Sismo Hub Guide: Add a Data Provider to the Sismo Factory

Developer tutorial

What's inside?

This tutorial will explain how to create a Data Provider in the Sismo Hub. Users (through the Factory) and developers (through the Sismo Hub) will be able to use your Data Provider in order to create Group Generators.

Group Generators are functions that enable the creation of groups at the center of the Sismo protocol. As groups need to contain data, we need Data Providers to fetch data from external APIs (e.g, Lens, GitHub, Snapshot).

You can find all already existing Data Providers here.

By following this tutorial you will learn how to create a new Data Provider to fetch the data for your desired group.

Data providers are at the foundation of the Sismo Hub architecture. They are designed to abstract the complexity of fetching external APIs for future devs or group creators gathering data. They are meant to be easy to use and reliable for future group generation.

You can find the pull request associated with this tutorial here.

What are Data Providers?

Data Providers enable other group creators to reuse specific data fetching logic, such as calls to APIs.

Groups are composed of Data Source:

  • Web2 accounts: Twitter, GitHub, Telegram

  • Web3 accounts: Ethereum addresses, ENS, Lens handle

When creating a group, you need to create a Group Generator. There are 3 different ways to do so:

Moreover, Data Providers are useful because they enable automatic and frequent group updates, whereas if the group was made of a hardcoded list of accounts it wouldn't be automatic.

If you want to create a group that contains all the voters of your last DAO proposal on Snapshot, simply use the queryProposalVoters function from the Snapshot Data Provider in your Group Generator by giving the Proposal Identifier as an argument.

For example:

const snapshotProviderData0 = await snapshotProvider.queryProposalVoters({
      proposal: "0x864685d094312bc51eeb01af69e83dc7c496108cca54a15b491d79e9ce389889"
    });

This will fetch all the voters of the CoW Swap proposal number 23 (CIP-23)

If you want to know more on Data Providers, check out this page. Alternatively, you can check out the Contributor Guide before creating one yourself.

Tutorial use case

This tutorial will show you how to create a Data Provider and make it usable in the Factory.

We will take the example of a Lens Data Provider on top of the Lens API and build our first query that gets the list of collectors of a specific Lens post.

At the end of this tutorial, it will be up to you to do a pull request on the Sismo Hub to see your Data Provider integrated and be usable in the Factory (green box) and in the Sismo Hub (red box) 🙌

Tutorial

Setup of the local environment

First, you need to clone the Sismo Hub repository

# using ssh
git clone git@github.com:sismo-core/sismo-hub.git
cd sismo-hub

# install dependencies
yarn

You can install yarn here.

Creation of the Data Provider

Let's build our Data Provider now 🧑‍💻

Go to the group-generators/helpers/data-providers folder and create a new folder:

# in a new terminal
# you are in sismo-hub

cd ./group-generators/helpers/data-providers
# create the data-provider folder
mkdir tutorial-lens
cd tutorial-lens

For our case in this tutorial, we will split our Data Provider into 3 different files:

  • index.ts: [mandatory] Contains all the functions you make available to use in a group generator. It will help format data that come from your API queries.

  • types.ts: [mandatory] The file where you can define all the types you need.

  • queries.ts: Define all your request to the API you want to use.

However, keep in mind that it's totally up to you to choose the structure you want for the files of your data provider. For this case, we found that it was better to split the Data Provider into these 3 files, but feel free to structure it however you like.

Only index.ts and types.ts are mandatory.

Let's start coding! 🧑‍💻

First, you need to define which query and which types you will use. From what we see in the Lens API docs, the whoCollectedPublication() query use and return some types, so you need to implement them in your data provider.

First, you can create a query that will use this type in the query.ts file:

# create the queries file
touch queries.ts
// group-generators/helpers/data-providers/tutorial-lens/queries.ts

// import gql in order to have prettier written requests
import { gql } from "graphql-request";
// import your previously created types
import { GetWhoCollectedPublicationType } from "./types";
// import the GraphQLProvider class
import { GraphQLProvider } from "@group-generators/helpers/data-providers/graphql";

// function that contains the query
export const getWhoCollectedPublicationQuery = async (
  graphqlProvider: GraphQLProvider,
  publicationId: string,
  cursor: string
): Promise<GetWhoCollectedPublicationType> => {
  // query
  return graphqlProvider.query<GetWhoCollectedPublicationType>(
    gql`
      query whoCollectedPublication($request: WhoCollectedPublicationRequest!) {
        whoCollectedPublication(request: $request) {
          items {
            address
          }
          pageInfo {
            prev
            next
            totalCount
          }
        }
      }
    `,
    {
      // the request will fetch 50 data by 50
      request: {
        // the id of the Lens post
        publicationId: publicationId,
        // number of data to fetch
        limit: 50,
        // this parameter allows specifying from which index you decide to fetch the data
        // example: cursor 50 & limit 100 => you will fetch the [50;100] data
        ...(cursor ? { cursor } : {}),
      },
    }
  );
};

Once done, create the types.ts file:

# create the types definitions file
touch types.ts

Then define all the types you need when using the whoCollectedPublication() query:

// group-generators/helpers/data-providers/tutorial-lens/types.ts

// this type informs us which parts of the set of elements we have fetched.
// it allows us to do pagination
export type PageInfo = {
  prev: string;
  next: string;
  totalCount: number;
};

export type Wallet = {
  address: string;
};

export type GetWhoCollectedPublicationType = {
  whoCollectedPublication: {
    items: Wallet[];
    pageInfo: PageInfo;
  };
};

// this type will be used in the index.ts file following the tutorial
export type PublicationId = {
  publicationId: string;
};

Because Lens API is a GraphQL API, you can use the GraphQL Data Provider already created in the Hub in order to create the query.

As you can see, this query allows fetching 50 users at a time. So you need to create a function that will iterate on this query in order to fetch all the users.

All of this occurs in the index.ts file:

# create the index file
touch index.ts
// group-generators/helpers/data-providers/tutorial-lens/index.ts

// import the queries
import {
  getWhoCollectedPublicationQuery,
} from "./queries";
// import the types
import {
  GetWhoCollectedPublicationType,
  PublicationId,
  Wallet
} from "./types";
// import the GraphQL Provider class
import { GraphQLProvider } from "@group-generators/helpers/data-providers/graphql";
// Import the FetchedData type
import { FetchedData } from "topics/group";

export class TutorialLensProvider extends GraphQLProvider {
  constructor() {
    super({
      // url of the API
      url: "https://api.lens.dev",
    });
  }
  
  // method that create the fetchedData object
  public async getWhoCollectedPublication(
    publication: PublicationId
  ): Promise<FetchedData> {
    const dataProfiles: FetchedData = {};
    for await (const item of this._getWhoCollectedPublication(publication)) {
      dataProfiles[item.address] = 1;
    }
    return dataProfiles;
  }

  // method that iterate on the getWhoCollectedPublicationQuery
  private async *_getWhoCollectedPublication({
    publicationId,
  }: PublicationId): AsyncGenerator<Wallet, void, undefined> {
    let cursor = "";
    let lensCollectors: GetWhoCollectedPublicationType;
    // fetch data as long as there is data
    do {
      lensCollectors = await getWhoCollectedPublicationQuery(
        this,
        publicationId,
        cursor
      );
      yield* lensCollectors.whoCollectedPublication.items;
      cursor = lensCollectors.whoCollectedPublication.pageInfo.next;
    } while (lensCollectors.whoCollectedPublication.items.length > 0);
  }
}

It is very important to pass an object that contains all your argument as argument of the function that the Factory will use.

Indeed, when the Factory create the group using your data provider, it will use take all the arguments gave by the user as input and create an object from it. Then it will call your function with this object, like this:

const lensProviderData0 = await lensProvider.getWhoCollectedPublication({
    publicationId: "0x26e5-0x09"
});
// for this case there is only 1 argument

That is why the argument of the function used by the Factory must be an object that contains all the possible arguments of your function.

For example in our case we needed to define this type:

export type PublicationId = {
  publicationId: string;
  // you can add other arguments here
};

It is worth noting that the final variable you return at the end of your getWhoCollectedPublication function is of type FetchedData. This type is needed if you want to create Group Generators.

Indeed, a Group Generator returns a group that is composed of metadata:

{
  name: "test",
  timestamp: context.timestamp,
  description: "This is a test group",
  specs: "",
  data: dataProfiles,
  valueType: ValueType.Score,
  tags: [],
}

The data field is where the group of accounts is stored, and its type is a FetchedData.

That's why FetchedData must be the format of the variable you return.

export type FetchedData = {
    [address: string]: BigNumberish;
};

And here's an example of how a FetchedData object looks like:

{
  "0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045": "1",
  "0xF61CabBa1e6FC166A66bcA0fcaa83762EdB6D4Bd": "1",
  "0x2E21f5d32841cf8C7da805185A041400bF15f21A": "1",
}

Here there is only "0x123..." type addresses but the fetchedData object can contains also:

  • ENS: "vitalik.eth"

  • Lens handle: "stani.lens"

  • Twitter account: "twitter:Sismo_eth"

  • GitHub account: "github:leosayous21"

  • Telegram account: "telegram:pelealexandru"

Finally, you need to reference your newly created Data Provider in the data-provider folder in order to use it anywhere in the Sismo Hub:

// group-generators/helpers/data-providers/index.ts

...
import { TutorialLensProvider } from "./tutorial-lens";

export const dataProviders = {
  ...
  TutorialLensProvider,
};

Congrats 🎉 You've just created your first Data Provider!

But now that you have finished, you probably want to test it right?

To do this you will have to create a group (only for testing) and import your Data Provider into it, so you can try your request. If you don't know how to create a group, check out the group tutorial here.

Make your Data Provider usable by anyone on the Factory

Okay, so now that your data provider is finished and tested, it can be used by all devs in the Sismo Hub. The next step is making it available to anyone in the Factory. Pretty exciting, isn't it? 🤩

To do so you need to create an interface-schema.json file that contains all the information the Factory needs to use your function and display relevant information.

# create the interface-schema file
touch interface-schema.json
// group-generators/helpers/data-providers/tutorial-lens/interface-schema.json

{
  "name": "Tutorial Lens",
  "iconUrl": "",
  "providerClassName": "TutorialLensProvider",
  "functions": [
    {
      "name": "Get collectors of",
      "functionName": "getWhoCollectedPublication",
      "countFunctionName": "getPublicationCollectorsCount",
      "description": "Returns all collectors of a specific publication",
      "args": [
        {
          "name": "publication identifier",
          "argName": "publicationId",
          "type": "string",
          "example": "0x26e5-0x03",
          "description": "A Lens publication identifier in the following format {profileId}-{publicationId}. https://docs.lens.xyz/docs/get-publication"
        }
      ]
    }
  ]
}

If you want more info on this schema, check this section on the Data Provider page.

Then you have to reference your schema:

// group-generators/helpers/data-providers/index.ts

...
import tutorialLensProviderSchema from "./tutorial-lens/interface-schema.json";

export const dataProvidersInterfacesSchemas = [
  ...
  tutorialLensProviderSchema,
];

You're almost finished! 😉

As you may have noticed, in the interface-schema.json there is a countFunctionName. This is called a count function. Count functions allow the Factory to know how many accounts a request will fetch and thus to know if the data that the user wants to fetch exists.

Let's create it:

// group-generators/helpers/data-providers/tutorial-lens/index.ts

...  
  public async getPublicationCollectorsCount(
    publication: PublicationId
  ): Promise<number> {
    const publicationStats = await getPublicationStatsQuery(
      this,
      publication.publicationId
    );
    return publicationStats.publication.stats.totalAmountOfCollects;
  }
// group-generators/helpers/data-providers/tutorial-lens/queries.ts

...
export const getPublicationStatsQuery = async (
  graphqlProvider: GraphQLProvider,
  publicationId: string
): Promise<GetPublicationStatsType> => {
  return graphqlProvider.query<GetPublicationStatsType>(
    gql`
      query PublicationStats($request: PublicationQueryRequest!) {
        publication(request: $request) {
          ... on Post {
            stats {
              totalAmountOfCollects
            }
          }
          ... on Comment {
            stats {
              totalAmountOfCollects
            }
          }
          ... on Mirror {
            stats {
              totalAmountOfCollects
            }
          }
        }
      }
    `,
    {
      request: {
        publicationId: publicationId,
      },
    }
  );
};
// group-generators/helpers/data-providers/tutorial-lens/types.ts

...
export type GetWhoMirroredPublicationType = {
  profiles: {
    items: ProfileType[];
    pageInfo: PageInfo;
  };
};

This count function (getPublicationCollectorsCount) will return the number of collectors for a specific Lens post. It allows the Factory to verify if the publicationId given by the user is correct but also to display the number of accounts the query will fetch on the website.

Finally, you have to reference your count function in the dataProvidersAPIEndpoints:

// group-generators/helpers/data-providers/index.ts

export const dataProvidersAPIEndpoints = {
  ...
  // add your data provider to the /data-provider-interfaces end of the Sismo Hub API
  TutorialLensProvider: {
    // add your count function to your data provider on the Sismo Hub API
    getPublicationCollectorsCount: async (_: any) =>
      new TutorialLensProvider().getPublicationCollectorsCount(_),
  },
};

By adding this interface file and the count function, you just allowed the factory to:

  • Fetch information about your newly created Data Provider. It can be now displayed (red box) and used for Factory users when adding eligible accounts.

  • Call the right count function (regarding the user selection).

If you wish to choose the collectors for a Lens post, you would need to input the post's ID. After clicking the 'Add' button, the Factory will invoke your count function to verify the post's validity and display the number of eligible accounts at the bottom of the screen (see the example above).

You have finally set up your Data Provider for the Factory, congrats! 🎉

For our case, we use the Lens API which doesn't require any access token. However, for many APIs, that's not the case, so if you want to set an access token for the API you want to use, you need to create an environment variable. Here is how to do it:

# you are in sismo-hub root
cp .example.env .env

Then you can add your own access token by adding a new line to this file:

# in the .env file
export YOUR_DATA_PROVIDER_API_KEY="<API_KEY>"

And you need to export the variable in order for you to use it locally on your computer:

source .env

You are now able to use it on the Sismo Hub, here is the syntax:

process.env.YOUR_DATA_PROVIDER_API_KEY

NB: To be able to use the API key in the Sismo Hub when your badge is deployed, please contact us in #dev-support (Sismo Discord) after creating your PR so we can add your access token to our infrastructure.

See your work integrated on Sismo

Your Data Provider is now ready, so it's time to push it into prod!

You don't have to store or launch anything on your side. You will just have to add a pull request to the Sismo Hub to see your Data Provider live! 😁

First, you will have to fork the Sismo Hub repository:

Then you will have to add a new remote to your repo:

git remote add fork <your_fork_repo_link.git>

You can also create a new branch for your work:

git checkout -b tutorial-lens-data-provider

In this tutorial you should have 5 different files changed:

  • 4 files created:

    • group-generators/helpers/data-providers/tutorial-lens/index.ts

    • group-generators/helpers/data-providers/tutorial-lens/queries.ts

    • group-generators/helpers/data-providers/tutorial-lens/types.ts

    • group-generators/helpers/data-providers/tutorial-lens/interface-schema.json

  • 1 file changed

    • group-generators/helpers/data-providers/index.ts

Next, add all your changes and push them to your fork:

# git add all your changes except the group you made to test your data provider
git add <all-your-files-except-the-test-group>
git commit -m "feat: add tutorial-lens data provider"
# if you created the branch tutorial-lens-data-provider
git push fork tutorial-lens-data-provider
# otherwise
git push

Don't forget to only add the files related to the Data Provider, and do not add the group you made for testing your Data Provider.

You will then see on your forked repository the following message, you can click on "Compare & pull request" to begin creating your pull request:

You will finally see your branch being compared to the main branch of the Sismo Hub repository (blue boxes). You can review your changes and add a meaningful title and comments, and when you are happy with your PR, click "Create Pull request". 😇

For example, here is a valid pull request: https://github.com/sismo-core/sismo-hub/pull/1407 (The PR is closed because it's only used as an example for this tutorial)

If your Data Provider requires an API key, contact the team on the Sismo Discord in #dev-support in order for us to make your API key usable in the main Sismo Hub and on the Factory.

When your PR is merged, you'll be able to see your Data Provider implemented in the Sismo Factory

You will also see it in the Sismo Hub API: https://hub.sismo.io/data-provider-interfaces

Finally, your Data Provider will be available in the Sismo Hub and anyone will be able to use it to create groups. Great job! 💪

Contribute to the Sismo Hub

You want to contribute, but you don't have any ideas for Data Providers to create? Check out the current Sismo Hub GitHub's issues, as you will find some interesting ideas for Data Providers to implement.

If you have any questions or you need help regarding your Data Providers creation process, do not hesitate to join our Discord and ask us in #dev-support or ou Dev Telegram. We will be glad to answer you 🤗

Last updated