Abstracting pagination across third-party APIs

Written by Elias Meire

Fragmented pagination styles are a challenge lots of developers have to cope with when integrating multiple APIs, as APIs use many different pagination strategies. In this post, we'll cover how we came up with a unified pagination API that supports all types of pagination. Resulting in a more consistent and better developer experience.

Unified Pagination

The Problem

At Apideck we’re hard at work building Unify. Unify is a single API for popular software categories such as CRMs, Accounting, and marketing automation tools, only requiring you to integrate one API per domain.

Creating a single API contract to handle all pagination types is a complex problem. APIs use many different pagination strategies. In the CRM APIs we integrate with, we found 4 different pagination styles.

  1. Offset pagination, for example Pipedrive
  2. Page pagination, for example Copper
  3. Cursor pagination, for example Hubspot CRM
  4. Link Pagination, for example Microsoft Dynamics

The interface of both the request and response varies per API. Most commonly, APIs accept pagination parameters via query string. But others use a request body, or even headers (looking at you Microsoft Dynamics).

In addition, validation rules differ. Some APIs support returning up to 5000 results in a single call. While others only support 100 results at most.

Let’s cover the different pagination strategies in more detail first.

Pagination Strategies: Offsets vs Cursors vs Links

Offsets & page-based

Offset pagination

Offsets are the most widely used pagination strategy. Clients pass an offset and a limit parameter. In response, the API then skips the number of results as indicated by the offset parameter, and returns the number of results requested via the limit parameter.

js
const response = await fetch(
  '<pipedrive-api>/organizations?start=0&limit=20'
) 
const hasNextPage = response
  .additional_data
  .pagination
  .more_items_in_collection

if (hasNextPage) {
  const nextPage = await fetch(
    '<pipedrive-api>/organizations?start=20&limit=20'
  )
}

Offset pagination in the Pipedrive API

Code samples are simplified for brevity. Authorization and JSON parsing are omitted, for example.

Many APIs use a variation of offset pagination using a page parameter. Instead of passing an offset telling the API how many results to skip, clients pass a page number. The API then translates this page to an offset by multiplying it with the limit.

Page pagination

Offset pagination has a couple of advantages

  • Clients can jump to any position in the result set. This is a limitation of cursor pagination as we'll see in the next section.
  • It’s often the easiest to implement. For example, it directly maps to SQL:
sql
SELECT * FROM users OFFSET 20 LIMIT 20

But there are also some disadvantages

  • It doesn’t scale well for large result sets. With most technologies, the database still has to read through all results up to the offset. The higher the offset becomes, the worse the performance.
  • It is not suitable for result sets that change often. If results get added or removed while a client is reading pages, the client could get duplicate results, or skip results by accident.

Cursors

Cursor pagination

Cursor pagination works by returning a pointer to a specific result in the set. This cursor can be an opaque string value. The client can use these cursors to get the next results from a specific result onwards.

This type of pagination has some great advantages over offset pagination

  • Performance is not affected by the size of the result set, nor the position in the list.
  • Cursor pagination is great for fast-changing result sets. Because the cursor points to a specific result, the client can't receive duplicate results or skip results by accident.

But it comes with its trade-offs

  • Clients can't jump to any page in the result set. Instead, they have to go through in sequence, using the cursor from the previous page.
  • There is no way for clients to know the total number in the result set, as the server is not reading the full list.
js
const response = await fetch(
  '<hubspot-api>/objects/companies?after=cursor&limit=20'
)

const nextCursor = response.paging.next.after;
const nextPage = await fetch(
  `<hubspot-api>/objects/companies?after=${nextCursor}&limit=20`
)

Cursor pagination in the Hubspot API

Links

Link pagination returns a URL to the next page in the result set. The client can then simply make a new request to this URL to fetch the next page. This is not really a pagination strategy, it can wrap both cursor and offset pagination. As some APIs only return links, Unify needs to be able to handle it.

js
const response = await fetch(
  '<ms-dynamics-api>/accounts',
  { headers: { Prefer: 'odata.maxpagesize=20' } }
)

const nextLink = response['@odata.nextLink']

if (nextLink) {
  const nextPage = await fetch(
    nextLink,
    { headers: { Prefer: 'odata.maxpagesize=20' } }
  )
}

Link pagination in the Microsoft Dynamics API

The Solution

We needed a solution that abstracts all these pagination types, so Unify has a single, consistent API across all CRM APIs. It also needed to be familiar, and easy to use for developers.

Cursors as an abstraction

We came up with the idea of using cursors to encode all pagination strategies. Since cursors are opaque string values, we can encode any information inside. Let's see how this works in practice.

The Pipedrive API for example uses offset pagination. So how do we abstract this into cursor pagination?

The client makes a request to Pipedrive using the Unify CRM API.

http
GET https://unify.apideck.com/crm/leads?limit=50
X-Apideck-Service-Id: pipedrive

In the response, Unify returns a cursor to access the next page of results.

json
"meta": {
  "cursors": {
    "next": "cGlwZWRyaXZlOjpvZmZzZXQ6OjUw"
  }
}

The client can now do another call with this cursor as a parameter, and get the next set of results.

http
GET https://unify.apideck.com/crm/leads
      ?limit=50
      &cursor=cGlwZWRyaXZlOjpvZmZzZXQ6OjUw
X-Apideck-Service-Id: pipedrive

So how does this work?

The cursors are base64 encoded strings. If we decode the cursor from the previous example we see it contains pipedrive::offset::50. Unify decodes this cursor and translates it into a request for Pipedrive.

http
GET https://api.pipedrive.com/v1/leads?start=50&limit=50

Unify then generates new cursors based on Pipedrive's response. In case there are more results, it will return a new cursor with the value pipedrive::offset::100.

Other pagination styles work similarly. Page pagination, for example Copper: copper::page::5. Cursor pagination, for example Hubspot: hubspot::cursor::7151. Link pagination, for example Microsoft Dynamics: microsoft-dynamics::link::<url>

API Augmentation

Some APIs return up to 5000 results, while others max out at 100. But this doesn't mean that the Unify API is limited to the lowest limit. Because Unify has full control over the requests and cursors, Unify can augment the functionality of APIs.

For the Unify API, we return up to 200 results per request. Let's look at an example to see how this works with APIs that have a lower limit.

The Hubspot API uses cursor pagination and only returns up to 100 results in a single call.

http
GET https://unify.apideck.com/crm/leads?limit=200
X-Apideck-Service-Id: hubspot

Unify has an OpenAPI spec for the Hubspot API, the spec says that Hubspot has a maximum limit of 100. Based on this, Unify knows it needs to make 2 calls to HubSpot.

http
GET https://api.hubapi.com/crm/v3/objects/contacts?limit=100

From this call, Unify receives a cursor from HubSpot to fetch the next page. Unify then uses that cursor in the second request.

http
GET https://api.hubapi.com/crm/v3/objects/contacts
      ?limit=100
      &after=16701

The cursor returned from the HubSpot API in the last request is encoded into hubspot::cursor::17653 and returned in the response from Unify.

This example uses cursor pagination, but this also works for the other pagination strategies.

Improving DX

In the Unify REST API, we also return pagination links. This makes it even easier to fetch the next or previous pages. Clients can simply make a new HTTP call to the URL. In practice, these links contain the request URL with the next or previous cursor in the query.

json
"links": {
  "previous":
    "https://unify.apideck.com/crm/leads?limit=20",
  "current":
    "https://unify.apideck.com/crm/leads?limit=20&cursor=<cursor>",
  "next":
    "https://unify.apideck.com/crm/leads?limit=20&cursor=<next>"
}

GraphQL

In the Unify GraphQL API, cursor pagination works in the same way. Cursors are returned from list queries and can be used as parameters. In this example query, we're retrieving a list of leads from the CRM API.

graphql
{
 crm {
   leads(limit: 100 cursor: "em9oby1jcm06OnBhZ2U6OjE=") {
     data {
       id
       name
     }
     meta {
       cursors {
         previous
         next
       }
     }
   }
 }
}

The GraphQL API returns the requested cursors. The links section is not included here since GraphQL APIs only have a single endpoint.

json
{
  "crm": {
    "leads": {
      "meta": {
        "cursors": {
          "previous": "em9oby1jcm06OnBhZ2U6OjIo",
          "next": "em9oby1jcm06OnBhZ2U6OjI="
        },
      }
    }
  }
}

Closing thoughts

If you are interested in the code that powers this pagination abstraction, be sure to let us know. If there is interest, we would be happy to open-source it and give back to the community.

Hope you enjoyed this deep dive into API pagination. We are pretty excited about our new Unified Pagination, hope you like it as well. If you have any questions, or notice any mistakes, feel free to contact us. Unified Pagination for Apideck Unify is now live, be sure to check out the docs.

Ready to get started?

Start building your ecosystem in less than 5 minutes

Sign up