DynamoDB Pagination - The Ultimate Guide (with Example)
Written by Ashan Fernando
Published on November 29th, 2021
Time to 10x your DynamoDB productivity with Dynobase [learn more]
Working With DynamoDB Pagination
If you are on the AWS platform, using DynamoDB is an obvious choice to store your application data. However, there are several things you might find quite different for those from a relational database background. One such area is the pagination support in DynamoDB.
What is DynamoDB Pagination?
Pagination, in general, is a technique to split the data records retrieved from DynamoDB into multiple segments. As a database that supports storing large amounts of data, it has already put default upper limits on the number of records we can retrieve - 1MB. On top of that, we can also limit the number of records for each query we perform.
Besides, in DynamoDB, each record you retrieve from a query has a direct cost. Though it's slightly more complex than that, it's better to consider it as a rule of thumb. So, pagination is the key to limiting the records that you retrieve and for direct cost savings.
Pagination with Page Number
DynamoDB is designed as a distributed system to store massive amounts of data. So tables are stored in multiple partitions (likely in multiple servers), and there is no query support across the partitions to count the number of records available in a table.
So, if you plan to do pagination with page numbers and need to get the total number of records available for the query, you are out of luck. But, there is a better alternative if you opt for continuous loading of pages, like showing an infinite scroll instead of page numbers.
Best Practices
If you are getting started with pagination, you will find that DynamoDB returns the LastEvaluatedKey
for each query you make. This key is returned either if the,
- Query results have hit the upper limit (e.g., DynamoDB Query operation divides the data into 1MB of size).
- If you have specified a limit by adding the Limit parameter in the query, it returns a dataset with more records remaining to evaluate for the next page.
Then you can use the LastEvaluatedKey
value set as a new parameter ExclusiveStartKey
for the subsequent query.
Using the LastEvaluatedKey
and ExclusiveStartKey
, you can implement a complete pagination solution that supports the on-demand loading of pages in your application.
Dynamic Results for Pages
Keep in mind that the number of items retrieved via pagination could vary. It’s especially true for the last page, where there could be just one item or if size of your item varies. And, things get interesting if you are using filter parameters and a query since DynamoDB applies the filter on top of the returned records for a given page. Then you might get a varying number of items for middle pages that could vary from (0 to the limit you specify).
Note: Never decide the pagination based on the number of items returned. Always you have to look for the
LastEvaluatedKey
value.
Therefore if the user is scrolling down, you might need to show a data loader, keep on looking for data, and show “No more records available” only if the LastEvaluatedKey
returns null.
When to use Pagination in DynamoDB
First of all, it’s important to evaluate whether we need to set up pagination in the first place.
For example, if we have an application feature to update the status of daily invoices, it might make sense to retrieve all the invoices for a given day and work on them directly. On the other hand, if it’s an account reconsolidation, it needs to retrieve the entire dataset, where you can optimize with maximum page sizes to retrieve all the records for a given month or year.
Generally, if there is a long list of items in an application, it’s always preferred to implement pagination.
So, start thinking from the application perspective at first,
- How will the user or the process use the data retrieved?
- What is the data set used by the user for a given session?
- How should we use the Read Capacity Units efficiently as possible to reduce the costs?
Besides, knowing how to Scan vs Query + pagination will be really helpful when you implement pagination.
Pagination with Filter
Adding an additional filter to pagination helps to limit the data retrieved from DynamoDB. However, it will still consume the Read Capacity Units for the data retrieved for the page.
Querying → Page Limiting(With ExclusiveStartKey + Limit) → Filtering( Boolean True or False)
Let’s look at a DynamoDB Query example with filters for better understanding in Node.js:
FAQs
Does DynamoDB support pagination?
Yes. It supports pagination. But not in the traditional means, where you will get the total number of records to derive the page count. You have to go for continuous querying of pages using LastEvaluatedKey
, ExclusiveStartKey
, and Limit
parameters.
What is the cursor in DynamoDB?
In DynamoDB, a cursor is a pointer to the last evaluated item in a query result. It uses the Primary Key of the record returned as LastEvaluatedKey
, allowing it to travel forwards and backward from returned page results for any subsequent queries.
What is LastEvaluatedKey
in DynamoDB?
LastEvaluatedKey
is the Primary Key of the record where a Query or Scan operation stopped and returned as the last value of the recordset.
You can use this value set as the ExclusiveStartKey
to retrieve the next page of records.
How can I find page numbers when using DynamoDB pagination?
There is no direct way of identifying the page number. You have to keep track of the total number of items and calculate the pages manually. Still, keep in mind that the number of items for a page could vary if you have filter parameters defined and for the last page. Therefore, continuous pagination (or from an application perspective infinite scrolling of the next page) is recommended.
How can I get the total count when using DynamoDB pagination?
There is no native query to get the total counts. You have to keep track of the number of items manually. You can decide to keep a counter in a DynamoDB table itself to hold the number of items inserted to a table and match a given query.
How does DynamoDB handle large datasets?
DynamoDB handles large datasets by distributing data across multiple partitions. Each partition can store up to 10 GB of data and handle up to 3,000 read capacity units and 1,000 write capacity units. When a table's size or throughput exceeds these limits, DynamoDB automatically adds more partitions to handle the load. This partitioning mechanism ensures that the database can scale horizontally and manage large volumes of data efficiently.
Handling Pagination in DynamoDB Streams
DynamoDB Streams provide a time-ordered sequence of item-level changes in a table. When dealing with large datasets, you might need to paginate through these streams. The ShardIterator
is used to retrieve stream records in batches. By using the GetShardIterator
and GetRecords
API calls, you can paginate through the stream records efficiently. This is particularly useful for applications that need to process real-time data changes without overwhelming the system with large data loads at once.