Skip to content

Commit 3bf2bf6

Browse files
committed
Merge remote-tracking branch 'github-api/add-pagination-information' into add-guide-on-pagination
2 parents c6b7666 + 927f3d9 commit 3bf2bf6

File tree

4 files changed

+252
-0
lines changed

4 files changed

+252
-0
lines changed
Lines changed: 248 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,248 @@
1+
---
2+
title: Traversing with Pagination | GitHub API
3+
---
4+
5+
# Traversing with Pagination
6+
7+
* TOC
8+
{:toc}
9+
10+
The GitHub API provides a vast wealth of information for developers to consume.
11+
Most of the time, you might even find that you're asking for _too much_ information,
12+
and in order to keep our servers happy, the API will automatically [paginate the requested items][pagination].
13+
14+
In this guide, we'll make some calls to the GitHub Search API, and iterate over
15+
the results using pagination. You can find the complete source code for this project
16+
in the [platform-samples][platform samples] repository.
17+
18+
## Basics of Pagination
19+
20+
To start with, it's important to know a few facts about receiving paginated items:
21+
22+
1. Different API calls respond with different defaults. For example, a call to
23+
[list GitHub's public repositories](http://developer.github.com/v3/repos/#list-all-public-repositories)
24+
provides paginated items in sets of 30, whereas a call to the GitHub Search API
25+
provides items in sets of 100
26+
2. You can specify how many items to receive (up to a maximum of 100); but,
27+
3. For technical reasons, not every endpoint behaves the same. For example,
28+
[events](http://developer.github.com/v3/activity/events/) won't let you set a maximum for items to receive.
29+
Be sure to read the documentation on how to handle paginated results for specific endpoints.
30+
31+
Information about pagination is provided in [the Link header](http://tools.ietf.org/html/rfc5988)
32+
of an API call. For example, let's make a curl request to the search API, to find
33+
out how many times Mozilla projects use the phrase `addClass`:
34+
35+
curl -I "https://api.github.com/search/code?q=addClass+user:mozilla"
36+
37+
The `-I` parameter indicates that we only care about the headers, not the actual
38+
content. In examining the result, you'll notice some information in the Link header
39+
that looks like this:
40+
41+
Link: <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2>; rel="next",
42+
<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>; rel="last"
43+
44+
Let's break that down. `rel="next"` says that the next page is `page=2`. This makes
45+
sense, since by default, all paginated queries start at page `1.` `rel="last"`
46+
provides some more information, stating that the last page of results is on page `34`.
47+
Thus, we have 33 more pages of information about `addClass` that we can consume.
48+
Nice!
49+
50+
Keep in mind that you should **always** rely on these link relations provided
51+
to you. Don't try to guess or construct your own URL. Some API calls, like [listing
52+
commits on a repository][listing commits], use pagination results that are based
53+
on SHA values, not numbers.
54+
55+
### Navigating through the pages
56+
57+
Now that you know how many pages there are to receive, you can start navigating
58+
through the pages to consume the results. You do this by passing in a `page`
59+
parameter. By default, `page` always starts at `1`. Let's jump ahead to page 14
60+
and see what happens:
61+
62+
curl -I "https://api.github.com/search/code?q=addClass+user:mozilla&page=14"
63+
64+
Here's the link header once more:
65+
66+
Link: <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=15>; rel="next",
67+
<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>; rel="last",
68+
<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=1>; rel="first",
69+
<https://api.github.com/search/code?q=addClass+user%3Amozilla&page=13>; rel="prev"
70+
71+
As expected, `rel="next"` is at 15, and `rel="last"` is still 34. But now we've
72+
got some more information: `rel="first"` indicates the URL for the _first_ page,
73+
and more importantly, `rel="prev"` lets you know the page number of the previous
74+
page. Using this information, you could construct some UI that lets users jump
75+
between the first, previous, next, or last list of results in an API call.
76+
77+
### Changing the number of items received
78+
79+
By passing the `per_page` parameter, you can specify how many items you want
80+
each page to return, up to 100 items. Let's try asking for 50 items about `addClass`:
81+
82+
curl -I "https://api.github.com/search/code?q=addClass+user:mozilla&per_page=50"
83+
84+
Notice what it does to the header response:
85+
86+
Link: <https://api.github.com/search/code?q=addClass+user%3Amozilla&per_page=50&page=2>; rel="next",
87+
<https://api.github.com/search/code?q=addClass+user%3Amozilla&per_page=50&page=20>; rel="last"
88+
89+
As you might have guessed, the `rel="last"` information says that the last page
90+
is now 20. This is because we are asking for more information per page about
91+
our results.
92+
93+
## Consuming the information
94+
95+
You don't want to be making low-level curl calls just to be able to work with
96+
pagination, so let's write a little Ruby script that does everything we've
97+
just described above.
98+
99+
As always, first we'll require [GitHub's Octokit.rb][octokit.rb] Ruby library, and
100+
pass in our [personal access token][personal token]:
101+
102+
#!ruby
103+
require 'octokit'
104+
105+
# !!! DO NOT EVER USE HARD-CODED VALUES IN A REAL APP !!!
106+
# Instead, set and test environment variables, like below
107+
client = Octokit::Client.new :access_token => ENV['MY_PERSONAL_TOKEN']
108+
109+
Next, we'll execute the search, using Octokit's `search_code` method. Unlike
110+
using `curl`, we can also immediately retrieve the number of results, so let's
111+
do that:
112+
113+
#!ruby
114+
results = client.search_code('addClass user:mozilla')
115+
total_count = results.total_count
116+
117+
Now, let's grab the number of the last page, similar to `page=34>; rel="last"`
118+
information in the link header. Octokit.rb support pagination information through
119+
an implementation called "[Hypermedia link relations][hypermedia-relations]."
120+
We won't go into detail about what that is, but, suffice to say, each element
121+
in the `results` variable has a hash called `rels`, which can contain information
122+
about `:next`, `:last`, `:first`, and `:prev`, depending on which result you're
123+
on. These relations also contain information about the resulting URL, by calling
124+
`rels[:last].href`.
125+
126+
Knowing this, let's grab the page number of the last result, and present all
127+
this information to the user:
128+
129+
#!ruby
130+
last_response = client.last_response
131+
number_of_pages = last_response.rels[:last].href.match(/page=(\d+)$/)[1]
132+
133+
puts "There are #{total_count} results, on #{number_of_pages} pages!"
134+
135+
Finally, let's iterate through the results. You could do this with a loop `for i in 1..number_of_pages.to_i`,
136+
but instead, let's follow the `rels[:next]` headers to retrieve information from
137+
each page. For the sake of simplicity, let's just grab the file path of the first
138+
result from each page. To do this, we'll need a loop; and at the end of every loop,
139+
we'll retrieve the data set for the next page by following the `rels[:next]` information.
140+
The loop will finish when there is no `rels[:next]` information to consume (in other
141+
words, we are at `rels[:last]`). It might look something like this:
142+
143+
#!ruby
144+
loop do
145+
puts last_response.data.items.first.path
146+
last_response = last_response.rels[:next].get
147+
sleep 4 # back off from the API rate limiting; don't do this in Real Life
148+
break if last_response.rels[:next].nil?
149+
end
150+
151+
Changing the number of items per page is extremely simple with Octokit.rb. Simply
152+
pass a `per_page` options hash to the initial client construction. After that,
153+
your code should remain intact:
154+
155+
#!ruby
156+
require 'octokit'
157+
158+
# !!! DO NOT EVER USE HARD-CODED VALUES IN A REAL APP !!!
159+
# Instead, set and test environment variables, like below
160+
client = Octokit::Client.new :access_token => ENV['MY_PERSONAL_TOKEN']
161+
162+
results = client.search_code('addClass user:mozilla', :per_page => 100)
163+
total_count = results.total_count
164+
165+
last_response = client.last_response
166+
number_of_pages = last_response.rels[:last].href.match(/page=(\d+)$/)[1]
167+
168+
puts last_response.rels[:last].href
169+
puts "There are #{total_count} results, on #{number_of_pages} pages!"
170+
171+
puts "And here's the first path for every set"
172+
173+
loop do
174+
puts last_response.data.items.first.path
175+
last_response = last_response.rels[:next].get
176+
sleep 4 # back off from the API rate limiting; don't do this in Real Life
177+
break if last_response.rels[:next].nil?
178+
end
179+
180+
## Constructing Pagination Links
181+
182+
Normally, with pagination, your goal isn't to concatenate all of the possible
183+
results, but rather, to produce a set of navigation, like this:
184+
185+
![Sample of pagination links](/images/pagination_sample.png)
186+
187+
Let's sketch out a micro-version of what that might entail.
188+
189+
From the code above, we already know we can get the `number_of_pages` in the
190+
paginated results from the first call:
191+
192+
#!ruby
193+
require 'octokit'
194+
195+
# !!! DO NOT EVER USE HARD-CODED VALUES IN A REAL APP !!!
196+
# Instead, set and test environment variables, like below
197+
client = Octokit::Client.new :access_token => ENV['MY_PERSONAL_TOKEN']
198+
199+
results = client.search_code('addClass user:mozilla')
200+
total_count = results.total_count
201+
202+
last_response = client.last_response
203+
number_of_pages = last_response.rels[:last].href.match(/page=(\d+)$/)[1]
204+
205+
puts last_response.rels[:last].href
206+
puts "There are #{total_count} results, on #{number_of_pages} pages!"
207+
208+
209+
From there, we can construct a beautiful ASCII representation of the number boxes:
210+
211+
#!ruby
212+
numbers = ""
213+
for i in 1..number_of_pages.to_i
214+
numbers << "[#{i}] "
215+
end
216+
puts numbers
217+
218+
Let's simulate a user clicking on one of these boxes, by constructing a random
219+
number:
220+
221+
#!ruby
222+
random_page = Random.new
223+
random_page = random_page.rand(1..number_of_pages.to_i)
224+
225+
puts "A User appeared, and clicked number #{random_page}!"
226+
227+
Now that we have a page number, we can use Octokit to explicitly retrieve that
228+
individual page, by passing the `:page` option:
229+
230+
#!ruby
231+
clicked_results = client.search_code('addClass user:mozilla', :page => random_page)
232+
233+
If we wanted to get fancy, we could also grab the previous and next pages, in
234+
order to generate links for back (`<<`) and foward (`>>`) elements:
235+
236+
#!ruby
237+
prev_page_href = client.last_response.rels[:prev] ? client.last_response.rels[:prev].href : "(none)"
238+
next_page_href = client.last_response.rels[:next] ? client.last_response.rels[:next].href : "(none)"
239+
240+
puts "The prev page link is #{prev_page_href}"
241+
puts "The next page link is #{next_page_href}"
242+
243+
[pagination]: /v3/#pagination
244+
[platform samples]: https://github.com/github/platform-samples/tree/master/api/ruby/traversing-with-pagination
245+
[octokit.rb]: https://github.com/octokit/octokit.rb
246+
[personal token]: https://help.github.com/articles/creating-an-access-token-for-command-line-use
247+
[hypermedia-relations]: https://github.com/octokit/octokit.rb#pagination
248+
[listing commits]: http://developer.github.com/v3/repos/commits/#list-commits-on-a-repository

content/v3.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -295,6 +295,8 @@ $ curl https://api.github.com/user/repos?page=2&per_page=100
295295
Note that page numbering is 1-based and that omitting the `?page`
296296
parameter will return the first page.
297297

298+
For more information on pagination, check out our guide on [Traversing with Pagination][pagination-guide].
299+
298300
### Link Header
299301

300302
The pagination info is included in [the Link
@@ -555,3 +557,4 @@ A link that looks like this:
555557
["url2", {:rel => "foo", :bar => "baz"}]] %>
556558

557559
[support]: https://github.com/contact?form[subject]=APIv3
560+
[pagination-guide]: /guides/traversing-with-pagination

layouts/guides.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ <h2><a href="/v3/">API</a></h2>
2626
<li><h3><a href="/guides/basics-of-authentication/">Basics of Authentication</a></h3></li>
2727
<li><h3><a href="/guides/rendering-data-as-graphs/">Rendering Data as Graphs</a></h3></li>
2828
<li><h3><a href="/guides/working-with-comments/">Working with Comments</a></h3></li>
29+
<li><h3><a href="/guides/traversing-with-pagination/">Traversing with Pagination</a></h3></li>
2930
</ul>
3031
</div>
3132

6.04 KB
Loading

0 commit comments

Comments
 (0)