-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
252 lines (168 loc) · 15.7 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
---
title: "rbioapi: User-Friendly R Interface to Biologic Web Services' API"
author: "Moosa Rezwani"
description: >
Connect to Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt in R with rbioapi package.
date: "`r Sys.Date()`"
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE, setup, echo=FALSE, results="hide"}
knitr::opts_chunk$set(echo = TRUE,
eval = TRUE,
message = FALSE,
warning = FALSE,
collapse = TRUE,
tidy = FALSE,
cache = FALSE,
dev = "png",
fig.path = "man/figures/README-",
out.width = "100%",
comment = "#>")
```
# <img src="man/figures/logo.svg" align="right" width="200"/>
<!-- badges: start -->
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/rbioapi)](https://cran.r-project.org/package=rbioapi) [![R-CMD-check](https://github.com/moosa-r/rbioapi/workflows/R-CMD-check/badge.svg)](https://github.com/moosa-r/rbioapi/actions)
<!-- badges: end -->
# What does rbioapi do?
Currently fully supports **Enrichr**, **JASPAR**, **miEAA**, **PANTHER**, **Reactome**, **STRING**, and **UniProt**!
The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services: In a way that insulates the user from technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services.
With rbioapi, you do not need to have technical knowledge about web services API or learn how to work with a new package for every biologic service or database. This an ongoing project; New databases and services will be added periodically. Feel free to [suggest](https://github.com/moosa-r/rbioapi/issues "Issue section in rbioapi GitHub repository") any databases or services you often use.
# What is Supported by rbioapi?
rbioapi is dedicated to **Biological or Medical** databases and web services. Currently, rbioapi supports and covers every API resources in the following services: (in alphabetical order):
On CRAN (Stable) version: (<https://cran.r-project.org/package=rbioapi>)
1. [Enrichr](https://maayanlab.cloud/Enrichr/ "Enrichr") ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_enrichr.html "rbioapi & Enrichr vignette article")) ^(new)^
2. [JASPAR](https://jaspar.elixir.no/ "JASPAR - A database of transcription factor binding profiles") ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_jaspar.html "rbioapi & Enrichr vignette article")) ^(new)^
3. [miEAA](https://ccb-compute2.cs.uni-saarland.de/mieaa2 "miRNA Enrichment Analysis and Annotation Tool (miEAA)") ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_mieaa.html "rbioapi & miEAA vignette article"))
4. [PANTHER](https://www.pantherdb.org "Protein Analysis THrough Evolutionary Relationships (PANTHER)") ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_panther.html "rbioapi & PANTHER vignette article"))
5. [Reactome](https://reactome.org/) ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_reactome.html "rbioapi & Reactome vignette article"))
6. [STRING](https://string-db.org/ "STRING: Protein-Protein Interaction Networks Functional Enrichment Analysis") ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_string.html "rbioapi & STRING vignette article"))
7. [UniProt](https://www.uniprot.org "Universal Protein Resource (UniProt)") ([rbioapi vignette article](https://rbioapi.moosa-r.com/articles/rbioapi_uniprot.html "rbioapi & UniProt vignette article"))
Only on Github (Developmental) version: (<https://github.com/moosa-r/rbioapi/>):
1. currently none
Each of the services has its dedicated vignette article. However, In this article, I will write about the general framework of rbioapi. Make sure to check the vignette article of each service to learn more about how to use them.
**Note That:** rbioapi is an ongoing project. New databases and services will be implemented periodically in order to gradually make the package as comprehensive as possible. Do you see yourself often using a certain database/service? Feel free to suggest any database/service by creating an issue on our GitHub [repository](https://github.com/moosa-r/ "rbioapi GitHub repositry"). I will appreciate any suggestions.
# How to install?
You can install the stable release version of [rbioapi](rbioapi.html "rbioapi: User-Friendly R Interface to Biologic Web Services' API") from [CRAN](https://cran.r-project.org/package=rbioapi "rbioapi page on CRAN (The Comprehensive R Archive Network)") with:
```{r install_cran, eval=FALSE}
install.packages("rbioapi")
```
However, the CRAN version is released at most once every 1-2 months, You can install the most recent (development) version from [GitHub](https://github.com/moosa-r/rbioapi/ "rbioapi repository on GitHub") with:
```{r install_github, eval=FALSE}
install.packages("remotes")
remotes::install_github("moosa-r/rbioapi")
```
Now, we can load the package:
```{r load_rbioapi, echo=TRUE}
library(rbioapi)
```
```{r prevent_vignette_errors, message=FALSE, warning=FALSE, include=FALSE}
rba_options(timeout = 30, skip_error = TRUE)
```
# Naming conventions
To make the namespace more organized, functions has been named with the following pattern:
rba_[service_name]_[resource_name]
For example, `rba_string_version()` will call [STRING](https://string-db.org/ "STRING: Protein-Protein Interaction Networks Functional Enrichment Analysis")'s version resource.
```{r naming_example, echo=TRUE, message=TRUE}
rba_string_version()
```
Thus, to this version, rbioapi function will have one of the following naming schema:
1. rba_enrichr\_\*
2. rba_jaspar\_\*
3. rba_mieaa\_\*
4. rba_panther\_\*
5. rba_reactome\_\*
6. rba_string\_\*
7. rba_uniprot\_\*
There are three exceptions: `rba_options()`, `rba_connection_test()`, and `rba_pages()`; these are helper functions. More on that later.
# Changing the options
To provide more control, multiple options have been implemented. See the manual of `rba_options()` function for a full description of available options. In short, some of the options will govern rbioapi's connection with servers (e.g. timeout, retry) and some of the options will modify your experience with rbioapi (e.g. verbose, diagnostics, save_file). There are two ways that you may use to change any option. Also, you can get table of available rbioapi options and their current values by calling `rba_options()`without any argument:
```{r rba_options, echo=TRUE, message=TRUE}
rba_options()
```
Now, let us consider the ways in which we can alter the settings:
## Change the option globally
Changing an option globally means that for the rest of your R session, any rbioapi function will respect the changed option. To do this, use `rba_options().` Each argument in this function corresponds to a certain option; Thus by running this function with your desired new values, you could globally alter that rbioapi option. for example:
```{r rba_options_global, eval=FALSE}
rba_options(save_file = TRUE)
## From now on, the raw file of server's response will be saved to your working directory.
rba_options(verbose = FALSE)
## From now on, the package will be quiet.
```
## Change the option only within a function call
You can pass additional arguments to any rbioapi function using "ellipsis" (the familiar `…` or dot dot dot!). Meaning that you can call any function with additional arguments where each is 'option = value' pair. This way, any changes in options will be confined within that particular function call. For example:
```{r rba_options_ellipsis1, eval=FALSE}
## Save the server's raw response file:
x <- rba_reactome_species(only_main = TRUE, save_file = "reactome_species.json")
## Also, in the case of connection failure, retry up to 10 times:
x <- rba_reactome_species(only_main = TRUE,
save_file = "reactome_species.json", retry_max = 10)
```
```{r rba_options_ellipsis2, eval=FALSE, echo=TRUE, message=TRUE}
## Run these codes in your own R session to see the difference.
## show internal diagnostics boring details
x <- rba_uniprot_proteins_crossref(db_id = "CD40", db_name = "HGNC", diagnostics = TRUE)
## The next function you call, will still use the default rbioapi options
x <- rba_uniprot_proteins_crossref(db_id = "CD40", db_name = "HGNC")
```
# Connection test
The second exception in functions' naming schema is `rba_connection_test()`. Run this simple function to check your connection with the supported services/databases. If you encounter errors when using rbioapi, kindly run this function to make sure that your internet connection or the servers are fine.
```{r rba_connection_test, message=TRUE}
rba_connection_test(print_output = TRUE)
```
# Iterating over paginated results
Some API resources will return paginated responses. This is particularly common in API resources which return potentially very large responses. In rbioapi, for these cases, there are arguments such as "page_number" (with default value of 1) and -if the API resource allows- "page_size". To save your time, you may use `rba_pages()`. This function will iterate over the pages you have specified.
Take rba_uniprot_taxonomy_name as an example. This function allows you to search taxonomic nodes in [UniProt](https://www.uniprot.org "Universal Protein Resource (UniProt)"). The response can potentially have a huge size, so [UniProt](https://www.uniprot.org "Universal Protein Resource (UniProt)") returns a paginated response. For example, if we search for nodes that contain "adenovirus", there is a large number of hits:
```{r rba_pages1, echo=TRUE}
adeno <- rba_uniprot_taxonomy_name(name = "adenovirus",
search_type = "contain",
page_number = 1)
str(adeno, max.level = 2)
```
As you can see, the server has returned the first page of the response, to retrieve the other pages, you should make separate calls and change the "page_number" argument within each call, or simply use `rba_pages()` as demonstrated below:
```{r rba_pages2, echo=TRUE}
adeno_pages = rba_pages(quote(rba_uniprot_taxonomy_name(name = "adenovirus",
search_type = "contain",
page_number = "pages:1:3")))
## You can inspect the structure of the response:
str(adeno_pages, max.level = 2)
```
As you can see, what we have done was:
1. Wrap the function call in `qoute()` and enter that as the input for `rba_pages()`.
2. Replace the argument we want to iterate over it, with a string in this format: "pages:start:end". For example, we supplied page_number = "pages:1:3" to get the responses of pages 1 to 3.
# How and what to cite?
rbioapi is an interface between you and other databases and services. Thus, if you have used rbioapi in published research, **in addition to kindly citing rbioapi, [*make sure to fully and properly cite the databases/services you have used*]{.underline}**. Suggested citations have been added in the functions' manuals, under the "references" section; Nevertheless, it is the user's responsibility to check for proper citations and to properly cite the database/services that they have used.
## How to cite rbioapi
- Moosa Rezwani, Ali Akbar Pourfathollah, Farshid Noorbakhsh, rbioapi: user-friendly R interface to biologic web services' API, Bioinformatics, Volume 38, Issue 10, 15 May 2022, Pages 2952--2953, <https://doi.org/10.1093/bioinformatics/btac172>
## How to cite the databases and web services
- [How to cite Enrichr](https://rbioapi.moosa-r.com/articles/rbioapi_enrichr.html#citations "How to cite Enrichr"). (See on [Enrichr website](https://maayanlab.cloud/Enrichr/help#terms))
- [How to cite JASPAR](https://rbioapi.moosa-r.com/articles/rbioapi_jaspar.html#citations "How to cite JASPAR"). (See on [JASPAR website](https://jaspar.elixir.no/faq/))
- [How to cite miEAA](https://rbioapi.moosa-r.com/articles/rbioapi_mieaa.html#citations "How to cite miEAA"). (See on [miEAA website](https://ccb-compute2.cs.uni-saarland.de/mieaa2))
- [How to cite PANTHER](https://rbioapi.moosa-r.com/articles/rbioapi_panther.html#citations "How to cite PANTHER"). (See on [PANTHER website](https://www.pantherdb.org/publications.jsp#HowToCitePANTHER))
- [How to cite Reactome](https://rbioapi.moosa-r.com/articles/rbioapi_reactome.html#citations "How to cite Reactome"). (See on [Reactome website](https://reactome.org/cite))
- [How to cite STRING](https://rbioapi.moosa-r.com/articles/rbioapi_string.html#citations "How to cite STRING"). (See on [STRING website](https://string-db.org/cgi/about?footer_active_subpage=references))
- [How to cite UniProt](https://rbioapi.moosa-r.com/articles/rbioapi_uniprot.html#citations "How to cite UniProt"). (See on [UniProt website](https://www.uniprot.org/help/publications))
## Code of conduct
This package, rbioapi, is an unofficial interface implementation and is not associated, endorsed, or officially connected in any way with the original databases and web services. The creators and maintainers of rbioapi are independent entities and have no official relationship with those databases and web services.
When using rbioapi, remember that you are querying data from web services; So please be considerate. Never flood a server with requests, if you need to download *unreasonably* large volumes of data, directly downloading the databases supplied in those services may be a better alternative. If you see yourself being rate-limited from any server (HTTP **429 Too Many Requests** response status code), know that you are sending more requests than what the server interprets as normal behavior, so please seek other methods or use `Sys.sleep()` between your requests.
# What next?
Each supported service has a dedicated vignette article. Make sure to check those too.
1. [Enrichr](https://rbioapi.moosa-r.com/articles/rbioapi_enrichr.html "rbioapi & Enrichr vignette")
2. [JASPAR](https://rbioapi.moosa-r.com/articles/rbioapi_jaspar.html "rbioapi & JASPAR vignette article")
3. [miEAA](https://rbioapi.moosa-r.com/articles/rbioapi_mieaa.html "rbioapi & miEAA vignette article")
4. [PANTHER](https://rbioapi.moosa-r.com/articles/rbioapi_panther.html "rbioapi & PANTHER vignette article")
5. [Reactome](https://rbioapi.moosa-r.com/articles/rbioapi_reactome.html "rbioapi & Reactome vignette article")
6. [STRING](https://rbioapi.moosa-r.com/articles/rbioapi_string.html "rbioapi & STRING vignette article")
7. [UniProt](https://rbioapi.moosa-r.com/articles/rbioapi_uniprot.html "rbioapi & UniProt vignette article")
We are also adding vignette articles focusing on tasks and workflows:
1. [Do with rbioapi: Enrichment (Over-Representation) Analysis in R](https://rbioapi.moosa-r.com/articles/rbioapi_do_enrich.html "Do with rbioapi: Enrichment (Over-Representation) Analysis in R")
# Design philosophy of rbioapi
To learn more about the design philosophy and the concepts behind developing rbioapi, please read our paper in Bioinformatics:
[rbioapi: user-friendly R interface to biologic web services' API](https://doi.org/10.1093/bioinformatics/btac172 "Rezwani, M., Pourfathollah, A. A., & Noorbakhsh, F. (2022). rbioapi: user-friendly R interface to biologic web services’ API. Bioinformatics, 38(10), 2952–2953. doi: 10.1093/bioinformatics/btac172")
# Links
- [This article in rbioapi documentation site](https://rbioapi.moosa-r.com/articles/rbioapi.html "rbioapi: User-Friendly R Interface to Biologic Web Services' API")
- [Functions references in rbioapi documentation site](https://rbioapi.moosa-r.com/reference/index.html "rbioapi reference")
# Session info
```{r sessionInfo, echo=FALSE}
sessionInfo()
```