Reactome graph database: Efficient access to complex pathway data
- PMID: 29377902
- PMCID: PMC5805351
- DOI: 10.1371/journal.pcbi.1005968
Reactome graph database: Efficient access to complex pathway data
Abstract
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
Reactome diagram viewer: data structures and strategies to boost performance.Bioinformatics. 2018 Apr 1;34(7):1208-1214. doi: 10.1093/bioinformatics/btx752. Bioinformatics. 2018. PMID: 29186351 Free PMC article.
-
neo4jsbml: import systems biology markup language data into the graph database Neo4j.PeerJ. 2024 Jan 16;12:e16726. doi: 10.7717/peerj.16726. eCollection 2024. PeerJ. 2024. PMID: 38250720 Free PMC article.
-
LinkedImm: a linked data graph database for integrating immunological data.BMC Bioinformatics. 2021 Aug 25;22(Suppl 9):105. doi: 10.1186/s12859-021-04031-9. BMC Bioinformatics. 2021. PMID: 34433410 Free PMC article.
-
Techniques for optimization of queries on integrated biological resources.J Bioinform Comput Biol. 2004 Jun;2(2):375-411. doi: 10.1142/s0219720004000648. J Bioinform Comput Biol. 2004. PMID: 15297988 Review.
-
LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics.BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-8-S3-S5. BMC Bioinformatics. 2007. PMID: 17493288 Free PMC article. Review.
Cited by
-
Graph databases in systems biology: a systematic review.Brief Bioinform. 2024 Sep 23;25(6):bbae561. doi: 10.1093/bib/bbae561. Brief Bioinform. 2024. PMID: 39565895 Free PMC article.
-
Transcriptomic Module Discovery of Diarrhea-Predominant Irritable Bowel Syndrome: A Causal Network Inference Approach.Int J Mol Sci. 2024 Aug 28;25(17):9322. doi: 10.3390/ijms25179322. Int J Mol Sci. 2024. PMID: 39273274 Free PMC article.
-
Exploring maternal and developmental toxicity of perfluoroalkyl ether acids PFO4DA and PFO5DoA using hepatic transcriptomics and serum metabolomics.Sci Total Environ. 2024 Nov 25;953:175978. doi: 10.1016/j.scitotenv.2024.175978. Epub 2024 Sep 1. Sci Total Environ. 2024. PMID: 39226966
-
Introducing Attribute Association Graphs to Facilitate Medical Data Exploration: Development and Evaluation Using Epidemiological Study Data.JMIR Med Inform. 2024 Jul 24;12:e49865. doi: 10.2196/49865. JMIR Med Inform. 2024. PMID: 39046780 Free PMC article.
-
Local genetic adaptation to habitat in wild chimpanzees.bioRxiv [Preprint]. 2024 Jul 9:2024.07.09.601734. doi: 10.1101/2024.07.09.601734. bioRxiv. 2024. PMID: 39026872 Free PMC article. Preprint.
References
-
- Fabregat A, Sidiropoulos K, Garapati P, Gillespie M, Hausmann K, Haw R, et al. The Reactome pathway Knowledgebase. Nucleic Acids Res. 2016; 44:D481–7. doi: 10.1093/nar/gkv1351 - DOI - PMC - PubMed
-
- Van Bruggen R. Learning Neo4j. Birmingham: Packt Publishing Ltd.; 2014
-
- Vukotic A, Watt N, Abedrabbo T, Fox D, Partner J. Neo4j in Action. 1st ed. Shelter Island, NY: Manning Publications; 2014.
-
- Sedgewick R, Wayne K. Algorithms. 4th ed. Addison-Wesley; 2011. pp. 566–596.
-
- Vastrik I, D'Eustachio P, Schmidt E, Joshi-Tope G, Gopinath G, Croft D, et al. Reactome: a knowledge base of biologic pathways and processes. Genome Biol. 2007; 8: R39 doi: 10.1186/gb-2007-8-3-r39 - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources