Introduction

Human bodies are built of >30 trillion cells specialized to fulfill diverse roles within our tissues, organs, and organ systems. All these cells originate from a single cell, a zygote formed at conception. From zygote to fetus, and throughout childhood, adolescence, and adulthood, cells divide and commit to different fates in order for the organism to develop, sustain and regenerate. The series of steps that lead from an undifferentiated progenitor cell, such as a stem cell, to one of a number of possible specialized descendants constitutes a cell lineage path. Recent technological advances have allowed researchers to collect high-throughput omics data from single cells of multicellular organisms and use it to track and manipulate cell fates (Burgess 2018; Saelens et al. 2019). This opens the door to the possibility of deciphering cell lineage paths at single-cell resolution, a critical requirement for the advancement of regenerative medicine and cancer medicine.

Researchers who conduct experiments that involve cell lineage tracing through the study of genomic, epigenomic, transcriptomic, and proteomic features of single cells currently expend significant effort extracting information from published sources to create short-lived, limited-scope local annotations of biological markers that are needed to interpret and analyze their high-throughput experimental data. This creates the urgent need for an open source, comprehensive, continuously maintained reference repository of cell lineage paths framed on the pre-existing knowledge of tissue morphogenesis, as well as a cell type-specific pathway resource, which can serve as a framework for interpretation and integration of the latest experimental findings. 

Cytomics Reactome aims to become an integrative systems biology resource of cell lineage paths and a toolset for the analysis of single-cell omics data. The Reactome web application and pathway visualization tools (Croft et al. 2011) have been extended to enable Cytomics Reactome visual display and search functions (Milacic et al. 2024) on the live website.

Database schema features to support Cytomics

The Reactome data model provides a robust framework for the annotation of biological entities and events (Joshi-Tope et al. 2005). The basic unit of the Reactome data model is a Reaction, an event that converts input to output physical entities. The participants in reactions are PhysicalEntities, which can be simple or complex molecules or molecular complexes. These PhysicalEntities can also act as reaction catalysts and regulators. Reactions are grouped into causal chains to form Pathways (Joshi-Tope et al. 2005).

The Reactome data schema was expanded as described below to accommodate new classes needed for Cytomics Reactome annotations.

  • Cell: A subclass of PhysicalEntity, represents a type of cell in a particular state of development/differentiation (Figure 1A).
    •  A Cell instance has CellType, TissueLayer, Tissue, and Organ as single-valued attributes which are populated using terms from public open source databases, the Cell Ontology (Sarntivijai et al. 2014; Osumi-Sutherland 2017) and UBERON (Haendel et al. 2014), and cross-reference these databases.
    • Cell instances also have multi-valued ProteinMarker and RNAMarker attributes, which are specific to the cell type and/or the cell state or are differentially upregulated in the Cell. (Figure 1B). The markers (protein or RNA) are manually curated; each marker instance is associated with one or more literature references for evidence, including, if applicable, citations of CellMarker (Hu et al. 2022) and PanglaoDB databases (Franzén et al. 2019) (Figure 1B) 

Figure 1. Cell Physical Entity class A and new attributes B

Figure 1. Cell Physical Entity class A and new attributes B

  • Cell Development Step and Cell Lineage Path are two Event subclasses (Figure 2A), that describe developmental/differentiation relationships among Cells: 
    • A Cell Development Step is a ReactionLikeEvent that contains Cell instances as its inputs and outputs. The input attribute represents the cell of origin, and the output attribute represents the destination cell type. Additional Cell Development Step attributes include regulators (molecules promoting or inhibiting the step) and required input components (input cell proteins required for the action of regulators) (Figure 2B). 
    • A Cell Lineage Path, organizationally similar to the preexisting Pathway subclass, is composed of other Cell Lineage Path instances or Cell Development Steps as subevents that are connected through shared input/output Cells (Figure 2C).
    • Both Cell Development Step and Cell Lineage Path are associated with a relevant Gene Ontology (GO) biological process term (The Gene Ontology Consortium 2019) and with a “tissue” term from UBERON (Haendel et al. 2014).

Figure 2

Figure 2. Cell Development Step and Cell Lineage Path schema (A) and attributes (B,C).

The Cytomics content is searchable on the Reactome website. Running a search from the homepage search bar may list hits within various categories including pathway, reaction, complex, and cell. Clicking on an item in the search results list will redirect to a Details page, which displays more information about the specific record.

  • For example, a details page for Cell (Figure 3A) includes information on location in the pathway hierarchy, cell type, and histological description of the given cell with ontology terms, a list of protein and RNA markers, and corresponding literature references for each marker. A scroll bar appears when there are more than 5 markers in the same category. The page also lists event(s) (i.e., Cell Development Step), in which the Cell participates as an input/output.  
    • Clicking on marker or Cell Development Step, will redirect to the details page of the marker or event respectively. 
    • Clicking on the Cell in the expanded hierarchical view of the Developmental Biology pathway (Figure 3B) in the details page, will redirect to the Pathway Browser view highlighting the Cell Lineage Path in the hierarchy panel and showing the selected Cell in the pathway diagram.

Figure 3A. Details page for Cell

 Figure 3. Details page for Cell (A)

Figure 3B. expanded view of the Cell location in the Pathway Browser B .

Figure 3. Expanded view of the Cell location in the Pathway Browser (B)

  • See the details pages for marker instance (Figure 4), Cell Development Step (Figure 5), Cell Lineage Path (Figure 6) below. 
    • The content of individual events can be exported from the event’s details pages (Figure 5, 6) as SBML, PDF, SVG, PNG, PPTX, SBGN files.

 Figure 4. Details page for Marker instance

Figure 4. Details page for Marker instance

Figure 5. Details page for Cell Development Step

Figure 5. Details page for Cell Development Step

Figure 6. Details page for Cell Lineage Path

Figure 6. Details page for Cell Lineage Path

Navigation of a Cell Lineage Path in the Pathway Browse

Cell Lineage Paths are available under the “Developmental Biology” pathway in the Reactome Pathway Browser, where they are grouped in the “Developmental Cell Lineages” subpathway (Figure 7). 


Although the pathway diagram depicts a selection of cell types that are planned for annotation, only one type has been annotated so far and this is indicated by a selectable blue box label in the diagram.

Figure 7. Pathway browser view of Developmental Cell Lineages subpathway

Figure 7. Pathway browser view of “Developmental Cell Lineages” subpathway

Selecting an individual Cell Lineage Path in the pathway hierarchy will display all Cell Development Steps of the selected path in the pathway diagram (Figure 8). An individual Cell is depicted with a double-layered outer compartment representing the plasma membrane and the cytosol, along with an inner blue rectangle symbolizing the nucleus.

The Description tab in the Details panel of a selected Cell Lineage Path provides additional information about the Cell Lineage Path including name, species, assigned GO Biological Process term (if applicable), and literature reference(s) (Figure 8).

Figure 8. Selecting a Cell Lineage Path from the pathway hierarchy

Figure 8. Selecting a Cell Lineage Path from the pathway hierarchy

Clicking on an individual Cell Development Step either in the pathway hierarchy or in the pathway diagram will display relevant information for the selected event in the details panel below the diagram (Figure 9). The description tab of the details panel shows input and output, which are defined by Cell instances, regulators of the event (when applicable), GO biological process, and UBERON tissue terms. It may also display preceding event(s) connecting individual Cell Development Steps within the same Cell Lineage Path (Figure 9). Details of participating Cell instances including histological terms, specific protein and RNA markers, and corresponding marker references are accessible from the description tab of Cell Development Step upon clicking on the plus sign to the right of the input/output attributes.

Figure 9. Selecting a Cell Development Step

Figure 9. Selecting a Cell Development Step

Alternatively, the Cell details can be displayed in the Details panel by selecting a “Cell” icon in the diagram of the Cell Lineage Path (Figure 10). Right-clicking or clicking the blue info icon when hovering over a “Cell” icon in the pathway diagram will open a popup list of protein and RNA markers for the selected Cell (Figure 11).

Figure 10. Selecting a Cell

Figure 10. Selecting a Cell

Figure 11. Displaying the list of Cell markers

Figure 11. Displaying the list of Cell markers

The Molecules tab in the Details panel (Figure 12A) displays a downloadable list of all markers for a Cell and both the markers and regulators for a Cell Development Step and a Cell Lineage Path.

The Structure tab (Figure 12B) in the details panel displays the Protein Data Bank structures of molecules listed in the Molecular tab and the Expression tab shows gene expression data from Expression Atlas.

Figure 12A

Figure 12. Displaying Molecules Tab (A) 

Figure 12B

Figure 12. Structures tab (B)

The Analysis tab (Figure 13, 14) in the details panel displays data after running the dataset analysis. Refer to (Rothfels et al. 2023) for detailed instructions on how to use the Reactome Analysis Tool.

The results of the analysis are also visualized in the pathway diagram panel, where the “Cell” icons are colored (Figure 13, 14). The colored area in the nucleus of the individual Cell corresponds to the coverage of the cell markers in the submitted data set.

In Pathway Enrichment Analysis, olive green is applied to color the nucleus of the cells in the Cell Lineage Path that contain markers from the query list. The extent of coloration is proportional to the number of markers hit (Figure 13).

The results of gene expression analysis, including Reactome Gene Set Analysis (Reactome GSA), appear as a single vertical bar of color when zoomed out representing the average expression of the markers from the query list (Figure 14a). Zooming in on the “Cell” icon reveals bars for individual cell markers with their expression values, as shown in Figure 14b.Expression values of cell markers from the query data set can be visualized in a popup information panel of protein and RNA markers for the selected Cell (Figure 14b).

Expression values for the markers of a selected cell are displayed as blue horizontal lines within the expression scale (Figure 14C). The median cell marker expression among the cell's matching markers is shown in black. On the left, when a specific marker bar is hovered over, the currently hovered marker’s expression value is displayed in red, and the expression values of other markers in the same cell are shown in yellow.

Figure 13 samll

Figure 13. Overrepresentation analysis displayed within the nuclei of the cells in the Cell Lineage Path

Figure 14A

Figure 14. Gene expression analysis results are displayed within the nuclei of the cells. The level with coloration reflects the average expression value of the cell markers (A)

Figure 14B

Figure 14. A zoomed-in view of the expression bars of individual markers (B) 

Figure 14C

Figure 14. The expression scale (C)

  1. Burgess, Darren J. 2018. “Tracing Cell-Lineage Histories.” Nature Reviews. Genetics.
  2. Croft, David, Gavin O’Kelly, Guanming Wu, Robin Haw, Marc Gillespie, Lisa Matthews, Michael Caudy, et al. 2011. “Reactome: A Database of Reactions, Pathways and Biological Processes.” Nucleic Acids Research 39 (Database issue): D691–97.
  3. Franzén O, Gan LM, Björkegren JLM, "PanglaoDB: a web server for exploration of mouse and human single cell RNA sequencing data", Database (Oxford), 2019, 2019.
  4. Haendel, Melissa A., James P. Balhoff, Frederic B. Bastian, David C. Blackburn, Judith A. Blake, Yvonne Bradford, Aurelie Comte, et al. 2014. “Unification of Multi-Species Vertebrate Anatomy Ontologies for Comparative Biology in Uberon.” Journal of Biomedical Semantics 5 (May): 21.
  5. Hu C, Li T, Xu Y, Zhang X, Li F, Bai J, Chen J, Jiang W, Yang K, Ou Q, Li X, Wang P, Zhang Y, "CellMarker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA seq data", Nucleic Acids Res, 2022.
  6. Joshi-Tope, G., M. Gillespie, I. Vastrik, P. D’Eustachio, E. Schmidt, B. de Bono, B. Jassal, et al. 2005. “Reactome: A Knowledgebase of Biological Pathways.” Nucleic Acids Research 33 (Database issue): D428–32.
  7. Milacic M, Beavers D, Conley P, et al. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 2024;52(D1):D672-D678. 
  8. Osumi-Sutherland, David. 2017. “Cell Ontology in an Age of Data-Driven Cell Classification.” BMC Bioinformatics 18 (Suppl 17): 558.
  9. Saelens, Wouter, Robrecht Cannoodt, Helena Todorov, and Yvan Saeys. 2019. “A Comparison of Single-Cell Trajectory Inference Methods.” Nature Biotechnology 37 (5): 547–54.
  10. Rothfels Karen, Milacic Marija, Matthews Lisa et al. 2023. “Using the Reactome Database.” Current protocols vol. 3,4: e722.
  11. Sarntivijai, Sirarat, Yu Lin, Zuoshuang Xiang, Terrence F. Meehan, Alexander D. Diehl, Uma D. Vempati, Stephan C. Schürer, et al. 2014. “CLO: The Cell Line Ontology.” Journal of Biomedical Semantics 5 (August): 37.
  12. The Gene Ontology Consortium. 2019. “The Gene Ontology Resource: 20 Years and Still GOing Strong.” Nucleic Acids Research 47 (D1): D330–38.