SuperCell is an R package for coarse-graining large single-cell RNA-seq data into metacells and performing downstream analysis at the metacell level.
The exponential scaling of scRNA-seq data represents an important hurdle for downstream analyses. One of the solutions to facilitate the analysis of large-scale and noisy scRNA-seq data is to merge transcriptionally highly similar cells into metacells. This concept was first introduced by Baran et al., 2019 (MetaCell) and by Iacono et al., 2018 (bigSCale). More recent methods to build metacells have been described in Ben-Kiki et al. 2022 (MetaCell2), Bilous et al., 2022 (SuperCell) and Persad et al., 2022 (SEACells). Despite some differences in the implementation, all the methods are network-based and can be summarized as follows:
1. A single-cell network is computed based on cell-to-cell similarity (in transcriptomic space)
2. Highly similar cells are identified as those forming dense regions in the single-cell network and merged together into metacells (coarse-graining)
3. Transcriptomic information within each metacell is combined (average or sum).
4. Metacell data are used for the downstream analyses instead of large-scale single-cell data
Unlike clustering, the aim of metacells is not to identify large groups of cells that comprehensively capture biological concepts, like cell types, but to merge cells that share highly similar profiles, and may carry repetitive information. Therefore metacells represent a compromise structure that optimally remove redundant information in scRNA-seq data while preserving the biologically relevant heterogeneity.
An important concept when building metacells is the graining level (γ), which we define as the ratio between the number of single cells in the initial data and the number of metacells. We suggest applying γ between 10 and 50, which significantly reduces the computational resources needed to perform the downstream analyses while preserving most of the result of the initial (i.e., single-cell) analyses.
SuperCell requires igraph, RANN, WeightedCluster, corpcor, weights, Hmisc, Matrix, matrixStats, plyr, irlba, grDevices, patchwork, ggplot2. SuperCell uses velocyto.R for RNA velocity.
install.packages("igraph")
install.packages("RANN")
install.packages("WeightedCluster")
install.packages("corpcor")
install.packages("weights")
install.packages("Hmisc")
install.packages("Matrix")
install.packages("patchwork")
install.packages("plyr")
install.packages("irlba")
Installing SuperCell package from gitHub
if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("GfellerLab/SuperCell")
library(SuperCell)
- Building and analyzing metacells with SuperCell
- RNA velocity applied to SuperCell object
- Building metacells with SuperCell and alayzing them with a standard Seurat pipeline
- Data integration of metacells built with SuperCell
If you use SuperCell in a publication, please cite: Bilous et al. Metacells untangle large and complex single-cell transcriptome networks, BMC Bioinformatics (2022).