The recount2 project re-processed RNA sequencing (RNA-seq) data on over
70,000 human RNA-seq samples spanning a variety of tissues, cell types
and disease conditions. Researchers can easily access these data via the
recount
Bioconductor package, and can quickly import gene, exon,
exon-exon junction and base-pair coverage data for uniformly processed
data from the SRA, GTEx and TCGA projects in R for analysis. This
workshop will include an overview of the recount2 project as well as
methods and tools that have been to improve it since 2017. The overview
will be followed by a hands-on session where you will dive into
recount
and related packages along the lines of the recount
workflow.
We will also cover new related packages such as
GenomicState
.
In more detail, this workshop will cover different use cases of the
recount
package, including downloading and normalizing data,
processing and cleaning relevant phenotype data (including
recount-brain
), performing differential expression (DE) analyses using
other Bioconductor packages. The workshop will also cover how to use the
base-pair coverage data for annotation-agnostic DE analyses and for
visualizing coverage data for features of interest. After taking this
workshop, attendees will be ready to enhance their analyses by
leveraging RNA-seq data from 70,000 human samples.
recount2 website, recount package, paper, recount workflow.
Prior editions: Bioc2017, BioC2019.
- Basic knowledge of R syntax
- Basic knowledge of RNA-seq or gene expression data
- Familiarity with
SummarizedExperiment
vignette - Familiarity with the
RangedSummarizedExperiment
class
Students are expected to ask questions during the recount2 overview presentation and to bring their own laptop so they can follow the hands-on portion of the workshop.
The slides for the workshop are available through speakerdeck.
<script async class="speakerdeck-embed" data-id="78092f5c558242f1aa3085d9abc9c061" data-ratio="1.77777777777778" src="//speakerdeck.com/assets/embed.js"></script>You can download a Docker image with all the workshop files using:
docker run -e PASSWORD=bioc2020 -p 8787:8787 -d --rm lcollado/recountworkshop2020
Then, log in to RStudio at http://localhost:8787 using username
rstudio
and password bioc2020
. Note that on Windows you need to
provide your localhost IP address like http://191.163.92.108:8787/
-
find it using docker-machine ip default
in Docker’s terminal.
To see the vignette on RStudio’s window (from the docker image), run
browseVignettes(package = "recountWorkshop2020")
. Click on one of the
links, “HTML”, “source”, “R code”. In case of The requested page was not found
error, add help/
to the URL right after the hostname, e.g.,
http://localhost:8787/help/library/recountWorkshop2020/doc/recount-workshop.html.
This is a known
bug.
SummarizedExperiment
, recount
, GenomicRanges
, DESeq2
,
derfinder
, regionReport
are among the packages that will be
explicitly covered.
Activity | Time |
---|---|
recount2 project overview | 10m |
accessing gene count data with recount | 15m |
Gene-level analysis with recount2 and recount-brain | 15m |
Un-annotated transcriptome methods overview: derfinder , GenomicState |
5 min |
the future: recount3 | 5m |
Total: a 50-55 minute session.
- Understand the publicly data available via the recount2 project beyond gene counts
- Identify how recount2 data could be useful for your RNA-seq projects and how you can use it
- Describe the annotation-agnostic methods powered by recount2 that complement other RNA-seq analysis pipelines
- Re-analyze public RNA-seq data using the recount Bioconductor package
- Recognize how to access the different types of data hosted by recount2
- Locate resources that enhance recount2 such as recount-brain,
derfinder
, andGenomicState
Get the latest stable R
release from
CRAN. Then install recountWorkshop2020
using from GitHub with the following code:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("LieberInstitute/recountWorkshop2020")
Below is the citation output from using citation('recount
in R. Please
run this yourself to check for any updates on how to cite recount.
print(citation("recount")[2], bibtex = TRUE)
#>
#> Collado-Torres L, Nellore A, Jaffe AE (2017). "recount workflow:
#> Accessing over 70,000 human RNA-seq samples with Bioconductor [version
#> 1; referees: 1 approved, 2 approved with reservations]."
#> _F1000Research_. doi: 10.12688/f1000research.12223.1 (URL:
#> https://doi.org/10.12688/f1000research.12223.1), <URL:
#> https://f1000research.com/articles/6-1558/v1>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {recount workflow: Accessing over 70,000 human RNA-seq samples with Bioconductor [version 1; referees: 1 approved, 2 approved with reservations]},
#> author = {Leonardo Collado-Torres and Abhinav Nellore and Andrew E. Jaffe},
#> year = {2017},
#> journal = {F1000Research},
#> doi = {10.12688/f1000research.12223.1},
#> url = {https://f1000research.com/articles/6-1558/v1},
#> }
Please note that the recount
was only made possible thanks to many
other R and bioinformatics software authors, which are cited either in
the vignettes and/or the paper(s) describing this package.
Please note that the recountWorkshop2020
project is released with a
Contributor Code of
Conduct.
By contributing to this project, you agree to abide by its terms.
- Continuous code testing is possible thanks to GitHub actions through usethis, remotes, sysreqs and rcmdcheck customized to use Bioconductor’s docker containers and BiocCheck.
- Code coverage assessment is possible thanks to codecov and covr.
- The documentation website is automatically updated thanks to pkgdown.
- The code is styled automatically thanks to styler.
- The documentation is formatted thanks to devtools and roxygen2.
For more details, check the dev
directory.
This package was developed using biocthis.