wigglescout
is an R library that allows you to calculate summary values
across bigWig files and BED files and visualize them in a genomics-relevant
manner. It is based on broadly used libraries such as rtracklayer
and
GenomicRanges
, among others for calculation, and mostly ggplot2
for
visualization. You can look at the DESCRIPTION
file to get more information
about all the libraries that make this one possible.
There are also many other tools whose functionality overlaps a little or
much with wigglescout
, but there was no single tool that
included all that I needed. The aim of this library is therefore not to
replace any of those tools, or to provide a silver-bullet solution to genomics
data analysis, but to provide a comprehensive, yet simple enough set of tools
focused on bigWig files that can be used entirely from the R environment
without switching back and forth across tools.
Other tools and libraries for akin purposes that you may be looking for include:
deepTools
, SeqPlots
, bwtool
, wiggletools
, and the list is endless!
wigglescout
allows you to summarize and visualize the contents of
bigWig files in two main ways:
- Genome-wide. Genome is partitioned on equally-sized bins and their aggregated value is calculated. Useful to get a general idea of the signal distribution without looking at specific places.
- Across sets of loci. This can be either summarized categories, or individual values, as in genome-wide analyses.
wigglescout
functionality is built in two layers. Names of functions that
calculate values over bigWig files start with bw_
. These return GRanges
objects when possible, data.frame
objects otherwise (i.e. when values are
summarized over some category, genomic location is lost in this process).
On the other hand, functions that plot such values and that usually make
internal use of bw_
functions, start with plot_
.
wigglescout
is a package under active development. You can install it from this
repository. For this, you will need remotes
to install it (and devtools
if
you plan to work on it):
install.packages(c('devtools', 'remotes'))
Additionally, there was an issue in the past with installing dependencies that
come from BioConductor
repository. This seems to have been fixed now, but if
you run into problems, I recommend installing manually these dependencies
before running the installation:
install.packages(('BiocManager'))
BiocManager::install(c('GenomeInfoDb', 'GenomicRanges', 'rtracklayer'))
Then you can install directly from this GitHub repository:
library(remotes)
remotes::install_github('cnluzon/wigglescout', build_vignettes = TRUE)
The vignettes or online documentation
can give a comprehensive overview of what is available in the package.
You can check the vignettes with browseVignettes("wigglescout")
.
Q: When running install_github
I get the following error:
Error: package or namespace load failed for ‘GenomeInfoDb’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
there is no package called ‘GenomeInfoDbData’
Error: package ‘GenomeInfoDb’ could not be loaded
Execution halted
A: This seemed to be a problem that came from installing Bioconductor
dependencies. A workaround is installing the BioConductor
packages manually:
if (!requireNamespace('BiocManager', quietly = TRUE))
install.packages('BiocManager')
BiocManager::install(c('GenomeInfoDb', 'GenomicRanges', 'rtracklayer'))