Skip to content

R package for computing multiple hypothesis tests on rows/columns of a matrix or a data.frame

Notifications You must be signed in to change notification settings

karoliskoncevicius/matrixTests

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRAN version Build Status codecov dependencies Monthly Downloads

Matrix Tests

A package dedicated to running multiple statistical hypothesis tests on rows and columns of matrices.

illustration

Goals

  1. Fast execution via vectorization.
  2. Convenient and detailed output format.
  3. Compatibility with tests implemented in base R.
  4. Careful handling of missing values and edge cases.

Examples

1. Bartlett's test on columns

Bartlett's test on every column of iris dataset using Species as groups:

col_bartlett(iris[,-5], iris$Species)
             obs.tot obs.groups var.pooled df statistic                pvalue
Sepal.Length     150          3 0.26500816  2 16.005702 0.0003345076070163084
Sepal.Width      150          3 0.11538776  2  2.091075 0.3515028004158132768
Petal.Length     150          3 0.18518776  2 55.422503 0.0000000000009229038
Petal.Width      150          3 0.04188163  2 39.213114 0.0000000030547839322

2. Welch t-test on rows

Welch t-test performed on each row of 2 large (million row) matrices:

X <- matrix(rnorm(10000000), ncol = 10)
Y <- matrix(rnorm(10000000), ncol = 10)

row_t_welch(X, Y)  # running time: 2.4 seconds

Confidence interval computations can be turned-off for further increase in speed:

row_t_welch(X, Y, conf.level = NA)  # running time: 1 second

Available Tests

Variant Name Function
Location tests (1 group) Single sample Student's t.test row_t_onesample
Single sample Wilcoxon's test row_wilcoxon_onesample
Location tests (2 groups) Equal variance Student's t.test row_t_equalvar
Welch adjusted Student's t.test row_t_welch
Two sample Wilcoxon's test row_wilcoxon_twosample
Location tests (paired) Paired Student's t.test row_t_paired
Paired Wilcoxon's test row_wilcoxon_paired
Location tests (2+ groups) Equal variance oneway anova row_oneway_equalvar
Welch's oneway anova row_oneway_welch
Kruskal-Wallis test row_kruskalwallis
van der Waerden's test row_waerden
Scale tests (2 groups) F variance test row_f_var
Scale tests (2+ groups) Bartlett's test row_bartlett
Fligner-Killeen test row_flignerkilleen
Levene's test row_levene
Brown-Forsythe test row_brownforsythe
Association tests Pearson's correlation test row_cor_pearson
Periodicity tests Cosinor row_cosinor
Distribution tests Kolmogorov-Smirnov test row_kolmogorovsmirnov_twosample
Normality tests Jarque-Bera test row_jarquebera
Anderson-Darling test row_andersondarling

Further Information

For more information please refer to the Wiki page:

  1. Installation Instructions
  2. Design Decisions
  3. Speed Benchmarks
  4. Bug Fixes and Improvements to Base R

See Also

Literature

Computing thousands of test statistics simultaneously in R, Holger Schwender, Tina Müller.
Statistical Computing & Graphics. Volume 18, No 1, June 2007.

Packages

CRAN:

  1. ttests() in the Rfast package.
  2. row.ttest.stat() in the metaMA package.
  3. MultiTtest() in the ClassComparison package.
  4. bartlettTests() in the heplots package.
  5. harmonic.regression() in the HarmonicRegression package.

BioConductor:

  1. lmFit() in the limma package.
  2. rowttests() in the genefilter package.
  3. mt.teststat() in the multtest package.
  4. row.T.test() in the HybridMTest package.
  5. rowTtest() in the viper package.
  6. lmPerGene() in the GSEAlm package.

GitHub:

  1. rowWilcoxonTests() in the sanssouci package.
  2. matrix.t.test() in the pi0 package.
  3. wilcoxauc() in the presto package.