DCoNA: tool for fast Differential Correlation Network Analysis

DCoNA is a statistical tool that allows one to identify pair interactions, which correlation significantly changes between two conditions. DCoNA was designed to test the hypothesis for a predefined list of source and target pairs ("Network" regime). However, DCoNA can also be used in the complete- network regime when the list is not given ("Exhaustive" regime). In this regime, DCoNA tests the hypothesis for all possible pairs of molecules from expression data. Aside from the hypothesis testing, DCoNA can be used to test that significantly altered correlations of a particular source molecule are overrepresented among all significantly changed correlations. Also, DCoNA can compute mean, median, and other quantiles of z-statistics associated with a particular molecule and its targets to determine a trend in correlation changes.

If you use DCoNA in work contributing to a scientific publication, we ask you to cite our publication:

Anton Zhiyanov, Narek Engibaryan, Stepan Nersisyan, Maxim Shkurnikov, Alexander Tonevitsky, Differential co-expression network analysis with DCoNA reveals isomiR targeting aberrations in prostate cancer, Bioinformatics, Volume 39, Issue 2, February 2023, btad051
https://doi.org/10.1093/bioinformatics/btad051

Installation

Installation using pip

pip install dcona

Downloading example dataset

You can try DCoNA on TCGA-PRAD test dataset

Usage

You can use DCoNA either as Python-module or as a command-line tool.

Example

Detailed description of functions with data example and test launch.

Available functions

`dcona.ztest`

It tests the hypothesis on correlation equiavalence between pairs of genes

dcona.ztest(data_df, description_df, reference_group, experimental_group, correlation='spearman', alternative='two-sided', interaction=None, repeats_number=None, output_dir=None, process_number=None)

Command-line usage:
```
dcona ztest config.json
```

`dcona.zscore`

It aggregates correlation changes of source molecule with all its targets.

dcona.zscore(data_df, description_df, reference_group, experimental_group, correlation='spearman', score='mean', alternative='two-sided', interaction=None, repeats_number=None, output_dir=None, process_number=None)

Command-line usage:
```
dcona zscore config.json
```

`dcona.hypergeom`

It groups pairs with changed correlations by the source molecules and finds overrepresented groups using the hypergeometric test.

dcona.hypergeom(ztest_df, alternative='two-sided', oriented=True, output_dir=None)

Command-line usage:
You should launch ztest and then hypergeom with the same config file.
```
dcona hypergeom config.json
```

Data structure for CLI launch

To run the tool in command line you need the following data:

config.json containing data filenames and tool usage parameters

{
	"data_path": "./example/data/data.csv",
	"description_path": "./example/data/description.csv",
	"interaction_path": "./example/data/interactions.csv",
	"output_dir_path": "./../output/",
	
	"reference_group": "Normal",
	"experimental_group": "Tumor",

	"correlation": "spearman",
	"alternative": "two-sided",
	"score": "mean",
	"repeats_number": 500,
	"process_number": 2
}

Both relative and absolute file paths can be used.

Data description:

data_path : data.csv contains an expression table. Rows of the table should be grouped by genes, miRNAs, isomiRNAs and other items. Columns of the table are grouped by patients taken from two different groups.

Structure of data.csv :

sample_1 ... sample_n

gene_1 1.2345 ... 1.2345

... ... ... ...

gene_n 1.2345 ... 1.2345
description_path : description.csv divide patients into two non-intersecting groups (e.g. Normal and Tumor patients). It is assumed that a patient does not belong to the both groups simultaneously.

Structure of description.csv:

Sample Group

sample_1 condition_1

... ...

sample_n condition_2

Column names have to be exactly Sample and Group.
interaction_path (optional): interaction.csv contains source/target pairs - correlations will be computed among this pairs (in network mode). You should delete this line from the config file if you want to launch an exhaustive mode.

Structure of interaction.csv:

Source Target

source_gene_1 target_gene_2

... ...

source_gene_n target_gene_n

Column names have to be exactly Source and Target.
output_dir_path is a path to an output directory.

Usage parameters:

reference_group, experimental_group are names of the patient groups.
correlation : spearman or pearson, defines the type of correlation that will be used in the tool.
alternative : two-sided, less or greater.

TODO: describe the parameter meaning in ztest and zscore regimes.

Network and exhaustive regimes

DCoNA has two working regimes:

Network (interactions) regime - performs calculations only on given gene pairs. Requires an interaction.csv file.
Exhaustive (all vs all) regime - generates all possible gene pairs from genes listed in data.csv and performs calculations. An interaction.csv file is not needed.

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github/workflows		.github/workflows
dcona		dcona
example		example
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DCoNA: tool for fast Differential Correlation Network Analysis

Table of Contents

Installation

Installation using pip

Downloading example dataset

Usage

Example

Available functions

`dcona.ztest`

`dcona.zscore`

`dcona.hypergeom`

Data structure for CLI launch

Network and exhaustive regimes

About

Releases 1

Packages

Contributors 3

Languages

	sample_1	...	sample_n
gene_1	1.2345	...	1.2345
...	...	...	...
gene_n	1.2345	...	1.2345

Sample	Group
sample_1	condition_1
...	...
sample_n	condition_2

Source	Target
source_gene_1	target_gene_2
...	...
source_gene_n	target_gene_n

License

zhiyanov/DCoNA

Folders and files

Latest commit

History

Repository files navigation

DCoNA: tool for fast Differential Correlation Network Analysis

Table of Contents

Installation

Installation using pip

Downloading example dataset

Usage

Example

Available functions

dcona.ztest

dcona.zscore

dcona.hypergeom

Data structure for CLI launch

Network and exhaustive regimes

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

`dcona.ztest`

`dcona.zscore`

`dcona.hypergeom`

Packages