This repository contains the code to the paper "Identification of causal effects of neuroanatomy on cognitive decline requires modeling unobserved confounders." If you are using this code, please cite:
@article(Poelsterl2022-adj,
title = {{Identification of causal effects of neuroanatomy on cognitive decline requires modeling unobserved confounders}},
author = {P{\"{o}}lsterl, Sebastian and Wachinger, Christian},
journal = {Alzheimer's Dement},
year = {2022},
pages = {},
doi = {10.1002/alz.12825},
}
It is recommended to run the code via Docker.
If you want to use the code for development, you can use conda
to create an environment with all dependencies from requirements.yaml
.
Pre-built packages are provided, but you can also build the docker image yourself:
- Install Docker.
- Build Docker image
causalad
:
docker build -t causalad .
This section provides an overview on how to obtain the data to
reproduce the results presented in the paper. As data cannot
be shared publicly, you will have to perform the data processing
yourself to fill in the missing values of data/adni-data-template.csv
and data/ukb-data-template.csv
. These files list the patient ID,
and visit and image ID for ADNI, which uniquely identify the data
you need to obtain. We expect that you have been approved to access
the data and are familiar with the data portals of ADNI and UK Biobank.
- Log in to the ADNI Data Portal.
- Download
ADNIMERGE.CSV
andUPENNBIOMK_MASTER.csv
. - Use
ABETA
,PTAU
,TAU
fromUPENNBIOMK_MASTER.csv
to determine which patients have an Alzheimer's pathologic by creating a columnATN_status
that describes the A/T/N scheme, e.g.A+/T+/N-
ifABETA ≤ 192
,PTAU ≥ 23
, andTAU < 93
, following the thresholds from Ekman et al., 2018:
The individual CSF values were considered pathological (+) if ≤192 pg/ml for Aβ42, ≥93 pg/ml for t-tau, and ≥23 pg/ml for p-tau.
- Download T1 structural brain MRI from the ADNI Data Portal and segment each with FreeSurfer 5.3 to obtain volume and thickness measurements.
- Fill in the values of
data/adni-data-template.csv
by takingABETA
,PTAU
,TAU
fromUPENNBIOMK_MASTER.csv
,ATN_status
from above, volume and thickness measurements computed by FreeSurfer, and the remaining variables fromADNIMERGE.CSV
. Save the resulting file asdata/adni-data.csv
.
- Log in to the UK Biobank Access Management System.
- Download data on Sex, and Age at first imaging visit.
- Download T1 structural brain MRI and segment each with FreeSurfer 5.3 to obtain volume measurements.
- Fill in the values of the
data/ukb-data-template.csv
and save the result asdata/ukb-data.csv
.
- Make sure your created
data/adni-data.csv
as outlined above. - The entire workflow is summarized in a shell script, which can be executed by running:
docker run -it --rm \
-v $(pwd)/data:/workspace/data \
-v $(pwd)/outputs:/workspace/outputs \
ghcr.io/ai-med/causal-effects-in-alzheimers-continuum:v0.2.0 \
./adni-experiments.sh
-
Upon completion, the main results will be available in the
outputs/adni/results
folder.plot-betareg-coef_outputs.ipynb
: This notebook will contain a figure comparing the estimated credible intervals for each model.estimate_ace_outputs.ipynb
: This notebook will contain figures comparing the average causal effect (ACE) across models.
-
The estimated substitute confounders will be stored in the
outputs/adni/subst_conf
folder.adni_bpmf_subst_conf_dim6.h5
: Transformed features with 6 substitute confounders estimated by BPMF.adni_ppca_subst_conf_dim6.h5
: Transformed features with 6 substitute confounders estimated by PPCA.
-
The estimated mean coefficients for all models will be stored in the
outputs/adni/models
folder.coef_adni_bpmf_subst_conf_dim6.csv
: Estimated coefficients of Beta-regression model when accounting for observed confounders and 6 substitute confounders estimated by BPMF.coef_adni_ppca_subst_conf_dim6.csv
: Estimated coefficients of Beta-regression model when accounting for observed confounders and 6 substitute confounders estimated by PPCA.coef_adni_original.csv
: Estimated coefficients of Beta-regression model when ignoring confounding.coef_adni_age_residualized.csv
: Estimated coefficients of Beta-regression model when accounting for observed confounders via the regress-out approach.coef_adni_combat_residualized.csv
: Estimated coefficients of Beta-regression model when harmonizing volume and thickness measures via the ComBat approach.
- Make sure your created
data/ukb-data.csv
as outlined above. - To execute all steps of the simulation study, you will need at least 64GB of RAM. Running the entire pipeline can take days and can be started by executing:
docker run -it --rm \
-v $(pwd)/data:/workspace/data \
-v $(pwd)/outputs:/workspace/outputs \
ghcr.io/ai-med/causal-effects-in-alzheimers-continuum:v0.2.0 \
./ukb-experiments.sh
-
The main result of the experiments will stored in the
outputs/ukb/results
folder.-
ukb_visualize_output.ipynb
: The notebook contains a table of the Bayesian p-values for each model and latent dimension. Moreover, it will contain a table summarizing the bias in the estimates of the causal effects compared to the true causal effects. -
experiments_summary.csv
: Table summarizing the bias in the estimates of the causal effects compared to the true causal effects. -
all_experiments.h5
: Contains the bias of estimated causal effects for each individual experiment, i.e. ratio of direct to confounding effect, model, and repetition. To load the results for the direct to confounding effect ratio 10/1, use pandas:
results = pd.read_hdf("all_experiments.h5", key="x10_z1")
The rows are the coefficients, and the columns are organized hierachically such that the first level is the experiment, the second level the model, and the third level the repitition.
-