Skip to content

Commit

Permalink
Update Tutorial_20221025.md
Browse files Browse the repository at this point in the history
  • Loading branch information
chenweng1991 authored Oct 27, 2022
1 parent 0915b0c commit cbc9394
Showing 1 changed file with 41 additions and 2 deletions.
43 changes: 41 additions & 2 deletions Tutorial_20221025.md
Original file line number Diff line number Diff line change
@@ -77,7 +77,12 @@ multiqc fastqc
```console
$MyMultiome/MultiomeATAC_mito.sh -n Tutorial_Mito -1 Example.R1.fq.gz -2 Example.R2.fq.gz -i Example.i5.fq.gz -c 0 -t 8 -m $MyMultiome -b Genome/hg38.mitoMask # add -q for quick, which skip the QC steps
```
After this step a QCplot is generated
![Old1_HSPC_Mito QCplot](https://user-images.githubusercontent.com/43254272/198318936-1c7f1f4b-c203-4b93-8997-b1d82adc3b62.png)

## Step3: Parse the Cellranger qualified cell barcode
In a ReDeeM experiment, the ATAC and RNA data is analyzed by Cellranger arc. The barcodes.tsv.gz from that will be parsed here.
Becasue the barcode in barcodes.tsv.gz is RNA barcode, we need to translate into ATAC barcode (via RNAbc2ATAC.R) to be able to match mtDNA data
```console
ln -s $REDEEM_V/source/barcodes.tsv.gz ./
Rscript $MyMultiome/Helpers/RNAbc2ATAC.R barcodes.tsv.gz Tutorial_atac.barcodes.tsv
@@ -104,6 +109,40 @@ $mitoConsensus/Finalize.sh $WD $Threads $mitoConsensus

# Expected result
- Location: The results are saved in $WD/final
- Major 5 files: QualifiedTotalCts, RawGenotypes.Total.StrandBalance(Least stringent), RawGenotypes.VerySensitive.StrandBalance, RawGenotypes.Sensitive.StrandBalance, RawGenotypes.Specific.StrandBalance (Most Stringent)
- Major 5 files:
- QualifiedTotalCts,
- RawGenotypes.Total.StrandBalance(Least stringent),
- RawGenotypes.VerySensitive.StrandBalance(Less stringent),
- RawGenotypes.Sensitive.StrandBalance(Stringent),
- RawGenotypes.Specific.StrandBalance (Most Stringent)
- QualifiedTotalCts is a table with 6 columnes that show mtDNA coverage per position per cell

| Cellbarcode| coordinates on mt genome|# unique frag(total)|# unique frag(less stringent)|# unique frag(stringent)|# unique frag(very stringent)|
| ------------- |----------------------|--------------------|-----------------------------|------------------------|-----------------------------|

- RawGenotypes is a table with 14 columnes that show the consensus variant calling. Each row is a molecule with a potential variant

|MoleculeID | CellBC | Pos | Variant | V | Ref | FamSize | V-counts | CSS | DB_Cts | SG_Cts | Is+ | Is- | TotalDepth|
|-----------|--------|-----|--------|---------|----------|---------|---------|-----|--------|---------|-----|-----|-----------|

1. MoleculeID: Cellbarcode+start+end which is the identifier to define a molecule
2. CellBC: Cell barcode
3. Pos: The coordinate of the variant
4. Variant: A description of the variant
5. V: The variant base called on Pos
6. Ref: The reference base on Pos
7. FamSize: The consensus family size, or the total number of PCR copies for the given molecule
8. V-count: Number of PCR copies that support the variant
9. CSS: Consensus score, which is the proportion of PCR copies that support the variant
10. DB_Cts: Number of double cover copies, which are positions that sequenced by both Read1 and Read2
11. SG_Cts: Number of single cover copies, which are positions that sequenced by only Read1 or Read2
12. Is+: If the variant is discovered on plus strand
13. Is- : If the variant is discovered on minus strand
14. TotalDepth: On this given position, total number of unique fragment in the given cell

- Strand biased variant has been removed
![StrandBiase](https://user-images.githubusercontent.com/43254272/198328824-40977739-6fdf-4813-9461-9c5bee18d53a.png)

- QualifiedTotalCts and RawGenotypes.* are the inputs for REDEEM-R for downstream mutation filtering and phylogenetic tree reconstruction
-


0 comments on commit cbc9394

Please sign in to comment.