Description
I have two datasets A and B, where there is a high-abundance OTU (id: OTU_54) in dataset A. In order to compare the abundance of OTU_54 in the two datasets, I put the raw sequencing data of A and B together (=>A+B), followed the example steps provided on the website to cluster (the parameters are the same as when A and B analyzed), and found that the OTU_54 in the original A dataset had very low abundance in the otutab(A+B) produced by the new clustering.
So I blast all.nonchimeras.fasta
(the file before cluster at 97% similarity) of A, B and A+B with OTU_54, and filtered the blast results according to identity > 97%, alignment length>300, and checked the number of matches, and found that A+B lost a lot of OTU_54.
wc -l filt_nonchim* # filtered blast results.
76966 filt_nonchim18.txt #generated from datasetB
157240 filt_nonchim19.txt #generated from datasetA
12369 filt_nonchim.txt #generated from A+B
How can I address or optimize the analysis process? Thanks!