Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colabfold template selection #249

Open
GISTAL opened this issue Aug 21, 2024 · 0 comments
Open

Colabfold template selection #249

GISTAL opened this issue Aug 21, 2024 · 0 comments

Comments

@GISTAL
Copy link

GISTAL commented Aug 21, 2024

Hello everyone,

I am facing the following situation:
I am trying to run colabfold locally and for that I have chosen to disable the MSA search and use a large self-created template folder with 792 templates (x002-x793). During the run, I see that only a portion (about 100) of the templates have been used (see below). I tried the same thing with another protein sequence and different template folder (892 templates), but again only 50% of the templates were charged.

My questions:

  1. What determines the template selection? sequence length? sequence similarity between the protein sequence and/or within templates? Can it be adjusted so all templates are implemented?
  2. I disabled the MSA search as it makes the contribution of templates negligible, correct?

Run
colabfold_batch --amber --templates --custom-template-path TempGISTAL --recycle-early-stop-tolerance 0.1 --num-recycle 20 --num-models 3 --rank plddt --use-gpu-relax TARGETGISTAL.a3m outputdir_TARGETGISTALCF/ &

log file
2024-08-21 20:57:07,763 Running colabfold 1.5.5 (fdf3b235b88746681c46ea12bcded76ecf8e1f76)
2024-08-21 20:57:07,925 Running on GPU
2024-08-21 20:57:09,214 Found 9 citations for tools or databases
2024-08-21 20:59:36,289 Query 1/1: TARGET (length 653)
2024-08-21 21:00:20,122 Sequence 0 found templates: ['x481_A', 'x481_B', 'x492_A', 'x492_B', 'x461_A', 'x461_B', 'x476_A', 'x476_B', 'x514_A', 'x514_B', 'x508_A', 'x508_B', 'x474_A', 'x474_B', 'x478_A', 'x478_B', 'x505_A', 'x505_B', 'x191_A', 'x191_B', 'x193_A', 'x193_B', 'x176_A', 'x176_B', 'x216_A', 'x216_B', 'x175_A', 'x175_B', 'x213_A', 'x213_B', 'x206_A', 'x206_B', 'x205_A', 'x205_B', 'x177_A', 'x177_B', 'x198_A', 'x198_B', 'x221_A', 'x221_B', 'x201_A', 'x201_B', 'x641_A', 'x641_B', 'x640_A', 'x640_B', 'x590_A', 'x590_B', 'x638_A', 'x638_B', 'x601_A', 'x601_B', 'x612_A', 'x612_B', 'x598_A', 'x598_B', 'x642_A', 'x642_B', 'x595_A', 'x595_B', 'x608_A', 'x608_B', 'x622_A', 'x622_B', 'x624_A', 'x624_B', 'x633_A', 'x633_B', 'x626_A', 'x626_B', 'x637_A', 'x637_B', 'x736_A', 'x736_B', 'x758_A', 'x758_B', 'x756_A', 'x756_B', 'x752_A', 'x752_B', 'x786_A', 'x786_B', 'x740_A', 'x740_B', 'x791_A', 'x791_B', 'x768_A', 'x768_B', 'x781_A', 'x781_B', 'x734_A', 'x734_B', 'x742_A', 'x742_B', 'x743_A', 'x743_B', 'x779_A', 'x779_B', 'x277_A', 'x277_B', 'x279_A', 'x279_B', 'x254_A', 'x254_B', 'x275_A', 'x275_B', 'x238_A', 'x238_B', 'x248_A', 'x248_B']
2024-08-21 21:01:03,640 Sequence 1 found templates: ['x481_A', 'x481_B', 'x492_A', 'x492_B', 'x461_A', 'x461_B', 'x476_A', 'x476_B', 'x514_A', 'x514_B', 'x508_A', 'x508_B', 'x474_A', 'x474_B', 'x478_A', 'x478_B', 'x505_A', 'x505_B', 'x191_A', 'x191_B', 'x193_A', 'x193_B', 'x176_A', 'x176_B', 'x216_A', 'x216_B', 'x175_A', 'x175_B', 'x213_A', 'x213_B', 'x206_A', 'x206_B', 'x205_A', 'x205_B', 'x177_A', 'x177_B', 'x198_A', 'x198_B', 'x221_A', 'x221_B', 'x201_A', 'x201_B', 'x641_A', 'x641_B', 'x640_A', 'x640_B', 'x590_A', 'x590_B', 'x638_A', 'x638_B', 'x601_A', 'x601_B', 'x612_A', 'x612_B', 'x598_A', 'x598_B', 'x642_A', 'x642_B', 'x595_A', 'x595_B', 'x608_A', 'x608_B', 'x622_A', 'x622_B', 'x624_A', 'x624_B', 'x633_A', 'x633_B', 'x626_A', 'x626_B', 'x637_A', 'x637_B', 'x736_A', 'x736_B', 'x758_A', 'x758_B', 'x756_A', 'x756_B', 'x752_A', 'x752_B', 'x786_A', 'x786_B', 'x740_A', 'x740_B', 'x791_A', 'x791_B', 'x768_A', 'x768_B', 'x781_A', 'x781_B', 'x734_A', 'x734_B', 'x742_A', 'x742_B', 'x743_A', 'x743_B', 'x779_A', 'x779_B', 'x277_A', 'x277_B', 'x279_A', 'x279_B', 'x254_A', 'x254_B', 'x275_A', 'x275_B', 'x238_A', 'x238_B', 'x248_A', 'x248_B']
2024-08-21 21:01:04,137 Setting max_seq=7, max_extra_seq=1
2024-08-21 21:02:09,765 alphafold2_multimer_v3_model_1_seed_000 recycle=0 pLDDT=69.1 pTM=0.377 ipTM=0.368
...
2024-08-21 21:06:56,453 alphafold2_multimer_v3_model_1_seed_000 recycle=20 pLDDT=69 pTM=0.4 ipTM=0.391 tol=1.55
2024-08-21 21:06:56,454 alphafold2_multimer_v3_model_1_seed_000 took 346.6s (20 recycles)
2024-08-21 21:07:12,333 alphafold2_multimer_v3_model_2_seed_000 recycle=0 pLDDT=66.8 pTM=0.381 ipTM=0.376
...
2024-08-21 21:12:15,761 alphafold2_multimer_v3_model_2_seed_000 recycle=20 pLDDT=73.1 pTM=0.381 ipTM=0.377 tol=0.186
2024-08-21 21:12:15,761 alphafold2_multimer_v3_model_2_seed_000 took 318.7s (20 recycles)
2024-08-21 21:12:31,470 alphafold2_multimer_v3_model_3_seed_000 recycle=0 pLDDT=57.4 pTM=0.391 ipTM=0.377
...
2024-08-21 21:17:23,735 alphafold2_multimer_v3_model_3_seed_000 recycle=20 pLDDT=58.7 pTM=0.366 ipTM=0.36 tol=0.626
2024-08-21 21:17:23,736 alphafold2_multimer_v3_model_3_seed_000 took 307.3s (20 recycles)
2024-08-21 21:17:24,306 reranking models by 'plddt' metric
2024-08-21 21:17:25,555 Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
2024-08-21 21:17:45,193 Relaxation took 20.9s
2024-08-21 21:17:45,194 rank_001_alphafold2_multimer_v3_model_2_seed_000 pLDDT=73.1 pTM=0.381 ipTM=0.377
2024-08-21 21:18:05,214 Relaxation took 20.0s
2024-08-21 21:18:05,214 rank_002_alphafold2_multimer_v3_model_1_seed_000 pLDDT=69 pTM=0.4 ipTM=0.391
2024-08-21 21:18:29,338 Relaxation took 24.0s
2024-08-21 21:18:29,339 rank_003_alphafold2_multimer_v3_model_3_seed_000 pLDDT=58.7 pTM=0.366 ipTM=0.36
2024-08-21 21:18:30,773 Done

Kind regards,
GISTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant