A trainable PyTorch reproduction of AlphaFold 3.
For more information on the model's performance and capabilities, see our technical report.
You can follow our twitter or join the conversation in the discord server.
Follow these steps to set up and run Protenix:
-
Install Docker (with GPU Support) Ensure that Docker is installed and configured with GPU support. Follow these steps:
- Install Docker if not already installed.
- Install the NVIDIA Container Toolkit to enable GPU support.
- Verify the setup with:
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
-
Pull the Docker image, which was built based on this Dockerfile
docker pull ai4s-cn-beijing.cr.volces.com/infra/protenix:v0.0.1
-
Clone this repository and
cd
into itgit clone https://github.com/bytedance/protenix.git cd ./protenix pip install -e .
-
Run Docker with an interactive shell
docker run --gpus all -it -v $(pwd):/workspace -v /dev/shm:/dev/shm ai4s-cn-beijing.cr.volces.com/infra/protenix:v0.0.1 /bin/bash
After running above commands, you’ll be inside the container’s environment and can execute commands as you would on a normal Linux terminal.
# maybe you need to update libxrender1 and libxext6 firstly, run as following for Debian:
# apt-get update
# apt-get install libxrender1
# apt-get install libxext6
pip3 install protenix
- Custom CUDA layernorm kernels modified from FastFold and Oneflow accelerate about 30%-50% during different training stages. To use this feature, run the following command:
If the environment variable
export LAYERNORM_TYPE=fast_layernorm
LAYERNORM_TYPE
is set tofast_layernorm
, the model will employ the layernorm we have developed; otherwise, the naive PyTorch layernorm will be adopted. The kernels will be compiled whenfast_layernorm
is called for the first time. - DeepSpeed DS4Sci_EvoformerAttention kernel is a memory-efficient attention kernel developed as part of a collaboration between OpenFold and the DeepSpeed4Science initiative. To use this feature, simply pass:
into the command line. DS4Sci_EvoformerAttention is implemented based on CUTLASS. You need to clone the CUTLASS repository and specify the path to it in the environment variable CUTLASS_PATH. The Dockerfile has already include this setting:
--use_deepspeed_evo_attention true
The kernels will be compiled when DS4Sci_EvoformerAttention is called for the first time.RUN git clone -b v3.5.1 https://github.com/NVIDIA/cutlass.git /opt/cutlass ENV CUTLASS_PATH=/opt/cutlass
To download the wwPDB dataset and proprecessed training data, you need at least 1T disk space.
Use the following command to download the preprocessed wwpdb training databases:
wget -P /af3-dev/release_data/ https://af3-dev.tos-cn-beijing.volces.com/release_data.tar.gz
tar -xzvf /af3-dev/release_data/release_data.tar.gz -C /af3-dev/release_data/
rm /af3-dev/release_data/release_data.tar.gz
The data should be placed in the /af3-dev/release_data/
directory. You can also download it to a different directory, but remember to modify the DATA_ROOT_DIR
in configs/configs_data.py correspondingly. Data hierarchy after extraction is as follows:
├── components.v20240608.cif [408M] # ccd source file
├── components.v20240608.cif.rdkit_mol.pkl [121M] # rdkit Mol object generated by ccd source file
├── indices [33M] # chain or interface entries
├── mmcif [283G] # raw mmcif data
├── mmcif_bioassembly [36G] # preprocessed wwPDB structural data
├── mmcif_msa [450G] # msa files
├── posebusters_bioassembly [42M] # preprocessed posebusters structural data
├── posebusters_mmcif [361M] # raw mmcif data
├── recentPDB_bioassembly [1.5G] # preprocessed recentPDB structural data
└── seq_to_pdb_index.json [45M] # sequence to pdb id mapping file
With the above data, you can run the training demo from scratch. components.v20240608.cif
and components.v20240608.cif.rdkit_mol.pkl
is also used in inference pipeline for generating ccd reference feature. If you only want to run inference, the full released data is not necessary, you can download these two files separately.
wget -P /af3-dev/release_data/ https://af3-dev.tos-cn-beijing.volces.com/release_data/components.v20240608.cif
wget -P /af3-dev/release_data/ https://af3-dev.tos-cn-beijing.volces.com/release_data/components.v20240608.cif.rdkit_mol.pkl
Data processing scripts are still being organized and prepared, and distillation data will be released in the future.
Use the following command to download pretrained checkpoint [1.4G]:
wget -P /af3-dev/release_model/ https://af3-dev.tos-cn-beijing.volces.com/release_model/model_v1.pt
the checkpoint should be placed in the /af3-dev/release_model/
directory.
You can use notebooks/protenix_inference.ipynb to run the model inference.
You can run the script inference_demo.sh
to do model inference:
bash inference_demo.sh
Arguments in this scripts are explained as follows:
load_checkpoint_path
: path to the model checkpoints.input_json_path
: path to a JSON file that fully describes the input.dump_dir
: path to a directory where the results of the inference will be saved.dtype
: data type used in inference. Valid options include"bf16"
and"fp32"
.use_deepspeed_evo_attention
: whether use the EvoformerAttention provided by DeepSpeed.use_msa
: whether to use the MSA feature, the default is true. If you want to disable the MSA feature, add--use_msa false
to the inference_demo.sh script.
or you can run inference with:
# run with examples floder
protenix_infer --input_json_path examples/example.json --dump_dir ./output
Detailed information on the format of the input JSON file and the output files can be found here.
Predicted structures for the posebusters set are available at:
https://af3-dev.tos-cn-beijing.volces.com/pb_samples_release.tar.gz
After the installation and data preparations, you can run the following command to train the model from scratch:
bash train_demo.sh
Key arguments in this scripts are explained as follows:
-
dtype
: data type used in training. Valid options include"bf16"
and"fp32"
.--dtype fp32
: the model will be trained in full FP32 precision.--dtype bf16
: the model will be trained in BF16 Mixed precision, by default, theSampleDiffusion
,ConfidenceHead
,Mini-rollout
andLoss
part will still be training in FP32 precision. if you want to train and infer the model in full BF16 Mixed precision, pass the following arguments to the train_demo.sh:--skip_amp.sample_diffusion_training false \ --skip_amp.confidence_head false \ --skip_amp.sample_diffusion false \ --skip_amp.loss false \
-
use_deepspeed_evo_attention
: whether use the EvoformerAttention provided by DeepSpeed as mentioned above. -
ema_decay
: the decay rate of the EMA, default is 0.999. -
sample_diffusion.N_step
: during evalutaion, the number of steps for the diffusion process is reduced to 20 to improve efficiency. -
data.train_sets/data.test_sets
: the datasets used for training and evaluation. If there are multiple datasets, separate them with commas. -
Some settings follow those in the AlphaFold 3 paper, The table below shows the training settings for different fine-tuning stages:
Arguments Initial training Fine tuning 1 Fine tuning 2 Fine tuning 3 train_crop_size
384 640 768 768 diffusion_batch_size
48 32 32 32 loss.weight.alpha_pae
0 0 0 1.0 loss.weight.alpha_bond
0 1.0 1.0 0 loss.weight.smooth_lddt
1.0 0 0 0 loss.weight.alpha_confidence
1e-4 1e-4 1e-4 1e-4 loss.weight.alpha_diffusion
4.0 4.0 4.0 0 loss.weight.alpha_distogram
0.03 0.03 0.03 0 train_confidence_only
False False False True full BF16-mixed speed(A100, s/step) ~12 ~30 ~44 ~13 full BF16-mixed peak memory (G) ~34 ~35 ~48 ~24 We recommend carrying out the training on A100-80G or H20/H100 GPUs. If utilizing full BF16-Mixed precision training, the initial training stage can also be performed on A800-40G GPUs. GPUs with smaller memory, such as A30, you'll need to reduce the model size, such as decreasing
model.pairformer.nblocks
anddiffusion_batch_size
. -
In this version, we do not use the template and RNA MSA feature for training. As the default settings in configs/configs_base.py and configs/configs_data.py:
--model.template_embedder.n_blocks 0 \ --data.msa.enable_rna_msa false \
This will be considered in our future work.
-
The model also supports distributed training with PyTorch’s
torchrun
. For example, if you’re running distributed training on a single node with 4 GPUs, you can use:torchrun --nproc_per_node=4 runner/train.py
You can also pass other arguments with
--<ARGS_KEY> <ARGS_VALUE>
as you want.
If you want to fine-tune the model on a specific subset, such as an antibody dataset, you only need to provide a PDB list file and load the pretrained weights as finetune_demo.sh shows:
checkpoint_path="/af3-dev/release_model/model_v1.pt"
...
--load_checkpoint_path ${checkpoint_path} \
--load_checkpoint_ema_path ${checkpoint_path} \
--data.weightedPDB_before2109_wopb_nometalc_0925.base_info.pdb_list examples/subset.txt \
, where the subset.txt
is a file containing the PDB IDs like:
6hvq
5mqc
5zin
3ew0
5akv
Implementation of the layernorm operators referred to OneFlow and FastFold. We used OpenFold for some module implementations, except the LayerNorm
.
Please check Contributing for more details. If you encounter problems using Protenix, feel free to create an issue! We also welcome pull requests from the community.
Please check Code of Conduct for more details.
If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.
Please do not create a public GitHub issue.
This project, including code and model parameters are made available under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License. You can find details at: https://creativecommons.org/licenses/by-nc/4.0/
For commercial use, please reach out to us at [email protected] for the commercial license. We welcome all types of collaborations.