life sciences msa open source software openfold protein protein folding protein template
This dataset contains MSAs and predicted structures for 13 million long (sequence length >= 200 amino acids) monomers from the MGNIFY database. These MSAs were generated using the AF3 protocol, and were used to predict structures with AlphaFold2. This data serves as the long monomer distillation set for Openfold3, an open-source, all-atom ligand, RNA and protein structure prediction software.
Never
https://github.com/aqlaboratory/openfold3-training-data-RODA/tree/main
OpenFold
See all datasets managed by OpenFold.
https://github.com/aqlaboratory/openfold3-training-data-RODA/issues
OpenFold3 Training Data was accessed on DATE from https://registry.opendata.aws/openfold3. Additionally, please cite our prior manuscript.
arn:aws:s3:::openfold3us-east-2aws s3 ls --no-sign-request s3://openfold3/