Boltz-1 Training Data

deep learning life sciences molecular docking open source software protein folding

Description

This is the data used to train the Boltz-1 model. It contains the following datasets:

  • Our pre-processed version of the Protein Data Bank
  • Our pre-processed version of the multiple sequence alignment data for each protein chain
  • The raw multiple sequence alginment data.
  • A pre-computed symmetry file for symmetry correction during training

Update Frequency

None

License

MIT License

Documentation

https://github.com/jwohlwend/boltz/blob/main/docs/training.md

Managed By

MIT CSAIL - Regina Barzilay Group

See all datasets managed by MIT CSAIL - Regina Barzilay Group.

Contact

jwohlwend@csail.mit.edu

How to Cite

Boltz-1 Training Data was accessed on DATE from https://registry.opendata.aws/boltz1.

Usage Examples

Publications

Resources on AWS

  • Description
    S3 Bucket containing the Boltz-1 data
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::boltz1
    AWS Region
    us-east-2
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://boltz1/

Edit this dataset entry on GitHub

Tell us about your project

Home