KAUST Research Conference

Computational Advances in Structural Biology

May 1 - 3, 2023 Auditorium between building 4 & 5

OpenFold: Insights gained from rebuilding and retraining AlphaFold2


Abstract:

AlphaFold2 represents a significant milestone in protein structure prediction. However, its implementation lacks the code and data necessary to train new models for novel tasks, such as predicting protein-ligand complex structures. In this presentation, I will discuss OpenFold, a fast, memory-efficient, and trainable implementation of AlphaFold2. We trained OpenFold from scratch and discovered that it can generalize remarkably well to previously unseen regions of protein structure space, even when trained on highly reduced datasets representing only ~1% of the original AlphaFold2 training data. This has profound implications for training AlphaFold2-style models in data-sparse regimes, such as nucleic acids or small molecules.

OpenFold has led to exciting applications, including the release of the ESM Metagenomic Atlas by Meta AI, which contains over 600 million predicted protein structures

  • Share this: