KAUST Research Conference
Computational Advances in Structural Biology
May 1 - 3, 2023 Auditorium between building 4 & 5
Abstract:
AlphaFold2 represents a significant milestone in protein structure prediction. However, its implementation lacks the code and data necessary to train new models for novel tasks, such as predicting protein-ligand complex structures. In this presentation, I will discuss OpenFold, a fast, memory-efficient, and trainable implementation of AlphaFold2. We trained OpenFold from scratch and discovered that it can generalize remarkably well to previously unseen regions of protein structure space, even when trained on highly reduced datasets representing only ~1% of the original AlphaFold2 training data. This has profound implications for training AlphaFold2-style models in data-sparse regimes, such as nucleic acids or small molecules.
OpenFold has led to exciting applications, including the release of the ESM Metagenomic Atlas by Meta AI, which contains over 600 million predicted protein structures