Improved protein structure prediction using…

February 2020

Author(s)

A.W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green, C. Qin, A. Žídek, A.W.R. Nelson, A. Bridgland, H. Penedones, S. Petersen, K. Simonyan, S. Crossan, P. Kohli, D.T. Jones, D.Silver, K.Kavukcuoglu & D. Hassabis

Sources

Nature volume 577, 706–710(2020)Cite this article Nature 577, 627-628 (2020) Mohammed AlQuraishi doi: 10.1038/d41586-019-03951-0

Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence1. This problem is of fundamental importance as the structure of a protein largely determines its function; however, protein structures can be difficult to determine experimentally. Considerable progress has recently been made by leveraging genetic information.

It is possible to infer which amino acid residues are in contact by analysing covariation in homologous sequences, which aids in the prediction of protein structures. The idea is predicated on the following premise: if two amino-acid residues in a protein are close together in 3D space, then a mutation that replaces one of them with a different residue (for example, large for small) will probably induce, at a later time, a mutation that alters the other residue in a compensatory direction (in our example, swapping small for large). The set of co-evolving residues, therefore, encodes valuable spatial information and can be found by analysing the sequences of evolutionarily related proteins.

AlphaFold predicts the probabilities of residues being separated by different distances. Because probabilities and energies are interconvertible, AlphaFold predicts an energy landscape — one that overlaps in its lowest basin with the true landscape but is much smoother. In fact, AlphaFold’s landscape is so smooth that it nearly eliminates the need for searching. This makes it possible to use a simple procedure to find the most favourable conformation, rather than the complex search algorithms employed by other methods.

Neural network can be trained to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions. Using this information, a potential of mean force can be constructed that accurately describe the shape of a protein. We find that the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures.

The resulting algorithm outperformed all entrants at the most recent blind assessment of methods used to predict protein structures (the CASP13 event), generating the best structure for 25 out of 43 proteins, compared with 3 out of 43 for the next-best method. AlphaFold’s predictions had a median accuracy of 6.6 ångströms. AlphaFold represents a considerable advance in protein-structure prediction.

AlphaFold is not yet accurate enough for most applications, such as working out the catalytic mechanisms of enzymes or how drugs bind to proteins (which both typically require 2–3 Å resolution). And although AlphaFold’s search procedure is much simpler than most modern methods, it can still be slow, taking tens to hundreds of hours to make a single prediction.

Latest news

January
2026

The Hidden Language of Gut-derived Lipopolysaccharides: FineChemistry, Huge Immunological Consequences

Lipopolysaccharides (LPSs) from Gram-negative bacteria have long been regarded as prototypical “endotoxins” that activate the...

December
2025

Structural basis for human chondroitin sulfate chain polymerization

Chondroitin sulfates are complex polysaccharides that regulate diverse biological processes at the cell surface and...

December
2025

Seal milk oligosaccharides rival human milk complexity andexhibit functional dynamics during lactation

Milk oligosaccharides are vital for neonatal growth and health in mammals. However, most research on...

December
2025

Structural Mechanism of Insect Cuticular Protein Binding to Chitin Revealed by Solid-State NMR

The insect exoskeleton exemplifies how nature employs organic materials to produce high-performance substances characterized by...