Tommy Johnson

The Success of DeepMind Alpha Fold in Protein Structure Prediction

AI, Bioinformatics, DeepMind, Protein Folding, Scientific Breakthrough

The Success of DeepMind Alpha Fold in Protein Structure Prediction

AlphaFold left scientists perplexed when it first made waves at CASP 2018 by overshadowing long-running groups with decades of experience.

Success of the platform was immense, with many seeing it as potentially game-changing for drug discovery. Yet several key questions arose as to its long-term viability: will it live up to its promises?

Accuracy

Accuracy in protein prediction algorithms has long been a challenge in biology. Proteins are complex biological structures composed of many amino acid chains linked together into crumpled structures that interact in multiple ways. Determining their three-dimensional structures from sequence data alone is no small task, even with high resolution experimental structures available as supporting evidence.

DeepMind Alpha Fold’s success at predicting protein structures has been an immense advantage to scientists. It allows researchers to study protein structures from their sequences, providing invaluable information that is crucial for understanding their functions in living organisms and finding treatments for diseases like malaria and antibiotic resistance. Furthermore, DeepMind Alpha Fold provides invaluable data necessary for drug discovery by providing designers with crucial knowledge for designing new drugs faster.

AlphaFold stands out among other protein folding software due to its innovative approach. Beginning with an input sequence of amino acid residues and querying multiple databases for similar structures, the system creates a multiple sequence alignment (MSA), consisting of correlations between amino acid residues in the input sequence and those found in database structures. From here it creates pair representation models describing interactions among amino acid residues within proteins which then allow AlphaFold to generate structural models of each protein.

AlphaFold stands out from other state-of-the-art protein folding models by producing an iterative final model which captures an exact snapshot of a protein’s structure at any given moment in time. The final structure produced is then compared with experimental structures solved via cryo-EM imaging technology used for high resolution images of macromolecules.

Results of this comparison are then utilized to improve sequence alignment and pair representation, leading to more accurate predictions for protein structures. Finally, all this data is stored in an easily searchable database which is freely accessible worldwide – this has already resulted in some exciting applications such as using AlphaFold to understand honeybee response to diseases as well as speed up development of malaria vaccines.

Speed

AlphaFold was trained for several weeks using computing power equivalent to 100-200 GPUs, or video cards, before learning how to estimate distances between pairs of amino acid residues. It then compared these distances against existing structures of similar proteins to identify how a new protein might fold into its final shape.

AlphaFold was an innovative system which, unlike previous state-of-the-art models that could take years to train and were only accurate for certain proteins, produced 3D structures quickly and accurately across an entire class. Furthermore, it could produce these 3D structures within hours rather than days or even weeks with traditional approaches.

See also  Emergence of Non-Traditional AI Applications

This speed and accuracy is unprecedented; it marks a critical step toward realizing digital biology’s promise of simulating biological systems and providing insight at unprecedented scale, speed, and scope.

AlphaFold has already had numerous practical impacts. A three-dimensional structure of proteins is vitally important to many projects, from understanding disease mechanisms to designing chemical probes of them for drug discovery. With open access code provided by AlphaFold team members, structural biologists are now able to independently validate its results as well as use its model predict structures themselves.

Not only can the latest version of our software predict the 3D shape of a protein from its sequence, but it can also estimate where to locate a ligand, another small molecule which may bind with it and alter its function. Being able to predict where this ligand may exist can speed up drug discovery efforts as it reduces experimental work required to identify potential new therapies.

However, it should be emphasized that binding of ligands to proteins may alter their structures; thus the AlphaFold model cannot guarantee 100% accurate predictions. Even so, researchers benefit immensely from this new tool; no longer do they require years or millions of dollars in order to understand their protein structures, instead just clicking their mouse a few times!

Scalability

Proteins are complex molecules with many functions in living cells. Understanding their assembly into identifiable structures is an immense challenge; however, DeepMind, Google parent Alphabet’s subsidiary focused on artificial intelligence has recently made significant strides forward with their machine learning program devoted to AI research.

AlphaFold was an impressive accomplishment that demonstrated DeepMind’s capability of accurately predicting three-dimensional structures of many proteins from their amino acid sequences – a huge step toward solving one of biology’s remaining challenges, according to Demis Hassabis, CEO of DeepMind.

Development of the algorithm required both hard work and luck, says Mr. Reiffel. It used a huge database of protein structures already determined experimentally and released in the Protein Data Bank (PDB), but also needed to surpass previous machine learning approaches.

To this end, the team developed a new neural network architecture and training procedure. They utilized evolutionary, physical and geometric constraints into the model as it predicts protein structures. As part of a competition called Critical Assessment of Structure Prediction (CASP), they compared their predictions with experimentally solved structures, in an event known as Critical Assessment of Structure Prediction (CASP). One approach — AlphaFold2 — easily outshone the others; its median GDT score in CASP14 reached 92.4, an amazing feat. Forecasting protein structural features with great accuracy was also evident, including the location of antiparallel b-sheets that make up most protein backbones, and even correctly pinpointing loops – an extremely challenging prediction task.

See also  Advances in 3D Printing Technology

Success of AF2 has brought greater awareness to protein structure prediction, but that’s only half the story. In a follow-up paper published in Nature, the team announced they used their system to predict protein structures at scale for 20 species, including most of the human proteome – making the structures freely available in an online public database.

Researchers suggest that by following this approach to all organisms in the PDB, it could help them gain greater insight into their function and make significant biomedical advances. They could use models like malaria, tuberculosis and cancer drugs in search of new medicines to treat those diseases.

Reliability

DeepMind’s AlphaFold protein folding AI represents an immense scientific achievement and significant leap forward for artificial intelligence research. While its fundamental insights could potentially revolutionise drug discovery, turning them into products with tangible value will require years of hard work from researchers and entrepreneurs.

Proteins are large, complex molecules made up of chains of amino acids that play many vital functions within our bodies. Structurally, proteins are highly stable. Their exact shapes and interactions with other proteins or small ligands depend on their unique three-dimensional structures which must be determined through experimentation. Understanding this 3D structure has long been considered one of the biggest challenges in biology – known as “protein folding problem”. For fifty years it has posed one of the greatest hurdles to advancing biological knowledge. DeepMind’s protein prediction model AlphaFold outperformed human scientists at last year’s Critical Assessment of Protein Structure Prediction (CASP) competition, and was ultimately declared successful – marking an historic achievement for artificial intelligence research. The judges declared this milestone for AI.

AlphaFold architecture consists of several modules – interlinked computational building blocks that each perform specific tasks – working in tandem to produce results. Starting from a sequence of one-dimensional amino acid sequences, predictions for how each amino acid will fold 3D are then made and then evaluated against experimentally determined protein structures to assess accuracy before being refined further and creating the final structure.

AlphaFold’s accuracy is measured using a metric called the predicted local distances difference test, or pLDDT, which measures how accurate individual atoms in predicted structures are in comparison to experimentally determined structures. This assessment uses an industry standard technique in structural biology which compares positions of individual atoms within computational models with experimentally established protein structures; hence a higher pLDDT score indicates more precise prediction accuracy.

AlphaFold architecture source code is open-source, meaning other research groups may adapt and modify it freely. Since its original competition, several independent groups have made improvements on it – such as aligning multiple protein sequences using intermediate losses to achieve iterative prediction and refinement, using masked MSA loss with heuristic representation of structures jointly training together, an equivariant attention architecture for end-to-end structure prediction as well as self-distortion error estimator.

Leave a Comment