Google’s artificial intelligence that predicts the structure of proteins

The activity carried out by Google in recent times in the field of search is unparalleled. What we will describe today is the artificial intelligence algorithm that led Google to predict the structure of proteins. This has been a hotly debated issue in the world of science for over 50 years. The problem was so felt that it pushed the scientific community in 1994 to introduce a global competition to find a possible solution.

File: Proteína MSK1 .gif - Wikimedia Commons

The competition in question is CASP (Critical Assessment of protein Structure Prediction) and as anticipated it takes place globally every two years. Google with its DeepMind research group took part in the trial starting in 2018 . In just a few years, the study brought about unprecedented progress in the ability of computational methods to predict the structure of proteins.

Genesis of the problem

Proteins are molecules present in all living organisms. They are made up of chains of amino acids (21 in total) that fold in on themselves to give them a precise shape. As the letters of the alphabet form words, amino acids can develop in multiple ways to form proteins.

During the award speech, the 1972 Nobel Prize in Chemistry Christian Anfinsen asserted that in theory, a protein's amino acid sequence could completely determine its structure . This hypothesis has sparked a five-year research that can computationally predict the 3D structure of a protein based solely on its 1D amino acid sequence. The idea was to replace this new type of research as a complementary alternative to the expensive and expensive experimental methods of those years.

The structure of a protein, from the simple chain of amino acids to the more complex structures.
Credit: http: // The structure of a protein, from the simple chain of amino acids to the more complex structures.

The idea of ​​predicting the structure of a protein by simply calculating all the possible combinations in which the amino acid sequence could be found was quickly discarded. It is estimated that it would take longer than the age of the known universe to find the number of all possible configurations of a typical protein by calculating the brute force. A definitely infinite time. Almost like predicting the divine comedy knowing the letters from A to Z.

Knowing the structure of a protein represents a step forward in the scientific world. From their structure, in fact, it is possible to determine which functions will be performed. Predicting what it will look like then gives you an edge. An example can be that of the Spike protein of Sars-CoV-2 that makes COVID-19 so fearful that it can be fought by knowing exactly its behavior.

How did artificial intelligence algorithms allow Google to predict the structure of proteins?

Google develops AlphaFold

The project team focused mainly on studying protein development without using previously known models as models. The extraordinary aspect lies in the degree of accuracy obtained. The name of the project software is AlphaFold, now in version number 2 which is the one on the lips of the whole scientific world.

The approach used is to use deep neural networks to predict the properties of a protein starting from its properties. In particular, the two properties examined by the neural network are:

  • the distances between the pairs of amino acids;
  • the angles between the chemical bonds that connect them.
Structure of the artificial intelligence algorithm developed by Google DeepMind for the prediction of protein structure
Credit: https: // Structure of the algorithm developed by Google DeepMind

The training of the neural network is therefore able to predict a distribution of the distances between each pair of amino acids. These probabilities were then combined into a score that estimates the accuracy of a known protein structure. A second neural network was then used to estimate how close the proposed structure is to the right answer.

The proposed method is based on techniques already known in structural biology , since it consists in repeatedly replacing pieces of a protein structure with new protein fragments. The novelty lies in the idea of ​​generating new fragments, used to continuously improve the score of the proposed protein structure.

The algorithm was able to leverage a huge training dataset. In fact, the field of genomics is quite rich in data thanks to the rapid reduction in the cost of genetic sequencing. As a result, deep learning approaches to the prediction problem that rely on genomic data have become increasingly efficient in recent years. The study thus conducted will allow the artificial intelligence algorithm developed by Google so quickly to know all the possible structures of proteins.

DeepMind the Google laboratory

DeepMind is an English search start-up founded in 2010 and acquired by Google in 2014. Following its acquisition, it changed its name to Google DeepMind. Although initially the work carried out by the company was mainly linked to the world of video games, in 2018 the focus of the work was shifted to scientific research and ethics . Already during his first participation in the CASP, he won the award for the results obtained by overcoming highly experienced software.

In recent years, the DeepMind Ethics and Society unit has also been launched which focuses mainly on the ethical issues introduced by the use of artificial intelligence.

We are sure that we will hear more about these projects and the results that will be disclosed from time to time.

The article Google's Artificial Intelligence Predicting Protein Structure comes from TechCuE .