In 2020, an artificial intelligence lab called DeepMind unveiled technology that could predict the shape of proteins—the microscopic mechanisms that drive the behavior of the human body and all other living things.
A year later, the lab shared the tool, called AlphaFold, with scientists and published the predicted shapes for more than 350,000 proteins, including all proteins expressed by the human genome. It immediately changed the course of biological research. If scientists can identify the shapes of proteins, they can accelerate the ability to understand disease, create new drugs, and otherwise investigate the mysteries of life on Earth.
Now, DeepMind has published predictions for almost every protein known to science. On Thursday, the London-based lab, which is owned by the same parent company as Google, said it had added more than 200 million predictions to an online database freely available to scientists around the world.
With this new launch, the scientists behind DeepMind hope to accelerate research into more obscure organisms and unleash a new field called metaproteomics.
“Scientists can now explore this entire database and look for patterns — correlations between species and evolutionary patterns that might not have been apparent until now,” Demis Hassabis, chief executive of DeepMind, said in a telephone interview.
Proteins begin as chains of chemical compounds, then twist and fold into three-dimensional shapes that define how these molecules attach to each other. If scientists can identify the shape of a particular protein, they can decipher how it works.
This knowledge is often a vital part of fighting disease and illness. For example, bacteria resist antibiotics by expressing certain proteins. If scientists can understand how these proteins work, they can begin to counter antibiotic resistance.
Previously, identifying the shape of a protein required extensive experimentation with X-rays, microscopes and other tools on a lab bench. Now, given the chain of chemical compounds that make up a protein, AlphaFold can predict its shape.
Technology is not perfect. But it can predict the shape of a protein with an accuracy that rivals physical experiments about 63 percent of the time, according to independent benchmark tests. With a prediction in hand, scientists can verify its accuracy relatively quickly.
Kliment Verba, a researcher at the University of California, San Francisco who uses the technology to understand the coronavirus and prepare for similar pandemics, said the technology had “supercharged” that work, often saving months of experimentation. Others have used the tool while fighting gastroenteritis, malaria and Parkinson’s disease.
The technology has also accelerated research beyond the human body, including an effort to improve the health of bees. DeepMind’s expanded database can help an even larger community of scientists achieve similar benefits.
Like Dr Hassabis, Dr Verba believes the database will provide new ways of understanding how proteins behave across species. He also sees it as a way to educate a new generation of scientists. Not all researchers are versed in this type of structural biology; a database of all known proteins lowers the entry bar. “It can bring structural biology to the masses,” said Dr. Verba.