A.I. Predicts the Shapes of Molecules to Come
For several years now, John McGeehan, biologist and director of the Center for Enzyme Innovation in Portsmouth, England, has been looking for a molecule capable of breaking down the 150 million tonnes of soda bottles and other plastic waste scattered around the world.
Working with researchers on both sides of the Atlantic, he found some good options. But its task is that of the most discerning locksmith: to spot the chemical compounds that on their own will twist and bend into the microscopic shape that can fit perfectly into the molecules of a plastic bottle and pull them apart, like a key opening a door. .
Determining the exact chemical content of a given enzyme is a fairly straightforward challenge these days. But identifying its three-dimensional shape may require years of biochemical experimentation. So last fall, after reading that an artificial intelligence lab in London called DeepMind had built a system that automatically predicts the shape of enzymes and other proteins, Dr McGeehan asked the lab if it could help. ‘help in his project.
Towards the end of a work week, he sent DeepMind a list of seven enzymes. The following Monday, the lab returned the forms for all seven. “It took us a year, if not two, forward,” said Dr. McGeehan.
Now any biochemist can speed up their work in much the same way. On Thursday, DeepMind published the predicted forms of more than 350,000 proteins – the microscopic mechanisms that govern the behavior of bacteria, viruses, the human body and all other living things. This new database includes the three-dimensional structures of all proteins expressed by the human genome, as well as those of proteins that appear in 20 other organisms, including mice, fruit flies and E. coli bacteria.
This large and detailed biological map – which provides around 250,000 forms that were previously unknown – can accelerate the ability to understand diseases, develop new drugs, and reuse existing drugs. It can also lead to new types of biological tools, like an enzyme that efficiently breaks down plastic bottles and converts them into materials that are easily reused and recycled.
“It can move you forward in time – influence how you view problems and help solve them faster,” said Gira Bhabha, assistant professor in the cell biology department at New York University. “Whether you are studying neuroscience or immunology – whatever your field of biology – this can be helpful.”
This new knowledge is its own kind of key: If scientists can determine the shape of a protein, they can determine how other molecules will bind to it. It could reveal, for example, how bacteria resist antibiotics – and how to counter that resistance. Bacteria are resistant to antibiotics by expressing certain proteins; if scientists were able to identify the forms of these proteins, they could develop new antibiotics or new drugs that suppress them.
In the past, locating the shape of a protein required months, years, and even decades of trial and error experiments involving x-rays, microscopes, and other tools on the lab bench. But DeepMind can cut the timeline dramatically with its AI technology, known as AlphaFold.
When Dr McGeehan sent DeepMind his list of seven enzymes, he told the lab he had already identified forms for two of them, but did not specify which ones. It was a way of testing the proper functioning of the system; AlphaFold passed the test, correctly predicting both forms.
It was even more remarkable, Dr McGeehan said, that the predictions arrived within days. He later learned that AlphaFold actually completed the task in just a few hours.
AlphaFold predicts protein structures using what’s called a neural network, a mathematical system that can learn tasks by analyzing large amounts of data – in this case, thousands of known proteins and their physical forms – and by extrapolating to the unknown.
It’s the same technology that identifies commands you bark on your smartphone, recognizes faces in photos you post on Facebook, and translates one language into another on Google Translate and other services. But many experts believe that AlphaFold is one of the most powerful applications in technology.
“It shows that AI can do useful things in the complexity of the real world,” said Jack Clark, one of the authors of the AI Index, an effort to track advances in artificial intelligence technology. worldwide.
As Dr. McGeehan discovered, it can be remarkably precise. AlphaFold can predict a protein’s shape with an accuracy that rivals physical experiments about 63 percent of the time, according to independent benchmark tests that compare its predictions to known protein structures. Most experts had assumed that such powerful technology was still years away.
“I thought it would take another 10 years,” said Randy Read, professor at the University of Cambridge. “It was a complete change.”
But the accuracy of the system varies, so some of DeepMind’s database predictions will be less useful than others. Each prediction in the database is accompanied by a “confidence score” indicating how accurate it is likely to be. DeepMind researchers estimate that the system provides a “good” prediction about 95% of the time.
As a result, the system cannot completely replace physical experiences. It is used alongside work on the lab bench, helping scientists determine which experiments to conduct and filling in gaps when experiments fail. Using AlphaFold, researchers at the University of Colorado at Boulder recently helped identify a protein structure they had struggled to identify for more than a decade.
The developers at DeepMind have chosen to freely share its database of protein structures rather than selling access, in hopes of spurring advancement in the biological sciences. “We are interested in maximum impact,” said Demis Hassabis, CEO and co-founder of DeepMind, which is owned by the same parent company as Google but operates more like a research lab than a commercial enterprise.
Some scientists have compared DeepMind’s new database to the Human Genome Project. Completed in 2003, the Human Genome Project provided a map of all human genes. Now, DeepMind has provided a map of the roughly 20,000 proteins expressed by the human genome – another step towards understanding how our bodies work and how we can react when things go wrong.
We also hope that the technology will continue to evolve. A University of Washington lab built a similar system called RoseTTAFold, and like DeepMind, it openly shared the computer code that drives its system. Anyone can use the technology, and anyone can work to improve it.
Even before DeepMind began to openly share its technology and data, AlphaFold was powering a wide range of projects. Researchers at the University of Colorado are using the technology to understand how bacteria like E. coli and salmonella develop resistance to antibiotics and to develop ways to fight this resistance. At the University of California at San Francisco, researchers used the tool to improve their understanding of the coronavirus.
The coronavirus wreaks havoc on the body through 26 different proteins. With help from AlphaFold, researchers have improved their understanding of a key protein and hope the technology can help them better understand the other 25.
If it comes too late to have an impact on the current pandemic, it could help prepare for the next one. “A better understanding of these proteins will help us not only target this virus but other viruses,” said Kliment Verba, one of the San Francisco researchers.
The possibilities are endless. After DeepMind gave Dr McGeehan forms for seven enzymes that could potentially rid the world of plastic waste, he sent the lab a list of 93 more. “They are working on it now,” he said.
#Predicts #Shapes #Molecules