In the last few months Baker’s team has been working with biologists who were previously stuck trying to figure out the shape of proteins they were studying. “There’s a lot of pretty cool biological research that’s been really sped up,” he says. A public database containing hundreds of thousands of ready-made protein shapes should be an even bigger accelerator.
“It looks astonishingly impressive,” says Tom Ellis, a synthetic biologist at Imperial College London studying the yeast genome, who is excited to try the database. But he cautions that most of the predicted shapes have not yet been verified in the lab.
In the new version of AlphaFold, predictions come with a confidence score that the tool uses to flag how close it thinks each predicted shape is to the real thing. Using this measure, DeepMind found that AlphaFold predicted shapes for 36% of human proteins with an accuracy that is correct down to the level of individual atoms. This is good enough for drug development, says Hassabis.
Previously, after decades of work, only 17% of the proteins in the human body have had their structures identified in the lab. If AlphaFold’s predictions are as accurate as DeepMind says, the tool has more than doubled this number in just a few weeks.
Even predictions that are not fully accurate at the atomic level are still useful. For more than half of the proteins in the human body, AlphaFold has predicted a shape that should be good enough for researchers to figure out the protein’s function. The rest of AlphaFold’s current predictions are either incorrect, or are for the third of proteins in the human body that don’t have a structure at all until they bind with others. “They’re floppy,” says Hassabis.
“The fact that it can be applied at this level of quality is an impressive thing,” says Mohammed AlQuraish, a systems biologist at Columbia University who has developed his own software for predicting protein structure. He also points out that having structures for most of the proteins in an organism will make it possible to study how these proteins work as a system, not just in isolation. “That’s what I think is most exciting,” he says.
DeepMind is releasing its tools and predictions for free and will not say if it has plans for making money from them in future. It is not ruling out the possibility, however. To set up and run the database, DeepMind is partnering with the European Molecular Biology Laboratory, an international research institution that already hosts a large database of protein information.
For now, AlQuraishi can’t wait to see what researchers do with the new data. “It’s pretty spectacular,” he says “I don’t think any of us thought we would be here this quickly. It’s mind boggling.”