Machine-learning model AlphaFold transformed the study of proteins, predicting folding patterns faster and more accurately than humans ever have. But it has done little to elucidate protein history.
“The origin of proteins was very complex. That great complexity surely did not spring into being like Athena from the brow of Zeus, out of nothing. But with current methods, we can’t see the causal dynamic that made proteins possible,” says SFI External Professor D. Eric Smith, a researcher at the Earth-Life Science Institute (ELSI) in Japan.
Smith has co-organized a working group looking for hidden rules that underpin proteins. “Assembly Theory for Folded Matter” brings molecular biologists, bioinformaticians, statisticians, machine-learning experts, and more to SFI August 18–21 to explore how proteins emerged, how they could evolve in the future, and how we might build new ones for medical treatment and beyond.
A key meeting pillar is that to truly understand proteins, you must search for underlying principles in all “folded matter” — any macromolecules that display folding, including polynucleotides and some polysaccharides.
“Molecular biologists studying the history of a fold traditionally think about repetition within one protein family. It’s hard for us to systematically consider connections in a more global space: how is this fold a reuse of widespread forms, and what does that tell us about its generation or discovery?” says Liam Longo, a meeting co-organizer and researcher at ELSI.
To interpret reuse across folded matter, participants will apply the tenets of assembly theory, developed by SFI External Professors Sara Walker (Arizona State University), Lee Cronin (University of Glasgow), and their labs. Organizers hope to determine if the theory, built to explain the biochemistry of small molecules, also fits larger macromolecules. If it does, that could have major implications for activities such as developing new drugs, where approaches for small molecules like aspirin and large folded proteins like antibodies can differ.
“Assembly theory centers on a quantitative measure of complexity that asks, what is the shortest path to make an object where you can reuse pieces you made along the way?” says co-organizer Cole Mathis, an assistant professor at Arizona State University. “With this working group, we’re asking how that measure holds up against a new physical system, and what it reveals about the deep history and future evolution of proteins.”
Participants will also probe how machine-learning methods can spot repeated forms and clarify the balance of invention and reuse as proteins evolved.
“We’ve reached a moment when the three meeting pillars — machine learning, assembly theory, and the concept of folded matter — can make more concrete progress together now than they can individually. Building on SFI’s long history of developing biological theory, this working group may help create a theoretical framework for a crucial element of the origin of life,” says co-organizer and ELSI researcher Harrison Smith.
The organizers hope to launch a continued partnership between SFI and ELSI, new research topics that would be reasonable in scope and timeline for a Ph.D. student or postdoc to tackle, and a network of interdisciplinary collaborators.
Hero Image: From Fig. 1 of Longo, L. M., R. Kolodny, and S. E. McGlynn. 2022. “Evidence for the emergence of β-trefoils by ‘Peptide Budding’ from an IgG-like β-sandwich.” PLOS Computational Biology 18 (2): e1009833. DOI: 10.1371/journal.pcbi.1009833. Courtesy Liam Longo.