This column is the latest in the "Science in a Complex World" series written by Santa Fe Institute researchers and published in The Santa Fe New Mexican. This article appeared on October 27, 2014.
By Andreas Wagner, External Professor, Santa Fe Institute
When we think of great innovators, the winter flounder usually does not come to mind. Yet this otherwise unremarkable flatfish, which dwells in the frigid waters of the North Atlantic, harbors innovative molecules akin to the antifreeze in your car.
These molecules are proteins, long string-like molecules made out of 20 different kinds of amino acids. What’s innovative about them is their specific sequence of amino acids, which allows them to keep the flounder’s body fluids flowing where those of less hardy species would freeze solid.
And they are not alone. Each cell in the winter flounder’s body harbors thousands of different kinds of proteins, each a different sequence of “letters” in the molecular alphabet of 20 amino acids, each dedicated to a specific task, such as delivering oxygen, sensing nutrients, degrading toxic molecules or transmitting signals to other cells. Each of them was an innovation when it originated in one of the flounder’s ancestors, somewhere along life’s long lineage that started almost 4 billion years ago.
Not just fish, but any organism alive today, from the most humble bacteria to humans, is packed to the rafters with molecular innovations undreamed of by evolution’s great pioneer Charles Darwin. And while Darwin recognized how natural selection allowed innovations to spread once they had originated, his theory remained silent about their origins, and he humbly admitted as much.
The early 20th-century Dutch botanist Hugo de Vries put it best when he said that “natural selection may explain the survival of the fittest, but it cannot explain the arrival of the fittest.” Since then, biologists have reconstructed the natural history of myriad innovations, but beyond knowing that nature innovates by randomly altering DNA, they became none the wiser about the deep reasons behind life’s ability to innovate — its “innovability.”
This has been changing, thanks to a molecular and genomic revolution that has engulfed the life sciences over the past few decades, a revolution that coincided with my own academic coming-of-age. Together with fellow researchers at the Santa Fe Institute and a group of researchers I direct at the University of Zurich in Switzerland, I have been privileged to help solve the problem of nature’s boundless creativity.
To understand what proteins have taught us about innovability, imagine a library of books whose texts contain all possible strings of combinations of letters that can be written in the English alphabet. While this library — enormous beyond comprehension — would contain much nonsense, it would also contain all meaningful texts, including descriptions of all conceivable human innovations, from the wheel to the steam engine to the transistor — to innovations we have not yet imagined.
Nature innovates by exploring libraries just like this, except that their texts are strings of all possible proteins and are numerous beyond imagination. If all proteins were only 100 amino acids long (many are much longer), nature’s library would contain more than 10130 texts — in English, that’s a one with 130 zeroes. This vast library encodes all “meaningful” proteins — proteins able to perform a useful task — that nature has already discovered, and many more that remain undiscovered.
A population of organisms that evolves through random DNA mutations (in other words, by exploring the protein library) is like a crowd of library patrons, each walking step by random step through a human library — except that in nature’s library, each step can have much more severe consequences. If a mutation disrupts an essential protein such as, for example, the hemoglobin (a protein in red blood cells that carries oxygen from the lungs or gills to the rest of the body), the individual dies, courtesy of natural selection.
But some random steps lead to improved texts and are rewarded, like that step made by some distant ancestor of the bar-headed goose, whose hemoglobin binds oxygen more tightly and allows the bird to migrate through the oxygen-deprived heights of the Himalayas, getting a wing up on its competitors.
Because libraries like this harbor the secret to nature’s innovability, we work hard to study and understand their catalogues. We have already learned that protein library catalogues are uncannily well suited for innovation through trial and error. They also answer an old question about Darwinian evolution: If problems like transporting oxygen had only one solution, one protein among more than 10100 (a one with 100 zeroes), this protein could never be found through a blind search of the library — too many texts, not enough time.
Yet in nature’s libraries we find that any one problem has too many solutions to count. For example, the functions of many proteins like hemoglobin or antifreeze proteins are encoded by 1050 or more texts — far fewer than the number of possible texts, but far more than a single text.
Fortunately, these viable texts are not scattered haphazardly through the library. Rather, they form a network of synonymous texts — texts with different combinations of letters but with roughly the same “meanings.” This network of texts extends through the entire library so that one can step from one text to a neighboring text to another neighboring text, and so on, exploring almost the entire library in multiple directions, while remaining on the relative safety of the fraction of texts that encode the same life-bestowing protein: hemoglobin or anti-freeze proteins, for example.
And that’s a good thing because an evolving population otherwise would be confined to a single volume or to a tiny region of the vast library, or be forced to make an exploratory, blind journey that is fraught with deadly missteps.
With such a network (which we call a genotype network), a population can explore the library far and wide and reach innovative texts — neighboring texts with slightly different but perhaps vastly advantageous meanings — wherever they occur. What’s more, some proteins have larger networks than others, and as my Santa Fe Institute colleague Evandro Ferrada has shown, those proteins have “discovered” more innovative texts in their evolutionary history. That’s how nature accesses so many innovations so rapidly. Large genotype networks provide an advantage so great that we have difficulty computing it because millions, if not trillions, of potential innovations can become accessible through them.
While Darwin did not have the technology to explore nature’s libraries, he might be pleased to hear that the thorny problem of innovability has a simple solution — one that makes natural selection work. And although we cannot expect the winter flounder to be grateful, I certainly am, for without them, neither one of us would have crawled out of the primordial soup.
Andreas Wagner is a leading researcher in evolutionary biology, an external professor at the Santa Fe Institute, a professor at the University of Zurich in Switzerland, and an adjunct professor at The University of New Mexico. His book, Arrival of the Fittest: Solving Evolution’s Greatest Puzzle, appeared this month.