Predicting the Evolutionary Utility of Nucleotide Compositions Using a Statistical Mechanics Approach to RNA Evolution

Erik Schultes*, Peter T. Hraber, Thomas H. LaBean

Abstract

We are developing an ensemble approach to the analysis of RNA molecular evolution, where the details of nucleotide sequence, secondary structure, and phylogenetic relationships are replaced with distributions of nucleotide composition. The space of possible nucleotide compositions can be exactly represented as a tetrahedral simplex, where each point in the volume of the tetrahedron represents a unique nucleotide composition. If we can understand statistically the effects of selection and mutation on nucleotide compositions, we can predict a priori the compositions of evolved RNA molecules.

RNA molecular evolution can be considered a search via mutation and natural selection for fit molecules. We propose a simple mathematical model that formalizes the effects of mutation on nucleotide composition by characterizing random walks under various assumptions in RNA sequence space. Assuming that selection will tend to maximize base pairing as molecules become better adapted, we use a mean field approximation to relate nucleotide composition to base pairing propensity in random RNA molecules. Hence, we can evaluate all possible combinations with respect to their accessibility under mutation and the probability of forming unfrustrated structures, a quantity we call evolutionary utility. Evolutionary utility is a measure of the probability of Darwinian evolutionary processes finding fit sequences within specific nucleotide composition.

We can demonstrate the validity of this approach by comparing out predicted utility values to the large amount of RNA sequence data that has been collected to date. We have plotted within the tetrahedral simplex the compositions of over 2000 individual, single-stranded RNA molecules from functionally distinct classes of RNA and from evolutionarily distinct organisms. We find a significant, universal clustering that qualitatively matches the distribution of compositions predicted to have high evolutionary utility. Refining what we believe to be the effects of selection and mutation on nucleotide composition will resolve the relative contributions of historical and ahistorical constraints during RNA molecular evolution and will have practical implications in the design of intelligent search strategies in combinatorial approaches to molecular design.


*presently: Department of Microbiology, Duke University Medical Center, Durham, NV 27710. (919) 684-2714, -2217. schultes@abacus.mc.duke.edu