RNA molecular evolution can be considered a search via mutation and natural selection for fit molecules. We propose a simple mathematical model that formalizes the effects of mutation on nucleotide composition by characterizing random walks under various assumptions in RNA sequence space. Assuming that selection will tend to maximize base pairing as molecules become better adapted, we use a mean field approximation to relate nucleotide composition to base pairing propensity in random RNA molecules. Hence, we can evaluate all possible combinations with respect to their accessibility under mutation and the probability of forming unfrustrated structures, a quantity we call evolutionary utility. Evolutionary utility is a measure of the probability of Darwinian evolutionary processes finding fit sequences within specific nucleotide composition.
We can demonstrate the validity of this approach by comparing out predicted utility values to the large amount of RNA sequence data that has been collected to date. We have plotted within the tetrahedral simplex the compositions of over 2000 individual, single-stranded RNA molecules from functionally distinct classes of RNA and from evolutionarily distinct organisms. We find a significant, universal clustering that qualitatively matches the distribution of compositions predicted to have high evolutionary utility. Refining what we believe to be the effects of selection and mutation on nucleotide composition will resolve the relative contributions of historical and ahistorical constraints during RNA molecular evolution and will have practical implications in the design of intelligent search strategies in combinatorial approaches to molecular design.