USING THE RNA SIMPLEX
Last updated September 1999
For more information, including references to manuscripts and
kinemage source, please read this.
If you have questions, please contact:
Erik Schultes
Whitehead Institute for Biomedical Research
9 Cambridge Center
Cambridge MA 02142 USA
phone: 617-258-6373
fax: 617-258-6768
email: schultes@wi.mit.edu
If the RNA Simplex or the accompanying data are used in further research,
please cite:
- E. Schultes, P.T. Hraber and T.H. LaBean. 1997. Global similarities in
nucleotide base composition among disparate functional classes of
single-stranded RNA imply adaptive evolutionary convergence. RNA 3:792-806.
- E. Schultes, P.T. Hraber and T.H. LaBean. 1999. A parameterization of RNA
sequence space. Complexity 4:61-71.
To work with the RNA simplex, open KINEMAGE files with your copy of
MAGE. It is best to use a large format monitor if possible. The
simplex can be rotated in any direction using the mouse. Various data
can be toggeled on and off using buttons on the right-hand side of the
window. Clicking on specific data points will identify, in the lower
left-hand corner of the window, the organismal source of the RNA
sequence. Be sure to explore all the KINEMAGEs by choosing "next" from
the KINEMAGE menu.
- Start-up MAGE.5.01 and click on the PROCEED button.
- Use the EXPAND button located in the upper right-hand corner of
the graphics window to expand the size of the graphics window to
maximum dimensions.
- Use menu option, OPEN FILE, to open the "RNA Simplex.kin" data file.
- The RNA simplex can be rotated by dragging the mouse. The ZOOM
option is a sliding control feature located on the right-hand side of
the graphics window. Other options can be found in the menu bar above
the graphics window.
- The buttons located along the right-hand side of the graphics window will
display various data:
- Tetrahedron: displays the boundaries of the RNA Simplex
- Lines: these features display loci in composition space obeying the rule
A=U&C=G (Chargaff's Axis, i.e., the gradient in G+C), A=G&C=U (gradient in
G+A), or A=C&G=U (gradient in G+U). The compass feature displays a reference
object at the isoheteropolymers and points in the directions to the
homopolymers.
- GA manifold: displays one of four order-disorder manifolds in RNA
composition space. Sequences located on the manifold represent the
most direct routes from the ordered region of sequence space
(Chargaff's Axis) to disordered regions of sequence space (edges of
the simplex other than the AU or CG edge). RNA sequence data are most
closely associated with this manifold.
- RNA: RNA composition data are listed as buttons by functional class and
taxonomic descriptions.
The following buttons are specific to simulation_simplex.kin:
- P, MF.A: mean base pairing propensity (P) of random sequences
calculated using the mean-field approximation (MF.A.). The 1771
composition vectors spanning the simplex are arbitrarily divided into
three groups from high to low mean base pairing propensity. This is
the distribution of self-organization of RNA as a function of base
composition.
- F, T.A.: mean frustration (F) of random sequences calculated
using the thermodynamic approximation (T.A.). 100 random sequences
from each composition vector were folded into secondary structures
using computer algorithms. Displayed are the computed free energies
averaged for each vector. The 1771 composition vectors spanning the
simplex are arbitrarily divided into three groups from low to high
mean frustration. This is the distribution of self-organization of RNA
as a function of base composition.
- Entropy: displays all composition vectors in the simplex
with Shannon entropy values between 1.3 and 1.5. The Shannon entropy
has a minimum value of 0.0 at the homopolymers and a maximum value of
2.0 at the isoheteropolymers. Random walks in RNA sequence space will
converge to high entropy composition vectors.
- M,F.A (0.64): evolutionary potential calculated using the
mean-field approximation with m=0.64. Displayed are only the 1% most
optimal composition vectors.
- T.A. (0.91): evolutionary potential calculated using the
thermodynamic approximation with m=0.91. The 1771 composition vectors
spanning the simplex are arbitrarily divided into four groups from
high to low mean frustration.