The RNA Simplex: An Interactive Graphics Tool for Visualizing RNA Composition Space

Erik Schultes, Whitehead Institute for Biomedical Research
Peter T. Hraber, National Center for Genome Resources
Thomas H. LaBean, Computer Science & Biochemistry, Duke University

Last updated September 1999

The RNA Simplex is a tool for the analysis of nucleotide base composition in RNA (or DNA) sequences. The RNA Simplex can be geometrically represented as points within the volume of a tetrahedron where the four homopolymers (poly-A, poly-C, poly-G, poly-U) reside at the vertices while sequences having an equal number of the four nucleotides reside at the center of gravity of the tetrahedron, equidistant from the homopolymers.

We have visualized the RNA Simplex using an interactive graphics package that is freely available (see below). Also available are the data sets used in our analysis of RNA base composition from both naturally and artificially evolved RNA sequences and our analysis of relationship between base composition and thermodynamic stability of RNA secondary structure.

If the RNA Simplex or the accompanying data are used in further research, please cite:

MAGE and KINEMAGES

MAGE is a general-purpose, interactive graphics program created at the Department of Biochemistry, Duke University, by Dave and Jane Richardson. MAGE allows the user to display and manipulate multi-dimensional graphical data. The data sets used by MAGE are called KINEMAGES (or KINEtic iMAGES). A KINEMAGE is a plain-text file of coordinate data and commented display lists.

Though the program is most often used to visualize molecular structures in three dimensions, we have used it to represent the RNA Simplex and the distributions of observed RNAs and our theoretical predictions.

Click here to lean more about MAGE and KINEMAGES on the Official Kinemage Homepage.

Click here for MAGE software.

The RNA Simplex KINEMAGES

To work with the RNA simplex, open KINEMAGE files with your copy of MAGE. It is best to use a large format monitor. The simplex can be rotated in any direction using the mouse and various data can be toggled on and off using buttons on the right-hand side of the window. Instructions for using the RNA Simplex are provided here.

Currently, we have two RNA Simplex KINEMAGES available (labeled name.kin). The first, is RNA_simplex.kin, which summarizes the nucleotide base composition of over 2800 individual single-stranded RNA sequences from 15 functional classes. All data is phylogenetically representative and taken from sequences that are more than 90% complete. Included are artificial sequences evolved in vitro.

The second KINEMAGE, is simulation_simplex.kin, which is an analysis of RNA thermodynamic properties as a function of base composition. These simulations were conducted to explain the observed distributions discovered in RNA_Simplex.kin.

Future KINEMAGES will include the InVitro_simplex.kin (analysis of artificial ribozymes evolved in vitro from random-sequence RNA pools) and Genome_simplex.kin (base composition analyses of entire genomes).

RNA Base Composition Text Data Files

These text files (labeled name.txt) contain the composition vectors of over 2800, phylogenetically representative, single-stranded RNA sequences. These files are named by functional class. Composition vectors were calculated from full length sequence data collected from various electronic sources. These files have a constant format consisting of 14, tab delimited columns:

  1. ACCESSION NUMBER
  2. RNA FUNCTIONAL CLASS
  3. ORGANISM NAME
  4. DOMAIN
  5. SUB-TAXON
  6. NUMBER OF A RESIDUES
  7. NUMBER OF C RESIDUES
  8. NUMBER OF G RESIDUES
  9. NUMBER OF U RESIDUES
  10. TOTAL NUMBER OF RESIDUES (N)
  11. MAGE-FORMATTED NAME (IN BRACKETS {})
  12. FRACTION A RESIDUES
  13. FRACTION C RESIDUES
  14. FRACTION G RESIDUES

Columns having missing data are marked with a short string of X's. Because some sequences had ambiguous sites, the sum of A, C, G, and U residues may not, in all cases, equal N. Click here to obtain these text files.

Click here to learn more about RNA in general.

If you have any questions please email Erik Schultes, Mage Magnate.