Next: Comparison Sequences
Up: Hexamer Dissimilarity Comparisons
Previous: Validation Sequences
  Contents
Calibration curves from comparing fungal sequences with plant
sequences (green), and from comparing fungal sequences with
rhizobacterial sequences (blue) indicate low overlap in hexamer
composition and clearly separated medians
(Figure 4.2A). Fungal calibration
curves are less adequately approximated by a normal distribution than
plant and rhizobacterial calibration curves
(Figure 4.2A and B), but for t > 0, a
normal approximation does not grossly misrepresent calibration curves.
Confidence curves (Figure 4.2B, yellow
and magenta lines) calculated from normal approximations to
calibration curves indicate 15.2% and 8.7% comparison-wide error
rates for rejecting the null hypothesis that a particular sequence
resembles hexamer composition of fungal sequences when compared with
plants (yellow line) and rhizobacteria (magenta line), respectively.
Evaluating a confidence curve at a particular confidence level gives
the approximate critical test value for t, above which we can reject
the null hypothesis with an arbitrary, but known, degree of certainty.
The approximate critical values of t for a 95% confidence level of
a one-tailed test are 312 for comparisons between fungi and plants,
and 384 for comparisons between fungi and rhizobacteria
(Figure 4.2B).
Figure 4.2:
Calibration and confidence distributions. (A) Calibration curves
showing cumulative probability distributions of
for two
pairwise comparisons between training sets: between fungi and plants
(
and
, green lines), and between fungi and rhizobacteria
(
and
, blue lines). Table 4.2
summarizes constituents of training sets. Calibration curves were
obtained from 100 resampled replicates in which each training set was
randomly halved, and one half was used to establish hexamer counts,
while the other half was used to compute
. The degree of
overlap in the tails of calibration curves about
is
used to establish experiment-wide false positive and false negative
rates. Here,
,
,
, and
. (B) Confidence curves (yellow and
magenta) indicate the comparison-wide confidence level for rejecting
the null hypothesis that a sequence is from taxon A, as calculated
from normal approximations to calibration curves in (A). Parameters
(median,
, and standard deviation,
) used to estimate normal
distributions are shown in the figure legend. This measure of
confidence varies continuously with
, and is computed as
.
 |
Figure 4.3:
Calibration, confidence, and comparison curves. (A) Cumulative
distributions of test results from four libraries, compared with
fungi and plants. Libraries are described in
Table 4.1. Calibration and
confidence curves (thick green and yellow lines, respectively) are
as in Figure 4.2. (B) Cumulative
distributions of test results, as in (A), except the comparison is
between fungi and rhizobacteria. Calibration curves appear as thick
blue lines.
 |
Next: Comparison Sequences
Up: Hexamer Dissimilarity Comparisons
Previous: Validation Sequences
  Contents
Peter T. Hraber
2001-06-13