where h(a,p) is the Hamming distance between antibody a and pathogen p. In other words, for each pathogen, we find the antibody with the minimal Hamming distance to the pathogen. The score is a number between 0 and 1, being maximal for a perfect match, at Hamming distance 0, and minimal for the case of complementary bit strings. Note that I use identical lengths for the antibody and the pathogen strings and that the bit strings are aligned prior to calculating the Hamming distance.
where P is the number of pathogens, and
is the score of the library averaged over all
pathogens. Thus, we find that the survival probability s is a
monotonically increasing function of the average score
.
For the selection scheme (described below) that I
used, only the relative ranking of the fitnesses of different
libraries is important. Therefore, under the assumption that the
fitness of an individual depends only on its survival probability
s, we can identify the fitness with the average score
.
Formally, if we denote the pathogen set by
,
the fitness f of an individual is given by
A note about the random number streams. The basic function of the random number stream returns a random deviate from a uniform distribution on the interval [0,1). The algorithm is given in Knuth (1973), and the implementation that I used was written by Terry Jones. This function can be used to generate random deviates of the uniform density function over any interval between 0 and any positive value.