Case Study Of Bacillus Anthracis

In this section, we will present some quantitative considerations related to microbial forensics. We will use published research on Bacillus anthracis to illustrate several important issues. We do so because the anthrax terrorist attacks of October 2001 have received considerable attention and, thus, offer concrete data for discussion. However, we will gloss over certain complications of this particular example in order to emphasize more general issues that would probably apply to similar cases in the future.

Following the anthrax attacks, Read, et al.8 sequenced and compared the complete genomes of two B. anthracis isolates. One isolate was a forensic sample taken from a victim in Florida, while the other was from the government research laboratory at Porton Down in the United Kingdom. The Porton Down isolate had been previously "cured" of the two extrachromosomal plas-mids that encode virulence factors, but otherwise it was presumed to be representative of the Ames strain that had been widely used in anthrax research. These two isolates, as well as other potential sources of the Florida attack strain, are shown in Figure 16.1. These other sources include U.S. governmental laboratories and field isolates.

Read, et al. reported that "Only four differences were discovered between the main chromosomes of the Florida and Porton isolates . . . two of these are SNPs and two are short indels." SNPs refer to single-nucleotide polymorphisms, which in this context are typical point mutations that distinguish the two sequences. Indels are insertions or deletion mutations, which often occur in specific hypermutable regions. We will focus initially on the two point

FIGURE 16.1 Possible derivations of the Bacillus anthracis isolated from the Florida anthrax victim in relation to several potential sources of the Ames strain. Chromosomes of the Florida isolate and the Porton Down laboratory strain have been sequenced. The Ames strain was originally isolated from a dead cow in Texas in 1981, then stored at Fort Detrick, and distributed to several other research laboratories (solid arrows), including Porton Down. The Porton Down strain was cured of its two virulence plasmids. (Reprinted with permission from ref. 8. Copyright 2002, American Association for the Advancement of Science.)

FIGURE 16.1 Possible derivations of the Bacillus anthracis isolated from the Florida anthrax victim in relation to several potential sources of the Ames strain. Chromosomes of the Florida isolate and the Porton Down laboratory strain have been sequenced. The Ames strain was originally isolated from a dead cow in Texas in 1981, then stored at Fort Detrick, and distributed to several other research laboratories (solid arrows), including Porton Down. The Porton Down strain was cured of its two virulence plasmids. (Reprinted with permission from ref. 8. Copyright 2002, American Association for the Advancement of Science.)

mutations that distinguish the Florida and Porton Down isolates. A complication is that the genome sequence of the Porton Down isolate was, in fact, based on DNA preparations from two substrains of the Porton Down strain, and these substrains were shown to have several mutational differences in their sequences. In our analysis, we will use only those differences that distinguish the Florida isolate from both Porton Down samples.

Given that there are two point mutations that distinguish the Florida and Porton Down isolates, we can ask several questions. Is that a surprisingly little difference, or is it a lot? What might these data tell us about how long ago the forensic isolate shared a common ancestor with the Ames laboratory strain? What might the data say about the relative likelihood of the forensic isolate coming from one source versus others?

To begin to answer these questions, we first need to understand the genomic mutation rate and the evidence concerning this rate in bacteria. The genomic mutation rate is simply the expected number of mutations per generation across the entire genome. To illustrate, consider Escherichia coli, which is the best studied bacterium. Our calculations assume functional DNA repair and ignore hypermutable sites, which constitute a small proportion of the genome. E. coli has a genome size of about 5 X 106bp and a point mutation rate of about 5 X 10-10 per bp per generation6 (see ref. 7 for a somewhat lower estimate). The product of these two quantities is the total genomic mutation rate, which in this case is approximately 2.5 X 10-3 point mutations per generation. The inverse of the genomic mutation rate is the expected number of generations until the first mutation occurs in a cell lineage (not in a population: see below), in this case about 400 generations. As it turns out, B. anthracis has a similar genome size,9 and its mutation rate measured for a particular gene also is similar to E. coli.10

At first glance, the expected time of 400 generations until the first mutation may seem an inappropriately long period, because it ignores the fact that a bacterial population may contain millions of cells, such that many mutations can occur every generation. However, comparisons between genomic sequences are based on single representatives of each sample, not on entire populations. Without belaboring this point, the expected time that is relevant to our analysis will generally be somewhat longer, not shorter, than we estimated above.

In point of fact, we are most interested in a quantity called the genomic substitution rate. A substitution is any mutation that spreads throughout a population of interest. The details can get complicated quickly but, fortunately, there are some mathematical shortcuts. In the present context, we can treat the number of substitutions as the number of mutations that distinguish the two individuals whose genomes are under comparison. Neutral mutations have no effect on fitness; synonymous point mutations are often used as a proxy for neutral mutations. A robust result from theoretical population genetics is that the expected substitution rate of the class of neutral mutations is equal to their corresponding mutation rate.11 Deleterious mutations—those which reduce a cell's survival or growth rate—have a substitution rate lower than the corresponding mutation rate, while beneficial mutations have a substitution rate above their mutation rate. Because many more mutations are deleterious than are beneficial, the genomic substitution rate is, in general, somewhat lower than the genomic mutation rate. Hence, as noted above, the expected time to the first substitution of a point mutation will be longer than 400 generations.

So what might we begin to conclude from the data? Given the two point mutations that distinguish the Florida and Porton Down isolates, the inferred time since their common ancestor is on the order of 800 cell generations (i.e., twice the expected time to the first mutation). Although not a huge number, it is to us surprisingly large in the context of standard lab practice, where one would expect working subcultures to be repeatedly restarted from a master culture (stored in a non-growing state, as spores or frozen vegetative cells), rather than by sustained propagation of subcultures. The inferred 800 generations would correspond to about 30 rounds of plating for single colonies (each colony representing some 25 cell divisions), and even more rounds if cells were propagated by serial dilution and transfer. Translating generations into chronological time would further depend on knowing how long cells might have been stored in a non-mutating state, for example as spores or in a freezer.

The inferred time since the common ancestor could be reduced, perhaps dramatically, if the Ames strain were defective in DNA repair (see refs. 7, 12 for the effects of loss of repair in E. coli) or if there was a history of mutagenesis. In terms of DNA repair, it appears that the Ames strain retains these functions given that mutation-rate estimates at specific loci are in line with estimates for E. coli that have normal DNA repair functions.10 However, with respect to growth under mutagenic conditions, it could be relevant that curing the two virulence plasmids from the Porton Down strain involved treating cells with high temperature and an antibiotic.9 In fact, depending on the timing of these treatments relative to the derivation of the two substrains of the Porton Down strain, these conditions may even explain the differences between the substrains as well as between the Florida isolate and the Porton Down strain.

Statistical uncertainties are also important with respect to inferring the time since two strains diverged from a common ancestor. Even if we accept the substitution rate as known, there remains the intrinsic error arising from the stochastic (random) occurrence of mutations. Using the Poisson distribution to reflect this intrinsic error, the probability of observing two or more mutations in anything fewer than 142 generations is below 5%. At the other end of the distribution, two strains are also unlikely to have substituted as few as two mutations in 1900 generations or longer (p < 5%). Although these bounds are already large, they are the best that one can do given only two point-mutation differences, because they assume that all else is known precisely. The bounds become even larger with uncertainty in, for example, the mutation rate. Despite these wide bounds, one might still exclude certain scenarios. For the sake of illustration (leaving aside the facts that the Porton Down strain lacks the virulence plasmids, and was subjected to mutagenic treatments), these two point mutations make it unlikely that the forensic isolate came directly from the Porton Down strain or even that it was derived from that strain via fewer than several rounds of plating or subculturing.

Thus, after considering what is known about rates of mutation in bacteria and placing this information in an evolutionary context, the two point mutations separating the forensic isolate from Florida and the Ames laboratory strain appear to be more than might have been reasonably expected, not less, provided that the Porton Down strain is indeed representative of the Ames strain more generally We will return to this proviso a bit later.

It should also be clear from what has been said that "either-or" inferences based on match versus no-match between a possible source and a forensic sample are less conclusive for bacteria than for humans, owing to the much greater possibility of exact clones (identical twins) in the microbial case and especially in the context of a deliberate attack using a strain taken from the laboratory.

Let us now shift gears, and consider these data from the perspective of establishing the most probable "line of descent" of a forensic isolate in relation to multiple potential sources. In this context, even one or a few distinguishing genetic substitutions could—in principle—provide compelling evidence to support or exclude certain scenarios. In the paragraphs that follow, we examine two such scenarios to illustrate how the data could be profitably analyzed. The first scenario is hypothetical and invokes imaginary data in order to make certain points clear. The second scenario accords with the relevant published data.

+1 0

Post a comment