Traditional epidemiology relies on the combination of clinical presentation of disease, identification of the pathogen, and anecdotal circumstances to explain where the infection began and how it spread throughout a population. Common features are established to link the transmission to the source of infection. The addition of molecular tools to the investigative effort allows the infectious agent to be identified at the genetic level and enhances our understanding of virus origin, emergence, and transmission.
Molecular epidemiology uses phylogenetic methods to reconstruct a path of virus transmission based on heredity. As a virus replicates and moves through a population, mutations accumulate. The genetic variability displayed by the viruses is compared to deduce common ancestors and explain how one virus sequence gave rise to another. Viruses evolve, sometimes very quickly, to adapt to new hosts and environments. The genetic makeup of the pathogen may increase in variability with time and passage. Closely related virus sequences therefore correlate with recent infection and transmission. Using PCR and molecular tools, virus sequences are collected during an outbreak from infected human, animal, or insect reservoirs. These new virus sequences are then compared with a database of known virus sequences. The comparison is not a simple match of nucleotides between the virus genomes; rather, it involves sophisticated algorithms creating clusters of virus sequences sharing a common evolutionary ancestor.
Phylogenetic trees display how a set of virus sequences might have been derived during evolution, and provide guides in the placement and classification of an unknown virus. The trees are a graphical illustration of the evolutionary linkage of newly isolated virus sequences with known virus genomes (Figure 4.4).10 The outer nodes of the tree display existing virus sequences. The inner branches represent theoretical ancestral virus sequences that gave rise to the recently isolated virus genome. The length of each branch corresponds to the amount of genetic change between the ancestral virus and the currently circulating virus. Additional algorithms may be applied to place a temporal scale with the amount of evolutionary change. A viral sequence that is very distantly related may be used as an outgroup to orient the tree with a direction of evolutionary change. Phylogenetic analyses revealing ancestry enable initial hypotheses about its basic life history, including the virus hosts and transmission patterns, as close relatives tend to be similar in their biology.
The phylogenetic tree can be interpreted as clusters of virus sequences that are used in classification schemes, and often these groupings complement traditional methods. Broad virus families can be defined using sequences from conserved genes such as polymerases and other enzymes required in the virus life cycle. Finer distinctions are noted by using sequences from genes that are more specialized for a particular virus. For example, envelopes and genes encoding structural proteins are often used to define virus subgroups within larger families. These subgroups often correspond to traditional serogroups defined by traditional serological methods that define groups by their ability to cross-react to a particular antibody. Entire genes or portions of genes can be used in the phylogenetic analysis, but complete virus genomes
Japanese Encephalitis virus Murray Valley Encephalitis virus KunJIn West Nile
St Louis Encephalitis virus
Dengue 2 Dengue 1
Dengue 3 Dengue 4 Banz] Ugandas Yellow Fever Seplk
Kyasanur Forest Louping III
Westam "Tick Born Encephalitis
-1 Dengue 2
Adapted from Holmes and Twiddy. Infection, Genetics and Evolution., 3:19-28. 2003
figure 4.4 Phylogenetic tree of the RNA virus family Flaviviridae. Originally defined by serological assays, the phylogenetic tree shows inferred genealogical relationships based on maximum likelihood analysis of nucleic acid sequences. The family is divided into separate genera, Pestivirus, Hepacivirus, and Flavivirus. Flaviviruses are further subdivided into species that fall into three groups that can be transmitted via ticks or mosquitoes or without an arthropod vector (black vertical lines). Dengue viruses are separated into four serogroups or genotypes based on serology and envelope protein divergence. Within each Dengue serotype/genotype, representatives responsible for outbreaks of disease have been isolated from around the world. Given that RNA viruses mutate rapidly, a given isolate from an infected individual exists as a nonhomogeneous population variant termed a "quasispecies." Figure adapted from Ref. (10).
are rarely used, as the sequence length is restricted by the amount of computational time.
Epidemiological questions can be resolved with phylogenetic trees. The most fundamental questions in epidemiology are the mode of transmission and the origin of an outbreak. As the virus is passed through a population, genetic differences accumulate. More recent infections generally correspond to more shared derived genetic changes, and the most closely related will cluster on a tree. Transmission patterns are revealed as virus sequences from isolates are compared with sequences from a database and the clusters of sequences reveal a common ancestor.
An immediate question to solve during a virus outbreak is the mechanism by which the virus spreads. Viruses frequently infect animal or insect vectors that serve to pass the virus to humans. By identifying the virus through sequence analysis, a hypothetical reservoir can be predicted by the placement of the sequence on the phylogenetic tree, since viruses that share a mode of transmission often cluster together. More than one type of vector may be used within the same family of viruses, but individual members depend on a particular vector. Viruses from the family Bunyaviridae can be transmitted to humans by such pests as ticks, mosquitoes, flies, and rodents. The particular vector utilized often distinguishes individual subgroups within the family. Phy-logenetic trees have been instrumental in proving virus transmission through family members, hospital settings, and susceptible members of a population. By comparing the virus sequence isolated from a patient to virus sequences isolated from individuals in the population transmission routes can be deduced. For example, it can be concluded that a doctor became infected from a patient seen at a hospital rather than from the general community if the virus sequences from doctor and patient are more similar than sequences found in the population.
The geographic location of virus infection can also be predicted by the use of phylogenetic trees. Viruses isolated from reservoirs in different geographic locations can define where people became infected. Arenaviruses are carried by rodents, but domestic rodents are often responsible for causing infections rather than rodents found in the fields or forests. By analyzing virus sequences obtained from infected patients and from house, field, and forest mice, the group of mice serving as a reservoir can be determined, and public health measures to prevent rodents from entering homes can be implemented. The origin of a virus outbreak can involve larger geographic areas. As the SARS outbreak demonstrates, importation of new viruses is an increasing global health concern. By accessing a database of virus genomes from around the world, a new virus outbreak could be rapidly identified as having been imported from another land. As viruses travel, some viruses become endemic, and it is necessary to distinguish between endemic and imported outbreaks to allow appropriate control measures to function. Viruses may also cause seasonal outbreaks. Influenza varies its genome with each year of infection. By understanding the extent to which the virus alters its genome from year to year, phylogenetic analysis assists in predicting the next year's strain and allows scientists to begin production for the upcoming yearly vaccine.
Was this article helpful?