In the Examples section below we describe an existing fully automated computational pipeline for whole-genome analysis to define candidate DNA
signatures for bacterial and viral pathogens. Depending on the organism involved, this can yield a very few (or zero) to several tens of thousands of anonymous candidates. Despite the number, it is desirable to know the following for all candidates:
— Is anything known about that gene?
— Is the gene involved in virulence?
Clearly, gene annotation is vital for the design and selection of pathogen diagnostics produced via whole-genome methods. (In contrast, traditional pathogen diagnostic design starts with a gene of known involvement in virulence.) Candidate unique diagnostic signatures that land on genes known to be involved in virulence should receive the highest priority. Experience has shown that whole-genome analysis may uncover unique signatures that land on genes with unknown function, and these too might be associated with virulence or host range selection. Some of these candidates should be taken forward for wet lab testing. It may also be prudent to randomly select some signature candidates that land on intergenic regions. These may turn out to be interesting gene regulation regions or genes that were missed by the gene caller programs. Additionally, having a few good intergenic detection signatures is a good insurance policy against genetic engineering to foil signatures targeted at obvious known virulence gene regions.
In the realm of protein detection and forensic diagnostic design, annotation plays additional roles. It can ensure that only proteins available on the organism's surface (envelope capsid or spore surface proteins) are being focused on, assuming that detection of nondisrupted organisms is desired. Knowledge of active site regions and protein regions that are solvent-accessible can guide selection of protein signature candidates.
Was this article helpful?