Protein Signature Pipeline

With straightforward extensions, the techniques above for nucleic acid detection diagnostics can be applied to protein sequences. When multiple pro-teomes are available from different strains or isolates, we perform a multiple sequence alignment on each protein to determine a protein consensus, using DIALIGN. These are compared using Vmatch against the GenBank nonredundant protein database, nr. This output is analyzed for apparently unique pep-tides of a given length (six or more amino acids, normally, in our application). Longer unique peptides are favored if, for example, all their substrings of length 6 are also unique.

We focus on proteins known, through annotation, to be presented on the outer surface of the pathogen in the form in which we expect to encounter it (e.g., spore in the case of anthrax). The protein structure prediction method described above is used to attempt to produce a 3-D model, since few pathogen virulence proteins have had structures determined experimentally. We must then determine which of the apparently conserved/unique protein sequence fragments are on the protein surface in normal conformation. We use NACCESS if we have been able to determine a structure model. We eliminate from further consideration all candidate monoclonal antibody targets that are not sufficiently exposed. Our protein signature development capability now far exceeds the speed at which monoclonal antibodies can be created for testing in the wet lab.

Figure 15.7 shows the result of this process for one pathogen protein. Regions shown in green indicate conserved and unique peptides that are accessible on the surface of the protein, and thus may provide good antibody detection targets. Potential applications of this process to determine vaccine or therapeutic targets, in addition to detection diagnostics, should be apparent. Additional work in progress at LLNL, requiring the use of very high-end parallel computers, is attempting to mine this information to automatically determine good HAL targets and, ultimately, reagents. A target goal would be to sequence a new pathogen in a day, annotate in a day, develop nucleic acid diag-

FIGURE 15.7 A 3-D model of a pathogen protein, highlighting conserved and unique protein sequence peptides that are accessible on the protein surface. These indicate potential locations for protein detection signatures. This example shows how the basic structure model shown in Fig. 15.3 can be further developed. (See color insert.)

FIGURE 15.7 A 3-D model of a pathogen protein, highlighting conserved and unique protein sequence peptides that are accessible on the protein surface. These indicate potential locations for protein detection signatures. This example shows how the basic structure model shown in Fig. 15.3 can be further developed. (See color insert.)

nostics in a day, model all proteins in a day, and develop protein diagnostics before the end of the week.

0 0

Post a comment