Dna Sequencing

DNA sequencing is considered the "gold standard" for mutation detection. However, before proceeding with a description of methods and technologies, it is important to point out that sequencing is not perfect, and it certainly is not magic. Like any analytical method, one can encounter false positives and false negatives. However, when properly done and properly interpreted, sequencing a DNA fragment will almost always reveal a high percentage, near 100%, of the sequence variations present.

Conceptually, three steps are required in order to obtain the base sequence of any DNA fragment. First, starting from some defined point, (e.g., the 5' end of a PCR product or an annealed primer), one must separate the DNA into four reactions and

Microsatellite Dna Sequences

Fig. 17. PCR-mediated site-directed mutagenesis analysis of the cystic fibrosis mutation G542X. (A) A BsiNI site is introduced into the wild-type PCR product by changing the T (indicated by an asterisk) to a C during PCR amplification using mismatched primers. The new BstNI site is indicated by the shaded box and the cutting site is indicated by the arrows. BsiNI digestion produces three fragments as a result of the presence of the introduced restriction site and a constitutive restriction site not associated with the mutation. When the wildtype G (surrounded by a small box) is mutated to a T, a BsiNI site is not created by the mismatched primers. The PCR product with the mutation is cut only once at the constitutive restriction site to produce two fragments. (B) A 295-bp region of the CFTR gene that flanks the G542X mutation was amplified and digested with BsiNI. The PCR products were separated by electrophoresis through a 10% acry-lamide gel and then stained with ethidum bromide. Lane 1, normal individual (170- and 101-bp fragments); lane 2, individual heterozygous for the G542X mutation (195-, 170-, and 101-bp fragments); lane 3, water blank; lane 4, ^X174 DNA size markers. The 24-bp fragment migrates quickly through the gel and is not visible. (Courtesy of William G. Learning, Molecular Genetics Laboratory, University of North Carolina Hospitals.)

Fig. 17. PCR-mediated site-directed mutagenesis analysis of the cystic fibrosis mutation G542X. (A) A BsiNI site is introduced into the wild-type PCR product by changing the T (indicated by an asterisk) to a C during PCR amplification using mismatched primers. The new BstNI site is indicated by the shaded box and the cutting site is indicated by the arrows. BsiNI digestion produces three fragments as a result of the presence of the introduced restriction site and a constitutive restriction site not associated with the mutation. When the wildtype G (surrounded by a small box) is mutated to a T, a BsiNI site is not created by the mismatched primers. The PCR product with the mutation is cut only once at the constitutive restriction site to produce two fragments. (B) A 295-bp region of the CFTR gene that flanks the G542X mutation was amplified and digested with BsiNI. The PCR products were separated by electrophoresis through a 10% acry-lamide gel and then stained with ethidum bromide. Lane 1, normal individual (170- and 101-bp fragments); lane 2, individual heterozygous for the G542X mutation (195-, 170-, and 101-bp fragments); lane 3, water blank; lane 4, ^X174 DNA size markers. The 24-bp fragment migrates quickly through the gel and is not visible. (Courtesy of William G. Learning, Molecular Genetics Laboratory, University of North Carolina Hospitals.)

specifically cleave it at the each of the four nucleotide bases; that is, in the A reaction, the DNA must be broken into all of the possible oligonucleotides ending in an A, the G reaction must produce all of the possible oligonucleotides that end in a G, and so on. Second, these oligonucleotides must be labeled in some manner to enable detection. Third, the labeled fragments must be separated, at single-base resolution, and detected.

In 1977, two articles appeared that demonstrated how to carry out these three steps and achieve the objective of reading a DNA sequence. The two methods accomplished the goals of creating pools of oligonucleotides, each pool consisting of DNA fragments ending in one of the four bases, in quite different ways. Maxam and Gilbert used chemicals to fragment DNA at specific bases (60). Sanger and colleagues used the DNA to be sequenced as a template and synthesized four pools of new oligonucleotides, each one terminating at each of the four bases by incorporation of a chain-terminating dideoxynucleotide triphosphate (61). These two publications are considered to be among the most significant milestones in molecular biology. In

1980, Gilbert and Sanger received the Nobel Prize in Chemistry "for their contributions concerning the determination of base sequences in nucleic acids." Although the Maxam-Gilbert chemical method has applications in sequencing small DNA fragments, such as oligonucleotides, currently, all, or almost all, DNA sequencing is done using the enzymatic Sanger method. Therefore, the rest of this discussion will focus exclusively on the enzymatic Sanger method.

The Sanger sequencing reaction is a multistep process. As most often practiced in the modern diagnostic laboratory, the first step is the PCR amplification of the target of interest. Clearly, this is a critical step, and a poorly designed PCR strategy with nonspecific amplification or poor yields will result in uninterpretable sequence. The next step involves the removal of excess deoxynucleotide triphosphates (dNTPs) and PCR primers. Next, the PCR product is denatured and an oligonucleotide (the sequencing primer) is added and annealed 5' to the region to be sequenced. A DNA polymerase (typically a recombinant thermostable enzyme) and a mixture of dNTPs and dideoxynucleotide triphosphates (ddNTPs) is then added (and subjected to thermal cycling if a thermostable enzyme is used). The DNA polymerase will extend the annealed primer in the 5' to 3' direction, making a new strand of DNA that is complementary to the PCR product template. Because the ddNTPs retain a 5' hydroxyl group, they can be incorporated into the growing complementary strand of DNA. However, as they lack the 3' hydroxyl, they cannot be further extended by DNA poly-merase. Thus, for example, when a ddATP is incorporated, the chain terminates at that A position, complementary to the corresponding T in the template. The rate at which ddNTPs are incorporated in the growing strand is dependent on both the ratio of dNTPs to ddNTPs in the reaction and the efficiency with which the ddNTPs are recognized by the polymerase. Most applications in the clinical laboratory utilize dye-terminator chemistry. In this approach, the detectable label, in this case a fluorescent tag, is linked to the ddNTPs. Each of the four ddNTPs is labeled with one of four fluorescent molecules. Thus, the fragment terminated by incorporation of a ddATP will be labled with one color, a fragment terminating in a G will be labeled with another color, and so on. The pool of DNA fragments is then subjected to high-resolution electrophoresis, either slab gel or CE, and the presence of each of the four fluors is scored as the DNA fragments migrate past a fixed, multi-wavelength fluorescence detector. The output resembles a chro-matogram and is read from the shortest fragment to the longer fragments, with a blue (for example) peak being read as an A, a red peak being read as a C, and so forth. A schematic of the principle modifications of the Sanger sequencing chemistry is shown in Fig. 18.

There have been enormous technical advances in DNA sequencing technology since the first reports. In the early days of sequencing, the ability to read 100 bases after several days of work was the state of the art. Today, automated sequence analyzers capable of turning out sequence at the rate of up to 100,000 to 200,000 bases per day are available and, although expensive, are within the reach of large clinical laboratories.

7.1. ENZYMES USED FOR DNA SEQUENCING One area in which major advances have occurred is the area of enzyme technology (62). The original enzyme used by Sanger et al. was the Klenow fragment of Escherichia coli DNA polymerase I. The Klenow fragment retains the 5' to 3' DNA poly-merase activity of the polymerase holoenzyme, but lacks much of the 3' to 5' exonuclease, or proofreading activity. Enzymes that retain the 3' to 5' exonuclease activity are not useful in DNA-sequencing applications for several reasons. First, it is difficult to achieve maximum labeling because the exonuclease activity can degrade the sequencing primer. Second, the exonu-clease can remove the labeled dye terminator, allowing the new strand to continue growing by the addition of dNTPs and leaving gaps in the sequence. Third, the polymerase can pause at certain sequences and cycle back and forth between the poly-merase and exonuclease activities. This can lead to marked variability in peak intensity as ddNTPs are given many more chances to incorporate at a pause site (63).

The Klenow fragment was used for sequencing for several years and is still used on occasion. However, its low processivity and relatively low elongation rate leads to higher backgrounds and lower signals than some other enzymes. Sequenase™ is the trade name of T7 DNA polymerases developed by Tabor and Richardson. Sequenase version 1 was a chemically modified enzyme that retained the high processivity and eleongation rate of the native polymerase while possessing approx 1% of the original 3' to 5'exonuclease activity (64). Sequenase version 2 is a genetically modified enzyme that retains even less exonu-clease activity (65). The addition of manganese ions to the sequencing reaction virtually eliminates the incorporation efficiency differences between dNTPs and ddNTPs, resulting in extremely uniform bands (66). Sequenase is marketed by US Biochemicals (Cleveland, OH). Thermal cycling and the use of thermostable DNA polymerases has certain advantages for DNA sequencing. In contrast to the PCR reaction, in which exponential amplification is achieved by the use of two primers, DNA sequencing uses a single primer. Thus, thermal cycling results in linear amplification of labeled sequencing products. This 20- to 30-fold amplification is sufficient to greatly reduce the amount of template required for each reaction. In addition, because fluorescent-detection methods are not as sensitive as radioactive detection, this linear amplification is particularly useful for fluorescent sequencing. Taq polymerase, the most commonly used enzyme for PCR, does not have significant 3' to 5' exonuclease activity. It does, however, have 5'to 3' exonuclease activity. When used for sequencing, enzymes with this activity can degrade the labeled sequencing products from the 5' end, resulting in undesirable length heterogeneity of products that retain the 3' fluorescent label. Tabor and Richardson developed two genetically modified Taq polymerases in 1995 (67). One of them, marketed under the name Amplitaq FS® by Appied Biosystems (Foster City, CA), has a single point mutation that virtually eliminates the 5' to 3' exonuclease activity. The other, marketed under the name Thermo-Sequenase® by Amersham Biosciences (Piscataway, NJ) and US Biochemicals (Cleveland, OH), contains another point mutation that greatly reduces the discrimination against ddNTPs, resulting in more even band intensities.

7.2. LABELING OF DDNTP TERMINATED FRAGMENTS Another area in which great technical advances have occurred

Sanger Method Dna Sequencing

Fig. 18. A schematic of three popular permutations of the Sanger sequencing chemistry. Shown are the original Sanger method in which labeling is accomplished by inclusion of 35S- (or 33P-) dATP. The reaction is done in four parts: the G reactions done in the presence of dideoxy GTP, the A reactions in the presence of ddATP, and so forth. Each reaction is loaded onto one lane of a slab gel. Detection is by autoradiography of the dried gel. Next is a schematic of the fluorescent dye-primer method. In this method, the reaction is again done in four parts: The G reaction is done in the presence of unlabeled ddGTP and a sequencing primer labeled with fluorescent dye 1, the A reaction is done with a primer labeled with dye 2, and so forth. After the extension reaction is completed, the individual reactions are pooled and run on one lane (or one capillary) in a sequence analyzer with a multiwavelength fluorescence detector. The last method is the dye-terminator method, in which the reaction is carried out in a single tube in the presence of all four ddNTPs, with the ddATP labeled with dye 1, the ddGTP labeled with dye 2, and so forth. After the extension reaction, the products are electrophoresed in a single lane (or capillary) and detected with a multiwavelength fluorescence detector.

Fig. 18. A schematic of three popular permutations of the Sanger sequencing chemistry. Shown are the original Sanger method in which labeling is accomplished by inclusion of 35S- (or 33P-) dATP. The reaction is done in four parts: the G reactions done in the presence of dideoxy GTP, the A reactions in the presence of ddATP, and so forth. Each reaction is loaded onto one lane of a slab gel. Detection is by autoradiography of the dried gel. Next is a schematic of the fluorescent dye-primer method. In this method, the reaction is again done in four parts: The G reaction is done in the presence of unlabeled ddGTP and a sequencing primer labeled with fluorescent dye 1, the A reaction is done with a primer labeled with dye 2, and so forth. After the extension reaction is completed, the individual reactions are pooled and run on one lane (or one capillary) in a sequence analyzer with a multiwavelength fluorescence detector. The last method is the dye-terminator method, in which the reaction is carried out in a single tube in the presence of all four ddNTPs, with the ddATP labeled with dye 1, the ddGTP labeled with dye 2, and so forth. After the extension reaction, the products are electrophoresed in a single lane (or capillary) and detected with a multiwavelength fluorescence detector.

is in the area of labeling technology. The original Sanger method utilized an 35S-labeled deoxynucleotide triphosphate for labeling of the ddNTP terminated fragments. In this method, a radiolabeled dNTP (typically a-35S dATP or a-33P dATP) is included in the reaction mixture. The labeled dNTP does not terminate the growing chain and is incorporated randomly into the growing DNA chains. After electrophoresis on thin polyacrylamide gels, the gels are dried and exposed to large sheets of X-ray film to visualize the bands. This method is straightforward, does not require expensive equipment, and is still widely practiced today. This method does, however, have some drawbacks in addition to the obvious use of radioactivity. Although the electrophoresis is typically carried out on long (40-50 cm) gels, there is a definite limit to the number of bands that can be resolved on each loading. Under routine conditions, there is very high resolution of lower-molecular-weight fragments that have traveled the longest distance through the gel, and there is less resolution with increasing molecular weight and decreasing distance traveled through the gel. One solution to this problem is to "double load" the gel. In this approach, the four reaction mixtures (each terminated by ddATP, ddCTP, ddGTP, or ddTTP) are loaded in adjacent lanes and subjected to electrophoresis for 1-2 h. The electrophoresis is then paused, the samples are loaded a second time into clean wells, and the electrophoresis resumed. After the electrophoresis is continued for a sufficient time for the lowest-molecular-weight bands in the second load to reach the bottom of the gel, these fragment have migrated off the end in the lanes that were loaded first. In these lanes, the longer fragments are resolved, whereas in the lanes with less electrophoresis time, the shorter fragments are resolved. Using this approach, several hundred bases can be resolved on one gel. Other approaches involve reduction of resolution at the bottom of the gel while retaining resolution on the upper portion. An example of this strategy is the use of wedge spacers. Spacers, placed between the glass plates before casting the gel, define the shape and thickness of the gel. If the spacers are thicker at the bottom than the top of the gel, the effective voltage (in V/cm) is reduced at the bottom of the gel, leading to decreased mobility and less spacing between shorter fragments. Thus, the gels can be run longer before the smaller fragments migrate off the end, and more bases can be read. The proprietary gel formulation Long Ranger® (Cambrex Bio Science Walkersville, Inc., Walkersville, MD) yields an elec-trophoretic pattern similar to those seen with wedge spacers, but without the difficulty of pouring non-uniform-shaped gels.

Fluorescent labeling of ddNTP terminated fragments was a major advance in sequencing technology. Smith et al. described a system in 1985 that involved labeling the 5' end of the sequencing primer with four different fluorescent dyes (68,69). The sequencing reactions were carried out in four aliquots, as for radioactive sequencing. However, in the fluorescent approach, the A reaction was carried out with a primer labeled with one fluor, the C reaction was carried out with a primer labeled with a second fluor, and so on. After the reactions were complete, they were pooled and run together in one lane of a polyacry-lamide gel and detected with a multiwavelength fluorescence detector located at a fixed position near the bottom of the gel. Automated sequencing was born. Other approaches utilized primer labeling with one fluorescent dye and keeping the A, C, G, and T reactions separate with electrophoresis in four lanes (70). The multiple-fluor approach increased thoughput by reducing the number of lanes required by a factor of 4, whereas the single-fluor approach decreased instrument complexity. In addition, different geometries for fluorescence excitation and detection have been proposed. Smith et al. utilized a detection format in which the laser and detector are both mounted in a housing that is moved back and forth near the bottom of the gel (69). Ansorge et al. used a number of fixed, single-wavelength detectors and a fixed laser that illuminated the fluors through the gel from the side (70). The four-color, one-lane, moving-detector instrument design was marketed by Applied Biosystems (Foster City, CA) as the 370, 373, and 377 series of DNA sequencers. The one-color, four-lane, fixed-detector design was marketed by Pharmacia (now Amersham Biosciences [Piscataway, NJ]) as the ALF family of sequence analyzers. Both designs have been replaced by CE-based instruments.

The next innovation in sequencing technology was to replace dye-primer chemistry with dye-terminator chemistry. This innovation eliminated the need to prepare an expensive fluor-labeled sequencing primer for each piece of DNA to be sequenced. Perhaps more importantly, it eliminated the need to split each reaction into four. The ability to use unlabeled sequencing primers and the ability to carry out the entire sequencing reaction in one tube allowed for efficient translation to robotic workstations for assay setup and led to a dramatically decreased cost. The four fluorescent dyes that are most commonly used are carboxyfluorescein (abbreviated FAM), carboxy-4', 5'-dichloro-2', 7'-dimethoxyfluorescein (JOE), carboxytetramethy-rhodamine (TAMRA), and carboxy-X-rhodamine (ROX). These dyes are particularly useful because their emission wavelengths are evenly spaced, which facilitates discrimination between colors and gives highly accurate base calls. Two lasers are required, however: one at 488 nm for excitation of the fluorescein dyes (FAM and JOE) and one at 543 nm for excitation of the rhodamine dyes (TAMRA and ROX).

The most recent innovation is the introduction of energy transfer dyes (ET dyes) by Mathies and colleagues (71). Fluorescence resonance energy transfer (FRET) is a quantum phenomenon that takes place when two fluorescent molecules (1) have excitation and emission maxima such that the excitation spectra of one fluor overlaps the emission spectra of the other and, (2) the two molecules are very close to one another when the fluor with the lower excitation wavelength is illuminated at its excitation maxima. Under these conditions, when the fluorescent molecule with the lower excitation wavelength moves to the excited state after absorption of a photon, instead of relaxing to the ground state by emission of a photon, a non-radiative transfer of energy occurs in which the first fluor returns to the ground state and the second molecule is excited and fluoresces at its higher emission wavelength. FRET has numerous applications in molecular biology and is the basis for many real-time PCR techniques, such as TaqMan, Molecular Beacons, and FRET probes. As labels for DNA-sequencing reactions, they have the advantage of only requiring a single laser for excitation. The ET dyes use as second fluors the same FAM, JOE, TAMRA, and ROX dyes used for single-dye labeling. They have been adapted to dye-terminator chemistry (72) and are marketed by Applied Biosystems (Foster City, CA) under the name BigDye®. The "big dyes" are markedly superior to single dyes in terms of signal strength, lack of differential effects on DNA mobility between dyes, and ability to detect heterozygotes (73). The use of these dyes has virtually replaced the use of the single dyes. The use of ET dye-terminator chemistry combined with cycle sequencing (74) and CE separations are the combination of methods of choice for many DNA diagnostics laboratories. An example of sequences traces around the zidovudine (AZT)-resistance mutation at codon 215 of the human immunodeficiency virus (HIV) reverse transcriptase gene using these techniques in a 'home-brew' protocol is shown in Fig.19.

7.3. CAPILLARY ARRAY INSTRUMENTS That CE is superior in many respects to slab gel electrophoresis has been previously noted in this chapter. Other advantages include its speed, a DNA-sequencing run reading 5-600 bases can be completed in approx 2 h, whereas the same separation would typically require 12-16 h on a slab gel instrument. However, slab gels have the clear advantage of being able to handle multiple samples (up to 96 lanes) on 1 gel, whereas CE offers one sample analysis at a time. In 1992, Mathies and Huang introduced the concept of the capillary array electrophoresis (CAE) (75). This system utilized 25 parallel capillaries and injection and detection systems that allowed the parallel analysis of 25 DNA-sequencing reactions. Others soon followed with different methods of accomplishing CAE. Currently, to the author's knowledge, there are 3 manufacturers offering 96 capillary array instruments commercially. The Molecular Dynamics (Sunnyvale, CA) MegaBace 1000 instrument is based on the design of the Mathies group (75). The Applied Biosystems (Foster City, CA) 3700 instrument is based on the work of

Amino Acid Microsatellite Instability

Fig. 19. Partial sequence of the HIV reverse transcriptase gene generated using ET dye-primer chemistry and capillary electrophoresis on an ABI 3100 sequence analyzer (Applied Biosystems, Foster City, CA). Shown are 40-base segments of a sequence from the HIV-1 reverse transcriptase genes from (A) a wild-type, AZT-sensitive laboratory strain of HIV-1 and (B) a mutant sequence from a patient who failed AZT therapy. Note the presence of the two point mutations resulting in the T215Y amino acid change. (Courtesy of Heui Ra Yoo, University of Maryland Medical Systems.)

Fig. 19. Partial sequence of the HIV reverse transcriptase gene generated using ET dye-primer chemistry and capillary electrophoresis on an ABI 3100 sequence analyzer (Applied Biosystems, Foster City, CA). Shown are 40-base segments of a sequence from the HIV-1 reverse transcriptase genes from (A) a wild-type, AZT-sensitive laboratory strain of HIV-1 and (B) a mutant sequence from a patient who failed AZT therapy. Note the presence of the two point mutations resulting in the T215Y amino acid change. (Courtesy of Heui Ra Yoo, University of Maryland Medical Systems.)

Table 2

Comparison of Some Salient Features of Three CAE Instruments

Table 2

Comparison of Some Salient Features of Three CAE Instruments

My First Baby

My First Baby

Are You Prepared For Your First Baby? Endlessly Searching For Advice and Tips On What To Expect? Then You've Landed At The Right Place With All The Answers! Are you expecting? Is the time getting closer to giving birth to your first baby? So many mothers to be are completely unprepared for motherhood and the arrival of a little one, but stress not, we have all the answers you need!

Get My Free Ebook


Post a comment