Advanced Sequencing Technologies

Worldwide, approximately US$3 billion was spent in 2005 on all aspects of genomic sequencing - the analyzer equipment and software and on sequencing reagents and enzymes. The vast majority of this sequence output was determined using CE technology with Sanger sequencing, which commensurately has been developing progressively over the past 20 years. Following the commercial release of the ''Sequencer GS20'' array pyrosequencer by the 454 Life Science Corporation (Margulies et al., 2005a, 2007), this industrial space will be increasingly occupied by sequencing technologies that do not use electrophoresis, although for many conventional, sequencing projects, Sanger sequencing employing fragment separation will remain the preferred technology, if its advantage of long sequence reads will be used in combination with the alternate ultra-high-throughput sequencing technologies that generate short-read lengths (Desany et al., 2005).

2.1. Capillary electrophoresis and Sanger sequencing

CE offers high resolution and high throughput, automatic operation and data acquisition, with on-line detection of dyes bound to DNA extension products -for reviews see Kan et al. (2004) and Mitchelson (2003). Operational advances such as graduated electric fields and automated thermal ramping programs as the run progresses can result in higher base resolution and longer sequence reads. Advanced base-calling algorithms and DNA marker additives that utilize known fragment sizing landmarks can also help to improve fragment base calling, increasing call accuracy and read lengths by 20-30% (http://www. nucleics.com). Yet, despite the high efficiency of CE sequencers, the delineation of the human genome and its implication for genome-wide analysis for personalized medicine is driving the development of devices and chemistries capable of massively increased sequence throughput compared to the throughput capable on conventional CE sequencers. Miniaturization of CE onto chip-based devices provides some increased efficiencies - a significant improvement in the speed of long sequence reads and improved automation of handling and analysis. However, the new array based sequencing devices also promise a quantum increase in efficiency, which microchip sequencers will find hard to match. Each of these new devices provides an extremely high throughput, high-quality data and low-process costs, yet currently their reads are between 30 and 100 bp average length.

2.2. High-throughput capillary-array sequencing

A CE instrument comprises two electrolyte chambers linked by a thin silica capillary 50-100 mm in diameter, or as a fine microchannel on a silicon chip. The thin capillary rapidly dissipates heat generated by the large electric fields, stabilizing band resolution. An in-line detector positioned close to the capillary outlet acquires data from the size-fractionated molecules. Typically, dyes attached to the DNA fragments are detected using laser-induced fluorescence (LIF). During the recent accumulation phase of genome research, elect-rophoresis-based capillary-array sequence analysis developed rapidly becoming the paradigm for DNA sequencing and providing simultaneous multi-parallel analyses. Capillary-array electrophoresis (CAE) is a multiplied version of conventional CE, with up to 100 parallel capillaries, or channels on miniature CE chips arranged radially (Paegel et al, 2002a, 2003), such that each one can simultaneously analyze individual samples. Sequencers such as the Megabace 4500 system (GE Healthcare) with 384 capillaries process sequence data from long 800 bp reads, equivalent to some 2.8 Mb per day. Novel new block co-polymer separation media for CE and chip electrophoresis devices represent a new paradigm for longer and faster sequence analysis (Doherty et al., 2004). Sparsely cross-linked "nanogels" composed of sub-colloidal polymer structures of covalently linked, linear polyacrylamide chains act as novel replaceable DNA sequencing matrixes, particularly for microchip electrophoresis (Barron, 2006). The physical network stability provided by the internally cross-linked structure of the nanogels results in substantially longer average read lengths compared to conventional LPA matrix. More conventional sequencing developments involving Sanger sequencing include improvement in chip CE equipment and design (Kan et al, 2004; Xiong and Cheng, 2007), while Jovanovich and colleagues (http://microchipbiotech.com/) and Blazej et al. (2006) intend to push Sanger-based sequencing toward its performance limit in a fully automated, bench-top system. The heart of these systems will be a microchip-based device that can label and process DNA fragments from individual microbeads or solution in low-volume reactions, followed by ultra-fast separation and analysis on microfabricated CE channels.

2.3. Signal detection dyes and detectors

Dye-tagged nucleotides with higher sensitivity and better spectral discrimination than earlier fluorophore dyes are being developed, in concert with new enzyme systems to efficiently incorporate them into DNA (Metzker, 2005; Kumar and Fuller, 2007). Highly sensitive photometric devices such as CCD cameras can detect extremely low numbers of signal molecules, despite constraints on the ease of detection of single molecules due to photobleaching of dyes (Eggeling et al., 2006). Micromachined sheath-flow cuvettes that precisely control both capillary alignment and matrix flow, and confocal LIF systems with one lens, for both excitation and detection optics, that scan continuously across a bundle of capillaries held in register can provide both longer reads and more accurate base-calling. Alternative detection methods such as time-resolved fluorescence decay, electrochemical detection, chemi-luminescence and near-infrared (NIR) detectors can also be incorporated into CE devices, yet none have emerged yet as competitors to fluorescent dyes except for IR dye sequencing systems from LiCor (www.licor.com). Recent development of efficient incorporation systems for these alternative dyes will make their use more widespread. Dye-tagged nucleotides with longer linker lengths and charge matching can also improve the incorporation of these bulky molecules into nascent DNA (Kumar and Fuller, 2007). Finn et al. (2003) created a series of charge-modified, dye-labeled 2',3'-dideoxynucleoside-5'-triphosphate terminators, which possess a net positive charge and migrate in the opposite direction to dye-labeled Sanger fragments during electrophoresis. The charge-modified nucleotides are efficiently incorporated by a number of DNA polymerases. Post-sequencing purification is not required to remove unreacted nucleotides prior to electrophoresis, and DNA sequencing reaction mixtures can be loaded directly onto a separating medium. New nucleotides dye labeled at the terminal phosphate (Kumar et al, 2005; Edwards et al., 2007) will also play a role in the new single nucleotide-addition sequencing (SBE) systems. Chain extension halts after each addition step allowing excitation and fluorescent signal detection, and further extension is prevented until the dye is removed, regenerating a terminal phosphate.

2.4. Microchip electrophoresis

We also draw the reader's attention to reviews by Bruin (2000) and Xiong and Cheng (2007) and to research by Krishnan et al. (2001) of recent advances in the miniaturization and automation of nano and microliter volume reaction processing and their roles in the improvement of DNA amplification processes and for alternative approaches to sequencing. The microchip CE analysis format has significant advantages over conventional CE. It is up to 10 times faster and uses sub-microliter volumes of analysis reagents. Pal and colleagues (2005) elegantly demonstrated the extremely rapid DNA analysis using an integrated microchip system incorporating both DNA amplification and CE separation of products and identified sequence-specific hemagglutinin A subtype for the A/LA/1/87 strain of influenza virus. This system integrated fluidic and thermal components such as heaters, temperature sensors and addressable valves to control two nanoliter reactors in series and is suitable for a variety of genetic analyses. Significant advances in the amplification and detection of single DNA template molecules on integrated devices are providing unprecedented levels of sensitivity.

New constriction-microchannel designs from Paegel et al. (2002a, 2002b, 2003) improve fragment resolution and increase the scope for longer path lengths, permitting single base resolution over longer fragments (see also, Figure 2h). The miniaturization of chip CE systems with nanochannels < 1 mm allows analysis to be undertaken on limiting numbers of molecules held pico- and nanomolar concentrations, with amplification and detection of signals from single template molecules (Xiong and Cheng, 2007). Microfabricated multi-reflection absorbance cells for microchip-based CE are being built with 5- to 10fold enhanced sensitivity over single-pass devices. These schemes are being built into devices with hundreds of capillaries to achieve high-speed and extremely high-throughput and detection sensitivity.

2.5. Capillary electrophoretic sequencing on microcapillary chips

Doherty et al. (2004) demonstrated the improvement of microchip-based DNA sequencing read-lengths and base-call accuracy with nanogel matrixes in a high-throughput microfabricated DNA sequencing device consisting of 96 separation channels densely fabricated on a 6-in. glass wafer. Aborn et al. (2005) also described the development of a 768-lane microfabricated system in large-format (25 cm x 50 cm) 384-lane arrays for high-throughput de novo genomic DNA

Fig. 2. An integrated nanoliter-scale nucleic acid bioprocessor for Sanger DNA sequencing developed by the Mathies group. (a) Top view of the assembled bioprocessor containing two sets of thermal cycling reactors, purification/concentration chambers, CE channels (black), RTDs (red), microvalves/pumps (green), pneumatic manifold channels (blue) and surface heaters (orange). (b) Expanded view, showing microdevice layers. Rim colors indicate the surface on which the respective features are fabricated. The top two glass wafers are thermally bonded and then assembled with a featureless PDMS membrane and manifold wafer. (c) A photograph of the microdevice, showing one of the two complete nucleic acid processing systems. Colors indicate the location of sequencing reagent (green), capture gel (yellow), separation gel (red) and pneumatic channels (blue) (scale bar, 5 mm). Notations d-h correspond to the following component microphotographs: (d) a 250-nl thermal cycling reactor with RTDs (scale bar, 1 mm); (e) a 5-nl displacement volume microvalve; (f) a 500-mm diameter via hole; (g) capture chamber and cross injector; (h) a 65-mm wide tapered turn (scale bars, 300 mm). All features are etched to a depth of 30 mm. Reprinted from Blazej et al. (2006). Copyright (2006), reprinted with permission from National Academy of Sciences (USA).

sequencing. The two 384-lane plates are alternatively cycled between electrophoresis and regeneration and achieve a total of >172,000 bases, at 99% accuracy (quality score Phred 20) for each run of a 384-lane plate. This corresponds to a throughput >4 Mb of raw Phred 20 sequence per day. This microcapillary format allows operation at ''1/32 x "Sanger chemistry, and tests suggest that sample can be further reduced to 1/256 x Sanger chemistry in the microdevice. Yet, this microcapillary device still used conventional microliter-scale processing that generates a 1000-fold more product than is needed for analysis (Shi et al., 1999). Together, these advances directly address the cost model requirements for the next step beyond capillary array instruments, while retaining the long read lengths of Sanger chemistries. Other nanoreactors, with serial electrodes that provide for high ''sweeping field'' separation using low voltage supplies suited to hand held devices, can achieve PCR-amplification in 15min and CE analysis in 2min (Krishnan et al., 2001; Pal et al., 2005). The nanoreactors can be interfaced to either microelectrophoresis chips or capillary gel tubes via micromachined capillary connectors or zero-dead volume unions, and signals are detected using NIR fluorescence detector. Moreover, new low-voltage closed-loop CE devices also offer the promise of hand-held or readily transportable analysis instruments. Microdevices that require no operator intervention and that integrate sample purification, sample amplification, ampl-icon product purification and DNA sequencing by CE have been developed (Paegel et al., 2002a). Unincorporated dye terminators are electrically separated from sequencing products under high voltage into a waste channel, prior to diverting the sample into a separation capillary for size resolution and sequence analysis.

Further advances with these types of Sanger sequencing devices may permit complete CE analysis in an easily transportable format (a DARPA request), a capacity that the large new SBS array devices with ultra-sensitive optical systems currently do not allow. In this regard, Blazej et al. (2006) recently reported on the construction and application of an efficient, nanoliter-scale microfabricated bioprocessor (Figure 2) integrating all three Sanger sequencing steps of thermal cycling, sample purification and CE. The design had a number of novel features that aided miniaturization into an integrated device - the use of elect-rophoretic and pneumatic forces for sample movement improved sample transfer through holes into channels, and the transitioning from a monolithic substrate to a hybrid glass - PDMS assembly was also critical to the function of the device (Figure 2a-c). Multi-layer construction also enabled a much greater design complexity and permitted the exchange of materials across fluidic and pneumatic lines (Skelley et al, 2005), and was necessary for parallel processing. The wafer-scale device was constructed to form a single microfabricated instrument with 250-nl reactor chambers, affinity-capture purification chambers, high-performance CE channels, and pneumatic valves and pumps. This device involved ''lab-on-a-chip-level integration'' (Krishnan et al., 2001; Hansen and Quake, 2003) of each of these functional components and was shown to be capable of undertaking complete Sanger sequencing from only 1 fmol of DNA template. Their associated development of optimized capture and resolution gels also aided the single base separation. The volume of affinity-capture gel (Paegel et al., 2002b) used to pre-concentrate the sequencing products was scaled down to 250 nl to eliminate excess sample. The resolution of the commonly used dye-terminator samples was improved by extension of the separation channel from 16 to 30 cm, which increased the resolving power of the system, producing error rates of ~1 in a million between 100 and 300 bases read. The device was capable of single base length fractionating and thus continuous reading of up to 556 bases with 99% accuracy. The lengths of these reads are thus still superior to the best of SBS devices and are realistically the lengths required for de novo sequencing of mammalian and other complex genomes.

Blazej et al. (2006) note additional potential improvements to their system may be possible, and that appropriately tuned separation gels (Doherty et al., 2004) could result in uniformly resolving peaks over greater amplicon lengths and extend the currently >99% accuracy range. Improvement of injection techniques combined with increased scanner sensitivity could extrapolate the ultimate minimal template quantity to a conservative ~100 amol, within a fabricated 25-nl reactor (requiring further reduction in volume 10-fold), and would represent a 400-fold reduction from the current Sanger sequencing reagent consumption and require 800-fold less DNA template (Karlinsey et al, 2005). The practical limits to any reduction in reaction volume will be determined by the sensitivity of the detection system. The ability to achieve single-molecule detection (SMD) coupled with the miniaturization technologies described above will be required to achieve the optimal requirements for the analysis and manipulation of samples on a single molecule scale. Dittrich and Manz (2005) present the unique benefits of single fluorescent molecule detection in micro-fluidic channels, which may be central to the reduction of Sanger sequencing chemistries. The integrated device of Blazej et al. (2006) provided a new performance benchmark for the evaluation of the feasibility of miniaturizing high-throughput Sanger sequencing and for determination of the costs of this established technology against newly emerging low-consumable cost solid-phase-array sequencing systems (Margulies et al, 2007).

2.6. Sequencing by mass spectrometry

Mass spectrometry is used to determine the sequence of a polynucleotide by analysis of the atomic masses of a series of polynucleotide sub-fragments derived by the partial and uniform fragmentation, or by the uniform extension of the polynucleotide (see this volume, Ehrich et al., 2007). The polynucleotide sub-fragments are released from a solid phase and analyzed by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS). The techniques include partial enzymatic cleavage, chemical cleavage, base specific cleavage of PCR products and primer extension Sanger sequencing (single nucleotide primer extension, SNuPE) methodologies. Nucleotide analogs are used widely during DNA sequence analysis by mass spectrometry - to modify polynucleotide electro-chemistry and stabilize the N-glycosidic linkages, to block base loss and subsequent random backbone cleavage or to introduce site-specific backbone weakness for controlled fragmentation at particular types of base. Wolfe et al. (2002, 2003) have employed analogs such as 7-deaza analogs, 5'-amino-5'-deoxy- and 5'-amino-2',5'-dideoxy-analogs, and stable mass isotope tags to substitute for particular nucleotides. These nucleotide analogs introduce differential mass properties or differential stability into the DNA subfragments which improve the mass separations. Similarly, acids and bases can cleave dideoxy and amine-modified analog backbones generating sequencing ladders, which may be analyzed by mass spectrometry.

0 0

Post a comment