The origins of genome era arguably commenced with the Noble prize winning work of Frederick Sanger who developed methods for utilizing in vitro DNA synthesis in the presence of dideoxyribonucleotides to generate a partial ladder of DNA fragments that differ by single nucleotide base steps, allowing determination of the sequence of the DNA polymer (Sanger et al., 1977a, 1977b, 1982), as well as his methods of cloning in single-stranded bacteriophage to aid rapid DNA sequencing (Sanger et al., 1980). Sanger sequencing has undergone a remarkable evolution since its radioactive slab-gel beginnings more than 30 years ago to a current practice of electrophoretic sizing on parallel arrays of microcapillaries and chip-based ultra-microcapillaries. Throughout the entire era of primary genome discovery its pre-eminence is because of its robustness with a wide range of different DNA sequence motifs and its unique ability to generate both long reads and highly accurate base calls.
To many scientists the report of initial draft of the sequences of the human genome in 2001 by the International Human Genome Sequencing Consortium (IHGSC, 2001) and the subsequent report of the entire euchromatin genome in 2004 (IHGSC, 2004) ended the decades long ''genomic era''. This era commenced in earnest in the early 1990s with initial reports from collaborative projects to sequence the E. coli genome (Yura et al., 1992) and the C. elegans genome (Sulston et al., 1992). Planning for the massive undertaking to sequence the human genome (Collins and Galas, 1993) also commenced. These large sequencing projects (the Human Genome Project was the largest biological program ever undertaken) commenced using Sanger sequencing, and employed electrophoretic size separations of sequentially terminated primer extensions from fragmented genomic templates. This era also saw a large increase in the efficiency of sequencing as new technologies for sequence reading - capillary array sequencers, liquid handling robotics and improved fluorescent dye chemistries and signal detectors were developed to provide for the worldwide demand for better and more cost-effective DNA sequencing.
We have now ostensibly entered the ''post-genomic era'' with the accumulation of significantly more than 150 billion base-pair of sequence information (142 Gb at June 2006), which provides an unprecedented opportunity to investigate deeply into the information revealed by the known genomes, and yet DNA sequencing will continue as a major and growing research and diagnostic activity, for biomedicine, to understand the basis of disease, to gain understanding of the depth of information encoded in the primary genomes, for forensic, biosafety and species identification purposes, and for comparative, archeological, taxonomic and evolutionary studies, to name but a few of the myriad applications.
The de novo sequencing of the genomes of many economically and medically important species is currently being undertaken (Benson et al, 2005; GeneBank, 2006), as well as an accelerated re-sequencing of known mammalian genomes to discover the genetic variation lying behind phenotypic diversity and disease susceptibility (The International HapMap Consortium, 2003, 2005; The ENCODE Project Consortium, 2004). Each of these sequencing activities undoubtedly will continue apace as more and more variant genomes are examined to determine the genetic component(s) to a myriad of common diseases and conditions in man and in model animals.
Sequencing technology based on Sanger sequencing - capillary electro-phoresis (CE) is still the gold standard and many large and small laboratories have access to this technology. However, for the planned large-scale screening and re-sequencing projects the current technology is either too expensive, or its capacity is too small. However, recently Blazej et al. (2006) have suggested that the continued use of Sanger sequencing may still be viable and that it is essential that the cost efficiency and scalability limits of this technique are taken to its ultimate limit (Figure 1). Blazej and colleagues propose that a fully integrated microfluidic genome sequencing system should achieve this aim and also lead to significant infrastructure and labor savings as well as template and reagent requirements being reduced an additional 100-fold from current array CE sequencer levels.
To address the need for a major improvement in sequencing throughput, both companies and governments are pushing to develop novel technologies that will bring the cost of sequencing to between $100,000 (http://grants.nih.gov/grants/ guide/rfa-files/RFA-HG-05-003.html) and $1,000 (http://grants.nih.gov/grants/ guide/rfa-files/RFA-HG-05-004.html) per whole human genome, compared to today's $3 million price. In the main, these new technologies move away from Sanger sequencing to modular unitary base addition onto multiple templates arrayed onto solid surfaces, using microfluidics and new approaches to sequence addition and ultra-sensitive signal reading technologies to capture the signals from the densely arrayed reactions (104-106 features per cm2), and thus bring about enormous increases in productivity and volume of quality data generation.
Several excellent recent reviews (Marziali and Akeson, 2001; Shendure et al., 2004; Metzker, 2005; Mitchelson, 2005; Church et al., 2006) provide comprehensive information about the advantages and problems faced in development of these new sequencing technologies, and discuss polymerase colony (polony) sequencing, nanopore sequencing and sequencing by hybridization (SBH), CE and microchip sequencing and sequencing of repeated elements, and also provide very detailed information on the different DNA termination strategies and on chemistries for signal detection. This introductory chapter will draw attention to recent advances in these new sequence technologies, and also indicates some of the surprising and exciting applications in which these technologies provide real advantage.
Fig. 1. A microfabricated nanoliter-scale "Bioprocessor Sequencing Factory". The microfabricated device integrates all three Sanger sequencing steps, thermal cycling, sample purification and CE. Importantly, a combination of glass and polydimethylsiloxane (PDMS) wafers was used to construct different functional elements including 250-nl reactors, affinity-capture purification chambers, high-performance CE channels and pneumatic valves and pumps onto a single microfabricated device. The "lab-on-a-chip-level" of integration requires only 1 fmol of DNA template for complete Sanger sequencing of up to 550 continuous bases with 99% accuracy. The performance of this miniaturized DNA sequencer provides a new benchmark for reducing the cost and determining the efficiency limits of Sanger sequencing of read lengths required for de novo sequencing of human and other complex genomes. Reprinted from Blazej et al. (2006). Copyright (2006), reprinted with permission from National Academy of Sciences (USA).
1.1. Biotechnological implications of ultra-high-throughput sequencing capability
Several different technological approaches are being developed to increase sequence reading throughput, while simultaneously reducing the cost of sequencing by several orders of magnitude. These developing (and prospective) technologies are being undertaken by both companies and academic researchers (see Table 1), and include new single-molecule sequencing (SMS) technologies and instruments employing the "clonal single-molecule array'' developed by Solexa Inc., and "single-molecule sequencing by synthesis'' developed by He-licos BioSciences corporation (see Table 1 for web site links), improvement of ''sequencing by synthesis (SBS)'' using pyrosequencing by the 454 Life Science Corporation (Margulies et al, 2005a, 2007), and both sequencing-by-ligation
Table 1. Web-sites providing information on different aspects of new DNA sequencing technology and sequencing output
Array capillary electrophoresis instrumentation and cyclic dye terminator chemistry
MALDI-TOF mass spectrographic sequencing. SNuPE and small oligomer fragmentation sequencing "Sequencing by synthesis'' and sequential polymerization enzymology
Massively parallel "sequencing by synthesis''. Array chemistry and advanced signal detection technologies. Advanced base-calling software
Nanofluidic barrier technology. Nanopore technologies. Sequencing of single DNA molecules
Alternative DNA sequencing tools. DNA-barrier breaking enzymes, sequencing enhancers, enzyme re-engineering
DNA and oligonucleotide arraying technologies. Optical masking, nanobead arrays
Laboratory-on-a-chip technologies, nanoengineering, nanofluidic and nanoanalytical technologies
http://www.cchem.berkeley.edu/ramgrp / alpha/
http http http http http http http http http http
sk05barbas.html http http http http http
Table l (continued)
Fundamental advances in dye chemistries, CE equipment and micro/nanofabricated device engineering
Microarrays for specific "sequencing by hybridization" of known SNPs and tiling of known genome regions. Non-coding RNA gene detection, promoter detection.
Physical and electrical analysis of DNA sequence
Genomic analysis technologies
Genomic sequence data bases Advanced access tools for genomic analysis and sequence analysis
MicroRNAs analysis and microRNA genomics tools
European Bioinformatics Institute
DOE Office of Science International HapMap Project Human Genome Project
(Shendure et al., 2005) and fluorescent in situ sequencing (FISSEQ) (Mitra et al., 2003), while advances have also been made to greatly reduce the costs of microchip sequencing, which although utilizing Sanger chemistries provide substantially longer read lengths than the above solid-phase sequencers, and have sequence turnaround in minutes rather than hours while using minute volumes of reagents.
Was this article helpful?