Sequence Of The Human Genome

The diploid genome of the typical human cell contains approx 3 X 109 basepairs of DNA that is subdivided into 23 pairs of chromosomes (22 autosomes and sex chromosomes X and Y). It has long been suggested that discerning the complete sequence of the human genome would enable the genetic causes of human disease to be investigated (28-30). Practical methods for DNA sequencing appeared in the mid to late 1970s (31-33), and numerous reports of DNA sequences corresponding to segments of the human genome began to appear. In the mid-1980s, a project to sequence the complete human genome was proposed, and this project began in the later years of that decade. The development of automated methods for DNA sequencing (34,35) made the ambitious goals of the Human Genome Project attainable (36,37). Subsequently, detailed genetic and physical maps of the human genome appeared (38-43), expressed sequences were identified and characterized (44-46), and gene maps of the human genome were constructed (47,48). Efforts by several consortiums using differing approaches (49-51) to large-scale sequencing of human DNA and sequence contig assembly culminated in 2001 with the publication of a draft sequence of the human genome (52,53). The actual number of genes contained in the human genome is not yet known. Early estimates suggested that the human genome might contain 70,000 to 100,000 genes (54-56). However, more recently, the number of genes contained in the human genome has been estimated to be approximately 30,000-40,000 (52,53,57,58). Early analysis of the draft sequences of the human genome revealed considerable variability between individuals, including in excess of 1.1-1.4 million single-nucleotide polymorphisms (SNPs) distributed throughout

nh2 0 nh2

V-r vr

oh oh 2'-deoxyadenosine 2'-deoxyguanosine

Purine Ribonucleosides

oh oh 2'-deoxycytosine 2'-deoxythymosine

Pyrimidine Ribonucleosides

Fig. 2. The chemical structure of purine and pyrimidine deoxyribonucleosides.

the genome (53,59). Refinement of the human genome sequence, identification and characterization of the genes contained, and description of the features of the genome (SNPs and other variations) continues at a rapid pace ( In early 2003, it was announced that the Human Genome Project had completely sequenced 99% of the gene-containing portion of the human genome, and new plans and goals for genomics research were described (60,61).

