L

6 shared fragments

2 shared fragments

FIGURE 24.18 Mapping of Sequence Tagged Sites

Fragment collection

6 shared fragments

2 shared fragments

FIGURE 24.18 Mapping of Sequence Tagged Sites

STS mapping is shown for four STS sites on a single chromosome. A variety of restriction enzyme digests are performed to cut the chromosome into many different sized fragments. The number of times two STS sequences are found on the same fragment reveals how close the two markers are to each other. In this example, the two purple STSs are found on the same fragment six times, and must be close to each other on the chromosome. The two green STSs are only found on the same fragment two times and are therefore further apart. The purple STSs are never found on the same fragment as the green STSs, therefore they must be far apart on the chromosome. Continuing to add different STSs will refine the map even more.

Sequence tags are mapped relative to each other by analysing how frequently tags are found together on the same chromosome fragments.

fragment depends how close they are on the original chromosome. Neighboring STSs will tend to be found together on many fragments whereas those further apart will only rarely be found on the same fragment. This type of data can be used to construct a linkage map for the STS sites examined.

The chromosome fragments to be examined were originally derived by cloning large segments of DNA into high capacity vectors such as yeast artificial chromosomes

FIGURE 24.19 Radiation Hybrid Mapping

To determine how close STSs and ESTs are to each other, many large chromosome fragments must be analyzed. Radiation hybrid mapping allows large human chromosome fragments to be inserted into hamster cells. First, the human chromosomes, which have the gene for thymidine kinase (TK+), are fragmented by irradiation. The human cells are then fused with hamster cells, which are TK-. If a human cell and hamster cell fuse successfully, the hybrid should express thymidine kinase and can be selected by plating on selective medium. Random loss of human chromosome fragments occurs during this process. Consequently, each radiation hybrid cell line will contains a different set of human chromosome fragments, which can be screened for the presence of STSs and ESTs.

TK positive donor human cells

TK positive donor human cells

Irradiate

TK negative donor hamster cells

Chromosomes fragmented

TK negative donor hamster cells

Cell fusion

Select cells that express tk

Radiation hybrid line ( TK positive)

Donor fragments taken up

Radiation hybrid line ( TK positive)

Radiation hybrids are cell lines that contain fragments of chromosomes from other eukaryotic cells.

(YACs). Unfortunately,YAC clones often contain two or more segments of DNA from different original locations. In practice, locating STS and EST sites relative to each other has mostly been done by radiation hybrid mapping (Fig. 24.19). A radiation hybrid is a cell (usually from a rodent) that contains fragments of chromosomes from another species.

To make a radiation hybrid, cultured human cells are irradiated with a lethal dose of X-rays or g-rays. This treatment breaks the chromosomes into fragments. The dying human cells are then fused with hamster cells. Cell fusion is promoted with polyethylene glycol or by using Sendai virus. The resultant hybrid cells contain random selections of the human chromosome fragments. Typical fragments are 5-10 Mbp in size and each hamster cell contains about 15-35% of the human genome. The hybrid cells are screened to see which STSs or ESTs are found together—i.e. are on the same human chromosome fragments. The more often two STSs are found in the same hybrid cell, the closer they are linked on the original human chromosome before fragmentation. By the late 1990s, STS-based maps of over 30,000 sites had been constructed for the human genome. This gives a density of approximately one marker per 100 kbp of DNA.

radiation hybrid A cell (usually from a rodent) that contains fragments of chromosomes (generated by irradiation) from another species

Sequencing very large numbers of small fragments provides enough information to assemble a complete genome sequence—if your computer is powerful enough

The bacterium Haemophilus had the honor of being the first organism to be totally sequenced.

Assembling Small Genomes by Shotgun Sequencing

Individual sequencing reactions give lengths of sequence that are several hundred base pairs long. A whole genome must be assembled from vast numbers of such short sequences. There are three approaches to whole genome assembly: shotgun sequencing, cloned contig sequencing, and the directed shotgun approach which is really a mixture of the first two.

In shotgun sequencing the genome is broken randomly into short fragments (1 to 2 kbp long) suitable for sequencing. The fragments are ligated into a suitable vector and then partially sequenced. Around 400-500 bp of sequence can be generated from each fragment in a single sequencing run. In some cases, both ends of a fragment are sequenced. Computerized searching for overlaps between individual sequences then assembles the complete sequence. Overlapping sequences are assembled to generate contigs (Fig. 24.20). The term contig refers to a known DNA sequence that is contiguous and lacks gaps.

Since fragments are cloned at random, duplicates will quite often be sequenced. To get full coverage the total amount of sequence obtained must therefore be several times that of the genome to allow for duplications. For example, 99.8% coverage requires a total amount of sequence that is 6 to 8-fold the genome size. In principle all that is required to assemble a genome, however large, from small sequences is a sufficiently powerful computer. No genetic map or prior information is needed about the organism whose genome is to be sequenced. The original limitation to shotgun sequencing was the massive data handling that is required. The development of faster computers has overcome this problem. Nowadays, a more important issue is that repetitive sequences create ambiguities.

The first bacterial genome to be sequenced, that of Haemophilus influenzae, was deduced from just under 25,000 sequences averaging 480 bp each. This gave a total of almost 12 million bp of sequence—six times the genome size. Computerized assembly using overlaps resulted in 140 regions of contiguous sequence—i.e. 140 contigs.

The gaps between the contigs may be closed by more individualistic procedures. The easiest method is to re-screen the original set of clones with pairs of probes corresponding to sequences on the two sides of each gap. Clones that hybridize to both members of such a pair of probes presumably carry DNA that bridges the gap between two contigs. Such clones are then sequenced in full to close the gaps between contigs. However, many of the gaps between contigs are due to regions of DNA that are unstable when cloned, especially in a multicopy vector. Therefore a second library in a different vector, often a single-copy vector such as a lambda phage, is often used during the later stages of shotgun cloning. Pairs of end-of-contig probes are used to screen the new library for clones that hybridize to both probes and carry DNA that bridges the gap between the two contigs (Fig. 24.21A). A third approach, that avoids cloning altogether, is to run PCR reactions on whole genomic DNA, using random pairs of PCR primers corresponding to contig ends. A PCR product will result only if the two contig ends are within a few kb of each other (Fig. 24.21B).

Race for the Human Genome

As described above, the first completely sequenced genomes of living cells were from bacteria with genomes consisting of a few million base pairs. The genomes of most higher animals and plants contains a thousand fold more DNA and their sequencing therefore presents greater problems. The original aim of the Human Genome Project contig A stretch of known DNA sequence that is contiguous and lacks gaps

Human Genome Project Program to sequence all the DNA making up the human genome shotgun sequencing Approach in which the genome is broken into many random short fragments for sequencing. The complete genome sequence is then assembled by computerized searching for overlaps between individual sequences

Race for the Human Genome 681

FIGURE 24.20 Shotgun Sequencing

The first step in shotgun sequencing an entire genome is to digest the genome into a large number of small fragments suitable for sequencing. All the small fragments are then cloned and sequenced. Computers analyze the sequence data for overlapping regions and assembled the sequences into several large contigs. Since some regions of the genome are unstable when cloned, some gaps may remain even after this procedure is repeated several times.

Many small fragments

A) Clone and sequence many small fragments

B) Line up sequences and

Was this article helpful?

0 0

Post a comment