The coding sequence of a gene is a series of three-nucleotide codons that specify the linear sequence of amino acids in its polypeptide product. Thus far we have tacitly assumed that the coding sequence is contiguous: the codon for one amino acid is immediately adjacent to the codon for the next amino acid in the polypeptide chain. This is true in the vast majority of cases in bacteria and their phage. But it is not always so for eukaryotic genes. In those cases, the coding sequence is periodically interrupted by stretches of noncoding sequence.
Thus many eukaryotic genes are mosaics, consisting of blocks of coding sequences separated from each other by blocks of noncoding sequences. The coding sequences are called exons and the intervening sequences are called introns. As a consequence of this alternating pattern of exons and introns, genes bearing noncoding interruptions are often said to be "in pieces" or "split."
Figure 13-1 shows a typical eukaryotic gene in which the coding region is interrupted by three introns, splitting it into four exons. The number of introns found within a gene varies enormously—from one in the case of most intron-containing yeast genes [and a few human genes), to 50 in the case of the chicken proa2 collagen gene, to as many as 303 in the case of die Htin gene of humans. Also, the sizes of the exons and introns vary. Indeed introns are very often much longer than the exons they separate. Thus, for example, exons are typically on the order of 150 nucleotides, whereas introns—though they too can be short—can be as long as 800,000 nucleotides [800 kb). As another example, the mammalian gene for the enzyme dihydrofolate reductase is more than 31 kb long, and within it are dispersed six exons that correspond to 2 kb of rnRNA. Thus, in this case, the coding portion of the gene is less than 10% of its total length.
Was this article helpful?