Info

Name

Sequence*

Function

ATP/GTP binding

[A,G]-X4-G-K-[S,T]

Residues within a nucleotide-binding domain that contact the nucleotide

Prenyl-group binding site

C-0-0-X (C-terminus)

C-terminal sequence covalently attached to isoprenoid lipids in some lipid-anchored proteins (e.g., Ras)

Zinc finger (C2H2 type)

C-X2-4-C-X3-0-X8-H-X3_5-H

Zn2+-binding sequence within DNA- or RNA-binding domain of some proteins

DEAD box

02-D-E-A-D-[R,K,E,N]-0

Sequence present in many ATP-dependent RNA helicases

Heptad repeat

(0-X2-0-X3L

Repeated sequence in proteins that form coiled-coil structures

Single-letter amino acid abbreviations used for sequences (see Figure 2-13). X = 0 = hydrophobic residue. Brackets enclose alternative permissible residues.

any residue;

Comparison of Related Sequences from Different Species Can Give Clues to Evolutionary Relationships Among Proteins

BLAST searches for related protein sequences may reveal that proteins belong to a protein family. (The corresponding genes constitute a gene family.) Protein families are thought to arise

▲ FIGURE 9-32 The generation of diverse tubulin sequences during the evolution of eukaryotes. (a) Probable mechanism giving rise to the tubulin genes found in existing species. It is possible to deduce that a gene duplication event occurred before speciation because the a-tubulin sequences from different species (e.g., humans and yeast) are more alike than are the a-tubulin and (3-tubulin sequences within a species. (b) A phylogenetic tree representing the relationship between the tubulin sequences. The branch points (nodes), indicated by small numbers, represent common ancestral genes at the time that by two different evolutionary processes, gene duplication and speciation, discussed in Chapter 10. Consider, for example, the tubulin family of proteins, which constitute the basic sub-units of microtubules. According to the simplified scheme in Figure 9-32a, the earliest eukaryotic cells are thought to have contained a single tubulin gene that was duplicated early in evolution; subsequent divergence of the different copies of the two sequences diverged. For example, node 1 represents the duplication event that gave rise to the a-tubulin and (3-tubulin families, and node 2 represents the divergence of yeast from multicellular species. Braces and arrows indicate, respectively, the orthologous tubulin genes, which differ as a result of speciation, and the paralogous genes, which differ as a result of gene duplication. This diagram is simplified somewhat because each of the species represented actually contains multiple a-tubulin and (3-tubulin genes that arose from later gene duplication events.

Ancestral cell

Ancestral cell

Orthologous

Gene duplication and divergence

Gene duplication and divergence

Species 1

Species 2

Species 1

Species 2

Orthologous

original tubulin gene formed the ancestral versions of the a-and ^-tubulin genes. As different species diverged from these early eukaryotic cells, each of these gene sequences further diverged, giving rise to the slightly different forms of a-tubulin and ^-tubulin now found in each species.

All the different members of the tubulin family are sufficiently similar in sequence to suggest a common ancestral sequence. Thus all these sequences are considered to be homologous. More specifically, sequences that presumably diverged as a result of gene duplication (e.g., the a- and ^-tubulin sequences) are described as paralogous. Sequences that arose because of speciation (e.g., the a-tubulin genes in different species) are described as orthologous. From the degree of sequence relatedness of the tubulins present in different organisms today, evolutionary relationships can deduced, as illustrated in Figure 9-32b. Of the three types of sequence relationships, orthologous sequences are the most likely to share the same function.

Genes Can Be Identified Within Genomic DNA Sequences

The complete genomic sequence of an organism contains within it the information needed to deduce the sequence of every protein made by the cells of that organism. For organisms such as bacteria and yeast, whose genomes have few in-trons and short intergenic regions, most protein-coding sequences can be found simply by scanning the genomic sequence for open reading frames (ORFs) of significant length. An ORF usually is defined as a stretch of DNA containing at least 100 codons that begins with a start codon and ends with a stop codon. Because the probability that a random DNA sequence will contain no stop codons for 100 codons in a row is very small, most ORFs encode a protein.

ORF analysis correctly identifies more than 90 percent of the genes in yeast and bacteria. Some of the very shortest genes are missed by this method, and occasionally long open reading frames that are not actually genes arise by chance. Both types of miss assignments can be corrected by more sophisticated analysis of the sequence and by genetic tests for gene function. Of the Saccharomyces genes identified in this manner, about half were already known by some functional criterion such as mutant phenotype. The functions of some of the proteins encoded by the remaining putative genes identified by ORF analysis have been assigned based on their sequence similarity to known proteins in other organisms.

Identification of genes in organisms with a more complex genome structure requires more sophisticated algorithms than searching for open reading frames. Figure 9-33 shows a comparison of the genes identified in a representative 50kb segment from the genomes of yeast, Drosophila, and humans. Because most genes in higher eukaryotes, including humans and Drosophila, are composed of multiple, relatively short coding regions (exons) separated by noncoding

Saccharomyces cerevisiae

Lower Your Cholesterol In Just 33 Days

Lower Your Cholesterol In Just 33 Days

Discover secrets, myths, truths, lies and strategies for dealing effectively with cholesterol, now and forever! Uncover techniques, remedies and alternative for lowering your cholesterol quickly and significantly in just ONE MONTH! Find insights into the screenings, meanings and numbers involved in lowering cholesterol and the implications, consideration it has for your lifestyle and future!

Get My Free Ebook


Post a comment