Nucleic acids function and structure

Nucleic acids represent a prominent category of biomolecule present in living cells. The term incorporates both DNA and RNA. DNA represents the repository of genetic information (the genome) of most life forms. RNA replaces DNA as the repository of genetic information in some viruses. In most life forms, however, RNA plays a role in mediating the conversion of genetic information stored in specific DNA sequences (genes) into polypeptides. There are three subcatego-ries of RNA, each playing a different role in the conversion of gene sequences into the amino acid sequence of polypeptides. Messenger RNA (mRNA) carries the genetic coding information from the gene to the ribosome, where the polypeptide is actually synthesized. Ribosomal RNA (rRNA), along with a number of proteins, forms the ribosome itself, and transfer RNA (tRNA) functions as an adaptor molecule, transferring a specific amino acid to a growing polypeptide chain on the ri-bosomal site of polypeptide synthesis. Therefore, nucleic acids, between them all, mediate the flow of genetic information via the processes of replication, transcription and translation as outlined in what has become known as the central dogma of molecular biology (Figure 3.1).

Structurally, nucleic acids are polymers in which the basic recurring monomer is a nucleotide (i.e. nucleic acids are polynucleotides). Nucleotides themselves consist of three components: a phosphate group, a pentose (five-carbon sugar) and a nitrogenous-containing cyclic structure known as a base (Figure 3.2). The nucleotide sugar associated with RNA is ribose, whereas that found in DNA is deoxyribose (Figure 3.3). In total, five different bases are found in nucleic acids. They are categorized as either purines (adenine and guanine, or A and G, found in both RNA and DNA) or pyrimidines (cytosine, thymine and uracil, or C, T and U). Cytosine is found in both RNA and DNA, whereas thymine is unique to DNA and uracil is unique to RNA (Figure 3.4).

The DNA or RNA polymer consists of a chain of nucleotides of specific base sequence, linked via phosphodiester bonds (Figure 3.5). RNA is a single-stranded polynucleotide, although RNA molecules tend to adopt higher order three-dimensional shapes. DNA, on the other hand, is a double-stranded molecule (Figure 3.6) that assumes a double helical structure. The two polynucleotide strands face each other in an antiparallel manner (Figure 3.6), with the hydrophilic sugar and phosphate residues facing outwards, towards the surrounding aqueous-based environment,

Phosphate

Phosphate

OH OH / Pentose (ribose)

Figure 3.2 (a) The basic structure of a nucleotide. (b) The actual chemical structure of one representative nucleotide (adenylate, i.e. adenosine 5'-monophosphate)

Base (Adenine)

OH OH / Pentose (ribose)

Figure 3.2 (a) The basic structure of a nucleotide. (b) The actual chemical structure of one representative nucleotide (adenylate, i.e. adenosine 5'-monophosphate)

Figure 3.3 Chemical structure of (a) ribose and (b) 2'-deoxyribose, the nucleotide pentoses found in RNA and DNA respectively. The differences in chemical structure are highlighted by the dotted circles

hcJ^CH

Pyrimidine

Purine

NHï

0

N H

■Meoine Guanine

Purines

NH.

o

1

H

N H

H

(RNA)

Figure 3.4 The five bases found in nucleic acids may be categorized as either pyrimidines or purines. Refer to text for details

Figure 3.5 The basic polynucleotide structure as shown in (a) outline form and (b) in chemical detail. The 5' end of the chain is defined by lacking a nucleotide attached to the first sugar's carbon number 5; the 3' end lacks a nucleotide attached to the carbon number 3 of the last sugar in the backbone

Figure 3.5 The basic polynucleotide structure as shown in (a) outline form and (b) in chemical detail. The 5' end of the chain is defined by lacking a nucleotide attached to the first sugar's carbon number 5; the 3' end lacks a nucleotide attached to the carbon number 3 of the last sugar in the backbone and the more hydrophobic bases point inwards. The base sequence of each chain displays complementarity. Wherever thymine is found in one chain, adenine is found positioned opposite it in the other. Wherever guanine is found in one chain, cytosine is found positioned opposite it in the second chain. Complementarity provides an obvious mechanism to ensure the fidelity of DNA replication and to underline transcription. The double helical DNA structure is stabilized by (a) hydrogen bonding between complementary opposite bases (two hydrogen bonds between A and T, three hydrogen bonds between G and C; Figure 3.6) and (b) by hydrophobic stacking

Figure 3.6 (a) DNA structure. The two complementary polynucleotide strands in DNA are antiparallel to each other in orientation (one runs 5'^3', the other 3'^5'; see Figure 3.5). The two strands are held together by hydrogen bonds between opposite complementary bases, as well as hydrophobic interactions between stacked bases, as described in the text. (b) The double polynucleotide chain adopts a double helical structure

Figure 3.6 (a) DNA structure. The two complementary polynucleotide strands in DNA are antiparallel to each other in orientation (one runs 5'^3', the other 3'^5'; see Figure 3.5). The two strands are held together by hydrogen bonds between opposite complementary bases, as well as hydrophobic interactions between stacked bases, as described in the text. (b) The double polynucleotide chain adopts a double helical structure interactions between the planar, largely hydrophobic bases effectively stacked above each other along the length of each strand.

3.2.1 Genome and gene organization

The genome refers to the entire hereditary information present in an organism. As discussed earlier, this is usually encoded by double-stranded DNA (the genome of most plant viruses and some animal and bacterial viruses is RNA based). DNA-based genomes are largely or exclusively organized into chromosomes, each chromosome being a single DNA molecule housing multiple genes, as well as non-coding sequences.

Bacteria normally harbour a single, circular chromosome that tends to be tethered to the bacterial plasma membrane and tends to have few if any closely associated proteins. Many bacteria also contain extra-chromosomal DNA in the form of plasmids, as will be discussed later. Eukaryotes (plants, animals and yeasts) posses multiple linear chromosomes contained within a cell nucleus, and these chromosomes are normally closely associated with proteins termed histones (the protein-DNA complex is termed chromatin). Eukaryotes also invariably possess DNA sequences within mitochondria and in chloroplasts in plants. The (usually circular) DNA molecules are much

Table 3.1 The number of chromosomes found in selected species/ cells, along with their predicted/estimated (approximate) number of genes

Cell/species

No. chromosomes

No. genes

E. coli

S. cerevisiae

Fern

Fruit fly

Mouse

Human

1 16

1 2GG

4 4GG 6 2GG 13 6GG 13 GGG

3GGGG-35GGG 23 GGG 19 3GG

3GGGG-35GGG

shorter than chromosomal DNA, are often present in multiple copy number and tend to house genes coding for proteins required within these organelles. Human mitochondrial DNA, for example, is 6600 base pairs (6.6 kbp) in length and houses 37 genes. Such DNA molecules are believed to be vestiges of chromosomes from ancient bacteria that gained entry into early eukaryotic cells.

The genomes of different species are organized into different numbers of chromosomes, as is evident from Table 3.1. Chromosomes present in all cells contain both coding regions (i.e. genes, which are stretches of DNA that encode the specific amino acid sequence of a particular polypeptide or the exact nucleotide sequence of a tRNA or rRNA) and non-coding regions. Coding regions, as we will subsequently see, often represent only a small fraction of total genome sequences.

In close association with gene sequences are regulatory elements, i.e. stretches or regions of DNA that mark the beginning or end of a gene or a series of related genes or which regulate the level of gene expression (Figure 3.7). A characteristic regulatory sequence upstream (i.e. on the 5' side) of a gene is termed the promoter region (P), which RNA polymerases (the enzymes responsible for transcribing the gene into RNA) identify and bind. Immediately adjacent to this is a characteristic sequence that represents the starting point for transcription (TC). Immediately downstream of the gene is a transcriptional termination site (tC). The intervening sequence, of course, represents the precise stretch of DNA that is copied into RNA and is often called the transcriptional unit. The gene sequence will often contain start and stop signals or sequences (TL and tL) that ultimately dictate the precise stretch of transcriptional unit actually translated into polypeptide (Figure 3.7). Other regulatory regions controlling gene expression can also be present, either upstream and/or downstream of the gene itself. In addition to genes and their associated

HL II

final coding sequence transcriptional unit sequence

Figure 3.7 Generalized gene organization within the genome. Refer to text for details

Pi

lacI

P

O

lacZ

lacY

lacA

r structural genes region region repressor control . . .

r structural genes region region

Figure 3.8 The lac operon houses three structural genes: lacZ, lacY and lacA. These code for three enzymes required for lactose metabolism (P-galactosidase, galactose permease and a transacetylase). Immediately upstream of these structural genes is a control region that houses a promoter (P) and operator (O) sequence. The operator represents a binding site for a 'repressor' protein that is in turn coded for by a repressor gene (Pi) found nearby but upstream of the lac operon. The repressor gene is in turn controlled by its own promoter. In the absence of the sugar lactose (or, more accurately, an isomer of lactose called 1,6-allolactose, which acts as an inducer) the repressor gene product is bound to the lac operator site, preventing transcription of the lac operon. In the presence of lactose (and hence the inducer), the inducer binds the repressor and the in-ducer-repressor complex disassociates from the operator, allowing transcription to go ahead. A polycystronic mRNA is produced, but the operon also houses translational start and stop sites that allow for independent ribosomal production of the three gene products regulatory sequences, DNA molecules also invariably house additional non-coding sequences in the form of various kinds of repeat sequences. For example, genes account for only some 30 per cent of the total human genome sequence.

The detail and arrangement of gene structure is also normally different in prokaryotes and eu-karyotes. In prokaryotes, genes of related function are often clustered together in operons, which are usually under the control of a single promoter/regulatory region. An example is the well-known 'lac operon' described in Figure 3.8. Transcribed operon mRNA thus usually contains coding sequence information for several polypeptides, and such mRNA is termed polycistronic. Although common in prokaryotes, the presence of polycistronic operons is infrequent in lower eukaryotes and essentially absent from higher eukaryotes, where virtually all protein-encoding genes are transcribed separately. Eukaryotic genes, however, usually contain coding sequences (exons) that are interrupted by non-coding intervening sequences (introns), and in many cases exons represent a minor proportion of the entire gene length (Figure 3.9, and also see Figure 4.3). For example, of the 30 per cent of the human genome believed to be taken up by genes, an estimated 28.5 per cent is accounted for by introns with only some 1.5 per cent being accounted for by exons.

mRNA transcripts in eukaryotes undergo substantial editing. The introns are enzymatically removed from (spliced out of) the primary transcript, and further characteristic modifications include the addition of a cap at the mRNA's 5' end and the addition of a polyadenine nucleotide tail (poly A tail) at the molecule's 3' terminus.

3.2.2 Nucleic acid purification

A prerequisite step to any rDNA work is the initial isolation of DNA or RNA from the source material (which can be microbial, plant, animal or viral). Numerous methodologies have been developed to achieve nucleic acid purification, and some of these methodologies have been adapted for use in a variety of commercially available purification kits. Although details vary, the general

5'cap mRNA (primary transcript)

Ei h

Removal of introns (splicing out)

Polyadenylation

E1 E2 E3

Figure 3.9 Overview of the transcription of eukaryote genes and subsequent mRNA editing. Prior to the completion of the synthesis of the primary transcript, a 5' cap (7-methylguanosine) is enzymatically added at the 5' end of the growing RNA chain. This helps to prevent mRNA degradation by nuclease enzymes. The non-coding introns (see main text) are enzymatically removed from the primary transcript (a process called splicing), yielding a mature mRNA sequence coding for the intended polypeptide. Finally, the mRNA 3' end is also modified by the addition of a poly A tail, comprising 80-250 adenylate residues, which again likely helps protect the mRNA from degradation approach adopted entails initial liberation of the nucleic acid by disruption of any cell wall present (or viral capsid) and of the cellular plasma membrane, followed by selective precipitation and often chromatography. In the context of plants and some microorganisms, initial disruption of the cell wall may require application of physical or other vigorous disruptive influences (see Chapter 6). This can potentially complicate DNA purification, particularly as it can cause physical shearing (fragmentation) of the extremely long DNA chromosome. The gentlest method of cell lysis usually involves incubation with cell-wall-degrading enzymes, and the addition of detergent will solubilize the plasma membrane. Following cellular disruption, initial purification steps normally entail solvent-based extraction/precipitation. For example, shaking in the presence of phenol (or a mixture of phenol and chloroform), followed by standing or centrifugation (to achieve phase separation) results in extraction of the (now denatured) proteins into the phenol phase and/or accumulation at the interphase, with nucleic acids remaining in the upper, aqueous phase. Further purification may be achieved by selective precipitation of the nucleic acids using ethanol or isopropanol as precipitant. If DNA is required, then the RNA present may now be removed by the addition of the enzyme ribonuclease, which selectively degrades RNA. On the other hand, if (eukaryotic) mRNA is required, then affinity-based purification may be undertaken using an oligo

Figure 3.10 Affinity-based purification of mRNA. The unpurified mRNA-containing solution is percolated through a column packed with cellulose beads (a), to which a short chain of deoxythymidylate (an oligo dT chain) has been attached. Any mRNA present is retained in the column due to complementary base pairing between its 3' poly A tail and the immobilized oligo dT, (b). Non-bound material can then be washed out of the column, with subsequent desorption of the mRNA by passing a low-salt buffer through the column. The mRNA collected may then be precipitated out of solution using ethanol, followed by its collection via centrifugation. An alternative, and now more commonly used variation, entails the direct addition of oligo (dT)-bound magnetic beads directly into the cell lysate and 'pulling out' the mRNA using a magnet. The method is rapid, thus minimizing contact time of the mRNA with degradative ribonucleases present naturally in the cytoplasm

Figure 3.10 Affinity-based purification of mRNA. The unpurified mRNA-containing solution is percolated through a column packed with cellulose beads (a), to which a short chain of deoxythymidylate (an oligo dT chain) has been attached. Any mRNA present is retained in the column due to complementary base pairing between its 3' poly A tail and the immobilized oligo dT, (b). Non-bound material can then be washed out of the column, with subsequent desorption of the mRNA by passing a low-salt buffer through the column. The mRNA collected may then be precipitated out of solution using ethanol, followed by its collection via centrifugation. An alternative, and now more commonly used variation, entails the direct addition of oligo (dT)-bound magnetic beads directly into the cell lysate and 'pulling out' the mRNA using a magnet. The method is rapid, thus minimizing contact time of the mRNA with degradative ribonucleases present naturally in the cytoplasm

(dT) column (Figure 3.10). Nucleic acids absorb UV light maximally at 260 nm (compared with 280 nm in the case of proteins); thus, absorbance at 260 nm can be used to quantify the amount of nucleic acid present and to follow the purification protocol. The ratio of absorbance at 260 nm versus 280 nm can also be used to determine how contaminated the nucleic acid preparation is with protein. The ratio A260/A280 « 1.8 for pure DNA and 2.0 for pure RNA preparations; lower ratios usually indicate the presence of contaminant protein. DNA can also be detected and quantified by the addition of the chemical ethidium bromide. Ethidium bromide molecules intercalate (bind) in between DNA bases and fluoresce when illuminated with UV light.

10 Ways To Fight Off Cancer

10 Ways To Fight Off Cancer

Learning About 10 Ways Fight Off Cancer Can Have Amazing Benefits For Your Life The Best Tips On How To Keep This Killer At Bay Discovering that you or a loved one has cancer can be utterly terrifying. All the same, once you comprehend the causes of cancer and learn how to reverse those causes, you or your loved one may have more than a fighting chance of beating out cancer.

Get My Free Ebook


Post a comment