ppm are between 6.5 and 7.6 ppm. Note that the CONH2 groups at the end of the Asn (N) and Gln (Q) side chains have distinct chemical shifts for the two HN protons due to hindered rotation of the amide linkage (circles with a cross). As we move outward in the "hydrocarbon" side chains, the chemical shifts move upfield because the distance to the nearest functional group (a-amino and carbonyl of the backbone) increases. The methyl groups appear at the "classic" hydrocarbon positions of about 0.8 ppm. Long side chains of repeated methylene units show the same behavior, but the trend then reverses when there is a nitrogen at the end: CH-(CH2)„-N. We see this for Lys (K), Arg (R), and Pro (P), which have shifts of 3-3.6 ppm for the last CH2 before the nitrogen. The most upfield methylene group is the y-CH2 because it is far from both the backbone functional groups and the nitrogen at the other end of the chain (backbone N in the case of Pro). Proline lacks an HN but the position of the 5-CH2 is similar to where the HN would be located, so the same kinds of NOE interactions can be observed.

We see this effect of the last CH2 group shifting downfield to a lesser extent in the five-spin systems Glu, Gln, and Met. The carbonyl group (Glu, Gln) or sulfur (Met) "pulls" the y-CH2 group downfield to the 2-3 ppm range. Even farther downfield than the final CH2 of the chains ending in nitrogen are the ^-protons of Ser and Thr, which are on an oxygenated carbon. These Hp resonances are near the Ha at around 4 ppm, standing out clearly from the other amino acids.

The first goal of an NMR study is to identify the spin systems present and assign each one to one of the 20 amino acids or at least to a group of amino acids. In a 2D TOCSY spectrum in 90% H2O we can observe the entire spin system at the F2 position of the HN proton (see Chapter 9, Fig. 9.45). Many of these patterns of chemical shifts are unique to one of the 20 amino acids. For example, valine has a ^-proton at 2.1 ppm and two diastereotopic methyl resonances at around 0.8 ppm. This pattern is clearly recognizable in a TOCSY spectrum, and so we know it is a valine residue. A protein will typically have a number of valines in the sequence, so at this point we only know that it represents one of these residues. Other spin systems fall into a group of amino acids. For example, a number of amino acids have the spin system CHa-CH2-R where R is a "dead end" for J coupling: a quaternary carbon or a heteroatom (such as oxygen or sulfur). These are called AMX or three-spin systems, and they include Asp, Asn, Cys, Phe, Tyr, Trp, and His. All have, in addition to the backbone HN and Ha, two Hp resonances in the vicinity of 3 ppm (2.6-3.4 ppm). If we observe this pattern in a TOCSY spectrum along an HN line, we can conclude that it belongs to this "AMX" group of amino acids. Serine is technically an AMX spin system (CHa-CH2OH), but the ^-protons are shifted farther downfield by the oxygen, closer to 4 ppm and just upfield of the Ha resonance. This makes Ser a recognizable "unique" spin system rather than part of the AMX group. Another group can be recognized from chemical shifts as "five-spin" systems (Fig. 12.9): CHa-CH2-CH2-R, where again R is a "dead end": Glu, Gln, and Met. The two ^-proton resonances appear around 2 ppm and the y-protons (which may or may not be degenerate) are farther downfield (2.3-2.6 ppm). This pattern can usually be distinguished from the AMX pattern.

12.5.1 Sequence-Specific Chemical Shifts in Structured Proteins

The random-coil chemical shifts shown in Figure 12.9 have very little variation in HN or Ha chemical shifts among the 20 amino acids. Furthermore, if we have more than one of a particular amino acid in an unstructured protein, they will be indistinguishable by chemical shift. Sometimes proteins are observed in an "unfolded" state, and this can be clearly seen

F2 = Hn (ppm) Figure 12.10

from the COSY or TOCSY spectrum, looking at the "fingerprint region" of F2 = HN and F\ = H a. If all of the amino acid residues are showing the random-coil shift values, we would see all of the HN-Ha crosspeaks in a small region less than one ppm wide in each dimension (HN 8-9 ppm, Ha 4-4.8 ppm) and there could be no more than 20 crosspeaks regardless of the size of the protein (Fig. 12.10). Even if a protein has folded only to the extent of "hydrophobic collapse" (aggregation of the hydrophobic side chains to protect them from water) we will see this very poor dispersion of chemical shifts. This state is called the "molten globule" state because these hydrophobic clusters are like drops of oil in a liquid state—they do not have the very specific packing arrangements of side-chains characteristic of folded (native) proteins. An unfolded or molten globule form of a protein can easily be identified by NMR, even from a 1D proton spectrum, because of this very poor dispersion. If part of a protein is disordered, those residues will fall in the narrow "random coil" chemical-shift range and will also give much sharper and stronger crosspeaks because their greater flexibility gives them the long T2 values of a smaller molecule.

In a folded protein, the random-coil chemical shifts are changed slightly by the immediate environment of the spin system in a protein: the precise orientations of nearby aromatic rings and peptide bonds lead to specific changes in these chemical shifts due to through-space effects of unsaturated "ring currents" (anisotropic effects). Thus in a protein there may be many serine residues but each one will have slightly different chemical shifts for the hn, Ha, and two Hp protons. This is illustrated by some of the 1H chemical shifts for a small (63 residue) globular protein, the Heregulin-a EGF domain (Fig. 12.11). Heregulin-a is a protein ligand for a membrane receptor associated with breast cancer, and the EGF domain is a part of this protein "cut out" for structure determination by NMR. The "generic" chemical shifts for each residue type (e.g., glycine) are shown above the specific residue chemical shifts for each occurrence of that amino acid in the protein. There are seven lysine residues, for example, and each one has a unique pattern of chemical shifts similar to the random coil values (in the same general region of the spectrum) but not identical. Because the Lys side chain is charged and likely to be exposed to solvent, we do not see a great deal of variation in the side-chain chemical shifts, but the backbone (Ha and HN)

shifts are widely separated, ranging from 7.7-9.6 ppm for the HN resonances (random coil Hn is 8.4). Similar variations are seen for each of the amino acids found in the protein (e.g., 4 Gly, 3 Ala, 4 Leu, 5 Val, 3 Pro, 6 Cys, 3 Phe, and 2 Tyr). This sequence-specific variation of chemical shifts is what makes protein NMR possible: each residue in the sequence has its own "address" (set of precise chemical shifts) that allows us to measure distances and dihedral angles from a vast number of precisely defined positions within the protein. Before we can use these addresses, however, we need a "phone book" that pairs up the chemical shifts with the sequence-specific locations within the protein. For example, we need to know that the resonance in the !H spectrum at 9.80 ppm (farthest left in Fig. 12.11) is not just an Hn resonance and not just the HN of a phenylalanine residue, but that this resonance is the Hn of Phe 21: a single specific proton in the entire molecule. It's like looking in the phone book and finding several pages of "Jones": we need the address of one particular Jones, Samuel P. Jones, in order to find his house and make a map of his neighbors. We have already encountered this process of assignment with natural products, but it is more complicated in a

0 0

Post a comment