## A 3 The Model and Auxiliary Data Analysis Methods

Because of the large number of peaks under the amide I and amide II bands of proteins [5-8], auxiliary methods are required to aid in the resolution of the spectra. The nonlinear regression analysis must be assisted by relatively accurate initial choices of the parameters.

In our examples, second derivatives of the spectra were used to obtain initial peak positions on the frequency axis and to identify the number of peaks. Software to obtain derivative spectra are usually available in an FT-IR spectrometer's software package. A regression model consisting of a sum of Gaussian peaks, similar to that listed in Table 3.8, gave the best fit to protein and polypeptide spectra [9, 10].

Therefore, the first step of the analysis is to examine the second derivative of the spectrum and extract from it the approximate number of peaks in the amide I and II envelopes, along with initial guesses of their positions. To use the second derivative spectra, we should be familiar with the shape of the second derivative of a Gaussian peak. Shapes of a Gaussian peak and its first and second derivatives are given in Figure 7.1. The second derivative has a characteristic negative peak at the identical position on the x axis as the original Gaussian peak. Therefore, one negative peak should be in the second derivative spectrum for each component peak in the original spectrum.

We can now recognize the negative peaks of the second derivative spectrum of the protein lysozyme (Figure 7.3) as characteristic of each underlying Gaussian peak in the original spectrum. In our analysis of FT-IR spectra, the second derivative spectra are used to obtain the approximate number of component peaks. Initial guesses of the peak positions on the frequency axis are based on the positions of the negative second derivative peaks.

The second auxiliary technique used is Fourier deconvolution [2]. This partial resolution method was discussed in Section 4.A.3. The procedure is different from the Fourier transform used to convert the interferogram into a frequency-based specturm. Conceptually, Fourier deconvolution enhances the resolution of the spectrum by transforming the component Gaussian peaks into peaks with larger heights and smaller widths, while maintaining the same area and frequency as the original peaks.

A spectrum that has been partially resolved by Fourier deconvolution can be fit with the same model (Table 7.1) as used for the original spectrum. The requirement that both sets of data give identical parameters and peak areas can be used as a convergence criteria.

1750 1700 1650 1600 1550 WAVENUMBER, cm1

1500

Figure 7.3 The second derivative FTIR spectrum of amide I and II bands of lysozyme in aqueous solution. The smooth line is experimental data. The jagged line on outer envelope is computed from the final regression analysis corresponding to the model and parameters found for the original spectrum, as described in the text. The inset shows the plot of connected residuals between calculated and experimental second derivative. (Reprinted with permission from [10], copyright by the American Chemical Society.)

1750 1700 1650 1600 1550 WAVENUMBER, cm1

1500

Figure 7.3 The second derivative FTIR spectrum of amide I and II bands of lysozyme in aqueous solution. The smooth line is experimental data. The jagged line on outer envelope is computed from the final regression analysis corresponding to the model and parameters found for the original spectrum, as described in the text. The inset shows the plot of connected residuals between calculated and experimental second derivative. (Reprinted with permission from [10], copyright by the American Chemical Society.)

Table 7.1 Model for the Analysis of Amide I and Amide II FT-IR Bands of Proteins

Assumptions: Solvent background has been subtracted, water vapor and carbon dioxide removed

Regression equation:

Regression parameters: Data:

A0 hj x0j W, for i = 1 to n A (absorbance) vs. x (frequency)

Auxiliary data sets: Second derivative of spectrum, Fourier deconvoluted

(FD) spectrum

Special instructions:

1. Use second derivative spectrum to identify number of peaks and their initial positions (x0.,)

### 2. Enhance resolution of original spectrum using FD

3. Do initial regression analyses on the FD spectrum by first fixing the jr0l. Take results of these initial fits and do regressions again to find all the parameters. Use these parameters to fit raw data spectrum

4. The major criteria for acceptance of the parameters are a final successful fit to the original spectrum using the fixed jco., found from fitting FD spectrum, a random deviation plot, and SD < ey. Confirm correct number of peaks (n) in model by using extra sum of squares F test (Section 3.C.1) for models with n, n - 1, and n + 1

5. True fractions of asparagine (ASN) and glutamine (GLN) should be subtracted from the fractional areas of bands at appropriate frequencies

0 0