B2 Peak Shaped Data

Many experiments in chemistry and biochemistry give peak-shaped signals. For methods such as chromatography, overlapped peaks result from poor separation of components of the sample during the analysis. Overlapped peaks in UV-VIS or infrared absorbance, fluorescence, X-ray photo-electron (XPS), and nuclear magnetic resonance (NMR) spectroscopy represent overlapped spectral features. Hence, there is a real need to separate overlapped peaks in a reliable way to extract the information relevant to each component peak.

For most of the techniques except NMR, there is little in the way of fundamental theory on which to base models for nonlinear regression. Therefore, model building for peak-shaped data tends to be somewhat empirical. Nevertheless, various peak shapes can be employed reliably as models for such data, e.g., Gaussian and Lorentzian shapes. The Gaussian shape is derived from the normal curve of error, and the Lorentzian shape is more narrow at the top but wider at the very bottom of the peak (Figure 3.6). Models for a number of these peak shapes are listed in Table 3.8. The parameters are typically peak height, the half-width of the peak at half-height, and the position of the peak maximum on the x axis.

Models for overlapped peaks can be constructed by summing up two or more Gaussian, Lorentzian, or other peak-shape functions. For example, suppose we have chromatographic peaks that are approximated well by Gaussian peak shapes. Two overlapped peaks are represented by the sum of two Gaussian equations (Table 3.8).

Although for experimental data it might be necessary to add background terms or use a function that accounts for tailing [1], we illustrate the use of an overlapped peak model in the ideal case where none of these refinements to the model is necessary. The model for two overlapped Gaussian peaks (Table 3.8) was used to fit noisy simulated data for two overlapping

Figure 3.6 Gaussian (solid line) and Lorentzian (dashed line) peak shapes used for models of peak-shaped data.

Figure 3.6 Gaussian (solid line) and Lorentzian (dashed line) peak shapes used for models of peak-shaped data.

peaks. The computed and measured data agree very well (Figure 3.7). Initial parameters were chosen as the best guesses possible by inspection of a graph of the raw data. Parameters and statistics are summarized in Table 3.9. We are able to recognize the characteristics of a good fit from this table. That is, SD < ey (0.005), and the deviation plot is random.

The use of nonlinear regression for peak resolution has the advantage that the component peaks can be reconstructed from the regression parameters. In this example, the computed estimates of height, width, and peak position are simply used with the equation for the shape of a single Gaussian peak (Table 3.8) to generate each component peak. In this way, accurate peak positions, heights, and areas (e.g., by integration) can be obtained for the components of the overlapped peaks.

time, min.

Figure 3.7 Deconvolution of Gaussian chromatographic peaks by nonlinear regression analysis. In outer envelope, points are experimental and line is the best fit from the analysis. Underlying peaks are component peaks computed from the results of the analysis.

time, min.

Figure 3.7 Deconvolution of Gaussian chromatographic peaks by nonlinear regression analysis. In outer envelope, points are experimental and line is the best fit from the analysis. Underlying peaks are component peaks computed from the results of the analysis.

Table 3.8 Models for Peak-Shaped Data

Peak shape Model equation Parameters

Table 3.8 Models for Peak-Shaped Data

Peak shape Model equation Parameters

Gaussian

(G)y = h exp{-(x - x0)2I2W2}

h, x0, W

Lorentzian

(L)y = hW2l[(x - x0)2 + W2]

h, x0, W

Two overlapped Gaussian peaks

y = b, exp{-(x - b2)2!2bl}

bi = hub2= x0,i ,b3 = Wu

+ ö4 exp{-(* - b5)2l2bl]

b4 = h2, 65 = xQ,2, b6 = W2

Combined Gaussian and Lorentzian shapes

y = M/G + (1 - f)L}

h, x0, W;f= fraction Gaussian

Gaussian model for n peaks with baseline

y = mx + yo + ^h, exp{-(* - X0J)2I2W} i-1

hi, x0J, W,

n Gaussian/Lorentzian peaks with baseline

i-i

hi, xoj, Wj,f- fraction Gaussian

Chromatographic peak with tailing (R vs. t data)

R = dt'h exp[(i - to)2l2W2] x

h, t0, W,t (see Chapter 14)

exp[(-(t-«' + io)/r]

Chromatographic peak with tailing (R vs. t data)

exp[-a(f - c)] + exp[-6(f - c)]

A, c, a, b (see Chapter 14)

Symbols: h = peak height, x0 = peak maximum location on x axis, W = peak width at half-height, i0

= peak maximum on t axis, m = baseline slope,

y0 = baseline intercept.

Table 3.9 Results of Overlapped Gaussian Peak Model Fit onto Data Representing Two Overlapping Peaks

Parameter/statistic

Initial value

True value"

Final value

Error

bi

1

1.00

0.998

0.2%

bi

2

2.00

1.998

0.1%

03

0.51

0.500

0.4996

0.08%

04

0.56

0.600

0.5987

0.2%

bs

3.7

3.7

3.699

0.02%

b6

0.9

0.80

0.804

0.05%

SD

0.0033

Deviation plot

Random

" Data generated by using the equation for two overlapped Gaussian peaks (Table 3.8) with absolute normally distributed noise at 0.5% of the maximum y.

" Data generated by using the equation for two overlapped Gaussian peaks (Table 3.8) with absolute normally distributed noise at 0.5% of the maximum y.

A bit of preliminary testing is often needed to find the best model for peaks from a given experiment. This is best done with data representing only single peaks. If we find the best model for single peaks, we can usually apply it to the overlapped peak situation.

The literature provides considerable guidance in choosing models for certain types of data. As mentioned earlier, successful fitting of chromatographic data has been done by using a Gaussian peak model convoluted with a decreasing exponential in time after the peak to account for peak tailing [1] (Table 3.8). The last entry in Table 3.8 shows another function that accounts for tailing of chromatographic peaks (Chapter 14).

For NMR spectroscopy, the choice of a Gaussian or a Lorentzian peak shape may depend on theoretical considerations (Chapter 8). Most peaks in Fourier transform infrared spectroscopy can be reliably fit with Gaussian or Gaussian-Lorentzian models (see Chapter 7).

A useful general model for peak shaped data involves a linear combination, of Lorentzian (L) and Gaussian (G) peak shapes (Table 3.8). A parameter / is included as the fraction of Gaussian character. Thus, four parameters per peak are usually used in the regression analysis. Product functions of G and L have also been used [1]. A Gaussian-Lorentzian model for n peaks is given in Table 3.8, with mx + y<, representing the slope (m) and intercept (_y0) of a linear background.

0 0

Post a comment