## A 1 Recognizing Correlated Parameters

In Section 2.A.1 we discussed correlation between x and y in a linear regression analysis. This correlation is measured by using the product moment correlation coefficient, r. We found that x, y correlation in a linear regression analysis is a good thing. For a calibration plot, for example, we would usually like to achieve the largest possible x, y correlation, i.e.,

In Section 3.B.1 we mentioned briefly another kind of correlation; that is, correlation between two parameters in a regression analysis. This type of correlation is not desirable. If two parameters are fully correlated, unique and independent values cannot be found for each of them. A value found for one of these parameters will depend upon the value of the other one. Partial correlation between parameters can also occur and can pose serious problems if it is too large. We must avoid large correlations between parameters in nonlinear regression analysis.

How can we recognize correlation between parameters? In some cases, this is quite easy. For example, consider Table 4.1, which lists a series of models with correlated parameters. In entries 1-3 in this table, two parameters either multiply, divide, or are subtracted from one another. If parameters appear only in one of these relationships in the model, they are fully correlated and cannot be estimated independently by nonlinear regression analysis.

Table 4.1 Examples of Models with Correlated Parameters

Correlated model Correlated Corrected model Relation

Correlated model Correlated Corrected model Relation

Table 4.1 Examples of Models with Correlated Parameters

 1. y = WV(1 + b2x2) b0, y = bol( 1 + b2x2) bo = b0bi 2. y = b0(b,x3 + exp[-b2x])lb3 b0, h y = boibiX1 + exp[-i>2*]) bo = b0lb3 3. y = b0 exp[(b2 - b{)x\ b„ b2 y = b0 exp[fr,*] b, = b2 ~ b1 4. y = F(b0,bi •■■■,bk), where bo, bi y = F{bub2,...,bk) bi = F(b0, 60 b0 = f{b\) or where model does not depend on b0

In Table 4.1, b0b{ (model 1), b0/b3 (model 2), and b2 - bx (model 3) or b2 + bx can be employed only as single parameters. The correct solution for this type of situation is to replace the correlated parameters in the model with a single parameter. This is shown in the "corrected model" column of Table 4.1; the relation between original and corrected model parameters is given in the final column.

We should not conclude that correlation exists anytime we find two parameters that are multiplied, divided, subtracted, or added in a model. For example, if model 1 in Table 4.1 were to have an additional term, so that y = 6oM 1 + b2x2) + b,lx. (4.1)

We now have b\ by itself in a term completely separate from b(ibx. This should allow the reliable estimations of each parameter, b0, b\, and b2. Thus,, full correlation can be assumed when the product (or quotient, sum, or difference) of parameters is the only place in which those parameters appear in the model. If another term exists in the model in which one of these parameters appears again, this usually removes the correlation.

The fourth model in Table 4.1 represents a general situation where one of the parameters exists in a mathematical relation with another parameter in the model. That is, if we can find an equation directly relating two of the parameters in a model, they are correlated.

In some cases, we may write a model for a set of data that has no information about a certain parameter in the model. An example is a current (_y) vs. potential (x) curve for a completely irreversible electrode reaction controlled by electrode kinetics alone. The rising portion of this curve follows the expression y = b0exp[(x- E0)lbx} (4.2)

In this case, the standard potential E° cannot be used as an adjustable parameter because the data depend mainly on the kinetics of the electrode reaction. In other words, the data contain no information about the standard potential, which is a thermodynamic quantity. We would have to use an independently determined value of E° as a fixed parameter in this case.

Additionally, we can write an expression tying together b0,E° and ft ,, i.e.,

Even from an algebraic point of view, eq. (4.2a) contains more parameters than any data set could provide. As mentioned above, a value found for one of these parameters will depend upon the value of the other one.

A similar situation arises if the same variability in the data is accounted for twice. For example, any simple exponential decay process which can be expressed as:

requires just two parameters, y0 and k, to fully account for all variability. Attempts to tailor this model may lead the unwary into proposing a more "suitable" form, for example:

We look at this apparently three-parameter model and ask ourselves: "What kind of variability do we account for with each parameter?" The presence of a completely correlated pair of parameters becomes obvious. For any data set consistent with the exponential decay model, we can choose several values of t0, and while holding it constant, fit k and y0. The model will fit equally well in each case, with the same summary statistics and identical deviation plots. Even the resultant k's will be the same. On the other hand, the values of y0 will depend on the choice of t„, since y0 = exp{A:i0}. If we allow t0 to vary during a nonlinear regression routine, we may see matrix inversion errors in Marquardt-Levenberg approach. The convergence usually will take a little longer, but the final fit may be quite good. However, the final values of the parameters will depend strongly on the initial guess of the adjustable parameters. This is because the measured y contains no information about t0. Thus, we need to be sure that the data contain information about all the parameters we are trying to estimate by nonlinear regression analysis.

In designing models for nonlinear regression, we may encounter situations where parameters will be at least partially correlated. The reason is that many functions we might want to include in the model based on our analysis of the physical nature of the experiment behave in similar ways in the range of independent variables. Look once again at the exponential model. Assume that we have determined the presence of a long-term linear drift in the instrument which we used to study a first-order decay. Appropriately, we modify our model to include this effect:

Here, the adjustable parameters are a, the value of the linear background at a time t = 0, m, the slope of the linear background, y0, the deflection of the exponential from the background at t = 0, and k, the rate constant of the decay. If our largest value of t is significantly less than 4.6 k~l, obtaining a reliable value for both k and m will be increasingly difficult. This problem will become very serious if our maximum t falls below 1 Ik. The reason for this is quite clear from inspection of our data: the observed decay is practically a straight-line! In fact, now the exponential term is approximately exp{-fa}« 1 - kt. (4.2e)

There is correlation between k and m. The algorithm we use for fitting the data cannot tell which variability to assign to what part of the model.

A good way to uncover more subtle correlations between parameters is by examining parameter correlation coefficients in a correlation matrix. A method of computing this matrix is described by Bates and Watts [1].

The diagonal elements of the symmetric correlation matrix are all unity. The off-diagonal elements can be interpreted like the correlation coefficient between y and x in a linear regression. If an off-diagonal element is close to 1, two parameters are highly correlated. Off-diagonal elements of the matrix, with i not equal to j, give an indication of the correlation between z'th and the ;'th parameters. As a rule, correlation is usually not a serious problem if absolute values of off-diagonal elements are smaller than about 0.980. For example, if off-diagonal elements of 0.67 and 0.38 were found in a correlation matrix for a three parameter fit, we would conclude that there are essentially no correlations between parameters.

As an example of a correlation matrix, we will refer back to the sample regression analysis summarized in Table 3.7. Recall that data were fit onto a model of the form y = 1 + 6) + bAx + b5 (4.3)

where

The correlation matrix elements obtained from this fit are shown (Table 4.2) only below the diagonal because it is a symmetric matrix. Elements above the diagonal are the mirror images of those below it. The bold numbers outside of the matrix are parameter labels. Thus, the correlation between parameters 2 and 3 is 0.045, that between 2 and 4 is 0.91 and so on. In this particular example, the only troublesome correlation is that between 1 and 4, at -0.991. This is a little bit above the arbitrary upper limit of safety we discussed previously. However, the other parameters show acceptable correlations, and the other goodness of fit criteria are

Table 4.2 Correlation Matrix for the Example in Table 3.7

0.065 0.059

acceptable (see Section 3.B.1). Such a high correlation could be avoided by including more baseline data. This has not been done in the present case because the partial correlation creates no serious problems.

An easy way to find out if such "borderline" correlations cause significant errors is to compare calculations done with several sets of starting parameters. We have done this for the data used to generate Table 3.7. Results for a different starting point are listed in Table 4.3. The final parameter values of bx and b4 have not changed very much in these two tables, and the errors in parameters b1,b2, and b3 remain small. We can conclude that the partial correlation between parameters bi and b4 is not highly significant in this analysis. The error in b4 is large because of its small significance in determining y (see Section 3.B.1).