## C1 Goodness of Fit Criteria

In many situations, the exact model for a set of data has not been established. The goodness of fit criteria for a series of nonlinear regression analyses to all of the possible models can be used to find the model that best fits the data. The usual goodness of fit criteria, deviation plots, standard deviations (SD), and error sums (S), are compared for this task.

Section 2.B.4 illustrated the use of SD, standard errors of parameters, and deviation plots for assessing goodness of fit. A SD smaller than the standard error in y(ey), assuming ey is constant or absolute, along with a random deviation plot, can usually be taken as good evidence of an acceptable fit. If the number of parameters in every model examined is the same, the lowest SD consistent with an uncorrected model and the best degree of randomness of the deviation plot can be used as criteria for choosing the best model. The use of summary statistics such as SD alone for such comparisons is risky, because summary statistics give little indication of systematic deviations of the data from the models. Deviation (residual)

plots are often better indicators of goodness of fit than summary statistics because each data point is checked separately for its adherence to the regression model. The type of results needed to distinguish between models with the same number of parameters is illustrated in Table 3.10 for a hypothetical case with four models.

Models 1-3 in Table 3.10 all have SD > ey and non-random deviation plots. These are criteria for the rejection of these models. Only model 4 gives a random deviation plot and SD < ey, indicating an acceptable model for the data.

If the models being compared have different numbers of regression parameters, summary statistics and deviation plots cannot be compared directly. A slightly different approach is needed to compare goodness of fit. Here, the individual regression analyses have different numbers of degrees of freedom, defined as the number of data points (n) analyzed minus the number of parameters (p). The difference in degrees of freedom must be accounted for when using a summary statistic to find the best of two models. A statistical test that can be used in such situations employs the extra sum of squares principle [1], This involves calculating the F ratio:

where Si and S2 are the residual error sums (S, eq. (2.10)) from regression analyses of the same data onto models 1 and 2, px and p2 now represent the numbers of parameters in each model, and the subscripts refer to the specific models.

To use eq. (3.15), model 2 must be a generalization of model 1. Regression analyses are done onto the two models, and the F statistic is calculated from eq. (3.15). The value of F obtained is then compared to the F value from tables at the desired confidence level, such as F(p2 - px, If the experimental F(p2 - p\,n - p2) is larger than F(p2 - px,n - p2)9o% but smaller than F(p2 - pu n — £2)95% from the tables, then model 2 is the most probable model at a 90% confidence level. Different levels of

Table 3.10 Hypothetical Results from Analysis of a Single Data Set onto Four Different Models, Each with the Same Number of Parameters

Model fj SD Deviation plot

1 0.01 0.078 Nonrandom

2 0.01 0.033 Nonrandom

### 3 0.01 0.019 Nonrandom

4 0.01 0.0087 Random confidence (P%) can be employed until F(p2 — p\,n — p2) falls between two F(p2 - pi, n - p2)p%, with the lowest of the two P%s giving the confidence level with which model 2 is the most probable choice. If F(p2 -Pu n - p2) > F(p2 - p\,n - p2)P%, for all P% > 70, then there is little confidence for concluding that model 2 fits the data better. In such cases, model 1 can be taken as the more appropriate model, because it explains the data with fewer parameters.

An example of the use of the extra sum of squares F test is illustrated in Table 3.11 for models that consist of a single exponential and the sum of two and three exponentials. The F test is done for the two- and three-exponential models. We see that the experimental F value is 2.79, which is in between the tabulated values of F(2, 49)90% and F(2, 49)80%. This indicates that the three-exponential model can be accepted with at least an 80% confidence level. If the F value were smaller than F(2, 49)70%, this would reflect an insignificant difference between the error sums of the two models and model 2 would be the better choice.

Given the same degree of goodness of fit, the deviation plot will appear somewhat more random to the eye as the number of parameters is increased in related models. For fits with about the same SD for models of similar type, an increase in number of parameters tends to produce greater scatter in the deviation plots. This complicates direct comparisons between deviation plots for closely related models with differences between pt and p2 s 2. However, with this knowledge in hand, an analysis based on deviation plots and the extra sum of squares F test can be used to distinguish between the models.