## B2 Nonlinear Regression Algorithms

General programs that require the user to supply only the model and the data are available for nonlinear regression. Some of these are available in several commercial graphics software packages, and some have been written by independent authors [1, 6-8], Many of these programs use numerical differentiation of the model so that analytical derivatives need not be provided by the user.

Commercial software packages for mathematics such as Mathcad and Matlab have built in minimization functions that allow facile construction of general programs for regression analysis. A version of a nonlinear regression program in the Mathcad environment is provided in Section B.4 of this chapter.

Programs for nonlinear regression differ mainly in the algorithm used to minimize the error sum S. General programs have a "model subroutine" into which the user writes the program code for the desired regression model. The model can be an explicit closed form equation, a series of equations, or a numerical simulation, as long as it supplies computed values of the dependent variable y;(calc) to the main program.

A detailed discussion of nonlinear regression algorithms falls outside the aims of this book. In the following, we give a qualitative description of some of the more popular algorithms.

One commonly used algorithm employs the steepest descent method , in which the search for the minimum S travels on the gradient of the error surface. The gradient is estimated numerically by finite difference methods. The progress toward the minimum S is monitored at each iteration or cycle and adjusted when necessary. Steepest descent provides fast convergence in the initial stages of the computation, but slows down considerably near the convergence point. Such programs written with conservative tolerances for convergence are extremely reliable, but may be very slow for some applications.

The Gauss-Newton algorithm approximates nonlinear models for y;(calc) by linear Taylor series expansions . After initial guesses, new values of the parameters are found at each cycle by methods similar to linear least squares, using expressions involving first derivatives of S with respect to each parameter. Ideally, each iterative cycle gives successively better estimates for the parameters until the absolute minimum in S is reached. Unlike the steepest descent method, convergence is often fast in the vicinity of the minimum.

The Marquardt-Levenberg algorithm  contains elements of both steepest descent and Gauss-Newton methods but converges more rapidly than either of these. The Marquardt-Levenberg algorithm behaves like an efficient steepest descent method when the parameters place the error sum far from minimum S on the error surface. Close to the minimum 5, it behaves like the Gauss-Newton method, again under conditions where the latter algorithm is efficient.

Another algorithm that can be used for nonlinear regression is the modified simplex. For k parameters, this method employs a (k + l)-dimensional geometric construct called a simplex. A three-dimensional simplex is a triangle. A set of rules are used by the algorithm to move the simplex toward the minimum of the error surface as it simultaneously contracts in size , Convergence times are comparable to steepest descent methods but longer than the Marquardt-Levenberg method.

Nonlinear regression programs can be interfaced with graphics subroutines that plot the experimental data along with the curve computed from the model. If such a plot is consulted after the initial parameter guesses are made, but before the iterations begin, a serious mismatch of experimental and computed curves can immediately be recognized and better initial guesses can be made before the regression is started. The graphics subroutine can also be used as a rough visual check of final goodness of fit, as discussed already.

Nonlinear regression programs should include criteria to automatically test for convergence and terminate the program when preset conditions for convergence are reached. As mentioned previously, reliable tests for convergence are often based on the rate of change of S or on the rate of change of the parameters. Criteria for convergence can often be adjusted by the user, but this is normally not necessary. Although convergence can be tested over a series of cycles, this is self-defeating in terms of computational time in an algorithm as fast as Marquardt-Levenberg, and convergence is tested on successive cycles.

The job of regression algorithms is to find the global minimum of S. If the error surface is irregularly shaped, the algorithm could conceivably stop at a local minimum with an error sum considerably larger than the global minimum. Parameter values would then be in error. Although such false minima are possible, they are rarely encountered in practice in well-conditioned, properly designed regression analyses. A more complete discussion of error surfaces is included in Chapter 4.

A graphic view of the initial computed response in regression analysis aids in avoiding starting points too far from the absolute minimum, which might cause convergence to a false minimum. Local minima can be detected by starting regression analyses with several different, but physically reasonable, sets of initial values of the parameters. Convergence to significantly different values of S and different final parameters for different starting points suggests that local minima are being reached. Convergence to identical values of S and identical parameters for any reasonable starting point indicates that the global minimum has been found.