B3 Error Surfaces

In Section 2.B.1 we introduced the concept of the error surface. The global minimum of the error surface must be found for a successful regression analysis. Recall that the error surface has as its "vertical axis" the error sum, S,

i=i where the wj are weighting factors, which depend on the distribution of errors in x and y. The other axes correspond to the parameters in the regression model. For a two-parameter model, we were able to visualize the error surface in a three-dimensional graph. Therefore, a nonlinear regression analysis was visualized as a journey to the minimum S on the error surface starting from a point corresponding to the initial parameter guesses (Figure 2.4).

We now consider the shapes of some real error surfaces, and how they can provide insight into convergence properties of a particular analysis problem. Consider the two-parameter linear model:

The error surface for this model is an elliptical paraboloid (Figure 4.3). This surface shows a sharp minimum. It features steep sides as we move away from the global minimum. It should provide no problems in convergence by any reliable minimization algorithm.

Nonlinear models can generate more complicated error surfaces than in Figure 4.3. In Figure 4.4, we show an error surface for an increasing exponential model:

This error surface does not have a sharp minimum but features a steep-sided, curved valley close to a rather shallow minimum. A region to the right and front of the minimum has a less steep slope.

A second model we shall consider describes the rate of an enzyme catalyzed reaction as a function of reactant concentration (cf. Section 6.B.3):

This error surface shows a somewhat broad valley that opens out into a relatively broader region near the minimum.

Error surfaces can be useful in understanding why a poor choice of initial values of parameters may result in a failure to converge or a matrix

* Section 4.B.3 was written by Artur Sucheta.

Figure 4.3 Error surface for a linear model.

Figure 4.4 Error surface for an increasing exponential. Arrow points to global minimum.

relative (uniform) error = 0.025 Yr

Figure 4.4 Error surface for an increasing exponential. Arrow points to global minimum.

relative (uniform) error = 0.025 Yr inversion error. Figures 4.4 and 4.5 both show areas of the error surface that are relatively flat or parallel to the x, y plane. There are also regions with steep gradients. This may pose a problem for minimum-seeking algorithms in which one criterion for the direction of the subsequent iteration is the value of a gradient at the location of the current point on the error surface.

For example, programs based upon the rapidly converging MarquardtLevenberg algorithm sometimes exhibit convergence problems when starting points are chosen in broad regions in the error surface close to the absolute minimum S. A typical program [6] compares two estimates of direction at each iteration. One is obtained from approximating the error surface as a parabola and determining the position of its minimum by matrix inversion [7], The second estimate is obtained by evaluating the local gradient on the error surface. If these two directions differ by more than 90°, the discrepancy is deemed impossible to resolve and the program is terminated prematurely, returning a message of "matrix inversion error" to the user.

How can such a situation be resolved by a researcher eager to analyze a data set? Simply choosing a new initial set of parameters sometimes leads to convergence. The new initial set of parameters may give a better representation of the experimental data set or may be located in a more favorable region of the error surface. Thus, if the set of parameters defining the point Pt in Figure 4.5 gives convergence problems or premature program

Figure 4.5 Error surface for a hyperbolic model. Arrow labeled Pmin points to global minimum.

Figure 4.5 Error surface for a hyperbolic model. Arrow labeled Pmin points to global minimum.

termination, an initial point on the other side of Pmin, further back in the steeper valley, might possibly give better convergence properties and result in successful analysis of the data. At present, choosing the best starting point in such situations must be done by trial and error. Successful convergence depends on the shape of the error surface, and each error surface is a little bit different.

We have found that these types of convergence problems occur more frequently with programs based on the Marquardt-Levenberg algorithm than with other, slower algorithms. Perhaps because the MarquardtLevenberg method is so rapid, it is less forgiving of unusual shapes of error surfaces.

In cases where the matrix inversion problem cannot be solved by a new set of initial parameters, a more conservative algorithm such as steepest descent or simplex (see Section 2.B.2) can usually be relied upon to find the global minimum, albeit with a larger number of iterations.

0 0

Post a comment