## A 1 Linear Regression Analysis

The classification of a model as linear means that we can fit it to experimental data by the method of linear regression. Although we shall be concerned mainly with nonlinear models and nonlinear regression in this book, it is instructive to review the method of linear regression, also called linear least squares. We shall see that the same principles used in linear least squares also apply to nonlinear regression analysis.

Linear least squares is familiar to most scientists and students of science. To discuss its principles, we express a single-equation linear model with k parameters in a general form:

where the >>;(calc) depend linearly on the parameters b\,..., bk. Equation (2.8) provides a way to compute the response y, (calc) from the linear model. We will also have n experimentally measured values of _y;(meas), one at each Xj. For a set of n measured values yy(meas), we define the error sum, S, as jy(calc) = F{xj, bub2,...,bk)

/=i where the w; are weighting factors that depend on the distribution of random errors in x and y. The simplest and most often used set of assumptions is as follows: the x values are free of error, and the random errors in y are independent of the magnitude of the y/meas). Another way of expressing this error distribution is that the variances in all of the yj (aj, where ay is the standard deviation in y) are equal. In this special case = 1. Other important options for Wj will be discussed later.

The principle of least squares is used in both linear and nonlinear regression. Its major premise is that the best values of the parameters b\, ..., bk will be obtained when S (eq. (2.9)) is at a minimum value with respect to these parameters. Exact formulas for the parameters can be derived for a linear model by taking the first derivative of S with respect to each parameter and setting each of these derivatives equal to zero. This results in a set of linear simultaneous equations that can be solved in closed form for the unique value of each parameter [1, 2].

Consider this procedure applied to the straight line model in eq. (2.3):

With the assumption Wj = 1, the form of the error sum is

For a given set of data, S depends on parameters bx and b2. The required derivatives are dS/db 1 = 0 i"? 1

Equation (2.11) yields two equations in two unknowns, which can be solved for bx and b2. If we define xav and yav as xav = 2jXj/n (2.12)

The resulting expression  for the slope of the line is

The standard deviations in the slope and intercept can be computed from simple formulas , A BASIC program for linear regression using a straight line model is listed in Appendix I. Also, built-in functions in mathematics software such as Mathcad, Mathematica, and Matlab can be used for linear regression. The book by Bevington  is an excellent source of linear regression programs (in FORTRAN) for a variety of linear models.

An important question to be answered as a part of any regression analysis is, How good does the model fit the data, or what is the goodness of fit? The product-moment correlation coefficient, often called simply the correlation coefficient, is a statistic often used to test goodness of fit of linear least squares models to data. This correlation coefficient (r) is defined as