Suppose we have made a series of measurements, giving data points,
each consisting of a triple
, where is
the standard deviation of the measurement . Suppose further that
we expect that the relationship between and is given by the
expression

(1) |

We will give some justification for this formula in the next section. Notice that this procedure tends to make the differences between the values, i.e. the “observed” values, and the values, i.e. the “predicted” values, as small as possible. Note also, that the weighting of the points is larger if the standard deviation is smaller. This makes sense. We want the more certain measurements to have a bigger influence on the agreement between the model and the data.

The minimum of occurs when

(3) |

where we have defined

(5) |

Equation (4) is called a quadratic form. It is a generalization of a quadratic to more than one variable. Here the variables are and . It happens to be a “positive-definite” quadratic form, which means that it has a minimum. The minimum gives our best fit value of the slope and intercept . Call this point and . Then it increases in all directions as we vary and away from this point. Think of the two variables and as defining a plane and think of as defining the height of a surface above the plane. The surface then looks like a bowl. The contours of constant elevation are ellipses in and .

Our next task is to locate the bottom of the bowl. We do this by
setting both partial derivatives of to zero:

(6) | |||

(7) |

This is a simple linear system of the form

(8) |

(9) |

and the vector is

(11) |

The solutions are, in compact matrix form

(12) |

(13) |

(14) |

(15) | |||

(16) |

This is the main result.

A quadratic

(17) |

(18) |

Likewise, the quadratic form in Eq. (4) can be
written as