Correlations and Contours

Figure 1: One sigma error ellipse, showing a strong correlation between the parameters $a_{1}$ and $a_{2}$. The ellipse is obtained from Eq. (19) by setting $\Delta \chi ^2 = \chi ^2 - \chi ^2_0 = 1$, i.e. one unit above the best value of $\chi ^2$.

As we have seen in the previous sections, with linear least squares, the chi square surface is a quadratic form given by Eq. (19). The contour lines of constant $\chi ^2$ are concentric ellipses, centered at the best fit value. It can be shown that a one-standard-deviation change in the parameter values raises $\chi ^2$ one unit above the minimum. This $1 \sigma$ contour is called the error ellipse. If the mixed term in $\Delta m \Delta
b$, then the major and minor axes of the ellipses are parallel to the parameter axes. If not, then the ellipse is rotated with respect to the axes. If the ellipses are particularly eccentric and rotated, as shown in Fig. 1, we see that it takes a large change in parameter values along the major axis in order to get to the $1 \sigma$ contour. This implies that the parameter values are strongly correlated. The parameter $\rho$ in Eq. (31) measures this correlation. It satisfies $ -1 \le \rho \le 1$. A small or zero value implies little correlation. A value close to 1 corresponds to a strong correlation and -1, a strong anticorrelation.

Figure 2: A linear fit producing strong correlations between slope and intercept. The two lines represent small departures from the best fit. An increase in slope can be compensated by a decrease in intercept.

An example of a situation that leads to such a correlation would be a linear fit to a cluster of $x,y$ points that are all distant from the $y$ axis. See Fig. 2. The $y$ intercept is especially uncertain, because it is far from the points. But the value of the intercept is strongly correlated with the slope, since any reasonably good fit must be a line that passes through the points. A small change in the slope of this line requires a corresponding change in the intercept. If we look at the Hessian matrix for a linear $\chi ^2$ fit, we find that in precisely this situation, the off diagonal terms are large compared with the diagonal terms, so $\rho$ is large.