The two methods we described above have problems. (1) The steepest descent method has no good way to determine the length of the step. (2) Newton's method is based on solving a linear system in Eq. (41). The matrix to be inverted can be singular. (3) Moreover, unless it is started close to the minimum, Newton's method sometimes leads to divergent oscillations that move away from the answer. That is, it overshoots, and then overcompensates, etc. We discuss the cure to the second problem and then show how the first and third can be used to cure each other.
The standard cure to the singular matrix problem is to drop the second
partials in the expression for , to give
The problem with overshooting can be solved by a method of Levenberg
and Marquardt that combines steepest descent with Newton. The idea is
to modify the matrix once more to have the form
So how do we pick ? We could start the Levenberg-Marquardt scheme with a small value of . Then we are doing Newton. If we overshoot the minimum, we would get an increase in . Then we back up and increase by, say, a factor of 10 and try again. Eventually may be very large. Then the method is more like steepest descent with a step size that gets smaller each time we increase . So by continuing to increase we are guaranteed to decrease eventually. If decreases, on the other hand, we decrease by a factor of 10. In this way the method tunes to the circumstances. As we approach the minimum, we expect Newton to do better, so we should find, finally, that a successful search concludes with a small value of .
How much should we fuss over a determination of the minimum of ? As far as the determination of the parameters is concerned, we should really be content with getting to within less than the true minimum plus one. The reason is that an increase in by one corresponds roughly to a change in one standard deviation in the fitting parameters, so is “in the noise”. However, the estimate of the error in the parameters through the formula (45) is often quite sensitive to how close we are to the true minimum, so it is a good idea to adjust the stopping criterion for the minimization in accordance with the resulting effect on the error estimate.