The two methods we described above have problems. (1) The steepest descent method has no good way to determine the length of the step. (2) Newton's method is based on solving a linear system in Eq. (35). The matrix to be inverted can be singular. (3) Moreover, unless it is started close to the minimum, Newton's method sometimes leads to divergent oscillations that move away from the answer. That is, it overshoots, and then overcompensates, etc. We discuss the cure to the second problem and then show how the first and third can be used to cure each other.
The standard cure to the singular matrix problem is to drop the second
partials in the expression for
, to give
![]() |
(40) |
The problem with overshooting can be solved by a method of Levenberg
and Marquardt that combines steepest descent with Newton. The idea is
to modify the matrix
once more to have the form
| (41) |
![]() |
(42) |
So how do we pick
? We could start the Levenberg-Marquardt
scheme with a small value of
. Then we are doing Newton. If
we overshoot the minimum, we would get an increase in
.
Then we back up and increase
by, say, a factor of 10 and try
again. Eventually
may be very large. Then the method is
more like steepest descent with a step size that gets smaller each
time we increase
. So by continuing to increase
we
are guaranteed to decrease
eventually. If
decreases, on the other hand, we decrease
by a factor of 10.
In this way the method tunes
to the circumstances. As we
approach the minimum, we expect Newton to do better, so we should
find, finally, that a successful search concludes with a small value
of
.
How much should we fuss over a determination of the minimum of
? As far as the determination of the parameters is
concerned, we should really be content with getting
to
within less than the true minimum plus one. The reason is that
an increase in
by one corresponds roughly to a change in
one standard deviation in the fitting parameters, so is ``in the
noise''. However, the estimate of the error in the parameters through
the formula (39) is often quite sensitive to how close
we are to the true minimum, so it is a good idea to adjust the
stopping criterion for the minimization in accordance with the
resulting effect on the error estimate.