## Steepest Descent

One way to minimize a general function of several variables is to use the method of steepest descent. This method is based on the fact that the gradient of a function points opposite the “fall-line”, the path of steepest descent. The idea is to start somewhere, say at the starting vector and then step off in the direction of steepest descent by an amount

 (38)

so that the new trial parameter vector is . The problem with this method is that we don't know how far to go along the line of steepest descent, i.e. we don't know what to take for the constant. Actually the problem isn't just one of choosing one constant: we might want to step with a different constant for each component . We could make up an algorithm that chooses a constant or set of constants, and then if, after taking a step, we find that increases, backs up and takes a smaller step. However, we can do better. Let's first consider a different approach and then we'll return to this question.