A great many matrices (more generally linear operators) are characterized by their eigenvalues and eigenvectors. They play a crucial role in all branches of science and engineering. Most of the time, finding them requires resorting to numerical methods. So we discuss some simpler methods.

Generally speaking, eigenvalues of a square matrix are roots of
the so-called characteristic polynomial:

That is, start with the matrix and modify it by subtracting the same variable from each diagonal element. Then calculate the determinant of the resulting matrix and you get a polynomial. Here is how it works using a matrix:

The three zeros of this cubic polynomial are , so this matrix has three distinct eigenvalues. For an matrix we get a polynomial of degree . Why? It is easy to see if we remember from the previous section that the determinant is a sum over products of matrix elements. One of those products runs down the diagonal. Since each diagonal element has a in it, the diagonals alone give you a polynomial of degree . The other products have fewer diagonal elements, so they can't increase the degree of the polynomial. beyond .

A polynomial of degree has roots, although some of them may
appear more than once. Call the zeros of the characteristic
polynomial, *i.e. *, the eigenvalues, . If we factor
the polynomial in terms of its roots, we get

Notice that the determinant of the matrix itself is the value of the characteristic polynomial at . Plugging in into the factored expression above leads to the result that the determinant of the matrix is the product of its eigenvalues.

When a matrix has a zero determinant, we can find a nontrivial (column
vector) solution to the equation

or

This is the standard equation for eigenvalue and eigenvector . There can be as many as linearly independent solutions to this equation as follows

Notice that the eigenvector is not unique. We can multiply both sides of the equation by a constant to see that if is a solution for eigenvalue , so is .

Often we deal with real symmetric matrices (the transpose of the
matrix is equal to the itself). In that case the eigenvectors form a
complete set of orthogonal vectors. They can be used to define the
directions of coordinate axes, so we can write any dimensional
vector as a linear combination

where the coefficient is the component of the vector in the direction .

More generally, if there are linearly independent eigenvectors , this is also possible. Then we have a simple method for finding the eigenvalue with the largest magnitude, namely the ``power method.''

The power method originates from the general statement that we can use
the eigenvectors of a matrix to represent any vector :

We multiply by and get

So we get a new vector whose coefficients are each multiplied by the corresponding eigenvalue: . The coefficients with the larger eigenvalues get bigger compared with the coefficients with smaller eigenvalues. So let's say we have sorted the eigenvalues so the one with smallest magnitude is , and the one with the largest magnitude is . If we multiply by times, the coefficients become . If we keep going, the th term (corresponding to the largest eigenvalue) will eventually swamp all the others and we get (for very large )

So we get an eigenvector corresponding to the largest eigenvalue. Another way of saying this is that when we hit the vector with the matrix we get a new vector that tends to point more in the direction of the leading eigenvector . The more factors of we pile on, the more precisely it points in that direction.

So how do we get the eigenvalue if we have an eigenvector pointing
in the direction of ? If it points in that direction, we must
have . Then remember that eigenvectors satisfy

That is, the vector has every component of multiplied by the eigenvalue . We can use any of the components to read off the answer.

To turn this process into a practical algorithm, we normalize the
vector after each multiplication by . Normalization simply means
multiplying by a constant to put the vector in our favorite standard
form. A common normalization is to divide by the Cartesian norm so we
get a vector of unit length. The normalization we will use here is
dividing the whole vector by the component with the largest magnitude.
If we take the absolute values of the components, the largest one is
called the *infinity norm*. So we divide by the infinity norm if
that component is positive and by minus the infinity norm if it is
negative. Here is an example. Suppose we have

The infinity norm is 3. We divide by 3 to normalize to get :

Let's call the component with the largest magnitude the ``leading component''. In the example, the leading component in is the first one and it is positive. If it was , instead, we'd divide by . The goal is to get a vector proportioanl to the original vector, but with one component equal to and with the rest of the components no larger in magnitude than 1.

The reason we pick this way of normalizing the vector is that we can then easily read off the eigenvalue by looking at what happens to the leading component when we multiply by . Also, the infinity norm is cheaper to compute, since it doesn't require any arithmetic - just comparisons.

Suppose, after normalizing to get , we multiply by one more
time and get

We can read off the eigenvalue from the leading component. It is 2. Of course, we could check every component to see that each one got multiplied by 2.

So here is the power algorithm

Start with any arbitrary vector y. Repeat steps 1 - 3 until convergence. Step 1: Normalize y to get x. The leading component is now 1. Step 2: Compute y = Ax. Step 3: The approximate eigenvalue is the new leading component.Convergence means that when you normalize , the new is close enough to the old to declare victory. You choose what is close enough to suit the requirements of your problem.

Notice that with the power algorithm, you can start with any arbitrary vector and the method will converge to the same result. Well, that is almost any starting vector. You might be unlucky and pick a starting vector that has a zero component along the leading eigenvector . But numerical calculations are usually subject to roundoff error, so even if you unwittingly started witn , chances are very good that after a few hits with the matrix , you develop a tiny nonzero , and then it is just a matter of time before its coefficient grows to dominate the iteration.

The inverse power method works with the inverse , assuming it
exists. It is easy to check that the eigenvectors of the matrix
are also eigenvectors of its inverse, but the eigenvalues are the
algebraic inverses:

where . So now the eigenvalue with the largest magnitude corresponds to the eigenvalue with the

So we can get the largest and smallest eigenvalues. How do we get the
ones between? For a matrix whose eigenvalues are all real, we can do
this by generalizing the inverse power method. We take the inverse of
the shifted matrix , where is any number we like. (We
intend to vary .) The eigenvectors of this matrix are still the
same as the eigenvectors of :

where, now, . Which is the largest ? It depends on . If is close to one of the 's, then is maximum for that . So if we hold that fixed and run the power method, we eventually get the eigenvector . Then we change and rerun the power method. It's like tuning a radio dial. As gets close to a new eigenvalue, we get the next ``broadcast station'',

Clearly, if any of the eigenvalues are complex, we would have a lot of searching to do, because we'd need to search the entire complex plane, and not just the real line interval between and . There are better methods.

The power and inverse power method are simple and very easy to implement, but if you want all the eigenvalues, those methods are very inefficient. There are other, much more sophisticated and efficient methods, though. Here we describe in broad terms the Householder/ algorithm for real symmetric matrices. For details, please see standard texts in numerical methods.

A real symmetric matrix can be put into diagonal form by a real
orthogonal similarity transform. In other words, there exists a real
orthogonal matrix such that the product (similarity transform)

(1) |

Of course, finding the transform is a challenge. With the
Householder/ algorithm it is done through an iterative process
that eventually converges to the answer. The first step in the
process, the Householder step, is to find an orthogonal similarity
transform that puts the matrix in tridiagonal form. This can be done
exactly in a finite number of steps. The Householder method finds a
matrix that is not only orthogonal, it is symmetric ():

(2) |

In the next phase, the phase, we apply a succession of orthogonal
similarity transforms on the tridiagonal matrix that make
the off-diagonal values smaller. Eventually they become small enough
that we can say it is diagonal for all intents and purposes. The
first similarity transform is applied to the tridiagonal matrix :

(3) |

(4) |

(5) |

(6) |

So where does the in the algorithm come in? At each step the
tridiagonal matrix is factored into a product of an
orthogonal matrix and an upper triangular matrix

(7) |

(8) |

As we can see, finding the eigenvalues this way takes some work. It turns out that the rate of convergence depends on the spacing of the eigenvalues. If two eigenvalues are very close to each other (compared with their average spacing), more iterations are needed to get convergence. If they are well separated, fewer iterations are required.

The algorithm also works with (complex) Hermitian matrices (). This covers a vast number of cases in science and engineering. The eigenvalues are still real, but the similarity transform is unitary ( ). Here the dagger means the complex conjugate transpose.