A great many matrices (more generally linear operators) are characterized by their eigenvalues and eigenvectors. They play a crucial role in all branches of science and engineering. Most of the time, finding them requires resorting to numerical methods. So we discuss some simpler methods.
Generally speaking, eigenvalues of a square matrix are roots of
the so-called characteristic polynomial:
A polynomial of degree has roots, although some of them may
appear more than once. Call the zeros of the characteristic
polynomial, i.e. , the eigenvalues, . If we factor
the polynomial in terms of its roots, we get
When a matrix has a zero determinant, we can find a nontrivial (column
vector) solution to the equation
Often we deal with real symmetric matrices (the transpose of the
matrix is equal to the itself). In that case the eigenvectors form a
complete set of orthogonal vectors. They can be used to define the
directions of coordinate axes, so we can write any dimensional
vector as a linear combination
More generally, if there are linearly independent eigenvectors , this is also possible. Then we have a simple method for finding the eigenvalue with the largest magnitude, namely the ``power method.''
The power method originates from the general statement that we can use
the eigenvectors of a matrix to represent any vector :
So how do we get the eigenvalue if we have an eigenvector pointing
in the direction of ? If it points in that direction, we must
have . Then remember that eigenvectors satisfy
To turn this process into a practical algorithm, we normalize the
vector after each multiplication by . Normalization simply means
multiplying by a constant to put the vector in our favorite standard
form. A common normalization is to divide by the Cartesian norm so we
get a vector of unit length. The normalization we will use here is
dividing the whole vector by the component with the largest magnitude.
If we take the absolute values of the components, the largest one is
called the infinity norm. So we divide by the infinity norm if
that component is positive and by minus the infinity norm if it is
negative. Here is an example. Suppose we have
The reason we pick this way of normalizing the vector is that we can then easily read off the eigenvalue by looking at what happens to the leading component when we multiply by . Also, the infinity norm is cheaper to compute, since it doesn't require any arithmetic - just comparisons.
Suppose, after normalizing to get , we multiply by one more
time and get
So here is the power algorithm
Start with any arbitrary vector y. Repeat steps 1 - 3 until convergence. Step 1: Normalize y to get x. The leading component is now 1. Step 2: Compute y = Ax. Step 3: The approximate eigenvalue is the new leading component.Convergence means that when you normalize , the new is close enough to the old to declare victory. You choose what is close enough to suit the requirements of your problem.
Notice that with the power algorithm, you can start with any arbitrary vector and the method will converge to the same result. Well, that is almost any starting vector. You might be unlucky and pick a starting vector that has a zero component along the leading eigenvector . But numerical calculations are usually subject to roundoff error, so even if you unwittingly started witn , chances are very good that after a few hits with the matrix , you develop a tiny nonzero , and then it is just a matter of time before its coefficient grows to dominate the iteration.
The inverse power method works with the inverse , assuming it
exists. It is easy to check that the eigenvectors of the matrix
are also eigenvectors of its inverse, but the eigenvalues are the
So we can get the largest and smallest eigenvalues. How do we get the
ones between? For a matrix whose eigenvalues are all real, we can do
this by generalizing the inverse power method. We take the inverse of
the shifted matrix , where is any number we like. (We
intend to vary .) The eigenvectors of this matrix are still the
same as the eigenvectors of :
Clearly, if any of the eigenvalues are complex, we would have a lot of searching to do, because we'd need to search the entire complex plane, and not just the real line interval between and . There are better methods.
The power and inverse power method are simple and very easy to implement, but if you want all the eigenvalues, those methods are very inefficient. There are other, much more sophisticated and efficient methods, though. Here we describe in broad terms the Householder/ algorithm for real symmetric matrices. For details, please see standard texts in numerical methods.
A real symmetric matrix can be put into diagonal form by a real
orthogonal similarity transform. In other words, there exists a real
orthogonal matrix such that the product (similarity transform)
Of course, finding the transform is a challenge. With the
Householder/ algorithm it is done through an iterative process
that eventually converges to the answer. The first step in the
process, the Householder step, is to find an orthogonal similarity
transform that puts the matrix in tridiagonal form. This can be done
exactly in a finite number of steps. The Householder method finds a
matrix that is not only orthogonal, it is symmetric ():
In the next phase, the phase, we apply a succession of orthogonal
similarity transforms on the tridiagonal matrix that make
the off-diagonal values smaller. Eventually they become small enough
that we can say it is diagonal for all intents and purposes. The
first similarity transform is applied to the tridiagonal matrix :
So where does the in the algorithm come in? At each step the
tridiagonal matrix is factored into a product of an
orthogonal matrix and an upper triangular matrix
As we can see, finding the eigenvalues this way takes some work. It turns out that the rate of convergence depends on the spacing of the eigenvalues. If two eigenvalues are very close to each other (compared with their average spacing), more iterations are needed to get convergence. If they are well separated, fewer iterations are required.
The algorithm also works with (complex) Hermitian matrices (). This covers a vast number of cases in science and engineering. The eigenvalues are still real, but the similarity transform is unitary ( ). Here the dagger means the complex conjugate transpose.