<BODY > <HR><center> <table border=1> <td> <A NAME="tex2html36" HREF="node2.html"><img height=36 width=36 alt="PREVIOUS" src="../../../icons/latex2html/previous.gif"></A></td> <td> <A NAME="tex2html42" HREF="linalg.html"><img height=36 width=36 alt="UP" src="../../../icons/latex2html/up.gif"></A></td> <td> <A NAME="tex2html44" HREF="node4.html"><img height=36 width=36 alt="NEXT" src="../../../icons/latex2html/next.gif"></A></td> <td><a href="contents-node3.html"><img height=36 width=36 alt="CONTENTS" src="../../../icons/latex2html/toc.gif"></a></td> </table> </center> <HR> <!--End of Navigation Panel--> <H1><A NAME="SECTION00030000000000000000"> Eigenvalues and eigenvectors</A> </H1> <P> A great many matrices (more generally linear operators) are characterized by their eigenvalues and eigenvectors. They play a crucial role in all branches of science and engineering. Most of the time, finding them requires resorting to numerical methods. So we discuss some simpler methods. <P> <H2><A NAME="SECTION00031000000000000000"> Characteristic Polynomial</A> </H2> <P> Generally speaking, eigenvalues of a square matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> are roots of the so-called characteristic polynomial: <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} \det|A - \lambda I| = P(\lambda) = 0 \end{displaymath} --> <IMG WIDTH="167" HEIGHT="28" BORDER="0" SRC="img67.png" ALT="\begin{displaymath} \det\vert A - \lambda I\vert = P(\lambda) = 0 \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> That is, start with the matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> and modify it by subtracting the same variable <IMG WIDTH="14" HEIGHT="15" ALIGN="BOTTOM" BORDER="0" SRC="img68.png" ALT="$\lambda$"> from each diagonal element. Then calculate the determinant of the resulting matrix and you get a polynomial. Here is how it works using a <IMG WIDTH="40" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img35.png" ALT="$3\times 3$"> matrix: <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} A = \left( \begin{array}{rrr} 1/2 & 3/2 & 0 \\ 3/2 & 1/2 & 0 \\ 0 & 0 & 1 \end{array} \right) \end{displaymath} --> <IMG WIDTH="159" HEIGHT="64" BORDER="0" SRC="img69.png" ALT="\begin{displaymath} A = \left( \begin{array}{rrr} 1/2 &amp; 3/2 &amp; 0 \\ 3/2 &amp; 1/2 &amp; 0 \\ 0 &amp; 0 &amp; 1 \end{array}\right) \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} \det|A - \lambda I| = \left| \begin{array}{rrr} 1/2-\lambda & 3/2 & 0 \\ 3/2 & 1/2-\lambda & 0 \\ 0 & 0 & 1 -\lambda \end{array} \right| = -2 + \lambda + 2\lambda^2 - \lambda^3 = (-1 - \lambda)(1 - \lambda)(2 - \lambda) \end{displaymath} --> <IMG WIDTH="584" HEIGHT="64" BORDER="0" SRC="img70.png" ALT="\begin{displaymath} \det\vert A - \lambda I\vert = \left\vert \begin{array}{rrr... ...ambda^2 - \lambda^3 = (-1 - \lambda)(1 - \lambda)(2 - \lambda) \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> The three zeros of this cubic polynomial are <IMG WIDTH="67" HEIGHT="32" ALIGN="MIDDLE" BORDER="0" SRC="img71.png" ALT="$(-1, 1, 2)$">, so this matrix has three distinct eigenvalues. For an <IMG WIDTH="43" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img72.png" ALT="$n\times n$"> matrix we get a polynomial of degree <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$">. Why? It is easy to see if we remember from the previous section that the determinant is a sum over products of matrix elements. One of those products runs down the diagonal. Since each diagonal element has a <IMG WIDTH="14" HEIGHT="15" ALIGN="BOTTOM" BORDER="0" SRC="img68.png" ALT="$\lambda$"> in it, the diagonals alone give you a polynomial of degree <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$">. The other products have fewer diagonal elements, so they can't increase the degree of the polynomial. beyond <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$">. <P> A polynomial of degree <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$"> has <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$"> roots, although some of them may appear more than once. Call the zeros of the characteristic polynomial, <I>i.e. </I>, the eigenvalues, <IMG WIDTH="19" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img73.png" ALT="$\lambda_i$">. If we factor the polynomial in terms of its roots, we get <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} P(\lambda) = (\lambda_1 - \lambda)(\lambda_2 - \lambda) \ldots{} (\lambda_n - \lambda). \end{displaymath} --> <IMG WIDTH="256" HEIGHT="28" BORDER="0" SRC="img74.png" ALT="\begin{displaymath} P(\lambda) = (\lambda_1 - \lambda)(\lambda_2 - \lambda) \ldots{} (\lambda_n - \lambda). \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> Notice that the determinant of the matrix itself is the value of the characteristic polynomial at <IMG WIDTH="43" HEIGHT="15" ALIGN="BOTTOM" BORDER="0" SRC="img75.png" ALT="$\lambda = 0$">. Plugging in <IMG WIDTH="43" HEIGHT="15" ALIGN="BOTTOM" BORDER="0" SRC="img75.png" ALT="$\lambda = 0$"> into the factored expression above leads to the result that the determinant of the matrix is the product of its eigenvalues. <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} \det|A| = P(0) = \lambda_1 \lambda_2 \ldots{} \lambda_n \end{displaymath} --> <IMG WIDTH="194" HEIGHT="28" BORDER="0" SRC="img76.png" ALT="\begin{displaymath} \det\vert A\vert = P(0) = \lambda_1 \lambda_2 \ldots{} \lambda_n \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> <P> <H2><A NAME="SECTION00032000000000000000"> Eigenvalue equation</A> </H2> <P> When a matrix has a zero determinant, we can find a nontrivial (column vector) solution <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img77.png" ALT="$v$"> to the equation <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} (A - \lambda I)v = 0 \end{displaymath} --> <IMG WIDTH="97" HEIGHT="28" BORDER="0" SRC="img78.png" ALT="\begin{displaymath} (A - \lambda I)v = 0 \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> or <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} A v = \lambda v \end{displaymath} --> <IMG WIDTH="59" HEIGHT="24" BORDER="0" SRC="img79.png" ALT="\begin{displaymath} A v = \lambda v \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> This is the standard equation for eigenvalue <IMG WIDTH="14" HEIGHT="15" ALIGN="BOTTOM" BORDER="0" SRC="img68.png" ALT="$\lambda$"> and eigenvector <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img77.png" ALT="$v$">. There can be as many as <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$"> linearly independent solutions to this equation as follows <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} A v_i = \lambda_i v_i \, . \end{displaymath} --> <IMG WIDTH="80" HEIGHT="26" BORDER="0" SRC="img80.png" ALT="\begin{displaymath} A v_i = \lambda_i v_i . \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> Notice that the eigenvector is not unique. We can multiply both sides of the equation by a constant <IMG WIDTH="11" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img81.png" ALT="$c$"> to see that if <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img82.png" ALT="$v_i$"> is a solution for eigenvalue <IMG WIDTH="19" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img73.png" ALT="$\lambda_i$">, so is <IMG WIDTH="24" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img83.png" ALT="$c v_i$">. <P> Often we deal with real symmetric matrices (the transpose of the matrix is equal to the itself). In that case the eigenvectors form a complete set of orthogonal vectors. They can be used to define the directions of coordinate axes, so we can write any <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$"> dimensional vector <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$"> as a linear combination <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} x = \alpha_1 v_1 + \alpha_2 v_2 + \ldots{} + \alpha_n v_n \end{displaymath} --> <IMG WIDTH="208" HEIGHT="26" BORDER="0" SRC="img85.png" ALT="\begin{displaymath} x = \alpha_1 v_1 + \alpha_2 v_2 + \ldots{} + \alpha_n v_n \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> where the coefficient <IMG WIDTH="20" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img86.png" ALT="$\alpha_i$"> is the component of the vector in the direction <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img82.png" ALT="$v_i$">. <P> More generally, if there are <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$"> linearly independent eigenvectors <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img82.png" ALT="$v_i$">, this is also possible. Then we have a simple method for finding the eigenvalue with the largest magnitude, namely the power method.'' <P> <H2><A NAME="SECTION00033000000000000000"> Power method</A> </H2> <P> The power method originates from the general statement that we can use the eigenvectors of a matrix to represent any vector <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$">: <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} x = \alpha_1 v_1 + \alpha_2 v_2 + \ldots{} + \alpha_n v_n \end{displaymath} --> <IMG WIDTH="208" HEIGHT="26" BORDER="0" SRC="img85.png" ALT="\begin{displaymath} x = \alpha_1 v_1 + \alpha_2 v_2 + \ldots{} + \alpha_n v_n \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> We multiply by <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> and get <P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{eqnarray*} Ax &=& \alpha_1 Av_1 + \alpha_2 Av_2 + \ldots{} + \alpha_n Av_n \\ &=& \alpha_1 \lambda_1 v_1 + \alpha_2 \lambda_2 v_2 + \ldots{} + \alpha_n \lambda_n v_n \end{eqnarray*} --> <IMG WIDTH="294" HEIGHT="50" BORDER="0" SRC="img87.png" ALT="\begin{eqnarray*} Ax &amp;=&amp; \alpha_1 Av_1 + \alpha_2 Av_2 + \ldots{} + \alpha_n Av... ...v_1 + \alpha_2 \lambda_2 v_2 + \ldots{} + \alpha_n \lambda_n v_n \end{eqnarray*}"></DIV> <BR CLEAR="ALL"><P></P> So we get a new vector whose coefficients are each multiplied by the corresponding eigenvalue: <!-- MATH $\alpha_i \lambda_i$ --> <IMG WIDTH="34" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img88.png" ALT="$\alpha_i \lambda_i$">. The coefficients with the larger eigenvalues get bigger compared with the coefficients with smaller eigenvalues. So let's say we have sorted the eigenvalues so the one with smallest magnitude is <IMG WIDTH="21" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img89.png" ALT="$\lambda_1$">, and the one with the largest magnitude is <IMG WIDTH="22" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img90.png" ALT="$\lambda_n$">. If we multiply by <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> <IMG WIDTH="18" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img91.png" ALT="$m$"> times, the coefficients become <!-- MATH $\alpha_i \lambda_i^m$ --> <IMG WIDTH="41" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img92.png" ALT="$\alpha_i \lambda_i^m$">. If we keep going, the <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$">th term (corresponding to the largest eigenvalue) will eventually swamp all the others and we get (for very large <IMG WIDTH="18" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img91.png" ALT="$m$">) <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} A^m x \approx \lambda_n^m \alpha_n v_n = y \end{displaymath} --> <IMG WIDTH="140" HEIGHT="28" BORDER="0" SRC="img93.png" ALT="\begin{displaymath} A^m x \approx \lambda_n^m \alpha_n v_n = y \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> So we get an eigenvector corresponding to the largest eigenvalue. Another way of saying this is that when we hit the vector <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$"> with the matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> we get a new vector that tends to point more in the direction of the leading eigenvector <IMG WIDTH="21" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img94.png" ALT="$v_n$">. The more factors of <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> we pile on, the more precisely it points in that direction. <P> So how do we get the eigenvalue if we have an eigenvector <IMG WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img95.png" ALT="$y$"> pointing in the direction of <IMG WIDTH="21" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img94.png" ALT="$v_n$">? If it points in that direction, we must have <IMG WIDTH="57" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img96.png" ALT="$y = c v_n$">. Then remember that eigenvectors satisfy <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} Ay = A cv_n = \lambda_n c v_n = \lambda_n y \, . \end{displaymath} --> <IMG WIDTH="192" HEIGHT="27" BORDER="0" SRC="img97.png" ALT="\begin{displaymath} Ay = A cv_n = \lambda_n c v_n = \lambda_n y . \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> That is, the vector <IMG WIDTH="25" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img98.png" ALT="$Ay$"> has every component of <IMG WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img95.png" ALT="$y$"> multiplied by the eigenvalue <IMG WIDTH="22" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img90.png" ALT="$\lambda_n$">. We can use any of the components to read off the answer. <P> To turn this process into a practical algorithm, we normalize the vector after each multiplication by <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$">. Normalization simply means multiplying by a constant to put the vector in our favorite standard form. A common normalization is to divide by the Cartesian norm so we get a vector of unit length. The normalization we will use here is dividing the whole vector by the component with the largest magnitude. If we take the absolute values of the components, the largest one is called the <EM>infinity norm</EM>. So we divide by the infinity norm if that component is positive and by minus the infinity norm if it is negative. Here is an example. Suppose we have <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} y = (3, 2, -1). \end{displaymath} --> <IMG WIDTH="96" HEIGHT="28" BORDER="0" SRC="img99.png" ALT="\begin{displaymath} y = (3, 2, -1). \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> The infinity norm is 3. We divide by 3 to normalize to get <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$">: <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} x = (1, 2/3, -1/3). \end{displaymath} --> <IMG WIDTH="128" HEIGHT="28" BORDER="0" SRC="img100.png" ALT="\begin{displaymath} x = (1, 2/3, -1/3). \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> Let's call the component with the largest magnitude the leading component''. In the example, the leading component in <IMG WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img95.png" ALT="$y$"> is the first one and it is positive. If it was <IMG WIDTH="25" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img101.png" ALT="$-3$">, instead, we'd divide by <IMG WIDTH="25" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img101.png" ALT="$-3$">. The goal is to get a vector proportioanl to the original vector, but with one component equal to <IMG WIDTH="25" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img102.png" ALT="$+1$"> and with the rest of the components no larger in magnitude than 1. <P> The reason we pick this way of normalizing the vector is that we can then easily read off the eigenvalue by looking at what happens to the leading component when we multiply by <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$">. Also, the infinity norm is cheaper to compute, since it doesn't require any arithmetic - just comparisons. <P> Suppose, after normalizing <IMG WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img95.png" ALT="$y$"> to get <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$">, we multiply by <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> one more time and get <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} Ax = (2, 4/3, -2/3)\, . \end{displaymath} --> <IMG WIDTH="142" HEIGHT="28" BORDER="0" SRC="img103.png" ALT="\begin{displaymath} Ax = (2, 4/3, -2/3) . \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> We can read off the eigenvalue from the leading component. It is 2. Of course, we could check every component to see that each one got multiplied by 2. <P> So here is the power algorithm <PRE> Start with any arbitrary vector y. Repeat steps 1 - 3 until convergence. Step 1: Normalize y to get x. The leading component is now 1. Step 2: Compute y = Ax. Step 3: The approximate eigenvalue is the new leading component. </PRE> Convergence means that when you normalize <IMG WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img95.png" ALT="$y$">, the new <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$"> is close enough to the old <IMG WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img84.png" ALT="$x$"> to declare victory. You choose what is close enough to suit the requirements of your problem. <P> Notice that with the power algorithm, you can start with any arbitrary vector and the method will converge to the same result. Well, that is almost any starting vector. You might be unlucky and pick a starting vector that has a zero component <IMG WIDTH="23" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img104.png" ALT="$\alpha_n$"> along the leading eigenvector <IMG WIDTH="21" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img94.png" ALT="$v_n$">. But numerical calculations are usually subject to roundoff error, so even if you unwittingly started witn <IMG WIDTH="52" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img105.png" ALT="$\alpha_n = 0$">, chances are very good that after a few hits with the matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$">, you develop a tiny nonzero <IMG WIDTH="23" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img104.png" ALT="$\alpha_n$">, and then it is just a matter of time before its coefficient grows to dominate the iteration. <P> <H2><A NAME="SECTION00034000000000000000"> Inverse power method</A> </H2> <P> The inverse power method works with the inverse <IMG WIDTH="33" HEIGHT="16" ALIGN="BOTTOM" BORDER="0" SRC="img106.png" ALT="$A^{-1}$">, assuming it exists. It is easy to check that the eigenvectors of the matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> are also eigenvectors of its inverse, but the eigenvalues are the algebraic inverses: <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} A^{-1} v_i = \mu_i v_i \end{displaymath} --> <IMG WIDTH="90" HEIGHT="27" BORDER="0" SRC="img107.png" ALT="\begin{displaymath} A^{-1} v_i = \mu_i v_i \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> where <!-- MATH $\mu_i = 1/\lambda_i$ --> <IMG WIDTH="71" HEIGHT="32" ALIGN="MIDDLE" BORDER="0" SRC="img108.png" ALT="$\mu_i = 1/\lambda_i$">. So now the eigenvalue <IMG WIDTH="19" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img109.png" ALT="$\mu_i$"> with the largest magnitude corresponds to the eigenvalue <IMG WIDTH="19" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img73.png" ALT="$\lambda_i$"> with the <EM>smallest</EM> magnitude. <P> So we can get the largest and smallest eigenvalues. How do we get the ones between? For a matrix whose eigenvalues are all real, we can do this by generalizing the inverse power method. We take the inverse of the shifted matrix <IMG WIDTH="64" HEIGHT="32" ALIGN="MIDDLE" BORDER="0" SRC="img110.png" ALT="$(A - qI)$">, where <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$"> is any number we like. (We intend to vary <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$">.) The eigenvectors of this matrix are still the same as the eigenvectors of <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$">: <BR><P></P> <DIV ALIGN="CENTER"> <!-- MATH \begin{displaymath} (A - qI)^{-1} v_i = \mu_i v_i \end{displaymath} --> <IMG WIDTH="136" HEIGHT="28" BORDER="0" SRC="img112.png" ALT="\begin{displaymath} (A - qI)^{-1} v_i = \mu_i v_i \end{displaymath}"> </DIV> <BR CLEAR="ALL"> <P></P> where, now, <!-- MATH $\mu_i = 1/(\lambda_i - q)$ --> <IMG WIDTH="110" HEIGHT="32" ALIGN="MIDDLE" BORDER="0" SRC="img113.png" ALT="$\mu_i = 1/(\lambda_i - q)$">. Which is the largest <IMG WIDTH="19" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img109.png" ALT="$\mu_i$">? It depends on <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$">. If <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$"> is close to one of the <IMG WIDTH="19" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img73.png" ALT="$\lambda_i$">'s, then <IMG WIDTH="28" HEIGHT="32" ALIGN="MIDDLE" BORDER="0" SRC="img114.png" ALT="$\vert\mu_i\vert$"> is maximum for that <IMG WIDTH="10" HEIGHT="17" ALIGN="BOTTOM" BORDER="0" SRC="img115.png" ALT="$i$">. So if we hold that <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$"> fixed and run the power method, we eventually get the eigenvector <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img82.png" ALT="$v_i$">. Then we change <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$"> and rerun the power method. It's like tuning a radio dial. As <IMG WIDTH="12" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img111.png" ALT="$q$"> gets close to a new eigenvalue, we get the next broadcast station'', <I>i.e.</I> the next eigenvector. If we keep going, eventually, we get them all. <P> Clearly, if any of the eigenvalues are complex, we would have a lot of searching to do, because we'd need to search the entire complex plane, and not just the real line interval between <IMG WIDTH="21" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img89.png" ALT="$\lambda_1$"> and <IMG WIDTH="22" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img90.png" ALT="$\lambda_n$">. There are better methods. <P> <H2><A NAME="SECTION00035000000000000000"> Other methods: QR algorithm</A> </H2> <P> The power and inverse power method are simple and very easy to implement, but if you want all the eigenvalues, those methods are very inefficient. There are other, much more sophisticated and efficient methods, though. Here we describe in broad terms the Householder/<IMG WIDTH="29" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img116.png" ALT="$QR$"> algorithm for real symmetric matrices. For details, please see standard texts in numerical methods. <P> A real symmetric matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> can be put into diagonal form by a real orthogonal similarity transform. In other words, there exists a real orthogonal matrix <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img117.png" ALT="$Q$"> such that the product (similarity transform) <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\Lambda = \tilde Q A Q$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="68" HEIGHT="27" BORDER="0" SRC="img118.png" ALT="\begin{displaymath}$$\Lambda = \tilde Q A Q$$\end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (1)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> is a diagonal matrix <IMG WIDTH="15" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img119.png" ALT="$\Lambda$">. (An orthogonal matrix <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img117.png" ALT="$Q$"> is one whose transpose <IMG WIDTH="17" HEIGHT="39" ALIGN="MIDDLE" BORDER="0" SRC="img120.png" ALT="$\tilde Q$"> is its inverse: <!-- MATH $Q\tilde Q = \tilde Q Q = 1$ --> <IMG WIDTH="105" HEIGHT="39" ALIGN="MIDDLE" BORDER="0" SRC="img121.png" ALT="$Q\tilde Q = \tilde Q Q = 1$">.) This solves the problem, because the eigenvalues of the matrix <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$"> are the diagonal values in <IMG WIDTH="15" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img119.png" ALT="$\Lambda$">, and the eigenvectors are the column vectors of <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img117.png" ALT="$Q$">. We say that the transform <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img117.png" ALT="$Q$"> diagonalizes'' the matrix. <P> Of course, finding the transform <IMG WIDTH="17" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img117.png" ALT="$Q$"> is a challenge. With the Householder/<IMG WIDTH="29" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img116.png" ALT="$QR$"> algorithm it is done through an iterative process that eventually converges to the answer. The first step in the process, the Householder step, is to find an orthogonal similarity transform that puts the matrix in tridiagonal form. This can be done exactly in a finite number of steps. The Householder method finds a matrix <IMG WIDTH="17" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img45.png" ALT="$P$"> that is not only orthogonal, it is symmetric (<IMG WIDTH="50" HEIGHT="21" ALIGN="BOTTOM" BORDER="0" SRC="img122.png" ALT="$P = \tilde P$">): <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\hat A = P A P \, .$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="76" HEIGHT="24" BORDER="0" SRC="img123.png" ALT="\begin{displaymath} \hat A = P A P . \end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (2)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> The matrix <IMG WIDTH="16" HEIGHT="18" ALIGN="BOTTOM" BORDER="0" SRC="img124.png" ALT="$\hat A$"> is tridiagonal (and real and symmetric). <P> In the next phase, the <IMG WIDTH="29" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img116.png" ALT="$QR$"> phase, we apply a succession of orthogonal similarity transforms <IMG WIDTH="32" HEIGHT="38" ALIGN="MIDDLE" BORDER="0" SRC="img125.png" ALT="$Q^{(i)}$"> on the tridiagonal matrix that make the off-diagonal values smaller. Eventually they become small enough that we can say it is diagonal for all intents and purposes. The first similarity transform is applied to the tridiagonal matrix <IMG WIDTH="16" HEIGHT="18" ALIGN="BOTTOM" BORDER="0" SRC="img124.png" ALT="$\hat A$">: <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\hat A^{(1)} = \tilde Q^{(1)} \hat A Q^{(1)}\, .$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="127" HEIGHT="27" BORDER="0" SRC="img126.png" ALT="\begin{displaymath} \hat A^{(1)} = \tilde Q^{(1)} \hat A Q^{(1)} . \end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (3)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> The transform is constructed so the resulting matrix <IMG WIDTH="33" HEIGHT="18" ALIGN="BOTTOM" BORDER="0" SRC="img127.png" ALT="$\hat A^{(1)}$"> is still tridiagonal, but the off-diagonal elements are smaller. Then we apply the second similarity transform to the result above: <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\hat A^{(2)} = \tilde Q^{(2)} \hat A^{(1)} Q^{(2)}\, .$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="144" HEIGHT="27" BORDER="0" SRC="img128.png" ALT="\begin{displaymath} \hat A^{(2)} = \tilde Q^{(2)} \hat A^{(1)} Q^{(2)} . \end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (4)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> We keep going until eventually <IMG WIDTH="35" HEIGHT="18" ALIGN="BOTTOM" BORDER="0" SRC="img129.png" ALT="$\hat A^{(n)}$">, for large <IMG WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0" SRC="img62.png" ALT="$n$">, is close enough to a diagonal matrix that we can call it our <IMG WIDTH="15" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img119.png" ALT="$\Lambda$">. Putting all the transforms together, we get <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\Lambda = \lim_{n\to\infty} \tilde Q^{(n)} \tilde Q^{(n-1)} \ldots{} \tilde Q^{(2)} \tilde Q^{(1)} P A P Q^{(1)} Q^{(2)} \ldots{} Q^{(n-1)} Q^{(n)}$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="426" HEIGHT="35" BORDER="0" SRC="img130.png" ALT="\begin{displaymath} \Lambda = \lim_{n\to\infty} \tilde Q^{(n)} \tilde Q^{(n-1)}... ...lde Q^{(1)} P A P Q^{(1)} Q^{(2)} \ldots{} Q^{(n-1)} Q^{(n)} \end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (5)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> It is easy to show that the product of real orthogonal matrices is also real orthogonal. So the product <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$Q = PQ^{(1)} Q^{(2)} \ldots{} Q^{(n-1)} Q^{(n)}$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="206" HEIGHT="27" BORDER="0" SRC="img131.png" ALT="\begin{displaymath}$$Q = PQ^{(1)} Q^{(2)} \ldots{} Q^{(n-1)} Q^{(n)}$$\end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (6)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> is the orthogonal matrix that diagonalizes <IMG WIDTH="16" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img66.png" ALT="$A$">. That is what we wanted. <P> So where does the <IMG WIDTH="17" HEIGHT="14" ALIGN="BOTTOM" BORDER="0" SRC="img132.png" ALT="$R$"> in the <IMG WIDTH="29" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img116.png" ALT="$QR$"> algorithm come in? At each step the tridiagonal matrix <IMG WIDTH="32" HEIGHT="18" ALIGN="BOTTOM" BORDER="0" SRC="img133.png" ALT="$\hat A^{(i)}$"> is factored into a product of an orthogonal matrix <IMG WIDTH="32" HEIGHT="38" ALIGN="MIDDLE" BORDER="0" SRC="img125.png" ALT="$Q^{(i)}$"> and an upper triangular matrix <IMG WIDTH="32" HEIGHT="17" ALIGN="BOTTOM" BORDER="0" SRC="img134.png" ALT="$R^{(i)}$"> <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\hat A^{(i)} = Q^{(i)} R^{(i)}.$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="106" HEIGHT="27" BORDER="0" SRC="img135.png" ALT="\begin{displaymath}$$\hat A^{(i)} = Q^{(i)} R^{(i)}.$$\end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (7)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> Hence the name <IMG WIDTH="29" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img116.png" ALT="$QR$">. This factorization can be done exactly with a finite number of steps. Then one can show that the combination <BR> <DIV ALIGN="RIGHT"> <!-- MATH $$\hat A^{(i+1)} = \tilde Q^{(i)} \hat A^{(i)} Q^{(i)}\, .$$ --> <TABLE WIDTH="100%" ALIGN="CENTER"> <TR VALIGN="MIDDLE"><TD ALIGN="CENTER" NOWRAP><IMG WIDTH="153" HEIGHT="27" BORDER="0" SRC="img136.png" ALT="\begin{displaymath} \hat A^{(i+1)} = \tilde Q^{(i)} \hat A^{(i)} Q^{(i)} . \end{displaymath}"></TD> <TD WIDTH=10 ALIGN="RIGHT"> (8)</TD></TR> </TABLE> <BR CLEAR="ALL"></DIV><P></P> is tridiagonal. That gives us the next matrix in the sequence. The same factorization is done on it, and the process continues. <P> As we can see, finding the eigenvalues this way takes some work. It turns out that the rate of convergence depends on the spacing of the eigenvalues. If two eigenvalues are very close to each other (compared with their average spacing), more iterations are needed to get convergence. If they are well separated, fewer iterations are required. <P> The <IMG WIDTH="29" HEIGHT="30" ALIGN="MIDDLE" BORDER="0" SRC="img116.png" ALT="$QR$"> algorithm also works with (complex) Hermitian matrices (<IMG WIDTH="56" HEIGHT="20" ALIGN="BOTTOM" BORDER="0" SRC="img137.png" ALT="$A^\dagger = A$">). This covers a vast number of cases in science and engineering. The eigenvalues are still real, but the similarity transform is unitary (<!-- MATH $Q^\dagger Q = Q Q^\dagger = 1$ --> <IMG WIDTH="118" HEIGHT="35" ALIGN="MIDDLE" BORDER="0" SRC="img138.png" ALT="$Q^\dagger Q = Q Q^\dagger = 1$">). Here the dagger means the complex conjugate transpose. <P> <HR><center> <table border=1> <td> <A NAME="tex2html36" HREF="node2.html"><img height=36 width=36 alt="PREVIOUS" src="../../../icons/latex2html/previous.gif"></A></td> <td> <A NAME="tex2html42" HREF="linalg.html"><img height=36 width=36 alt="UP" src="../../../icons/latex2html/up.gif"></A></td> <td> <A NAME="tex2html44" HREF="node4.html"><img height=36 width=36 alt="NEXT" src="../../../icons/latex2html/next.gif"></A></td> <td><a href="contents-node3.html"><img height=36 width=36 alt="CONTENTS" src="../../../icons/latex2html/toc.gif"></a></td> </table> </center> <HR> <!--End of Navigation Panel--> <HR><P><ADDRESS> <I></I> </ADDRESS> </BODY>