10 Pre-Class Assignment: Eigenvectors and Eigenvalues

1. Eigenvectors and Eigenvalues

Understanding Eigenvector and Eigenvalues can be very challenging. These are complex topics with many facets. Different textbooks approach the problem from different directions. All have value. These facets include:

  • Understanding the mathematical definition of Eigenvalues.

  • Being able to calculate an Eigenvalue and Eigenvector.

  • Understanding what Eigenvalues and Eigenvectors represent.

  • Understanding how to use Eigenvalues and Eigenvectors to solve problems.

In this course we consider it more important to understand what eigenvectors and eigenvalues represent and how to use them. However, often this understanding comes from first learning how to calculate them.

Eigenvalues are a special set of scalars associated with a square matrix that are sometimes also known as characteristic roots, characteristic values (Hoffman and Kunze 1971), proper values, or latent roots (Marcus and Minc 1988, p. 144).

The determination of the eigenvalues and eigenvectors of a matrix is extremely important in physics and engineering, where it is equivalent to matrix diagonalization and arises in such common applications as stability analysis, the physics of rotating bodies, and small oscillations of vibrating systems, to name only a few.

The decomposition of a square matrix \(A\) into eigenvalues and eigenvectors is known in this work as eigen decomposition, and the fact that this decomposition is always possible as long as the matrix consisting of the eigenvectors of \(A\) is square. This is known as the eigen decomposition theorem.

From: http://mathworld.wolfram.com/Eigenvalue.html

The following video provides an intuition for eigenvalues and eigenvectors.

from IPython.display import YouTubeVideo
YouTubeVideo("ue3yoeZvt8E",width=640,height=360, cc_load_policy=True)


Let \(A\) be an \(n\times n\) matrix. Find a vector \(x\) in \(R^n\) such that:

\[Ax=\lambda x\]

The above can be rewritten as the following homogeneous equation:

\[(A-\lambda I_n)x = 0\]

The trivial solution is \(x=0\).

However, if we define eigenvectors to be nonzero vectors then \(|A-\lambda I_n| = 0\). Nonzero (i.e. non-trivial) solutions to this system of equations can only exist if the matrix of coefficients is singular, i.e. the determinant of \(|A - \lambda I_n| = 0\). Therefore, solving the equation \(|A - \lambda I_n| = 0\) for \(\lambda\) leads to all the eigenvalues of \(A\).

Note: the above logic is key. Make sure you understand. If not, ask questions.

QUESTION: Explain why nonzero solutions to a system of homogeneous systems require the matrix to be singular.

Put your answer here

from IPython.display import YouTubeVideo
YouTubeVideo("PFDu9oVAE-g",width=640,height=360, cc_load_policy=True)


Here are a few more examples of how eigenvalues and eigenvectors are used (You are not required to understand all):

Using singular value decomposition for image compression. This is a note explaining how you can compress an image by throwing away the small eigenvalues of \(A^TA\). It takes an 88 megapixel image of an Allosaurus and shows how the image looks after compressing by selecting the largest singular values.

[Deriving Special Relativity]((https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxuYXNsdW5kZXJpY3xneDo2ZTAyNzA4NTZmOGZmNmU4) is more natural in the language of linear algebra. In fact, Einstein’s second postulate really states that “Light is an eigenvector of the Lorentz transform.” This document goes over the full derivation in detail.

Spectral Clustering. Whether it’s in plants and biology, medical imaging, buisness and marketing, understanding the connections between fields on Facebook, or even criminology, clustering is an extremely important part of modern data analysis. It allows people to find important subsystems or patterns inside noisy data sets. One such method is spectral clustering, which uses the eigenvalues of the graph of a network. Even the eigenvector of the second smallest eigenvalue of the Laplacian matrix allows us to find the two largest clusters in a network.

Dimensionality Reduction/PCA. The principal components correspond to the largest eigenvalues of \(A^\top A\), and this yields the least squared projection onto a smaller dimensional hyperplane, and the eigenvectors become the axes of the hyperplane. Dimensionality reduction is extremely useful in machine learning and data analysis as it allows one to understand where most of the variation in the data comes from.

Low rank factorization for collaborative prediction. This is what Netflix does (or once did) to predict what rating you’ll have for a movie you have not yet watched. It uses the singular value decomposition and throws away the smallest eigenvalues of \(A^\top A\).

The Google Page Rank algorithm. The largest eigenvector of the graph of the internet is how the pages are ranked.

From: https://math.stackexchange.com/questions/1520832/real-life-examples-for-eigenvalues-eigenvectors

2. Solving Eigenproblems - A 2x2 Example

from IPython.display import YouTubeVideo
YouTubeVideo("0UbkMlTu1vo",width=640,height=360, cc_load_policy=True)

Consider calculating eigenvalues for any \(2\times 2\) matrix. We want to solve:

\[|A - \lambda I_2 | = 0\]
\[\begin{split} \left| \left[ \begin{matrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix} \right] - \lambda \left[ \begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix} \right] \right| = \left| \left[ \begin{matrix} a_{11}-\lambda & a_{12} \\ a_{21} & a_{22}-\lambda \end{matrix} \right] \right| =0 \end{split}\]

We know this determinant:

\[(a_{11}-\lambda)(a_{22}-\lambda) - a_{12} a_{21} = 0 \]

If we expand the above, we get:

\[a_{11}a_{22}+\lambda^2-a_{11}\lambda-a_{22}\lambda - a_{12} a_{21} = 0\]


\[\lambda^2-(a_{11}+a_{22})\lambda+a_{11}a_{22} - a_{12} a_{21} = 0\]

This is a simple quadratic equation. The roots pf \(A\lambda^2+B\lambda+C = 0\) can be solved using the quadratic formula:

\[ \frac{-B \pm \sqrt{B^2 - 4AC}}{2A}\]

QUESTION: Using the above equation. What are the eigenvalues for the following \(2\times 2\) matrix. Try calculating this by hand and then store the lower value in a variable namede1 and the larger value in e2 to check your answer:

\[\begin{split}A = \left[ \begin{matrix} -4 & -6 \\ 3 & 5 \end{matrix} \right] \end{split}\]
# Put your answer here
from answercheck import checkanswer

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-5-825ed7a81fa9> in <module>
----> 1 from answercheck import checkanswer
      3 checkanswer.float(e1,'c54490d3480079138c8c027a87a366e3');

~/REPOS/MTH314_Textbook/MakeTextbook/bookfiles/answercheck.py in <module>
      1 import hashlib
      2 import numpy as np
----> 3 import sympy as sym
      4 import sys
      5 import textwrap

ModuleNotFoundError: No module named 'sympy'
from answercheck import checkanswer


DO THIS Find a numpy function that will calculate eigenvalues and verify the answers from above.

# Put your answer here

QUESTION: What are the corresponding eigenvectors to the matrix \(A\)? This time you can try calculating by hand or just used the function you found in the previous answer. Store the eigenvector associated with the e1 value in a vector named v1 and the eigenvector associated with the eigenvalue e2 in a vector named v2 to check your answer.

Put your answer to the above question here.

from answercheck import checkanswer

from answercheck import checkanswer


QUESTION: Both sympy and numpy can calculate many of the same things. What is the fundamental difference between these two libraries?

Put your answer to the above question here.

3. Introduction to Markov Models

In probability theory, a Markov model is a stochastic model used to model randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it.

State space diagram. See text for description A diagram representing a two-state Markov process, with the states labelled E and A.

Each number represents the probability of the Markov process changing from one state to another state, with the direction indicated by the arrow. For example, if the Markov process is in state A, then the probability it changes to state E is 0.4, while the probability it remains in state A is 0.6.

From: Wikipedia

The above state model can be represented by a transition matrix.

At each time step (\(t\)) the probability to move between states depends on the previous state (\(t-1\)):

\[A_{t} = 0.6A_{(t-1)}+0.7E_{(t-1)}\]
\[E_{t} = 0.4A_{(t-1)}+0.3E_{(t-1)}\]

The above state model (\(S_t = [A_t, E_t]^T\)) can be represented in the following matrix notation:

\[S_t = PS_{(t-1)}\]

DO THIS : Create a \(2 \times 2\) matrix (P) representing the transition matrix for the above Markov space.

#Put your answer to the above question here
from answercheck import checkanswer


3. Assignment wrap-up

Assignment-Specific QUESTION: Both sympy and numpy can calculate many of the same things. What is the fundamental difference between these two libraries?

Put your answer to the above question here

QUESTION: Summarize what you did in this assignment.

Put your answer to the above question here

QUESTION: What questions do you have, if any, about any of the topics discussed in this assignment after working through the jupyter notebook?

Put your answer to the above question here

QUESTION: How well do you feel this assignment helped you to achieve a better understanding of the above mentioned topic(s)?

Put your answer to the above question here

QUESTION: What was the most challenging part of this assignment for you?

Put your answer to the above question here

QUESTION: What was the least challenging part of this assignment for you?

Put your answer to the above question here

QUESTION: What kind of additional questions or support, if any, do you feel you need to have a better understanding of the content in this assignment?

Put your answer to the above question here

QUESTION: Do you have any further questions or comments about this material, or anything else that’s going on in class?

Put your answer to the above question here

QUESTION: Approximately how long did this pre-class assignment take?

Put your answer to the above question here

Written by Dr. Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.