10 Pre-Class Assignment: Eigenvectors and Eigenvalues¶

Readings for this topic (Recommended in bold)¶

Goals for today’s pre-class assignment¶

Eigenvectors and Eigenvalues
Solving Eigenproblems - A 2x2 Example
Introduction to Markov Models

1. Eigenvectors and Eigenvalues¶

Understanding Eigenvector and Eigenvalues can be very challenging. These are complex topics with many facets. Different textbooks approach the problem from different directions. All have value. These facets include:

Understanding the mathematical definition of Eigenvalues.
Being able to calculate an Eigenvalue and Eigenvector.
Understanding what Eigenvalues and Eigenvectors represent.
Understanding how to use Eigenvalues and Eigenvectors to solve problems.

In this course we consider it more important to understand what eigenvectors and eigenvalues represent and how to use them. However, often this understanding comes from first learning how to calculate them.

Eigenvalues are a special set of scalars associated with a square matrix that are sometimes also known as characteristic roots, characteristic values (Hoffman and Kunze 1971), proper values, or latent roots (Marcus and Minc 1988, p. 144).

The determination of the eigenvalues and eigenvectors of a matrix is extremely important in physics and engineering, where it is equivalent to matrix diagonalization and arises in such common applications as stability analysis, the physics of rotating bodies, and small oscillations of vibrating systems, to name only a few.

The decomposition of a square matrix \(A\) into eigenvalues and eigenvectors is known in this work as eigen decomposition, and the fact that this decomposition is always possible as long as the matrix consisting of the eigenvectors of \(A\) is square. This is known as the eigen decomposition theorem.

From: http://mathworld.wolfram.com/Eigenvalue.html

The following video provides an intuition for eigenvalues and eigenvectors.

from IPython.display import YouTubeVideo
YouTubeVideo("ue3yoeZvt8E",width=640,height=360, cc_load_policy=True)

Definition¶

Let \(A\) be an \(n\times n\) matrix. Find a vector \(x\) in \(R^n\) such that:

\[Ax=\lambda x\]

The above can be rewritten as the following homogeneous equation:

\[(A-\lambda I_n)x = 0\]

The trivial solution is \(x=0\).

However, if we define eigenvectors to be nonzero vectors then \(|A-\lambda I_n| = 0\). Nonzero (i.e. non-trivial) solutions to this system of equations can only exist if the matrix of coefficients is singular, i.e. the determinant of \(|A - \lambda I_n| = 0\). Therefore, solving the equation \(|A - \lambda I_n| = 0\) for \(\lambda\) leads to all the eigenvalues of \(A\).

Note: the above logic is key. Make sure you understand. If not, ask questions.

✅ QUESTION: Explain why nonzero solutions to a system of homogeneous systems require the matrix to be singular.

Put your answer here

from IPython.display import YouTubeVideo
YouTubeVideo("PFDu9oVAE-g",width=640,height=360, cc_load_policy=True)

Examples:¶

Here are a few more examples of how eigenvalues and eigenvectors are used (You are not required to understand all):

Using singular value decomposition for image compression. This is a note explaining how you can compress an image by throwing away the small eigenvalues of \(A^TA\). It takes an 88 megapixel image of an Allosaurus and shows how the image looks after compressing by selecting the largest singular values.

[Deriving Special Relativity]((https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxuYXNsdW5kZXJpY3xneDo2ZTAyNzA4NTZmOGZmNmU4) is more natural in the language of linear algebra. In fact, Einstein’s second postulate really states that “Light is an eigenvector of the Lorentz transform.” This document goes over the full derivation in detail.

Spectral Clustering. Whether it’s in plants and biology, medical imaging, buisness and marketing, understanding the connections between fields on Facebook, or even criminology, clustering is an extremely important part of modern data analysis. It allows people to find important subsystems or patterns inside noisy data sets. One such method is spectral clustering, which uses the eigenvalues of the graph of a network. Even the eigenvector of the second smallest eigenvalue of the Laplacian matrix allows us to find the two largest clusters in a network.

Dimensionality Reduction/PCA. The principal components correspond to the largest eigenvalues of \(A^\top A\), and this yields the least squared projection onto a smaller dimensional hyperplane, and the eigenvectors become the axes of the hyperplane. Dimensionality reduction is extremely useful in machine learning and data analysis as it allows one to understand where most of the variation in the data comes from.

Low rank factorization for collaborative prediction. This is what Netflix does (or once did) to predict what rating you’ll have for a movie you have not yet watched. It uses the singular value decomposition and throws away the smallest eigenvalues of \(A^\top A\).

The Google Page Rank algorithm. The largest eigenvector of the graph of the internet is how the pages are ranked.

From: https://math.stackexchange.com/questions/1520832/real-life-examples-for-eigenvalues-eigenvectors

2. Solving Eigenproblems - A 2x2 Example¶

from IPython.display import YouTubeVideo
YouTubeVideo("0UbkMlTu1vo",width=640,height=360, cc_load_policy=True)

Consider calculating eigenvalues for any \(2\times 2\) matrix. We want to solve:

\[|A - \lambda I_2 | = 0\]

\[\begin{split} \left| \left[ \begin{matrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix} \right] - \lambda \left[ \begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix} \right] \right| = \left| \left[ \begin{matrix} a_{11}-\lambda & a_{12} \\ a_{21} & a_{22}-\lambda \end{matrix} \right] \right| =0 \end{split}\]

We know this determinant:

\[(a_{11}-\lambda)(a_{22}-\lambda) - a_{12} a_{21} = 0 \]

If we expand the above, we get:

\[a_{11}a_{22}+\lambda^2-a_{11}\lambda-a_{22}\lambda - a_{12} a_{21} = 0\]

and

\[\lambda^2-(a_{11}+a_{22})\lambda+a_{11}a_{22} - a_{12} a_{21} = 0\]

This is a simple quadratic equation. The roots pf \(A\lambda^2+B\lambda+C = 0\) can be solved using the quadratic formula:

\[ \frac{-B \pm \sqrt{B^2 - 4AC}}{2A}\]

✅ QUESTION: Using the above equation. What are the eigenvalues for the following \(2\times 2\) matrix. Try calculating this by hand and then store the lower value in a variable namede1 and the larger value in e2 to check your answer:

\[\begin{split}A = \left[ \begin{matrix} -4 & -6 \\ 3 & 5 \end{matrix} \right] \end{split}\]

# Put your answer here

from answercheck import checkanswer

checkanswer.float(e1,'c54490d3480079138c8c027a87a366e3');

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-5-825ed7a81fa9> in <module>
----> 1 from answercheck import checkanswer
      2 
      3 checkanswer.float(e1,'c54490d3480079138c8c027a87a366e3');

~/REPOS/MTH314_Textbook/MakeTextbook/bookfiles/answercheck.py in <module>
      1 import hashlib
      2 import numpy as np
----> 3 import sympy as sym
      4 import sys
      5 import textwrap

ModuleNotFoundError: No module named 'sympy'

from answercheck import checkanswer

checkanswer.float(e2,'d1bd83a33f1a841ab7fda32449746cc4');

✅ DO THIS Find a numpy function that will calculate eigenvalues and verify the answers from above.

# Put your answer here

✅ QUESTION: What are the corresponding eigenvectors to the matrix \(A\)? This time you can try calculating by hand or just used the function you found in the previous answer. Store the eigenvector associated with the e1 value in a vector named v1 and the eigenvector associated with the eigenvalue e2 in a vector named v2 to check your answer.

Put your answer to the above question here.

from answercheck import checkanswer

checkanswer.eq_vector(v1,'35758bc2fa8ff4f04cfbcd019844f93d');

from answercheck import checkanswer

checkanswer.eq_vector(v2,'90b0437e86d2cf70050d1d6081d942f4');

✅ QUESTION: Both sympy and numpy can calculate many of the same things. What is the fundamental difference between these two libraries?

Put your answer to the above question here.

3. Introduction to Markov Models¶

In probability theory, a Markov model is a stochastic model used to model randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it.
A diagram representing a two-state Markov process, with the states labelled E and A.

Each number represents the probability of the Markov process changing from one state to another state, with the direction indicated by the arrow. For example, if the Markov process is in state A, then the probability it changes to state E is 0.4, while the probability it remains in state A is 0.6.

From: Wikipedia

The above state model can be represented by a transition matrix.

At each time step (\(t\)) the probability to move between states depends on the previous state (\(t-1\)):

\[A_{t} = 0.6A_{(t-1)}+0.7E_{(t-1)}\]

\[E_{t} = 0.4A_{(t-1)}+0.3E_{(t-1)}\]

The above state model (\(S_t = [A_t, E_t]^T\)) can be represented in the following matrix notation:

\[S_t = PS_{(t-1)}\]

✅DO THIS : Create a \(2 \times 2\) matrix (P) representing the transition matrix for the above Markov space.

#Put your answer to the above question here

from answercheck import checkanswer

checkanswer.matrix(P,'de1c99f4b4a8d7ea541a084d836ba7e4');

3. Assignment wrap-up¶

✅ Assignment-Specific QUESTION: Both sympy and numpy can calculate many of the same things. What is the fundamental difference between these two libraries?