19 Pre-Class Assignment: Least Squares Fit (Regression)¶

Readings for this topic (Recommended in bold)¶

Goals for today’s pre-class assignment¶

Least Squares Fit
Linear Regression
One-to-one and Inverse transform
Inverse of a Matrix
Assignment Wrap-up

1. Least Squares Fit¶

Review Chapters Chapter 13 pg 225-239 of the Boyd textbook.

In this first part of this course, we try to solve the system of linear equations $Ax=b$ with an $m\times n$ matrix $A$ and a column vector $b$.

There are three possible outcomes: an unique solution, no solution, and infinite many solutions. (Review the material on this part if you are no familiar with when the three types of outcomes happen.)

When $m<n$, we call the matrix $A$ underdeterminated, because we can not have an unique solution for it. When $m>n$, we call the matrix $A$ overdeterminated, becasue we may not have a solution with high probability.

However, if we still need to find a best $x$, even when there is no solution or infinite many solutions we use a technique called least squares fit (LSF). Least squares fit find $x$ such that $\|Ax-b\|$ is the smallest (i.e. we try to minimize the estimation error).

When there is no solution, we want to find $x$ such that $Ax-b$ is small (here, we want $\|Ax-b\|$ to be small).
If the null space of $A$ is just $\{0\}$, we can find an unique $x$ to obtain the smallest $\|Ax-b\|$.
- If there is a unique solution $x^*$ for $Ax=b$, then $x^*$ is the optimal $x$ to obtain the smallest $\|Ax-b\|$, which is 0.
- Because the null space of $A$ is just $\{0\}$, you can not have infinite many solutions for $Ax=b$.
If the null space of $A$ is not just $\{0\}$, we know that we can always add a nonzero point $x_0$ in the null space of $A$ to a best $x^*$, and $\|A(x^*+x_0)-b\|=\|Ax^*-b\|$. Therefore, when we have multiple best solutions, we choose to find the $x$ in the rowspace of $A$, and this is unique.

QUESTION 1: Let $$A=\begin{bmatrix}1\\2\end{bmatrix},\quad b=\begin{bmatrix}1.5 \\ 2\end{bmatrix}$$ Find the best $x$ such that $|Ax-b|$ has the smallest value.

Put your answer to the above question here.

QUESTION 2: Compute $(A^\top A)^{-1}A^\top b$.

Put your answer to the above question here.

2. Linear Regression¶

Watch the video for using Least Squares to do linear regression.

from IPython.display import YouTubeVideo
YouTubeVideo("Lx6CfgKVIuE",width=640,height=360, cc_load_policy=True)

QUESTION 3: How to tell it is a good fit or a bad one?

Put your answer to the above question here.

3. One-to-one and Inverse transform¶

Read Section 4.9 of the textbook if you are not familiar with this part.

Definition: A transformation $T:U\mapsto V$ is said to be one-to-one if each element in the range is the image of just one element in the domain. That is, for two elements ($x$ and $y$) in $U$. $T(x)=T(y)$ happens only when $x=y$.

Theorem: Let $T:U\mapsto V$ be a one-to-one linear transformation. If $\{u_1,\dots,u_n\}$ is linearly independent in $U$, then $\{T(u_1),\dots,T(u_n)\}$ is linearly independent in $V$.

Definition: A linear transformation $T:U\mapsto V$ is said to be invertible if there exists a transformation $S:V\mapsto U$, such that $$S(T(u))=u,\qquad T(S(v))=v,$$ for any $v$ in $V$ and any $u$ in $U$.

QUESTION 4: If linear transformation $T:U\mapsto V$ is invertible, and the dimension of $U$ is 2, what is the dimension of $V$? Why?

Put your answer to the above question here.

4. Inverse of a Matrix¶

Recall the four fundamental subspaces of a $m\times n$ matrix $A$
- The rowspace and nullspace of $A$ in $R^n$
- The columnspace and the nullspace of $A^\top$ in $R^m$

The two-sided inverse gives us the following $$ {A}{A}^{-1}=I={A}^{-1}{A} $$
- For this we need $r = m = n$, here $r$ is the rank of the matrix.

For a left-inverse, we have the following
- Full column rank, with $r = n \leq m$ (but possibly more rows)
- The nullspace contains just the zero vector (columns are independent)
- The rows might not all be independent
- We thus have either no or only a single solution to $Ax=b$.
- $A^\top $ will now also have full row rank
- From $(A^\top A)^{-1}A^\top A = I$ follows the fact that $(A^\top A)^{-1}A^\top$ is a left-sided inverse
- Note that $(A^\top A)^{-1}A^\top$ is a $n\times m$ matrix and $A$ is of size $m\times n$, theire mulitiplication $(A^\top A)^{-1}A^\top A$ results in a $n\times n$ identity matrix
- The $A(A^\top A)^{-1}A^\top$ is a $m\times m$ matrix. BUT $A(A^\top A)^{-1}A^\top\neq I$ if $m\neq n$. The matrix $A(A^\top A)^{-1}A^\top$ is the projection matrix onto the column space of $A$.

QUESTION 5: What is the projection matrix that projects any vector onto the subspace spanned by $[1,2]^\top$. (What matrix will give the same result as projecting any point onto the vector $[1,2]^\top$.)

Put your answer to the above question here.

QUESTION 6: If $m=n$, is the left inverse the same as the inverse?

Put your answer to the above question here.

Theorem: For a matrix $A$ with $r=n<m$, the columnspace of $A$ has dimension $r(=n)$. The linear transfrom $A: R^n\mapsto R^m$ is one-to-one. In addition, the linear transformation $A$ from $R^n$ to the columnspace of $A$ is one-to-one and onto (it means that for any element in the columnspace of $A$, we can find $x$ in $R^n$ such that it equals $Ax$.) Then the left inverse of $A$ is a one-to-one mapping from the columnspace of $A$ to $R^n$, and it can be considered as an inverse transform of $A$.

3. Assignment wrap-up¶

✅ Assignment-Specific QUESTION: There is no Assignment specific question for this notebook. You can just say “none”.