Exercise 4

Due No Due Date
Points None

Exercise 4.1 Boosting on sonar data

Run the notebook ex6_boosting.ipynb Links to an external site. where you will tweak some code that implements boosting with a weight vector and a simple loop. Then you compare it to sklearns AdaBoost implementation and see how well you got it to work, you should be able to get to a similar performance with rather simple means.

Exercise 4.2 Singular Value Decomposition

a) Prove that if LaTeX: U $U$ is a unitary matrix LaTeX: (U^TU=UU^T = I) $(U^TU=UU^T = I)$ and LaTeX: y=Ux $y=Ux$ then $LaTeX: \Vert y \Vert^2 = \Vert x \Vert^2$ $\Vert y \Vert^2 = \Vert x \Vert^2$ , i.e. the sum of squares of the matrix elements is the same for $y$ and LaTeX: x $x$ ("Unitary matrices don't change lengths").

b) Given a matrix LaTeX: A $A$ , one way of finding the U,S and V in the SVD decomposition LaTeX: A=USV^T $A=USV^T$ is to compute the eigenvectors and eigenvalues of the two matrices LaTeX: AA^T $AA^T$ and LaTeX: A^TA $A^TA$ . Explain why, and how LaTeX: U,S,V $U,S,V$ can be obtained from this information. (But it is not the best method)

c) Use a SVD LaTeX: X=USV^T $X=USV^T$ to rewrite the normal equation $LaTeX: X^TX\theta = X^Ty$ $X^TX\theta = X^Ty$ , and solve for $LaTeX: \theta$ $\theta$ . Show the solution (if $X^TX$ is invertible) is given by $LaTeX: \widehat{\theta } = V_1S^{-1}U_1^Ty$ $\widehat{\theta } = V_1S^{-1}U_1^Ty$ .

d) Same problem but with Tikhonov regularisation, for which $LaTeX: (X^TX+\gamma I_p)\theta = X^Ty$ $(X^TX+\gamma I_p)\theta = X^Ty$

e) On the lecture we introduced the projections LaTeX: P_1=U_1U_1^T $P_1=U_1U_1^T$ and LaTeX: Q_1=V_1V_1^T $Q_1=V_1V_1^T$ . Prove the stated claims on slides 37-38 (i.e. that LaTeX: P_1^2=P_1 $P_1^2=P_1$ etc)

f) The SVD can be used to approximate a matrix LaTeX: A=USV^T $A=USV^T$ with a matrix of lower rank. Write matlab or python code that takes an image, represented as a matrix of pixel intensities, and calculates an optimal rank LaTeX: k $k$ approximation LaTeX: A_k $A_k$ . Try different values of LaTeX: k=1,2,10,... $k=1,2,10,...$ and plot the images $A_k$ .

Hint: LaTeX: A_k $A_k$ = LaTeX: U_1S_1V_1^T $U_1S_1V_1^T$ where LaTeX: U_1,S_1 $U_1,S_1$ and LaTeX: V_1 $V_1$ are the parts corresponding to the LaTeX: k $k$ largest singular values.

Exercises 4.3 Kmeans and precoding Investigate the code kmeans.ipynb Links to an external site. which uses the following idea: First it clusters the data vectors of length 64 corresponding to flattened 8x8 images of the numbers (0-9) in 10 clusters. Then it represents each image as a vector of 10 coordinates corresponding to the distances to the 10 cluster centers, so each image is now represented by a vector of length 10. It then compares performance of a logistic regression classifier with these 10 numbers to a classifier that uses the 64 numbers.

Unfortunately, it seems the performance was worsened. But the idea can actually be made to work. Try to tune the algorithm so that a positive improvement is obtained. You should be able to get about 1% average improvement, possibly even 2%.

Exercises 4.4 Kmeans Implement your own K-means algorithm (choose language yourself). Try it on a data set of your choice.

Exercise 4.5 LDA

Prove that the decision regions in LDA are bounded by linear functions.

Hint: The decision region between two classes is most easily determined by the fact that the LDA classifier chooses the cluster LaTeX: i $i$ giving the largest log likelihood $LaTeX: \log p(x_\star) = \textrm{const } -\frac{1}{2} (x_\star-\mu_i)^T \Sigma^{-1} (x_\star-\mu_i)$ $\log p(x_\star) = \textrm{const } -\frac{1}{2} (x_\star-\mu_i)^T \Sigma^{-1} (x_\star-\mu_i)$ .

Exercise 4.6 - PCA and MNIST

Investigate the code lec6pcaMNIST.ipynb Links to an external site.

a) How far can you compress the images before classifier performance deteriorates?

b) Was it a good idea to apply the standard scaler on each pixel ? Try without.

We will also talk more about lab1.

Solutions sol4new.pdf Download sol4new.pdf

boosting_answer.ipynb Download boosting_answer.ipynb (exercise 4.1)

svd_rank_k_approx.m Download svd_rank_k_approx.m (exercise 4.2f)

exercise 4.3: Try n_clusters=100

my_kmeans.ipynb Download my_kmeans.ipynb (exercise 4.4)

Rubric

Title:

Find a Rubric

Title

Title
Criteria	Ratings	Pts
Description of criterion threshold: 5 pts Edit criterion description Delete criterion row	5 to >0 pts Full Marks blank 0 to >0 pts No Marks blank_2 This area will be used by the assessor to leave comments related to this criterion.	pts / 5 pts --
Description of criterion threshold: 5 pts Edit criterion description Delete criterion row	5 to >0 pts Full Marks blank 0 to >0 pts No Marks blank_2 This area will be used by the assessor to leave comments related to this criterion.	pts / 5 pts --