Exercise 4
- Due No Due Date
- Points None
Exercise 4.1 Boosting on sonar data
Run the notebook ex04_boosting.ipynb Links to an external site. where you will tweak some code that implements boosting with a weight vector and a simple loop. Then you compare it to sklearns AdaBoost implementation and see how well you got it to work, you should be able to get to a similar performance with rather simple means.
Exercise 4.2 Singular Value Decomposition
a) Prove that if is a unitary matrix and then , i.e. the sum of squares of the matrix elements is the same for and ("Unitary matrices don't change lengths").
b) Given a matrix , one way of finding the U,S and V in the SVD decomposition is to compute the eigenvectors and eigenvalues of the two matrices and . Explain why, and how can be obtained from this information. (But it is not the best method)
c) Use a SVD to rewrite the normal equation , and solve for . Show the solution (if is invertible) is given by .
d) Same problem but with Tikhonov regularisation, for which
e) On the lecture we introduced the projections and . Prove the stated claims on slides 37-38 (i.e. that etc)
f) The SVD can be used to approximate a matrix with a matrix of lower rank. Write matlab or python code that takes an image, represented as a matrix of pixel intensities, and calculates an optimal rank approximation . Try different values of and plot the images .
Hint: = where and are the parts corresponding to the largest singular values.
Exercises 4.3 Kmeans and precoding Investigate the code ex04_kmeans.ipynb Links to an external site. which uses the following idea: First it clusters the data vectors of length 64 corresponding to flattened 8x8 images of the numbers (0-9) in 10 clusters. Then it represents each image as a vector of 10 coordinates corresponding to the distances to the 10 cluster centers, so each image is now represented by a vector of length 10. It then compares performance of a logistic regression classifier with these 10 numbers to a classifier that uses the 64 numbers.
(Extra: you can plot the 10 class centers to get a visual intuition on what the k-means algorithm is doing.)
Unfortunately, it seems the performance was worsened. But the idea can actually be made to work. Try to tune the algorithm so that a positive improvement is obtained. You should be able to get about 1% average improvement, possibly even 2%.
Exercises 4.4 Kmeans Implement your own K-means algorithm (choose language yourself). Try it on a data set of your choice.
Exercise 4.5 LDA
Prove that the decision regions in LDA are bounded by linear functions.
Hint: The decision region between two classes is most easily found by using the fact that the LDA classifier chooses the cluster with the highest probability, which according to Bayes' rule is proportional to the log-likelihood scaled with the prior (probability of data belonging to a certain class ):
.
Exercise 4.6 - PCA and MNIST
Investigate the code ex04_pcaMNIST.ipynb
Links to an external site.
a) How far can you compress the images before classifier performance deteriorates?
b) In terms of representing as much variance as possible with as few components as possible, was it a good idea to apply the standard scaler on each pixel? Try without.
Solutions sol4new.pdf Download sol4new.pdf
exercise 4.3: Try n_clusters=100