April 2022 On Campus Exam

Solutions to this exam is available here FRTN65_examApr2022_solutions.pdf Download FRTN65_examApr2022_solutions.pdf

A code solution to Problem 3 could look like this: solutioncode3.m Download solutioncode3.m

Use your anonymization code when marking papers or sending in your files.

You can hand in either hand-written solutions on paper or send in files to our administrator mika@control.lth.se, or use a mix of both, whatever is easiest for you. If mailing, send one single zip-file and name this file NR.zip where NR is your anonymization code.

Allowed aid: All material is allowed to use, including online resources. You will need web access in problem 2,3,4. Communication with other persons is not allowed, except exam supervisors !!!

Total points is 50p. Limit for pass will be 25p.

Good luck !

1 Dimension analysis [10p]

Background: The attraction force between a proton and an electron with charge $LaTeX: e\approx 1.6\cdot10^{-19}\textrm{C}$ $e\approx 1.6\cdot10^{-19}\textrm{C}$ depends on the Coulomb's constant $LaTeX: k \approx 9.0\cdot 10^9 \textrm{kg }\textrm{m}^3\textrm{s}^{-2}\textrm{C}^{-2}$ $k \approx 9.0\cdot 10^9 \textrm{kg }\textrm{m}^3\textrm{s}^{-2}\textrm{C}^{-2}$ . In the formula for the binding energy LaTeX: E $E$ of the electron in a hydrogen atom (i.e. the energy needed to kick the electron away from the proton) also the mass $LaTeX: m_{electron}\approx 9.1\cdot 10^{-31} \textrm{ kg}$ $m_{electron}\approx 9.1\cdot 10^{-31} \textrm{ kg}$ is involved, and because it is a quantum effect also Planck's constant $LaTeX: \hbar \approx 1.1\cdot 10^{-34} \textrm{kg} \textrm{ m}^2\textrm{s}^{-1}$ $\hbar \approx 1.1\cdot 10^{-34} \textrm{kg} \textrm{ m}^2\textrm{s}^{-1}$ is expected to appear.

a) Determine integer coefficients LaTeX: a,b,c,d,f $a,b,c,d,f$ so that $LaTeX: m_{electron}^ak^be^c\hbar^dE^f$ $m_{electron}^ak^be^c\hbar^dE^f$ becomes a dimensionless variable. Your analysis should describe how you found $a,b,c,d,f$ from a linear equation system Ax=0. Hint: You can use null(A) in matlab to calculate solutions to Ax=0.

b) From the analysis, choosing the solution with LaTeX: f=-1 $f=-1$ , we expect a formula for the energy of the form $LaTeX: E=D \cdot m_{electron}^ak^be^c\hbar^d$ $E=D \cdot m_{electron}^ak^be^c\hbar^d$ for some dimensionless constant LaTeX: D $D$ . This also turns out to be correct, with the constant $LaTeX: D=\frac{1}{2}$ $D=\frac{1}{2}$ . Calculate the binding energy $E$ for the electron.

2 Supervised learning [10p]

This google colab notebook Links to an external site. investigates a data set consisting of some hundreds of penguins of 3 different species ('Adelie', 'Gentoo' or 'Chinstrap'). There are 6 features for the penguins: 4 of them are numerical and 2 are categorical. The goal is to find a good classifier that can determine the species from (some of) the attributes. To reduce future penguin measurement work, one wants to use few features. We will try to use only 2 instead of all 6.

Answer subproblems a,b,c,d described in the notebook. You do NOT have to hand in any code.

a) The first logistic regression does not work very well. Explain why.
b) Explain how the improvement is achieved.
c) Make a better choice of which 2 features to use for classification. Describe what accuracy you achieve.
d) We now use all 6 features. Describe what is done in the code and how the categorical variables have been used. Any idea why the LDA method seems to work slightly better than the logistic regression for this data set?

Problem 3 [12 points] System Identification Hands-on

The file problem3.mat Download problem3.mat contains input u and output y for a SISO system with sample rate h=0.01s. Identify a discrete time linear model of the system. Aim for using few model parameters. Be sure to describe your methodology, including outlier analysis, choice of suitable model structure and model order, and include model validation with residual analysis. Also hand in your matlab code.

Hint: Useful commands might include help ident, systemIdentification, arx,oe,armax,bj, present, compare, resid, bodeplot,pzmap,...

Problem 4 [10 points] Causal inference and DAGs

a) Draw the DAG corresponding to the following structural causal model. Be careful to get it right, since you will use the DAG in problems b)-i).
$LaTeX: S:=n_s\\T:=S+n_t\\Z:=n_z\\V:=0.5Z+n_v\\X:=Z+n_x\\Y:=1.5V+X+n_y\\W:=0.6S+X+2Y+n_w\\U:=W+n_u\\ \textrm{where } n_s,...,n_u \in N(0,1) \textrm{ are independent random variables.}$ $S:=n_s\\T:=S+n_t\\Z:=n_z\\V:=0.5Z+n_v\\X:=Z+n_x\\Y:=1.5V+X+n_y\\W:=0.6S+X+2Y+n_w\\U:=W+n_u\\ \textrm{where } n_s,...,n_u \in N(0,1) \textrm{ are independent random variables.}$

The next task is to draw conclusions about statements such as "T is independent of Y" (which is true) or " T is independent of Y given W" (which is false), using the DAG and this google colab code. Links to an external site.

Are the following statements true or false? Motivate the answers.

b) T and W are independent
c) Y is independent of Z given X and V
d) S is independent of Y given U
e) T and U are independent
f) X and S are independent
g) X is independent of U given W
h) X is independent of V given Z
i) X is independent of V given Z and U

Correct answers give +1 and wrong answers -1. You can skip subproblems if you want.

Problem 5 [8 points]

A certain genotype can have the three different possibilities: AA, Aa and aa. Assume these three cases occur with probability $LaTeX: \theta^2,2\theta(1-\theta), \textrm{ and } (1-\theta)^2$ $\theta^2,2\theta(1-\theta), \textrm{ and } (1-\theta)^2$ respectively. To estimate $LaTeX: \theta$ $\theta$ you make LaTeX: N $N$ independent observations getting LaTeX: n_1, n_2, n_3 $n_1, n_2, n_3$ samples from each category (so LaTeX: N=n_1+n_2+n_3 $N=n_1+n_2+n_3$ ).

a) Find a lower bound on achievable error variance $LaTeX: E(\hat \theta-\theta_{true})^2$ $E(\hat \theta-\theta_{true})^2$ for any unbiased estimator $LaTeX: \hat\theta$ $\hat\theta$ .

b) Find an optimal estimator achieving this lower bound, or prove that no such estimator exists.