January 2021 Take Home Exam

This was the exam given for the course 2020. Note that in 2021 we have skipped learning about "differentiability index", so problems like problem 1 will not be given. On the other hand, in 2021 we have talked about dimensionless variables and the Buckingham Pi-theorem (Lecture 7), which was not included in 2020 and therefore not present on the previous exam below.

Solutions to this exam is available here: exam2021Jan_solutions.pdf Download exam2021Jan_solutions.pdf

---------------------------------------------------

You are not allowed to discuss the exam with anyone else than bo.bernhardsson@control.lth.se

The maximum is 50 points on the exam. The time limit is 48 hours

Good luck.

Problem 1 [3 points] DAE

Consider the following differential algebraic equation

$LaTeX: \begin{align*} \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}\dot z + \begin{bmatrix} 0& -2 \\ 1 & 2 \end{bmatrix} z &= \begin{bmatrix} 0 \\ 1 \end{bmatrix}u \\ y&= z_1 + z_2 \end{align*}$ $\begin{align*} \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}\dot z + \begin{bmatrix} 0& -2 \\ 1 & 2 \end{bmatrix} z &= \begin{bmatrix} 0 \\ 1 \end{bmatrix}u \\ y&= z_1 + z_2 \end{align*}$

a) Write the system in the "standard form I" (Hint: Use $w_1=z_1$ and $w_2=z_1+2z_2$ ).
b) What is the index of the system ?

Problem 2 [10 points] System identification - theory

Consider the following "true" system

LaTeX: y(t) + 0.5y(t-1) = 0.4u(t) + v(t) $y(t) + 0.5y(t-1) = 0.4u(t) + v(t)$

where LaTeX: v(t) $v(t)$ is zero-mean white noise with variance 2. The input LaTeX: u $u$ is white noise of zero mean and variance 1 and uncorrelated with $v$ .

Assume we fit ARX models

$LaTeX: y(t) + a_1y(t-1) + \ldots + a_{n_a}y(t-n_a) = b_1u(t)+\ldots + b_{n_b}u(t-n_b+1) + e(t)$ $y(t) + a_1y(t-1) + \ldots + a_{n_a}y(t-n_a) = b_1u(t)+\ldots + b_{n_b}u(t-n_b+1) + e(t)$

using standard least squares, i.e. minimizing the prediction error

$LaTeX: V_N(\theta) = \sum_{t=1}^N (y(t)-\hat y(t|\theta))^2$ $V_N(\theta) = \sum_{t=1}^N (y(t)-\hat y(t|\theta))^2$

a) Assume that LaTeX: n_a=n_b=1 $n_a=n_b=1$ . To what values are the estimates of LaTeX: a_1 $a_1$ and LaTeX: b_1 $b_1$ converging when $LaTeX: N\to\infty?$ $N\to\infty?$ What is the variance of these estimates as a function of $N$ ?

b) Assume instead that LaTeX: n_a=2 $n_a=2$ and LaTeX: n_b=1 $n_b=1$ . To what values are the estimates of LaTeX: a_1,a_2 $a_1,a_2$ and LaTeX: b_1 $b_1$ converging when $LaTeX: N\to\infty$ $N\to\infty$ ? What is the variance of these estimates as a function of $N$ ?

c) What can you say in general for the estimated values of the parameters when we vary $LaTeX: n_a\geq 1, n_b\geq 1$ $n_a\geq 1, n_b\geq 1$ for this system ? Can you also guess what will happen with the parameter error variances when $n_a$ and LaTeX: n_b $n_b$ increases ?

(Hint: You do not need to simulate anything. But it can of course be a good idea for verification of your calculations, if you have time for it.)

Problem 3 [10 points] System Identification hands-on

Download the matlab file 2021jan_problem3.mat Download 2021jan_problem3.mat which contains sampled signals LaTeX: u $u$ and LaTeX: y $y$ (sample time is LaTeX: T_s=0.1 $T_s=0.1$ ). In matlab you type load 2021jan_problem3.mat to load the data.

a) Construct an appropriate black-box model fitting the data, with the constraints that the total number of parameters is $LaTeX: \leq 8$ $\leq 8$ and that the fit to validation data is at least 80%. Report

plot of the fitted model vs validation data
parameter values and their uncertainty
residual plot
bode plot
poles and zero placement.

Discuss and comment your choices and results

b) To which of the following Bode plots is your model most similar ?

problem3fig

(Hint: Useful commands might include help ident, systemIdentification, arx,oe,armax,bj, present, compare,resid, bodeplot,pzmap,...)

Problem 4 [10 points] Supervised Learning - EEG task

EEG task

The problem is described on this page Links to an external site..

Problem 5 [10 points] Causality and DAGs

Consider the following structural causal model

$LaTeX: \begin{align*} V&:= N_V \\ X &:= 4V+N_X \\ Y &:= -X + N_Y \\ Z &:= \alpha X + N_Z \\ W &:= -4V+2Y+2Z + N_W \end{align*}$ $\begin{align*} V&:= N_V \\ X &:= 4V+N_X \\ Y &:= -X + N_Y \\ Z &:= \alpha X + N_Z \\ W &:= -4V+2Y+2Z + N_W \end{align*}$

with independent random Gaussian variables $LaTeX: N_V,N_X,N_Y,N_Z,N_W \sim N(0,1)$ $N_V,N_X,N_Y,N_Z,N_W \sim N(0,1)$ .

a) Draw the graph corresponding to the SCM

b) Set $LaTeX: \alpha = 2$ $\alpha = 2$ and simulate 10000 data points from the joint distribution. Plot the values of LaTeX: W $W$ versus LaTeX: X $X$ to visualize the distribution LaTeX: P(W | X) $P(W | X)$ . If LaTeX: X=3 $X=3$ , what is a good guess of $W$ ?

c) Still using $LaTeX: \alpha = 2$ $\alpha = 2$ , simulate 10000 data points from the intervention distribution LaTeX: P(W | do(X:=x)) $P(W | do(X:=x))$ obtained by changing the equation for LaTeX: X $X$ to LaTeX: X:=3 $X:=3$ . What is a good guess of LaTeX: W $W$ , after the intervention LaTeX: X=3 $X=3$ ?

d) Describe how you can estimate the value of causal influence from LaTeX: X $X$ to LaTeX: W $W$

$LaTeX: \frac{\partial }{\partial x}E^{\mathrm{do}(X:=x)}[W]$ $\frac{\partial }{\partial x}E^{\mathrm{do}(X:=x)}[W]$

from a large amount of data V,X,Y,Z,W. (I.e. you know the structure of the graph and that the SCM is linear with Gaussian variables, but you do not know the actual coefficients in the equations).

e) A directed path from one node to another does not necessarily imply that the former node has a causal effect on the latter. Find a value of $LaTeX: \alpha$ $\alpha$ so that LaTeX: X $X$ has no causal effect on LaTeX: W $W$ .

Hint: You might find code from Lecture 8 Links to an external site. useful.

Problem 6 [7 points] Bayesian Estimation and parameter accuracy

(Note: There is no actual data that you will need for solving this problem.)

Background: There have been several researcher trying to estimate the efficiency of different interventions to reduce the spread of covid-19. Some influential reports from Imperial College in UK were released during the spring and the following article was later published (you will not have to read it !)

[1] Flaxman et al, Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe, Nature June 2020 Links to an external site.

The statistical analysis and conclusions in the paper were later criticized by several researchers, and the following paper describes some of the concerns (you will not have to read that either !)

[2] Soltesz et al, Matters arising: The effect of interventions on COVID-19, Nature, Dec 2020 Links to an external site.

The analysis in [1] aims at describing how the so called reproduction number R(t) is impacted by different interventions. The following 5 interventions were studied

1. "Social distancing encouraged"
2. "Self isolation"
3. "School closure"
4. "Public events banned"
5. "Complete lockdown"

The model assumption in [1] can be described by the following linear equation

$LaTeX: \begin{equation} y(t) = \alpha_0 + \alpha_1 x_1(t) + \alpha_2 x_2(t) + \alpha_3 x_3(t) + \alpha_4 x_4(t) + \alpha_5 x_5(t) + e(t) \qquad(1) \end{equation}$ $\begin{equation} y(t) = \alpha_0 + \alpha_1 x_1(t) + \alpha_2 x_2(t) + \alpha_3 x_3(t) + \alpha_4 x_4(t) + \alpha_5 x_5(t) + e(t) \qquad(1) \end{equation}$

where for day $LaTeX: t=1,2,\ldots$ $t=1,2,\ldots$ (counted from what was considered the starting day of the pandemic) one defines $LaTeX: y(t) = \log(R(t))$ $y(t) = \log(R(t))$ and LaTeX: x_i(t)=1 $x_i(t)=1$ if intervention number LaTeX: i $i$ was active that day and LaTeX: x_i(t)=0 $x_i(t)=0$ if it was inactive. Here $LaTeX: \alpha_0 = log(R_0)$ $\alpha_0 = log(R_0)$ , the reproduction number one would get without any interventions, such as during the first days of the pandemic.

In practice, the output LaTeX: y(t) $y(t)$ is unknown, and must be estimated from e.g. death rates, hospitalization statistics, or covid PCR tests. In this assignment we will make the optimistic assumption that $y(t)$ is known (within some error which can be included in LaTeX: e(t) $e(t)$ ).

a) Describe how one can estimate the coefficients $LaTeX: \alpha_0, \ldots, \alpha_5$ $\alpha_0, \ldots, \alpha_5$ from data LaTeX: y(t) $y(t)$ and LaTeX: x_i(t) $x_i(t)$ using least square regression.

b) What problem will you get if two different interventions, say 4 and 5, were introduced the same day so that one had LaTeX: x_4(t) = x_5(t) $x_4(t) = x_5(t)$ for all LaTeX: t $t$ ?

In article [1], data from 11 European countries were studied. For each country, a model of the form (1) above was introduced. It was assumed that the parameter $LaTeX: \alpha_0$ $\alpha_0$ differed between countries, but that the parameters $LaTeX: \alpha_1, \ldots, \alpha_5$ $\alpha_1, \ldots, \alpha_5$ were the same for all countries. (This meant that the model assumed the same effect in every country of a specific intervention, and that no other factors influenced the spread of the disease besides the mentioned interventions.)

For instance we would have for UK and Sweden

$LaTeX: \begin{align*} y^{UK}(t) &= \alpha_0^{UK} + \alpha_1 x_1^{UK}(t) + \alpha_2 x_2^{UK}(t) + \alpha_3 x_3^{UK}(t) + \alpha_4 x_4^{UK}(t) + \alpha_5 x_5^{UK}(t) + e^{UK}(t) \\ y^{\textrm{SW}}(t) &= \alpha_0^{\textrm{SW}} + \alpha_1 x_1^{\textrm{SW}}(t) + \alpha_2 x_2^{\textrm{SW}}(t) + \alpha_3 x_3^{\textrm{SW}}(t) + \alpha_4 x_4^{\textrm{SW}}(t) + \alpha_5 x_5^{\textrm{SW}}(t) + e^{\textrm{SW}}(t) \end{align*}$ $\begin{align*} y^{UK}(t) &= \alpha_0^{UK} + \alpha_1 x_1^{UK}(t) + \alpha_2 x_2^{UK}(t) + \alpha_3 x_3^{UK}(t) + \alpha_4 x_4^{UK}(t) + \alpha_5 x_5^{UK}(t) + e^{UK}(t) \\ y^{\textrm{SW}}(t) &= \alpha_0^{\textrm{SW}} + \alpha_1 x_1^{\textrm{SW}}(t) + \alpha_2 x_2^{\textrm{SW}}(t) + \alpha_3 x_3^{\textrm{SW}}(t) + \alpha_4 x_4^{\textrm{SW}}(t) + \alpha_5 x_5^{\textrm{SW}}(t) + e^{\textrm{SW}}(t) \end{align*}$

One can now use data from all 11 countries to try to estimate the parameters (11+5 parameters in total) by stacking the data from all countries into a large vector LaTeX: y $y$ . Using 70 days of data we get a model of the form $LaTeX: y = X\theta + e$ $y = X\theta + e$ , where $y$ is a vector of length 770, and LaTeX: X $X$ a matrix of dimension 770*16.

c) We will for simplicity assume LaTeX: e $e$ is Gaussian, the Fisher information matrix is then LaTeX: FIM = X^TX $FIM = X^TX$ . Since the elements of LaTeX: X $X$ are binary (0 or 1), the elements of the FIM matrix have a nice interpretation. What is it ?

d) When calculating the SVD of the FIM matrix, it turns out that there is one large singular value but that the rest of the singular values are quite small. What is the interpretation of this result?