January 2021 Take Home Exam
This was the exam given for the course 2020. Note that in 2021 we have skipped learning about "differentiability index", so problems like problem 1 will not be given. On the other hand, in 2021 we have talked about dimensionless variables and the Buckingham Pi-theorem (Lecture 7), which was not included in 2020 and therefore not present on the previous exam below.
Solutions to this exam is available here: exam2021Jan_solutions.pdf Download exam2021Jan_solutions.pdf
---------------------------------------------------
You are not allowed to discuss the exam with anyone else than bo.bernhardsson@control.lth.se
The maximum is 50 points on the exam. The time limit is 48 hours
Good luck.
Problem 1 [3 points] DAE
Consider the following differential algebraic equation
[1000]˙z+[0−212]z=[01]uy=z1+z2
a) Write the system in the "standard form I" (Hint: Use w1=z1 and
w2=z1+2z2).
b) What is the index of the system ?
Problem 2 [10 points] System identification - theory
Consider the following "true" system
y(t)+0.5y(t−1)=0.4u(t)+v(t)
where v(t) is zero-mean white noise with variance 2. The input
u is white noise of zero mean and variance 1 and uncorrelated with
v.
Assume we fit ARX models
y(t)+a1y(t−1)+…+anay(t−na)=b1u(t)+…+bnbu(t−nb+1)+e(t)
using standard least squares, i.e. minimizing the prediction error
VN(θ)=∑Nt=1(y(t)−ˆy(t|θ))2
a) Assume that na=nb=1. To what values are the estimates of
a1 and
b1 converging when
N→∞? What is the variance of these estimates as a function of
N?
b) Assume instead that na=2 and
nb=1. To what values are the estimates of
a1,a2 and
b1 converging when
N→∞ ? What is the variance of these estimates as a function of
N?
c) What can you say in general for the estimated values of the parameters when we vary na≥1,nb≥1 for this system ? Can you also guess what will happen with the parameter error variances when
na and
nb increases ?
(Hint: You do not need to simulate anything. But it can of course be a good idea for verification of your calculations, if you have time for it.)
Problem 3 [10 points] System Identification hands-on
Download the matlab file 2021jan_problem3.mat
Download 2021jan_problem3.mat which contains sampled signals u and
y (sample time is
Ts=0.1). In matlab you type load 2021jan_problem3.mat to load the data.
a) Construct an appropriate black-box model fitting the data, with the constraints that the total number of parameters is ≤8 and that the fit to validation data is at least 80%. Report
- plot of the fitted model vs validation data
- parameter values and their uncertainty
- residual plot
- bode plot
- poles and zero placement.
Discuss and comment your choices and results
b) To which of the following Bode plots is your model most similar ?
(Hint: Useful commands might include help ident, systemIdentification, arx,oe,armax,bj, present, compare,resid, bodeplot,pzmap,...)
Problem 4 [10 points] Supervised Learning - EEG task
The problem is described on this page Links to an external site..
Problem 5 [10 points] Causality and DAGs
Consider the following structural causal model
V:=NVX:=4V+NXY:=−X+NYZ:=αX+NZW:=−4V+2Y+2Z+NW
with independent random Gaussian variables NV,NX,NY,NZ,NW∼N(0,1).
a) Draw the graph corresponding to the SCM
b) Set α=2 and simulate 10000 data points from the joint distribution. Plot the values of
W versus
X to visualize the distribution
P(W|X). If
X=3, what is a good guess of
W?
c) Still using α=2, simulate 10000 data points from the intervention distribution
P(W|do(X:=x)) obtained by changing the equation for
X to
X:=3. What is a good guess of
W, after the intervention
X=3?
d) Describe how you can estimate the value of causal influence from X to
W
∂∂xEdo(X:=x)[W]
from a large amount of data V,X,Y,Z,W. (I.e. you know the structure of the graph and that the SCM is linear with Gaussian variables, but you do not know the actual coefficients in the equations).
e) A directed path from one node to another does not necessarily imply that the former node has a causal effect on the latter. Find a value of α so that
X has no causal effect on
W.
Hint: You might find code from Lecture 8 Links to an external site. useful.
Problem 6 [7 points] Bayesian Estimation and parameter accuracy
(Note: There is no actual data that you will need for solving this problem.)
Background: There have been several researcher trying to estimate the efficiency of different interventions to reduce the spread of covid-19. Some influential reports from Imperial College in UK were released during the spring and the following article was later published (you will not have to read it !)
The statistical analysis and conclusions in the paper were later criticized by several researchers, and the following paper describes some of the concerns (you will not have to read that either !)
The analysis in [1] aims at describing how the so called reproduction number R(t) is impacted by different interventions. The following 5 interventions were studied
- 1. "Social distancing encouraged"
- 2. "Self isolation"
- 3. "School closure"
- 4. "Public events banned"
- 5. "Complete lockdown"
The model assumption in [1] can be described by the following linear equation
y(t)=α0+α1x1(t)+α2x2(t)+α3x3(t)+α4x4(t)+α5x5(t)+e(t)(1)
where for day t=1,2,… (counted from what was considered the starting day of the pandemic) one defines
y(t)=log(R(t)) and
xi(t)=1 if intervention number
i was active that day and
xi(t)=0 if it was inactive. Here
α0=log(R0), the reproduction number one would get without any interventions, such as during the first days of the pandemic.
In practice, the output y(t) is unknown, and must be estimated from e.g. death rates, hospitalization statistics, or covid PCR tests. In this assignment we will make the optimistic assumption that
y(t) is known (within some error which can be included in
e(t)).
a) Describe how one can estimate the coefficients α0,…,α5from data
y(t) and
xi(t)using least square regression.
b) What problem will you get if two different interventions, say 4 and 5, were introduced the same day so that one had x4(t)=x5(t) for all
t ?
In article [1], data from 11 European countries were studied. For each country, a model of the form (1) above was introduced. It was assumed that the parameter α0 differed between countries, but that the parameters
α1,…,α5were the same for all countries. (This meant that the model assumed the same effect in every country of a specific intervention, and that no other factors influenced the spread of the disease besides the mentioned interventions.)
For instance we would have for UK and Sweden
yUK(t)=αUK0+α1xUK1(t)+α2xUK2(t)+α3xUK3(t)+α4xUK4(t)+α5xUK5(t)+eUK(t)ySW(t)=αSW0+α1xSW1(t)+α2xSW2(t)+α3xSW3(t)+α4xSW4(t)+α5xSW5(t)+eSW(t)
One can now use data from all 11 countries to try to estimate the parameters (11+5 parameters in total) by stacking the data from all countries into a large vector y. Using 70 days of data we get a model of the form
y=Xθ+e, where
y is a vector of length 770, and
X a matrix of dimension 770*16.
c) We will for simplicity assume e is Gaussian, the Fisher information matrix is then
FIM=XTX. Since the elements of
X are binary (0 or 1), the elements of the FIM matrix have a nice interpretation. What is it ?
d) When calculating the SVD of the FIM matrix, it turns out that there is one large singular value but that the rest of the singular values are quite small. What is the interpretation of this result?