(About 6 of the following problems)

Solutions to the problems

Code to solutions:

prob1.m

prob2.m

problem6.m

1. System Identification Theory

a) Suppose that we would like to identify a model, where the true system is given by

LaTeX: y(t) = 0.4u(t-1) + 0.3u(t-2) + 0.2u(t-3) + 0.1u(t-4) + e(t) $y (t) = 0.4 u (t - 1) + 0.3 u (t - 2) + 0.2 u (t - 3) + 0.1 u (t - 4) + e (t)$ $y (t) = 0.4 u (t - 1) + 0.3 u (t - 2) + 0.2 u (t - 3) + 0.1 u (t - 4) + e (t)$

where $LaTeX: e(t)\in N\left(0,1\right)$ $e (t)$ $e (t)$ is white noise with zero mean and unit variance. Suppose
that the input signal is a sinusoidal, $LaTeX: u(t) = sin(\pi t/4)$ $u (t) = s i n (π t / 4)$ $u (t) = s i n (π t / 4)$ and that you
estimate the parameters in a model of the form

LaTeX: y(t) = b_1u(t-1) + b_2u(t-2) + e(t) $y (t) = b_{1} u (t - 1) + b_{2} u (t - 2) + e (t)$ $y (t) = b_{1} u (t - 1) + b_{2} u (t - 2) + e (t)$

using a standard least squares prediction error method. What are the estimates of LaTeX: b_1 $b_{1}$ $b_{1}$ and LaTeX: b_2 $b_{2}$ $b_{2}$ when $LaTeX: N \to \infty$ $N \to \infty$ $N \to \infty$ ?

b) In another scenario assume the true system is given by

LaTeX: y(t) = 0.2u(t-1) + 0.4u(t-2) + w(t) $y (t) = 0.2 u (t - 1) + 0.4 u (t - 2) + w (t)$ $y (t) = 0.2 u (t - 1) + 0.4 u (t - 2) + w (t)$

with $LaTeX: w(t)\in N\left(0,1\right)$ $w (t)$ $w (t)$ white noise of variance 1. Assume an identification experiment is carried out with an input LaTeX: u $u$ $u$ uncorrelated with $w$ $w$ but with autocovariance

$LaTeX: R_u(\tau) = \begin{cases} 1, &\tau=0\\ 0.5, & |\tau|=1 \\ 0, & |\tau|>1 \end{cases}$ $R_{u} (τ) = {\begin{cases} 1, & τ = 0 \\ 0.5, & | τ | = 1 \\ 0, & | τ | > 1 \end{cases}$ $R_{u} (τ) = {\begin{cases} 1, & τ = 0 \\ 0.5, & | τ | = 1 \\ 0, & | τ | > 1 \end{cases}$

Calculate the asymptotic values of the estimates $LaTeX: \hat{b}_1$ ${\hat{b}}_{1}$ ${\hat{b}}_{1}$ and $LaTeX: \hat{b}_2$ ${\hat{b}}_{2}$ ${\hat{b}}_{2}$ when $LaTeX: N\to \infty.$ $N \to \infty .$ $N \to \infty .$ Are the estimates asymptotically correct? The model is of the correct form

LaTeX: y(t) = b_1 u(t-1) + b_2u(t-2) + e(t) $y (t) = b_{1} u (t - 1) + b_{2} u (t - 2) + e (t)$ $y (t) = b_{1} u (t - 1) + b_{2} u (t - 2) + e (t)$ .

Also determine the error variances $LaTeX: \mathrm{Var}(b_1-\hat b_1)$ $V a r (b_{1} - {\hat{b}}_{1})$ $V a r (b_{1} - {\hat{b}}_{1})$ and $LaTeX: \mathrm{Var}(b_2-\hat b_2)$ $V a r (b_{2} - {\hat{b}}_{2})$ $V a r (b_{2} - {\hat{b}}_{2})$ of the parameter estimates for finite amount of data LaTeX: N $N$ $N$ .

c) Suggest another signal LaTeX: u(t) $u (t)$ $u (t)$ , also with variance 1 as in b, which gives lower error variances $LaTeX: \mathrm{Var}(b_1-\hat b_1)$ $V a r (b_{1} - {\hat{b}}_{1})$ $V a r (b_{1} - {\hat{b}}_{1})$ and $LaTeX: \mathrm{Var}(b_2-\hat b_2)$ $V a r (b_{2} - {\hat{b}}_{2})$ $V a r (b_{2} - {\hat{b}}_{2})$ .

2 System Identification Practice

The data for this problem are in the file sysid02.mat. Load the data into Matlab, inside it you will find input and output signals u and y (the sample time is LaTeX: T_s=1 $T_{s} = 1$ $T_{s} = 1$ ).

Use that data to construct one or more appropriate black-box models, choosing between ARX, OE, ARMAX and BJ structures of appropriate orders. For your best model report:

plot of the fitted model vs validation data. (Hint: compare())
parameter values and uncertainty
residual analysis plot (resid)
Bode plot (bode or bodeplot)
poles and zeros (pzmap)

You can either use the systemIdentification GUI or do it with matlab code that you write.

3. Modeling, Modelica and DAE systems

Subproblems

Consider the electric circuit below driven by a current source of input current LaTeX: I $I$ $I$ (and LaTeX: V $V$ $V$ is a voltage).

a) Write a DAE in the variables LaTeX: i_1 $i_{1}$ $i_{1}$ , LaTeX: i_2 $i_{2}$ $i_{2}$ , and LaTeX: V $V$ $V$ , with $I$ $I$ as input.

[In 2021-22 we have skipped talking about the differentiability index, and therefore questions of the form b) and c) will not be given in Jan 2022-3. Therefore skip the next two subproblems.

b) What is the differentiability index $k$ $k$ of the DAE ?

c) Let $w = L_{1} i_{1} - L_{2} i_{2}$ $w = L_{1} i_{1} - L_{2} i_{2}$ and $y = V$ $y = V$ . Show that the model can be written in the form

$LaTeX: \begin{align*} \dot w&= Aw + BI \\ y &= Cw + D_0I + \ldots +D_{k-1}I^{(k-1)} \end{align*}$ $\begin{aligned} \dot{w} & = A w + B I \\ y & = C w + D_{0} I + \dots + D_{k - 1} I^{(k - 1)} \end{aligned}$ $\begin{aligned} \dot{w} & = A w + B I \\ y & = C w + D_{0} I + \dots + D_{k - 1} I^{(k - 1)} \end{aligned}$

where $LaTeX: I^{(k-1)}$ $I^{(k - 1)}$ $I^{(k - 1)}$ is the $(k - 1)$ $(k - 1)$ -derivative of $I,$ $I,$ and $k$ $k$ is the differentiability index. ]

d) If the current source is replaced by a voltage source (similar diagram, but with voltage LaTeX: V $V$ $V$ as input), is it possible to write the system in state space form $LaTeX: \dot x = Ax + Bu, y=Cx+Du$ $\dot{x} = A x + B u, y = C x + D u$ $\dot{x} = A x + B u, y = C x + D u ?$ (Note: different matrices LaTeX: A,B,C,D than in c)

You can assume that parameters LaTeX: L_1 $L_{1}$ $L_{1}$ and LaTeX: L_2 $L_{2}$ $L_{2}$ are non-zero.

4. Supervised Learning - Practice and theory

The EEG data needed here is not included, so you cant solve this problem. The problem would be more detailed on the exam. Don't spend all the time trying to optimize performance.

The google colab notebook xxx loads data from an EEG experiment, measuring brain activity from persons looking on images on a computer screen. These images belong to 3 different categories, (denoted 0,1,2 in the data). It is known that the activities in the brain differ when processing images from these categories.

The EEG data has the following structure:

The data is split into a training set of X images which you should use to train your classifier and a test set of Y images which you should use to evaluate your algorithm.

a) Choose a good algorithm described in the course and train a classifier on the data. It is of course good if your algorithm gets a high performance, but your result will be judged mainly by your methodology, and how well you describe your method and result.

b) Describe how one could interpret the information one obtains from the singular value decomposition [U,S,V]=svd(A)

of the EEG data matrix A = (here follows a description of the matrix A). Say for instance that only 5 singular values are significantly larger then 0.

5. Causal Inference, Theory or Practice

The following DAG decscribes a linear Gaussian structural causal model, where we assume we do not know the parameters (the values on the edges).

The equations of the SCM are given by

$LaTeX: \begin{align*} A& := N_A \\ B &:= N_B\\ X &:= A + 2B + N_C\\ D &:= 3B + 4X + N_D\\ Y &:= 2D+3X+N_E \end{align*}$ $\begin{aligned} A & := N_{A} \\ B & := N_{B} \\ X & := A + 2 B + N_{C} \\ D & := 3 B + 4 C + N_{D} \\ Y & := 2 D + 3 C + N_{E} \end{aligned}$ $\begin{aligned} A & := N_{A} \\ B & := N_{B} \\ X & := A + 2 B + N_{C} \\ D & := 3 B + 4 C + N_{D} \\ Y & := 2 D + 3 C + N_{E} \end{aligned}$

where $LaTeX: N_A,\ldots,N_E$ $N_{A}, \dots, N_{E}$ $N_{A}, \dots, N_{E}$ are normally distributed LaTeX: N(0,1) $N (0, 1)$ $N (0, 1)$ random variables.

a) We are interested in estimating the causal effect from LaTeX: X $X$ $X$ to LaTeX: Y $Y$ $Y,$ e.g. find $LaTeX: \frac{\partial }{\partial x} E(Y | \mathrm{do}(X:=x))$ (which in this case is 11). Draw a figure indicating the updated DAG after an intervention has been made corresponding to this situation, $\frac{\partial}{\partial a} E (Y | d o (X := a))$ $\frac{\partial}{\partial a} E (Y | d o (X := a))$ .

b) If such an intervention is not practically possible, then describe how the causal effect from X to Y can be obtained from linear regression using available data. Determine which of these linear regressions will give the correct value

$LaTeX: \begin{align*} &Y \sim X \\ &Y \sim X + B \\ &Y \sim X+A+B \end{align*}$ $\begin{aligned} Y \sim X \\ Y \sim X + B \\ Y \sim X + A + B \end{aligned}$ $\begin{aligned} Y \sim X \\ Y \sim X + B \\ Y \sim X + A + B \end{aligned}$

For the example $LaTeX: Y\sim X+A+B$ $Y \sim X + A + B$ $Y \sim X + A + B$ this would mean that we find the correct coefficient $LaTeX: \theta_1=11$ $θ_{1} = 11$ $θ_{1} = 11$ (asymptotically when the number of data points goes to infinity) from the least squares regression

$LaTeX: Y=\theta_1 X + \theta_2 A + \theta_3B + \textrm{noise}$ $Y = θ_{1} X + θ_{2} A + θ_{3} B + noise$ $Y = θ_{1} X + θ_{2} A + θ_{3} B + noise$

c) Confirm your results numerically by generating a large amount of data points according to the true SCM and perform the three different linear regressions described in b. (Hint: In python you can use the ols command in the statsmodels package. You can also solve the problem in matlab).

6. Grey Box Identification

The following continuous time model describes the one dimensional position LaTeX: y $y$ of a mobile robot. The input signal LaTeX: u $u$ to a motor generates a force LaTeX: F $F$ on the robot. The motor has a time constant LaTeX: T $T$ . The robot is initially at rest.

$LaTeX: \begin{align*} M \ddot y &= F \\ T\dot F &= -F + ku \end{align*}$ $\begin{aligned} M \ddot{y} & = F \\ T \dot{F} & = - F + k u \end{aligned}$

Parameters LaTeX: M, T $M, T$ and LaTeX: k $k$ are unknown and should be estimated from output input data LaTeX: (y,u). $(y, u) .$

a) Use the state $LaTeX: x = \begin{bmatrix} y \\ \dot y \\F \end{bmatrix}$ $x = [\begin{matrix} y \\ \dot{y} \\ F \end{matrix}]$ and write the model on state space form $LaTeX: \dot x = A(\theta) x + B(\theta)u; \quad y = C(\theta) x; \quad x(0)=0$ $\dot{x} = A (θ) x + B (θ) u; y = C (θ) x; x (0) = 0$

suitable for Grey-box identification.

b) Explain why all three parameters LaTeX: M,T,k $M, T, k$ can not be identified from any output input data LaTeX: (y,u) $(y, u)$ .

b) The file problem6data.mat contains data LaTeX: (y,u) $(y, u)$ sampled at LaTeX: T_s=0.1 $T_{s} = 0.1$ . (The data includes some noise.) Estimate the two parameters LaTeX: k,T $k, T$ assuming that the mass LaTeX: M=5 $M = 5$ is known.

7. Bayesian Estimation

Say we know data $LaTeX: y_1,\ldots,y_N$ is drawn from a probability function

$LaTeX: p(y;\theta) = (1-\theta)f_0(y) + \theta f_1(y)$

where $LaTeX: f_0(y) \textrm{ and } f_1(y)$ are known functions, but where the parameter $LaTeX: \theta \in (0,1)$ is unknown and should be estimated.

a) Calculate the Fisher information $LaTeX: I(\theta)$ and show that any bias-free estimator $LaTeX: \hat \theta = t(y)$ needs to satisfy

$LaTeX: \begin{equation*} E(\theta - \hat \theta)^2 \geq \frac{1}{N}\left[ \int \frac{(f_0(y)-f_1(y))^2}{(1-\theta)f_0(y)+\theta f_1(y)}dy \right]^{-1} \end{equation*}$

(where we assume the integral exists)

b) Suggest a method to estimate $LaTeX: \theta$ from data $LaTeX: y_1,\ldots,y_N$ which works well when $LaTeX: N\to\infty$ (assuming LaTeX: f_0 is different from LaTeX: f_1 ).