FRTN65
Exam Aug 2024
Skip To Content
Dashboard
  • Login
  • Dashboard
  • Calendar
  • Inbox
  • History
  • Help
Close
  • My dashboard
  • FRTN65
  • Assignments
  • Exam Aug 2024
2023 HT/Autumn
  • Home
  • Modules
  • Quizzes
  • Assignments
  • Syllabus

Exam Aug 2024

  • Due No Due Date
  • Points 50
  • Submitting a file upload

Allowed aid: All material is allowed, including old exams, internet access and tools such as chatGPT etc.

If chatGPT or similar tool is used, we ask you to briefly describe how it was used (on which problems,  what kind of prompts etc. Note: this information is just to give feedback useful for future course development, it will not impact your score.).

Instructions:  Name files handed in to Canvas using your anonymization code, such as NR.zip or NR-problem1.pdf etc. We prefer that all solutions are handed in via Canvas (photos of handwritten solutions are fine) but if you really need to hand in some handwritten solutions on papers at the exam that is ok, but these must then be marked with both your anonymization code and your personal identifier.

All solutions must be well motivated. 

Code that is relevant for your solutions should be submitted.

Preliminary limits for grades (out of 50p):   3: 25p, 4:33p, 5: 42p.

Good luck !


1 Dimensional Analysis [4p] 

To study the force LaTeX: F generated by a propeller on a small drone let us assume that the relevant variables are

  • LaTeX:  \rho, the density of air (kg/m^3)
  • LaTeX: \omega, the angular rate of the propeller (1/s)
  • LaTeX: L, the length of the propeller (m)

Use dimensional analysis to determine a physically motivated relation of the form  LaTeX: F =  \textrm{const} \cdot L^a \rho^b\omega^c with integers a,b,c. (The constant will depend on the shape of the propeller).


2 Evaluating classification performance [8pt]

In binary classification we often use classifiers that compare a score LaTeX: x with a threshold LaTeX: t :

        IF LaTeX: x < t THEN "negative case" ELSE "positive case"

It can in some situations be unclear  what should be defined as "positive cases" and "negative cases. This problem concerns the consequences of this.

The figure below illustrates the score LaTeX: x that is the output from two slightly different implementations of the same classifier. The only difference between implementation 1 and 2 is a switched sign of the score LaTeX: x:

  • In implementation 1 (left),  "cancer"  is defined as the "positive case" and the classifier generates a higher score LaTeX: x for the blue cases (cancer) than for the red cases (healthy).
  • In implementation 2 (right), "healthy" is instead defined as the "positive case" and  the sign of the score is flipped (score is LaTeX: -x instead).

 

cancer-1.png

There are 1000 healthy patients and 100 cancer patients (it is the same patients in the two figures).

a) The following figure illustrates the ROC curve for implementation 1

ROC curve for version 1

What will be true for the ROC curve for implementation 2 ? (Motivate!)

  • The ROC curve will be exactly the same
  • The ROC curve will be a mirrored version of the curve above
  • The two curves can be completely different

b) AUC = 0.918 for implementation 1. Will AUC for implementation 2 be the same ? Motivate.

c) The following figure illustrates the precision vs recall curve for implementation 1.

Precision vs Recall for implementation 1

For implementation 1, use this figure to determine (approximately) the best achievable F1-score ( = 2/(1/precision + 1/recall))  if threshold LaTeX: t is chosen optimally.

d)  What will be true for the precision-recall curve for implementation 2 ? (Motivate!)

  • The curve will be exactly the same
  • The curve will be a mirrored version of the curve above
  • The two curves can be completely different

(Reminder: TPR =  TP/(TP+FN) ,  FPR = FP/(FP+TN), Precision = TP/(TP+FP), Recall = TPR)


3 Supervised Learning [12pt]

This Google colab code studies a classification problem: predicting quality of 1599 different red wines based on 11 measured input variables.  Quality is given by an integer in the range 1-10, and the input variables are numerical values describing e.g acidity, sugar, alcohol levels etc.

a) Describe some drawbacks with the existing code and how to improve it. Do not spend much time optimizing performance in this subproblem.

b) Rewrite the code so the prediction is treated as a regression problem, where quality is predicted as a real value, and optimize mean square error (MSE) instead.

Hand in your code.

Your solutions need to be well motivated and explained. Only handing in the code will not suffice. (Motivations and explanations can be written as comments in your code if you want.)


4  Causal Inference [7pt]

The following directed acyclic graph illustrates  a linear structured causal model. You might find this Google colab code helpful to generate data from the model.

We are interested in calculating the causal impact of three different variables (LaTeX: A,LaTeX: B, and LaTeX: C) on the output LaTeX: Y. Remember that in the course the causal effect of LaTeX: X on LaTeX: Y was defined as LaTeX: \frac{\partial }{\partial x}E[Y \mid \mathbf{do}(X:=x)] . All variables are real-valued numbers, and each node has an associated linear equation indicated by the graph, such as LaTeX: A = c_2 B + c_5 C + \textrm{noise}, etc. The model parameters LaTeX: c_1, \ldots, c_6 are considered unknown.

 

diagram-7.png

Which of the statements below are correct concerning the coefficients in the ordinary least squares (OLS) regression Y~ A + B + C  - 1 ?

a) True or False: The coefficient of A measures the causal effect of A on Y   (Hint: This causal effect equals LaTeX: c_1)

b) True or False: The coefficient of B measures the causal effect of B on Y. Also, what is the correct value of this causal effect (give an expression using coefficients LaTeX: c_1, \ldots, c_6) ?

c) True or False: The coefficient of C measures the causal effect of C on Y.  Also, what is the correct value of this causal effect (give an expression using coefficients LaTeX: c_1, \ldots, c_6) ?

d) Furthermore: In case that a, b or c are false, then suggest an alternative linear regression that would give the correct result instead.


5 System Identification - Hands-on [12p]

The file sysid.mat contains some data from a linear system with one input u and one output y sampled at the rate h=0.5.

The code sysidproblem.m contains an initial very quick investigation of the data.

Identify a discrete time model of the system. Aim for using few model parameters. Be sure to describe your methodology, including outlier analysis and suitable preprocessing, choice of suitable model structure and model order, and include model validation with residual analysis etc.

(Hint: Useful commands might include help ident, systemIdentification, arx,oe,armax,bj, present, compare, resid, bodeplot,pzmap, pwelch, detrend, ...)

Also hand in your code.

Your solution needs to be well motivated. Only handing in uncommented code will give a low score.


6.  System Identification Theory [7p]

Assume we want to estimate parameters LaTeX: b_0 \textrm{ and } b_1in the model

LaTeX: y(t) = b_0 u(t) + b_1u(t-1) + e(t), \quad t=2,\ldots, N

Here LaTeX: u (known) and LaTeX: e (unknown) are random signals with zero mean and with LaTeX: E(u^2(t))=\sigma_u^2 and LaTeX: E(e^2(t))=\sigma_e^2, for all LaTeX: t.

a) Write the estimation, where data for t=2 to N is used, as a least squares problem of the form LaTeX: Y = \Phi \theta + E .

b) If the signals LaTeX: u and LaTeX: e are white noise, with LaTeX: e and LaTeX: u independent, we know from the course that the parameter estimates are asymptotically correct. For this case, determine the matrix LaTeX: P in the expression for the asymptotic estimation error

LaTeX: \sqrt{N} (\widehat \theta_N - \theta_0) \to N(0, P), \quad \textrm{ when } N\to \infty.

c) Will the parameter estimates converge to the true values if LaTeX: e and LaTeX: u are independent white noise (same situation as in b) but the model instead is

LaTeX: y(t) = b_0 u(t) + b_1u(t-1) + e(t) + 0.5 e(t-1), \quad t=2,\ldots, N

All solutions must be well motivated. 

0
Please include a description
Additional Comments:
Rating max score to > pts
Please include a rating title

Rubric

Find Rubric
Please include a title
Find a Rubric
Title
You've already rated students with this rubric. Any major changes could affect their assessment results.
 
 
 
 
 
 
 
     
Can't change a rubric once you've started using it.  
Title
Criteria Ratings Pts
This criterion is linked to a Learning Outcome Description of criterion
threshold: 5 pts
Edit criterion description Delete criterion row
5 to >0 pts Full Marks blank
0 to >0 pts No Marks blank_2
This area will be used by the assessor to leave comments related to this criterion.
pts
  / 5 pts
--
Additional Comments
Total Points: 5 out of 5