PART I - ML Competition
- Due No due date
- Points 0
- Submitting a file upload
Now that you have gone through Part I of the book, it is time to test your skills on a mini ML project. Wohoo! Your project will be evaluated based on the steps (pipeline) explained in Chapter 2, i.e.,
Pipeline
- Look at the big picture.
- Get the data.
- Discover and visualize the data to gain insights.
- Prepare the data for Machine Learning algorithms.
- Select a model and train it: choose from the models treated in Part I of the book
- Fine-tune your model.
- Present your solution at the Presentation Session on March 13
- Launch, monitor, and maintain your system. Just comment on what would need to be done for this step,
as well as the performance you achieve.
You can choose between the following data sets:
- Breast Cancer Wisconsin (Diagnostic) Data Set - https://www.kaggle.com/uciml/breast-cancer-wisconsin-data Links to an external site.
- Red Wine quality - https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009 Links to an external site.
Note: the second data set can be used for both regression and classification.
PART I - ML Competition tasks:
- Create pipeline and comment on each step (e.g., in a Jupyter notebook). Motivate your choice of model, performance measure, etc.
- Send your well commented pipeline to your assigned opponent at midnight March 10, at the latest. Please cc to carolina.bergeling@control.lth.se
- Create project presentation - approx. 5 min
- Prepare your opponent duty: check the respondents pipeline and prepare questions regarding the different steps of the pipeline. - approx. 5 min discussion
- Take part in presentation session on March 13!
Presenter | Opponent | Data set chosen (due March 3) |
Frida | Lars | Breast Cancer |
Birgitta | Frida | Breast cancer |
Harry | Birgitta | Breast Cancer |
Nils | Harry | Breast Cancer |
Olle | Nils | Wine |
Johan | Olle | cancer |
Lars | Johan | Wine |
Jeff | - |