142 lines
4.9 KiB
Markdown
142 lines
4.9 KiB
Markdown
<!-- vim: set ts=2 sw=2 et tw=80: -->
|
|
|
|
---
|
|
header-includes:
|
|
- \usepackage[utf8]{inputenc}
|
|
- \usepackage[T1]{fontenc}
|
|
- \usepackage[sc]{mathpazo}
|
|
- \usepackage{caption, subcaption}
|
|
- \usepackage{hyperref}
|
|
- \usepackage[english]{babel}
|
|
- \usepackage{amsmath, amsfonts}
|
|
- \usepackage{listings}
|
|
- \usepackage{graphicx}
|
|
- \graphicspath{{Figures/}{./}}
|
|
- \usepackage{float}
|
|
- \usepackage{geometry}
|
|
- \geometry{paper=a4paper,top=2.5cm,bottom=3cm,left=3cm,right=3cm}
|
|
- \usepackage{sectsty}
|
|
- \sectionfont{\vspace{6pt}\centering\normalfont\scshape}
|
|
- \subsectionfont{\normalfont\bfseries}
|
|
- \subsubsectionfont{\normalfont\itshape}
|
|
- \paragraphfont{\normalfont\scshape}
|
|
- \usepackage{scrlayer-scrpage}
|
|
- \ofoot*{\pagemark}
|
|
- \ifoot*{Maggioni Claudio}
|
|
- \cfoot*{}
|
|
---
|
|
\title{
|
|
\normalfont\normalsize
|
|
\textsc{Machine Learning\\
|
|
Universit\`a della Svizzera italiana}\\
|
|
\vspace{25pt}
|
|
\rule{\linewidth}{0.5pt}\\
|
|
\vspace{20pt}
|
|
{\huge Assignment 1}\\
|
|
\vspace{12pt}
|
|
\rule{\linewidth}{1pt}\\
|
|
\vspace{12pt}
|
|
}
|
|
\author{\LARGE Maggioni Claudio}
|
|
\date{\normalsize\today}
|
|
\maketitle
|
|
|
|
The assignment is split into two parts: you are asked to solve a
|
|
regression problem, and answer some questions. You can use all the
|
|
books, material, and help you need. Bear in mind that the questions you
|
|
are asked are similar to those you may find in the final exam, and are
|
|
related to very important and fundamental machine learning concepts. As
|
|
such, sooner or later you will need to learn them to pass the course. We
|
|
will give you some feedback afterwards.\
|
|
!! Note that this file is just meant as a template for the report, in
|
|
which we reported **part of** the assignment text for convenience. You
|
|
must always refer to the text in the README.md file as the assignment
|
|
requirements.
|
|
|
|
# Regression problem
|
|
|
|
This section should contain a detailed description of how you solved the
|
|
assignment, including all required statistical analyses of the models'
|
|
performance and a comparison between the linear regression and the model
|
|
of your choice. Limit the assignment to 2500 words (formulas, tables,
|
|
figures, etc., do not count as words) and do not include any code in the
|
|
report.
|
|
|
|
## Task 1
|
|
|
|
Use the family of models
|
|
$f(\mathbf{x}, \boldsymbol{\theta}) = \theta_0 + \theta_1 \cdot x_1 +
|
|
\theta_2 \cdot x_2 + \theta_3 \cdot x_1 \cdot x_2 + \theta_4 \cdot
|
|
\sin(x_1)$
|
|
to fit the data. Write in the report the formula of the model
|
|
substituting parameters $\theta_0, \ldots, \theta_4$ with the estimates
|
|
you've found:
|
|
$$f(\mathbf{x}, \boldsymbol{\theta}) = \_ + \_ \cdot x_1 + \_
|
|
\cdot x_2 + \_ \cdot x_1 \cdot x_2 + \_ \cdot \sin(x_1)$$
|
|
Evaluate the test performance of your model using the mean squared error
|
|
as performance measure.
|
|
|
|
## Task 2
|
|
|
|
Consider any family of non-linear models of your choice to address the
|
|
above regression problem. Evaluate the test performance of your model
|
|
using the mean squared error as performance measure. Compare your model
|
|
with the linear regression of Task 1. Which one is **statistically**
|
|
better?
|
|
|
|
## Task 3 (Bonus)
|
|
|
|
In the [**Github repository of the
|
|
course**](https://github.com/marshka/ml-20-21), you will find a trained
|
|
Scikit-learn model that we built using the same dataset you are given.
|
|
This baseline model is able to achieve a MSE of **0.0194**, when
|
|
evaluated on the test set. You will get extra points if the test
|
|
performance of your model is better (i.e., the MSE is lower) than ours.
|
|
Of course, you also have to tell us why you think that your model is
|
|
better.
|
|
|
|
# Questions
|
|
|
|
## Q1. Training versus Validation
|
|
|
|
1. **Explain the curves' behavior in each of the three highlighted
|
|
sections of the figures, namely (a), (b), and (c).**
|
|
|
|
I dont know
|
|
|
|
1. **Is any of the three section associated with the concepts of
|
|
overfitting and underfitting? If yes, explain it.**
|
|
|
|
1. **Is there any evidence of high approximation risk? Why? If yes, in
|
|
which of the below subfigures?**
|
|
|
|
1. **Do you think that by further increasing the model complexity you
|
|
will be able to bring the training error to zero?**
|
|
|
|
1. **Do you think that by further increasing the model complexity you
|
|
will be able to bring the structural risk to zero?**
|
|
|
|
## Q2. Linear Regression
|
|
|
|
Comment and compare how the (a.) training error, (b.) test error and
|
|
(c.) coefficients would change in the following cases:
|
|
|
|
1. **$x_3$ is a normally distributed independent random variable
|
|
$x_3 \sim \mathcal{N}(1, 2)$**
|
|
|
|
1. **$x_3 = 2.5 \cdot x_1 + x_2$**
|
|
|
|
1. **$x_3 = x_1 \cdot x_2$**
|
|
|
|
## Q3. Classification
|
|
|
|
1. **Your boss asked you to solve the problem using a perceptron and now
|
|
he's upset because you are getting poor results. How would you
|
|
justify the poor performance of your perceptron classifier to your
|
|
boss?**
|
|
|
|
1. **Would you expect to have better luck with a neural network with
|
|
activation function $h(x) = - x \cdot e^{-2}$ for the hidden units?**
|
|
|
|
1. **What are the main differences and similarities between the
|
|
perceptron and the logistic regression neuron?**
|