From 68712ae0e0b7dbf374629a7d6d862f84716d582f Mon Sep 17 00:00:00 2001 From: "Claudio Maggioni (maggicl)" Date: Tue, 4 May 2021 16:18:52 +0200 Subject: [PATCH] wip --- assignment_1/report_Maggioni_Claudio.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/assignment_1/report_Maggioni_Claudio.md b/assignment_1/report_Maggioni_Claudio.md index 6362809..808dc7e 100644 --- a/assignment_1/report_Maggioni_Claudio.md +++ b/assignment_1/report_Maggioni_Claudio.md @@ -112,7 +112,7 @@ better. the training error curve decreases as the model complexity increases, albeit in a less steep fashion as its behaviour in (a). -1. **Is any of the three section associated with the concepts of +2. **Is any of the three section associated with the concepts of overfitting and underfitting? If yes, explain it.** Section (a) is associated with underfitting and section (c) is associated @@ -143,7 +143,7 @@ better. data without learning noise. Thus, both the validation and the test MSE curves reach their lowest point in this region of the graph. -1. **Is there any evidence of high approximation risk? Why? If yes, in +3. **Is there any evidence of high approximation risk? Why? If yes, in which of the below subfigures?** Depending on the scale and magnitude of the x axis, there could be @@ -158,7 +158,7 @@ better. test error, since the inherent structure behind the chosen family of models would be unable to capture the true behaviour of the data. -1. **Do you think that by further increasing the model complexity you +4. **Do you think that by further increasing the model complexity you will be able to bring the training error to zero?** Yes, I think so. The model complexity could be increased up to the point @@ -170,7 +170,7 @@ better. learned as well, thus making the model completely useless for prediction of new datapoints. -1. **Do you think that by further increasing the model complexity you +5. **Do you think that by further increasing the model complexity you will be able to bring the structural risk to zero?** No, I don't think so. In order to achieve zero structural risk we would need @@ -197,6 +197,11 @@ Comment and compare how the (a.) training error, (b.) test error and justify the poor performance of your perceptron classifier to your boss?** + The classification problem in the graph, according to the data points + shown, is quite similar to the XOR or ex-or problem. Since in 1969 that + problem was proved impossible to solve by a perceptron model by Minsky and + Papert, then the + 1. **Would you expect to have better luck with a neural network with activation function $h(x) = - x \cdot e^{-2}$ for the hidden units?**