report done up to training

This commit is contained in:
Claudio Maggioni 2023-05-24 18:15:43 +02:00
parent 185bee2933
commit 2797dc7a9d
3 changed files with 16 additions and 1 deletions

View file

@ -150,7 +150,15 @@ The script `./train_classifiers.py`, according to the random seed $3735924759$,
The metrics for each classifier and each hyperparameter configuration in decreasing order of
accuracy are reported in the following sections.
For each classifier, I then choose the hyperparameter configuration with highest accuracy.
For each classifier, I then choose the hyperparameter configuration with highest accuracy. Namely, these configurations are:
| **Classifier** | **Hyper-parameter configuration** | **Precision** | **Accuracy** | **Recall** | **F1 Score** |
|:----|:--------|-:|-:|-:|--:|
| DecisionTreeClassifier | `criterion`: gini, `splitter`: best | 0.7885 | 0.8506 | 0.9535 | 0.8632 |
| GaussianNB | -- | 0.8 | 0.6782 | 0.4651 | 0.5882 |
| MLPClassifier | `activation`: logistic, `hidden_layer_sizes`: (60, 80, 100), `learning_rate`: constant, `max_iter`: 500000, `solver`: lbfgs | 0.8958 | 0.9425 | 1 | 0.9451 |
| RandomForestClassifier | `class_weight`: balanced, `criterion`: gini, `max_features`: sqrt | 0.8367 | 0.8851 | 0.9535 | 0.8913 |
| SVC | `gamma`: scale, `kernel`: rbf | 0.7174 | 0.7356 | 0.7674 | 0.7416 |
## Decision Tree (DT)
@ -300,6 +308,8 @@ For sake of brevity, only the top 100 results by accuracy are shown.
| gini | balanced_subsample | log2 | 0.803922 | 0.862069 | 0.953488 | 0.87234 |
| entropy | balanced_subsample | log2 | 0.803922 | 0.862069 | 0.953488 | 0.87234 |
# Evaluation
## Output Distributions

Binary file not shown.

View file

@ -150,6 +150,11 @@ def find_best_and_save(df: pd.DataFrame):
metrics = ['precision', 'accuracy', 'recall', 'f1']
df_best.loc[:, metrics] = df_best.loc[:, metrics].round(decimals=4)
df_best = df_best.reindex(
['classifier', 'params'] + \
[x for x in df_best.columns if x in metrics], \
axis=1)
print(df_best.to_markdown(index=False))