compiled report for part 1

This commit is contained in:
Claudio Maggioni 2023-03-06 15:33:39 +01:00
parent 541a13b352
commit 3b04697ffe
2 changed files with 111 additions and 0 deletions

111
report/main.md Normal file
View file

@ -0,0 +1,111 @@
---
author: Claudio Maggioni
title: Information Modelling & Analysis -- Project 1
---
<!--
The following shows a minimal submission report for project 1. If you
choose to use this template, replace all template instructions (the
yellow bits) with your own values. In addition, for any section, if
**and only if** anything was unclear or warnings were raised by the
code, and you had to take assumptions about the correct implementation
(e.g., about details of a metric), describe your assumptions in one or
two sentences.
You may - at your own risk - also choose not to use this template. As
long as your submission is a latex-generated, English PDF containing all
expected info, you'll be fine.
-->
# Code Repository
The code and result files part of this submission can be found at:
::: center
Repository: \url{https://github.com/infoMA2023/project-01-god-classes-maggicl}
Commit ID: **TBD**
:::
# Data Pre-Processing
## God Classes
::: {#tab:god_classes}
---------------------------------------------- ---------------
**Class Name** **\# Methods**
org.apache.xerces.dom.CoreDocumentImpl 125
org.apache.xerces.impl.xs.traversers.XSDHandler 118
org.apache.xerces.xinclude.XIncludeHandler 116
org.apache.xerces.impl.dtd.DTDGrammar 101
---------------------------------------------- ---------------
: Identified God Classes
:::
The god classes I identified, and their corresponding number of methods
can be found in Table [1](#tab:god_classes){reference-type="ref"
reference="tab:god_classes"}.
## Feature Vectors
Table [2](#tab:feat_vec){reference-type="ref" reference="tab:feat_vec"}
shows aggregate numbers regarding the extracted feature vectors for the
god classes.
::: {#tab:feat_vec}
---------------- ------------------------ ---------------------
**Class Name** **\# Feature Vectors** **\# Attributes\***
\... \... \...
---------------- ------------------------ ---------------------
: Feature vector summary (\*= used at least once)
:::
# Clustering {#sec:clustering}
## Algorithm Configurations
Report/comment the algorithm configurations (distance function, linkage
rule, etc.). You may do so in any form you feel suited, but a short
paragraph of text is probably sufficient.
## Testing Various K & Silhouette Scores
\(1\) Report data about the clusters produced by the two algorithms at
various k (#clusters, size of clusters, silhouette scores). You may use
any suitable format (table, graph, \...).
\(2\) Briefly comment your results. What is the best configuration, and
why? Anything else you observed?
# Evaluation
## Ground Truth
I computed the ground truth using the command \.... The generated files
are checked into the repository with the names \....
Comment briefly on the strengths & weaknesses of our ground truth.
## Precision and Recall
::: {#tab:eval}
---------------- ------------------- -------- ------------- --------
**Class Name** **Agglomerative** **K-Means**
Prec. Recall Prec. Recall
\... \... \... \... \...
---------------- ------------------- -------- ------------- --------
: Evaluation Summary
:::
Precision and Recall, for the optimal configurations found in Section
[3](#sec:clustering){reference-type="ref" reference="sec:clustering"},
are reported in Table [3](#tab:eval){reference-type="ref"
reference="tab:eval"}.
## Practical Usefulness
Discuss the practical usefulness of the obtained code refactoring
assistant in a realistic setting (1 paragraph).

BIN
report/main.pdf Normal file

Binary file not shown.