clustering | ||
feature_vectors | ||
god_classes | ||
report | ||
resources/xerces2-j-src | ||
.gitignore | ||
extract_feature_vectors.py | ||
find_god_classes.py | ||
ground_truth.py | ||
hierarchical.py | ||
k_means.py | ||
keyword_list.txt | ||
prec_recall.py | ||
readme.md | ||
requirements.txt | ||
silhouette.py |
Information Modelling & Analysis: Project 1
Student: enter your name here
Project instructions:
Please follow the instructions provided in the project slides. For your convenience, the source code to be analyzed (xerces2) has already been added to this repository (/resources/xerces2-j-src).
Attention: Please consider the submission instructions available on iCorsi.
Report: You may want to use the template distributed on iCorsi.
Install dependencies
# create venv
python -m venv env
source env/bin/activate
pip3 install -r requirements.txt
Running part 1: find god classes
./find_god_classes.py
The resulting CSV file containing a list of God classes is generated in the god_classes/god_classes.csv
path.
Running part 3: clustering and silhouette metric
To compute optimal k-means and agglomerative clusterings using silhouette validation for all classes run:
./silhouette.py --validate --autorun
To compute k-means or agglomerative clustering for a specific number of
clusters K
and a specific class KLASS
run respectively:
./k_means.py KLASS K
./hierarchical.py KLASS K
Then, to check their silhouette metric run:
./silhouette.py
Compile report
- Install Pandoc;
- Run
pandoc main.md -o main.pdf
inreport
directory.