This repository has been archived on 2023-06-18. You can view files and clone it, but cannot push or open issues or pull requests.
Go to file
2023-04-19 21:35:11 +02:00
clustering report almost done 2023-04-19 21:35:11 +02:00
feature_vectors done part 1 and 2 of report 2023-04-18 15:47:07 +02:00
god_classes project work part 1 done 2023-03-06 14:48:32 +01:00
report report almost done 2023-04-19 21:35:11 +02:00
resources/xerces2-j-src Initial commit 2023-03-06 13:35:46 +00:00
.gitignore project work part 1 done 2023-03-06 14:48:32 +01:00
extract_feature_vectors.py done part 1 and 2 of report 2023-04-18 15:47:07 +02:00
find_god_classes.py done part 1 and 2 of report 2023-04-18 15:47:07 +02:00
ground_truth.py done part 3 and part 4 2023-03-22 14:28:17 +01:00
hierarchical.py done part 3 and part 4 2023-03-22 14:28:17 +01:00
k_means.py done part 3 and part 4 2023-03-22 14:28:17 +01:00
keyword_list.txt done part 3 and part 4 2023-03-22 14:28:17 +01:00
prec_recall.py report almost done 2023-04-19 21:35:11 +02:00
readme.md done part 3 and part 4 2023-03-22 14:28:17 +01:00
requirements.txt done part 1 and 2 of report 2023-04-18 15:47:07 +02:00
silhouette.py report almost done 2023-04-19 21:35:11 +02:00

Information Modelling & Analysis: Project 1

Student: enter your name here

Project instructions:

Please follow the instructions provided in the project slides. For your convenience, the source code to be analyzed (xerces2) has already been added to this repository (/resources/xerces2-j-src).

Attention: Please consider the submission instructions available on iCorsi.

Report: You may want to use the template distributed on iCorsi.

Install dependencies

# create venv
python -m venv env
source env/bin/activate

pip3 install -r requirements.txt

Running part 1: find god classes

./find_god_classes.py

The resulting CSV file containing a list of God classes is generated in the god_classes/god_classes.csv path.

Running part 3: clustering and silhouette metric

To compute optimal k-means and agglomerative clusterings using silhouette validation for all classes run:

./silhouette.py --validate --autorun

To compute k-means or agglomerative clustering for a specific number of clusters K and a specific class KLASS run respectively:

./k_means.py KLASS K
./hierarchical.py KLASS K

Then, to check their silhouette metric run:

./silhouette.py

Compile report

  • Install Pandoc;
  • Run pandoc main.md -o main.pdf in report directory.