report done

2023-05-06 17:12:32 +02:00 · 2023-05-06 17:12:32 +02:00 · 11c5b98b72
commit 11c5b98b72
parent a0043d03be
7 changed files with 94 additions and 0 deletions
--- a/Assignment2_part2/report.pdf
+++ b/Assignment2_part2/report.pdf
--- a/Assignment2_part2/report/.gitignore
+++ b/Assignment2_part2/report/.gitignore
@ -0,0 +1 @@
+_tmp.md
--- a/Assignment2_part2/report/build.sh
+++ b/Assignment2_part2/report/build.sh
@ -0,0 +1,9 @@
+#!/bin/bash
+
+set -e
+
+SCRIPT_DIR=$(cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd)
+
+cd "$SCRIPT_DIR"
+m4 -I"$SCRIPT_DIR" main.md > _tmp.md
+pandoc _tmp.md -o ../report.pdf
--- a/Assignment2_part2/report/canvas_any.png
+++ b/Assignment2_part2/report/canvas_any.png
--- a/Assignment2_part2/report/canvas_city.png
+++ b/Assignment2_part2/report/canvas_city.png
--- a/Assignment2_part2/report/dashboard.png
+++ b/Assignment2_part2/report/dashboard.png
--- a/Assignment2_part2/report/main.md
+++ b/Assignment2_part2/report/main.md
@ -0,0 +1,84 @@
+---
+author: Claudio Maggioni
+title: Visual Analytics -- Assignment 2 -- Part 2
+geometry: margin=2cm,bottom=3cm
+---
+
+changequote(`{{', `}}')
+
+# Indexing
+
+Similarly to part 1 of the assignment, the first step of indexing is to convert
+the newly given CSV dataset (stored in `data/restaurants_extended.csv`) into a
+JSON-lines file which can be directly used as the HTTP request body of
+Elasticsearch document insertion requests.
+
+The conversion is performed by the script `./convert.sh`. The converted file 
+is stored in the JSON-lines file `data/restaurants_extended.jsonl`.
+
+The sources of `./convert.sh` are listed below:
+
+```shell
+include({{../convert.sh}})
+```
+
+The only change in the script is the way the field containing the restaurant
+location is parsed. In the extended dataset, city, country and continent are in
+this field and separated by `/`. The script maps the three values in separate
+fields and additionally maps the entire string to an additional `cityRaw` field
+which is used in the generation of the runtime field for part 2.
+
+The sourced of the updated upload script, loading the new index are listed
+below:
+
+```shell
+include({{../upload.sh}})
+```
+
+Mappings are stored in `mappings.json` and are identical to the ones in Part 1
+other than for the new location fields and their `.keyword` counterparts
+similarly generated as the old `city` field.
+
+9499 documents are imported.
+
+# Data Visualization
+
+The Dashboard, Canvas, and requested dependencies (like scripted fields and 
+stored searched) are stored in the JSON object export file `export.ndjson`.
+Screenshot of the Dashboard and Canvas can be found below.
+
+The scripted field `continent_scripted` has been generated with the following
+Painless expression:
+
+```java
+doc['cityRaw.keyword'].value.substring(doc['cityRaw.keyword'].value.lastIndexOf("/") + 1)
+```
+
+The expression extracts the last portion of the `cityRaw` field, i.e. the
+portion of text between the last `/` and the end of the field, which contains
+the continent.
+
+
+![Part 2 Dashboard](dashboard.png)
+
+![Part 2 Canvas with no city selected](canvas_any.png)
+
+![Part 2 Canvas with a city selected](canvas_city.png)
+
+# Ingestion Plugin
+
+Sources for the ingestion plugin can be found in the Gitlab repository:
+
+[_usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl_](https://gitlab.com/usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl).
+
+The plugin can be built and installed on Elasticsearch with the script
+`./install-on-ec.sh` included in the repository by changing the variable
+`ES_LOCATION` to the path to the local installation of Elasticsearch.
+
+The plugin works as illustrated in the `README.md` file in the repository, and
+it has been tested with a unit test suite included in its sources.
+
+The plugin lookup procedure works by splitting the indicated field in words
+(non-empty sequences of non-space characters -- according to the PCRE regular
+expression specification) and matching each word with the given
+substitution map, performing substitutions when needed.