report done

2023-05-06 17:12:32 +02:00 · 2023-05-06 17:12:32 +02:00 · 11c5b98b72
commit 11c5b98b72
parent a0043d03be
7 changed files with 94 additions and 0 deletions
--- a/Assignment2_part2/report.pdf
+++ b/Assignment2_part2/report.pdf
--- a/Assignment2_part2/report/.gitignore
+++ b/Assignment2_part2/report/.gitignore
@ -0,0 +1 @@
 _tmp.md
--- a/Assignment2_part2/report/build.sh
+++ b/Assignment2_part2/report/build.sh
@ -0,0 +1,9 @@
 #!/bin/bash
 set -e
 SCRIPT_DIR=$(cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd)
 cd "$SCRIPT_DIR"
 m4 -I"$SCRIPT_DIR" main.md > _tmp.md
 pandoc _tmp.md -o ../report.pdf
--- a/Assignment2_part2/report/canvas_any.png
+++ b/Assignment2_part2/report/canvas_any.png
--- a/Assignment2_part2/report/canvas_city.png
+++ b/Assignment2_part2/report/canvas_city.png
--- a/Assignment2_part2/report/dashboard.png
+++ b/Assignment2_part2/report/dashboard.png
--- a/Assignment2_part2/report/main.md
+++ b/Assignment2_part2/report/main.md
@ -0,0 +1,84 @@
 ---
 author: Claudio Maggioni
 title: Visual Analytics -- Assignment 2 -- Part 2
 geometry: margin=2cm,bottom=3cm
 ---
 changequote(`{{', `}}')
 # Indexing
 Similarly to part 1 of the assignment, the first step of indexing is to convert
 the newly given CSV dataset (stored in `data/restaurants_extended.csv`) into a
 JSON-lines file which can be directly used as the HTTP request body of
 Elasticsearch document insertion requests.
 The conversion is performed by the script `./convert.sh`. The converted file 
 is stored in the JSON-lines file `data/restaurants_extended.jsonl`.
 The sources of `./convert.sh` are listed below:
 ```shell
 include({{../convert.sh}})
 ```
 The only change in the script is the way the field containing the restaurant
 location is parsed. In the extended dataset, city, country and continent are in
 this field and separated by `/`. The script maps the three values in separate
 fields and additionally maps the entire string to an additional `cityRaw` field
 which is used in the generation of the runtime field for part 2.
 The sourced of the updated upload script, loading the new index are listed
 below:
 ```shell
 include({{../upload.sh}})
 ```
 Mappings are stored in `mappings.json` and are identical to the ones in Part 1
 other than for the new location fields and their `.keyword` counterparts
 similarly generated as the old `city` field.
 9499 documents are imported.
 # Data Visualization
 The Dashboard, Canvas, and requested dependencies (like scripted fields and 
 stored searched) are stored in the JSON object export file `export.ndjson`.
 Screenshot of the Dashboard and Canvas can be found below.
 The scripted field `continent_scripted` has been generated with the following
 Painless expression:
 ```java
 doc['cityRaw.keyword'].value.substring(doc['cityRaw.keyword'].value.lastIndexOf("/") + 1)
 ```
 The expression extracts the last portion of the `cityRaw` field, i.e. the
 portion of text between the last `/` and the end of the field, which contains
 the continent.
 ![Part 2 Dashboard](dashboard.png)
 ![Part 2 Canvas with no city selected](canvas_any.png)
 ![Part 2 Canvas with a city selected](canvas_city.png)
 # Ingestion Plugin
 Sources for the ingestion plugin can be found in the Gitlab repository:
 [_usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl_](https://gitlab.com/usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl).
 The plugin can be built and installed on Elasticsearch with the script
 `./install-on-ec.sh` included in the repository by changing the variable
 `ES_LOCATION` to the path to the local installation of Elasticsearch.
 The plugin works as illustrated in the `README.md` file in the repository, and
 it has been tested with a unit test suite included in its sources.
 The plugin lookup procedure works by splitting the indicated field in words
 (non-empty sequences of non-space characters -- according to the PCRE regular
 expression specification) and matching each word with the given
 substitution map, performing substitutions when needed.