report done
This commit is contained in:
parent
a0043d03be
commit
11c5b98b72
7 changed files with 94 additions and 0 deletions
BIN
Assignment2_part2/report.pdf
Normal file
BIN
Assignment2_part2/report.pdf
Normal file
Binary file not shown.
1
Assignment2_part2/report/.gitignore
vendored
Normal file
1
Assignment2_part2/report/.gitignore
vendored
Normal file
|
@ -0,0 +1 @@
|
||||||
|
_tmp.md
|
9
Assignment2_part2/report/build.sh
Executable file
9
Assignment2_part2/report/build.sh
Executable file
|
@ -0,0 +1,9 @@
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
SCRIPT_DIR=$(cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd)
|
||||||
|
|
||||||
|
cd "$SCRIPT_DIR"
|
||||||
|
m4 -I"$SCRIPT_DIR" main.md > _tmp.md
|
||||||
|
pandoc _tmp.md -o ../report.pdf
|
BIN
Assignment2_part2/report/canvas_any.png
Normal file
BIN
Assignment2_part2/report/canvas_any.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 230 KiB |
BIN
Assignment2_part2/report/canvas_city.png
Normal file
BIN
Assignment2_part2/report/canvas_city.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 226 KiB |
BIN
Assignment2_part2/report/dashboard.png
Normal file
BIN
Assignment2_part2/report/dashboard.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 2.4 MiB |
84
Assignment2_part2/report/main.md
Normal file
84
Assignment2_part2/report/main.md
Normal file
|
@ -0,0 +1,84 @@
|
||||||
|
---
|
||||||
|
author: Claudio Maggioni
|
||||||
|
title: Visual Analytics -- Assignment 2 -- Part 2
|
||||||
|
geometry: margin=2cm,bottom=3cm
|
||||||
|
---
|
||||||
|
|
||||||
|
changequote(`{{', `}}')
|
||||||
|
|
||||||
|
# Indexing
|
||||||
|
|
||||||
|
Similarly to part 1 of the assignment, the first step of indexing is to convert
|
||||||
|
the newly given CSV dataset (stored in `data/restaurants_extended.csv`) into a
|
||||||
|
JSON-lines file which can be directly used as the HTTP request body of
|
||||||
|
Elasticsearch document insertion requests.
|
||||||
|
|
||||||
|
The conversion is performed by the script `./convert.sh`. The converted file
|
||||||
|
is stored in the JSON-lines file `data/restaurants_extended.jsonl`.
|
||||||
|
|
||||||
|
The sources of `./convert.sh` are listed below:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
include({{../convert.sh}})
|
||||||
|
```
|
||||||
|
|
||||||
|
The only change in the script is the way the field containing the restaurant
|
||||||
|
location is parsed. In the extended dataset, city, country and continent are in
|
||||||
|
this field and separated by `/`. The script maps the three values in separate
|
||||||
|
fields and additionally maps the entire string to an additional `cityRaw` field
|
||||||
|
which is used in the generation of the runtime field for part 2.
|
||||||
|
|
||||||
|
The sourced of the updated upload script, loading the new index are listed
|
||||||
|
below:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
include({{../upload.sh}})
|
||||||
|
```
|
||||||
|
|
||||||
|
Mappings are stored in `mappings.json` and are identical to the ones in Part 1
|
||||||
|
other than for the new location fields and their `.keyword` counterparts
|
||||||
|
similarly generated as the old `city` field.
|
||||||
|
|
||||||
|
9499 documents are imported.
|
||||||
|
|
||||||
|
# Data Visualization
|
||||||
|
|
||||||
|
The Dashboard, Canvas, and requested dependencies (like scripted fields and
|
||||||
|
stored searched) are stored in the JSON object export file `export.ndjson`.
|
||||||
|
Screenshot of the Dashboard and Canvas can be found below.
|
||||||
|
|
||||||
|
The scripted field `continent_scripted` has been generated with the following
|
||||||
|
Painless expression:
|
||||||
|
|
||||||
|
```java
|
||||||
|
doc['cityRaw.keyword'].value.substring(doc['cityRaw.keyword'].value.lastIndexOf("/") + 1)
|
||||||
|
```
|
||||||
|
|
||||||
|
The expression extracts the last portion of the `cityRaw` field, i.e. the
|
||||||
|
portion of text between the last `/` and the end of the field, which contains
|
||||||
|
the continent.
|
||||||
|
|
||||||
|
|
||||||
|
![Part 2 Dashboard](dashboard.png)
|
||||||
|
|
||||||
|
![Part 2 Canvas with no city selected](canvas_any.png)
|
||||||
|
|
||||||
|
![Part 2 Canvas with a city selected](canvas_city.png)
|
||||||
|
|
||||||
|
# Ingestion Plugin
|
||||||
|
|
||||||
|
Sources for the ingestion plugin can be found in the Gitlab repository:
|
||||||
|
|
||||||
|
[_usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl_](https://gitlab.com/usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl).
|
||||||
|
|
||||||
|
The plugin can be built and installed on Elasticsearch with the script
|
||||||
|
`./install-on-ec.sh` included in the repository by changing the variable
|
||||||
|
`ES_LOCATION` to the path to the local installation of Elasticsearch.
|
||||||
|
|
||||||
|
The plugin works as illustrated in the `README.md` file in the repository, and
|
||||||
|
it has been tested with a unit test suite included in its sources.
|
||||||
|
|
||||||
|
The plugin lookup procedure works by splitting the indicated field in words
|
||||||
|
(non-empty sequences of non-space characters -- according to the PCRE regular
|
||||||
|
expression specification) and matching each word with the given
|
||||||
|
substitution map, performing substitutions when needed.
|
Reference in a new issue