report done
This commit is contained in:
parent
a0043d03be
commit
11c5b98b72
7 changed files with 94 additions and 0 deletions
BIN
Assignment2_part2/report.pdf
Normal file
BIN
Assignment2_part2/report.pdf
Normal file
Binary file not shown.
1
Assignment2_part2/report/.gitignore
vendored
Normal file
1
Assignment2_part2/report/.gitignore
vendored
Normal file
|
@ -0,0 +1 @@
|
|||
_tmp.md
|
9
Assignment2_part2/report/build.sh
Executable file
9
Assignment2_part2/report/build.sh
Executable file
|
@ -0,0 +1,9 @@
|
|||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR=$(cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd)
|
||||
|
||||
cd "$SCRIPT_DIR"
|
||||
m4 -I"$SCRIPT_DIR" main.md > _tmp.md
|
||||
pandoc _tmp.md -o ../report.pdf
|
BIN
Assignment2_part2/report/canvas_any.png
Normal file
BIN
Assignment2_part2/report/canvas_any.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 230 KiB |
BIN
Assignment2_part2/report/canvas_city.png
Normal file
BIN
Assignment2_part2/report/canvas_city.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 226 KiB |
BIN
Assignment2_part2/report/dashboard.png
Normal file
BIN
Assignment2_part2/report/dashboard.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 2.4 MiB |
84
Assignment2_part2/report/main.md
Normal file
84
Assignment2_part2/report/main.md
Normal file
|
@ -0,0 +1,84 @@
|
|||
---
|
||||
author: Claudio Maggioni
|
||||
title: Visual Analytics -- Assignment 2 -- Part 2
|
||||
geometry: margin=2cm,bottom=3cm
|
||||
---
|
||||
|
||||
changequote(`{{', `}}')
|
||||
|
||||
# Indexing
|
||||
|
||||
Similarly to part 1 of the assignment, the first step of indexing is to convert
|
||||
the newly given CSV dataset (stored in `data/restaurants_extended.csv`) into a
|
||||
JSON-lines file which can be directly used as the HTTP request body of
|
||||
Elasticsearch document insertion requests.
|
||||
|
||||
The conversion is performed by the script `./convert.sh`. The converted file
|
||||
is stored in the JSON-lines file `data/restaurants_extended.jsonl`.
|
||||
|
||||
The sources of `./convert.sh` are listed below:
|
||||
|
||||
```shell
|
||||
include({{../convert.sh}})
|
||||
```
|
||||
|
||||
The only change in the script is the way the field containing the restaurant
|
||||
location is parsed. In the extended dataset, city, country and continent are in
|
||||
this field and separated by `/`. The script maps the three values in separate
|
||||
fields and additionally maps the entire string to an additional `cityRaw` field
|
||||
which is used in the generation of the runtime field for part 2.
|
||||
|
||||
The sourced of the updated upload script, loading the new index are listed
|
||||
below:
|
||||
|
||||
```shell
|
||||
include({{../upload.sh}})
|
||||
```
|
||||
|
||||
Mappings are stored in `mappings.json` and are identical to the ones in Part 1
|
||||
other than for the new location fields and their `.keyword` counterparts
|
||||
similarly generated as the old `city` field.
|
||||
|
||||
9499 documents are imported.
|
||||
|
||||
# Data Visualization
|
||||
|
||||
The Dashboard, Canvas, and requested dependencies (like scripted fields and
|
||||
stored searched) are stored in the JSON object export file `export.ndjson`.
|
||||
Screenshot of the Dashboard and Canvas can be found below.
|
||||
|
||||
The scripted field `continent_scripted` has been generated with the following
|
||||
Painless expression:
|
||||
|
||||
```java
|
||||
doc['cityRaw.keyword'].value.substring(doc['cityRaw.keyword'].value.lastIndexOf("/") + 1)
|
||||
```
|
||||
|
||||
The expression extracts the last portion of the `cityRaw` field, i.e. the
|
||||
portion of text between the last `/` and the end of the field, which contains
|
||||
the continent.
|
||||
|
||||
|
||||
![Part 2 Dashboard](dashboard.png)
|
||||
|
||||
![Part 2 Canvas with no city selected](canvas_any.png)
|
||||
|
||||
![Part 2 Canvas with a city selected](canvas_city.png)
|
||||
|
||||
# Ingestion Plugin
|
||||
|
||||
Sources for the ingestion plugin can be found in the Gitlab repository:
|
||||
|
||||
[_usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl_](https://gitlab.com/usi-si-teaching/msde/2022-2023/visual-analytics-atelier/elasticsearch-plugin/ingest-lookup-maggicl).
|
||||
|
||||
The plugin can be built and installed on Elasticsearch with the script
|
||||
`./install-on-ec.sh` included in the repository by changing the variable
|
||||
`ES_LOCATION` to the path to the local installation of Elasticsearch.
|
||||
|
||||
The plugin works as illustrated in the `README.md` file in the repository, and
|
||||
it has been tested with a unit test suite included in its sources.
|
||||
|
||||
The plugin lookup procedure works by splitting the indicated field in words
|
||||
(non-empty sequences of non-space characters -- according to the PCRE regular
|
||||
expression specification) and matching each word with the given
|
||||
substitution map, performing substitutions when needed.
|
Reference in a new issue