report has user evaluation without statistics

This commit is contained in:
Claudio Maggioni 2020-12-08 18:44:18 +01:00
parent ec8199ec8d
commit 3df2c17797
2 changed files with 61 additions and 3 deletions

Binary file not shown.

View file

@ -19,6 +19,7 @@
\begin{document}
\maketitle
\tableofcontents
\listoffigures
\newpage
\section{Introduction}
@ -307,6 +308,7 @@ search are a 100 results limit and the use of \texttt{t\_*} fields to match
documents (lines 25 and 23 -- remember the definition of the \texttt{text} field).
\section{User interface}
\subsection{UI flow}
Figure \ref{fig:ui} illustrates the IR system's UI showing its features.
\begin{figure}[H]
@ -335,13 +337,11 @@ Figure \ref{fig:ui} illustrates the IR system's UI showing its features.
\label{fig:ui}
\end{figure}
\subsection{Technical details}
The UI has been implemented using HTML5, vanilla CSS and vanilla JS, with the
exception of the \textit{FoamTree} library from the Carrot2 project to handle
displaying the clustering bar to the left of search results.
Event handlers offered by \textit{FoamTree} allowed for the implementation of
the results filtering feature when a clustering in filtered.
This is a single page application, i.e. all updates to the UI happen without
making the page refresh. This was achieved by using AJAX requests to interact
with Solr.
@ -350,5 +350,63 @@ All UI files can be found under the \texttt{ui} directory in the repository root
directory. In order to run the UI, a ``CORS Everywhere'' extension must be
installed on the viewing browser. See the installation instructions for details.
\subsection{Clustering Component}
Event handlers offered by \textit{FoamTree} allowed for the implementation of
the results filtering feature when a clustering in filtered.
\section{User evaluation}
The user evaluation was conducted remotely using Microsoft Teams by selecting
three of my colleagues and making them install and run my project on their local
system. The examination approximately took 20 minutes for each test subject,
including installation.
The questionnaire was implemented using Qualtrics at \href{TBD}{this link}.
Data for the evaluation was collected using a questionnaire with a ``before
test'' and an ``after test'' section.
In the ``before test'' section, users expressed their agreement to test
procedures and stated their level of familiarity with image search text
retrieval systems, stating in particular if they ever searched for user created
images or stock photos. All participants stated they were mostly familiar with
TR image search systems, and they had the chance to search for user created
images. Only one participant never searched for stock photos.
\begin{figure}[H]
\begin{tabular}{p{4cm}|p{3.5cm}|p{3.5cm}|p{3.5cm}}
Question & Subject 1 & Subject 2 & Subject 3 \\
\hline
\textsc{metadata:} Start time & 2020-12-06 14:52 & 2020-12-07 13:35 & 2020-12-07 13:54\\
\textsc{metadata:} End time & 2020-12-06 14:58 & 2020-12-07 13:48 & 2020-12-07 14:05\\
Familiarity with image search TR systems & 4 & 4 & 5 \\
Has searched for user images & Yes & Yes & Yes \\
Has searched for stock photos & Yes & No & Yes \\
The UI was easy to use & 6 & 6 & 7 \\
Accurate results for ``find a person sneezing'' & 6 & 7 & 7 \\
Accurate results for ``find Varenna'' & 5 & 7 & 7 \\
Personal task & ``Churchill Pfeil'' & Eiffel tower from query ``France'' &
``Italian traditional masks'' \\
Accurate results for personal task & 7 & 7 & 7 \\
Clustering was helpful & 6 & 7 & 7 \\
Irrelevant results were a lot and distracting & 4 & 5 & 2 \\
Suggestions & Missing Search button & Did not understand clustering was a
filter & \textit{Great survey background image} \\
\end{tabular}
\caption{Data collected from the questionnaire.}
\label{fig:qs}
\end{figure}
Figure \ref{fig:qs} illustrates the data gathered from the questionnaire.
Numeric values represent a 5-tier likert scale for the ``Familiarity with image
TR systems question (from 1 to 5, ``Not familiar at all'', ``Slightly
familiar'', ``Moderately familiar'', ``Very familiar'', ``Extremely familiar'')
and a 7-tier likert scale on all other questions (from 1 to 7, ``Strongly
Disagree'', ``Disagree'', ``Slightly disagree'', ``Neither agree or disagree'',
``Somewhat agree'', ``Agree'', ``Strongly agree''). The start and end times are
expressed in CEST without DST (corresponding to the local time of participants).
All participants started the questionnaire before they started effectively using
the IR system, so this data can also be used to measure the experiment's length.
\end{document}