PF3/hw1/submission.tex

\documentclass[12pt]{article}

\usepackage[utf8]{inputenc}
\usepackage[margin=2cm]{geometry}

\title{Howework 1 -- Programming Fundamentals 3}
\author{Claudio Maggioni}

\begin{document}
\maketitle
\tableofcontents
\section{Exercise 1}
\subsection{Question 1}
\texttt{MultipleUpdatesPerThread} is neither correct nor efficient. The reason for its non-correctness is the unsynchronized access
of \emph{result}, which assigns non-consistent values to it: non-atomic evaluation of the statement \texttt{result += partialSum;}
can make one thread evaluate the new value for \emph{result} before another has finished writing to it, thus producing incorrect
results.

By synchronising the access to \emph{result}, we effectively make the application sequential, since only one thread at a time can
change result. Therefore, \texttt{MultipleUpdatesPerThread} runs even worse than \texttt{SequentialSum} since it has to cope with
all the synchronization overhead.
\subsection{Question 2}
While more efficient, \texttt{SingleUpdatePerThread} is still wrong because the access to  \emph{result} is still not synchronized.
Again, non-atomic evaluation of the statement \texttt{result += partialSum;} can make one thread evaluate the new value for \emph{result} before another has finished writing to it, thus producing incorrect
results.
\subsection{Question 6}
\texttt{CollectingResults} is the only Thread-safe implementation because it is the only one not to use a static field to compute
the final result. If multiple threads use anyone of the other classes concurrently, \emph{result} will be shared between the
threads and all the results will be inconsistent. In order to solve this problem, either the entire \texttt{sum(...)} method must
be considered synchronised to \textit{this.class}, (defeating the point of concurrent access to the summing class) or the scope of result must be
bound to the thread (e.g. by making \emph{result} either a private field and making the inner classes non-static, or by making it
a local variable, as \texttt{CollectingResults} does).
\subsection{Question 7}
The slowest implementations are \texttt{MultipleUpdatesPerThreadSynch} and \linebreak[4] \texttt{MultipleUpdatesPerThreadAtomic} since, as discussed before for \textit{Question 1}, their execution is basically sequential since all the computation required for the sum is synchronized, making them even worse than a sequential algorithm due to the synchronization overhead.

\texttt{SingleUpdatesPerThreadSynch}, \texttt{CollectingResults} and \texttt{SingleUpdatesPerThreadAtomic} are better, with the \textit{Atomic} version being marginally faster due to ISA-level optimization for the synchronization of \texttt{result}. They are faster than the previous couple since they actually make the computation of the sum parallel. This advantage holds only for a reasonable value for \texttt{NUM\_THREADS} (not too few, but not too close from below to the number of elements in the array).
\subsection{Question 8}
The sequential sum implementation always performs better than the \textit{MultipleUpdates*} implementations.

In addition, this implementation is better performing than the other three implementations when
\texttt{NUM\_THREADS} is close to or bigger than the array length, since each thread in the parallel algorithms will sum few elements making the computation less parallel (since the final sum of \texttt{partialResult}s is sequential).
\end{document}