report

2021-07-12 21:28:33 +02:00 · 2021-07-12 21:28:33 +02:00 · db8178e98c
parent 6278aa25bc
commit db8178e98c
3 changed files with 23 additions and 21 deletions
--- a/report/Claudio_Maggioni_report.pdf
+++ b/report/Claudio_Maggioni_report.pdf
--- a/report/Claudio_Maggioni_report.tex
+++ b/report/Claudio_Maggioni_report.tex
@ -37,7 +37,7 @@
 \advisor[Universit\`a della Svizzera Italiana,
 Switzerland]{Prof.}{Walter}{Binder}
 \assistant[Universit\`a della Svizzera Italiana,
-Switzerland]{Dr.}{Andrea}{Ros\'a}
+Switzerland]{Dr.}{Andrea}{Ros\`a}
 \end{committee}
 \abstract{The thesis aims at comparing two different traces coming from large
@ -65,7 +65,7 @@ avoid wasting resources and avoid failures.
 In 2011 Google released a month long data trace of their own cluster management
 system~\cite{google-marso-11} \textit{Borg}, containing a lot of data regarding
 scheduling, priority management, and failures of a real production workload.
-This data was the foundation of the 2015 Ros\'a et al.\ paper
+This data was the foundation of the 2015 Ros\`a et al.\ paper
 \textit{Understanding the Dark Side of Big Data Clusters: An Analysis beyond
 Failures}~\cite{dsn-paper}, which in its many conclusions highlighted the need
 for better cluster management highlighting the high amount of failures found in
@ -116,7 +116,7 @@ exploiting the power of parallel computing, following most of the time a
 MapReduce-like structure.
 %\subsection{Contribution}
-This project aims to repeat the analysis performed in 2015 DSN Ros\'a et al.\
+This project aims to repeat the analysis performed in 2015 DSN Ros\`a et al.\
 paper~\cite{dsn-paper} to highlight similarities and differences in Google Borg
 workload and the behaviour and patterns of executions within it. Thanks to this
 analysis, we aim to understand even better the causes of failures and how to
@ -207,7 +207,7 @@ bugs~\cite{9}~\cite{10}~\cite{11}~\cite{12}.
 However, the community has not yet performed any research on the new Borg
 traces analysing unsuccessful executions, their possible causes, and the
 relationships between tasks and jobs. Therefore, the only current research in
-this field is this very report, providing and update to the the 2015 Ros\'a et
+this field is this very report, providing and update to the the 2015 Ros\`a et
 al.\ paper~\cite{dsn-paper} focusing on the new trace.
 \section{Background}\label{sec3}
@ -517,7 +517,7 @@ task termination counts. After the task events are sorted, the script iterates
 over the events in chronological order, storing each execution attempt time and
 registering all execution termination types by checking the event type field.
 The task termination is then equal to the last execution termination type,
-following the definition originally given in the 2015 Ros\'a et al. DSN paper.
+following the definition originally given in the 2015 Ros\`a et al. DSN paper.
 If the task termination is determined to be unsuccessful, the tally counter of
 task terminations for the matching task property is increased. Otherwise, all
@ -533,7 +533,7 @@ in the clear and coincise tables found in Figure~\ref{fig:taskslowdown}.
 \section{Analysis: Performance Input of Unsuccessful Executions}\label{sec5}
 Our first investigation focuses on replicating the analysis done by the paper of
-Ros\'a et al.\ paper~\cite{dsn-paper} regarding usage of machine time
+Ros\`a et al.\ paper~\cite{dsn-paper} regarding usage of machine time
 and resources.
 In this section we perform several analyses focusing on how machine time and
@ -639,7 +639,7 @@ Refer to Figure~\ref{fig:taskslowdown} for a comparison between the 2011 and
 means are computed on a cluster-by-cluster basis for 2019 data in
 Figure~\ref{fig:taskslowdown-csts}.
-In 2015 Ros\'a et al.~\cite{dsn-paper} measured mean task slowdown per each task
+In 2015 Ros\`a et al.~\cite{dsn-paper} measured mean task slowdown per each task
 priority value, which at the time were numeric values between 0 and 11. However,
 in 2019 traces, task priorities are given as a numeric value between 0 and 500.
 Therefore, to allow an easier comparison, mean task slowdown values are computed
@ -740,7 +740,7 @@ traces.
 \section{Analysis: Patterns of Task and Job Events}\label{sec6}
 This section aims to use some of the tecniques used in section IV of
-the Ros\'a et al.\ paper~\cite{dsn-paper} to find patterns and interpendencies
+the Ros\`a et al.\ paper~\cite{dsn-paper} to find patterns and interpendencies
 between task and job events by gathering event statistics at those events. In
 particular, Section~\ref{tabIII-section} explores how the success of a
 task is inter-correlated with its own event patterns, which
@ -873,15 +873,16 @@ Additionally, it is noteworthy that cluster A has no \texttt{EVICT}ed jobs.
 \section{Analysis: Potential Causes of Unsuccessful Executions}\label{sec7}
-This section re-applies the tecniques used in Section V of the Ros\'a et al.\
+This section re-applies the tecniques used in Section V of the Ros\`a et al.\
-paper~\cite{dsn-paper} to find patterns and interpendencies
+paper~\cite{dsn-paper} to find causes for unsuccessful events related to
-between task and job events by gathering event statistics at those events. In
+task-level parameters (analyzed in Section~\ref{fig7-section}),
-particular, Section~\ref{tabIII-section} explores how tasks of the success of a
+usage of machine resources by tasks (analyzed in Section~\ref{fig8-section}),
-task is inter-correlated with its own event patterns, which
+and job-level parameters (analyzed in Section~\ref{fig9-section}). In all the
-Section~\ref{figV-section} explores even further by computing task success
+analyses we use the ``event rate'' metric, which represents the relative
-probabilities based on the number of task termination events of a specific type.
+percentage of termination type events over a certain task/job parameter
-Finally, Section~\ref{tabIV-section} aims to find similar correlations, but at
+configuration. We compute this metric for all the possible terminations (i.e.\
-the job level.
+\texttt{EVICT}, \texttt{FAIL}, \texttt{FINISH} and \texttt{KILL}) in order to
 find correlations with the several trace parameters.
 \subsection{Task Event Rates vs.\ Task Priority, Event Execution Time, and
 Machine Concurrency.}\label{fig7-section} \input{figures/figure_7}
@ -911,7 +912,7 @@ From this analysis we can make the following observations:
    Figure~\ref{fig:figureVII-b-csts}) for the 2019 traces
    are quite different than 2011 ones, here it
  seems there is a good correlation between short task execution times
-  and finish event rates, instead of the ``U shape'' curve found in the Ros\'a
+  and finish event rates, instead of the ``U shape'' curve found in the Ros\`a
    et al.\ 2015 DSN paper~\cite{dsn-paper};
 \item
  The behaviour among different clusters for the event execution time
--- a/report/usiinfbachelorproject.cls
+++ b/report/usiinfbachelorproject.cls
@ -229,7 +229,7 @@
    {\newpage }
        {\textwidth 5cm}
-%%% put ToC, LoF, LoT and Index entries in the ToC use of \phantomsection is required for dealing with the hyperref package and depends on the nohyper option
+%%% put ToC, LoF, LoT and Index entries in the ToC use of \phantomsection is required for dealing with the ryperref package and depends on the nohyper option
 %%% other useful packages
@ -241,7 +241,8 @@
 \RequirePackage{amsmath}
 %%% switch on hyperref support
 \ifthenelse{\boolean{@hypermode}}{%
-\RequirePackage[unicode,plainpages=false,pdfpagelabels,breaklinks]{hyperref}
+\RequirePackage[svgnames]{xcolor}
 \RequirePackage[colorlinks=true,linkcolor=Maroon,allcolors=Maroon,unicode,plainpages=false,pdfpagelabels,breaklinks]{hyperref}
 \RequirePackage[all]{hypcap}
 }{}
@ -256,7 +257,7 @@
 	\textsf{Advisor's approval}{}
 	(\DTLforeach*[\DTLiseq{\type}{r}]{committee}%
 	{\actitle=title,\first=first,\last=last,\type=type}{%
-		\DTLiffirstrow{}{, }\textsf{\print@blank{\actitle}\first \ \last}, \textsf{Dr. Andrea Ros\'a}):%
+		\DTLiffirstrow{}{, }\textsf{\print@blank{\actitle}\first \ \last}, \textsf{Dr. Andrea Ros\`a}):%
 	\hspace{4cm}
 	& \textsf{Date: }
 	}