report progress

2021-05-17 18:50:25 +02:00 · 2021-05-17 18:50:25 +02:00 · 0d139cea38
parent 9bc8ff3b01
commit 0d139cea38
2 changed files with 55 additions and 62 deletions
--- a/report/Claudio_Maggioni_report.pdf
+++ b/report/Claudio_Maggioni_report.pdf
--- a/report/Claudio_Maggioni_report.tex
+++ b/report/Claudio_Maggioni_report.tex
@ -1,11 +1,11 @@
 \documentclass{usiinfbachelorproject}
 \title{Understanding and Comparing Unsuccessful Executions in Large Datacenters}
 \author{Claudio Maggioni}
-
+\usepackage{enumitem}
-\usepackage[parfill]{parskip}
+\usepackage{parskip}
-\setlength{\parskip}{7pt}
+\setlength{\parskip}{5pt}
 \setlength{\parindent}{0pt}
-
+%\usepackage[printfigures]{figcaps}
 \usepackage{xcolor}
 \usepackage{amsmath}
 \usepackage{subcaption}
@ -93,42 +93,36 @@ are encoded and stored in the trace as rows of various tables. Among the
 information events provide, the field ``type'' provides information on
 the execution status of the job or task. This field can have the
 following values:
-
+\begin{center}
-\begin{itemize}
+\begin{tabular}{p{3cm}p{12cm}}
-\item
+\toprule
-  \textbf{QUEUE}: The job or task was marked not eligible for scheduling
+\textbf{Type code} & \textbf{Description} \\
 \midrule
 	\texttt{QUEUE} & The job or task was marked not eligible for scheduling
  by Borg's scheduler, and thus Borg will move the job/task in a long
-  wait queue;
+  wait queue\\
-\item
+\texttt{SUBMIT}  & The job or task was submitted to Borg for execution\\
-  \textbf{SUBMIT}: The job or task was submitted to Borg for execution;
+\texttt{ENABLE}  & The job or task became eligible for scheduling\\
-\item
+\texttt{SCHEDULE}  & The job or task's execution started\\
-  \textbf{ENABLE}: The job or task became eligible for scheduling;
+\texttt{EVICT}  & The job or task was terminated in order to free
-\item
+   computational resources for an higher priority job\\
-  \textbf{SCHEDULE}: The job or task's execution started;
+\texttt{FAIL}  & The job or task terminated its execution unsuccesfully
-\item
+   due to a failure\\
-  \textbf{EVICT}: The job or task was terminated in order to free
+\texttt{FINISH}  & The job or task terminated succesfully\\
-  computational resources for an higher priority job;
+\texttt{KILL}  & The job or task terminated its execution because of a
-\item
+   manual request to stop it\\
-  \textbf{FAIL}: The job or task terminated its execution unsuccesfully
+\texttt{LOST}  & It is assumed a job or task is has been terminated, but
-  due to a failure;
+   due to missing data there is insufficent information to identify when
-\item
+   or how\\
-  \textbf{FINISH}: The job or task terminated succesfully;
+\texttt{UPDATE\_PENDING}  & The metadata (scheduling class, resource
-\item
+   requirements, \ldots) of the job/task was updated while the job was
-  \textbf{KILL}: The job or task terminated its execution because of a
+   waiting to be scheduled\\
-  manual request to stop it;
+\texttt{UPDATE\_RUNNING}  & The metadata (scheduling class, resource
-\item
+   requirements, \ldots) of the job/task was updated while the job was in
-  \textbf{LOST}: It is assumed a job or task is has been terminated, but
+   execution\\
-  due to missing data there is insufficent information to identify when
+\bottomrule
-  or how;
+\end{tabular}
-\item
+\end{center}
  \textbf{UPDATE\_PENDING}: The metadata (scheduling class, resource
  requirements, \ldots) of the job/task was updated while the job was
  waiting to be scheduled;
 \item
  \textbf{UPDATE\_RUNNING}: The metadata (scheduling class, resource
  requirements, \ldots) of the job/task was updated while the job was in
  execution;
 \end{itemize}
 Figure~\ref{fig:eventTypes} shows the expected transitions between event
 types.
@ -177,22 +171,16 @@ file segments) where each carriage return separated line represents a
 single record for that table.
 There are namely 5 different table ``files'':
-
+\begin{description}
-\begin{itemize}
+\item[\texttt{machine\_configs},] which is a table containing each physical
 \item
  \texttt{machine\_configs}, which is a table containing each physical
  machine's configuration and its evolution over time;
-\item
+\item[\texttt{instance\_events},] which is a table of task events;
-  \texttt{instance\_events}, which is a table of task events;
+\item[\texttt{collection\_events},] which is a table of job events;
-\item
+\item[\texttt{machine\_attributes},] which is a table containing (obfuscated)
  \texttt{collection\_events}, which is a table of job events;
 \item
  \texttt{machine\_attributes}, which is a table containing (obfuscated)
  metadata about each physical machine and its evolution over time;
-\item
+\item[\texttt{instance\_usage},] which contains resource (CPU/RAM) measures
  \texttt{instance\_usage}, which contains resource (CPU/RAM) measures
  of jobs and tasks running on the single machines.
-\end{itemize}
+\end{description}
 The scope of this thesis focuses on the tables
 \texttt{machine\_configs}, \texttt{instance\_events} and
@ -224,7 +212,11 @@ analysis}\label{project-requirements-and-analysis}}
 \hypertarget{analysis-methodology}{%
 \section{Analysis methodology}\label{analysis-methodology}}
-\textbf{TBD}
+Due to the inherent complexity in analyzing traces of this size, novel
 bleeding-edge data engineering tecniques were adopted to performed the required
 computations. We used the framework Apache Spark to perform efficient and
 parallel Map-Reduce computations. In this section, we discuss the technical
 details behind our approach.
 \hypertarget{introduction-on-apache-spark}{%
 \subsection{Introduction on Apache
@ -302,15 +294,16 @@ the presence of incomplete data (i.e.~records which contain fields whose values
 is unknown). This filtering is performed using the \texttt{.filter()} operation
 of Spark's RDD API.
-The core of each query is often a \texttt{groupby()} followed by a \texttt{map()}
+The core of each query is often a \texttt{groupby()} followed by a
-operation on the aggregated data. The \texttt{groupby()} groups the set of all records
+\texttt{map()} operation on the aggregated data. The \texttt{groupby()} groups
-into several subsets of records each having something in common. Then, each of
+the set of all records into several subsets of records each having something in
-this small clusters is reduced with a \texttt{map()} operation to a single
+common. Then, each of this small clusters is reduced with a \texttt{map()}
-record. The motivation behind this computation is often to analyze a time
+operation to a single record. The motivation behind this way of computing data
-series of several different traces of programs. This is implemented by
+is that for the analysis in this thesis it is often necessary to analyze the
-\texttt{groupby()}-ing records by program id, and then \texttt{map()}-ing each program
+behaviour w.r.t. time of either task or jobs by looking at their events. These
-trace set by sorting by time the traces and computing the desired property in
+queries are therefore implemented by \texttt{groupby()}-ing records by task or
-the form of a record.
+job, and then \texttt{map()}-ing each set of event records sorting them by time
 and performing the desired computation on the obtained chronological event log.
 Sometimes intermediate results are saved in Spark's parquet format in order to
 compute and save intermediate results beforehand.