report progress

This commit is contained in:
Claudio Maggioni 2021-05-17 18:50:25 +02:00
parent 9bc8ff3b01
commit 0d139cea38
2 changed files with 55 additions and 62 deletions

Binary file not shown.

View File

@ -1,11 +1,11 @@
\documentclass{usiinfbachelorproject} \documentclass{usiinfbachelorproject}
\title{Understanding and Comparing Unsuccessful Executions in Large Datacenters} \title{Understanding and Comparing Unsuccessful Executions in Large Datacenters}
\author{Claudio Maggioni} \author{Claudio Maggioni}
\usepackage{enumitem}
\usepackage[parfill]{parskip} \usepackage{parskip}
\setlength{\parskip}{7pt} \setlength{\parskip}{5pt}
\setlength{\parindent}{0pt} \setlength{\parindent}{0pt}
%\usepackage[printfigures]{figcaps}
\usepackage{xcolor} \usepackage{xcolor}
\usepackage{amsmath} \usepackage{amsmath}
\usepackage{subcaption} \usepackage{subcaption}
@ -93,42 +93,36 @@ are encoded and stored in the trace as rows of various tables. Among the
information events provide, the field ``type'' provides information on information events provide, the field ``type'' provides information on
the execution status of the job or task. This field can have the the execution status of the job or task. This field can have the
following values: following values:
\begin{center}
\begin{itemize} \begin{tabular}{p{3cm}p{12cm}}
\item \toprule
\textbf{QUEUE}: The job or task was marked not eligible for scheduling \textbf{Type code} & \textbf{Description} \\
\midrule
\texttt{QUEUE} & The job or task was marked not eligible for scheduling
by Borg's scheduler, and thus Borg will move the job/task in a long by Borg's scheduler, and thus Borg will move the job/task in a long
wait queue; wait queue\\
\item \texttt{SUBMIT} & The job or task was submitted to Borg for execution\\
\textbf{SUBMIT}: The job or task was submitted to Borg for execution; \texttt{ENABLE} & The job or task became eligible for scheduling\\
\item \texttt{SCHEDULE} & The job or task's execution started\\
\textbf{ENABLE}: The job or task became eligible for scheduling; \texttt{EVICT} & The job or task was terminated in order to free
\item computational resources for an higher priority job\\
\textbf{SCHEDULE}: The job or task's execution started; \texttt{FAIL} & The job or task terminated its execution unsuccesfully
\item due to a failure\\
\textbf{EVICT}: The job or task was terminated in order to free \texttt{FINISH} & The job or task terminated succesfully\\
computational resources for an higher priority job; \texttt{KILL} & The job or task terminated its execution because of a
\item manual request to stop it\\
\textbf{FAIL}: The job or task terminated its execution unsuccesfully \texttt{LOST} & It is assumed a job or task is has been terminated, but
due to a failure; due to missing data there is insufficent information to identify when
\item or how\\
\textbf{FINISH}: The job or task terminated succesfully; \texttt{UPDATE\_PENDING} & The metadata (scheduling class, resource
\item requirements, \ldots) of the job/task was updated while the job was
\textbf{KILL}: The job or task terminated its execution because of a waiting to be scheduled\\
manual request to stop it; \texttt{UPDATE\_RUNNING} & The metadata (scheduling class, resource
\item requirements, \ldots) of the job/task was updated while the job was in
\textbf{LOST}: It is assumed a job or task is has been terminated, but execution\\
due to missing data there is insufficent information to identify when \bottomrule
or how; \end{tabular}
\item \end{center}
\textbf{UPDATE\_PENDING}: The metadata (scheduling class, resource
requirements, \ldots) of the job/task was updated while the job was
waiting to be scheduled;
\item
\textbf{UPDATE\_RUNNING}: The metadata (scheduling class, resource
requirements, \ldots) of the job/task was updated while the job was in
execution;
\end{itemize}
Figure~\ref{fig:eventTypes} shows the expected transitions between event Figure~\ref{fig:eventTypes} shows the expected transitions between event
types. types.
@ -177,22 +171,16 @@ file segments) where each carriage return separated line represents a
single record for that table. single record for that table.
There are namely 5 different table ``files'': There are namely 5 different table ``files'':
\begin{description}
\begin{itemize} \item[\texttt{machine\_configs},] which is a table containing each physical
\item
\texttt{machine\_configs}, which is a table containing each physical
machine's configuration and its evolution over time; machine's configuration and its evolution over time;
\item \item[\texttt{instance\_events},] which is a table of task events;
\texttt{instance\_events}, which is a table of task events; \item[\texttt{collection\_events},] which is a table of job events;
\item \item[\texttt{machine\_attributes},] which is a table containing (obfuscated)
\texttt{collection\_events}, which is a table of job events;
\item
\texttt{machine\_attributes}, which is a table containing (obfuscated)
metadata about each physical machine and its evolution over time; metadata about each physical machine and its evolution over time;
\item \item[\texttt{instance\_usage},] which contains resource (CPU/RAM) measures
\texttt{instance\_usage}, which contains resource (CPU/RAM) measures
of jobs and tasks running on the single machines. of jobs and tasks running on the single machines.
\end{itemize} \end{description}
The scope of this thesis focuses on the tables The scope of this thesis focuses on the tables
\texttt{machine\_configs}, \texttt{instance\_events} and \texttt{machine\_configs}, \texttt{instance\_events} and
@ -224,7 +212,11 @@ analysis}\label{project-requirements-and-analysis}}
\hypertarget{analysis-methodology}{% \hypertarget{analysis-methodology}{%
\section{Analysis methodology}\label{analysis-methodology}} \section{Analysis methodology}\label{analysis-methodology}}
\textbf{TBD} Due to the inherent complexity in analyzing traces of this size, novel
bleeding-edge data engineering tecniques were adopted to performed the required
computations. We used the framework Apache Spark to perform efficient and
parallel Map-Reduce computations. In this section, we discuss the technical
details behind our approach.
\hypertarget{introduction-on-apache-spark}{% \hypertarget{introduction-on-apache-spark}{%
\subsection{Introduction on Apache \subsection{Introduction on Apache
@ -302,15 +294,16 @@ the presence of incomplete data (i.e.~records which contain fields whose values
is unknown). This filtering is performed using the \texttt{.filter()} operation is unknown). This filtering is performed using the \texttt{.filter()} operation
of Spark's RDD API. of Spark's RDD API.
The core of each query is often a \texttt{groupby()} followed by a \texttt{map()} The core of each query is often a \texttt{groupby()} followed by a
operation on the aggregated data. The \texttt{groupby()} groups the set of all records \texttt{map()} operation on the aggregated data. The \texttt{groupby()} groups
into several subsets of records each having something in common. Then, each of the set of all records into several subsets of records each having something in
this small clusters is reduced with a \texttt{map()} operation to a single common. Then, each of this small clusters is reduced with a \texttt{map()}
record. The motivation behind this computation is often to analyze a time operation to a single record. The motivation behind this way of computing data
series of several different traces of programs. This is implemented by is that for the analysis in this thesis it is often necessary to analyze the
\texttt{groupby()}-ing records by program id, and then \texttt{map()}-ing each program behaviour w.r.t. time of either task or jobs by looking at their events. These
trace set by sorting by time the traces and computing the desired property in queries are therefore implemented by \texttt{groupby()}-ing records by task or
the form of a record. job, and then \texttt{map()}-ing each set of event records sorting them by time
and performing the desired computation on the obtained chronological event log.
Sometimes intermediate results are saved in Spark's parquet format in order to Sometimes intermediate results are saved in Spark's parquet format in order to
compute and save intermediate results beforehand. compute and save intermediate results beforehand.