report: added introduction

This commit is contained in:
Claudio Maggioni 2021-05-19 14:06:38 +02:00
parent 18ce409cde
commit 87b869b92d
2 changed files with 30 additions and 3 deletions

Binary file not shown.

View file

@ -43,9 +43,36 @@ system attributes such asmachine locality and concurrency level.}
\tableofcontents
\newpage
\hypertarget{introduction-including-motivation}{%
\section{Introduction (including
Motivation)}\label{introduction-including-motivation}}
\section{Introduction}
In today's world there is an ever growing demand for efficient, large scale
computations. The rising trend of ``big data'' put the need for efficient
management of large scaled parallelized computing at an all time high. This fact
also increases the demand for research in the field of distributed systems, in
particular in how to schedule computations effectively, avoid wasting resources
and avoid failures.
In 2011 Google released a month long data trace of its own \textit{Borg} cluster
management system, containing a lot of data regarding scheduling, priority
management, and failures of a real production workload. This data was the
foundation of the 2015 Ros\'a et al.\ paper \textit{Understanding the Dark Side
of Big Data Clusters: An Analysis beyond Failures}, which in its many
conclusions highlighted the need for better cluster management highlighting the
high amount of failures found in the traces.
In 2019 Google released an updated version of the \textit{Borg} cluster traces,
not only containing data from a far bigger workload due to the sheer power of
Moore's law, but also providing data from 8 different \textit{Borg} cells from
datacenters all over the world. These new traces are therefore about 100 times
larger than the old traces, weighing in terms of storage spaces approximately
8TiB (when compressed and stored in JSONL format), requiring considerable
computational power to analyze them and the implementation of special data
engineering tecniques for analysis of the data.
This project aims to repeat the analysis performed in 2015 to highlight
similarities and differences in workload this decade brought, and expanding the
old analysis to understand even better the causes of failures and how to prevent
them. Additionally, this report will provide an overview on the data engineering
tecniques used to perform the queries and analyses on the 2019 traces.
\hypertarget{state-of-the-art}{%
\section{State of the Art}\label{state-of-the-art}}