5.8 KiB

Raw Blame History

documentclass

title

author

pandoc-options

header-includes

usiinfbachelorproject

Understanding and Comparing Unsuccessful Executions in Large Datacenters

Claudio Maggioni

--filter=pandoc-include

--latex-engine-opt=--shell-escape

--latex-engine-opt=--enable-write18

```{=latex} \usepackage{subcaption} \usepackage{booktabs} \usepackage{graphicx} \captionsetup{labelfont={bf}} %\subtitle{The (optional) subtitle} \versiondate{\today} \begin{committee} \advisor[Universit`a della Svizzera Italiana, Switzerland]{Prof.}{Walter}{Binder} \assistant[Universit`a della Svizzera Italiana, Switzerland]{Dr.}{Andrea}{Ros'a} \end{committee} \abstract{The project aims at comparing two different traces coming from large datacenters, focusing in particular on unsuccessful executions of jobs and tasks submitted by users. The objective of this project is to compare the resource waste caused by unsuccessful executions, their impact on application performance, and their root causes. We will show the strong negative impact on CPU and RAM usage and on task slowdown. We will analyze patterns of unsuccessful jobs and tasks, particularly focusing on their interdependency. Moreover, we will uncover their root causes by inspecting key workload and system attributes such asmachine locality and concurrency level.} ```

Introduction (including Motivation)

State of the Art

Introduce Ros'a 2015 DSN paper on analysis
Describe Google Borg clusters
Describe Traces contents
Differences between 2011 and 2019 traces

Project requirements and analysis

(describe our objective with this analysis in detail)

Analysis methodology

Technical overview of traces' file format and schema

Overview on challenging aspects of analysis (data size, schema, avaliable computation resources)

Introduction on apache spark

General workflow description of apache spark workflow

The Google 2019 Borg cluster traces analysis were conducted by using Apache Spark and its Python 3 API (pyspark). Spark was used to execute a series of queries to perform various sums and aggregations over the entire dataset provided by Google.

In general, each query follows a general Map-Reduce template, where traces are first read, parsed, filtered by performing selections, projections and computing new derived fields. Then, the trace records are often grouped by one of their fields, clustering related data toghether before a reduce or fold operation is applied to each grouping.

Most input data is in JSONL format and adheres to a schema Google profided in the form of a protobuffer specification¹.

On of the main quirks in the traces is that fields that have a "zero" value (i.e. a value like 0 or the empty string) are often omitted in the JSON object records. When reading the traces in Apache Spark is therefore necessary to check for this possibility and populate those zero fields when omitted.

Most queries use only two or three fields in each trace records, while the original records often are made of a couple of dozen fields. In order to save memory during the query, a projection is often applied to the data by the means of a .map() operation over the entire trace set, performed using Spark's RDD API.

Another operation that is often necessary to perform prior to the Map-Reduce core of each query is a record filtering process, which is often motivated by the presence of incomplete data (i.e. records which contain fields whose values is unknown). This filtering is performed using the .filter() operation of Spark's RDD API.

The core of each query is often a groupBy followed by a map() operation on the aggregated data. The groupby groups the set of all records into several subsets of records each having something in common. Then, each of this small clusters is reduced with a .map() operation to a single record. The motivation behind this computation is often to analyze a time series of several different traces of programs. This is implemented by groupBy()-ing records by program id, and then map()-ing each program trace set by sorting by time the traces and computing the desired property in the form of a record.

Sometimes intermediate results are saved in Spark's parquet format in order to compute and save intermediate results beforehand.

General Query script design

Ad-Hoc presentation of some analysis scripts (w diagrams)

Analysis (w observations)

machine_configs

\input{figures/machine_configs}

Observations:

machine configurations are definitely more varied than the ones in the 2011 traces
some clusters have more machine variability

machine_time_waste

\input{figures/machine_time_waste}

Observations:

task_slowdown

spatial_resource_waste

figure_7

figure_8

figure_9

table_iii, table_iv, figure_v

Potential causes of unsuccesful executions

Implementation issues -- Analysis limitations

Discussion on unknown fields

Limitation on computation resources required for the analysis

Other limitations ...

Conclusions and future work or possible developments

Some examples

Figure 1{reference-type="ref" reference="fig:USILogo"} shows how to insert figures in the document.

{#fig:USILogo width="50%"}

Table 1{reference-type="ref" reference="tab:numbers"} shows how to insert tables in the document.

::: {#tab:numbers} Col 1 Col 2 Col 3 Col 4

1 2 3 Goofy 4 5 6 Mickey

: Caption of the table :::

Google 2019 Borg traces Protobuffer specification on Github ↩︎

5.8 KiB Raw Blame History