report work

This commit is contained in:
Claudio Maggioni 2021-06-09 22:20:27 +02:00
parent d7071adb87
commit a9c90c14f7
4 changed files with 138 additions and 118 deletions

Binary file not shown.

View File

@ -620,13 +620,12 @@ This section aims to use some of the tecniques used in section IV of
the Ros\'a et al.\ paper\cite{dsn-paper} to find patterns and interpendencies
between task and job events by gathering event statistics at those events.
\subsection{Unsuccessful Task Event Patterns}
\input{figures/table_iii} % has table III and table IV in it
\subsection{Unsuccessful Task Event Patterns}\label{tabIII-section}
\input{figures/table_iii}
In this analysis we compute the distribution of termination events by type at
the task-level events and the conditional probability of a task succesfully
terminating given a number of \texttt{EVICT}, \texttt{FAIL} and \texttt{FINISH}
termination events during the task execution.
the task-level events, namely \texttt{EVICT}, \texttt{FAIL}, \texttt{FINISH}
and \texttt{KILL} termination events.
A comparison of the termination event distribution between the 2011 and 2019
traces is shown in figure~\ref{fig:tableIII}. Additionally, a cluster-by-cluster
@ -688,38 +687,54 @@ corresponding task to terminate in an unsuccessful way: a task with no
\texttt{KILL} events have 0.02\%, 0.20\%, 0.44\%, 0.04\%, and
0.07\% probabilities of success respectively. The same effect can be observed,
albeit in a less drastic fashion, for the \texttt{EVICT} and \texttt{FAIL}
curves. The \texttt{EVICT} curve has for 0 to 5
curves. The \texttt{EVICT} curve has for tasks with 0 to 5 kill events 19.70\%,
15.94\%, 1.94\%, 1.67\%, 0.35\% and 0.00\% success probabilities repectively.
The \texttt{FAIL} probability curve has instead 18.55\%, 1.79\%, 14.49\%,
2.08\%, 2.40\%, and 1.29\% success probabilities for the same range.
Refer to figure \ref{fig:figureV}.
Considering cluster-to-cluster behaviour in the 2019 traces (as shown in
figure~\ref{fig:figureV-csts}), some clusters show quite similar behaviour to
the aggregated plot (namely clusters A, F, and H), while some other clusters
show very oscillating probability distribution function curves for
\texttt{EVICT} and \texttt{FINISH} curves. \texttt{KILL} behaviour is instead
homogeneous even on a single cluster basis.
\textbf{Observations}:
\begin{itemize}
\item
Behaviour is very different from cluster to cluster
\item
There is no easy conclusion, unlike in 2011, on the correlation
between succesful probability and \# of events of a specific type.
\item
Clusters B, C and D in particular have very unsmooth lines that vary a
lot for small \# evts differences. This may be due to an uneven
distribution of \# evts in the traces.
\end{itemize}
\subsection{Unsuccessful Job Event Patterns}
\input{figures/table_iv}
\textbf{Observations}:
This analysis uses very similar techniques to the ones used in
section~\ref{tabIII-section}, but focusing at the job level instead. The aim is
to better understand the task-job level relationship and to understand how
task-level termination events can influence the termination state of a job.
\begin{itemize}
\item
Again the mean number of tasks is significantly higher than the 2011
traces, indicating a higher complexity of workloads
\item
Cluster A has no evicted jobs
\item
The number of events is however lower than the event means in the 2011
traces
\end{itemize}
A comparison of the analyzed parameters between the 2011 and 2019
traces is shown in figure~\ref{fig:tableIV}. Additionally, a cluster-by-cluster
breakdown of the same data for the 2019 traces is shown in
figure~\ref{fig:tableIV-csts}.
Considering the distribution of number of tasks in a job, the 2019 traces show a
decrease for the mean figure (e.g. for \texttt{FAIL}ed jobs, with a mean 60.5
tasks per job in 2011 and a mean 43.126 tasks per job in 2019) and a fluctuation
of the 95-th percentile figure (e.g. for \texttt{FAIL}ed jobs it rose from 110
to 200, but for \texttt{KILL}ed job the figure decreased from 400 to 178).
Considering the distribution of the number of task-wise termination events
instead, the 2019 traces show values generally one or two orders of magnitude
below the ones in 2011. While the behaviour of \texttt{EVICT}ed jobs stays the
same, \texttt{FAIL}ed and \texttt{KILL}ed jobs show a dramatic difference in
the event distribution, with \texttt{KILL} becoming the most popular event
task-wise with mean 12.833 and 11.337 task events per job respectively. Finally,
the \texttt{FINISH}ed job category has a new event distribution too, with
\texttt{FINISH} task events being the most popular at 1.778 events per job in
the 2019 traces.
The cluster-by-cluster comparison in figure~\ref{fig:tableIV-csts} shows that
the number of tasks per job are generally distributed similarly to the
aggregated data, with only cluster H having remarkably low mean and 95-th
percentiles overall. Event-wise, for \texttt{EVICT}ed, \texttt{FINISH}ed,
and \texttt{KILL}ed jobs again the distributions are similar to the aggregated
one. For some clusters (namely B, C, and D), the mean number of \texttt{FAIL} and
\texttt{KILL} task events for \texttt{FINISH}ed jobs is almost the same.
\section{Analysis: Potential Causes of Unsuccessful Executions}

View File

@ -121,90 +121,3 @@ overall mean accompanied by the 95-th percentile of all termination
events, followed by a mean of events per event type of each
termination event.}\label{fig:tableIII-csts}
\end{figure}
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\begin{tabular}{lrrrrr}
\toprule
\tableIVh%
\midrule
EVICT & 0.989 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 43.126 (200) & 0.114 & 2.300 & 0.981 & 12.833 \\
FINISH & 3.074 (2) & 0.005 & 0.153 & 1.778 & 0.014 \\
KILL & 53.919 (178) & 0.235 & 0.103 & 0.288 & 11.337 \\
\bottomrule
\end{tabular}
\caption{2011 data}
\vspace{0.5cm}
\end{subfigure}
\begin{subfigure}{\textwidth}
\centering
\begin{tabular}{lrrrrr}
\toprule
\tableIVh%
\midrule
EVICT & 0.989 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 43.126 (200) & 0.114 & 2.300 & 0.981 & 12.833 \\
FINISH & 3.074 (2) & 0.005 & 0.153 & 1.778 & 0.014 \\
KILL & 53.919 (178) & 0.235 & 0.103 & 0.288 & 11.337 \\
\bottomrule
\end{tabular}
\caption{2019 data}
\end{subfigure}
\caption{tbd}
\end{figure}
\begin{figure}[p]
\tableIV{A}{
EVICT & -- & -- & -- & -- & -- \\
FAIL & 90.793 (499) & 0.695 & 0.684 & 0.086 & 1.850 \\
FINISH & 1.187 (1) & 0.005 & 0.001 & 1.073 & 0.024 \\
KILL & 16.533 (10) & 1.045 & 0.074 & 0.461 & 1.189 \\
}
\tableIV{B}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 74.368 (374) & 2.003 & 1.994 & 0.267 & 4.944 \\
FINISH & 6.304 (10) & 0.022 & 0.008 & 2.349 & 0.013 \\
KILL & 69.853 (234) & 1.696 & 0.158 & 0.614 & 3.009 \\
}
\tableIV{C}{
EVICT & 1.000 (1) & 1.001 & 0.000 & 0.000 & 0.000 \\
FAIL & 41.982 (200) & 3.484 & 0.998 & 0.376 & 3.998 \\
FINISH & 1.991 (1) & 0.022 & 0.017 & 1.565 & 0.017 \\
KILL & 110.681 (652) & 0.627 & 0.059 & 0.656 & 2.267 \\
}
\tableIV{D}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 43.356 (250) & 6.112 & 0.949 & 0.531 & 6.498 \\
FINISH & 2.109 (2) & 0.268 & 0.013 & 1.723 & 0.019 \\
KILL & 89.648 (283) & 1.013 & 0.054 & 0.283 & 3.256 \\
}
\tableIV{E}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 23.081 (25) & 0.247 & 0.666 & 0.717 & 1.588 \\
FINISH & 7.776 (2) & 0.019 & 0.029 & 1.934 & 0.021 \\
KILL & 88.790 (309) & 0.706 & 0.029 & 0.461 & 7.572 \\
}
\tableIV{F}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 17.161 (8) & 0.621 & 0.546 & 0.426 & 7.559 \\
FINISH & 2.941 (2) & 0.015 & 0.051 & 1.670 & 0.162 \\
KILL & 103.889 (361) & 0.183 & 0.064 & 0.417 & 5.824 \\
}
\tableIV{G}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 51.835 (250) & 0.556 & 3.335 & 0.608 & 20.352 \\
FINISH & 8.519 (36) & 0.002 & 0.630 & 1.760 & 0.005 \\
KILL & 37.055 (100) & 5.687 & 0.065 & 0.080 & 19.166 \\
}
\tableIV{H}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 20.504 (1) & 0.114 & 2.300 & 0.981 & 12.833 \\
FINISH & 4.278 (14) & 0.005 & 0.153 & 1.778 & 0.014 \\
KILL & 11.023 (3) & 0.235 & 0.103 & 0.288 & 11.337 \\
}
\caption{tbd}
\end{figure}

View File

@ -0,0 +1,92 @@
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\begin{tabular}{lrrrrr}
\toprule
\tableIVh%
\midrule
EVICT & 1 (1) & 1 & 0 & 0 & 0 \\
FAIL & 60.5 (110) & $139.0$ & $788.5$ & $49.2$ & $9.5$ \\
FINISH & 2.7 (1) & $0.4$ & $0.1$ & $5 \cdot 10^{-4}$ & $2.7$ \\
KILL & 86.8 (400) & $13.3$ & $20.9$ & $26.9$ & $62.7$ \\
\bottomrule
\end{tabular}
\caption{2011 data}
\vspace{0.5cm}
\end{subfigure}
\begin{subfigure}{\textwidth}
\centering
\begin{tabular}{lrrrrr}
\toprule
\tableIVh%
\midrule
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 43.126 (200) & 0.114 & 2.300 & 0.981 & 12.833 \\
FINISH & 3.074 (2) & 0.005 & 0.153 & 1.778 & 0.014 \\
KILL & 53.919 (178) & 0.235 & 0.103 & 0.288 & 11.337 \\
\bottomrule
\end{tabular}
\caption{2019 data}
\end{subfigure}
\caption{Mean number of tasks and event distribution per job type for between
2011 and 2019 (all clusters aggregated) traces. The tables show and
mean and 95-th percentile for the number of tasks in a job, and
additionally show the mean of job-wise total of task termination events.}
\end{figure}
\begin{figure}[p]
\tableIV{A}{
EVICT & -- & -- & -- & -- & -- \\
FAIL & 90.793 (499) & 0.695 & 0.684 & 0.086 & 1.850 \\
FINISH & 1.187 (1) & 0.005 & 0.001 & 1.073 & 0.024 \\
KILL & 16.533 (10) & 1.045 & 0.074 & 0.461 & 1.189 \\
}
\tableIV{B}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 74.368 (374) & 2.003 & 1.994 & 0.267 & 4.944 \\
FINISH & 6.304 (10) & 0.022 & 0.008 & 2.349 & 0.013 \\
KILL & 69.853 (234) & 1.696 & 0.158 & 0.614 & 3.009 \\
}
\tableIV{C}{
EVICT & 1.000 (1) & 1.001 & 0.000 & 0.000 & 0.000 \\
FAIL & 41.982 (200) & 3.484 & 0.998 & 0.376 & 3.998 \\
FINISH & 1.991 (1) & 0.022 & 0.017 & 1.565 & 0.017 \\
KILL & 110.681 (652) & 0.627 & 0.059 & 0.656 & 2.267 \\
}
\tableIV{D}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 43.356 (250) & 6.112 & 0.949 & 0.531 & 6.498 \\
FINISH & 2.109 (2) & 0.268 & 0.013 & 1.723 & 0.019 \\
KILL & 89.648 (283) & 1.013 & 0.054 & 0.283 & 3.256 \\
}
\tableIV{E}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 23.081 (25) & 0.247 & 0.666 & 0.717 & 1.588 \\
FINISH & 7.776 (2) & 0.019 & 0.029 & 1.934 & 0.021 \\
KILL & 88.790 (309) & 0.706 & 0.029 & 0.461 & 7.572 \\
}
\tableIV{F}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 17.161 (8) & 0.621 & 0.546 & 0.426 & 7.559 \\
FINISH & 2.941 (2) & 0.015 & 0.051 & 1.670 & 0.162 \\
KILL & 103.889 (361) & 0.183 & 0.064 & 0.417 & 5.824 \\
}
\tableIV{G}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 51.835 (250) & 0.556 & 3.335 & 0.608 & 20.352 \\
FINISH & 8.519 (36) & 0.002 & 0.630 & 1.760 & 0.005 \\
KILL & 37.055 (100) & 5.687 & 0.065 & 0.080 & 19.166 \\
}
\tableIV{H}{
EVICT & 1.000 (1) & 1.000 & 0.000 & 0.000 & 0.000 \\
FAIL & 20.504 (1) & 0.114 & 2.300 & 0.981 & 12.833 \\
FINISH & 4.278 (14) & 0.005 & 0.153 & 1.778 & 0.014 \\
KILL & 11.023 (3) & 0.235 & 0.103 & 0.288 & 11.337 \\
}
\caption{Mean number of tasks and event distribution per job type for each
cluster in the 2019 traces. The tables show and
mean and 95-th percentile for the number of tasks in a job, and
additionally show the mean of job-wise total of task termination events.}
\end{figure}