soft-an01/report.tex

\documentclass[11pt,a4paper]{scrartcl}
\usepackage{algorithm}
\usepackage{algpseudocode}
\usepackage[utf8]{inputenc}
\usepackage[margin=2.25cm]{geometry}
\usepackage{hyperref}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{lmodern}
\usepackage{booktabs}
\usepackage{graphicx}
\usepackage{float}
\usepackage{tikz}
\usepackage{listings}
\usepackage{pgfplots}
\pgfplotsset{compat=1.18}
\usepackage{subcaption}
\setlength{\parindent}{0cm}
\setlength{\parskip}{0.3em}
\hypersetup{pdfborder={0 0 0}}
%\usepackage[nomessages]{fp} no easter eggs this time
\usepackage{amsmath}
\DeclareMathOperator*{\argmax}{arg\,max}
\DeclareMathOperator*{\argmin}{arg\,min}
\usepackage{minted}

\renewcommand{\MintedPygmentize}{python3 ./pygmentize.py}

\definecolor{codegreen}{rgb}{0,0.6,0}
\definecolor{codegray}{rgb}{0.5,0.5,0.5}
\definecolor{codepurple}{rgb}{0.58,0,0.82}
\definecolor{backcolour}{rgb}{0.95,0.95,0.92}

\lstdefinestyle{mystyle}{
  backgroundcolor=\color{backcolour},
  commentstyle=\color{codegreen},
  keywordstyle=\color{magenta},
  keywordstyle=[2]{\color{olive}},
  numberstyle=\tiny\color{codegray},
  stringstyle=\color{codepurple},
  basicstyle=\ttfamily\footnotesize,
  breakatwhitespace=false,
  breaklines=true,
  captionpos=b,
  keepspaces=true,
  numbers=left,
  numbersep=5pt,
  showspaces=false,
  showstringspaces=false,
  showtabs=false,
  tabsize=2,
  aboveskip=0.8em,
  belowcaptionskip=0.8em
}
\lstset{style=mystyle}

\geometry{left=2cm,right=2cm,top=2cm,bottom=3cm}
\title{
\vspace{-5ex}
Assignment 1 -- Software Analysis \\\vspace{0.5cm}
\Large Deductive verification with Dafny
\vspace{-1ex}
}
\author{Claudio Maggioni}
\date{\vspace{-3ex}}

\begin{document}
\maketitle

\section{Choice of Sorting Algorithm}

In order to choose I browse the implementations of some classical sorting
algorithms to find one that strikes a good tradeoff between ease-of-proof and a
solid understanding of how it works on my part.

I decide to implement and verify the correctness of selection sort. The
algorithm sorts a given list in place with average and worst-case complexity
$O(n^2)$. It works by iteratively finding either the minimum or maximum element
in the list, pushing it to respectively either the beginning or the end of the
list, and subsequently running the next iteration over the remaining $n-1$
elements.

For the sake of this assignment, I choose to implement and analyze the variant
where the minimum element is computed. The pseudocode of selection sort is given
in algorithm \ref{alg:sel}.

\begin{algorithm}
  \caption{Selection sort}\label{alg:sel}
  \begin{algorithmic}[1]
  \Require $a$ list of values
  \Ensure $a$ is sorted in-place
  \State $s \gets 0$
  \While{$s < |a|$}
    \State {$m \gets \argmin_x{a[x]}$ for $s \leq x < |a|$}
    \State {\textbf{swap} $a[x]$, $a[s]$}
    \State {$s \gets s + 1$}
  \EndWhile
  \end{algorithmic}
\end{algorithm}

I choose this algorithm due to its procedural nature, since I feel more
comfortable tackling loops instead of recursive calls when writing verification
code as we already covered them in class.

Additionally, given the algorithm incrementally places a ever-growing portion of
sorted elements at the beginning of the list as $s$ increases, finding a loop
invariant for the \textbf{while} loop shown in the pseudocode should be simple
as I can formalize this fact into a predicate.

\subsection{Dafny implementation}

To implement and verify the algorithm I use
\href{https://dafny.org/}{\color{blue} Dafny}, a programming language that is
verification-aware and equipped with a static program verifier.

I first write an implementation of the pseudocode, listed below.

\begin{listing}[H]
\begin{minted}[linenos]{dafny}
method SelectionSort(a: array<int>)
{
  if (a.Length == 0) {
    return;
  }

  var s := 0;

  while (s < a.Length - 1)
  {
    var min: int := s;
    var i: int := s + 1;

    while (i < a.Length)
    {
      if (a[i] < a[min]) {
        min := i;
      }
      i := i + 1;
    }

    a[s], a[min] := a[min], a[s];
    s := s + 1;
  }
}
\end{minted}
    \caption{Implementation of selection sort in Dafny}
    \label{lst:sel}
\end{listing}

The implementation is only slightly different from the pseudocode. The biggest
difference lies in the inner \textbf{while} loop at lines 14-20. This is just a
procedural implementation of the assignment

$$m \gets \argmin_x{l[x]} \text{\hspace{1cm} for \hspace{1cm}} s \leq x < |l|$$

at line 3 of the pseudocode.

\section{Verification}

I now verify that the implementation in listing \ref{lst:sel} is correct by
adding a specification to it, namely a method precondition, a method
postcondition, and invariants and variants for the outer and inner loop.

\subsection{Method precondition and postcondition}

Aside the \mintinline{dafny}{array<int>} type declaration, no other condition is
needed to constrain the input parameter \texttt{a} as a sorting algorithm should
sort any list. Therefore, the method precondition is

\mintinline{dafny}{requires true}

which can just be omitted.

Regarding postconditions, as the assignment description suggests, we need to
verify that the method indeed sorts the values, while preserving the values in
the list (i.e. without adding or deleting values).

We can define the sorted condition by saying that for any pair of monotonically
increasing indices of $a$ the corresponding elements should be monotonically
non-decreasing. This can be expressed with the predicate:

\begin{minted}{dafny}
predicate sorted(s: seq<int>)
{
  forall i,j: int :: 0 <= i < j < |s| ==> s[i] <= s[j]
}
\end{minted}

According to advice given during lecture, we can express order-indifferent
equality with the predicate:

\begin{minted}{dafny}
predicate sameElements(a: seq<int>, b: seq<int>)
{
  multiset(a) == multiset(b)
}
\end{minted}

Therefore, the method signature including preconditions and postconditions
is:

\begin{minted}{dafny}
method SelectionSort(a: array<int>)
  modifies a
  ensures sorted(a[..])
  ensures sameElements(a[..], old(a[..]))
\end{minted}

\subsection{Outer loop variant and invariant}

As mentioned already, the outer \textbf{while} loop in selection sort strongly
relates with the incremental selection of minimum values from the non-processed
part of the list (i.e. for indices in $[s,|a|)$) and them being moved to the
beginning of the list in the correct order. Indeed, the outer loop maintains two
main properties:

\begin{itemize}
\item The processed elements (i.e. indices $[0, s)$) are sorted, as the elements
    in them are the minimum, the second-minimum, the third-minimum and so on in
    this order;
\item As all the processed elements have been selected as minimums, all of them
    are by definition greater than or equal than the non-processed elements.
    Even if this property seems trivial, it is quite important to ensure we
    can simply ``append'' elements to the processed portion as they will for
    sure be greater than all the values in indices $[0, s)$.
\end{itemize}

We can formalize these two facts respectively with two Dafny loop invariants on
the outer while loop:

\begin{minted}{dafny}
invariant s >= 1 ==> sorted(a[0..s]) // saying one item is sorted makes little sense
invariant forall i,j: int :: 0 <= i < s <= j < a.Length ==> a[i] <= a[j]
\end{minted}

Since this loop has an index $s$ iterating over the values in $[0,|a| - 1]$
(note that we condiser the value of $s$ after loop termination as well) in
steps of one the corresponding loop invariant and variant on the index are quite
straightforward\footnote{We specify loop variants as well for completeness'
sake, even if Dafny is able to infer them in these circumnstances.}. Indeed,
they are respectively:

\begin{minted}{dafny}
invariant s <= a.Length - 1
decreases a.Length - s
\end{minted}

Finally, since the only mutation performed on the list is the \textbf{swap}
operation at line 22 of the code, and a swap operation does not create or
destroy values (either it swaps the position of two values or it swaps the same
position with itself -- i.e. doing nothing), the order-indifferent equality
predicate can be simply added to this loop as an invariant

\begin{minted}{dafny}
invariant sameElements(a[..], old(a[..]))
\end{minted}

However trivial of a fact as it is, Dafny requires it as an invariant to
complete the proof of correctness, and therefore we need to be a little redundant.

\subsection{Inner loop variant and invariant}

The inner loop implements the \textit{arg min} expression used in the assignment
at line 6 of the pseudocode. From this fact we can easily say this loop does not
mutate $a$ in any array and therefore the order-indifferent equality predicate
holds:

\begin{minted}{dafny}
invariant sameElements(a[..], old(a[..]))
\end{minted}

Then, we can define the invariant for the index $i$, which iterates
over the values in $[s + 1, |a|]$ in single increasing steps. We also know that
the loop condition is based on the upper bound of this interfal, so we can
define the loop variant as well:

\begin{minted}{dafny}
invariant s + 1 <= i <= a.Length
decreases a.Length - i
\end{minted}

The only assignments to variable \textit{min} are at line 11 and line 17 of the
pseudocode, where \textit{min} is initialized to $s$ and \textit{min} is
assigned the value of $i$ respectively. Therefore we can say the value of
\textit{min} at any point where is in scope is either $s$ or $\in [s + 1,|a|)$
as it is assigned a possible value of $i$ \textit{inside} the loop. We can
formalize this fact with an invariant:

\begin{minted}{dafny}
invariant s <= min < a.Length
\end{minted}

Finally % it's here for you, it's the last member of the DK crew
we can define a loop invariant for the values of the list element referenced by
\textit{min}. As the loop scans for values referenced by $i$ less than
\mintinline{dafny}{a[min]}, updating \textit{min} with the value of $i$ when
this is true, we can say that for all values smaller than $i$ in the scanned
region \mintinline{dafny}{a[min]} indeed contains the smallest value. This fact
in predicate form results in the following loop invariant:

\begin{minted}{dafny}
invariant forall j: int :: s <= j < i ==> a[min] <= a[j]
\end{minted}

\subsection{Dafny implementation with specification}
The Dafny implementation of selection sort complete with the specification
explained so far can be found in the file \texttt{selectionsort.dfy} found in
the repository:

\begin{center}
    \href{https://gitlab.com/usi-si-teaching/msde/2022-2023/software-analysis/maggioni/assigment-1}{gitlab.com:usi-si-teaching/msde/2022-2023/software-analysis/maggioni/assigment-1}
\end{center}

on the \textit{main} branch.

\section{Conclusions}

Writing specifications with Dafny turned out to indeed be a challenge. Choosing
the right predicates and the right places to put them to satisfy Danfy and
Boogie, its embedded theorem prover, was an iterative process requiring many
attempts and careful scrutiny at the reported errors.

As part of this process, I have introduced many \mintinline{dafny}{assert} and
\mintinline{dafny}{assume} clauses to tackle the algorithm portion by portion.
As \mintinline{dafny}{assume}s are basically ``cheats'' w.r.t. the correctness
proving process and \mintinline{dafny}{assert}s were redundant, no such
statement survives in my implementation.

Lines 3-5 of the algorithm's implementation (a simple check of the list $a$
being empty, and an immediate \mintinline{dafny}{return} if this is the case)
are not strictly needed according to the pseudocode. However, handling this edge
case specifically simplifies somewhat the process of determining
the loop invariants as we can assume in all further statements that the list is
non-empty.

Indeed, the hardest step in the verification process was determining appropriate
loop invariants to satisfy the theorem proven. The need for redundant
order-indifferent equality predicates in both loops is quite counter-intuitive
as those invariants merely re-state a property that is true across the method.

\end{document}