report 70% done

This commit is contained in:
Claudio Maggioni 2023-05-01 13:24:46 +02:00
parent 0d3fadaafa
commit f50bfb288b
2 changed files with 148 additions and 16 deletions

Binary file not shown.

View file

@ -148,10 +148,15 @@ assertions) simply run:
env MAVEN_OPTS="-ea" mvn clean test
\end{verbatim}
The state of the assignment repository when the type checker was first ran
successfully is pinned by the \textit{git} tag \textit{before-refactor}. A copy
of the CheckerFramework relevant portion of the compilation output at that tag
is stored in the file \textit{before-refactor.txt}.
Apache Commons Text includes classes that have been deprecated. As changing the
interface and behaviour of these classes would be useless, as alternatives to
them exist in the library already, I choose to ignore them while refactoring by
adding a \textit{@SuppressWarning} annotation in each of them. The state of the
assignment repository after the deprecated classes are annotated and when the
type checker was first ran successfully is pinned by the \textit{git} tag
\textit{before-refactor}. A copy of the CheckerFramework relevant portion of the
compilation output at that tag is stored in the file
\textit{before-refactor.txt}.
No CheckerFramework checkers other than the index checker is used in this
analysis as the code in the project mainly manipulates strings and arrays and a
@ -175,7 +180,7 @@ significant number of warnings are generated even by using this checker only..
Total & 513 & 483 \\ \bottomrule
\end{tabular}
\caption{Number of CheckerFramework Type Checker warnings by category before
and after refactoring.}
and after refactoring, ignoring deprecated classes.}
\label{tab:check}
\end{table}
@ -187,8 +192,8 @@ in the following classes:
\begin{multicols}{2}
\begin{itemize}
\item AlphabetConverter
\item StringSubstitutor
\item AlphabetConverter % done
\item StringSubstitutor % done
\item similarity.LongestCommonSubsequence
\item translate.AggregateTranslator
\item translate.CharSequenceTranslator
@ -204,20 +209,147 @@ in the following classes:
\end{itemize}
\end{multicols}
The strategy I adopt to perform the refactoring is based on the compiler errors
thrown on the original code. In every flagged statement I attempt to find the
root cause of the warning and eliminate it with either extended type qualifier
annotations or assertions when adding only the former fails.
Instead of using \texttt{@SuppressWarning} annotations I choose to use
\texttt{@AssumeAssertion}-annotated assertions as I aim to use the existing
Commons Text test suite to aid in finding incorrectly-placed annotations. As
mentioned before in the report, I run the test suite of the project by enabling
assertions and I verify that all tests still pass and no
\textit{AssertionError}s are thrown.
In total, the refactor consists in the placement of 16 extended type qualifiers
and 14 assertions. A more detailed description of salient refactoring decisions
taken to extinguish the warnings follows.
\begin{listing}[H]
\begin{minted}[linenos,firstnumber=139]{java}
public static int toMillisInt(final Duration duration) {
Objects.requireNonNull(duration, "duration");
// intValue() does not do a narrowing conversion here
return LONG_TO_INT_RANGE.fit(Long.valueOf(duration.toMillis())).intValue();
\subsection{Class \textit{AlphabetConverter}}
\begin{minted}[linenos,firstnumber=387]{java}
for (int j = 0; j < encoded.length();) {
final int i = encoded.codePointAt(j);
final String s = codePointToString(i);
if (s.equals(originalToEncoded.get(i))) {
result.append(s);
j++; // because we do not encode in Unicode extended the
// length of each encoded char is 1
} else {
if (j + encodedLetterLength > encoded.length()) {
throw new UnsupportedEncodingException("Unexpected end "
+ "of string while decoding " + encoded);
}
final String nextGroup = encoded.substring(j,
j + encodedLetterLength);
\end{minted}
Here the \texttt{substring(\ldots)} call at line 151 is flagged by
CheckerFramework warning the start and end index may be negative and that the
start index may be greater than the end index. As the attribute
\texttt{encodedLetterLength} is positive according to the contract of the class
constructor and \texttt{j} is only incremented in the for loop or by a factor of
\texttt{encodedLetterLength}, the code is correct. After introducing a
\mintinline{java}{@Positive} annotation on the declaration of \texttt{j} and an
\mintinline{java}{assert encodedLength > 0} after line 395, CheckerFramework
agrees with my judgement.
\subsection{Class \textit{StringSubstitutor}}
\begin{minted}[linenos,breaklines,firstnumber=910]{java}
/** [...]
* @throws IllegalArgumentException if variable is not found when its allowed to throw exception
* @throws StringIndexOutOfBoundsException if {@code offset} is not in the
* range {@code 0 <= offset <= source.length()}
* @throws StringIndexOutOfBoundsException if {@code length < 0}
* @throws StringIndexOutOfBoundsException if {@code offset + length > source.length()}
*/
public String replace(final String source, final int offset, final int length) {
if (source == null) {
return null;
}
final TextStringBuilder buf = new TextStringBuilder(length).append(source, offset, length);
if (!substitute(buf, 0, length)) {
return source.substring(offset, offset + length);
}
return buf.toString();
}
\end{minted}
\caption{Method \textit{toMillisInt(Duration)} of class
\textit{time.DurationUtils} in Apache Commons Lang 3.12.0.}
\end{listing}
The implementation of method \texttt{replace} is flagged by the extended type
checker as the indices \texttt{offset} and \texttt{length} are not bound checked
against the string \texttt{source}. As the unsafe behaviour of the method is
documented in its \textit{javadoc} with appropriate \texttt{@throws} clauses, I
simply add this implied preconditions to the method's contract by using extended
type qualifiers:
\begin{minted}[breaklines]{java}
public String replace(final String source,
final @IndexOrHigh("#1") int offset,
final @NonNegative @LTLengthOf(value = "#1", offset = "#2 - 1") int length)
\end{minted}
\subsection{Class \textit{translate.CharSequenceTranslator} and implementors}
Apache Commons Text provides the aforementioned abstract class implementation as
a template method of sorts for expressing text encoding and decoding algorithms.
The class essentially provides facilities to scan UTF-16 code points
sequentially, and delegating the translation of each code point to the
implementation of the abstract method:
\begin{minted}[breaklines]{java}
public abstract int translate(CharSequence input, int index, Writer writer) throws IOException;
\end{minted}
CheckerFramework gives some warnings about some of the implementations of this
method, highlighting that they assume the \texttt{input} \textit{CharSequence}
is non-empty and the \texttt{index} parameter is a valid index for the string.
Even if the method is public, I choose to interpret this hierarchy to mainly be
a template method pattern, with high coupling between the algorithm in the
abstract class and each abstract method implementation. Given this, I decide to
restrict the method's precondition to highlight conditions already provided by
the caller algorithm, namely the length and index constraints provided by
CheckerFramework.
The new signature of the abstract method is this:
\begin{minted}[breaklines]{java}
public abstract int translate(@MinLen(1) CharSequence input,
@NonNegative @LTLengthOf("#1") int index,
Writer writer) throws IOException;
\end{minted}
As some methods have a more forgiving implementation, and a broader child method
argument type from a more restrictive parent type does not break the rules of
inheritance (thanks to contravariance), I choose to propagate the extended type
annotations only when needed and avoid introducing additional preconditions to
more tolerant implementations of the template method.
\subsection{Class \textit{translate.SinglePassTranslator} and implementors}
\textit{SinglePassTranslator} is one of the implementor classes of the
aforementioned \textit{CharSequenceTranslator} template method. However, the
class is itself a template method pattern ``for processing whole input in single
pass''\footnote{According to the class \textit{javadoc} description.}, i.e.
essentially performing an abstraction inversion of the codepoint-by-codepoint
algorithm in \textit{CharSequenceTranslator} by implementing the encoding or
decoding process in a single go.
The class immediately delegates the implementation of the translation algorithm
to the abstract package-private method:
\begin{minted}[breaklines]{java}
abstract void translateWhole(CharSequence input, Writer writer) throws IOException;
\end{minted}
and requires callers of the public implementation of \texttt{translate} to call
it with \texttt{index} equal to 0.
I simply propagate the non-empty extended type annotation on \texttt{input}
(i.e. \mintinline{java}{@MinLen(1)}) on this new abstract method and
implementors.
\section{Conclusions}