OM/Claudio_Maggioni_5/Claudio_Maggioni_5.md

<!-- vim: set ts=2 sw=2 et tw=80: -->

---
title: Homework 5 -- Optimization Methods
author: Claudio Maggioni
header-includes:
- \usepackage{amsmath}
- \usepackage{amssymb}
- \usepackage{hyperref}
- \usepackage[utf8]{inputenc}
- \usepackage[margin=2.5cm]{geometry}
- \usepackage[ruled,vlined]{algorithm2e}
- \usepackage{float}
- \floatplacement{figure}{H}
- \hypersetup{colorlinks=true,linkcolor=blue}

---
\maketitle

# Excecise 1

## Exercise 1.1

### The Simplex method

The simplex method solves constrained minimization problems with a linear
cost function and linearly-defined equality and inequality constraints. The main
approach used by the simplex method is to consider only basic feasible points
along the feasible region polytope and to iteratively navigate between them
hopping through neighbours and trying to find the point that minimizes the cost
function.

Although the Simplex method is relatively efficient for most in-practice
applications, it has exponential complexity, since it has been proven that
a carefully crafted $n$-dimensional problem can have up to $2^n$ polytope
vertices, thus making the method inefficient for complex problems.

### Interior-point method

The interior point method aims to have a better worst-case complexity than the
simplex method but still retain an in-practice acceptable performance. Instead
of performing many inexpensive iterations walking along the polytope boundary,
the interior point takes Newton-like steps travelling along "interior" points in
the feasible region (hence the name of the method), thus reaching the
constrained minimizer in fewer iterations. Additionally, the interior-point
method is easier to be implemented in a parallelized fashion.

### Penalty method

The penalty method allows for a linear constrained minimization problem with
equality constraints to be converted in an unconstrained minimization problem,
and to allow the use of conventional unconstrained minimization algorithms to
solve the problem. Namely, the penalty method builds a new uncostrained
objective function with is the summation of:

- The original objective function;
- An additional term for each constraint, which is positive when the current
  point $x$ violates that constraint and zero otherwise.

With some fine tuning of the coefficients for these new "penalty" terms, it is
possible to build an equivalent unconstrained minimization problem whose
minimizer is also constrained minimizer for the original problem.

## Exercise 1.2

The simplex method, as said in the previous section, works by iterating along
basic feasible points and minimizing the cost function along them. In terms of
strict linear algebra terms, the simplex method works by finding an initial set
of indices $\mathcal{B}$ which represent column indices of $A$. At each
iteration, the lagrangian inequality set of constraints $s$ is computed and
checked for negative components, since in order to satisfy the KKT conditions
the components must be all $\geq 0$. The iteration consists of removing one of
said components by changing the $\mathcal{B}$ index set, by effectively swapping
one of the basis vectors with one of the non-basic ones. The method terminates
once all components of $s$ are non-negative.

Geometrically speaking the meaning of each iteration is a simple transition from
a basic feasible point to another neighboring one, and the search step is
effectively equivalent to one of the edges of the polytope. If the $s$ component
is always chosen to be the smallest (i.e. we choose the "most negative"
component), then the method behaves effectively a gradient descent operation of
the cost function hopping between basic feasible points.

## Exercise 1.3

In order to discuss in detail the interior point method, we first define two
sets named "feasible set" ($\mathcal{F}$) and "strictly feasible set"
($\mathcal{F}^{o}$) respectively:

$$
\begin{array}{l} \mathcal{F}=\left\{(x, \lambda, s) \mid A x=b, A^{T}
\lambda+s=c,(x, s) \geq 0\right\} \\ \mathcal{F}^{o}=\left\{(x, \lambda, s) \mid
A x=b, A^{T} \lambda+s=c,(x, s)>0\right\} \end{array}
$$

A central path $\mathcal{C}$ is defined as an arc strictly composed of feasible
points which is parametrized by a scalar $\tau>0$ where each point
$\left(x_{\tau}, \lambda_{\tau}, s_{\tau}\right) \in \mathcal{C}$ satisfies the
following conditions:

$$
\begin{array}{c}
A^{T} \lambda+s=c \\
A x=b \\
x_{i} s_{i}=\tau \quad i=1,2, \ldots, n \\
(x, s)>0
\end{array}
$$

We can observe how these conditions are very much similar to the KKT conditions
and, in fact, they only differ by the factor $\tau$ and by requiring the
pairwise products to be strictly greater than zero. With this, we can define the
central path as follows:

$$
\mathcal{C}=\left\{\left(x_{\tau}, \lambda_{\tau}, s_{\tau}\right) \mid
\tau>0\right\}
$$

Given these definitions, we can also observe that, as $\tau$ approaches zero,
the conditions we have defined become a closer and closer approximation to the
original KKT conditions. Therefore, if the central path $\mathcal{C}$ converges
to anything as $\tau$ approaches zero, then we know that it will converge to a
solution of the linear program. Meaning that the central path is leading us to a
solution by maintaining $x$ and $s$ positive while reducing the pairwise
products to zero at the same time. Usually, the Newton method is used to take
steps following $\mathcal{C}$ rather than by following the set of feasible
points $\mathcal{F}$ because it allows for longer steps before violating the
positivity constraint.

# Exercise 2

## Exercise 2.1

The resulting MATLAB plot of each constraint and of the feasible region is shown
below:

![Plot of feasible region and constraints\label{fig:a}](./ex2-1.png)

## Exercise 2.2

<!--The Simplex method is used to solve linear programs which are defined as
follows:

$$\min c^Tx, \text{ subject to } Ax = b, x > 0$$

And when we have inequalities constraints such as:

$$\min c^Tx,\text{ subject to }Ax \leq b$$

We can introduce slack variable to convert the inequalities into equalities:

$$\min c^Tx,\text{ subject to }Ax + z = 0, z>0$$

Each iterate generated by the simplex method is a basic feasible point which
is a-->

According to Nocedal, a vector $x$ is a basic feasible point if it is in the
feasible region and if there exists a subset $\beta$ of the
index set $1, 2, \ldots, n$ such that:

- $\beta$ contains exactly $m$ indices, where $m$ is the number of rows of $A$;
- For any $i \notin \beta$, $x_i = 0$, meaning the bound $x_i \geq 0$ can be
  inactive only if $i \in \beta$;
- The $m$ x $m$ matrix $B$ defined by $B = [A_i]_{i \in \beta}$ (where $A_i$ is
  the i-th column of A) is non-singular, i.e. all columns corresponding to the
  indices in $\beta$ are linearly independent from each other.

The geometric interpretation of basic feasible points is that all of them
are vertices of the polytope that bounds the feasible region. We will use this
proven property to manually solve the constrained minimization problem presented
in this section by aiding us with the graphical plot of the feasible region in
figure \ref{fig:a}.

<!--
And as already been
said, the basic feasible point is the basic feasible solution for the problem.
Regarding the section 13.3, which outline how the steps are done, we know that
most steps consist of a move from one vertex to an adjacent one for which the
basis $\beta$ differs exactly one component.The step is an edge along which the
objective function is reduced.

Then, one major issue at every simplex iteration
is to decide which index must be removed from the basis. As the book specify,
unless the step is a direction of unboundedness, a single index must be removed
by replacing it with another from outside $\beta$.

From a geometrical point of view,
a move, is along an edge of the feasible polytope that decreases $c^Tx$ and we
continue moving along that edge until we find a new vertex. One we find a vertex
we defined the new constraint $x_p > 0$ which is one of the component $x_p,p \in \beta$
decreased to zero. Afterwards we can remove the index $p$ from the basis $\beta$ and
replace it with $q$. A more detailed visualisation can be see in the following
Figure 1 taken from the book.-->

## Exercise 2.3

Since the geometrical interpretation of the definition of basic feasible point
states that these point are non-other than the vertices of the feasible region,
we first look at the plot above and to these points (i.e. the vertices of the
bright green non-trasparent region). Then, we look which constraint boundaries cross these
edges, and we formulate an algebraic expression to find these points. In
clockwise order, we have:

- The lower-left point at the origin, given by the boundaries of the constraints
 $x_1 \geq 0$ and $x_2 \geq 0$:
 $$x^*_1 = \begin{bmatrix}0\\0\end{bmatrix}$$
- The top-left point, at the intersection of constraint boundaries $x_1 \geq 0$ and
  $-3x_1 + 2x_2 \leq 3$:
  $$x_1 = 0 \;\;\; 2x_2 = 3 \Leftrightarrow x_2 = \frac32 \;\;\; x^*_2 =
  \frac12 \cdot \begin{bmatrix}0\\3\end{bmatrix}$$
- The top-center-left point at the intersection of constraint boundaries $-3x_1 +
  2x_2 \leq 3$ and $2x_1 + 3x_2 \leq 6$:
  $$-3x_1 + 2x_2 + 3x_1 + \frac92 x_2 = \frac{13}{2} x_2 = 3 + 9 = 12
  \Leftrightarrow x_2 = 12 \cdot \frac2{13} = \frac{24}{13}$$$$
  -3x_1 + 2 \cdot \frac{24}{13} = 3 \Leftrightarrow x_1 = \frac{39 - 48}{13}
  \cdot \frac{1}{-3} = \frac{3}{13} \;\;\; x^*_3 = \frac{1}{13} \cdot
  \begin{bmatrix}3\\24\end{bmatrix}$$
- The top-center-right point at the intersection of constraint boundaries $2x_1
  + 3x_2 \leq 6$ and $2x_1 + x_2 \leq 4$:
  $$2x_1 + 3x_2 - 2x_1 - x_2 = 2x_2 = 6 - 4 = 2 \Leftrightarrow x_2 = 1 \;\;\;
  2x_1 + 1 = 4 \Leftrightarrow x_1 = \frac32 \;\;\; x^*_4 = \frac12 \cdot
  \begin{bmatrix}3\\2\end{bmatrix}$$
- The right point at the intersection of $2x_1 + x_2 \leq 4$ and $x_2 \geq
  0$:
  $$x_2 = 0 \;\;\; 2x_1 + 0 = 4 \Leftrightarrow x_1 = 2 \;\;\; x^*_5 =
  \begin{bmatrix}2\\0\end{bmatrix}$$

Therefore, $x^*_1$ to $x^*_5$ are all of the basic feasible points for this
constrained minimization problem.

We then compute the objective function value for each basic feasible point
found, The smallest objective value will correspond with the constrained
minimizer problem solution.

$$
x^*_1 = \begin{bmatrix}0\\0\end{bmatrix} \;\;\; f(x^*_1) = 4 \cdot 0 + 3 \cdot 0 =
0$$$$
x^*_2 = \frac12 \cdot \begin{bmatrix}0\\3\end{bmatrix} \;\;\;
f(x^*_2) = 4 \cdot 0 + 3 \cdot \frac{3}{2} = \frac92$$$$
x^*_3 = \frac{1}{13} \cdot \begin{bmatrix}3\\24\end{bmatrix} \;\;\; f(x^*_3) = 4
\cdot \frac{3}{13} + 3 \cdot \frac{24}{13} = \frac{84}{13}$$$$
x^*_4 = \frac12 \cdot \begin{bmatrix}3\\2\end{bmatrix} \;\;\; f(x^*_4) = 4 \cdot
\frac32 + 3 \cdot 1 = 9$$$$
x^*_5 = \begin{bmatrix}2\\0\end{bmatrix} \;\;\; f(x^*_5) = 4 \cdot 2 + 1 \cdot 0
= 8$$

Therefore, $x^* = x^*_1 = \begin{bmatrix}0 & 0\end{bmatrix}^T$ is the global
constrained minimizer.

# Exercise 3

## Exercise 3.1
<!--
I consider the given problem, which is exactly the same as one of the problems
of the previous assignment (Homework 4):

$$\min_{x} f(x) = 3x^2_1 + 2x_1x_2 + x_1x_3 +
2.5x^2_2 + 2x_2x_3 + 2x^2_3 - 8x_1 - 3x_2 - 3x_3
$$$$\text{ subject to } x_1 + x_3 = 3 \;\;\; x_2 + x_3 = 0$$

defining $x$ as $(x_1,\,x_2,\,x_3)^T$, that can be written in the form of a
quadratic minimization problem:

$$\min_{x} f(x) = \dfrac{1}{2} \langle x,\, Gx\rangle + \langle x,\, c\rangle \\
\text{ subject to } Ax = b$$

Where $G\in \mathbb{R}^{n\times n}$ is a symmetric positive definite matrix,
$x$, $c \in \mathbb{R}^n$. The equality constraints are defined in terms of the
matrix $A\in \mathbb{R}^{m\times n}$, with $m \leq n$ and vector $b \in
\mathbb{R}^m$. Here, matrix $A$ has full rank.
-->

Yes, the problem can be solved with _Uzzawa_'s method since the problem can be
reformulated as a saddle point system. The KKT conditions of the problem can be
reformulated as a matrix-vector to vector equality in the following way:

$$\begin{bmatrix}G & -A^T\\A & 0 \end{bmatrix} \begin{bmatrix}
x^*\\\lambda^* \end{bmatrix} = \begin{bmatrix} -c\\b \end{bmatrix}.$$

If we then express the minimizer $x^*$ in terms of $x$, an approximation of it,
and $p$, a search step (i.e. $x^* = x + p$), we obtain the following system.

$$\begin{bmatrix}
G & A^T\\
A & 0
\end{bmatrix}
\begin{bmatrix}
-p\\
\lambda^*
\end{bmatrix} =
\begin{bmatrix}
g\\
h
\end{bmatrix}$$

This is the system the _Uzzawa_ method will solve. Therefore, we need to check
if the matrix:

$$K = \begin{bmatrix}G & A^T \\ A& 0\end{bmatrix} = \begin{bmatrix}
6  &   2  &  1  &  1  &  0 \\
2  &   5  &  2  &  0  &  1 \\
1  &   2  &  4  &  1  &  1 \\
1  &   0  &  1  &  0  &  0 \\
0  &   1  &  1  &  0  &  0 \\
\end{bmatrix}\text{ recalling the computed values of }A\text{ and }G\text{ from the
previous assignment}$$

Has non-zero positive and negative eigenvalues. We compute the eigenvalues of this
matrix with MATLAB, and we find:

$$\begin{bmatrix}
   -0.4818\\
   -0.2685\\
    2.6378\\
    4.3462\\
    8.7663\end{bmatrix}$$

Therefore, the system is indeed a saddle point system and it can be solved with
_Uzzawa_'s method.

## Exercise 3.2

The MATLAB code used to find the solution can be found under section 3.2 of the
`main.m` script. The solution is:

$$x=\begin{bmatrix}0.7692\\-2.2308\\2.2308\end{bmatrix} \;\;\; \lambda=
\begin{bmatrix}-10.3846\\2.1538\end{bmatrix}$$