OM/hw1/main.tex

\documentclass{scrartcl}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage{subcaption}
\usepackage{amsmath}
\usepackage{pgfplots}
\pgfplotsset{compat=newest}
\usetikzlibrary{plotmarks}
\usetikzlibrary{arrows.meta}
\usepgfplotslibrary{patchplots}
\usepackage{grffile}
\usepackage{amsmath}
\usepackage{subcaption}
\usepgfplotslibrary{external}
\tikzexternalize
\usepackage[margin=2.5cm]{geometry}

% To compile:
% sed -i 's#title style={font=\\bfseries#title style={yshift=1ex, font=\\tiny\\bfseries#' *.tex
% luatex -enable-write18 -shellescape main.tex

\pgfplotsset{every x tick label/.append style={font=\tiny, yshift=0.5ex}}
\pgfplotsset{every title/.append style={font=\tiny, align=center}}
\pgfplotsset{every y tick label/.append style={font=\tiny, xshift=0.5ex}}
\pgfplotsset{every z tick label/.append style={font=\tiny, xshift=0.5ex}}

\setlength{\parindent}{0cm}
\setlength{\parskip}{0.5\baselineskip}

\title{Optimisation methods -- Homework 1}
\author{Claudio Maggioni}

\begin{document}

\maketitle

\section{Exercise 1}

\subsection{Gradient and Hessian}

The gradient and the Hessian for $f$ are the following:

\[\nabla f = \begin{bmatrix}2x_1 + x_2 \cdot \cos(x_1) \\ 9x^2_2 + \sin(x_1)\end{bmatrix} \]

\[H_f = \begin{bmatrix}2 - x_2 \cdot \sin(x_1) & \cos(x_1)\\\cos(x_1) & 18x_2\end{bmatrix} \]

\subsection{Taylor expansion}

\[f(h) = 0 + \langle\begin{bmatrix}0 + 0 \\ 0 + 0\end{bmatrix},\begin{bmatrix}h_1\\h_2\end{bmatrix}\rangle + \frac12 \langle\begin{bmatrix}2 - 0 & 1\\1 & 0\end{bmatrix} \begin{bmatrix}h_1 \\ h_2\end{bmatrix}, \begin{bmatrix}h_1 \\ h_2\end{bmatrix}\rangle + O(\|h\|^3)\]
\[f(h) = \frac12 \langle\begin{bmatrix}2h_1 + h_2 \\ h_1\end{bmatrix}, \begin{bmatrix}h_1 \\ h_2\end{bmatrix}\rangle + O(\|h\|^3)\]
\[f(h) = \frac12 \left(2h^2_1 + 2 h_1h_2\right) + O(\|h\|^3)\]
\[f(h) = h^2_1 + h_1h_2 + O(\|h\|^3)\]

\section{Exercise 2}

\subsection{Gradient and Hessian}

For $A$ symmetric, we have:

\[\frac{d}{dx}\langle b, x\rangle = \langle b,\cdot \rangle = b\]
\[\frac{d}{dx}\langle Ax, x\rangle = 2\langle Ax,\cdot \rangle = 2Ax\]

Then:

\[\nabla J = Ax - b\]
\[H_J = \frac{d}{dx} \nabla J = A\]

\subsection{First order necessary condition}

It is a necessary condition for a minimizer $x^*$ of $J$ that:

\[\nabla J (x^*) = 0 \Leftrightarrow Ax^* = b\]

\subsection{Second order necessary condition}

It is a necessary condition for a minimizer $x^*$ of $J$ that:

\[\nabla^2 J(x^*) \geq 0 \Leftrightarrow A \text{ is positive semi-definite}\]

\subsection{Sufficient conditions}

It is a sufficient condition for $x^*$ to be a minimizer of $J$ that the first necessary condition is true and that:

\[\nabla^2 J(x^*) > 0 \Leftrightarrow A \text{ is positive definite}\]

\subsection{Does $\min_{x \in R^n} J(x)$ have a unique solution?}

Not in general. If for example we consider A and b to be only zeros, then $J(x) = 0$ for all $x \in \!R^n$ and thus $J$ would have an infinite number of minimizers.

However, for if $A$ would be guaranteed to have full rank, the minimizer would be unique because the first order necessary condition would hold only for one value $x^*$. This is because the linear system $Ax^* = b$ would have one and only one solution (due to $A$ being full rank).

\section{Exercise 3}

\subsection{Quadratic form}

$f(x,y)$ can be written in quadratic form in the following way:

\[f(v) = \frac12 \langle\begin{bmatrix}2 & 0\\0 & 2\mu\end{bmatrix}v, v\rangle + \langle\begin{bmatrix}0\\0\end{bmatrix}, x\rangle\]

where:

\[v = \begin{bmatrix}x\\y\end{bmatrix}\]

\subsection{Matlab implementation with \texttt{surf} and \texttt{contour}}

The graphs generated by MATLAB are shown below:

\resizebox{\textwidth}{!}{\input{surf.tex}}

\resizebox{\textwidth}{!}{\input{contour.tex}}

\resizebox{\textwidth}{!}{\input{yseries.tex}}

\resizebox{\textwidth}{!}{\input{norms.tex}}

Isolines get stretched along the y axis as $\mu$ increases. For a large $\mu$, points well far away from the axes could be a
problem since picking search directions and steps using a naive gradient based method iterations will zig-zag to the minimizer reaching it slowly.

\end{document}
Initial commit 2021-03-21 14:05:22 +00:00			`\documentclass{scrartcl}`
			`\usepackage[utf8]{inputenc}`
hw1: matlab done, report almost done 2021-03-22 15:22:10 +00:00			`\usepackage{graphicx}`
			`\usepackage{subcaption}`
Initial commit 2021-03-21 14:05:22 +00:00			`\usepackage{amsmath}`
hw1: done up to 3.2 2021-03-21 21:54:15 +00:00			`\usepackage{pgfplots}`
hw1: matlab done, report almost done 2021-03-22 15:22:10 +00:00			`\pgfplotsset{compat=newest}`
			`\usetikzlibrary{plotmarks}`
			`\usetikzlibrary{arrows.meta}`
			`\usepgfplotslibrary{patchplots}`
			`\usepackage{grffile}`
			`\usepackage{amsmath}`
			`\usepackage{subcaption}`
			`\usepgfplotslibrary{external}`
			`\tikzexternalize`
			`\usepackage[margin=2.5cm]{geometry}`

			`% To compile:`
			`% sed -i 's#title style={font=\\bfseries#title style={yshift=1ex, font=\\tiny\\bfseries#' *.tex`
			`% luatex -enable-write18 -shellescape main.tex`

			`\pgfplotsset{every x tick label/.append style={font=\tiny, yshift=0.5ex}}`
			`\pgfplotsset{every title/.append style={font=\tiny, align=center}}`
			`\pgfplotsset{every y tick label/.append style={font=\tiny, xshift=0.5ex}}`
			`\pgfplotsset{every z tick label/.append style={font=\tiny, xshift=0.5ex}}`
Initial commit 2021-03-21 14:05:22 +00:00
			`\setlength{\parindent}{0cm}`
			`\setlength{\parskip}{0.5\baselineskip}`

			`\title{Optimisation methods -- Homework 1}`
			`\author{Claudio Maggioni}`

			`\begin{document}`

			`\maketitle`

			`\section{Exercise 1}`

			`\subsection{Gradient and Hessian}`

			`The gradient and the Hessian for $f$ are the following:`

			`\[\nabla f = \begin{bmatrix}2x_1 + x_2 \cdot \cos(x_1) \\ 9x^2_2 + \sin(x_1)\end{bmatrix} \]`

			`\[H_f = \begin{bmatrix}2 - x_2 \cdot \sin(x_1) & \cos(x_1)\\\cos(x_1) & 18x_2\end{bmatrix} \]`

			`\subsection{Taylor expansion}`

			`\[f(h) = 0 + \langle\begin{bmatrix}0 + 0 \\ 0 + 0\end{bmatrix},\begin{bmatrix}h_1\\h_2\end{bmatrix}\rangle + \frac12 \langle\begin{bmatrix}2 - 0 & 1\\1 & 0\end{bmatrix} \begin{bmatrix}h_1 \\ h_2\end{bmatrix}, \begin{bmatrix}h_1 \\ h_2\end{bmatrix}\rangle + O(\\|h\\|^3)\]`
			`\[f(h) = \frac12 \langle\begin{bmatrix}2h_1 + h_2 \\ h_1\end{bmatrix}, \begin{bmatrix}h_1 \\ h_2\end{bmatrix}\rangle + O(\\|h\\|^3)\]`
			`\[f(h) = \frac12 \left(2h^2_1 + 2 h_1h_2\right) + O(\\|h\\|^3)\]`
			`\[f(h) = h^2_1 + h_1h_2 + O(\\|h\\|^3)\]`

			`\section{Exercise 2}`

			`\subsection{Gradient and Hessian}`

			`For $A$ symmetric, we have:`

			`\[\frac{d}{dx}\langle b, x\rangle = \langle b,\cdot \rangle = b\]`
			`\[\frac{d}{dx}\langle Ax, x\rangle = 2\langle Ax,\cdot \rangle = 2Ax\]`

			`Then:`

			`\[\nabla J = Ax - b\]`
			`\[H_J = \frac{d}{dx} \nabla J = A\]`

			`\subsection{First order necessary condition}`

			`It is a necessary condition for a minimizer $x^*$ of $J$ that:`

			`\[\nabla J (x^) = 0 \Leftrightarrow Ax^ = b\]`

			`\subsection{Second order necessary condition}`

			`It is a necessary condition for a minimizer $x^*$ of $J$ that:`

			`\[\nabla^2 J(x^*) \geq 0 \Leftrightarrow A \text{ is positive semi-definite}\]`

			`\subsection{Sufficient conditions}`

			`It is a sufficient condition for $x^*$ to be a minimizer of $J$ that the first necessary condition is true and that:`

			`\[\nabla^2 J(x^*) > 0 \Leftrightarrow A \text{ is positive definite}\]`

			`\subsection{Does $\min_{x \in R^n} J(x)$ have a unique solution?}`

			`Not in general. If for example we consider A and b to be only zeros, then $J(x) = 0$ for all $x \in \!R^n$ and thus $J$ would have an infinite number of minimizers.`

			`However, for if $A$ would be guaranteed to have full rank, the minimizer would be unique because the first order necessary condition would hold only for one value $x^$. This is because the linear system $Ax^ = b$ would have one and only one solution (due to $A$ being full rank).`

hw1: done up to 3.2 2021-03-21 21:54:15 +00:00			`\section{Exercise 3}`

			`\subsection{Quadratic form}`

			`$f(x,y)$ can be written in quadratic form in the following way:`

			`\[f(v) = \frac12 \langle\begin{bmatrix}2 & 0\\0 & 2\mu\end{bmatrix}v, v\rangle + \langle\begin{bmatrix}0\\0\end{bmatrix}, x\rangle\]`

			`where:`

			`\[v = \begin{bmatrix}x\\y\end{bmatrix}\]`

			`\subsection{Matlab implementation with \texttt{surf} and \texttt{contour}}`

			`The graphs generated by MATLAB are shown below:`

hw1: matlab done, report almost done 2021-03-22 15:22:10 +00:00			`\resizebox{\textwidth}{!}{\input{surf.tex}}`

			`\resizebox{\textwidth}{!}{\input{contour.tex}}`

			`\resizebox{\textwidth}{!}{\input{yseries.tex}}`

			`\resizebox{\textwidth}{!}{\input{norms.tex}}`
hw1: done up to 3.2 2021-03-21 21:54:15 +00:00
			`Isolines get stretched along the y axis as $\mu$ increases. For a large $\mu$, points well far away from the axes could be a`
			`problem since picking search directions and steps using a naive gradient based method iterations will zig-zag to the minimizer reaching it slowly.`

Initial commit 2021-03-21 14:05:22 +00:00			`\end{document}`