2021-03-21 14:05:22 +00:00
\documentclass { scrartcl}
\usepackage [utf8] { inputenc}
2021-03-22 15:22:10 +00:00
\usepackage { graphicx}
\usepackage { subcaption}
2021-03-21 14:05:22 +00:00
\usepackage { amsmath}
2021-03-21 21:54:15 +00:00
\usepackage { pgfplots}
2021-03-22 15:22:10 +00:00
\pgfplotsset { compat=newest}
\usetikzlibrary { plotmarks}
\usetikzlibrary { arrows.meta}
\usepgfplotslibrary { patchplots}
\usepackage { grffile}
\usepackage { amsmath}
\usepackage { subcaption}
\usepgfplotslibrary { external}
\tikzexternalize
\usepackage [margin=2.5cm] { geometry}
% To compile:
% sed -i 's#title style={font=\\bfseries#title style={yshift=1ex, font=\\tiny\\bfseries#' *.tex
% luatex -enable-write18 -shellescape main.tex
\pgfplotsset { every x tick label/.append style={ font=\tiny , yshift=0.5ex} }
\pgfplotsset { every title/.append style={ font=\tiny , align=center} }
\pgfplotsset { every y tick label/.append style={ font=\tiny , xshift=0.5ex} }
\pgfplotsset { every z tick label/.append style={ font=\tiny , xshift=0.5ex} }
2021-03-21 14:05:22 +00:00
\setlength { \parindent } { 0cm}
\setlength { \parskip } { 0.5\baselineskip }
\title { Optimisation methods -- Homework 1}
\author { Claudio Maggioni}
\begin { document}
\maketitle
\section { Exercise 1}
\subsection { Gradient and Hessian}
The gradient and the Hessian for $ f $ are the following:
\[ \nabla f = \begin { bmatrix } 2 x _ 1 + x _ 2 \cdot \cos ( x _ 1 ) \\ 9 x ^ 2 _ 2 + \sin ( x _ 1 ) \end { bmatrix } \]
\[ H _ f = \begin { bmatrix } 2 - x _ 2 \cdot \sin ( x _ 1 ) & \cos ( x _ 1 ) \\ \cos ( x _ 1 ) & 18 x _ 2 \end { bmatrix } \]
\subsection { Taylor expansion}
\[ f ( h ) = 0 + \langle \begin { bmatrix } 0 + 0 \\ 0 + 0 \end { bmatrix } , \begin { bmatrix } h _ 1 \\ h _ 2 \end { bmatrix } \rangle + \frac 12 \langle \begin { bmatrix } 2 - 0 & 1 \\ 1 & 0 \end { bmatrix } \begin { bmatrix } h _ 1 \\ h _ 2 \end { bmatrix } , \begin { bmatrix } h _ 1 \\ h _ 2 \end { bmatrix } \rangle + O ( \| h \| ^ 3 ) \]
\[ f ( h ) = \frac 12 \langle \begin { bmatrix } 2 h _ 1 + h _ 2 \\ h _ 1 \end { bmatrix } , \begin { bmatrix } h _ 1 \\ h _ 2 \end { bmatrix } \rangle + O ( \| h \| ^ 3 ) \]
\[ f ( h ) = \frac 12 \left ( 2 h ^ 2 _ 1 + 2 h _ 1 h _ 2 \right ) + O ( \| h \| ^ 3 ) \]
\[ f ( h ) = h ^ 2 _ 1 + h _ 1 h _ 2 + O ( \| h \| ^ 3 ) \]
\section { Exercise 2}
\subsection { Gradient and Hessian}
For $ A $ symmetric, we have:
\[ \frac { d } { dx } \langle b, x \rangle = \langle b, \cdot \rangle = b \]
\[ \frac { d } { dx } \langle Ax, x \rangle = 2 \langle Ax, \cdot \rangle = 2 Ax \]
Then:
\[ \nabla J = Ax - b \]
\[ H _ J = \frac { d } { dx } \nabla J = A \]
\subsection { First order necessary condition}
It is a necessary condition for a minimizer $ x ^ * $ of $ J $ that:
\[ \nabla J ( x ^ * ) = 0 \Leftrightarrow Ax ^ * = b \]
\subsection { Second order necessary condition}
It is a necessary condition for a minimizer $ x ^ * $ of $ J $ that:
\[ \nabla ^ 2 J ( x ^ * ) \geq 0 \Leftrightarrow A \text { is positive semi - definite } \]
\subsection { Sufficient conditions}
It is a sufficient condition for $ x ^ * $ to be a minimizer of $ J $ that the first necessary condition is true and that:
\[ \nabla ^ 2 J ( x ^ * ) > 0 \Leftrightarrow A \text { is positive definite } \]
\subsection { Does $ \min _ { x \in R ^ n } J ( x ) $ have a unique solution?}
Not in general. If for example we consider A and b to be only zeros, then $ J ( x ) = 0 $ for all $ x \in \! R ^ n $ and thus $ J $ would have an infinite number of minimizers.
However, for if $ A $ would be guaranteed to have full rank, the minimizer would be unique because the first order necessary condition would hold only for one value $ x ^ * $ . This is because the linear system $ Ax ^ * = b $ would have one and only one solution (due to $ A $ being full rank).
2021-03-21 21:54:15 +00:00
\section { Exercise 3}
\subsection { Quadratic form}
$ f ( x,y ) $ can be written in quadratic form in the following way:
\[ f ( v ) = \frac 12 \langle \begin { bmatrix } 2 & 0 \\ 0 & 2 \mu \end { bmatrix } v, v \rangle + \langle \begin { bmatrix } 0 \\ 0 \end { bmatrix } , x \rangle \]
where:
\[ v = \begin { bmatrix } x \\ y \end { bmatrix } \]
\subsection { Matlab implementation with \texttt { surf} and \texttt { contour} }
The graphs generated by MATLAB are shown below:
2021-03-22 15:22:10 +00:00
\resizebox { \textwidth } { !} { \input { surf.tex} }
\resizebox { \textwidth } { !} { \input { contour.tex} }
\resizebox { \textwidth } { !} { \input { yseries.tex} }
\resizebox { \textwidth } { !} { \input { norms.tex} }
2021-03-21 21:54:15 +00:00
Isolines get stretched along the y axis as $ \mu $ increases. For a large $ \mu $ , points well far away from the axes could be a
problem since picking search directions and steps using a naive gradient based method iterations will zig-zag to the minimizer reaching it slowly.
2021-03-21 14:05:22 +00:00
\end { document}