Suggest edit — SU-CS361 APR042024

Title

Name

Note

---
title: "SU-CS361 APR042024"
source: https://www.jemoka.com/posts/kbhsu_cs361_apr042024/
date: 2024-04-04
---

optimization inequalities cannot be strict Consider:
\begin{align} \min_{x}&amp;\ x \\ s.t.\ &amp; x &gt; 1 \end{align}
this has NO SOLUTION. (1,1) wouldn&rsquo;t actually be in feasible set. So, we usually specify optimization without a strict inequality.
So, instead, we write:
\begin{align} \min_{x}&amp;\ x \\ s.t.\ &amp; x \geq 1 \end{align}
Univariate Conditions First order Necessary Condition \begin{equation} \nabla f(x^{*}) = 0 \end{equation}
Second order necessary condition \begin{equation} \nabla^{2}f(x^{*}) \geq 0 \end{equation}
Derivative \begin{equation} f&rsquo;(x) = \frac{\Delta f(x)}{\Delta x} \end{equation}
Or gradient; our convention is that gradients are a COLUMN vector&mdash;
\begin{equation} \nabla f(x) = \mqty(\pdv{f(x)}{x_1} \\ \pdv{f(x)}{x_2} \\ \dots \\ \pdv{f(x)}{x_{n}}) \end{equation}
Hessian matrix (2nd order partial); its just this, where columns are the second index and rows are the first index.
Directional Derivative \begin{align} \nabla_{s} f(x) &amp;= \lim_{h \to 0} \frac{f(x+hs) - f(x)}{h} \\ &amp;= \lim_{h \to 0} \frac{f(x+\frac{hs}{2}) - f(x- \frac{hs}{2})}{h} \end{align}
i.e. this is &ldquo;derivative along a direction&rdquo;
Numerical Method Finite-Difference Method All of these methods suffer from the fact that $f(x+h) - f(x)$ cancels out at small values of $x$ and $h$, because of floating point errors. To fix this, use Complex-Difference Method.
Forward Difference Recall the Taylor Series about $f(x+h)$:
\begin{equation} f(x+h) = f(x) + \frac{f&rsquo;(x)}{1} h + \frac{f&rsquo;&rsquo;(x)}{2!} h^{2} + \dots \end{equation}
Moving it around to get $f&rsquo;(x)$ by itself:
\begin{equation} f&rsquo;(x)h = f(x+h) - f(x) - \frac{f&rsquo;&rsquo;(x)}{2!} h^{2} - \dots \end{equation}
So:
\begin{equation} f&rsquo;(x) \approx \frac{f(x+h)-f(x)}{h} \end{equation}
where $&hellip;$ errors in the end at $O(h)$. So:
\begin{equation} f&rsquo;(x) = \lim_{h \to 0}\frac{f(x+h)-f(x)}{h} \end{equation}
Error Analysis
$\frac{f&rsquo;&rsquo;(x)}{2!}h + &hellip; h^{n}$, the biggest error term lives with $h$, so this scheme has asymtotic error at $O(h)$.
Central Difference Slightly different formulation, which gives quadratic error $O(h^{2})$, because the non-squared $h$ term cancels out:
\begin{equation} f&rsquo;(x)= \lim_{h \to 0}\frac{f\qty(x+\frac{h}{2})-f\qty(x-\frac{h}{2})}{h} \end{equation}
Backward Difference Forward difference, backward:
\begin{equation} f&rsquo;(x) = \lim_{h \to 0} \frac{f(x)-f(x-h)}{h} \end{equation}
Complex-Difference Method Consider a Taylor approximation of complex difference:
\begin{equation} f(x + ih) = f(x) + ih f&rsquo;(x) - h^{2} \frac{f&rsquo;&rsquo;(x)}{2!} - ih^{3} \frac{f&rsquo;&rsquo;&rsquo;(x)}{3!} + \dots \end{equation}
Let&rsquo;s again try to get $f&rsquo;(x)$ by itself; rearranging and thinking for a bit, we will get every other term on the expression above:
\begin{equation} f&rsquo;(x) = \frac{\Im (f(x+ih))}{h} + \dots \end{equation}
Where the $&hellip;$ errors is at $O(h^{2})$
NOTICE: we no longer have the cancellation error because we no longer have subtraction.
Automatic Differentiation See Automatic Differentiation
Bracketing Given a unimodal function, the global minimum is guaranteed to be within $[a,c]$ with $b \in [a,c]$ if we have that $f(a) &gt; f(b) &lt; f( c)$.
So let&rsquo;s find this bracket.
Unimodality A function $f$ is unimodal if:
$\exists$ unique $x^{*}$ such that $f$ is monotonically decreasing for $x \leq x^{*}$ and monotonically increasing for $x \geq x^{*}$
Bracketing Procedure If we don&rsquo;t know anything, we might as well start with $a=-1, b=0, c=1$.
One of three things:
we already satisfy $f(a) &gt; f(b) &lt; f( c)$, well, we are done if our left side $f(a)$ is too low, we will move $a$ to the left without moving $c$&mdash;doubling the step size every time until it works if our right side is too low to the other thing, move it too, &hellip; Fibonacci Search Say you wanted to evaluate your sequence a finite number of times to maximally lower the interval for bracketing.
Two Evaluations
At two evaluations, you should pick two points right down the middle very close together; this will cut your interval in half.
$n$ evaluations
Evaluate intervals with lengths
\begin{equation} F_{n} = \begin{cases} 1, n\leq 2 \\ F_{n-1} + F_{n-2} \end{cases} \end{equation}
as in; say you are allowed $n$ evaluations; figure the sequence $\{F_1, \dots, F_{n}\}$, and then partition your space between $a$ and $b$ into $F_{n}$ slices; evaluate at locations $\frac{F_{n-1}}{F_{n}}$ and $1- \frac{F_{n-1}}{F_{n}}$, and lop off the half-line which is to the extrema of the higher point.
Golden Section Search Because of Binet&rsquo;s Formula, we can write:
\begin{equation} \lim_{n \to \infty} \frac{F_{n-1}}{F_{n}} = \frac{1}{\varphi} \end{equation}
the inverse of the the golden ratio. So just shrink intervals by that instead.