Suggest edit — sigmoid

Title

Name

Note

---
title: "sigmoid"
source: https://www.jemoka.com/posts/kbhsigmoid/
---

sigmoid function is used to squash your data between \(0\) and \(1\). Sigmoid is symmetric. It could take any number and squash it to look like a probability between 0 and 1.
\begin{equation} \sigma(z) = \frac{1}{1+ e^{-z}} \end{equation}
Say you have one discrete variable \(X\), and one continuous variable \(Y\), and you desire to express \(p(x|y)\).
The simplest way to do this, of course, is to say something like:
\begin{equation} P(x^{j} \mid y) = \begin{cases} P(x^{j} \mid y) = 0, y &lt; \theta \\ P(x^{j} \mid y) = 1, y &gt; \theta \end{cases} \end{equation}
whereby if \(y\) is above or below a value, \(x^{j}|y\) behaves differently. But we often don&rsquo;t want a card cap.
To soften this, we can use a sigmoid model:
\begin{equation} P(x^{1} \mid y) = \frac{1}{1 + \exp \qty(-2 \frac{y-\theta_{1}}{\theta_{2}})} \end{equation}
whereby, \(\theta_{1}\) is where the threshold of activation is, and \(\theta_{2}\) is how soft you want the spread to be.
The derivative of this function is also dead simple:
\begin{equation} \dv{\sigma(z)}{z} = \sigma(z) (1-\sigma(z)) \end{equation}