inference
inference is the act of updating the distribution of a random variable based on distribution of actually observed variables:
where Y is observed, and we want to know how likely X would therefore be.
We call the set X the “query variables”, Y as “evidence varibales”, and anything that we didn’t use which connects the two variables as “hidden variables”.
If things are not in the right order of X and Y, consider the Bayes rule.
Inference is Hard mix of continuous and discrete distribution results could be either a PMF or a PDF Example Suppose we’d like to know P(b^{1} | d^{1}, c^{1}), where b^{1} is considered a query variable, and c^{1} is considered evidence varibales. The definition of the conditional probability gives us:
To compute p(b^{1}d^{1}c^{1}), we first compute:
and then, use the law of total probability to get:
you will note this is very expensive computationally O(es) — and if you have like a 1000 hidden variables you will die.
We therefore introduce sum-product elimination.
sum-product elimination You will note the summation in the example above has a lot of interlocking for loops. You can “factor them out” via the sum-product elimination algorithm.
Suppose you are interested in:
Step 1: write down factors Write down all factors associated with this computation:
we have evidence at two variables: D, C.
Step 2: performing factor conditioning for all evidence variables Therefore, \phi_{4} and \phi_{5} can be replaced by the factor conditioning as we observed d, c, so we no longer need d, c as input because we know them:
now we have, to replace \phi_{4}, \phi_{5}:
Step 3: using the law of total probability and factor product, get rid of hidden variables We then choose an ordering of the hidden variables and apply a factor product using the law of total probability to get rid of them:
First get rid of any hidden variables Then use factor product to combine results \begin{equation} \phi_{8}(B,S) = \sum_{E=e} \phi_{3}(E,B,S) \phi_{6}(e) \phi_{7}(e) \end{equation}
We now only have two factors left: \phi_{1}(B)\phi_{9}(B). We finally apply factor product again:
Approximate Inference See Approximate Inference
Gaussian Inference See Inference for Gaussian Models