[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / wiki/concepts/cross_entropy_method.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Cross Entropy Method" type: concept related: [Gaussian Distribution, Utility Theory] source: https://www.jemoka.com/posts/kbhcross_entropy_method/ confidence: high status: active --- This method introduces a search distribution instead of discrete points: \begin{equation} p(\theta | \psi) \end{equation} We want to know how parameters \(\theta\) are distributed, given some input parameters \(\psi\) (for instance, we assume parameters are gaussian distributed such as the mean/variance). Given this distribution, we sample \(m\) samples of \(\theta\) from the distribution. Those are our starting candidate points. We then check its policy for its utility via the Roll-out utility We want to take top \(k\) of our best performers, called “elite samples” \(m_{elite}\) Use the set of \(m_{elite}\) points, we fit a new distribution parameter \(\psi\) that describes those sample This allows us to bound how many Roll-out utilities we are doing. For each dimension, we should have 10x elite sample points (1d should have 10 samples, 2d should have 20, etc.)