[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / wiki/concepts/policy_optimization.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Policy Optimization" type: concept related: [Policy, Cross Entropy Method, Policy Iteration, Local Policy Search, Policy Optimization] source: https://www.jemoka.com/posts/kbhpolicy_optimization/ confidence: high status: active --- Policy Optimization deals with algorithms that, unlike value iteration/policy iteration/online planning which uses a surrogate (like value function or some future discounted reward) to calculate a policy, directly optimizes against policy parameters \(\theta\) for a policy \(\pi_{\theta}\). Local Policy Search (aka Hooke-Jeeves Policy Search) Genetic Policy Search Cross Entropy Method Policy Gradient, Regression Gradient and Likelyhood Ratio Gradient Reward-to-Go