[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/concept/kbhg_dice.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "G-DICE" source: https://www.jemoka.com/posts/kbhg_dice/ --- Motivation Its the same. It hasn’t changed: curses of dimensionality and history. Goal: to solve decentralized multi-agent MDPs. Key Insights macro-actions (MAs) to reduce computational complexity (like hierarchical planning) uses cross entropy to make infinite horizon problem tractable Prior Approaches masked Monte Carlo search: heuristic based, no optimality garantees MCTS: poor performance Direct Cross Entropy see also Cross Entropy Method sample a value function \(k\) takes \(n\) highest sampled values update parameter \(\theta\) resample until distribution convergence take the best sample \(x\) G-DICE create a graph with exogenous \(N\) nodes, and \(O\) outgoing edges (designed before) use Direct Cross Entropy to solve for the best policy Results demonstrates improved performance over MMCS and MCTS does not need robot communication garantees convergence for both finite and infiinte horizon can choose exogenous number of nodes in order to gain computational savings