[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / wiki/concepts/cpomdp.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "CPOMDP" type: concept related: [Partially Observable Markov Decision Process, Cpomdp] source: https://www.jemoka.com/posts/kbhcpomdp/ confidence: high status: active --- A CPOMDP, or Constrained Partially Observable Markov Decision Process, gives two objectives for the system to optimize upon: an reward function \(r(s,a)\) and a set of constraints \(c(s,a) \geq 0\). Specifically, we formulate it as a POMDP: \((S,A,\Omega), T, O ,R\), with an additional set of constraints \(\bold{C}\) and budgets \(\beta\). Whereby, we seek to maximize the infinite-horizon reward \(\mathbb{E}_{t} \qty[R(a_{t}, s_{t})]\) subject to discounting, subject to: \begin{equation} C_{i}(s,a) \leq \beta_{i}, \forall C_{i},\beta_{i} \in \bold{C}, \beta \end{equation}