[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / wiki/concepts/reliable_rl.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Reliable RL" type: concept source: https://www.jemoka.com/posts/kbhreliable_rl/ confidence: high status: active --- Thinking about advances in the capabilities of RL: Knowledge Discovery -> Reasoning (programming assistance) ->(ongoing)-> Robotics Insight: as time goes on, the “risk-criticality” of our applications increase; yet, as risk critical scenarios increase, its harder to get data. Reliable Feedback Loop General desirable structure… Verify (claims and requirements) => Safeguard (safe continuous deployment) => Generalize (via compositional generalization—incrementing adding behavior without loosing behavior) => Verify => … Deal with Stochasticity An RL algorithm is explicable, if, WHP, running on the same MDP with fixed randomness results in the same outcomes. => \(\epsilon\) optimal replicable algorithms for tabular / linear settings with sample complexity polynomial i parameters. Quantization for Tie Break Compositional Generalization We can decompose relevant problems into subparts, and thus allowing us to compose them together into solving new task.