[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/concept/kbhhindsight_optimization.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Hindsight Optimization" source: https://www.jemoka.com/posts/kbhhindsight_optimization/ --- If we are tele-operating a robot, we ideally want to minimize cost. We want to estimate a user’s goal via user inputs. Predict the most likely goal + assist for it. “find a cost function for which user input \(u\) is optimal”. system does not know the goal the user may not change their goal on a whim Hindsight Optimization To solve this, we use QMDP: “select the most optimal actions to estimating cost-to-go assuming full observability”. \begin{equation} Q(b,a,u) = \sum_{g}^{} b(g) Q_{g}(x,a,u) \end{equation} Result users felt less in control with Hindsight Optimization, despite reaching the goal faster with this policy. Challenging the results between “task completion” vs. “user satisfaction”.