[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / wiki/concepts/advantage_function.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "advantage function" type: concept related: [Utility Theory, Policy, Advantage Function] source: https://www.jemoka.com/posts/kbhadvantage_function/ confidence: high status: active --- an advantage function is a method for scoring a policy based on how much additional value it provides compared to the greedy policy: \begin{align} A(s,a) &= Q(s,a) - U(s) \\ &= Q(s,a) - \max_{a}Q(s,a) \end{align} that is, how much does your policy’s action-value function differ from that of choosing the action that maximizes the utility. For a greedy policy that just optimizes this exact metric, \(A =0\).