[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/concept/kbhdialogue_state_architecture.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Dialogue State Architecture" source: https://www.jemoka.com/posts/kbhdialogue_state_architecture/ --- Dialogue State Architecture uses dialogue acts instead of simple frame filling to perform generation; used currently more in research. NLU: slot fillers to extract user’s utterance, using ML Dialogue State Tracker: maintains current state of dialogue Dialogue policy: decides what to do next (think GUS’ policy: ask, fill, respond)—but nowaday we have more complex dynamics NLG: respond dialogue acts dialogue acts combines speech-acts with underlying states slot filing we typically do this with BIO Tagging with a BERT just like NER Tagging, but we tag for frame slots. the final <cls> token may also work to classify domain + intent. corrections are hard folks sometimes uses hyperarticulation (“exaggerated prosody”) for correction, which trip up ASR correction acts may need to be detected explicitly as a speech act: dialogue policy we can choose over the last frame, agent and user utterances: \begin{equation} A = \arg\max_{a} P(A|F_{i-1}, A_{i-1}, U_{i-1}) \end{equation} we can probably use a neural architecture to do this. whether to confirm via ASR confirm: \(<\alpha\): reject \(\geq \alpha\): confirm explicitly \(\geq \beta\): confirm implicitly \(\geq \gamma\): no need to confirm NLG once the speech act is determined, we need to actually go generate it: 1) choose some attributes 2) generate utterance We typically want to delexicalize the keywords (Henry serves French food => [restraunt] serves [cruisine] food), then run through NLG, then rehydrate with frame.