[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/concept/kbhcontext_free_grammars.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Context-Free Grammar" source: https://www.jemoka.com/posts/kbhcontext_free_grammars/ --- We use CFGs to highlight the recursive structure in expressions. A CFG consists of Non-Recursed Terminals (if, else, etc.) \(T\) Non-Terminals (expr, stmt, etc.) \(N\) Start symbol (i.e., program) \(S \in N\) Set of productions generative semantics Let \(G\) be an CFG with start symbol \(S\); the language of \(G\) is… \begin{equation} \qty {a_1, \dots, a_{n} | S \to^{*} a_{1}, a_{n}} \end{equation} with each \(a_{j}, \forall j\) being terminals. That is, the “language of a CFG” is just the set of languages that could be produced by as many moves as needed until we obtain the output. productions \begin{equation} X \to Y_1, Y_2, \dots, Y_{n} \end{equation} whereby \(X \in N\), \(Y_{i} \in T \cup N \cup \qty {\varepsilon}\). Notably, we could replace a non-terminal with nothing. derivation a derivation is a sequence of productions which leads to a string of only terminals A derivation can be drawn as a tree— start symbol is the root for production \(X \to Y_1, …, Y_{n}\), add children \(Y_1, …, Y_{n}\) to note \(X\) stratification stratification allows us to deal with the PEMDAS (/associativity) of the operations. Consider an addition and multiplication world: E = E+E | E*E | (E) | id how do we parse id * id + id. We can just have a “multiplication world” E’ language, whereby after en counting an multiplication, we are not allowed to do addition anymore unless we encounter a parentheses. E = E' + E | E' E' = id * E' | id | (E) * E' | (E) dangling if problem Consider the following ambiguous grammar: E -> if E then E | if E then E else E | .... Consider: if something then if something then something else something Is the else the first or the second expression?~:w to fix this, we need to make a world of “matched if expression” E -> MIF -- all "then" are matched | UIF -- some "then" is unmatched UIF -> if E then E | if E then MIF else UIF {- "we don't allow unmatched if expressions in then branches" -} MIF -> if E then MIF else MIF | OTHER Most tools, like flex/bison, will allow you to control precedence. conventions non-terminals in upper case terminals in lower case start symbol is the non-terminal of the first production membership in a language is a “yes”/“no” we must handle errors gracefully bison could help us actually implement the parser example EXPR -> if EXPR then EXPR else EXPR fi | while EXPR loop EXPR pool | id