Context-Free Grammar
We use CFGs to highlight the recursive structure in expressions.
A CFG consists of
Non-Recursed Terminals (if, else, etc.) T Non-Terminals (expr, stmt, etc.) N Start symbol (i.e., program) S \in N Set of productions generative semantics Let G be an CFG with start symbol S; the language of G is…
with each a_{j}, \forall j being terminals.
That is, the “language of a CFG” is just the set of languages that could be produced by as many moves as needed until we obtain the output.
productions \begin{equation} X \to Y_1, Y_2, \dots, Y_{n} \end{equation}
whereby X \in N, Y_{i} \in T \cup N \cup \left\{\varepsilon\right\}. Notably, we could replace a non-terminal with nothing.
derivation a derivation is a sequence of productions which leads to a string of only terminals
A derivation can be drawn as a tree—
start symbol is the root for production X \to Y_1, …, Y_{n}, add children Y_1, …, Y_{n} to note X stratification stratification allows us to deal with the PEMDAS (/associativity) of the operations. Consider an addition and multiplication world:
E = E+E | E*E | (E) | id how do we parse id * id + id.
We can just have a “multiplication world” E’ language, whereby after en counting an multiplication, we are not allowed to do addition anymore unless we encounter a parentheses.
E = E' + E | E' E' = id * E' | id | (E) * E' | (E) dangling if problem Consider the following ambiguous grammar:
E -> if E then E | if E then E else E | .... Consider:
if something then if something then something else something Is the else the first or the second expression?~:w
to fix this, we need to make a world of “matched if expression”
E -> MIF -- all "then" are matched | UIF -- some "then" is unmatched UIF -> if E then E | if E then MIF else UIF {- "we don't allow unmatched if expressions in then branches" -} MIF -> if E then MIF else MIF | OTHER Most tools, like flex/bison, will allow you to control precedence.
conventions non-terminals in upper case
terminals in lower case
start symbol is the non-terminal of the first production
membership in a language is a “yes”/“no”
we must handle errors gracefully
bison could help us actually implement the parser
example EXPR -> if EXPR then EXPR else EXPR fi | while EXPR loop EXPR pool | id