Suggest edit — SU-CS143 APR242025

Title

Name

Note

---
title: "SU-CS143 APR242025"
source: https://www.jemoka.com/posts/kbhsu_cs143_apr242025/
date: 2025-04-24
---

Bottom-Up Parsing Bottom-Up Parsing is more general than top-down parsing (i.e., it doesn&rsquo;t require left factorization, etc.)
This is generally the preferred method. Key insight: we reduce strings into increasingly higher level symbols
Sketch Because of the Properties of Bottom-Up Parsing, we can&hellip;.
initialization split the string into two pieces right substring has terminals, not yet examined left substring has terminals and non-terminals We begin by splitting to the left since we begin with no non-terminals.
x1 x2 x3 &hellip;. xn moves Shift / Reduce
shift: move | one place to the right &mdash; shifts a terminal to the left string reduce: apply an inverse production at the right end of the left string (i.e., take a substring closest to the bar, and reduce it) Think: left string is a stack. top: is the |; push: the &ldquo;shift&rdquo; operation to push a terminal onto the stack; reduce: pop something onto the stack.
conflict When Shift/Reduce are both legal, that&rsquo;s a &ldquo;shift reduce&rdquo; conflict; if multiple reductions are possible, that&rsquo;s a &ldquo;reduce reduce&rdquo; conflict.
deciding moves handle
We want to reduce only if the result can still be reduced to the starting symbol.
A handle is a single production in a legal sequence coming from $S$ which can be reduced into a single nonterminal. Consider:
\begin{equation} S \to^{*} \alpha X \omega \to \alpha \beta \omega \end{equation}
then $X \to \beta$ in the position after $\alpha$ is a handle of $\alpha \beta \omega$. We reduce at handles.
The handles are always top of the stack, otherwise it&rsquo;d not be a rightmost derivation in reverse.
SLR
Simple Left-to-right Rightmost derivation is a class of CFGs for which we have simple heuristics for detecting handles.
heuristics
item
An item is a production with &ldquo;.&rdquo; somewhere on the RHS, denoting a focus point. For instance, $T \to .(E)$.
The only item for $X \to \epsilon$ is $X → .$
We also call these LR(0) items.
viable prefix
$\alpha$ is a &ldquo;viable prefix&rdquo; if there exists $\omega$ such that $\alpha | \omega$ is a state of a shift-reduce parser.
a viable prefix doesn&rsquo;t extend beyond the right end of handle its a variable prefix because its a possible prefix of a handle as long as a parser has viable prefixes on the stack, no parsing error detecting viable prefixs
We us an NFA!
add a new start production $S&rsquo; \to S$ to the grammar the NFA states are items of the grammar For item $E \to \alpha . X \beta$ add the transition $[E \to \alpha . X \beta] \to^{x} [E \to \alpha X . \beta]$ For item $E \to \alpha . X \beta$ and production $X \to \gamma$, add the transition $[E \to \alpha . X \beta] \to^{\epsilon} [X \to . \gamma]$ every state is an accepting state, start state is $S&rsquo; \to .S$
LR(0) Parsing
Suppose stack contains $\alpha$, next input is $t$. DFA on input $\alpha$ terminated it state $s$.
Reduce by $X \to \beta$ if $s$ contains item $X → β .$ Shift if $s$ contains item $X \to \beta .t\omega$ (that is, there&rsquo;s a transition labeled $t$ We will get reduce/reduce conflicts if one state contains $X → β .$ and $Y → ω .$; we will get shift/reduce conflict if $X → β.$ and $Y \to \omega .t \gamma$
LR(0) to SLR
Exactly the same process, but!
Suppose stack contains $\alpha$, next input is $t$. DFA on input $\alpha$ terminated it state $s$.
Reduce by $X \to \beta$ if $s$ contains item $X → β. $; and if $t \in \text{Follow}\qty(X)$ Shift if $s$ contains item $X \to \beta .t\omega$ (that is, there&rsquo;s a transition labeled $t$ If there are conflicts under these rules, then you don&rsquo;t have an SLR grammar.
Remember to keep (symbol, DFA state) on the stack.
Properties of Bottom-Up Parsing bottom-up parser traces a rightmost derivation in reverse in shift-reduce strategy, handles appear only at the top of the stack, never inside the stack for any SLR grammar, the set of viable prefixes is a regular language! Let $\alpha \beta \omega$ be a step of bottom up parse, assume the next reduction is $X \to \beta$, then $\omega$ should already have been a string of terminals (since we go rightmost first) Example int * int + int | T -&gt; int int * T + int | T -&gt; int * T T + int | T -&gt; int T + T | E -&gt; T T + E | E -&gt; T + E E Weird! It somehow starts in the middle.
Generally, the parsing strategy expands non-terminals bottom-up in the rightmost first.