语法分析(cont.)

:material-circle-edit-outline: 约 509 个字 :material-clock-time-two-outline: 预计阅读时间 2 分钟

Bottom-Up Parsing

reduce instead of derive

Shift-Reduce Parsing: a general style of bottom-up parsing

先列出 LR(0) item，再手搓 NFA，再转化为 DFA，就可以构造出 LR(0) Parsing Table

规约过程如下：

Chapter3-3.pdf 17 页例子，看 LR(0) 具体怎么运作的

Simple LR Parsing

Put reduce actions into the table only where indicated by the FOLLOW set.

先生成 LR(0) DFA，然后计算每一个 non-terminal 的 Follow Set

与 LR(0) 不同的是，SLR(1) 的 reduce 不再只与状态有关，我们需要提前观看下一个符号从而判断进行什么样的 shift 或 reduce 操作。

LR(1) parsing is more powerful than SLR parsing.

LR(0): chooses shift or reduce only based on the current state
SLR: chooses shift or reduce based on more information (whether the next token is in the specific Follow set)
LR(1): add more information into the state of DFA
- Specify which terminal symbols can appear after the whole RHS of the production in certain context.

An LR(1) item consists of an LR(0) item and a lookahead symbol: (A → α.β, x)

一个 LR(1) Item 除了有 production 和 rhs position（用句点表示）以外，还有一个 lookahead symbol。

还是需要见构建 DFA

[!TIP]

可以发现 LR(0)-> SLR(1)-> LR(1)，都是在致力于减少 reduce 导致的冲突项，以兼容更多的文法；第一者是直接无脑让所有终结符都标上 reduce，第二者改用了 FOLLOW，第三者则再细化到实际会出现的情况

LR(1) parsing tables can be very large, with many states.

LALR(1) parsing: merge any two states whose items are identical except for lookahead sets in the LR(1) parsing table

即，将产生式一样的、只是 lookahead symbol 不同的状态合并为一个状态，以此减少状态数

但是，这有时（虽然很少）也可能引入 reduce-reduce conflict

注意到合并后的状态 5&6，会存在 reduce-reduce conflict。这说明这个文法不是 LALR(1) 文法。

根据 DFA 可以构造 LALR(1) Parsing Table，其方法与 LR(1) 完全一致

同样地，如果构造出的 Parsing Table 没有冲突项，则说明文法是 LALR(1) 的，否则不是。