编译技术2-7

发表于2025-05-01|更新于2025-05-01|编译原理

|字数总计:731|阅读时长:3分钟|阅读量:|评论数:

编译技术2-7

📌 Conflict Resolutions 冲突解决

From Regular Expressions to Implementation 从正则表达式到实现

1. From Regular Expressions to NFAs

从正则表达式到 NFA

Inductive method 归纳方法

2. From NFA to DFA

从 NFA 到 DFA

Subset construction algorithm 子集构造算法

3. Minimizing DFA

最小化 DFA

State-Minimization Algorithm 状态最小化算法

🚧 Challenges in Scanning 扫描中的挑战

How do we determine which lexemes are associated with each token? 如何确定哪些词素与每个标记相关联？
When there are multiple ways we could scan the input, how do we know which one to pick? 当有多种扫描输入的方式时，我们如何知道选择哪一种？
- Maximal munch 最大匹配原则
- Priority system 优先级系统
How do we address these concerns efficiently? 如何高效地解决这些问题？

⚠️ Lexing Ambiguities 词法分析歧义

Example 示例：

1 2	T_For: for T_Identifier: [A-Za-z_][A-Za-z0-9_]*

Input: "fort"

Possible tokenizations:

"for" + "t" → T_For + T_Identifier
"f" + "o" + "r" + "t" → 4 T_Identifier tokens
"fo" + "rt" → 2 T_Identifier tokens
etc.

🔄 Conflict Resolution Algorithm 冲突解决算法

Left-to-right scan 从左到右扫描
Maximal munch 最大匹配原则
- Always match the longest possible prefix of the remaining text 总是匹配剩余文本中最长的可能前缀

⚙️ Implementing Maximal Munch 实现最大匹配原则

Approach 方法：

Convert all regular expressions to NFAs 将所有正则表达式转换为 NFA
Run all NFAs in parallel 并行运行所有 NFA
Keep track of the last match 跟踪最后一个匹配
When all automata get stuck, report the last match and restart 当所有自动机都卡住时，报告最后一个匹配并重新开始

Example with Tokens 示例

T_Do: "do"
T_Double: "double"
T_Mystery: [A-Za-z]

输入: "DOUBLE" → 所有 token 的 NFA 会并行运行来扫描输入。

img

🥇 Priority System 优先级系统

When two regular expressions match the same text:

Choose the one defined first (higher priority) 选择先定义的规则（更高优先级）

Example:

T_Do
T_Double
T_Identifier

对于输入 "double"，T_Double 会优先于 T_Identifier 被选择。

❗ Error Handling 错误处理

Catch-all rule 通用规则：

Matches any character and reports an error 匹配任何字符并报告错误
Ensures the scanner never gets stuck 确保扫描器永远不会卡住

✅ Summary of Conflict Resolution

冲突解决总结

Construct an automaton for each regular expression 为每个正则表达式构建一个自动机
Merge them into one automaton with a new start state 将它们合并为一个具有新起始状态的自动机
Scan input while tracking the last known match 扫描输入同时跟踪最后一个已知匹配
Break ties using rule precedence 使用规则优先级打破平局
Include a catch-all error rule 包含一个通用的错误规则

🎬 From Regular Expressions to Implementation

从正则表达式到实现

讲座过渡内容：

观看 SPOC 平台上的 MOOC 视频
完成 SPOC 第三讲中的小测验

📺 Key Topics in Upcoming Videos 后续视频重点内容

From regular expressions to finite automata (3-5) 从正则表达式到有限自动机
Conversion from NFA to DFA (3-6) 从 NFA 转换为 DFA
DFA for word recognition (3-7) 使用 DFA 进行单词识别

文章作者: Totoro

文章链接: http://totorocatcat.top/%E7%BC%96%E8%AF%91%E6%8A%80%E6%9C%AF2-7/

版权声明: 本博客所有文章除特别声明外，均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Totoroの旅！

相关推荐

Lec2 Logic and Proof, Sets, and Function

CS61B 课程笔记（Lecture 02）

CS61B 课程笔记（Lecture 04）

CS61B 课程笔记（Lecture 03）

CS61B 课程笔记（Lecture 07）

CS61B 课程笔记（Lecture 06）

评论

数据库加载中