Skip to content

Translate to Intermediate Code

:material-circle-edit-outline: 约 645 个字 :fontawesome-solid-code: 13 行代码 :material-clock-time-two-outline: 预计阅读时间 2 分钟

Chapter7.pdf

编译原理(本)_刘忠鑫

Three-Address Code

定义

三地址码是一个宽泛的概念,没有严格的形式,其用来表示算术表达式求值,形如 x = y op z

There is at most one operator on the right side of an instruction.

The entire sequence of three-address instructions is implemented as an array or linked list

image-20250414194805197

实现

保存三地址码信息需要 3 个地址以及 1 个操作符,我们用 四元式 quadruple 的方式来存储三地址码

image-20250414195352560

Intermediate Representation Trees

表达式

image-20250414205537869

evaluate 即求值

ESEQ 是连接,先执行 statement 得到副作用(环境 B),再计算 e 得到结果作为表达式的返回

语句

image-20250414205550127

EXP(e) 会计算表达式然后将结果丢掉,用来将表达式转化为语句(只需要表达式的副作用,不管其返回值)

JUMP 指跳转到 e 的地址继续执行,注意这里 e 不一定是表达式

CJUMP 为条件跳转,t 和 f 是两个地址

SEQ 就是把两个语句连起来,功能类似 C 语言的分号

LABEL 将当前机器码的地址存入符号 n

Translate AST into IR Trees

对不同的 AST 结点进行不同的处理

Expressions

  • Expressions with return values: T_exp
    • Ex
  • Expressions that return no value: T_stm
    • Nx
  • Expressions with Boolean values, such as a > b: aconditional jump
    • Cx
typedef struct Tr_exp_ *Tr_exp;
struct Cx { patchList trues; patchList falses; T_stm stm;};
struct Tr_exp_ { 
    enum { Tr_ex, Tr_nx, Tr_cx } kind;
    union { 
    T_exp ex;
    T_stm nx; 
    struct Cx cx; } u;
};
static Tr_exp Tr_Ex ( T_exp ex);
static Tr_exp Tr_Nx ( T_exp nx);
static Tr_exp Tr_Cx ( patchList trues, patchList falses,T_stm 
stm);

Tr_*:Translate module _

T*: Tree module

image-20250414221405963

NULLt 指 true 时跳转的位置,因为此时还不知道具体地址就记成 null,会通过 patchList 替换为正确的地址

image-20250414222457387

Simple Variables

Translate a simple variable v declared in the current procedure’s stack frame:

image-20250414222547764

  • k 是变量再栈帧中的偏移
  • fp 是 fp reg

image-20250414223128363

Array Variables

In Tiger,array variables behave like pointers

Structured L-Values

Aninteger or pointer value is a “scalar”: it has only one component

Subscripting and Field Selection

  • To compute the address of a [i]: (i −l) × s + a
    • a: the base address of the array elements
    • l: the lower bound of the index rang
    • s: the size (in bytes) of each array element
  • T calculate the address of the field f of a record a: offset(f) + a

In the Tiger language, all record and array values are really pointers to record and array structures.

image-20250415121857053

Arithmetic

Tiger 没有一元操作符,负数用 0-x 运算表示

Tiger 不支持浮点数

Conditionals

an expression such as x < 5 will be translated as a Cx with:

stm =CJUMP(LT, x, CONST(5), NULLt, NULLf)
trues = {t}
falses = {f}

最简单粗暴的方式,直接将跳转逻辑翻译过来:

image-20250415123629612

但这样不高效:

  • If e2 and e3 are both “statements”, unEx will work, but it might be better to recognize this case specially.
  • If e2 or e3 is a Cx expression, unEx will yield a horrible tangle of jumps and labels.

我们尝试优化后者的情况:尝试将 e2 和 e3 的 Cx 转化进 e1

image-20250415123143098

While Loops

For Loops

Function Call

Translation of Declarations

Variable Definition

Function Definition