Translate to Intermediate Code
Three-Address Code
定义
三地址码是一个宽泛的概念,没有严格的形式,其用来表示算术表达式求值,形如 x = y op z
There is at most one operator on the right side of an instruction.
The entire sequence of three-address instructions is implemented as an array or linked list
实现
保存三地址码信息需要 3 个地址以及 1 个操作符,我们用 四元式 quadruple 的方式来存储三地址码
Intermediate Representation Trees
表达式
evaluate 即求值
ESEQ 是连接,先执行 statement 得到副作用(环境 B),再计算 e 得到结果作为表达式的返回
语句
EXP(e) 会计算表达式然后将结果丢掉,用来将表达式转化为语句(只需要表达式的副作用,不管其返回值)
JUMP 指跳转到 e 的地址继续执行,注意这里 e 不一定是表达式
CJUMP 为条件跳转,t 和 f 是两个地址
SEQ 就是把两个语句连起来,功能类似 C 语言的分号
LABEL 将当前机器码的地址存入符号 n
Translate AST into IR Trees
对不同的 AST 结点进行不同的处理
Expressions
- Expressions with return values: T_exp
- Ex
- Expressions that return no value: T_stm
- Nx
- Expressions with Boolean values, such as a > b: aconditional jump
- Cx
typedef struct Tr_exp_ *Tr_exp;
struct Cx { patchList trues; patchList falses; T_stm stm;};
struct Tr_exp_ {
enum { Tr_ex, Tr_nx, Tr_cx } kind;
union {
T_exp ex;
T_stm nx;
struct Cx cx; } u;
};
static Tr_exp Tr_Ex ( T_exp ex);
static Tr_exp Tr_Nx ( T_exp nx);
static Tr_exp Tr_Cx ( patchList trues, patchList falses,T_stm
stm);
Tr_*:Translate module _
T*: Tree module
NULLt 指 true 时跳转的位置,因为此时还不知道具体地址就记成 null,会通过 patchList 替换为正确的地址
Simple Variables
Translate a simple variable v declared in the current procedure’s stack frame:
- k 是变量再栈帧中的偏移
- fp 是 fp reg
Array Variables
In Tiger,array variables behave like pointers
Structured L-Values
Aninteger or pointer value is a “scalar”: it has only one component
Subscripting and Field Selection
- To compute the address of a [i]: (i −l) × s + a
- a: the base address of the array elements
- l: the lower bound of the index rang
- s: the size (in bytes) of each array element
- T calculate the address of the field f of a record a: offset(f) + a
In the Tiger language, all record and array values are really pointers to record and array structures.
Arithmetic
Tiger 没有一元操作符,负数用 0-x 运算表示
Tiger 不支持浮点数
Conditionals
an expression such as x < 5 will be translated as a Cx with:
最简单粗暴的方式,直接将跳转逻辑翻译过来:
但这样不高效:
- If e2 and e3 are both “statements”, unEx will work, but it might be better to recognize this case specially.
- If e2 or e3 is a Cx expression, unEx will yield a horrible tangle of jumps and labels.
我们尝试优化后者的情况:尝试将 e2 和 e3 的 Cx 转化进 e1