02 IR

编译器基础

graph TB;
  SC([Source Code]) -- Scanner --> Tokens -- Parser --> AST -- Type Checker --> DA[Decorated AST] -- Translator --> IR -- Code Generator --> MC([Machine Code])

阶段	分析类型	工具
Scanner	Lexical Analysis	Regular Expression
Parser	Syntax Analysis	Context-Free Grammar
Type Checker	Semantic Analysis	Attribute Grammar
(After Translator)	Static Analysis

AST

high-level and closed to grammar structure
usually language dependent
suitable for fast type checking
lack of control flow information

low-level and closed to machine code
usually language independent
compact and uniform
contains control flow information
usually considered as the basis for static analysis

IR

3-Address Code(3AC)

每个语句只出现3个地址（等号右边只出现一个操作符）

地址：变量、常量、编译器生成的临时变量

Soot(Jimple)

typed 3AC

“每个指令最多三个地址”的规则主要针对的是简单的算术和逻辑运算指令，而方法调用在 Jimple 中是一种特殊的、更复杂的指令，它本身就可以包含任意数量的参数。

Static Single Assignment (SSA)

All assignments in SSA are to variables with distinct names

Give each definition a fresh name
Propagate fresh name to subsequent uses
Every variable has exactly one definition
A special merge operator, (called phi-function), is introduced to select the values at merge nodes
has the value x0 if the control flow passes through the true part of the conditional and the value x1 otherwise

Why SSA?

Flow information is indirectly incorporated into the unique variable names
- May help deliver some simpler analyses, e.g., flow-insensitive analysis gains partial precision of flow-sensitive analysis via SSA
Define-and-Use pairs are explicit Enable more effective data facts storage and propagation in some on-demand tasks
- Some optimization tasks perform better on SSA (e.g., conditional constant propagation, global value numbering)

Why not SSA?

SSA may introduce too many variables and phi-functions
May introduce inefficiency problem when translating to machine code (due to copy operations)

How to build Basic Blocks?

INPUT: A sequence of three-address instructions of P

OUTPUT: A list of basic blocks of P

METHOD:

寻找P中的leaders

第一条指令是leader
所有条件或非条件跳转的目标是leader
条件或非条件跳转之后的第一条指令

构建基本块

每个leader到下一个leader之间的部分就是基本块

Quartz 4

探索