Click Here Return to Posts List # Intermediate_representation中间语言表示 ## Review of Compiler Source code -> 词法分析 -> 语法分析 -> 语义分析 -> 中间代码生成 -> **(静态分析)** -> 生成目标代码 -> machine code 更多有关编译原理的内容请查阅其他书籍或跳转我的专栏: ### AST抽象语法树与三地址码比较 #### AST 1. high-level & close to grammar 2. language dependent 3. fast type checking 4. lack of control-flow information #### IR(3-address) 1. low-level & close to machine code 2. language independent 3. simple, compact and uniform 4. contain control-flow information ## Soot and Jimple Soot是github上一个开源的java静态分析项目,它的IR名字叫Jimple e.g.1 for循环 ```bash package xjtu.edu.e1; public class ForLoop3AC { public static void main(String[] args){ int x = 0; for(int i = 0; i < 10; i++){ x = x + 1; } } } #对应的Jimple: public static void main(java.lang.String[]){ java.lang.String[] r0; int i1; r0 := @parameter0: java.lang.String[]; i1=0; #这里编译器进行了优化,x dead-zone label1: if i1 >= 10 goto label2; #有条件goto语句 i1 = i1 + 1; goto label1; #无条件goto语句 label2: return; } ``` e.g.2 方法调用 ```bash package example; public class MethodCall3AC{ String combine(String string1, String string2){ return string1 + " " + string2; } public static void main(String[] args){ MethodCall3AC mc = new MethodCall3AC(); System.out.println(mc.combine("xjtu","se")) } } #combine函数对应的Jimple: java.lang.String combine(java.lang.String, java.lang.String){ example.MethodCall3AC r0; java.lang.String r1,r2,$r7; #$在Jimple中代表临时变量 java.lang.StringBuilder $r3,$r4,$r5,$r6; r0 := @this: example.MethodCall3AC; r1 := @parameter0: java.lang.String; r2 := @parameter1: java.lang.String; $r3 = java.lang.StringBuilder; specialinvoke $r3.()>(); #函数签名 className.(paraName) $r4 = virtualinvoke $r3.(r1); #这一步相当于使用了StringBuilder底下的append方法,在空的基础上衔接了实际值为r1的StringBuilder作为返回值,下同 $r5 = virtualinvoke $r4.(" "); $r6 = virtualinvoke $r5.(r2); $r7 = virtualinvoke $r6.(); return $r7; } #main函数对应的Jimple: public static void main(java.lang.String[]){ java.lang.String[] r0; example.MethodCall3AC $r3; #由于combine函数和main函数一起进行Soot转换,r1和r2已经使用,r3作为临时变量被释放 java.lang.System $r4; java.io.printStream $r5; r0 := @parameter0: java.lang.String[]; $r3 = new example.MethodCall3AC; $r4 = java.lang.System; specialinvoke $r3.()>(); virtualinvoke $r3.("xjtu","se"); $r5 = finalinvoke $r4.(); #$r4即为java.lang.System,out是System中的final方法 virtualinvoke $r5.($r3); return; } ``` e.g.3 类声明与使用 ```BASH package example; public class class3AC{ public static final double pi = 3.14; public static void main(String[] args){ } } Jimple: public class example.class3AC extends java.lang.object{ public static final double pi; public void (){ example.class3AC r0; r0 := @this: example.class3AC; specialinvoke r0.()>(); return; } public static void main(java.lang.String[]){ java.lang.String[] r0; r0 := @parameter: java.lang.String[]; return; } public static void(){ = 3.14; return; } } ``` ### 关于jvm字节码中调用的四种类型 1. invokespecial: call constructor/superclass/private朝父级调用 2. invokevirtual: instance method call(virtual)朝子级调用 3. invokeinterface: checking interface implementation 4. invokestatic: call static methods ## Static Single Assignment(SSA) each variables in assignment have a distinct name ```bash 3AC: p = a + b q = p - c p = q * e p = p * c SSA: p1 = a + b q1 = p1 - c p2 = q1 * e1 p3 = p2 * c ``` 对于多条数据流存在phi-function,给不同数据流的同一个变量进行选择,同时赋给新同名变量,再进行其他变量的定义 SSA may introduce too much phi-function ## Control Flow Graph控制流图 ### Basic Block 1. contains 3 addr code 2. can only be entered in the first line 3. only have one exit(do not have exit except last line) ```bash BB1: (1)a = input (2)b = a + 2 BB2: (3)c = a * b (4)if c > 20 goto (7) BB3: (5)b = b + 1 (6)goto (3) BB4: (7)d = b / a (8)p = b - d (9)if d==p goto (11) BB5: (10) goto(3) BB6: (11) return ``` 总的来说,BB的首行判断方法为: (1)整段代码的第一行 (2)跳转的目标行 (3)跳转行的下一行 ### Flow的建立 1. 所有BB默认跳转到下一BB,除了exit为纯goto 2. 带有goto指令的还需跳转到对应的BB ```text 2 <------ 5 2 <- 3 1 -> 2 -> 3 -> 4 -> 5 -> 6 ```