Compiler Design Spring 2017 6.0 Runtime system and object layout Dr. Zoltán Majó Compiler Group Java HotSpot Virtual Machine Oracle Corporation 1
Runtime system Some open issues from last time Handling stack overflows Safepoint support Good examples for compiler-runtime coupling Not required for JavaLi 2
Generated code static int m(int i) { return m(i + 1); } SOE.m(I)I [0x00007f1eaa4851c0, 0x00007f1eaa485228] 104 bytes [Entry Point] [Verified Entry Point] [Constants] # {method} {0x00007f1e6fc01250} 'm' '(I)I' in 'SOE # parm0: rsi = int # [sp+0x20] (sp of caller) ;; N1: # B1 <- B2 B3 Freq: 1 ;; B1: # B3 B2 <- BLOCK HEAD IS JUNK Freq: 1 0x00007f1eaa4851c0: mov %eax,-0x16000(%rsp) 0x00007f1eaa4851c7: push %rbp 0x00007f1eaa4851c8: sub $0x10,%rsp ;*synchronization entry ; - SOE::m@-1 (line 3) 0x00007f1eaa4851cc: inc %esi ;*iadd {reexecute=0 rethrow=0 return_oop=0} ; - SOE::m@2 (line 3) 0x00007f1eaa4851ce: nop 0x00007f1eaa4851cf: callq 0x00007f1ea2ad29a0 ; ImmutableOopMap{} ;*invokestatic m {reexecute=0 rethrow=0 return_oop=0} ; - SOE::m@3 (line 3) ; {static_call} ;; B2: # N1 <- B1 Freq: 0.99998 0x00007f1eaa4851d4: add $0x10,%rsp 0x00007f1eaa4851d8: pop %rbp 0x00007f1eaa4851d9: test %eax,0x1c8dee21(%rip) # 0x00007f1ec6d64000 ; {poll_return} 0x00007f1eaa4851df: retq Categories of assembly instructions: program semantics reserve/release stack space set up frame stack overflow detection safepoint support ;*invokestatic m {reexecute=0 rethrow=0 return_oop=0} ; - SOE::m@3 (line 3) 6
Stack overflow support Requires a single instruction At the beginning of every compiled method mov %eax,-0x16000(%rsp) // AT&T syntax How does it work? Hint: Mechanism similar to implicit null checks Hint: Mechanism uses memory protection 7
8
Safepoint support Garbage collector (GC) modifies heap graph Not OK while program is running (in general) Program must stop to let GC proceed GC signals need for safe point E.g., Safe point needed because VM is running low on memory Once all program threads stopped: GC can proceed Notification mechanism: polling Program polls regularly if need to go to safepoint How to poll efficiently? 9
Safepoint support (cont d) Compiler inserts an extra instruction before method returns test %eax,0x1c8dee21(%rip) # 0x00007f1ec6d64000 Also to loop back-edges (ensures short wait times) How does it work? Hint: Mechanism similar to implicit null checks Hint: Mechanism uses memory protection 11
12
Outline 6.1 Introduction 6.2 Basic types 6.3 Object instances and references 6.4 Inheritance 6.5 Program startup 6.6 Handling stack overflows & GC safepoints Extra question 13
15
Extra question Look (again) at the code generated by the HotSpot JVM for the SOE::m(int i) method. Why did the compiler generate two method entry points? Entry Point Verified Entry Point 16
18
Extra question Look (again) at the code generated by the HotSpot JVM for the SOE::m(int i) method. Why did the compiler generate two method entry points? Entry Point Verified Entry Point Observation: There is no difference between the two entry points. (They both act as a label for the first instruction of the method.) Let s change method m(int i) from static to virtual (instance method) and see what happens. static int m(int i) { return m(i + 1); } 19
Generated code static int m(int i) { return m(i + 1); } SOE.m(I)I [0x00007f354a4851c0, 0x00007f354a485228] 104 bytes [Entry Point] [Constants] # {method} {0x00007f350db00258} 'm' '(I)I' in 'SOE # this: rsi:rsi = 'SOE # parm0: rdx = int # [sp+0x20] (sp of caller) ;; N34: # B1 <- BLOCK HEAD IS JUNK Freq: 1 0x00007f354a4851c0: cmp 0x8(%rsi),%rax 0x00007f354a4851c4: jne 0x00007f3542ad2220 ; {runtime_call ic_miss_stub} 0x00007f354a4851ca: nop 0x00007f354a4851cb: nop 0x00007f354a4851cc: nop 0x00007f354a4851cd: nop 0x00007f354a4851ce: nop 0x00007f354a4851cf: nop [Verified Entry Point] ;; B1: # B3 B2 <- BLOCK HEAD IS JUNK Freq: 1 0x00007f354a4851d0: mov %eax,-0x16000(%rsp) 0x00007f354a4851d7: push %rbp 0x00007f354a4851d8: sub $0x10,%rsp ;*synchronization entry ; - SOE::m@-1 (line 3) 0x00007f354a4851dc: inc %edx ;*iadd {reexecute=0 rethrow=0 return_oop=0} 21
Inline caching Call sites of method m()may contain a direct call to SOE::m() Costly vtable lookup avoided But m() is virtual: Call can be to m() in a sub/superclass of SOE Verification needed (Unverified) entry point performs a dynamic type check Is the receiver the same as the type declaring the method? YES: Continue execution NO: Update call site to perform proper vtable lookup Other actions possible as well Approach is called inline caching Implemented in different language VMs (Self, HotSpot, V8) 22
Compiler Design Spring 2017 7.0 Code generation Dr. Zoltán Majó Compiler Group Java HotSpot Virtual Machine Oracle Corporation 23
Code generation Input: IR (operator trees, AST) Output: Machine code (assembly) Crucial part of compiler In many compilers, most effort devoted to code generation (and establishing that conditions for optimization are met) Many compilers start with existing front-end/semantic analyzers Key requirement: correctness Sometimes: three requirements instead Performance, performance, performance Until correctness becomes important again (usually very quickly) 24
Code generator Major activities of a code generator 1. Code selection Decide on machine instruction(s) that are generated for a given IR node or group of IR nodes 25
Code selection Consider x = y x 1 Assume x is in %eax, y is in %ecx One possible code sequence (AT&T syntax) subl %eax, %ecx movl %ecx, %eax decl %eax movl %eax, x Another option notl %eax addl %ecx, %eax movl %eax, x 26
Code selection Swap x, y Assume x is in %eax, y is in %ecx Option 1: movl %eax, %edx movl %ecx, %eax movl %edx, %ecx Option 2 xchg %eax, %ecx Finding good sequences (maybe) far from easy Not always clear what sequence is fastest May depend on processor model 27
Code generator Major activities of a code generator 1. Code selection Decide on machine instruction(s) that are generated for a given IR node or group of IR nodes 2. Code scheduling Determines order of execution of (unrelated) instructions 28
Code scheduling Not much to do for single small tree Options for multiple trees Consider a = b + c; x = y + 1; Assume that memory read access takes 2 cycles 29
Code scheduling (cont d) a = b + c; x = y + 1; Option 1 movl b, %eax addl c, %eax movl %eax, a movl y, %ecx incl %ecx movl %ecx, x Option 2 movl b, %eax movl y, %ecx addl c, %eax incl %ecx movl %eax, a movl %ecx, x Not always clear which option is faster Processors reorder instructions on-the-fly Caching affects performance as well 30
Code generator Major activities of a code generator 1. Code selection Decide on machine instruction(s) that are generated for a given IR node or group of IR nodes 2. Code scheduling Determines order of execution of (unrelated) instructions 3. Register allocation and assignment Allocation: decide which operand goes into a register Assignment: decide which register holds a given operand 31
Register allocation Consider (in some class C) int inc(int x) { return x + 1; } int foo(int p) { int a, b, c; a = p + 1; b = inc(a); c = a + b; return c; } 32
Register allocation Assume that method inc() uses (and destroys) the %eax register. Return value in %eax Method foo() has two options int inc(int x) { return x + 1; } int foo(int p) { int a, b, c; a = p + 1; b = inc(a); c = a + b; return c; } 33
Options for register allocation Option 1: use %eax for first stmt Evaluate(p+1)into %eax Invoke inc() Reload a into %ebx Add %ebx to %eax Option 2: don t use %eax for first stmt Evaluate (p+1) into %ebx Invoke inc Add %ebx and %eax 2 nd option saves (re)loading a int inc(int x) { return x + 1; } int foo(int p) { int a, b, c; a = p + 1; b = inc(a); c = a + b; return c; } Register requirements of inc()may not be known at the time that foo() is compiled 34
Bad (interesting) news Code selection Register allocation Code scheduling Code selection depends on code scheduling Code scheduling depends on register allocation Register allocation depends on code selection Close coupling 35
Different register use for a = b + c; x = y + 1; Option 1 movl b, %eax addl c, %eax movl %eax, a movl y, %eax incl %eax movl %eax, x Option 2 movl b, %eax movl y, %eax addl c, %eax incl %eax movl %eax, a movl %eax, x 36