Harry Xu May 2012
Complex, concurrent software Precision (no false positives) Find real bugs in real executions
Need to modify JVM (e.g., object layout, GC, or ISA-level code) Need to demonstrate realism (usually performance)
Otherwise use RoadRunner, BCEL, Pin, LLVM,
Keeping track of stuff as the program executes? Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC
Keeping track of stuff as the program executes? JVM written in Java?! Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC Uninterruptible code
Jikes RVM { Guide Research Archive Research mailing list
Jikes RVM { Guide Research Archive Research mailing list
Jikes RVM source code Boot image writer Dynamic compilers
Jikes RVM source code Boot image writer Run with another JVM Dynamic compilers
Jikes RVM source code Boot image writer Run with another JVM Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Build configurations: BaseBase (prototype) BaseAdaptive (prototype-opt) FullAdaptive (development) FastAdaptive (production) Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive Testing Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive Faster builds Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive Faster runs Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive Performance Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Dynamic compilers Boot image (native code + initial heap space)
Jikes RVM source code Boot image writer Run with another JVM Edit with Eclipse (see Guide) Dynamic compilers Boot image (native code + initial heap space)
Keeping track of stuff as the program executes? Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC
Bytecode Baseline compiler Native code
Bytecode Baseline compiler Native code Each bytecode several x86 instructions (BaselineCompilerImpl.java)
Bytecode Baseline compiler Native code Each bytecode several x86 instructions (BaselineCompilerImpl.java)
Bytecode Baseline compiler Native code Profiling Adaptive optimization system
Bytecode Baseline compiler Native code Profiling Optimizing compiler (Faster) native code Adaptive optimization system
Bytecode Baseline compiler Native code Profiling Optimizing compiler (Faster) native code Adaptive optimization system
Bytecode Optimizing compiler (Faster) native code
Bytecode HIR Resembles bytecode Resembles typical compiler IR (3-address code) LIR (Faster) native code Resembles assembly code MIR
Bytecode HIR Opt levels: 0, 1, 2 LIR (Faster) native code MIR (Even faster) native code
Bytecode HIR Add instrumentation at reads, writes, allocation, synchronization ExpandRuntimeServices.java LIR (Faster) native code MIR
Keeping track of stuff as the program executes? Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC
header field0 field1 field2 Low address High address
Object reference type info block locking & GC field0 field1 field2
Object reference type info block locking & GC field0 field1 field2 Object reference type info block locking & GC Array length elem0 elem1
Object reference type info block locking & GC field0 field1 field2 Steal bits
Object reference misc type info block locking & GC field0 field1 field2 MiscHeader.java
Object reference counter type info block locking & GC field0 field1 field2
Object reference counter type info block locking & GC field0 field1 field2 Magic! Compiles down to three x86 instructions
Object reference counter type info block locking & GC field0 field1 field2 2 Gotcha: can t actually use LSB of leftmost word
Object reference counter type info block locking & GC field0 field1 field2 2 What s the problem with this code?
Object reference counter type info block locking & GC field0 field1 field2 2
Object reference not used type info block locking & GC field0 field1 field2
Object reference not used type info block locking & GC field0 field1 field2 Compiles down to three x86 instructions
Object reference misc type info block locking & GC field0 field1 field2
Object reference misc type info block locking & GC field0 field1 field2 What if GC moves object? What if GC collects object?
Keeping track of stuff as the program executes? Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC
Object reference type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f)
Object reference type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f)
Object reference type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f)
Object reference misc type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f) obj.misc = markandpossiblycopy(obj.f) worklist.push(obj.misc)
Object reference misc type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f) obj.misc = markandpossiblycopy(obj.f) worklist.push(obj.misc)
Object reference misc type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f) obj.misc = markandpossiblycopy(obj.f) worklist.push(obj.misc) TraceLocal.scanObject()
Object reference type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f)
Object reference type info block locking & GC field0 field1 field2 // Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markandpossiblycopy(obj.f) worklist.push(obj.f) TraceLocal.processNode()
Keeping track of stuff as the program executes? Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC Uninterruptible code
Normal application code can be interrupted Allocation GC Synchronization & yield points join a GC Some VM code shouldn t be interrupted Heap etc. in inconsistent state Most instrumentation can t be interrupted Reads & writes aren t GC-safe points
@Uninterruptible static void mymethod(object o) { } // No allocation or synchronization // No calls to interruptible methods
@Uninterruptible static void mymethod(object o) { currentthread.defergc = true; Metadata m = new Metadata(); currentthread.defergc = false; } setmischeader(o, offset, m);
Need to modify JVM internals Need to demonstrate realism Guide Overview of other tasks & components Jikes RVM Research Archive Dynamic analysis examples Research mailing list Help (especially for novices)
Object layout Extra bits or words in header Stealing bits from references Discuss magic here Adding instrumentation Baseline & optimizing compilers Allocation sites; reads & writes Inlining instrumentation Garbage collection Piggybacking on GC New spaces Low-level stuff Uninterruptible code Walking the stack Concurrency Atomic stores Thread-local data