Inside Out: A Modern Virtual Machine Revealed

Size: px
Start display at page:

Download "Inside Out: A Modern Virtual Machine Revealed"

Transcription

1 Inside Out: A Modern Virtual Machine Revealed John Coomes Brian Goetz Tony Printezis Sun Microsystems

2 Some Questions > Why not compile my program to an executable ahead of time? > Why can't I tell the VM what/when to compile? > Why not save and reuse the compiled code? > Why not include an explicit free() method? 2

3 Some Questions > Why not compile my program to an executable ahead of time? > Why can't I tell the VM what/when to compile? > Why not save and reuse the compiled code? > Why not include an explicit free() method? > Answers: coming up... 3

4 Virtual Machine > An abstraction layer Between application and system Provides virtual instruction set > Offers Portability write once, run anywhere Security VM is intermediary between application and system resources Performance monitor application behavior Adapt, recompile, etc., as conditions change Productivity enable higher level abstractions 4

5 Virtual Machine > Virtual instruction set (bytecode) May offer higher-level abstractions than native instructions sets May constrain the programming model Usually for good reason! Example: no pointers Enables significant performance optimizations Enables relocating garbage collection > Two opportunities for compilation Static compilation source to bytecode Dynamic compilation bytecode to native 5

6 Dynamic Compilation > Happens while program is running JIT == "just in time" compilation Part of VM, not javac > When a method is first run, bytecode is interpreted VM gathers profiling data If a method is hot enough It is compiled to native code by the JIT compiler Execute native code on next call to method Or transfer to native code while still in the method If necessary... Invalidate native code and recompile 6

7 Dynamic Compilation > Compiler has more information better decisions Precise knowledge of target hardware Number and type of CPUs, cache line size, NUMA, etc. Whole-program information Which classes are loaded right now Online profiling Which branches taken, which not Which loops are hot Whether a null obj has been seen at a given point > Enables adaptive and speculative techniques Compile optimistically, recover if necessary 7

8 Dynamic Compilation > Very flexible! Can freely mix interpretation and native execution > Events can invalidate compiled code New classes being loaded Change in program behavior (phase change) Gathering more profiling data > Must be able to recover Deoptimization interpret and/or recompile > Benefit: better long-term performance > Cost: less predictable short-term performance 8

9 VM Philosophy > Make the common case fast Don't worry about uncommon/infrequent case > Defer optimization decisions Until you have enough data Revisit prior decisions if new data warrants > Cede some control, and you will be rewarded No pointers safety, fast allocation, efficient GC, many optimizations Dynamic compilation better data, speculative optimizations 9

10 Virtual Method Calls > Virtual method calls can be more expensive than direct calls > C++ approach: Make programmer decide virtual vs. non-virtual Don't make me pay for what I don't use > VM approach: Make them fast when necessary (and possible) Let the VM agonize over low-level performance details 10

11 Virtual Method Calls > Overhead: 2 or 3 dependent loads + branch C++ is similar 11

12 Virtual Method Calls > VM has multiple tricks to speed up method dispatch Devirtualize avoid redundant branch target computation Some methods are obviously monomorphic static, final, and private methods Much more pipeline-friendly! Inlining eliminate call overhead entirely Copy the callee code right into the calling method Inline decision based on time/space tradeoff heuristics Inline caching cache target in generated code Fast receiver-type check plus predictable branch 12

13 Inlining > Not just about eliminating call overhead Provides optimizer with bigger blocks Enables other optimizations Hoisting, dead code elimination, common subexpression elimination, code motion, strength reduction, > When inlined, invocation overhead is zero > Most small methods are inlined Such as getters and setters Inlined code is more compact than a call Moral: don't fret about small methods 13

14 Inlining > What kinds of method calls can we inline? Nearly everything! static Always final Always private Always virtual Often reflective Sometimes 14

15 Example 1 class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; 15

16 Example 1 BailoutFund fund = new BailoutFund(...); fund.bailout("john's Insurance", 24); class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; Initial (Naïve Version) if (fund == null) throw new NullPointerException(); fund.bailout("john's Insurance", 24); Virtual call 16

17 Example 1 > fund was just allocated and allocation was successful > JIT can prove fund is not null Can safely eliminate the null check Step 1: Null Check Elimination if (fund == null) throw new NullPointerException(); fund.bailout("john's Insurance", 24); 17

18 Example 1 BailoutFund fund = new BailoutFund(...); fund.bailout("john's Insurance", 24); class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; Intermediate fund.bailout("john's Insurance", 24); Virtual call 18

19 Example 1 > VM knows that bailout() is not overriden Class hierarchy analysis > Can inline bailout() Avoids the virtual call Step 2: Inline bailout() fund.bailout("john's Insurance", 24); DB.log("John's Insurance"); fund._spent += 24; 19

20 Example 1 BailoutFund fund = new BailoutFund(...); fund.bailout("john's Insurance", 24); class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; Final (Optimized Version) DB.log("John's Insurance"); fund._spent += 24; 20

21 Speculative Optimization > If a method is truly monomorphic, we can inline it But how can we know? Classes are loaded dynamically No such thing as a "fully linked executable" Closed-world whole-program analysis is impossible Or is it? > VM can analyze the classes currently loaded And optimize based on that May have to back out optimizations If a subsequent class load would violate assumptions Then recompile the affected code 21

22 Standard Compiler Optimizations > Dynamic compilers can apply all the standard optimizations Dead code elimination Loop-invariant code hoisting Common subexpression elimination Loop unrolling and strength reduction Null check and array bounds check elimination > Inlining more opportunities for these optimizations 22

23 Example 2 class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; 23

24 Example 2 BailoutFund fund = new BailoutFund(...); BailoutItem[] items = new BailoutItem[...]; // fill in items[]... for (int i = 0; i < items.length; ++i) fund.bailout(items[i].name(), items[i].amount()); class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; Initial (Naïve Version) for (int i = 0; i < items.length; ++i) { if (i < 0 items.length <= i) throw new ArrayIndexOOBE(); fund.bailout(items[i].name(), items[i].amount()); 24

25 Example 2 > Array bounds checks required by the language Potentially expensive > Can prove i is in range Eliminate array bounds check safely Step 1: Bounds Check Elimination for (int i = 0; i < items.length; ++i) { if (i < 0 items.length <= i) throw new ArrayIndexOOBE(); fund.bailout(items[i].name(), items[i].amount()); 25

26 Example 2 BailoutFund fund = new BailoutFund(...); BailoutItem[] items = new BailoutItem[...]; // fill in items[]... for (int i = 0; i < items.length; ++i) fund.bailout(items[i].name(), items[i].amount()); class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; Intermediate for (int i = 0; i < items.length; ++i) { fund.bailout(items[i].name(), items[i].amount()); 26

27 Example 2 > Hoist invariant code Likely into a register No need to access items.length every time around the loop Step 2: Hoist items.length int length = items.length; for (int i = 0; i < items.length; ++i) { fund.bailout(items[i].name(), items[i].amount()); length 27

28 Example 2 BailoutFund fund = new BailoutFund(...); BailoutItem[] items = new BailoutItem[...]; // fill in items[]... for (int i = 0; i < items.length; ++i) fund.bailout(items[i].name(), items[i].amount()); class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; Intermediate int length = items.length; for (int i = 0; i < length; ++i) { fund.bailout(items[i].name(), items[i].amount()); 28

29 Example 2 > Avoid redundant memory accesses Read items[i] once in the loop, not twice Likely kept in a register Step 3: Sub-Expression Elimination int length = items.length; for (int i = 0; i < length; ++i) { BailoutItem item = items[i]; fund.bailout(items[i].name(), items[i].amount()); item 29

30 Example 2 BailoutFund fund = new BailoutFund(...); BailoutItem[] items = new BailoutItem[...]; // fill in items[]... for (int i = 0; i < items.length; ++i) fund.bailout(items[i].name(), items[i].amount()); class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; Intermediate int length = items.length; for (int i = 0; i < length; ++i) { BailoutItem item = items[i]; fund.bailout(item.name(), item.amount()); 30

31 Example 2 > VM knows name() and amount() are not overridden Class hierarchy analysis > Can inline name() and amount() Avoids virtual calls > Must record dependencies Single implementer of name() and amount() Allows recovery Step 4: Inline name() and amount() int length = items.length; for (int i = 0; i < length; ++i) { BailoutItem item = items[i]; fund.bailout(item.name(), item.amount()); (item._name, item._amount) 31

32 Example 2 BailoutFund fund = new BailoutFund(...); BailoutItem[] items = new BailoutItem[...]; // fill in items[]... for (int i = 0; i < items.length; ++i) fund.bailout(items[i].name(), items[i].amount()); class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; Intermediate int length = items.length; for (int i = 0; i < length; ++i) { BailoutItem item = items[i]; fund.bailout(item._name, item._amount); 32

33 Example 2 > VM knows that bailout() is not overriden Class hierarchy analysis > Can inline bailout() Avoids the virtual call > Must record dependency Single implementer of bailout() Allows recovery Step 4: Inline bailout() int length = items.length; for (int i = 0; i < length; ++i) { BailoutItem item = items[i]; fund.bailout(item._name, item._amount); DB.log(item._name); fund._spent += item._amount; 33

34 Example 2 BailoutFund fund = new BailoutFund(...); BailoutItem[] items = new BailoutItem[...]; // fill in items[]... for (int i = 0; i < items.length; ++i) fund.bailout(items[i].name(), items[i].amount()); class BailoutItem { final private String _name; final private int _amount; // in billions BailoutItem(String name, int amount) { _name = name; _amount = amount; String name() { return _name; int amount() { return _amount; Final (Optimized Version) int length = items.length; for (int i = 0; i < length; ++i) { BailoutItem item = items[i]; DB.log(item._name); fund._spent += item._amount; 34

35 Scalar Replacement > Object allocation (new) is under programmer control Mostly the VM can fake us out User deals in references, but VM owns the pointers VM can often optimize away allocations Even though allocation is fast, extra objects still cause GC churn and more cache misses > Many objects are just holders for related values Like java.awt.point VM can put fields in registers - object is unnecessary This is called scalar replacement 35

36 Escape Analysis > Eliding allocation relies on escape analysis Tells us if a reference escapes a certain scope Such as the method in which it was allocated Determines dynamic scope of object reference > If an object does not escape Can eliminate heap allocation Place fields into registers or allocate on stack Can eliminate locking > Inlining more opportunities for escape analysis 36

37 Example 3 class Treasury {... BailoutItem nextbailout() { return new BailoutItem(nextName(), calcamount()); 37

38 Example 3 Treasury treasury = new Treasury(...); BailoutFund fund = new BailoutFund(...); BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(), item.amount()); Initial (Naïve Version) BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(),item.amount()); class Treasury {... BailoutItem nextbailout() { return new BailoutItem(nextName(), calcamount()); 38

39 Example 3 > VM knows that nextbailout() is not overriden Class hierarchy analysis > Can inline nextbailout() Avoids the virtual call > Must record dependency Single implementer of nextbailout() Step 1: Inline nextbailout() BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(), item.amount()); new BailoutItem(treasury.nextName(), treasury.calcamount()); 39

40 Example 3 Treasury treasury = new Treasury(...); BailoutFund fund = new BailoutFund(...); BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(), item.amount()); class Treasury {... BailoutItem nextbailout() { return new BailoutItem(nextName(), calcamount()); Intermediate BailoutItem item = new BailoutItem(treasury.nextName(), treasury.calcamount()); fund.bailout(item.name(), item.amount()); 40

41 Example 3 > Assume item is not used further JIT can prove item is nonescaping > Eliminate the allocation of item > Fields become a set of simple (scalar) variables Object item is replaced by variables localname and localamount Likely will only appear in registers Step 2: Scalar Replace Allocation localname = treasury.nextname(); localamount = treasury.calcamount(); BailoutItem item = new BailoutItem(treasury.nextName(), treasury.calcamount()); fund.bailout(item.name(), item.amount()); (localname, localamount) 41

42 Example 3 Treasury treasury = new Treasury(...); BailoutFund fund = new BailoutFund(...); BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(), item.amount()); class Treasury {... BailoutItem nextbailout() { return new BailoutItem(nextName(), calcamount()); Intermediate localname = treasury.nextname(); localamount = treasury.calcamount(); fund.bailout(localname, localamount); 42

43 Example 3 > Remove localname and localamount > They are aliases of treasury.calcamount() and treasury.nextname() Step 3: Eliminate Aliases localname = treasury.nextname(); localamount = treasury.calcamount(); fund.bailout(localname, localamount); (treasury.nextname(), treasury.calcamount()) 43

44 Example 3 Treasury treasury = new Treasury(...); BailoutFund fund = new BailoutFund(...); BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(), item.amount()); Intermediate fund.bailout(treasury.nextname(), treasury.calcamount()); class Treasury {... BailoutItem nextbailout() { return new BailoutItem(nextName(), calcamount()); 44

45 Example 3 > VM knows that bailout() is not overriden Class hierarchy analysis > Can inline bailout() Avoids the virtual call > Must record dependency Single implementer of bailout() Step : Inline bailout() fund.bailout(treasury.nextname(), treasury.calcamount()); DB.log(treasury.nextName()); fund._spent += treasury.calcamount(); 45

46 Example 3 Treasury treasury = new Treasury(...); BailoutFund fund = new BailoutFund(...); BailoutItem item = treasury.nextbailout(); fund.bailout(item.name(), item.amount()); Final (Optimized Version) DB.log(treasury.nextName()); fund._spent += treasury.calcamount(); class Treasury {... BailoutItem nextbailout() { return new BailoutItem(nextName(), calcamount()); 46

47 Lock Coarsening > Reduces the overhead of locking > Combine adjacent synchronized blocks That lock the same object > Can also move code into (but not out of) a synchronized block to facilitate lock coarsening 47

48 Example 4 class BailoutFund {... synchronized void bailoutmt(string name, long amount) { bailout(name, amount); 48

49 Example 4 BailoutFund fund = new BailoutFund(...); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); class BailoutFund {... synchronized void bailoutmt(string name, long amount) { bailout(name, amount); Initial (Naïve Version) lock(fund); fund.bailoutmt("john's Insurance", 24); unlock(fund); lock(fund); fund.bailoutmt("brian's Bank", 10); unlock(fund); lock(fund); fund.bailoutmt("tony's Car Factory", 16); unlock(fund); 49

50 Example 4 > Three adjacent blocks All lock the same object Assume blocks are small enough > Interior lock / unlock operations can be removed Step 1: Lock Coarsening lock(fund); fund.bailoutmt("john's Insurance", 24); unlock(fund); lock(fund); fund.bailoutmt("brian's Bank", 10); unlock(fund); lock(fund); fund.bailoutmt("tony's Car Factory", 16); unlock(fund); 50

51 Example 4 BailoutFund fund = new BailoutFund(...); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); class BailoutFund {... synchronized void bailoutmt(string name, long amount) { bailout(name, amount); Intermediate lock(fund); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); unlock(fund); 51

52 Example 4 > VM knows that bailoutmt() and bailout() are not overriden Class hierarchy analysis > Can inline bailoutmt() and bailout() Avoids the virtual calls > Must record dependencies Single implementer of bailout() and bailoutmt() Step 2: Inline bailoutmt lock(fund); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); unlock(fund); DB.log( John's Insurance ); fund._spent += 24; DB.log( Brian's Bank ); fund._spent += 10; DB.log( Tony's Car Factory ); fund._spent += 16; 52

53 Example 4 BailoutFund fund = new BailoutFund(...); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); class BailoutFund {... synchronized void bailoutmt(string name, long amount) { bailout(name, amount); Final (Optimized Version) lock(fund); DB.log( John's Insurance ); fund._spent += 24; DB.log( Brian's Bank ); fund._spent += 10; DB.log( Tony's Car Factory ); fund._spent += 16; unlock(fund); 53

54 Lock Elision > Further reduces the overhead of locking > If the JIT can prove that an object is non-escaping It is guaranteed that only one thread will access it Therefore, no synchronization is necessary Can totally elide synchronization for that object 54

55 Example 4 (Alt 1) BailoutFund fund = new BailoutFund(...); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); class BailoutFund {... synchronized void bailoutmt(string name, long amount) { bailout(name, amount); Final (with Lock Elision) lock(fund); DB.log( John's Insurance ); fund._spent += 24; DB.log( Brian's Bank ); fund._spent += 10; DB.log( Tony's Car Factory ); fund._spent += 16; unlock(fund); 55

56 Biased Locking > Even if we cannot coarsen / elide synchronization, we can still optimize it > Often, object A is only ever locked by thread T The VM can bias object A to thread T T never needs to lock A again A simple test proves that A is biased to T > If, however, another thread tries to lock A The VM will need to unbias A This is very expensive 56

57 Example 4 (Alt 2) BailoutFund fund = new BailoutFund(...); fund.bailoutmt("john's Insurance", 24); fund.bailoutmt("brian's Bank", 10); fund.bailoutmt("tony's Car Factory", 16); class BailoutFund {... synchronized void bailoutmt(string name, long amount) { bailout(name, amount); Final (with Biased Locking) if (fund.biased_to()!= Thread.current()) { // fund biased/locked by another thread: // revoke bias, then acquire lock DB.log( John's Insurance ); fund._spent += 24; DB.log( Brian's Bank ); fund._spent += 10; DB.log( Tony's Car Factory ); fund._spent += 16; 57

58 Inline caching > Pure inlining is great, but not always possible A call site may have multiple targets ( megamorphic ) But the most likely one may still be known Thanks to online profiling And we can optimize for the common case! > If a single target is very common, it can generate an "inline cache" Cache predicted jump target in code Do a fast type check, and if receiver type matches predicted, branch to predicted target Hardware can predict this branch effectively 58

59 Example 5 class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; class TARPFund extends BailoutFund { void bailout(string name, int amount) {... 59

60 Example 5 BailoutFund fund = new BailoutFund(...); fund.bailout("john's Insurance", 24); class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; class TARPFund extends BailoutFund { void bailout(string name, int amount) {... Initial (Naïve Version) fund.bailout("john's Insurance", 24); Virtual call 60

61 Example 5 > JIT cannot inline bailout() unconditionally It is overriden > However, assume that in most cases fund is an instance of BailoutFund Dynamic profiling tells us > Add a fast type check and inline the common case Avoids the virtual call for the common case Step 2: Inline bailout() and type check fund.bailout("john's Insurance", 24); if (fund.getclass() == BailoutFund) { DB.log("John's Insurance"); fund._spent += 24; else { fund.bailout("john's Insurance", 24); 61

62 Example 5 BailoutFund fund = new BailoutFund(...); fund.bailout("john's Insurance", 24); class BailoutFund { private int _spent; // in billions void bailout(string name, int amount) { DB.log(name); _spent += amount; class TARPFund extends BailoutFund { void bailout(string name, int amount) {... Final (Optimized Version) if (fund.getclass() == BailoutFund) { DB.log("John's Insurance"); fund._spent += 24; else { fund.bailout("john's Insurance", 24); Virtual call 62

63 Garbage Collection > Automatic memory management As opposed to explicit malloc / free > The usual advantages Eliminates dangling pointers Eliminates memory leaks But not memory retention Improves productivity Simpler API/library design > Trade-offs Less predictability Throughput? 63

64 Object Relocation > Garbage collection enables object relocation Compaction: eliminates fragmentation Generational GC: decreases GC overhead Linear Allocation: best allocation performance Fast path: ~10 instructions, inlined, no synchronization top new top end new object 64

65 Garbage Collection and Throughput > Recent publication shows malloc / free outperform GC when heap is tight But: GC can match (or better) malloc / free when there is enough breathing room Matthew Hertz and Emery Berger, Quantifying the performance of garbage collection vs. explicit memory management. In Proceedings of OOPSLA '05, Oct

66 Generational GC Is Fast! > malloc / free cost malloc * all_objects + cost free * freed_objects > Generational GC with Copying Young Generation cost linear_alloc * all_objects + cost copy * surviving_objects Notice: no reclamation cost (i.e, cost free ) for all reclaimed objects in the young generation > But consider cost linear_alloc much lower than cost malloc surviving_objects say ~5% or less of all_objects 66

67 Object Relocation Other Benefits > Compaction: can improve page locality > Relocation order: can improve cache locality > Important points Object allocation & reclamation are fast Object relocation can also improve application performance 67

68 Obscure Optimizations > Significant payoff On some hardware For some applications > But tedious to do in portable C/C++ apps > VM gives them for free Large Pages NUMA 68

69 Large Pages > Scarce resources: Disk? 69

70 Large Pages > Scarce resources: Disk? 1 TB for under $100! RAM? 70

71 Large Pages > Scarce resources: Disk? 1 TB for under $100! RAM? Laptops have 2+ GB Cache? 71

72 Large Pages > Scarce resources: Disk? 1 TB for under $100! RAM? Laptops have 2+ GB Cache? Maybe, but not what I'm looking for... TLB Cache of virtual physical address mappings Used on many (or all) memory references TLB miss memory access even slower Must walk page tables 72

73 Large Pages > TLBs are scarce A few 10s to a few 100s of entries With 4 KB pages Need 1M entries to cover 4 GB > What to do? Make TLB larger add more entries Must remain very fast Hard to keep up with memory growth Allow TLB entries to cover a larger region Large pages 64 KB, 2 MB, 4 MB, 256 MB, 1 GB 73

74 Large Pages > Exploiting large pages VM queries OS for available page sizes Selects appropriate size for Java heap Young + old generations Permanent generation Dynamically generated code Other structures Card table, bitmaps, region tables > Performance boost: 5% - 10% With large heaps, on some benchmarks; YMMV 74

75 Large Pages > OS Support Solaris Enabled by default More than two page sizes Regions can expand and contract Linux, Windows Administrator must enable Search web for java large pages At most two page sizes Region size fixed at initialization > Summary: obscure, but worth it 75

76 NUMA > All memory is not equal > Each CPU can access Local memory Remote memory Longer latency up to 2x > Hypothetical: 4 locality groups Random placement wrong 75% of the time CPU starved for data > Solution? 76

77 NUMA > Make VM NUMA-aware Each thread has a home node Object allocation Uses memory local to thread's home node when possible 77

78 NUMA > Make VM NUMA-aware Each thread has a home node Object allocation Uses memory local to thread's home node when possible > Payoff: Modest to Huge Some applications, some platforms 2 sockets * 2 cores - 10% 8 sockets * 2 cores - 30% 72 sockets * 2 cores - 280% > Summary: obscure, but necessary 78

79 Conclusions > Give up some control Pointers Ahead-of-time compilation Some predictability > Get a lot back Performance Ultra-fast allocation Dynamic optimizations Safety No pointer bugs Portability 79

80 Acknowledgements > Vladimir Kozlov > Igor Veresov > John Rose > Charlie Hunt 80

81 John Coomes Brian Goetz Tony Printezis

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley Managed runtimes & garbage collection CSE 6341 Some slides by Kathryn McKinley 1 Managed runtimes Advantages? Disadvantages? 2 Managed runtimes Advantages? Reliability Security Portability Performance?

More information

Lecture 9 Dynamic Compilation

Lecture 9 Dynamic Compilation Lecture 9 Dynamic Compilation I. Motivation & Background II. Overview III. Compilation Policy IV. Partial Method Compilation V. Partial Dead Code Elimination VI. Escape Analysis VII. Results Partial Method

More information

Managed runtimes & garbage collection

Managed runtimes & garbage collection Managed runtimes Advantages? Managed runtimes & garbage collection CSE 631 Some slides by Kathryn McKinley Disadvantages? 1 2 Managed runtimes Portability (& performance) Advantages? Reliability Security

More information

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1 SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine David Bélanger dbelan2@cs.mcgill.ca Sable Research Group McGill University Montreal, QC January 28, 2004 SABLEJIT: A Retargetable

More information

Running class Timing on Java HotSpot VM, 1

Running class Timing on Java HotSpot VM, 1 Compiler construction 2009 Lecture 3. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int s = r + 5; return

More information

2011 Oracle Corporation and Affiliates. Do not re-distribute!

2011 Oracle Corporation and Affiliates. Do not re-distribute! How to Write Low Latency Java Applications Charlie Hunt Java HotSpot VM Performance Lead Engineer Who is this guy? Charlie Hunt Lead JVM Performance Engineer at Oracle 12+ years of

More information

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs Performance of Non-Moving Garbage Collectors Hans-J. Boehm HP Labs Why Use (Tracing) Garbage Collection to Reclaim Program Memory? Increasingly common Java, C#, Scheme, Python, ML,... gcc, w3m, emacs,

More information

A JVM Does What? Eva Andreasson Product Manager, Azul Systems

A JVM Does What? Eva Andreasson Product Manager, Azul Systems A JVM Does What? Eva Andreasson Product Manager, Azul Systems Presenter Eva Andreasson Innovator & Problem solver Implemented the Deterministic GC of JRockit Real Time Awarded patents on GC heuristics

More information

Compiler construction 2009

Compiler construction 2009 Compiler construction 2009 Lecture 3 JVM and optimization. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int

More information

Field Analysis. Last time Exploit encapsulation to improve memory system performance

Field Analysis. Last time Exploit encapsulation to improve memory system performance Field Analysis Last time Exploit encapsulation to improve memory system performance This time Exploit encapsulation to simplify analysis Two uses of field analysis Escape analysis Object inlining April

More information

High Performance Managed Languages. Martin Thompson

High Performance Managed Languages. Martin Thompson High Performance Managed Languages Martin Thompson - @mjpt777 Really, what is your preferred platform for building HFT applications? Why do you build low-latency applications on a GC ed platform? Agenda

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC 330 - Spring 2013 1 Memory Attributes! Memory to store data in programming languages has the following lifecycle

More information

The Fundamentals of JVM Tuning

The Fundamentals of JVM Tuning The Fundamentals of JVM Tuning Charlie Hunt Architect, Performance Engineering Salesforce.com sfdc_ppt_corp_template_01_01_2012.ppt In a Nutshell What you need to know about a modern JVM to be effective

More information

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder JAVA PERFORMANCE PR SW2 S18 Dr. Prähofer DI Leopoldseder OUTLINE 1. What is performance? 1. Benchmarking 2. What is Java performance? 1. Interpreter vs JIT 3. Tools to measure performance 4. Memory Performance

More information

Fiji VM Safety Critical Java

Fiji VM Safety Critical Java Fiji VM Safety Critical Java Filip Pizlo, President Fiji Systems Inc. Introduction Java is a modern, portable programming language with wide-spread adoption. Goal: streamlining debugging and certification.

More information

Robust Memory Management Schemes

Robust Memory Management Schemes Robust Memory Management Schemes Prepared by : Fadi Sbahi & Ali Bsoul Supervised By: Dr. Lo ai Tawalbeh Jordan University of Science and Technology Robust Memory Management Schemes Introduction. Memory

More information

Optimising for the p690 memory system

Optimising for the p690 memory system Optimising for the p690 memory Introduction As with all performance optimisation it is important to understand what is limiting the performance of a code. The Power4 is a very powerful micro-processor

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC 330 Spring 2017 1 Memory Attributes Memory to store data in programming languages has the following lifecycle

More information

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center April

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

Priming Java for Speed

Priming Java for Speed Priming Java for Speed Getting Fast & Staying Fast Gil Tene, CTO & co-founder, Azul Systems 2013 Azul Systems, Inc. High level agenda Intro Java realities at Load Start A whole bunch of compiler optimization

More information

Hierarchical PLABs, CLABs, TLABs in Hotspot

Hierarchical PLABs, CLABs, TLABs in Hotspot Hierarchical s, CLABs, s in Hotspot Christoph M. Kirsch ck@cs.uni-salzburg.at Hannes Payer hpayer@cs.uni-salzburg.at Harald Röck hroeck@cs.uni-salzburg.at Abstract Thread-local allocation buffers (s) are

More information

Java On Steroids: Sun s High-Performance Java Implementation. History

Java On Steroids: Sun s High-Performance Java Implementation. History Java On Steroids: Sun s High-Performance Java Implementation Urs Hölzle Lars Bak Steffen Grarup Robert Griesemer Srdjan Mitrovic Sun Microsystems History First Java implementations: interpreters compact

More information

Quantifying the Performance of Garbage Collection vs. Explicit Memory Management

Quantifying the Performance of Garbage Collection vs. Explicit Memory Management Quantifying the Performance of Garbage Collection vs. Explicit Memory Management Matthew Hertz Canisius College Emery Berger University of Massachusetts Amherst Explicit Memory Management malloc / new

More information

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation CSE 501: Compiler Construction Course outline Main focus: program analysis and transformation how to represent programs? how to analyze programs? what to analyze? how to transform programs? what transformations

More information

New Java performance developments: compilation and garbage collection

New Java performance developments: compilation and garbage collection New Java performance developments: compilation and garbage collection Jeroen Borgers @jborgers #jfall17 Part 1: New in Java compilation Part 2: New in Java garbage collection 2 Part 1 New in Java compilation

More information

Habanero Extreme Scale Software Research Project

Habanero Extreme Scale Software Research Project Habanero Extreme Scale Software Research Project Comp215: Garbage Collection Zoran Budimlić (Rice University) Adapted from Keith Cooper s 2014 lecture in COMP 215. Garbage Collection In Beverly Hills...

More information

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008 Dynamic Storage Allocation CS 44: Operating Systems Spring 2 Memory Allocation Static Allocation (fixed in size) Sometimes we create data structures that are fixed and don t need to grow or shrink. Dynamic

More information

High Performance Managed Languages. Martin Thompson

High Performance Managed Languages. Martin Thompson High Performance Managed Languages Martin Thompson - @mjpt777 Really, what s your preferred platform for building HFT applications? Why would you build low-latency applications on a GC ed platform? Some

More information

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC330 Fall 2018 1 Memory Attributes Memory to store data in programming languages has the following lifecycle

More information

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides Garbage Collection Last time Compiling Object-Oriented Languages Today Motivation behind garbage collection Garbage collection basics Garbage collection performance Specific example of using GC in C++

More information

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1 Agenda CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Summer 2004 Java virtual machine architecture.class files Class loading Execution engines Interpreters & JITs various strategies

More information

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find

More information

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace Project there are a couple of 3 person teams regroup or see me or forever hold your peace a new drop with new type checking is coming using it is optional 1 Compiler Architecture source code Now we jump

More information

CS 31: Intro to Systems Virtual Memory. Kevin Webb Swarthmore College November 15, 2018

CS 31: Intro to Systems Virtual Memory. Kevin Webb Swarthmore College November 15, 2018 CS 31: Intro to Systems Virtual Memory Kevin Webb Swarthmore College November 15, 2018 Reading Quiz Memory Abstraction goal: make every process think it has the same memory layout. MUCH simpler for compiler

More information

Computer Systems A Programmer s Perspective 1 (Beta Draft)

Computer Systems A Programmer s Perspective 1 (Beta Draft) Computer Systems A Programmer s Perspective 1 (Beta Draft) Randal E. Bryant David R. O Hallaron August 1, 2001 1 Copyright c 2001, R. E. Bryant, D. R. O Hallaron. All rights reserved. 2 Contents Preface

More information

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18 PROCESS VIRTUAL MEMORY CS124 Operating Systems Winter 2015-2016, Lecture 18 2 Programs and Memory Programs perform many interactions with memory Accessing variables stored at specific memory locations

More information

Kasper Lund, Software engineer at Google. Crankshaft. Turbocharging the next generation of web applications

Kasper Lund, Software engineer at Google. Crankshaft. Turbocharging the next generation of web applications Kasper Lund, Software engineer at Google Crankshaft Turbocharging the next generation of web applications Overview Why did we introduce Crankshaft? Deciding when and what to optimize Type feedback and

More information

Computer Architecture Area Fall 2009 PhD Qualifier Exam October 20 th 2008

Computer Architecture Area Fall 2009 PhD Qualifier Exam October 20 th 2008 Computer Architecture Area Fall 2009 PhD Qualifier Exam October 20 th 2008 This exam has nine (9) problems. You should submit your answers to six (6) of these nine problems. You should not submit answers

More information

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Supercharge your PS3 game code Part 1: Compiler internals.

More information

Chapter 8: Main Memory

Chapter 8: Main Memory Chapter 8: Main Memory Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and 64-bit Architectures Example:

More information

PennBench: A Benchmark Suite for Embedded Java

PennBench: A Benchmark Suite for Embedded Java WWC5 Austin, TX. Nov. 2002 PennBench: A Benchmark Suite for Embedded Java G. Chen, M. Kandemir, N. Vijaykrishnan, And M. J. Irwin Penn State University http://www.cse.psu.edu/~mdl Outline Introduction

More information

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc. Chapter 1 GETTING STARTED SYS-ED/ Computer Education Techniques, Inc. Objectives You will learn: Java platform. Applets and applications. Java programming language: facilities and foundation. Memory management

More information

JDK 9/10/11 and Garbage Collection

JDK 9/10/11 and Garbage Collection JDK 9/10/11 and Garbage Collection Thomas Schatzl Senior Member of Technical Staf Oracle JVM Team May, 2018 thomas.schatzl@oracle.com Copyright 2017, Oracle and/or its afliates. All rights reserved. 1

More information

Lecture 14 Pointer Analysis

Lecture 14 Pointer Analysis Lecture 14 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis [ALSU 12.4, 12.6-12.7] Phillip B. Gibbons 15-745: Pointer Analysis

More information

Chapter 8: Memory-Management Strategies

Chapter 8: Memory-Management Strategies Chapter 8: Memory-Management Strategies Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and

More information

Virtual Machine Design

Virtual Machine Design Virtual Machine Design Lecture 4: Multithreading and Synchronization Antero Taivalsaari September 2003 Session #2026: J2MEPlatform, Connected Limited Device Configuration (CLDC) Lecture Goals Give an overview

More information

Memory management has always involved tradeoffs between numerous optimization possibilities: Schemes to manage problem fall into roughly two camps

Memory management has always involved tradeoffs between numerous optimization possibilities: Schemes to manage problem fall into roughly two camps Garbage Collection Garbage collection makes memory management easier for programmers by automatically reclaiming unused memory. The garbage collector in the CLR makes tradeoffs to assure reasonable performance

More information

Lecture 13: Garbage Collection

Lecture 13: Garbage Collection Lecture 13: Garbage Collection COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer/Mikkel Kringelbach 1 Garbage Collection Every modern programming language allows programmers

More information

One VM, Many Languages

One VM, Many Languages One VM, Many Languages John Rose Brian Goetz Oracle Corporation 9/20/2010 The following is intended to outline our general product direction. It is intended for information purposes

More information

Dynamic Selection of Application-Specific Garbage Collectors

Dynamic Selection of Application-Specific Garbage Collectors Dynamic Selection of Application-Specific Garbage Collectors Sunil V. Soman Chandra Krintz University of California, Santa Barbara David F. Bacon IBM T.J. Watson Research Center Background VMs/managed

More information

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory Recall: Address Space Map 13: Memory Management Biggest Virtual Address Stack (Space for local variables etc. For each nested procedure call) Sometimes Reserved for OS Stack Pointer Last Modified: 6/21/2004

More information

6.828: OS/Language Co-design. Adam Belay

6.828: OS/Language Co-design. Adam Belay 6.828: OS/Language Co-design Adam Belay Singularity An experimental research OS at Microsoft in the early 2000s Many people and papers, high profile project Influenced by experiences at

More information

Threads SPL/2010 SPL/20 1

Threads SPL/2010 SPL/20 1 Threads 1 Today Processes and Scheduling Threads Abstract Object Models Computation Models Java Support for Threads 2 Process vs. Program processes as the basic unit of execution managed by OS OS as any

More information

New Compiler Optimizations in the Java HotSpot Virtual Machine

New Compiler Optimizations in the Java HotSpot Virtual Machine New Compiler Optimizations in the Java HotSpot Virtual Machine Steve Dever Steve Goldman Kenneth Russell Sun Microsystems, Inc. TS-3412 Copyright 2006, Sun Microsystems Inc., All rights reserved. 2006

More information

Free-Me: A Static Analysis for Automatic Individual Object Reclamation

Free-Me: A Static Analysis for Automatic Individual Object Reclamation Free-Me: A Static Analysis for Automatic Individual Object Reclamation Samuel Z. Guyer, Kathryn McKinley, Daniel Frampton Presented by: Jason VanFickell Thanks to Dimitris Prountzos for slides adapted

More information

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Pros and Cons of Pointers Lecture 27 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Many procedural languages have pointers

More information

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1 CS 345 Garbage Collection Vitaly Shmatikov slide 1 Major Areas of Memory Static area Fixed size, fixed content, allocated at compile time Run-time stack Variable size, variable content (activation records)

More information

Azul Systems, Inc.

Azul Systems, Inc. 1 Stack Based Allocation in the Azul JVM Dr. Cliff Click cliffc@azulsystems.com 2005 Azul Systems, Inc. Background The Azul JVM is based on Sun HotSpot a State-of-the-Art Java VM Java is a GC'd language

More information

Lecture 20 Pointer Analysis

Lecture 20 Pointer Analysis Lecture 20 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis (Slide content courtesy of Greg Steffan, U. of Toronto) 15-745:

More information

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES OBJECTIVES Detailed description of various ways of organizing memory hardware Various memory-management techniques, including paging and segmentation To provide

More information

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime Runtime The optimized program is ready to run What sorts of facilities are available at runtime Compiler Passes Analysis of input program (front-end) character stream Lexical Analysis token stream Syntactic

More information

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts Memory management Last modified: 26.04.2016 1 Contents Background Logical and physical address spaces; address binding Overlaying, swapping Contiguous Memory Allocation Segmentation Paging Structure of

More information

Untyped Memory in the Java Virtual Machine

Untyped Memory in the Java Virtual Machine Untyped Memory in the Java Virtual Machine Andreas Gal and Michael Franz University of California, Irvine {gal,franz}@uci.edu Christian W. Probst Technical University of Denmark probst@imm.dtu.dk July

More information

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1 CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Winter 2008 3/11/2008 2002-08 Hal Perkins & UW CSE V-1 Agenda Java virtual machine architecture.class files Class loading Execution engines

More information

CSc 453 Interpreters & Interpretation

CSc 453 Interpreters & Interpretation CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson Interpreters An interpreter is a program that executes another program. An interpreter implements a virtual machine,

More information

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What

More information

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition Chapter 8: Memory- Management Strategies Operating System Concepts 9 th Edition Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation

More information

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Chapter 8: Main Memory. Operating System Concepts 9 th Edition Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition Chapter 7: Main Memory Operating System Concepts Essentials 8 th Edition Silberschatz, Galvin and Gagne 2011 Chapter 7: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure

More information

Dynamic Dispatch and Duck Typing. L25: Modern Compiler Design

Dynamic Dispatch and Duck Typing. L25: Modern Compiler Design Dynamic Dispatch and Duck Typing L25: Modern Compiler Design Late Binding Static dispatch (e.g. C function calls) are jumps to specific addresses Object-oriented languages decouple method name from method

More information

Memory Management. Dr. Yingwu Zhu

Memory Management. Dr. Yingwu Zhu Memory Management Dr. Yingwu Zhu Big picture Main memory is a resource A process/thread is being executing, the instructions & data must be in memory Assumption: Main memory is infinite Allocation of memory

More information

Just-In-Time Compilation

Just-In-Time Compilation Just-In-Time Compilation Thiemo Bucciarelli Institute for Software Engineering and Programming Languages 18. Januar 2016 T. Bucciarelli 18. Januar 2016 1/25 Agenda Definitions Just-In-Time Compilation

More information

Lecture 13: Address Translation

Lecture 13: Address Translation CS 422/522 Design & Implementation of Operating Systems Lecture 13: Translation Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of the

More information

Chapter 8: Main Memory

Chapter 8: Main Memory Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information

Ext3/4 file systems. Don Porter CSE 506

Ext3/4 file systems. Don Porter CSE 506 Ext3/4 file systems Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Today s Lecture Kernel RCU File System Networking Sync Memory Management Device Drivers

More information

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction Group B Assignment 8 Att (2) Perm(3) Oral(5) Total(10) Sign Title of Assignment: Code optimization using DAG. 8.1.1 Problem Definition: Code optimization using DAG. 8.1.2 Perquisite: Lex, Yacc, Compiler

More information

Lecture 16 Pointer Analysis

Lecture 16 Pointer Analysis Pros and Cons of Pointers Lecture 16 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Many procedural languages have pointers

More information

Run-Time Environments/Garbage Collection

Run-Time Environments/Garbage Collection Run-Time Environments/Garbage Collection Department of Computer Science, Faculty of ICT January 5, 2014 Introduction Compilers need to be aware of the run-time environment in which their compiled programs

More information

Memory Management Virtual Memory

Memory Management Virtual Memory Memory Management Virtual Memory Part of A3 course (by Theo Schouten) Biniam Gebremichael http://www.cs.ru.nl/~biniam/ Office: A6004 April 4 2005 Content Virtual memory Definition Advantage and challenges

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #6: Memory Management CS 61C L06 Memory Management (1) 2006-07-05 Andy Carle Memory Management (1/2) Variable declaration allocates

More information

Dynamic Storage Allocation

Dynamic Storage Allocation 6.172 Performance Engineering of Software Systems LECTURE 10 Dynamic Storage Allocation Charles E. Leiserson October 12, 2010 2010 Charles E. Leiserson 1 Stack Allocation Array and pointer A un Allocate

More information

A new Mono GC. Paolo Molaro October 25, 2006

A new Mono GC. Paolo Molaro October 25, 2006 A new Mono GC Paolo Molaro lupus@novell.com October 25, 2006 Current GC: why Boehm Ported to the major architectures and systems Featurefull Very easy to integrate Handles managed pointers in unmanaged

More information

Myths and Realities: The Performance Impact of Garbage Collection

Myths and Realities: The Performance Impact of Garbage Collection Myths and Realities: The Performance Impact of Garbage Collection Tapasya Patki February 17, 2011 1 Motivation Automatic memory management has numerous software engineering benefits from the developer

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal

More information

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst Department of Computer Science Why C? Low-level Direct access to memory WYSIWYG (more or less) Effectively

More information

Chapter 8. Virtual Memory

Chapter 8. Virtual Memory Operating System Chapter 8. Virtual Memory Lynn Choi School of Electrical Engineering Motivated by Memory Hierarchy Principles of Locality Speed vs. size vs. cost tradeoff Locality principle Spatial Locality:

More information

Memory Management. Kevin Webb Swarthmore College February 27, 2018

Memory Management. Kevin Webb Swarthmore College February 27, 2018 Memory Management Kevin Webb Swarthmore College February 27, 2018 Today s Goals Shifting topics: different process resource memory Motivate virtual memory, including what it might look like without it

More information

CS 160: Interactive Programming

CS 160: Interactive Programming CS 160: Interactive Programming Professor John Canny 3/8/2006 1 Outline Callbacks and Delegates Multi-threaded programming Model-view controller 3/8/2006 2 Callbacks Your code Myclass data method1 method2

More information

Attila Szegedi, Software

Attila Szegedi, Software Attila Szegedi, Software Engineer @asz Everything I ever learned about JVM performance tuning @twitter Everything More than I ever wanted to learned about JVM performance tuning @twitter Memory tuning

More information

Learning from Executions

Learning from Executions Learning from Executions Dynamic analysis for program understanding and software engineering Michael D. Ernst and Jeff H. Perkins November 7, 2005 Tutorial at ASE 2005 Outline What is dynamic analysis?

More information

Summary: Open Questions:

Summary: Open Questions: Summary: The paper proposes an new parallelization technique, which provides dynamic runtime parallelization of loops from binary single-thread programs with minimal architectural change. The realization

More information

Analyzing Real-Time Systems

Analyzing Real-Time Systems Analyzing Real-Time Systems Reference: Burns and Wellings, Real-Time Systems and Programming Languages 17-654/17-754: Analysis of Software Artifacts Jonathan Aldrich Real-Time Systems Definition Any system

More information

Programming Language Implementation

Programming Language Implementation A Practical Introduction to Programming Language Implementation 2014: Week 10 Garbage Collection College of Information Science and Engineering Ritsumeikan University 1 review of last week s topics dynamic

More information

IA-64 Compiler Technology

IA-64 Compiler Technology IA-64 Compiler Technology David Sehr, Jay Bharadwaj, Jim Pierce, Priti Shrivastav (speaker), Carole Dulong Microcomputer Software Lab Page-1 Introduction IA-32 compiler optimizations Profile Guidance (PGOPTI)

More information

USC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4

USC 227 Office hours: 3-4 Monday and Wednesday  CS553 Lecture 1 Introduction 4 CS553 Compiler Construction Instructor: URL: Michelle Strout mstrout@cs.colostate.edu USC 227 Office hours: 3-4 Monday and Wednesday http://www.cs.colostate.edu/~cs553 CS553 Lecture 1 Introduction 3 Plan

More information

L9: Storage Manager Physical Data Organization

L9: Storage Manager Physical Data Organization L9: Storage Manager Physical Data Organization Disks and files Record and file organization Indexing Tree-based index: B+-tree Hash-based index c.f. Fig 1.3 in [RG] and Fig 2.3 in [EN] Functional Components

More information

LLVM for a Managed Language What we've learned

LLVM for a Managed Language What we've learned LLVM for a Managed Language What we've learned Sanjoy Das, Philip Reames {sanjoy,preames}@azulsystems.com LLVM Developers Meeting Oct 30, 2015 This presentation describes advanced development work at Azul

More information

Just-In-Time Compilers & Runtime Optimizers

Just-In-Time Compilers & Runtime Optimizers COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Programmer Directed GC for C++ Michael Spertus N2286= April 16, 2007

Programmer Directed GC for C++ Michael Spertus N2286= April 16, 2007 Programmer Directed GC for C++ Michael Spertus N2286=07-0146 April 16, 2007 Garbage Collection Automatically deallocates memory of objects that are no longer in use. For many popular languages, garbage

More information