Yak: A High-Performance Big-Data-Friendly Garbage Collector. Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Sanazsadat Alamian Shan Lu

Size: px

Start display at page:

Download "Yak: A High-Performance Big-Data-Friendly Garbage Collector. Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Sanazsadat Alamian Shan Lu"

Amelia Cameron
5 years ago
Views:

1 Yak: A High-Performance Big-Data-Friendly Garbage Collector Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Sanazsadat Alamian Shan Lu University of California, Irvine University of Chicago Onur Mutlu ETH Zürich

2 2

3 Naiad 2

4 Background: GC Heap 3

5 Background: GC Heap 4

6 Background: GC Heap 4

7 Background: GC Heap 4

8 Background: GC Heap 4

9 Background: GC Heap 4

10 5

11 Scalability JVM crashes due to OutOfMemory error at early stage [Fang et al., SOSP 15] 5

12 Scalability JVM crashes due to OutOfMemory error at early stage [Fang et al., SOSP 15] Management cost GC time accounts for up to 50% of the execution time [Bu et al., ISMM 13] 5

13 Scalability JVM crashes due to OutOfMemory error at early stage [Fang et al., SOSP 15] Management cost GC time accounts for up to 50% of the execution time [Bu et al., ISMM 13] 5

14 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF 6

15 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF 6

16 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Control path Data path 6

17 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Control path 6

18 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller MASTER Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Control path 6

19 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller MASTER Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Control path 6

20 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller MASTER SLAVE Node Controller... Node Controller SLAVE Aggregate Join UDF Aggregate Join UDF Control path 6

21 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Control path 6

22 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Data path 6

23 Two Paths, Two Hypotheses Data Loads and Feeds Queries and Results Data Publishing Cloud Partitioner Cluster Controller Node Controller... Node Controller Aggregate Join UDF Aggregate Join UDF Data path 6

24 WordCount 7

25 WordCount Document Mapper 7

26 WordCount Document Mapper setup 7

27 WordCount Document Mapper setup 7

28 WordCount Document Mapper setup map 7

29 WordCount Document Mapper setup Heap map 7

30 WordCount Document Mapper setup Heap map 7

31 WordCount Document Mapper setup Heap map 7

32 WordCount Document Mapper setup Heap map 7

33 WordCount Document Mapper setup Heap map 7

34 WordCount Document Mapper setup Heap map cleanup 7

35 WordCount Document Mapper setup Heap map cleanup 7

36 WordCount Document Mapper setup Heap map cleanup 7

37 WordCount Document setup Many data objects have the same life span: Mapper Heap map cleanup 7

38 WordCount Document setup Many data objects have the same life span: Mapper Heap GC map cleanup 7

39 WordCount Document setup Many data objects have the same life span: Mapper Heap GC map cleanup GC run in the middle is wasted 7

40 WordCount Document setup Many data objects have the same life span: Mapper Memory Heap Region GC map cleanup GC run in the middle is wasted Can be efficiently reclaimed together 7

41 WordCount Document setup Many data objects have the same life span: Mapper Memory Heap Region GC map cleanup GC run in the middle is wasted Can be efficiently reclaimed together 7

42 WordCount Document setup Many data objects have the same life span: Mapper Memory Region GC map cleanup GC run in the middle is wasted Can be efficiently reclaimed together 7

43 Region-based Memory Management Sophisticated static analysis won t work for data-intensive systems 8

44 Region-based Memory Management Sophisticated static analysis won t work for data-intensive systems What about control path? 8

45 Region-based Memory Management Sophisticated static analysis won t work for data-intensive systems What about control path? generational GC region-based memory management 8

46 Yak Approach Heap 9

47 Yak Approach Control Space Data Space 9

48 Yak Approach Control Space Data Space Generational GC 9

49 Yak Approach Control Space Data Space Generational GC Region-based Memory Management 9

50 Yak Approach Control Space Data Space Generational GC Region-based Memory Management Reduced memory management cost 9

51 Yak Approach Control Space Data Space Generational GC Region-based Memory Management Reduced memory management cost annotate epoch boundary: - epoch_start() - epoch_end() 9

52 Correctness Region 10

53 Correctness Region 10

54 Correctness User-based approach solution: Facade [Nguyen et al., ASPLOS 15] 10

55 Correctness User-based approach solution: Facade [Nguyen et al., ASPLOS 15] annotation & refactoring 10

56 Correctness User-based approach solution: Facade [Nguyen et al., ASPLOS 15] annotation & refactoring Developers 10

57 Correctness User-based approach solution: Facade [Nguyen et al., ASPLOS 15] annotation & refactoring Developers Yak 10

58 Yak: An Automatic Solution Yak: the first hybrid GC 11

59 Yak: An Automatic Solution Yak: the first hybrid GC Implemented in OpenJDK 8 Modified the interpreter, two JIT compilers, the heap layout, the Parallel Scavenge GC NO code refactoring needed; correctness guaranteed by the system 11

60 Yak: An Automatic Solution Yak: the first hybrid GC Implemented in OpenJDK 8 Modified the interpreter, two JIT compilers, the heap layout, the Parallel Scavenge GC NO code refactoring needed; correctness guaranteed by the system On average, vs. default production GC (PS): Reduce 33% execution time Reduce 78% GC time 11

61 Challenges How to create regions? How to reclaim regions correctly? 12

62 How to Create Regions? A region starts at a call to epoch-start and ends at a call to epoch-end The location of epochs affects performance but not correctness Regions are thread-local Regions can be nested 13

63 How to Create Regions? void main() { } //end of main 14

64 How to Create Regions? void main() { CS,* } //end of main 14

65 How to Create Regions? void main() { epoch_start(); for( ) { epoch #1 CS,* 1,t 1 1,t 2 } epoch_end(); } //end of main 14

$void main() { epoch_start(); for( ) { epoch_start(); for( ) { How to Create Regions?$

66 void main() { epoch_start(); for( ) { epoch_start(); for( ) { How to Create Regions? epoch #1 epoch #2 1,t 1 CS,* 1,t 2 2,t 1 2,t 2 } epoch_end(); } epoch_end(); } //end of main 14

67 void main() { epoch_start(); for( ) { epoch_start(); for( ) { epoch_start(); for( ) { How to Create Regions? epoch #1 epoch #2 epoch #3 1,t 1 CS,* 1,t 2 } epoch_end(); 2,t 1 2,t 2 } epoch_end(); } epoch_end(); 3,t 1 3,t 2 } //end of main 14

68 Region Semilattice void main() { epoch_start(); for( ) { epoch #1 CS,* epoch_start(); for( ) { epoch_start(); for( ) { epoch #2 epoch #3 1,t 1 1,t 2 } epoch_end(); region 2,t 1 2,t 2 } epoch_end(); } epoch_end(); 3,t 1 3,t 2 } //end of main 14

69 Region Semilattice void main() { epoch_start(); for( ) { epoch #1 CS,* epoch_start(); for( ) { epoch_start(); for( ) { epoch #2 epoch #3 1,t 1 1,t 2 } epoch_end(); region 2,t 1 2,t 2 } epoch_end(); } epoch_end(); } //end of main 3,t 1 3,t 2 JOIN( 3,t 1, 2,t 1 ) = 2,t 1 14

70 Region Semilattice void main() { epoch_start(); for( ) { epoch #1 CS,* epoch_start(); for( ) { epoch_start(); for( ) { epoch #2 epoch #3 1,t 1 1,t 2 } epoch_end(); region 2,t 1 2,t 2 } epoch_end(); } epoch_end(); } //end of main 3,t 1 3,t 2 JOIN( 3,t 1, 2,t 1 ) = 2,t 1 14

71 Region Semilattice void main() { epoch_start(); for( ) { epoch #1 CS,* epoch_start(); for( ) { epoch_start(); for( ) { epoch #2 epoch #3 1,t 1 1,t 2 } epoch_end(); region 2,t 1 2,t 2 } epoch_end(); } epoch_end(); } //end of main 3,t 1 3,t 2 JOIN( 2,t 1, 2,t 2 ) = CS,* 14

72 Region Semilattice void main() { epoch_start(); for( ) { epoch #1 CS,* epoch_start(); for( ) { epoch_start(); for( ) { epoch #2 epoch #3 1,t 1 1,t 2 } epoch_end(); region 2,t 1 2,t 2 } epoch_end(); } epoch_end(); } //end of main 3,t 1 3,t 2 JOIN( 2,t 1, 2,t 2 ) = CS,* 14

73 How to Reclaim Regions Correctly? Object Promotion Algorithm 15

74 How to Reclaim Regions Correctly? Object Promotion Algorithm Two key tasks: What: Identify escaping objects: 15

75 How to Reclaim Regions Correctly? Object Promotion Algorithm Two key tasks: What: Identify escaping objects: Tracking of cross-region/space references in write barrier A fast path for intra-region references Inter-region references are recorded in the remember sets of their destination regions 15

76 How to Reclaim Regions Correctly? Object Promotion Algorithm Two key tasks: What: Identify escaping objects: Tracking of cross-region/space references in write barrier A fast path for intra-region references Inter-region references are recorded in the remember sets of their destination regions Where: Decide the relocation destination: 15

77 How to Reclaim Regions Correctly? Object Promotion Algorithm Two key tasks: What: Identify escaping objects: Tracking of cross-region/space references in write barrier A fast path for intra-region references Inter-region references are recorded in the remember sets of their destination regions Where: Decide the relocation destination: Query region semilattice 15

78 Region Deallocation <CS,*> <r 1, t 1 > epoch_end() <r 2, t 1 > 16

79 Region Deallocation <CS,*> <r 1, t 1 > epoch_end() Remember Set <r 2, t 1 > 16

80 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() Remember Set <r 2, t 1 > 16

81 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() Remember Set <r 2, t 1 > 16

82 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() Remember Set Barrier <r 2, t 1 > 16

83 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set Barrier <r 2, t 1 > 16

84 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set Barrier <r 2, t 1 > 16

85 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

86 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

87 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

88 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

89 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

90 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

91 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

92 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

93 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Remember Set escaping roots Barrier <r 2, t 1 > 16

94 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack Barrier 16

95 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() t 2 s stack t 3 s stack 16

96 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack epoch_end() 16

97 Region Deallocation <CS,*> <r 1, t 1 > t 1 s stack 16

98 Evaluations 3 real-world systems, 9 applications: Hadoop Popular distributed MapReduce implementation Hyracks [Borkar et al., ICDE 11] Data-parallel platform to run data-intensive jobs on a cluster of shared-nothing machines GraphChi [Kyrola et al., OSDI 12] High-performance graph analytical framework for a single machine 17

99 Improvement Summary Normalized Performance (to PS) Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi GC Time App. Time Total Time 18

100 Improvement Summary Normalized Performance (to PS) Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi GC Time App. Time Total Time 18

101 Improvement Summary Normalized Performance (to PS) Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi GC Time App. Time Total Time 18

Improvement Summary Normalized Performance (to PS) 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.

102 Improvement Summary Normalized Performance (to PS) Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi Hyracks Hadoop GraphChi GC Time App. Time Total Time 18

103 More Data in the Paper Statistics on Yak s heap Overhead breakdown Write barrier cost Extra space cost 19

104 Conclusion Goal: Reduce memory management cost in Big Data systems Solution: Yak, the first hybrid GC 33% execution time savings 78% GC time reduction Requires almost zero user effort 20

105 Thank You!

Yak: A High-Performance Big-Data-Friendly Garbage Collector

Yak: A High-Performance Big-Data-Friendly Garbage Collector Khanh Nguyen, Lu Fang, Guoqing Xu, and Brian Demsky; University of California, Irvine; Shan Lu, University of Chicago; Sanazsadat Alamian, University