Ghislain Fourny Big Data 7. Resource Management artjazz / 123RF Stock Photo
Data Technology Stack User interfaces Querying Data stores Indexing Processing Validation Data models Syntax Encoding Storage 2
Where we are User interfaces Querying Data stores Indexing Processing Validation Data models Syntax Encoding Storage 3
Last week: MapReduce Input data Map Map Map Map Map Map Map Map Intermediate data (shuffled) Reduce Reduce Reduce Reduce Reduce Reduce Reduce Reduce Output data 4
Hadoop infrastructure (version 1) Namenode /dir/file Datanode Datanode Datanode Datanode Datanode Datanode 5
Hadoop infrastructure (version 1) Namenode + JobTracker /dir/file Datanode + TaskTracker Datanode + TaskTracker Datanode + TaskTracker Datanode + TaskTracker Datanode + TaskTracker Datanode + TaskTracker 6
Responsibilities of the MapReduce JobTracker Resource Management 7
Responsibilities of the MapReduce JobTracker Resource Management Scheduling 8
Responsibilities of the MapReduce JobTracker Resource Management Scheduling Monitoring 9
Responsibilities of the MapReduce JobTracker Resource Management Scheduling Monitoring Job lifecycle 10
Responsibilities of the MapReduce JobTracker Resource Management Scheduling Monitoring Job lifecycle Fault-tolerance 11
Issue 1: scalability M M M M M M M M M M M M < 4,000 nodes < 40,000 tasks 12
Issue 2: bottleneck JobTracker Bottleneck TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker 13 13
Issue 3: Jack of all trades Scheduling Monitoring 14 14
Issue 4: Utilization (task slots) Static (Decide on M/R at configuration time) Fixed-size 15 15
Issue 5: Not fungible Map Reduce 16 16
Issue 5: Not fungible Working at maximum capacity Idle Map Reduce 17 17
kirtchanut / 123RF Stock Photo YARN 18
YARN Yet Another Resource Negotiator 19
YARN Scheduling Application Monitoring management Resource Manager Application Master Application Master Application Master Application Master Application Master 20
Scales more M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M 10,000 nodes 100,000 tasks 21
YARN architecture ResourceManager 22
YARN architecture ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 23
YARN architecture ResourceManager Container Container Container NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 24
Remember... It does ring a bell, doesn't it? 25
Master-slave architecture Master Slave Slave Slave Slave Slave Slave 26
HDFS server architecture Namenode /dir/file1 /dir/file2 /file3 Datanode Datanode Datanode Datanode Datanode Datanode 27
YARN ResourceManager Container Container Container NodeManager NodeManager NodeManager NodeManager NodeManager 28
YARN Client ResourceManager Job Container Container Container NodeManager NodeManager NodeManager NodeManager NodeManager 29
YARN: RM allocates an Application Master Client ResourceManager Job Schedules Container Container Container NodeManager NodeManager NodeManager NodeManager NodeManager 30
YARN: RM allocates an Application Master Client ResourceManager Job Schedules Application Master Container Container NodeManager NodeManager NodeManager NodeManager NodeManager 31
YARN: RM allocates an Application Master Client ResourceManager Job Application Master Container Container NodeManager NodeManager NodeManager NodeManager NodeManager 32
YARN: RM allocates an Application Master Client ResourceManager Job Application Master Container Container NodeManager NodeManager NodeManager NodeManager NodeManager 33
Application Master communicates with containers Application Master Container Container Container Execute Monitor Container 34
kirtchanut / 123RF Stock Photo YARN's Resource Manager 35
Resource Manager Capacity guarantees Cluster Utilization Fairness SLAs 36
Communication with clients 37
Communication with clients Client Service Application (start, end) Queue information Statistics 38
Communication with clients Client Service Application (start, end) Queue information Statistics Admin Service Refresh the node list Queue configuration 39
Communication with the node managers 40
Communication with the node managers Resource Tracker 41
Communication with the node managers Resource Tracker Liveliness 42
Communication with the node managers Resource Tracker Liveliness Nodes List Manager valid invalid 43
Communication with the application masters 44
Communication with the application masters Application Master Service (registration) 45
Communication with the application masters Application Master Service (registration) Liveliness 46
Communication with the application masters Application Master Service (registration) Liveliness Application Master Service (container requests) 47
Communication with the application masters Application Master Service (registration) Liveliness Application Master Service (container requests) Applications Manager 48
Communication with the application masters Application Master Service (registration) Liveliness Application Master Service (container requests) Applications Manager + Launcher 49
Authentication 50
Authentication Application Token 51
Authentication Application Token Container Token 52
Authentication Application Token Application ACL Container Token 53
Pure scheduler Does not monitor tasks. Does not restart upon failure. 54
Scheduling strategies: pluggable scheduler 55
Scheduling strategies: pluggable scheduler FIFO scheduler 56
Scheduling strategies: pluggable scheduler FIFO scheduler 57
Scheduling strategies: pluggable scheduler FIFO scheduler 58
Scheduling strategies: pluggable scheduler FIFO scheduler 59
Scheduling strategies: pluggable scheduler FIFO scheduler 60
Scheduling strategies: pluggable scheduler Capacity scheduler Queue 1 Queue 2 61
Scheduling strategies: pluggable scheduler Capacity scheduler Queue 1 Queue 2 62
Scheduling strategies: pluggable scheduler Capacity scheduler Queue 1 Queue 2 63
Scheduling strategies: pluggable scheduler Capacity scheduler Queue 1 Queue 2 64
Scheduling strategies: pluggable scheduler Capacity scheduler Queue 1 Queue 2 65
Scheduling strategies: pluggable scheduler Capacity scheduler Queue 1 Queue 2 66
Hierarchical queues Root 67
Hierarchical queues Root Math 4 Physics 1 CS 5 68
Hierarchical queues Root Math 4 Physics 1 CS 5 40% 10% 50% 69
Hierarchical queues Root Math 4 Physics 1 CS 5 Analysis Algebra TI DB 10 40 20 80 70
Hierarchical queues Root Math 4 Physics 1 CS 5 Analysis Algebra TI DB 10 40 20 80 8% 10% 10% 32% 40% 71
Hierarchical queues Root Math 4 Physics 1 CS 5 Analysis Algebra Best effort DB 10 40 0 80 8% 10% 0% 32% 50% 72
Hierarchical queues Root Math 4 Physics 1 CS 5 Analysis Algebra Best effort DB 10 40 0 80 8% 10% 50% 32% 0% 73
Scheduling strategies: pluggable scheduler Fair scheduler 74
Scheduling strategies: pluggable scheduler Fair scheduler 75
Scheduling strategies: pluggable scheduler Fair scheduler 76
Scheduling strategies: pluggable scheduler Fair scheduler 77
Scheduling strategies: pluggable scheduler Fair scheduler 78
Scheduling strategies: pluggable scheduler Fair scheduler 79
Scheduling strategies: pluggable scheduler Fair scheduler 80
Scheduling strategies: pluggable scheduler Fair scheduler 81
Fine grained resource requests Memory Application A: 10 GB Application A: 30 GB 82
Fine grained resource requests Memory Application A: 10 GB Application A: 30 GB 25% 75% 83
Fine grained resource requests Memory CPU 84
Dominant Resource Fairness Memory (total 1 TB) CPU (total 100 cores) 85
Dominant Resource Fairness Memory (total 1 TB) CPU (total 100 cores) Application A: 300 GB, 4 cores Application A: 10 GB, 50 cores 86
Dominant Resource Fairness Memory (total 1 TB) CPU (total 100 cores) Application A: 300 GB, 4 cores Application A: 10 GB, 50 cores 30% Memory, 4% CPU 1% Memory, 50% CPU 87
Dominant Resource Fairness Memory (total 1 TB) CPU (total 100 cores) Application A: 300 GB, 4 cores Application A: 10 GB, 50 cores 30% Memory, 4% CPU 1% Memory, 50% CPU 37.5% 62.5% 88
Fine grained resource requests Memory CPU Disk Network 89
Fine grained resource requests Memory CPU Work in progress Disk Network 90
Resource container X GB W cores, U GHz Y TB Z MBps 91
kirtchanut / 123RF Stock Photo YARN's Node Manager 92
NodeManager: one per node NodeManager NodeManager NodeManager NodeManager 93
Monitoring Memory CPU Disk Network 94
Reports to ResourceManager Memory CPU ResourceManager Disk Network 95
Container 96
kirtchanut / 123RF Stock Photo YARN's Application Masters 97
Application Master Application Master is per application. 98
Application Master Application Master is application-specific. 99
Framework-specific application masters MapReduce DAG distributed processing Message Passing Interface Graph processing 100
Complexity is moved to the Application Master complexity 101
Application Master ResourceManager negotiates resources 102
Application Master ResourceManager negotiates resources executes and monitors NodeManager 103
Fault tolerance is on the application master 104
Fault tolerance is on the application master 105
Fault tolerance is on the application master relaunch 106
Application-specific monitoring no longer a bottleneck 107
Application Master is not trusted 108
Application Master is not trusted Evil plan to book containers and not use them 109
Summary Separation between scheduling and monitoring 110
Summary Separation between scheduling and monitoring Scalability 111
Summary Separation between scheduling and monitoring Scalability Availability 112
Summary Separation between scheduling and monitoring Scalability Availability Multi-tenancy 113
Forward compatibility with DAGs of tasks 114