MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp

Size: px

Start display at page:

Download "MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp"

Lorena Griffith
5 years ago
Views:

1 MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp

2 Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage (e.g., NAS)

3 Hadoop+Enterprise: Two Storage Silos Hadoop Hardware $$$ Cross-silo data management $$$ Periodic data ingest

4 Our Solution: MixApart MapReduce analytics on enterprise storage Enterprise storage single reliable data store MapReduce MapReduce MapReduce Cache Cache Cache On-disk cache for scalability Transparent and on-demand ingest

5 Data Flow with MixApart Data reuse Map task parallelism: Storage bandwidth Cache reuse Map task I/O rates Map Map Map Reduce Reduce Reduce

6 Workload Analysis Extrapolate from recent studies* Production traces from Facebook, Bing, Yahoo Insights High data reuse across jobs e.g., ~60% Low IO to CPU ratio in input phases e.g., ~25Mbps Predictable IO demands * Ananthanarayanan et al. NSDI 12, Chen et al. VLDB 12

7 Scale Estimates Map Task I/O Rate 25 Mbps 100, parallel tasks # of Map Tasks 10,000 1, parallel tasks Data Reuse Ratio Shared storage bandwidth 10 Gbps

8 MixApart Design Storage back-end bandwidth management Saturate bandwidth with Map I/O streams without impacting job performance Cache management Ensure high cached data reuse management Assign Map tasks to nodes with cached data

9 MapReduce Optimization Predictable job I/O demands at submission User-specified job input data path Derived Map task I/O rates Just-in-time parallel data prefetch within & across jobs

10 MixApart Architecture Job priorities Job IO demands JobTracker Data locations XDFS NameNode Scheduler Location Map Data Transfer Scheduler Co-locates compute and data using: Job priorities Data in the cache Node Cache Issues prefetches Nodeusing: Available storage bandwidth Job priorities Map I/O rates Node Cache Node

11 MixApart in Action exchange job input info Job (F1 F2 F3 F4) 1 JobTracker XDFS NameNode Scheduler Location Map Data Transfer Scheduler Node 1 Node 2 Cache Cache Node 1 F1 Node 2 F3 F1 F3 F2 F2 F4 F4

12 MixApart in Action 3 create tasks exchange job input info Job (F1 F2 F3 F4) 1 JobTracker XDFS NameNode T1 T2 T3 T4 Scheduler Location Map Data Transfer Scheduler 2 transfer F2 Node 1 transfer F4 Node 2 Cache Cache Node 1 F1 Node 2 F3 F1 F3 F2 F2 F4 F4

13 MixApart in Action 3 create tasks exchange job input info Job (F1 F2 F3 F4) 1 JobTracker XDFS NameNode Scheduler Location Map T2 Data Transfer Scheduler 2 T4 transfer F2 Node 1 T1 transfer F4 Node 2 Cache Cache Node 1 F1 Node 2 F3 F2 T3 compute T1 and T3 4 prefetch F2 and F4 F4 F1 F3 F2 F4

14 MixApart in Action 3 create tasks exchange job input info Job (F1 F2 F3 F4) 1 JobTracker XDFS NameNode Scheduler Location Map Data Transfer Scheduler 2 transfer F2 Node 1 T2 transfer F4 Node 2 Cache Cache Node 1 F1 Node 2 F3 F2 T4 compute T1 and T3 4 prefetch F2 and F4 F4 F1 F3 F2 F4

15 MixApart Prototype Re-engineered Hadoop MapReduce and HDFS XDFS cache Stateless HDFS + NFS support scheduler FIFO task scheduler + cache aware Data transfer scheduler Module in NameNode

16 Evaluation on Amazon EC2 MixApart vs. Hadoop 100-core compute cluster 50 EC2 VM instances 7.5 GB RAM, 850GB local storage Local VM instance storage for XDFS cache & HDFS NFS server EC2 instance 4 EBS volumes in RAID-0 setting 1Gbps bandwidth for analytics

17 Microbenchmarks Dataset 12 days of Wikipedia statistics Workload MR Job to aggregate page views for regex Job on uncompressed data I/O intensive Job on compressed data CPU intensive

18 Impact of Ingest MixApart Hadoop+ingest % compute MixApart faster: overlap of compute and ingest Seconds ingest -28% 0 I/O intensive CPU intensive Next: MixApart vs. ideal Hadoop with no static ingest

19 Microbenchmark Job Durations MixApart Hadoop-ideal Hadoop+ingest 400 Seconds reuse: MixApart ~ Hadoop Data Reuse Ratio

20 2 Jobs Co-scheduled Time (Normalized to Hadoop) Job A high priority high reuse MixApart Job B low priority low reuse Hadoop-ideal compute A wait B compute A prefetch B Time compute B compute B

21 2 Jobs Co-scheduled Time (Normalized to Hadoop) % Job A high priority low reuse -37% MixApart Job B low priority high reuse Hadoop-ideal compute A wait B compute A compute B Time compute B MixApart: work conserving compute scheduling

22 Facebook Hadoop Trace 1 Data Reuse Fraction Hour

23 Facebook Job Durations MixApart MixApart matches Hadoop when ignoring ingest! Hadoop-ideal +0.9% Seconds % +0.2% Reuse Trace 0.48 Reuse Trace 0.81 Reuse Trace

24 Facebook Concurrency 1 MixApart Hadoop-ideal 0.8 Reduce phase parallelism CDF Map phase parallelism Number of Running Tasks

25 MixApart Summary MapReduce analytics on enterprise storage Enterprise storage single reliable data store Optimized storage efficiency Simplified data management MixApart faster than ingest-then-compute Hadoop MixApart comparable to Hadoop with no ingest MapReduce MapReduce MapReduce Cache Cache Cache

26 Thank you! Questions?

MixApart: Decoupled Analytics for Shared Storage Systems

MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto, NetApp Abstract Data analytics and enterprise applications have very