The Evolving Apache Hadoop Ecosystem What it means for Storage Industry

Similar documents
HDFS What is New and Futures

Namenode HA. Sanjay Radia - Hortonworks

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

Apache Hadoop.Next What it takes and what it means

Hadoop and HDFS Overview. Madhu Ankam

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

Distributed Systems 16. Distributed File Systems II

Hadoop An Overview. - Socrates CCDH

The Hadoop Distributed File System Konstantin Shvachko Hairong Kuang Sanjay Radia Robert Chansler

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia,

The Google File System. Alexandru Costan

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Distributed Filesystem

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Lecture 11 Hadoop & Spark

Distributed File Systems II

The Google File System

HDFS Design Principles

MapR Enterprise Hadoop

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

Automatic-Hot HA for HDFS NameNode Konstantin V Shvachko Ari Flink Timothy Coulter EBay Cisco Aisle Five. November 11, 2011

CA485 Ray Walshe Google File System

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Map-Reduce. Marco Mura 2010 March, 31th

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

HDFS Architecture Guide

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

NPTEL Course Jan K. Gopinath Indian Institute of Science

CLOUD-SCALE FILE SYSTEMS

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

Hortonworks and The Internet of Things

A BigData Tour HDFS, Ceph and MapReduce

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017

Google File System. Arun Sundaram Operating Systems

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

docs.hortonworks.com

Cloudera Exam CCA-410 Cloudera Certified Administrator for Apache Hadoop (CCAH) Version: 7.5 [ Total Questions: 97 ]

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2016

EsgynDB Enterprise 2.0 Platform Reference Architecture

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2013/14

MapReduce. U of Toronto, 2014

CS /30/17. Paul Krzyzanowski 1. Google Chubby ( Apache Zookeeper) Distributed Systems. Chubby. Chubby Deployment.

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved.

CS370 Operating Systems

Yuval Carmel Tel-Aviv University "Advanced Topics in Storage Systems" - Spring 2013

BigData and Map Reduce VITMAC03

The Google File System

L5-6:Runtime Platforms Hadoop and HDFS

CS370 Operating Systems

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong

HADOOP FRAMEWORK FOR BIG DATA

Distributed System. Gang Wu. Spring,2018

GFS: The Google File System. Dr. Yingwu Zhu

Introduction to Hadoop and MapReduce

Hadoop File Management System

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

The Google File System

Google File System. By Dinesh Amatya

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

5 Fundamental Strategies for Building a Data-centered Data Center

2014 VMware Inc. All rights reserved.

UNIT-IV HDFS. Ms. Selva Mary. G

Quobyte The Data Center File System QUOBYTE INC.

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS: The Google File System

Automation of Rolling Upgrade for Hadoop Cluster without Data Loss and Job Failures. Hiroshi Yamaguchi & Hiroyuki Adachi

MAPR TECHNOLOGIES, INC. TECHNICAL BRIEF APRIL 2017 MAPR SNAPSHOTS

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

The Google File System

50 Must Read Hadoop Interview Questions & Answers

CSE 124: Networked Services Fall 2009 Lecture-19

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Current Topics in OS Research. So, what s hot?

Google File System 2

The Google File System

Vendor: Cloudera. Exam Code: CCA-505. Exam Name: Cloudera Certified Administrator for Apache Hadoop (CCAH) CDH5 Upgrade Exam.

CSE 124: Networked Services Lecture-16

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles

Hadoop/MapReduce Computing Paradigm

Chapter 5. The MapReduce Programming Model and Implementation

Microsoft Big Data and Hadoop

SwiftStack and python-swiftclient

Certified Big Data and Hadoop Course Curriculum

Hadoop-PR Hortonworks Certified Apache Hadoop 2.0 Developer (Pig and Hive Developer)

Top 25 Big Data Interview Questions And Answers

TI2736-B Big Data Processing. Claudia Hauff

Renovating your storage infrastructure for Cloud era

An Introduction to GPFS

FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM

Certified Big Data Hadoop and Spark Scala Course Curriculum

The Google File System

Transcription:

The Evolving Apache Hadoop Ecosystem What it means for Storage Industry Sanjay Radia Architect/Founder, Hortonworks Inc. All Rights Reserved Page 1

Outline Hadoop (HDFS) and Storage Data platform drivers and How Hadoop changes the game File systems leading to HDFS HDFS Architecture RAID disk or not? HDFS s Generic storage layer Archival Upcoming HDFS Features Page 2

Next Generation Data Platform Drivers Business Drivers Technical Drivers Financial Drivers Need for new business models & faster growth (20%+) Find insights for competitive advantage & optimal returns Data continues to grow exponentially Data is increasingly everywhere and in many formats Legacy solutions unfit for new requirements growth Cost of data systems, as % of IT spend, continues to grow Cost advantages of commodity hardware & open source

Big Data Changes the Game Petabytes BIG DATA Transactions + Interactions + Observations = BIG DATA Mobile Web Sentiment SMS/MMS Speech to Text User Click Stream Terabytes WEB Social Interactions & Feeds Web logs Spatial & GPS Coordinates A/B testing Sensors / RFID / Devices Behavioral Targeting Gigabytes CRM Dynamic Pricing Segmentation Megabytes ERP Purchase detail Purchase record Payment record Business Data Feeds External Demographics Customer Touches Search Marketing Affiliate Networks Support Contacts Offer details Dynamic Funnels Offer history User Generated Content HD Video, Audio, Images Product/Service Logs Increasing Data Variety and Complexity

How Hadoop Changes the Game Cost Commodity servers, JBod disks Horizontal scaling new servers as needed Ride the technology curve Scale from small to very large Size of data or size of computation The Data-Refinery Traditional Datamarts optimize for time and space Don t need to pre-select the subset of schema Can experiment to gain insights into your whole data You insights can change over time Don t need to throw your old data away Open and Growing Eco-system You aren t locked into a vendor Ever growing choice of tools and components Page 5

Distributed File Systems and HDFS Hortonworks Inc. 2012 Page 6

File Systems Background(1) : Scaling (1) Mainframe FS Vertically Scale Namespace IO, & Storage Vertically Scale Namespace (2) Single System Image Distributed FS Horizontally Scale namespace ops and IO (3) Modern FS (Hadoop, GFS, pnfs) Horizontally Scale IO and Storage 7

File systems Background (3): Leading to Google FS and HDFS >Separation of metadata from data - 1978, 1980 l Separating Data from Function in a Distributed File System (1978) l by J E Israel, J G Mitchell, H E Sturgis l A universal file server (1980) by A D Birrell, R M Needham >Horizontal scaling of storage nodes and io bandwidth (1999-2003) l Several startups building scalable NFS late 1990s l Luster (1999-2002) l Google FS (2003) l (Hadoop s HDFS, pnfs) >Commodity HW with JBODs, Replication, Non-posix semantics l Google FS (2003) - Not using RAID a very important decision >Computation close to the data l Parallel DBs l Google FS/MapReduce

HDFS: Scalable, Reliable, Manageable Scale IO, Storage, CPU Add commodity servers & JBODs 6K nodes in cluster, 120PB r Fault Tolerant & Easy management r Built in redundancy r Tolerate disk and node failures r Automatically manage addition/ removal of nodes r One operator per 3K node!! r Storage server used for computation r Move computation to data r Not a SAN r But high-bandwidth network access to data via Ethernet r Scalable file system r Read, Write, rename, append Core Switch Switch Switch Switch ` r No random writes Simplicity of design why a small team could build such a large system in the first place 9

HDFS Architecture: Two Main Layers Namespace layer Multiple independent namespace Block Storage Namespace Can be mounted using client side mount tables NS NS Block Management Datanode Datanode Consists of dirs, files and blocks Operations: create, delete, modify and list dir/files Block Storage Generic Block service DataNodes cluster membership Block Pool set of blocks for a Namespace volume Shared storage across all volumes no partitioning Namespace Volume = Namespace + Block Pool Storage Block operations Create/delete/modify/getBlockLocation operations Read and write access to blocks Detect and Tolerate faults Monitor node/disk health, periodic block verification Recover from failures replication and placement 10

Namespace HDFS Architecture Persistent Namespace Metadata & Journal NFS Hierarchal Namespace File Name è BlockIDs Namespace State Namenode Block Map Block ID è Block Locations Block Storage Heartbeats & Block Reports b2 b1 b3 b1 b3 b5 b3 DataNodes b2 b5 b1 b2 b5 Block ID è Data JBOD JBOD JBOD JBOD Horizontally Scale IO and Storage 11

Client Read & Write Directly from Closest Server Namespace State 1 open Namenode 1 create Block Map Client Client End-to-end checksum 2 read 2 write b2 b1 b3 JBOD b1 b3 b5 JBOD b3 b2 b5 JBOD b2 b5 b1 JBOD 12

Quiz: What Is the Common Attribute? 13

HDFS Actively maintains data reliability Namespace State Namenode 1. replicate Bad/lost block replica b2 b4 b3 JBOD b1 b3 b5 JBOD Block Map 3. blockreceived 2. copy b3 Periodically check block checksums b2 b5 JBOD b2 b5 b1 JBOD 14

Significance of not using Disk RAID Key new idea was to not use RAID on the local disk and instead replicate the data Uniformly Deal with device failure, media failures, node failures etc. Nodes and clusters continue run, with slightly lower capacity Nodes and disks can be fixed when convenient Recover from failures in parallel Raid5 disk take over half a day to recover from a 1TB failure HDFS recovers 12TB in minutes recover is in parallel Faster for larger clusters Operational advantage: system recovers automatically from node and disk failures very rapidly 1 operator for 4K servers at mature Hadoop installations Yes there is the storage overhead File-RAID is available (1.4-1.6 efficiency) Practical to use for portion of data otherwise node recover takes long Page 15

Current HDFS Availability & Data Integrity Simple design, storage fault tolerance Storage: Rely in OS s file system rather than use raw disk Storage Fault Tolerance: multiple replicas, active monitoring Single NameNode Master Persistent state: multiple copies + checkpoints Restart on failure How well did it work? Lost 19 out of 329 Million blocks on 10 clusters with 20K nodes in 2009 7-9 s of reliability Fixed in 20 and 21. 18 months Study: 22 failures on 25 clusters - 0.58 failures per year per cluster Only 8 would have benefitted from HA failover!! (0.23 failures per cluster year) NN is very robust and can take a lot of abuse NN is resilient against overload caused by misbehaving apps Automatic Failover is available in Hadoop 1 And in Hadoop 2 (alpha) 16

HDFS Generic Storage Service Opportunities for Innovation Federation - Distributed (Partitioned) Namespace Simple and Robust due to independent masters Scalability, Isolation, Availability Alternate NN Implementation MR tmp Namespace New Services Independent Block Pools New FS - Partial namespace in memory MR Tmp storage, HBase directly on block storage Storage Service Shadow file system caches HDFS, NFS, S3 Future: move Block Management into the DataNodes Simplifies namespace/application implementation Distributed namenode becomes significantly simple Hortonworks Inc. 2012 HBase HDFS

Archival data where should it sit? Hadoop encourages old data for future analysis Should it sit in a separate cluster with lower computing power? Spindles are one of the bottlenecks in big data systems Can t waste precious spindles to another cluster where they are underutilized Better to keep archival data in the main cluster where the spindles can be actively used As data grows over time, a Big data cluster may have smaller percentage of hot data Should we arrange data on the disk platter based on access patterns? Challenge growing data Do tapes play a role? Page 18

HDFS in Hadoop 1 and Hadoop 2 Hadoop 1 (GA) Hadoop 2 (alpha) Security New Append Append/Fsync (HBase) Federation WebHdfs + Spnego Wire compatibility Write pipeline improvements Edit logs rewrite Local write optimization Faster startup Performance improvements HA NameNode (hot) Disk-fail-in-place Full Stack HA in progress HA Namenode Full Stack HA Page 19

Hadoop Full Stack HA Architecture Slave Nodes of Hadoop Cluster job Slave Server job job Slave Server Apps Running Outside Slave Server job job Slave Server Slave Server Failover JT into Safemode NN JT Server Server NN Server N+K failover HA Cluster for Master Daemons 20

Upcoming HDFS features Continue improvements Performance Monitoring and Management Rolling upgrades Disaster Recovery Full Stack HA in Hadoop 2 Snapshots (prototype already published) Support for heterogeneous storage on DataNodes Will allow Flash disks to be used for specific use cases Block grouping - allow large number of smaller blocks Working-set of namespace in memory large # of files Other protocols NFS Page 21

Which Apache Hadoop Distro? Stable Reliable Well supported But do not want to lock into a vendor Hortonworks Inc. 2012 Page 22

Hortonworks Data Platform Simplify deployment to get started quickly and easily Monitor, manage any size cluster with familiar console and tools 1 Only platform to include data integration services to interact with any data source Hortonworks Data Platform Metadata services opens the platform for integration with existing applications Delivers enterprise grade functionality on a proven Apache Hadoop distribution to ease management, simplify use and ease integration into the enterprise Dependable high availability architecture The only 100% open source data platform for Apache Hadoop

Hortonworks Data Platform Hadoop 1.0.x = stable kernel Integrates and tests all necessary Apache component projects Most stable and compatible versions of all components are chosen 3.3.4 0.4.0 3.1.3 0.9.2 0.9.0+ 0.92.1+ 0.9.0+ 3.1.3 3.3.4 beta Talend Zookeeper Oozie Sqoop HBase 0.4.0 Hive 1.0.3 Pig 0.9.0+ Ambari 0.9.0+ HCatalog QE/Beta process that produced every single stable Apache Hadoop release Current Apache Hadoop 2 is being QE ed and alpha/beta tested through the same process 0.9.2 Hadoop Closely aligned with each Apache component project code line with closedown process that provides necessary buffer 5.1.1 0.92.1+ 1.0.3 beta 5.1.1 Hortonworks Distribution Tested, Hardened & Proven Distribution Reduces Risk Page 24

Summary Hadoop changes Big Data game in fundamental ways Cost, storage and compute capacity JBODs rather than RAID, RAID arrays, or SANS Block-Storage layer is being further generalize Horizontally scale from small to very very large Data Refinery keep all your data, new business insights Archival data is merely cold/warm where should it sit? Exponential growth new data + old data & cost is low Open, growing eco-system no lock in Hortonworks and HDP Distribution The team that originally created Apache Hadoop at Yahoo The team that is driving key developments in Apache Hadoop QE ed and alpha/beta tested every stable Hadoop release including Hadoop 1 and Hadoop 2 (alpha) HDP is fully open source to apache nothing held back close/identical to Apache releases Page 25

Thank You! Questions & Answers sanjay@hortonworks.com Hortonworks Inc. 2012 Page 26