Hadoop محبوبه دادخواه کارگاه ساالنه آزمایشگاه فناوری وب زمستان 1391
Outline Big Data Big Data Examples Challenges with traditional storage NoSQL Hadoop HDFS MapReduce Architecture 2
Big Data In information technology, big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. 3
Big Data Examples Astronomy atmospheric science satellite imagery medical records Genomics biological Biogeochemical social networks social data web logs photography archives video archives Internet text and documents Internet search indexing web server logs sensor networks RFID call detail records large-scale e- commerce traffic flow sensors Industry Marketing banking transactions scans of government documents, military surveillance The Large Hadron Collider (LHC) experiments represent about 150 million sensors delivering data 40 million times per second. this is equivalent to 500 quintillion (5 10 20 ) bytes per day. Decoding the human genome originally took 10 years to process; now it can be achieved in one week. The NASA Center for Climate Simulation (NCCS) stores 32 petabytes of climate observations and simulations. http://en.wikipedia.org/wiki/big_data 4
Social Media 5
Facebook http://www.searchenginejournal.com/stats-on-facebook-2012-infographic/40301/ 6
Facebook Facebook Announces Monthly Active Users Were At 1.01 Billion As Of September 30 th An Increase Of 26% Year-Over-Year http://techcrunch.com/2012/10/23/facebook-announces-monthly-active-users-were-at-1-01-billion-as-of-september-30th/ 7
Facebook http://www.searchenginejournal.com/stats-on-facebook-2012-infographic/40301/ 8
Twitter http://www.mediabistro.com/alltwitter/twitter-statistics-2012_b18914 9
General Internet statistics 2012 In one day on the Internet: Enough information is consumed to fill 168 million DVDs 294 billion emails are sent 2 million blog posts are written (enough posts to fill TIME magazine for 770 million years) 250 million photos are uploaded 864,000 hours of video are uploaded to YouTube 4.7 billion minutes are spent on Facebook 532 million statuses are updated 22 million hours of tv and movies are watched on Netflix More than 35 million apps are downloaded More iphones are sold than people are born 172 million people visit Facebook 40 million visit Twitter 22 million visit LinkedIn 20 million visit Google+ 17 million visit Pinterest http://thesocialskinny.com/100-social-media-mobile-and-internet-statistics-for-2012/ 10
The three Vs of Big Data Are commonly used to characterize different aspects of big data: Volume Processing large amounts of information is the main attraction of big data analytics The most immediate challenge to conventional IT structures It calls for scalable storage, and a distributed approach to querying Velocity Industry terminology for such fast-moving data tends to be either streaming data, or complex event processing. There are two main reasons to consider streaming processing when the input data are too fast to store in their entirety where the application mandates immediate response to the data Variety A common theme in big data systems is that the source data is diverse, and doesn t fall into neat relational structures text from social networks, image data, a raw feed directly from a sensor source http://strata.oreilly.com/2012/01/what-is-big-data.html#variety 11
Challenges with traditional storage They work very well, but since they are vertically scaled, as the amount of data / users increases, the performance quickly degrades. In order to increase performance: either very expensive software and hardware have to be bought, or some of RDBMS advantages have to be dropped. Big Data solutions were born to solve 2 issues classic databases were not able to: Being really very scalable at low cost Being able to work with non-modeled and non-structured data (i.e. internet data originally) http://www.capgemini.com/technology-blog/2012/06/nosql-hadoop/ 12
non-structured data http://www.capgemini.com/technology-blog/2011/11/what-is-big-data/ 13
To design this new family of solutions, the word NoSQL has been invented and used for the first time in 1998. NoSQL doesn t mean No SQL, but Not only SQL! And the SQL word represents the relational databases, not the SQL language. Using the No SQL expression may be confusing, but it sounds really good, and this is why it is still used today. is a broad class of database management systems. NoSQL databases are not built primarily on tables, and generally do not use SQL for data manipulation. read/write latency of a NoSQL database like Cassandra can be up to 30 times faster than that of an equivalent relational database like MySQL when both databases are loaded with 50+GB of data. http://www.capgemini.com/technology-blog/2012/06/nosql-hadoop/ http://www.appdynamics.com/blog/2011/05/18/will-nosql-kill-the-dba/ 14
NOSQL IMPLEMENTATIONS 15
The most well known technology used for Big Data is Hadoop 16
The Solution for the Big Data We are looking at newer programming models Supporting algorithms and data structures h 17
History Google's MapReduce and Google File System (GFS) papers, 2004. and Hadoop is derived from these papers 18
The Solution for the Big Data But those are not open source So Doug Cutting created the open source version named Hadoop programming models A programming model called MapReduce Google's for processing MapReduce big-data Supporting algorithms and data structures A supporting file system called HadoopGoogle Distributed File System File System (GFS) (HDFS) The two major components of Hadoop h 19
Storage: HDFS 20
Hadoop Distributed File System is a distributed file system designed to run on commodity hardware Is designed to store very large data sets reliably Is self-healing, rebalances files across cluster Is scalable, just by adding new nodes is highly fault-tolerant when nodes fail HDFS stores file system metadata and application data separately Blocks are replicated to handle hardware failure (Block oriented) Namenode (Only one per cluster) Secondary namenode (Checkpoint node) (Only one per cluster) Datanodes (Many per cluster) 21
HDFS Blocks Blocks in disks: Minimum amount of data that can be read or written. (~ 512 bytes) Filesystem blocks: Abstraction over disk blocks. (~ few kilobytes) HDFS block: Abstraction over Filesystem blocks, to facilitate distribution over network and other requirements of Hadoop. Usually 64 MB or 128 MB. Block abstraction keeps the design simple. e.g, replication is at block level rather than file level. File is split into blocks for storing in HDFS. Blocks of the same file can reside on multiple machines in the cluster. Each block is stored as a file in the Local FS of the DataNode. Block size does not refer to size on disk. 1 MB file will not take up 64 MB on disk. 22
Processing: MapReduce 23
What is MapReduce? It is a framework to......automatically partition jobs that have large input data sets into simpler work units or tasks, distribute them in the nodes of a cluster (map) and......combine the intermediate results of those tasks (reduce) in a way to produce the required results. Run on key/value pairs Moves computation to data 24
Simple Example Mapped data on Node 1 Result Input data Mapped data on Node 2 25
What is MapReduce? Parallel programming model meant for large clusters User implements Map() and Reduce() Parallel computing framework Parallelization Fault Tolerance Data Distribution Load Balancing Status and monitoring Simplify the parallelization and distribution of large-scale computations in clusters MapReduce library does most of the hard work for us! Used extensively on many applications inside Google and Yahoo that......require simple processing tasks......but have large input data sets 26
Word Count Execution Input Map Shuffle & Sort Reduce Output the quick brown fox the fox ate the mouse how now brown cow Map Map Map the, 1 brown, 1 fox, 1 the, 1 fox, 1 the, 1 how, 1 now, 1 brown, 1 ate, 1 mouse, 1 cow, 1 quick, 1 Reduce Reduce brown, 2 fox, 2 how, 1 now, 1 the, 3 ate, 1 cow, 1 mouse, 1 quick, 1 27
An Optimization: The Combiner Local reduce function for repeated keys produced by same map For associative ops. like sum, count, max Decreases amount of intermediate data Example: local counting for Word Count: def combiner(key, values): output(key, sum(values)) 28
Word Count with Combiner Input Map Shuffle & Sort Reduce Output the quick brown fox the fox ate the mouse how now brown cow Map Map Map the, 2 fox, 1 how, 1 now, 1 brown, 1 the, 1 brown, 1 fox, 1 ate, 1 mouse, 1 cow, 1 quick, 1 Reduce Reduce brown, 2 fox, 2 how, 1 now, 1 the, 3 ate, 1 cow, 1 mouse, 1 quick, 1 29
Programming Model Input & Output Each one is a set of key/value pairs Map: Processes input key/value pairs Compute a set of intermediate key/value pairs map (in_key, in_value) -> list(int_key, intermediate_value) Reduce: Combine all the intermediate values that share the same key Produces a set of merged output values (usually just one per key) reduce(int_key, list(intermediate_value)) -> list(out_value) 30
Example: Count # of Each Letter in a Big File 1 2 3 4 5 6 7 8 9 10 Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l (There are 26 different keys 1) Split File into 10 pieces of 64MB letters in the range [a..z]) Master Worker Worker Idle Idle Worker Idle Worker Idle Worker Idle Worker Idle Worker Worker Idle Idle 31
Example: Count # of Each Letter in a Big File 1 2 3 4 5 6 7 8 9 10 Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 2) Assign map and reduce tasks Master Mappers Reducers Worker Worker Idle Idle Worker Idle Worker Idle Worker Idle Worker Idle Worker Worker Idle Idle 32
Example: Count # of Each Letter in a Big File 1 2 3 4 Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 3) Read the split data Map T. In progress Map T. In progress Map T. In progress Map T. In progress Master Reduce T. Idle Reduce T. Idle Reduce T. Idle Reduce T. Idle 33
Example: Count # of Each Letter in a Big File Big File 640MB a y b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 4) Process data (in memory) Machine 1 Partition Function (used to map the letters in regions): Map T.1 In-Progress a b c d e f g h i j k l m n n o p q r s t v w x y z R1 R2 R3 R4 Simulating the execution in memory (a,1) (b,1) (a,1) (m1) (o,1) (p,1) (r, 1) (y,1) R1 R2 R3 R4 34
Example: Count # of Each Letter in a Big File Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 5) Apply combiner function Machine 1 Map T.1 In-Progress Simulating the execution in memory (a,1) (a,2) (b,1) (b,1) (a,1) (m1) (o,1) (p,1) (r, 1) (y,1) R1 R2 R3 R4 35
Example: Count # of Each Letter in a Big File Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 6) Store results on disk Machine 1 Map T.1 In-Progress Master Memory (a,2) (b,1) (m1) (o,1) (p,1) (r, 1) (y,1) R1 R2 R3 R4 Disk 36
Example: Count # of Each Letter in a Big File Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 7) Inform the master about the position of the intermediate results in local disk Machine 1 Map T.1 In-Progress MT1 Results Location MT1 Results Master (a,2) (b,1) (m1) (o,1) (p,1) (r, 1) (y,1) R1 R2 R3 R4 37
Example: Count # of Each Letter in a Big File Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 8) The Master assigns the next task (Map Task 5) to the Worker recently free Data for Map Task 5 Machine 1 Worker In-Progress T1 Results Master Task 5 (a,2) (b,1) (m1) (o,1) (p,1) (r, 1) (y,1) R1 R2 R3 R4 38
Example: Count # of Each Letter in a Big File Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l 9) The Master forwards the location of the intermediate results of Map Task 1 to reducers Machine 1 Map T.5 In-Progress Master MT1 Results MT1 Results Location (R1) MT1 Results Location (Rx) (a,2) (b,1) (m1) (o,1) (p,1) (r, 1) (y,1) R1 R2 R3 R4 Reduce T.1 Idle... An Introduction to MapReduce 39
Example: Count # of Each Letter in a Big File Big File 640MB a t b o m a p r r e d u c e g o o o g l e a p i m a c a c a b r a a r r o z f e i j a o t o m a t e c r u i m e s s o l (a, 2) (b,1) (e, 1) (d, 1) (c, 1) (e, 1) (g, 1) (e, 1) (a, 3) (c, 1) (c, 1) (a, 1) (b,1) (a, 2) (f, 1) (e, 1) (a, 2) (e, 1)(c, 1) (e, 1) Letters in Region 1: a b c d e f g Reduce T.1 Idle R1 40
Example: Count # of Each Letter in a Big File 10) The RT 1 reads the data in R=1 from each MT Machine N Reduce T.1 In-Progress (a, 2) (b,1) (e, 1) (d, 1) (c, 1) (e, 1) (g, 1) (e, 1) (a, 3) (c, 1) (c, 1) (a, 1) (b,1) (a, 2) (f, 1) (e, 1) (a, 2) (e, 1)(c, 1) (e, 1) Data read from each Map Task stored in region 1 41
Example: Count # of Each Letter in a Big File 11) The reduce task 1 sorts the data Machine N Reduce T.1 In-Progress (a, 2) (a, 3) (a, 1) (a, 2) (a, 2) (b,1) (b,1) (c, 1) (c, 1) (c, 1) (c, 1) (d, 1) (e, 1) (e, 1) (e, 1) (e, 1) (e, 1) (e, 1) (f, 1) (g, 1) 42
Example: Count # of Each Letter in a Big File 12) Then it passes the key and the corresponding set of intermediate data to the user's reduce function Machine N Reduce T.1 In-Progress (a, 2) (a, 3) (a, 1) (a, 2) (a, 2) (b,1) (b,1) (c, 1) (c, 1) (c, 1) (c, 1) (d, 1) (e, 1) (e, 1) (e, 1) (e, 1) (e, 1) (e, 1) (f, 1) (g, 1) (a, {2,3,1,2,2}) (b, {1,1}) (c, {1,1,1,1}) (d,{1}) (e, {1,1,1,1,1,1}) (f, {1}) (g, {1}) 43
Example: Count # of Each Letter in a Big File 12) Finally, generates the output file 1 of R, after executing the user's reduce Machine N Reduce T.1 In-Progress (a, {2,3,1,2,2}) (b, {1,1}) (c, {1,1,1,1}) (d,{1}) (e, {1,1,1,1,1,1}) (f, {1}) (g, {1}) (a, 10) (b, 2) (c, 4) (d, 1) (e, 6) (f, 1) (g, 1) 44
MapReduce Characteristics Very large scale data: peta, exa bytes Write once and read many data: allows for parallelism without mutexes Map and Reduce are the main operations: simple code There are other supporting operations such as combine All the map should be completed before reduce operation starts. Number of map tasks and reduce tasks are configurable. Operations are provisioned near the data. Commodity hardware and storage. Runtime takes care of splitting and moving data for operations. 45
Hadoop Architecture 46
HDFS Architecture 47
Namenode The "master" node Maintains the HDFS namespace, filesystem tree and metadata. Maintains the mapping from each file to the list of blocks where the file is. Maintains in memory the locations of each block. (Block to datanode mapping) Issues instructions to datanode to create/replicate/delete blocks Single point of failure 48
Datanodes The "slaves" Serve as storage for data blocks No metadata Report all blocks to namenode (BlockReport) Sends periodic "heartbeat" to Namenode Serves read, write requests, performs block creation, deletion, and replication upon instruction from Namenode. User data never flows through the NameNode. 49
Block reports and heartbeats A block report contains the block id, the length for each block replica The first is sent immediately after the DataNode registration Subsequent block reports are sent every hour. During normal operation DataNodes send heartbeats to the NameNode to confirm that the DataNode is operating and the block replicas it hosts are available. Heartbeats from a DataNode also carry information about: Total storage capacity Fraction of storage in use The default heartbeat interval is three seconds 50
MapReduce Architecture 51
MapReduce: JobTracker & TaskTracker Master-Slave architecture JobTracker Accepts jobs submitted by users Assigns Map and Reduce tasks to Tasktrackers Makes all scheduling decisions Schedules tasks on nodes close to data Monitors task and tasktracker status, re-executes tasks upon failure TaskTracker Asks for new tasks, executes, monitors, reports status Run Map and Reduce tasks upon instruction from the Jobtracker Manage storage and transmission of intermediate output
Other Features: Failures Re-execution is the main mechanism for fault-tolerance Worker failures: Master detect Worker failures via periodic heartbeats The master drives the re-execution of tasks Completed and in-progress map tasks are re-executed In-progress reduce tasks are re-executed Master failure: The initial implementation did not support failures of the master Robust: lost 1600 of 1800 machines once, but finished fine 53
Hadoop Characteristics Commodity HW + Horizontal scaling Add inexpensive servers Storage servers and their disks are not assumed to be highly reliable and available Can add/upgrade servers over time Use replication across servers to deal with unreliable storage/servers Support for moving computation close to data i.e. servers have 2 purposes: data storage and computation Metadata-data separation - simple design Storage scales horizontally Metadata scales vertically Automatic re-execution on failure/distribution 54
Real world Hadoop Facebook In 2010 Facebook claimed that they had the largest Hadoop cluster in the world with 21 PB of storage On July 27, 2011 they announced the data had grown to 30 PB On June 13, 2012 they announced the data had grown to 100 PB On November 8, 2012 they announced the warehouse grows by roughly half a PB per day the world s largest Hadoop cluster, which spans more than 100 petabytes of data, and it analyzes about 105 terabytes every 30 minutes. Yahoo June 15, 2012, It stores 140 petabytes in Hadoop. Since Hadoop keeps all data sets in triplicate, over 400 petabytes of storage are needed to sustain its systems. http://thesocialskinny.com/100-social-media-mobile-and-internet-statistics-for-2012/ 55
Thank you 56