CS Project Report
|
|
- Ariel Sharp
- 5 years ago
- Views:
Transcription
1 CS Project Report Kshitij Sudan kshitij@cs.utah.edu 1 Introduction With the growth in services provided over the Internet, the amount of data processing required has grown tremendously. To satisfy the computing requirements for large web applications, an underlying distributed platform is typically used. These platforms are usually clusters of commodity computers, and can consist of thousands of machines. The Map-Reduce software framework and its open-source implementation Hadoop are typically used to program these large distributed systems. Since these clusters are operated at a large scale, the cost of operating a cluster over its lifetime becomes a dominant cost. Thus the energy costs to operate a cluster should be included when designing such systems. To lower the energy consumption of commodity clusters, several approaches are taken. For example, Google and Facebook chose to re-design the underlying commodity hardware to make it more power efficient. A more recent approach is to virtualize the I/O sub-system and simplify each node to just have compute and memory. The advantage of virtualizing I/O is that most components of a traditional motherboard can be removed/reduced, allowing energy and physical space savings. The downside of this, especially disk virtualization, is that the traditional 1-to-1 mapping between disk and compute is altered because one can now pack more compute than disk in a given power and space envelop. However, the traditional assumption for scale-out web applications like Map-Reduce is that there is a 1-to-1 disk-to-compute mapping. This notion was formed using traditional servers and there is no clear evidence that this is still useful. Many Map-Reduce computations are not as disk I/O intensive as previously assumed, and if there is sufficient network bandwidth to shuttle data around then it is better to virtualize disk I/O for energy efficiency reasons. As an example, Figure 1 shows how 64 disks are shared among 512 nodes of a Hadoop cluster. With such a configuration, 8 CPU cores share one physical disk, i.e. a ratio of 1 disk per 8 cores. For such a system, we analyzed the performance of a collection of Hadoop benchmarks and reached the conclusion that a 1-to-1 mapping of compute and disk bandwidth is not necessarily beneficial when optimizing for energy efficiency. Figure 1. Virtualized Disk I/O for Hadoop Framework. The disk virtualization layer shown in Figure 1 is implemented using a combination of hardware and software techniques. Disk virtualization is also transparent to the operating system, and virtualized disks appear as standard devices within the OS. This obviates any need for system software modification. With disk virtualization, the OS running on each node is presented an independent disk, which under the hood is an offset on the same physical disk. As an example, if 16 CPU cores are configured to share a 1 TB physical disk, then each CPU is 1
2 presented a disk of 64 GB. CPU-0 accesses the disk from offset 0 through to offset of 64 GB, CPU-1accesses the same disk from offset 64 GB to 128 GB, and so on. With such an implementation of disk virtualization, the OS behaves as if it has an independent local disk attached to the system. 2 Motivation Disk BW (MB/sec) Core i7 - TeraSort - Aggregate Disk BW read write 60 per. Mov. Avg. (read) 60 per. Mov. Avg. (write) Disk BW (MB/sec) Atom TeraSort-Aggregate Disk BW 5 0 Rd MB Wr MB 60 per. Mov. Avg. (Rd MB) 60 per. Mov. Avg. (Wr MB) (a) Executing on a Core i7 based cluster (b) Executing on an Atom based cluster Figure 2. Disk bandwidth utilization by Core i7 and Atom based clusters while executing TeraSort benchmark. The overlaid lines are 1 minute averages of individual datapoints. A typical recommendation for a Hadoop cluster configuration is to use two physical disk per CPU core. This ensures maximum disk bandwidth is available to the core without interference from requests originating at other cores. Framework overheads however constrain the maximum usable disk bandwidth by the application, and Hadoop jobs typically use far less than the full disk bandwidth. Consider that each disk access request made by the application has to propagate through multiple layers of abstractions (Hadoop, JVM, TCP/IP stack, and the OS), with each layer having an associated overhead. Due to these overheads, even machines with heavyweight CPUs are unable to utilize the full disk bandwidth. Figure 2 shows the disk bandwidth usage for Core i7 and Atom CPU based clusters while executing TeraSort. Bandwidth usage is plotted on the Y-axis and execution time on the X-axis. Bandwidth was measured using the dstat utility which reported bandwidth usage every 1 sec. The 60 sec moving average of these 1 sec measurements is overlaid in the figure. It can be seen that the sustained average bandwidth usage for both Core i7 and Atom clusters varies between MB/sec. These values are considerably lower than the raw disk bandwidth each CPU can drive from the disk. Core i7 based systems can drive a sustained average raw disk read bandwidth of 112 MB/sec, while the Atom CPUs in our proposed system can drive up to 80 MB/sec of read bandwidth, and 40 MB/sec of write bandwidth. Note that these bandwidth limits are also a function of disk controller, and the disk internals, and not just the CPU 1. When using low-power CPUs like the Intel Atom, not only is the bandwidth greatly over-provisioned but the power consumption is also disproportionately distributed for a cluster. Typical disks consume between 7-25 Watts [1], which is 0.8x-3x the CPU power (8.5 W). Thus, 1 disk per Atom CPU leads to the disk consuming a large fraction of total node power and energy, while the resource itself (disk bandwidth) is being under-utilized. 3 Problem Statement For a system architecture that uses disk virtualization for improved energy efficiency, one of the major configuration parameter is the appropriate numbers of cores-per-disk (CPD). This project develops two analytical 1 These results were collected using the dd utility on Unix systems. 2
3 models that take the MapReduce application characteristics into account, and suggests the appropriate CPD configuration value. Note that these configuration suggestions are intended to be best effort suggestions and further tuning of the system might be required. 4 Hadoop Analytical Models for Systems with Virtualized I/O In this section, we discuss two models that describe the relation between cores-per-disk and execution time of the application. The first model assumes a fixed number of CPU cores in the system, and a fixed input dataset size. This model aims to determine the least execution time for a given CPD value. The second model assumes that the power-budget for the system is fixed, and assuming a fixed dataset size, attempts to determine the number of CPU cores and disks that should be used for least execution time. 4.1 Constant CPU Cores For this model, we assume the number of CPU cores, and the dataset size(n) is fixed. Ift C is the time the application spends executing on the CPU,t D is time spent accessing disks, andt misc is time spent in miscellaneous operations like network communication latency, job setup and tear-down time, etc. To keep the model simple, we assume that the miscellaneous costs are constant. Assuming the application completes in only one MapReduce round, the lower bound on the execution time of the MapReduce application t E can then be represented similar to model proposed by Goodrich et al. [2]: t E = Ω(t C +t D +t misc ) Since the number of cores are fixed at N, the time taken to perform the necessary compute operations for the application is also fixed. Thus t C is fixed and can be represented as t C = f(cores,n). However, the time it takes the application to access the disk is variable, dependent based on the CPD value and dataset size, i.e. t D = f(cpd,n). Since t E only has a lower bound expressed by the equation above, we now wish to now tighten this bound a little. To do so, note that if the application is CPU bound, then assuming t misc to be negligible, the upper bound on execution time can be represented as t E = O(t C ). Similarly, if the application is purely I/O bound, then the upper bound is t E = O(t D ). If the application is CPU bound, then the system can be trivially configured to maximum CPD value so as to use as few disks as possible. If the system is I/O bound, then the system should be configured to the least CPD value possible, since in that case fewer disk seeks would imply higher available disk bandwidth. Apart from these two corner cases, the CPD ratio has to be decided such that the compute and disk I/O times are comparable. This would lead to the application execution time being the least possible with a given number of CPU cores. Note that since CPU core count is fixed, arbitrarily reducing the disk I/O time will not improve the overall application execution time. This occurs because Hadoop MapReduce applications cannot exploit asynchrony in resource utilization at system level as much as other applications. This is explained next. It can be argued that at the system level, many operations are asynchronous - for e.g., the CPU might issue a disk access request, and then context switch to do some other useful work. However, for the Hadoop MapReduce applications, much of this asynchrony cannot be exploited due to the way the systems are configured. Hadoop clusters are typically configured with a single Map and a single Reduce task per core. This is usually done to not over-subscribe the compute resources. Since there is an implicit global barrier between Map and Reduce phases, these two phases cannot overlap. As a result, the computation for any phase can start only when all the data for the phase has been fetched from the disk into the main memory. Due to this limitation the model aims to achieve comparable compute and disk I/O time. Since the compute time is fixed for an application with a fixed dataset size and number of cores, we attempt to define the disk I/O time as a function of CPD value. Disk accesses are very sensitive to seek latency, as a disk head seek takes significantly longer than actual data read. When multiple cores share a single disk, the disk head 3
4 activity increases significantly due to mixing of access streams from different CPUs. Thus, to first order, disk access time can be represented in terms of CPD as follows: t D = disk seek latency CPD+const Here the constant denotes the cost of actual data transfer. Typically this value is much smaller than the seek latency, and is usually dropped. Thus t D can be approximated only in terms of CPD value. To achieve the least execution time with a fixed number of CPU cores, and fixed dataset size, the values oft D and t C have to be nearly equal. This leads to: This can be re-written as: CPD t C t D 1 t C disk seek latency This result can be intuitively understood as computing for nearly as long as it takes to read data from the disk. 4.2 Iso-Power System To improve energy efficiency, not only the execution time has to be minimized, but power consumption also has to be taken into account. In Section 4.1 we assumed the dataset size, and the number of CPU cores is fixed. We relax these constraint by now assuming that the number of CPU cores can also be varied. With this relaxation in assumptions, the compute time t C becomes a variable now. As noted above, t C = f(cores,n), and CP D = cores/disks, thus using Equation 1 above, we get: cores disks f(cores,n) disk seek latency A simple relation between number of disks, CPU cores, and power budget can be written as: power budget = a disks+b cores+const (3) where a and b are constants that characterize the power consumed by disks and cores, respectively. The const term accounts for the fixed overheads that cannot be amortized over disks and cores, like cooling fans, and power supplies. Since MapReduce workloads are scale-out applications, i.e. with larger number of compute cores, more parallelization can be achieved to lower the execution time, as a first order approximation t C = const N/cores. Thus, using Equation 2 and 3 above, one can derive an appropriate number of cores and disks to be used for an application, for a given power budget. 5 Conclusions and Future Work This project explored two basic models to understand the impact of various system configuration parameters on MapReduce applications. The presented models show that if the power consumed is not a constraint, then the CPD value is simply related to the CPU time the application consumes. If however the power budget is fixed, then using the iso-power model, one can derive the appropriate number of CPU cores and disks. The current model contains many system specific parameters that need to be empirically measured for a given system. The most critical measurement is to account for the time the application spends on the CPU. My current attempts to characterize this parameter were not very successful since it s hard to break-up the execution time of an application among time for compute and disk accesses. In future, I plan to develop a methodology to accurately account for distribution of application time among compute and disk accesses. I also plan to leverage these models to present guidelines such that algorithms can be developed for MapReduce applications that are aware of the underlying effects of disk virtualization. This can be leveraged to develop energy efficient algorithms for the system architecture described here. (1) (2) 4
5 References [1] Internet-Scale Datacenter Economics: Costs and Opportunities. In High Performance Transaction Systems, [2] M. T. Goodrich, N. Sitchinava, and Q. Zhang. Sorting, Searching, and Simulation in the MapReduce Framework. CoRR, abs/ ,
ibench: Quantifying Interference in Datacenter Applications
ibench: Quantifying Interference in Datacenter Applications Christina Delimitrou and Christos Kozyrakis Stanford University IISWC September 23 th 2013 Executive Summary Problem: Increasing utilization
More informationA Novel System Architecture for Web Scale Applications Using Lightweight CPUs and Virtualized I/O *
½ 1 A Novel System Architecture for Web Scale Applications Using Lightweight CPUs and Virtualized I/O * Kshitij Sudan Saisanthosh Balakrishnan Sean Lie Min Xu Dhiraj Mallick Gary Lauterbach Rajeev Balasubramonian
More informationBIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE
BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest
More informationMixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp
MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage
More informationSandor Heman, Niels Nes, Peter Boncz. Dynamic Bandwidth Sharing. Cooperative Scans: Marcin Zukowski. CWI, Amsterdam VLDB 2007.
Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS Marcin Zukowski Sandor Heman, Niels Nes, Peter Boncz CWI, Amsterdam VLDB 2007 Outline Scans in a DBMS Cooperative Scans Benchmarks DSM version VLDB,
More informationDesign of Parallel Algorithms. Course Introduction
+ Design of Parallel Algorithms Course Introduction + CSE 4163/6163 Parallel Algorithm Analysis & Design! Course Web Site: http://www.cse.msstate.edu/~luke/courses/fl17/cse4163! Instructor: Ed Luke! Office:
More informationCloud Computing CS
Cloud Computing CS 15-319 Programming Models- Part III Lecture 6, Feb 1, 2012 Majd F. Sakr and Mohammad Hammoud 1 Today Last session Programming Models- Part II Today s session Programming Models Part
More informationAccelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet
WHITE PAPER Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet Contents Background... 2 The MapR Distribution... 2 Mellanox Ethernet Solution... 3 Test
More informationOptimizing Apache Spark with Memory1. July Page 1 of 14
Optimizing Apache Spark with Memory1 July 2016 Page 1 of 14 Abstract The prevalence of Big Data is driving increasing demand for real -time analysis and insight. Big data processing platforms, like Apache
More informationThe Hadoop Distributed Filesystem: Balancing Portability and Performance
The Hadoop Distributed Filesystem: Balancing Portability and Performance Jeffrey Shafer, Scott Rixner, and Alan L. Cox Rice University Houston, TX Email: {shafer, rixner, alc}@rice.edu Abstract Hadoop
More informationAnalytics in the cloud
Analytics in the cloud Dow we really need to reinvent the storage stack? R. Ananthanarayanan, Karan Gupta, Prashant Pandey, Himabindu Pucha, Prasenjit Sarkar, Mansi Shah, Renu Tewari Image courtesy NASA
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationOptimizing Datacenter Power with Memory System Levers for Guaranteed Quality-of-Service
Optimizing Datacenter Power with Memory System Levers for Guaranteed Quality-of-Service * Kshitij Sudan* Sadagopan Srinivasan Rajeev Balasubramonian* Ravi Iyer Executive Summary Goal: Co-schedule N applications
More informationPCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud
PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud Mohammed G. Khatib & Zvonimir Bandic WDC Research 2/24/16 1 HDD s power impact on its cost 3-yr server & 10-yr infrastructure amortization
More informationAn Exploration of Designing a Hybrid Scale-Up/Out Hadoop Architecture Based on Performance Measurements
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI.9/TPDS.6.7, IEEE
More informationCS / Cloud Computing. Recitation 3 September 9 th & 11 th, 2014
CS15-319 / 15-619 Cloud Computing Recitation 3 September 9 th & 11 th, 2014 Overview Last Week s Reflection --Project 1.1, Quiz 1, Unit 1 This Week s Schedule --Unit2 (module 3 & 4), Project 1.2 Questions
More informationGROMACS Performance Benchmark and Profiling. August 2011
GROMACS Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationEsgynDB Enterprise 2.0 Platform Reference Architecture
EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed
More informationCloudian Sizing and Architecture Guidelines
Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary
More informationFusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic
WHITE PAPER Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
More informationCorrelation based File Prefetching Approach for Hadoop
IEEE 2nd International Conference on Cloud Computing Technology and Science Correlation based File Prefetching Approach for Hadoop Bo Dong 1, Xiao Zhong 2, Qinghua Zheng 1, Lirong Jian 2, Jian Liu 1, Jie
More informationEnergy Management of MapReduce Clusters. Jan Pohland
Energy Management of MapReduce Clusters Jan Pohland 2518099 1 [maps.google.com] installed solar panels on headquarters 1.6 MW (1,000 homes) invested $38.8 million North Dakota wind farms 169.5 MW (55,000
More informationLecture 20: WSC, Datacenters. Topics: warehouse-scale computing and datacenters (Sections )
Lecture 20: WSC, Datacenters Topics: warehouse-scale computing and datacenters (Sections 6.1-6.7) 1 Warehouse-Scale Computer (WSC) 100K+ servers in one WSC ~$150M overall cost Requests from millions of
More informationHTRC Data API Performance Study
HTRC Data API Performance Study Yiming Sun, Beth Plale, Jiaan Zeng Amazon Indiana University Bloomington {plale, jiaazeng}@cs.indiana.edu Abstract HathiTrust Research Center (HTRC) allows users to access
More informationPerformance, Power, Die Yield. CS301 Prof Szajda
Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the
More informationFAWN. A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan
FAWN A Fast Array of Wimpy Nodes David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan Carnegie Mellon University *Intel Labs Pittsburgh Energy in computing
More informationQuiz for Chapter 6 Storage and Other I/O Topics 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: 1. [6 points] Give a concise answer to each of the following
More informationFacilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding
Facilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding Yin Li, Hao Wang, Xuebin Zhang, Ning Zheng, Shafa Dahandeh,
More informationTowards Energy Proportional Cloud for Data Processing Frameworks
Towards Energy Proportional Cloud for Data Processing Frameworks Hyeong S. Kim, Dong In Shin, Young Jin Yu, Hyeonsang Eom, Heon Y. Yeom Seoul National University Introduction Recent advances in cloud computing
More informationCamdoop Exploiting In-network Aggregation for Big Data Applications Paolo Costa
Camdoop Exploiting In-network Aggregation for Big Data Applications costa@imperial.ac.uk joint work with Austin Donnelly, Antony Rowstron, and Greg O Shea (MSR Cambridge) MapReduce Overview Input file
More informationSystems Architecture II
Systems Architecture II Topics Interfacing I/O Devices to Memory, Processor, and Operating System * Memory-mapped IO and Interrupts in SPIM** *This lecture was derived from material in the text (Chapter
More informationSUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS
TABLE OF CONTENTS 2 THE AGE OF INFORMATION ACCELERATION Vexata Provides the Missing Piece in The Information Acceleration Puzzle The Vexata - Supermicro Partnership 4 CREATING ULTRA HIGH-PERFORMANCE DATA
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute
More informationDept. Of Computer Science, Colorado State University
CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [HADOOP/HDFS] Trying to have your cake and eat it too Each phase pines for tasks with locality and their numbers on a tether Alas within a phase, you get one,
More informationService Oriented Performance Analysis
Service Oriented Performance Analysis Da Qi Ren and Masood Mortazavi US R&D Center Santa Clara, CA, USA www.huawei.com Performance Model for Service in Data Center and Cloud 1. Service Oriented (end to
More informationDatacenter application interference
1 Datacenter application interference CMPs (popular in datacenters) offer increased throughput and reduced power consumption They also increase resource sharing between applications, which can result in
More informationSynonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short
Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short periods of time Usually requires low latency interconnects
More informationZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1
ZEST Snapshot Service A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1 Design Motivation To optimize science utilization of the machine Maximize
More informationEnabling Cost-effective Data Processing with Smart SSD
Enabling Cost-effective Data Processing with Smart SSD Yangwook Kang, UC Santa Cruz Yang-suk Kee, Samsung Semiconductor Ethan L. Miller, UC Santa Cruz Chanik Park, Samsung Electronics Efficient Use of
More informationA Preliminary Approach for Modeling Energy Efficiency for K-Means Clustering Applications in Data Centers
A Preliminary Approach for Modeling Energy Efficiency for K-Means Clustering Applications in Data Centers Da Qi Ren, Jianhuan Wen and Zhenya Li Futurewei Technologies 2330 Central Expressway, Santa Clara,
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models RCFile: A Fast and Space-efficient Data
More informationComparing Memory Systems for Chip Multiprocessors
Comparing Memory Systems for Chip Multiprocessors Jacob Leverich Hideho Arakida, Alex Solomatnikov, Amin Firoozshahian, Mark Horowitz, Christos Kozyrakis Computer Systems Laboratory Stanford University
More informationConsolidating Complementary VMs with Spatial/Temporalawareness
Consolidating Complementary VMs with Spatial/Temporalawareness in Cloud Datacenters Liuhua Chen and Haiying Shen Dept. of Electrical and Computer Engineering Clemson University, SC, USA 1 Outline Introduction
More informationCS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures
CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2 I/O performance measures I/O performance measures diversity: which I/O devices can connect to the system? capacity: how many I/O devices
More informationCloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018
Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster
More informationYuval Carmel Tel-Aviv University "Advanced Topics in Storage Systems" - Spring 2013
Yuval Carmel Tel-Aviv University "Advanced Topics in About & Keywords Motivation & Purpose Assumptions Architecture overview & Comparison Measurements How does it fit in? The Future 2 About & Keywords
More informationDON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling
DON T CRY OVER SPILLED RECORDS Memory elasticity of data-parallel applications and its application to cluster scheduling Călin Iorgulescu (EPFL), Florin Dinu (EPFL), Aunn Raza (NUST Pakistan), Wajih Ul
More informationValidating Hyperconsolidation Savings With VMAX 3
Validating Hyperconsolidation Savings With VMAX 3 By Ashish Nadkarni, IDC Storage Team An IDC Infobrief, sponsored by EMC January 2015 Validating Hyperconsolidation Savings With VMAX 3 Executive Summary:
More information2/26/2017. Originally developed at the University of California - Berkeley's AMPLab
Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second
More informationTypically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times
Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Measured in operations per month or years 2 Bridge the gap
More informationLEVERAGING FLASH MEMORY in ENTERPRISE STORAGE
LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE Luanne Dauber, Pure Storage Author: Matt Kixmoeller, Pure Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless
More informationBigDataBench-MT: Multi-tenancy version of BigDataBench
BigDataBench-MT: Multi-tenancy version of BigDataBench Gang Lu Beijing Academy of Frontier Science and Technology BigDataBench Tutorial, ASPLOS 2016 Atlanta, GA, USA n Software perspective Multi-tenancy
More informationDiffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading
Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading Mario Almeida, Liang Wang*, Jeremy Blackburn, Konstantina Papagiannaki, Jon Crowcroft* Telefonica
More informationHadoop Workloads Characterization for Performance and Energy Efficiency Optimizations on Microservers
IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS Hadoop Workloads Characterization for Performance and Energy Efficiency Optimizations on Microservers Maria Malik, Katayoun Neshatpour, Setareh Rafatirad,
More informationUsing Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD
More informationA Network-aware Scheduler in Data-parallel Clusters for High Performance
A Network-aware Scheduler in Data-parallel Clusters for High Performance Zhuozhao Li, Haiying Shen and Ankur Sarker Department of Computer Science University of Virginia May, 2018 1/61 Data-parallel clusters
More informationTowards Energy Efficient MapReduce
Towards Energy Efficient MapReduce Yanpei Chen Laura Keys Randy H. Katz Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2009-109 http://www.eecs.berkeley.edu/pubs/techrpts/2009/eecs-2009-109.html
More informationHANA Performance. Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business
More informationApache Spark Graph Performance with Memory1. February Page 1 of 13
Apache Spark Graph Performance with Memory1 February 2017 Page 1 of 13 Abstract Apache Spark is a powerful open source distributed computing platform focused on high speed, large scale data processing
More informationPerformance and Scalability with Griddable.io
Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.
More informationNative-Task Performance Test Report
Native-Task Performance Test Report Intel Software Wang, Huafeng, Huafeng.wang@intel.com Zhong, Xiang, xiang.zhong@intel.com Intel Software Page 1 1. Background 2. Related Work 3. Preliminary Experiments
More informationAvailability and Utility of Idle Memory in Workstation Clusters. Anurag Acharya, UC-Santa Barbara Sanjeev Setia, George Mason Univ
Availability and Utility of Idle Memory in Workstation Clusters Anurag Acharya, UC-Santa Barbara Sanjeev Setia, George Mason Univ Motivation Explosive growth in data intensive applications Large-scale
More informationMAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti
International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti 1 Department
More informationWorkload Characterization and Optimization of TPC-H Queries on Apache Spark
Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research
More informationScotch: Combining Software Guard Extensions and System Management Mode to Monitor Cloud Resource Usage
Scotch: Combining Software Guard Extensions and System Management Mode to Monitor Cloud Resource Usage Kevin Leach 1, Fengwei Zhang 2, and Westley Weimer 1 1 University of Michigan, 2 Wayne State University
More informationTechnical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array
Technical Paper Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array Release Information Content Version: 1.0 April 2018 Trademarks and Patents SAS Institute Inc., SAS Campus
More informationEvaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades
Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state
More informationWas ist dran an einer spezialisierten Data Warehousing platform?
Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction
More informationAccelerate Big Data Insights
Accelerate Big Data Insights Executive Summary An abundance of information isn t always helpful when time is of the essence. In the world of big data, the ability to accelerate time-to-insight can not
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationIBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide
V7 Unified Asynchronous Replication Performance Reference Guide IBM V7 Unified R1.4.2 Asynchronous Replication Performance Reference Guide Document Version 1. SONAS / V7 Unified Asynchronous Replication
More informationCESM (Community Earth System Model) Performance Benchmark and Profiling. August 2011
CESM (Community Earth System Model) Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationAn Empirical Model for Predicting Cross-Core Performance Interference on Multicore Processors
An Empirical Model for Predicting Cross-Core Performance Interference on Multicore Processors Jiacheng Zhao Institute of Computing Technology, CAS In Conjunction with Prof. Jingling Xue, UNSW, Australia
More informationScalable Shared Databases for SQL Server 2005
White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting using SQL Server 2005 Enterprise Edition Abstract: Microsoft SQL Server 2005 Enterprise Edition
More informationPerformance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware
Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware 2010 VMware Inc. All rights reserved About the Speaker Hemant Gaidhani Senior Technical
More informationMySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona
MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona In the Presentation Practical approach to deal with some of the common MySQL Issues 2 Assumptions You re looking
More informationSpark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies
Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay 1 Apache Spark - Intro Spark within the Big Data ecosystem Data Sources Data Acquisition / ETL Data Storage Data Analysis / ML Serving 3 Apache
More informationHadoop Virtualization Extensions on VMware vsphere 5 T E C H N I C A L W H I T E P A P E R
Hadoop Virtualization Extensions on VMware vsphere 5 T E C H N I C A L W H I T E P A P E R Table of Contents Introduction... 3 Topology Awareness in Hadoop... 3 Virtual Hadoop... 4 HVE Solution... 5 Architecture...
More informationAnalytical Modeling of Parallel Systems. To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003.
Analytical Modeling of Parallel Systems To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Sources of Overhead in Parallel Programs Performance Metrics for
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.830 Database Systems: Fall 2008 Quiz II There are 14 questions and 11 pages in this quiz booklet. To receive
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationLecture 12 DATA ANALYTICS ON WEB SCALE
Lecture 12 DATA ANALYTICS ON WEB SCALE Source: The Economist, February 25, 2010 The Data Deluge EIGHTEEN months ago, Li & Fung, a firm that manages supply chains for retailers, saw 100 gigabytes of information
More informationThe amount of data increases every day Some numbers ( 2012):
1 The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect
More informationAddressing the Stranded Power Problem in Datacenters using Storage Workload Characterization. January 30 th, 2010 Sriram Sankar and Kushagra Vaid
Addressing the Stranded Power Problem in Datacenters using Storage Workload Characterization January 30 th, 2010 Sriram Sankar and Kushagra Vaid 1 Microsoft Online Services Across the company, all over
More information2/26/2017. The amount of data increases every day Some numbers ( 2012):
The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect to
More informationCIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )
Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL
More informationVoldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationDynamic Load balancing for I/O- and Memory- Intensive workload in Clusters using a Feedback Control Mechanism
Dynamic Load balancing for I/O- and Memory- Intensive workload in Clusters using a Feedback Control Mechanism Xiao Qin, Hong Jiang, Yifeng Zhu, David R. Swanson Department of Computer Science and Engineering
More informationAchieving Horizontal Scalability. Alain Houf Sales Engineer
Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches
More informationSmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers
SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers Feng Chen 1 Song Jiang 2 Xiaodong Zhang 1 The Ohio State University, USA Wayne State University, USA Disks Cost High Energy
More informationMulti-tenancy version of BigDataBench
Multi-tenancy version of BigDataBench Gang Lu Institute of Computing Technology, Chinese Academy of Sciences BigDataBench Tutorial MICRO 2014 Cambridge, UK INSTITUTE OF COMPUTING TECHNOLOGY 1 Multi-tenancy
More informationIBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
More informationA Cool Scheduler for Multi-Core Systems Exploiting Program Phases
IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 5, MAY 2014 1061 A Cool Scheduler for Multi-Core Systems Exploiting Program Phases Zhiming Zhang and J. Morris Chang, Senior Member, IEEE Abstract Rapid growth
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationTechnical Paper. Performance and Tuning Considerations for SAS on the Hitachi Virtual Storage Platform G1500 All-Flash Array
Technical Paper Performance and Tuning Considerations for SAS on the Hitachi Virtual Storage Platform G1500 All-Flash Array Release Information Content Version: 1.0 April 2018. Trademarks and Patents SAS
More informationTECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING
TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes
More informationUsing Containers to Deliver an Efficient Private Cloud
Using Containers to Deliver an Efficient Private Cloud Software-Defined Servers Using Containers to Deliver an Efficient Private Cloud iv Contents 1 Solving the 3 Challenges of Containers 1 2 The Fit with
More informationFuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc
Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,
More informationModels and Metrics for Energy-Efficient Computer Systems. Suzanne Rivoire May 22, 2007 Ph.D. Defense EE Department, Stanford University
Models and Metrics for Energy-Efficient Computer Systems Suzanne Rivoire May 22, 2007 Ph.D. Defense EE Department, Stanford University Power and Energy Concerns Processors: power density [Borkar, Intel]
More information