High Volume Throughput Computers (HVC): An ICT View of Datacenter Computers

Size: px
Start display at page:

Download "High Volume Throughput Computers (HVC): An ICT View of Datacenter Computers"

Transcription

1 High Volume Throughput Computers (HVC): An ICT View of Datacenter Computers Jianfeng Zhan ( 詹剑锋 )

2 Outline Motivation Related work Challenges and Opportunities Work-in-the-progress in my group

3 Datacenter hosting computing

4 Three trends in datacenter hosting computing More and more services involving a large amount of data, are deployed in datacenters to serve the masses. More and more data are produced, stored, analyzed, and serviced. Lots of users tend to use streaming media or VoIP for fun or communications.

5 Trend of Big Data Data Warehouse: 20PB Daily Increase: 60TB User Number: 750 million Image Number: 260 Billion Weekly increase : 1 billion 5

6 Scales of deployed data centers

7 We coin a new term---high Volume Throughput Computing ( in short, HVC) to describe those workloads.

8 What is HVC? a datacenter based computing paradigm focusing on throughput-oriented workloads Workloads: consist of a large amount of loosely coupled jobs in stead of a big job. Nature: throughput computing Target : increase the volume of throughput in terms of Requests (Services) processed data (Data processing applications) the maximum number of simultaneous subscribers (Interactive real-time application) High Volume Throughput Computing: Identifying and Characterizing Throughput Oriented Workloads in Data Centers. Jianfeng Zhan. Lixin Zhang, Lei Wang, Ninghui Sun et al.. LSPP 2012.

9 Three categories of workloads in HVC Services A service is a group of applications that collaborate to receive user requests and return responses to end users. Data processing applications Only include loosely coupled data-intensive computing. Interactive real-time applications An interactive real-time application will maintain a user session of a long period while guaranteeing the real time quality of service.

10 The relationships between WSC and DISC Data Intensive Services (Web search) Loose coupled MapReduce jobs DISC WSC

11 The relationships among HVC, WSC and DISC Traditional server Workloads (SQL, Web Server) Services Data Intensive Services (Web search) WSC Data processing applications Interactive real-time Applications (Desktop Cloud, Stream media) HVC

12 The relationship between HVC and Cloud HPC in Cloud HVC server Consolidation systems Cloud computing

13 Differences of HVC from other computing

14 Differences of HVC from other computing

15 Outline Motivation Related work Challenges and Opportunities Work-in-the-progress in my group

16 Google Data Center Solution Low-end server clusters X86 CPU Single thread performance commodity-off-the-shelf networks Software built in house Software-level fault tolerance

17 A Large SMP Server vs. a Low-end, PC- Class Server

18 Why Google Objects to Hardware Specialization? The diverse requirements of multiple services The datacenter must be a general-purpose computing system. The breadth of requirements makes it less likely that specialized hardware can make a large overall impact in the operation. The speed of workload churn product requirements evolve rapidly Smart programmers learn from experience and rewrite the baseline algorithms and data structures much more rapidly than hardware itself can evolve. There is substantial risk that by the time a specialized hardware solution is implemented.

19 IBM Technologies Z-series Super I/O capability High redundancy Suitable for banking systems. P-series SMT after-thought Achieve high single thread performance and reliability. Suitable for high end commercial computing.

20 Sun UltraSparc T1/T2 Designed for high throughput. Poor singe thread performance. The single thread performance of T2 is about 1/5 2.6GHz Nehalem. Power consumption is still high. T2 processor: about 100watts. The Rock project is canceled.

21 Outline Motivation Related work Challenges and Opportunities Work-in-the-progress in my group

22 (1) HVC is a Multidiscipline Field Computer Science and Engineering Electrical Engineering Cooling Mechanical Engineering

23 Breakdown of a Datacenter

24 Categories of Datacenters Tier I datacenters A single path for power and cooling distribution, without redundant components. Tier II datacenters Redundant components to this design (N + 1), improving availability. Tier III datacenters Multiple power and cooling distribution paths but only one active path. Provide redundancy even during maintenance, usually with an N + 2 setup. Tier IV datacenters Two active power and cooling distribution paths, redundant components in each path.

25 Google s Power Efficiency Metrics The first term in the efficiency calculation (a) is the power usage effectiveness (PUE) the ratio of total building power to IT power IT power is the power consumed by the actual computing equipment (servers, network equipment, etc.). The second term (b) accounts for these overheads: server PUE (SPUE). Our CS guys only can improve the third term (c).

26 LBNL Survey of the PUE of 24 Datacenters

27 (2) No Benchmark High-performance computing (HPC) LINPACK The Green 500. Graph 500 No initiative for Internet services. We want to do, however its is very difficult No one would like to share data.

28 (3) Cost Challenges The differences of service systems from enterprise systems Data cost models.

29 The Difference between Amazon Service and Enterprise Systems Data source from James Hamilton, VP of Amazon, ISCA 09 Keynote

30 Cost Models of Datacenters Spending (US$B) $120 $100 $80 Worldwide Server Market Power and Cooling New Server Spending $60 $40 $20 $0 Source: werinlargescaledatacenter.aspx Source: IDC, The Impact of Power and Cooling on Data Center Infrastructure, May 2006

31 Main Concerns Server costs Workload spike Power related costs Power and cooling infrastructure Energy-proportional energy-proportional systems will consume almost no power when idle and gradually consume more power as the activity level increases.

32 Human Energy Usage vs. Activity Levels

33 Subsystem Power Usage in an x86 Server

34 Activity Profile of a Sample of 5,000 Google Servers over a Period of 6 Months.

35 Space-constrained Energy Efficiency Most of Chinese service companies depends upon hosted data centers. Space-constrained. The fixed power supply. How to improve performance gains under these constraints?

36 (4) System Architecture Challenges Limitations Code investments Rapidly changing workloads Opportunities Ample parallelism The power concerns.

37 Rapidly Changing Workloads Internet services are in their infancy as an application area New products appear and gain popularity at a very fast pace. YouTube video sharing site exploded in popularity in a period of a few months Some services have different architectural need. The difficult mismatch between The time scale for radical workload behavior changes The design and life cycles for data centers.

38 Building Balanced Systems from Imbalanced Components Processors continue to get faster and more energy efficient. Memory systems and magnetic storage are not evolving at the same pace, whether in performance or energy efficiency. Non-CPU components dominate computer performance and energy usage. Architects must try to build efficient large-scale systems. Some balance in performance and cost despite the shortcomings of the available components.

39 (5) Massive Data Challenges How to store data? How to process data? How to present data as insight?

40 Outline Motivation Related work Challenges and Opportunities Work-in-the-progress in my group

41 Benchmarks

42 Current Benchmarks SPEC CPU SPEC Web HPCC PARSEC TPCC Gridmix YCSB

43 Current Benchmarks SPEC CPU SPEC Web HPCC TPCC Gridmix YCSB PARSEC No Benchmark For HVC

44 Different benchmarks and metrics

45 Benchmark requirements Target Benchmark Represent- ative Diverse Program m-ing Models Distribute d State-of of-art

46 Current benchmark suits Search: Benchmark Category: service Details: Real workload-based benchmark Nutch based benchmark Open source: available from ks BigDataBench: Benchmark Category: big data processing Details: 21 representative workloads 4 programming models: MapReduce MPI AllPairs WorkQueue Characterization of Real Workloads of Web Search Engines. Huafeng Xi, Jianfeng Zhan.etc IEEE International Symposium on Workload Characterization (IISWC 2011). CloudRank-D : Benchmarking and Ranking Cloud Computing Systems For Data Processing Applications. Chunjie Luo, Jianfeng Zhan etc. To appear in Frontier of Computer Sciences.

47 Benchmark: Search Client Workloads generator Search service Web Server Search Server

48 BigDataBench Basic operation Classification Cluster Recommendation Association rule mining BigDataBench Sequence learning Warehouse operation Feature reduction Vector calculate Bioinformati cs

49 Detailed information of BigDataBench

50 Applications

51 An ideal application Data-intensive TB or PB A challenging application Machine Reading of the World Wide Web Valuable services attracts more searches Professional search Information extraction of professionals

52

53 根据研究方向搜索 用 prof.ict.ac.cn 搜索 多核系统 or 突厥文学 or 计划生育

54 多核系统

55 信息抽取的结果

56 突厥文学

57 计划生育

58 Data Analysis Processes

59 Summary High-level talk on the ICT view of datacenter computers. Challenges and opportunities. Some relevant work in my group.

60 Reading lists Vaquero, L. M., Rodero-Merino, L., Caceres, J., and Lindner, M A break in the clouds: towards a cloud definition. SIGCOMM Comput. Commun. Rev. 39, 1 (Dec. 2008), Jianfeng Zhan, Lei Wang, Xianna Li, Weisong Shi, Chuliang Weng, Wenyao Zhang, and Xiutao Zang. Cost-aware Cooperative Resource Provisioning for Heterogeneous Workloads in Data Centers. Accepted by IEEE Transaction on Computers (TC). U. Hoelzle et al The Datacenter as a Computer: an introduction to the Design of Warehouse-Scale Machines. 1st. Morgan and Claypool Publishers. Armbrust, M., et al. Above the clouds: A Berkeley view of cloud computing. Tech. Rep. UCB/EECS , EECS Department, UC Berkeley, Feb 2009 Chunjie Luo, Jianfeng Zhan, Zhen Jia et al. CloudRank-D : Benchmarking and Ranking Cloud Computing Systems For Data Processing Applications. To appear in Frontier of Computer Sciences

61 Thank you!

High Volume Computing: Identifying and Characterizing Throughput Oriented Workloads in Data Centers

High Volume Computing: Identifying and Characterizing Throughput Oriented Workloads in Data Centers High Volume Computing: Identifying and Characterizing Throughput Oriented Workloads in Data Centers arxiv:1202.6134v2 [cs.dc] 14 Jan 2013 Jianfeng Zhan, Lixin Zhang, Ninghui Sun, Lei Wang, Zhen Jia, and

More information

DCBench: a Data Center Benchmark Suite

DCBench: a Data Center Benchmark Suite DCBench: a Data Center Benchmark Suite Zhen Jia ( 贾禛 ) http://prof.ict.ac.cn/zhenjia/ Institute of Computing Technology, Chinese Academy of Sciences workshop in conjunction with CCF October 31,2013,Guilin

More information

BigDataBench: a Big Data Benchmark Suite from Web Search Engines

BigDataBench: a Big Data Benchmark Suite from Web Search Engines BigDataBench: a Big Data Benchmark Suite from Web Search Engines Wanling Gao, Yuqing Zhu, Zhen Jia, Chunjie Luo, Lei Wang, Jianfeng Zhan, Yongqiang He, Shiming Gong, Xiaona Li, Shujie Zhang, and Bizhu

More information

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017 CSE 124: THE DATACENTER AS A COMPUTER George Porter November 20 and 22, 2017 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative

More information

CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications

CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications Front. Comput. Sci., 2012, 6(4): 347 362 DOI 10.1007/s11704-012-2118-7 CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications Chunjie LUO 1,JianfengZHAN 1,ZhenJIA

More information

Lecture 20: WSC, Datacenters. Topics: warehouse-scale computing and datacenters (Sections )

Lecture 20: WSC, Datacenters. Topics: warehouse-scale computing and datacenters (Sections ) Lecture 20: WSC, Datacenters Topics: warehouse-scale computing and datacenters (Sections 6.1-6.7) 1 Warehouse-Scale Computer (WSC) 100K+ servers in one WSC ~$150M overall cost Requests from millions of

More information

ΕΠΛ372 Παράλληλη Επεξεργάσια

ΕΠΛ372 Παράλληλη Επεξεργάσια ΕΠΛ372 Παράλληλη Επεξεργάσια Warehouse Scale Computing and Services Γιάννος Σαζεϊδης Εαρινό Εξάμηνο 2014 READING 1. Read Barroso The Datacenter as a Computer http://www.morganclaypool.com/doi/pdf/10.2200/s00193ed1v01y200905cac006?cookieset=1

More information

Warehouse-Scale Computing

Warehouse-Scale Computing ecture 31 Computer Science 61C Spring 2017 April 7th, 2017 Warehouse-Scale Computing 1 New-School Machine Structures (It s a bit more complicated!) Software Hardware Parallel Requests Assigned to computer

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Warehouse-Scale Computing

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Warehouse-Scale Computing CS 61C: Great Ideas in Computer Architecture (Machine Structures) Warehouse-Scale Computing Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Coherency Tracked by

More information

Green Computing: Datacentres

Green Computing: Datacentres Green Computing: Datacentres Simin Nadjm-Tehrani Department of Computer and Information Science (IDA) Linköping University Sweden Many thanks to Jordi Cucurull For earlier versions of this course material

More information

Computing as a Service

Computing as a Service IBM System & Technology Group Computing as a Service General Session Thursday, June 19, 2008 1:00 p.m. - 2:15 p.m. Conrad Room B/C (2nd Floor) Dave Gimpl, gimpl@us.ibm.com June 19, 08 Computing as a Service

More information

Data Center Fundamentals: The Datacenter as a Computer

Data Center Fundamentals: The Datacenter as a Computer Data Center Fundamentals: The Datacenter as a Computer George Porter CSE 124 Feb 10, 2017 Includes material from (1) Barroso, Clidaras, and Hölzle, as well as (2) Evrard (Michigan), used with permission

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

CSE6331: Cloud Computing

CSE6331: Cloud Computing CSE6331: Cloud Computing Leonidas Fegaras University of Texas at Arlington c 2019 by Leonidas Fegaras Cloud Computing Fundamentals Based on: J. Freire s class notes on Big Data http://vgc.poly.edu/~juliana/courses/bigdata2016/

More information

Power your planet. Optimizing the Enterprise Data Center POWER7 Powers a Smarter Infrastructure

Power your planet. Optimizing the Enterprise Data Center POWER7 Powers a Smarter Infrastructure Power your planet. Optimizing the Enterprise Data Center POWER7 Powers a Smarter Infrastructure Enoch Lau Field Technical Sales Specialist, Power Systems Systems & Technology Group Power your planet. Smarter

More information

Virtual Melting Temperature: Managing Server Load to Minimize Cooling Overhead with Phase Change Materials

Virtual Melting Temperature: Managing Server Load to Minimize Cooling Overhead with Phase Change Materials Virtual Melting Temperature: Managing Server Load to Minimize Cooling Overhead with Phase Change Materials Matt Skach1, Manish Arora2,3, Dean Tullsen3, Lingjia Tang1, Jason Mars1 University of Michigan1

More information

Green Computing: Datacentres

Green Computing: Datacentres Green Computing: Datacentres Simin Nadjm-Tehrani Department of Computer and Information Science (IDA) Linköping University Sweden Many thanks to Jordi Cucurull For earlier versions of this course material

More information

Distributed Data Infrastructures, Fall 2017, Chapter 2. Jussi Kangasharju

Distributed Data Infrastructures, Fall 2017, Chapter 2. Jussi Kangasharju Distributed Data Infrastructures, Fall 2017, Chapter 2 Jussi Kangasharju Chapter Outline Warehouse-scale computing overview Workloads and software infrastructure Failures and repairs Note: Term Warehouse-scale

More information

Pervasive Insight. Mission Critical Platform

Pervasive Insight. Mission Critical Platform Empowered IT Pervasive Insight Mission Critical Platform Dynamic Development Desktop & Mobile Server & Datacenter Cloud Over 7 Million Downloads of SQL Server 2008 Over 30,000 partners are offering solutions

More information

TOOLS FOR INTEGRATING BIG DATA IN CLOUD COMPUTING: A STATE OF ART SURVEY

TOOLS FOR INTEGRATING BIG DATA IN CLOUD COMPUTING: A STATE OF ART SURVEY Journal of Analysis and Computation (JAC) (An International Peer Reviewed Journal), www.ijaconline.com, ISSN 0973-2861 International Conference on Emerging Trends in IOT & Machine Learning, 2018 TOOLS

More information

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Energy Efficient Cloud Computing: Challenges and Solutions

Energy Efficient Cloud Computing: Challenges and Solutions Energy Efficient Cloud Computing: Challenges and Solutions Burak Kantarci and Hussein T. Mouftah School of Electrical Engineering and Computer Science University of Ottawa Ottawa, ON, Canada Outline PART-I:

More information

SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics

SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics Min LI,, Jian Tan, Yandong Wang, Li Zhang, Valentina Salapura, Alan Bivens IBM TJ Watson Research Center * A

More information

Embedded Technosolutions

Embedded Technosolutions Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication

More information

Intro to Software as a Service (SaaS) and Cloud Computing

Intro to Software as a Service (SaaS) and Cloud Computing UC Berkeley Intro to Software as a Service (SaaS) and Cloud Computing Armando Fox, UC Berkeley Reliable Adaptive Distributed Systems Lab 2009-2012 Image: John Curley http://www.flickr.com/photos/jay_que/1834540/

More information

Why Converged Infrastructure?

Why Converged Infrastructure? Why Converged Infrastructure? Three reasons to consider converged infrastructure for your organization Converged infrastructure isn t just a passing trend. It s here to stay. A recent survey 1 by IDG Research

More information

Buffered Co-scheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems

Buffered Co-scheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems National Alamos Los Laboratory Buffered Co-scheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems Fabrizio Petrini and Wu-chun Feng {fabrizio,feng}@lanl.gov Los Alamos National

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

BigDataBench: a Benchmark Suite for Big Data Application

BigDataBench: a Benchmark Suite for Big Data Application BigDataBench: a Benchmark Suite for Big Data Application Wanling Gao Institute of Computing Technology, Chinese Academy of Sciences HVC tutorial in conjunction with The 19th IEEE International Symposium

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

CS Project Report

CS Project Report CS7960 - Project Report Kshitij Sudan kshitij@cs.utah.edu 1 Introduction With the growth in services provided over the Internet, the amount of data processing required has grown tremendously. To satisfy

More information

Power Management in Storage Systems

Power Management in Storage Systems Tag line, tag line Power Management in Storage Systems Kaladhar Voruganti Technical Director CTO Office, Sunnyvale June 12, 2009 Outline Power Consumption Background in Data Centers and Storage Systems

More information

Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES

Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES Tom Atwood Business Development Manager Sun Microsystems, Inc. Takeaways Understand the technical differences between

More information

Optimizing Apache Spark with Memory1. July Page 1 of 14

Optimizing Apache Spark with Memory1. July Page 1 of 14 Optimizing Apache Spark with Memory1 July 2016 Page 1 of 14 Abstract The prevalence of Big Data is driving increasing demand for real -time analysis and insight. Big data processing platforms, like Apache

More information

Big Data and Cloud Computing

Big Data and Cloud Computing Big Data and Cloud Computing Presented at Faculty of Computer Science University of Murcia Presenter: Muhammad Fahim, PhD Department of Computer Eng. Istanbul S. Zaim University, Istanbul, Turkey About

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Characterizing Data Analysis Workloads in Data Centers

Characterizing Data Analysis Workloads in Data Centers Characterizing Data Analysis Workloads in Data Centers Zhen Jia 1,2, Lei Wang 1,2, Jianfeng Zhan 1*, Lixin Zhang 1, and Chunjie Luo 1 1 State Key Laboratory Computer Architecture, Institute of Computing

More information

Cloud Connections SEE Partner Summit Janos Strausz Product Sales Specialist, DC

Cloud Connections SEE Partner Summit Janos Strausz Product Sales Specialist, DC Cloud Connections SEE Partner Summit 2015 Janos Strausz Product Sales Specialist, DC 75% of Businesses To be Digital in 5 years 1 81% 80% 81% 80% Mobile Technologies Mobile for Technologies Customer for

More information

Why Converged Infrastructure?

Why Converged Infrastructure? Why Converged Infrastructure? Three reasons to consider converged infrastructure for your organization Converged infrastructure isn t just a passing trend. It s here to stay. According to a recent survey

More information

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest

More information

PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites

PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites Christian Bienia (Princeton University), Sanjeev Kumar (Intel), Kai Li (Princeton University) Outline Overview What

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 17 Datacenters and Cloud Compu5ng

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 17 Datacenters and Cloud Compu5ng CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 17 Datacenters and Cloud Compu5ng Instructor: Dan Garcia h;p://inst.eecs.berkeley.edu/~cs61c/ 2/28/13 1 In the news Google disclosed

More information

Energy Efficient Data Access and Storage through HW/SW Co-design

Energy Efficient Data Access and Storage through HW/SW Co-design Energy Efficient Data Access and Storage through HW/SW Co-design Minyi Guo Shanghai Jiao Tong University, China MCSOC 2014, Japan 24 September, 2014 Outline! Power: A first--class data center constraint!

More information

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan & Matt Larsen (University of Oregon), Hank Childs (Lawrence Berkeley National Laboratory) 26

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

A Cool Scheduler for Multi-Core Systems Exploiting Program Phases

A Cool Scheduler for Multi-Core Systems Exploiting Program Phases IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 5, MAY 2014 1061 A Cool Scheduler for Multi-Core Systems Exploiting Program Phases Zhiming Zhang and J. Morris Chang, Senior Member, IEEE Abstract Rapid growth

More information

RED HAT CLOUD STRATEGY (OPEN HYBRID CLOUD) Ahmed El-Rayess Solutions Architect

RED HAT CLOUD STRATEGY (OPEN HYBRID CLOUD) Ahmed El-Rayess Solutions Architect RED HAT CLOUD STRATEGY (OPEN HYBRID CLOUD) Ahmed El-Rayess Solutions Architect AGENDA Cloud Concepts Market Overview Evolution to Cloud Workloads Evolution to Cloud Infrastructure CLOUD TYPES AND DEPLOYMENT

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Slides courtesy of Tim Wood Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

Cloud Computing Economies of Scale

Cloud Computing Economies of Scale Cloud Computing Economies of Scale AWS Executive Symposium 2010 James Hamilton, 2010/7/15 VP & Distinguished Engineer, Amazon Web Services email: James@amazon.com web: mvdirona.com/jrh/work blog: perspectives.mvdirona.com

More information

PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud

PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud Mohammed G. Khatib & Zvonimir Bandic WDC Research 2/24/16 1 HDD s power impact on its cost 3-yr server & 10-yr infrastructure amortization

More information

Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers

Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers This white paper details the performance improvements of Dell PowerEdge servers with the Intel Xeon Processor Scalable CPU

More information

Introduction to Hadoop. Owen O Malley Yahoo!, Grid Team

Introduction to Hadoop. Owen O Malley Yahoo!, Grid Team Introduction to Hadoop Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com Who Am I? Yahoo! Architect on Hadoop Map/Reduce Design, review, and implement features in Hadoop Working on Hadoop full time since

More information

745: Advanced Database Systems

745: Advanced Database Systems 745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.

More information

The amount of data increases every day Some numbers ( 2012):

The amount of data increases every day Some numbers ( 2012): 1 The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect

More information

Renovating your storage infrastructure for Cloud era

Renovating your storage infrastructure for Cloud era Renovating your storage infrastructure for Cloud era Nguyen Phuc Cuong Software Defined Storage Country Sales Leader Copyright IBM Corporation 2016 2 Business SLAs Challenging Traditional Storage Approaches

More information

Technology Trend : Green IT and Virtualizaiton. Education and Research Sun Microsystems(Thailand)

Technology Trend : Green IT and Virtualizaiton. Education and Research Sun Microsystems(Thailand) Technology Trend 2008-2009 : Green IT and Virtualizaiton surachet@sun.com Education and Research Sun Microsystems(Thailand) 1 Our Vision: The Network is the Computer 1 billion+ people on the Net today

More information

The End of Redundancy. Alan Wood Sun Microsystems May 8, 2009

The End of Redundancy. Alan Wood Sun Microsystems May 8, 2009 The End of Redundancy Alan Wood Sun Microsystems May 8, 2009 Growing Demand, Shrinking Resources By 2008, 50% of current data centers will have insufficient power and cooling capacity to meet the demands

More information

2/26/2017. The amount of data increases every day Some numbers ( 2012):

2/26/2017. The amount of data increases every day Some numbers ( 2012): The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect to

More information

ACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE

ACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE ACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE An innovative storage solution from Pure Storage can help you get the most business value from all of your data THE SINGLE MOST IMPORTANT

More information

VMware and Xen Hypervisor Performance Comparisons in Thick and Thin Provisioned Environments

VMware and Xen Hypervisor Performance Comparisons in Thick and Thin Provisioned Environments VMware and Hypervisor Performance Comparisons in Thick and Thin Provisioned Environments Devanathan Nandhagopal, Nithin Mohan, Saimanojkumaar Ravichandran, Shilp Malpani Devanathan.Nandhagopal@Colorado.edu,

More information

Techniques to improve the scalability of Checkpoint-Restart

Techniques to improve the scalability of Checkpoint-Restart Techniques to improve the scalability of Checkpoint-Restart Bogdan Nicolae Exascale Systems Group IBM Research Ireland 1 Outline A few words about the lab and team Challenges of Exascale A case for Checkpoint-Restart

More information

Personal Grid. 1 Introduction. Zhiwei Xu, Lijuan Xiao, and Xingwu Liu

Personal Grid. 1 Introduction. Zhiwei Xu, Lijuan Xiao, and Xingwu Liu Personal Grid Zhiwei Xu, Lijuan Xiao, and Xingwu Liu Institute of Computing Technology, Chinese Academy of Sciences 100080 Beijing, China Abstract. A long-term trend in computing platform innovation is

More information

Cloud Computing. What is cloud computing. CS 537 Fall 2017

Cloud Computing. What is cloud computing. CS 537 Fall 2017 Cloud Computing CS 537 Fall 2017 What is cloud computing Illusion of infinite computing resources available on demand Scale-up for most apps Elimination of up-front commitment Small initial investment,

More information

Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism

Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism The datacenter is the computer Luiz Andre Barroso, Google (2007) Outline Introduction to WSCs Programming Models and Workloads

More information

IBM Data Center Networking in Support of Dynamic Infrastructure

IBM Data Center Networking in Support of Dynamic Infrastructure Dynamic Infrastructure : Helping build a Smarter Planet IBM Data Center Networking in Support of Dynamic Infrastructure Pierre-Jean BOCHARD Data Center Networking Platform Leader IBM STG - Central Eastern

More information

High Performance and Cloud Computing (HPCC) for Bioinformatics

High Performance and Cloud Computing (HPCC) for Bioinformatics High Performance and Cloud Computing (HPCC) for Bioinformatics King Jordan Georgia Tech January 13, 2016 Adopted From BIOS-ICGEB HPCC for Bioinformatics 1 Outline High performance computing (HPC) Cloud

More information

Towards Energy-Proportional Datacenter Memory with Mobile DRAM

Towards Energy-Proportional Datacenter Memory with Mobile DRAM Towards Energy-Proportional Datacenter Memory with Mobile DRAM Krishna Malladi 1 Frank Nothaft 1 Karthika Periyathambi Benjamin Lee 2 Christos Kozyrakis 1 Mark Horowitz 1 Stanford University 1 Duke University

More information

ICN for Cloud Networking. Lotfi Benmohamed Advanced Network Technologies Division NIST Information Technology Laboratory

ICN for Cloud Networking. Lotfi Benmohamed Advanced Network Technologies Division NIST Information Technology Laboratory ICN for Cloud Networking Lotfi Benmohamed Advanced Network Technologies Division NIST Information Technology Laboratory Information-Access Dominates Today s Internet is focused on point-to-point communication

More information

Big Data Using Hadoop

Big Data Using Hadoop IEEE 2016-17 PROJECT LIST(JAVA) Big Data Using Hadoop 17ANSP-BD-001 17ANSP-BD-002 Hadoop Performance Modeling for JobEstimation and Resource Provisioning MapReduce has become a major computing model for

More information

Jason Waxman General Manager High Density Compute Division Data Center Group

Jason Waxman General Manager High Density Compute Division Data Center Group Jason Waxman General Manager High Density Compute Division Data Center Group Today 2015 More Users Only 25% of the world is Internet connected today 1 New technologies will connect over 1 billion additional

More information

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput Solution Brief The Impact of SSD Selection on SQL Server Performance Understanding the differences in NVMe and SATA SSD throughput 2018, Cloud Evolutions Data gathered by Cloud Evolutions. All product

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Managing CAE Simulation Workloads in Cluster Environments

Managing CAE Simulation Workloads in Cluster Environments Managing CAE Simulation Workloads in Cluster Environments Michael Humphrey V.P. Enterprise Computing Altair Engineering humphrey@altair.com June 2003 Copyright 2003 Altair Engineering, Inc. All rights

More information

The Implications from Benchmarking Three Big Data Systems

The Implications from Benchmarking Three Big Data Systems The Implications from Benchmarking Three Big Data Systems Jing Quan 1 Yingjie Shi 2 Ming Zhao 3 Wei Yang 4 1 School of Software Engineering, University of Science and Technology of China, Hefei, China

More information

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second

More information

BUYING SERVER HARDWARE FOR A SCALABLE VIRTUAL INFRASTRUCTURE

BUYING SERVER HARDWARE FOR A SCALABLE VIRTUAL INFRASTRUCTURE E-Guide BUYING SERVER HARDWARE FOR A SCALABLE VIRTUAL INFRASTRUCTURE SearchServer Virtualization P art 1 of this series explores how trends in buying server hardware have been influenced by the scale-up

More information

Solace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery

Solace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery Solace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery Java Message Service (JMS) is a standardized messaging interface that has become a pervasive part of the IT landscape

More information

HPE Storage Update The All Flash Datacenter 3PAR

HPE Storage Update The All Flash Datacenter 3PAR Horizont 2016 HPE Storage Update The All Flash Datacenter 3PAR James Hall EMEA Pre-Sales Strategy Copyright 2015 Hewlett Packard Enterprise Development LP October 2016 Agenda 1 2 Business Challanges HPE

More information

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network

More information

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Kandalla, Mark Arnold and Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory

More information

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Erich Strohmaier 1 and Hans W. Meuer 2 1 NERSC, Lawrence Berkeley National Laboratory, USA 2 University of Mannheim, Germany 1 Introduction In

More information

Availability in the Modern Datacenter

Availability in the Modern Datacenter Availability in the Modern Datacenter Adriana Rangel SIS Research Director IDC Middle East, Turkey & Africa IDC Visit us at IDC.com and follow us on Twitter: @IDC 2 US$ Mn Middle East IT Market Spending

More information

FUSION PROCESSORS AND HPC

FUSION PROCESSORS AND HPC FUSION PROCESSORS AND HPC Chuck Moore AMD Corporate Fellow & Technology Group CTO June 14, 2011 Fusion Processors and HPC Today: Multi-socket x86 CMPs + optional dgpu + high BW memory Fusion APUs (SPFP)

More information

Parallels Virtuozzo Containers

Parallels Virtuozzo Containers Parallels Virtuozzo Containers White Paper Deploying Application and OS Virtualization Together: Citrix and Parallels Virtuozzo Containers www.parallels.com Version 1.0 Table of Contents The Virtualization

More information

CS 345A Data Mining Lecture 1. Introduction to Web Mining

CS 345A Data Mining Lecture 1. Introduction to Web Mining CS 345A Data Mining Lecture 1 Introduction to Web Mining What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns Web Mining v. Data Mining Structure (or lack of

More information

The next step in Software-Defined Storage with Virtual SAN

The next step in Software-Defined Storage with Virtual SAN The next step in Software-Defined Storage with Virtual SAN Osama I. Al-Dosary VMware vforum, 2014 2014 VMware Inc. All rights reserved. Agenda Virtual SAN s Place in the SDDC Overview Features and Benefits

More information

CS 6240: Parallel Data Processing in MapReduce: Module 1. Mirek Riedewald

CS 6240: Parallel Data Processing in MapReduce: Module 1. Mirek Riedewald CS 6240: Parallel Data Processing in MapReduce: Module 1 Mirek Riedewald Why Parallel Processing? Answer 1: Big Data 2 How Much Information? Source: http://www2.sims.berkeley.edu/research/projects/ho w-much-info-2003/execsum.htm

More information

Faculté Polytechnique

Faculté Polytechnique Faculté Polytechnique INFORMATIQUE PARALLÈLE ET DISTRIBUÉE CHAPTER 7 : CLOUD COMPUTING Sidi Ahmed Mahmoudi sidi.mahmoudi@umons.ac.be 13 December 2017 PLAN Introduction I. History of Cloud Computing and

More information

Cisco Tetration Analytics

Cisco Tetration Analytics Cisco Tetration Analytics Enhanced security and operations with real time analytics John Joo Tetration Business Unit Cisco Systems Security Challenges in Modern Data Centers Securing applications has become

More information

Infrastructure Innovation Opportunities Y Combinator 2013

Infrastructure Innovation Opportunities Y Combinator 2013 Infrastructure Innovation Opportunities Y Combinator 2013 James Hamilton, 2013/1/22 VP & Distinguished Engineer, Amazon Web Services email: James@amazon.com web: mvdirona.com/jrh/work blog: perspectives.mvdirona.com

More information

A Study on Load Balancing in Cloud Computing * Parveen Kumar,* Er.Mandeep Kaur Guru kashi University, Talwandi Sabo

A Study on Load Balancing in Cloud Computing * Parveen Kumar,* Er.Mandeep Kaur Guru kashi University, Talwandi Sabo A Study on Load Balancing in Cloud Computing * Parveen Kumar,* Er.Mandeep Kaur Guru kashi University, Talwandi Sabo Abstract: Load Balancing is a computer networking method to distribute workload across

More information

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage Performance Study of Microsoft SQL Server 2016 Dell Engineering February 2017 Table of contents

More information

Integrated hardware-software solution developed on ARM architecture. CS3 Conference Krakow, January 30th 2018

Integrated hardware-software solution developed on ARM architecture. CS3 Conference Krakow, January 30th 2018 Integrated hardware-software solution developed on ARM architecture CS3 Conference Krakow, January 30th 2018 Why Object Storage Data doubles every 2 year...growing at a faster pace and is mainly unstructured

More information

A Mathematical Computational Design of Resource-Saving File Management Scheme for Online Video Provisioning on Content Delivery Networks

A Mathematical Computational Design of Resource-Saving File Management Scheme for Online Video Provisioning on Content Delivery Networks A Mathematical Computational Design of Resource-Saving File Management Scheme for Online Video Provisioning on Content Delivery Networks Dr.M.Upendra Kumar #1, Dr.A.V.Krishna Prasad *2, Dr.D.Shravani #3

More information

IT Level Power Provisioning Business Continuity and Efficiency at NTT

IT Level Power Provisioning Business Continuity and Efficiency at NTT IT Level Power Provisioning Business Continuity and Efficiency at NTT Henry M.L. Wong Intel Eco-Technology Program Office Environment Global CO 2 Emissions ICT 2% 98% Source: The Climate Group Economic

More information

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management TCO REPORT NAS File Tiering Economic advantages of enterprise file management Executive Summary Every organization is under pressure to meet the exponential growth in demand for file storage capacity.

More information