: A new version of Supercomputing or life after the end of the Moore s Law

Size: px
Start display at page:

Download ": A new version of Supercomputing or life after the end of the Moore s Law"

Transcription

1 : A new version of Supercomputing or life after the end of the Moore s Law Dr.-Ing. Alexey Cheptsov SEMAPRO 2015 :: :: Dr. Alexey Cheptsov

2 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 2

3 1 About HLRS Baden-Württemberg Turnover: ~1-2 bil. Staff: ~8000 Turnover: ~50 bil. Staff: ~ Turnover: ~100 bil. Staff: Turnover: ~5 bil. Staff: ~ Turnover: ~2,5 bil. Staff: ~11.00 Turnover: ~14 bil. Staff: ~ SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 3

4 1 About HLRS High Performance Computing Center Stuttgart First Cray system in Europe (Cray-2, 1986, 4 CPUs, 2GB RAM, approx. 2 GFLOPS) National HPC infrastructure provider since 1995 EU infrastructure provider since M core hours delivered to industry in 2014 HORNET (Cray XC40, Intel Haswell CPU, Aries network) nodes (24 cores) - 4 PFLOPs performance GB RAM per node - 7,8 PB Disc KW power consumption / 1.5M Euro SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 4

5 1 About HLRS TOP500 Total: - USA Japan Germany 37 - China - 37 Newcomers in 2015: - USA 34 - Germany 12 - Japan - 11 SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 5

6 1 About HLRS XXL Application Portfolio Climate simulation on a high (a few kms) resolution University of Hohenheim cores, 84 hours, 450 TB data Turbulent flow simulation for air- and gas dynamic analysis RWTH Aachen University cores, 110 hours, 80 TB data See more at: SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 6

7 1 About HLRS Maya, the bee ( みつばちマーヤの冒険 ) pictures 2 hours each 200 pictures in parallel From 9600 days to 48 days viewers in 2 weeks SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 7

8 1 About HLRS Challenges for New-Generation Systems Computation power Exascale is on his way, perhaps by 2020 More performance for less money Hazelhehn - approx. 8 PTFLOPs / 3M Euro power costs, approx. 1 PB RAM Sustainable performance - 1 PTFLOP Storage Spinning devices will go away by 2020 Flash is the core technology (aka NVM) Tapes will remain for data persistency Memory Source: both high memory capacity and high memory bandwidth are required cannot guarantee same growth as for FLOPS extremely deep hierarchy (at the cache level) programming ease 3D stacked memory, NVM SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 8

9 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 9

10 2 Convergence of Big Data into Supercomputing The modern HPC have to address a new class of computing-intensive applications from data-intensive domains in the internet, media, business, science, etc. Evolution of Computational Applications Traditional Computational Sciences Data-Intensive Sciences Structure static dynamic Locality spatially and temporally local no or little Volume fit into memory do not fit into memory Arithmetical Complexity High Precision Arithmetic variable precision or integer based SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 10

11 2 Convergence of Big Data into Supercomputing Hardware Infrastruktures e-services Applications Semantic Web Linked Data Data Facts Internet Intranet Information Knowledge Cloud Data Center HPC SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 11

12 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 12

13 3 Semantic Web as a New HPC Domain? What are the challenges Infrastructure on demand - distributed memory parallel clusters with low-latency intercon. - multicore machines with shared memory - GPGPU devices - alltogether? SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 13

14 3 Semantic Web as a New HPC Domain? What are the challenges A Semantic Web Integration Platform for Large-Scale Reasoning Query (SPARQL) Selection Identification Reasoner Answer (RDF) SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 14

15 3 Semantic Web as a New HPC Domain? Development Platforms The idea of LarKC LarKC = an infrastructure for large scale, high performance incomplete reasoning Decider Selecter 1 Query Transformer Identifier Reasoner Selecter 2 Flexibility, Modularity Scalability, Performance 15 SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 15

16 3 Semantic Web as a New HPC Domain? Development Platforms LarKC architecture: High-Level Overview Workflow branching Plug-in parallelisation Query Identifier Workflow branching Decider Selecter Selecter Process 1 Reasoner Process 2 Process 3 Process 4 Multi-Threading MPI Map-Reduce SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 16

17 3 Semantic Web as a New HPC Domain? Development Platforms LarKC architecture: High-Level Overview Plug-in developers Workflow designers Semantic Web Service Farm / / Plug-in Marketplace Deciders Identifiers Transformers Plug-in Plug-in Selectors Reasoners Plug-in Plug-in Registry Plug-in Managers Workflow Support System End-points LarKC platform Execution Framework Data Layer (OWLIM) Remote Invocation Framework (GAT) Monitoring Service Application end-users Infrastructure RDF data base Storage Highperformance computer Computation Monitoring SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 17

18 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 18

19 4 Data-Centric Parallel Programming Models State-Of-The-Art: Distributed Memory Processing over FS Programming models: MapReduce data-centric fault-tolerant poor sustainable performance restrictive key/value model SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 19

20 4 Data-Centric Parallel Programming Models The Message-Passing Interface Data-driven scenarios with MPI easy to integrate high performance SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 20

21 4 Data-Centric Parallel Programming Models The Message-Passing Interface Issues lack of implementations for the programming languages typically used in the data-centric communities, such as Java Java implementations (MPJExpress) native implementations (mpijava) full MPI-2 standard implementation issues with supporting new high-speed interconnects (e.g. Cray), related to the JVM scalability to a peta/exaflop? support of native tools for parallel computing, i.e. debuggers, error detectors, etc. JNI is used for calling communication libraries that are available in native codes (i.e. highly optimized MPI comm.) integration with a native MPI library is not easy but if you got it running, very enjoyable performance SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 21

22 4 Data-Centric Parallel Programming Models The Message-Passing Interface mpijava Architecture import mpi.*; Applications Java wrappers JNI C Interface Java MPI bindings Open MPI MPICH Native (C) MPI implementation SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 22

23 4 Data-Centric Parallel Programming Models The Message-Passing Interface Java bindings for Open MPI (ompijava) in the Open MPI s trunk! SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 23

24 4 Data-Centric Parallel Programming Models HLRS ompijava performance P2P communication SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 24

25 4 Data-Centric Parallel Programming Models Application Scenarios Random Indexing for Large Texts A Statistical Distribution technique for word/text similarity analysis Docs Subsetting Terms Occurance vector Occurance vector Similarity index Query Expansion SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 25

26 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 26

27 5 DreamCloud Project Approach Problem statement Application I/O bound Communication-bound HPC Resource Manager Infrastructure RAM ccnuma CPU NUMA node NUMA node I/O SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 27

28 5 DreamCloud Project Approach DreamCloud solution Application Scalable Low overhead Flexible Unified Performance Monitoring Framework Dynamic Scheduler Infrastructure SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 28

29 5 DreamCloud Project Approach Available application profiling tools Valgrind general analysis Likwid energy/power consumption Vampir/Paraver communication Wireshark networking infrastructure Infrastructure Monitoring Frameworks Zabbix Nagious Excess Problem: A very high user s involvement Key solution: Consolidation and Integration! SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 29

30 5 DreamCloud Project Approach Consolidated View Legenda: communication streams Total execution time Total energy consumption High-level view to be targeted by D3.2 Communicationintensive tasks 1 2 I/O-intensive tasks 4 Memoryintensive tasks 5 Pattern-based view 3 Disk RAM Detailed view SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 30

31 5 DreamCloud Project Approach Architecture Querying Interface (RESTful) Monitoring Server (Elastic Search) EXCESS monitoring agent PAPI RAPL... Node 1 3rd-party tools... Node N SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 31

32 5 DreamCloud Project Approach Performance Profiles Static (prior to the execution) Dynamic (collected at runtime) Performance Counters (PAPI etc.) Energy Counters (RAPL etc.) User-defined ones (e.g. progress tracker) SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 32

33 5 DreamCloud Project Approach Basic Metrics SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 33

34 5 DreamCloud Project Approach Power consumption SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 34

35 5 DreamCloud Project Approach Performance Profiles Representation in DreamCloud Performance Profiles(Example in CSV) timestamp;node;task;papi_tot_ins:cpu0;papi_tot_ins:cpu ;node_1[perf];task_1; ; ;node_2[perf];task_1;663789; ;node_3[balanced];task_1; ; ;node_3[energy];task_2_1; ; ;node_9[perf];task_2_1;663789; ;node_3[perf];task_2_2; ; ;node_9[perf];task_2_2;663789; Visualization SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 35

36 5 DreamCloud Project Approach Molecular Dynamics (MD) Simulation Code MS2 Simulation of movement of atoms and molecules Massively parallel, on a per-particle basis Persistent Task Adaptation Adaptation Phase Phase Adapt State Generation Generation Phase Phase Generate Individuals Evaluation Evaluation Phase Phase Generate Input Data Sets Compute Intensive (MPI) Exit Check Termination MS 2 MS 2 MS 2 MS 2 Rank Individuals Calculate Fitness Values SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 36

37 5 DreamCloud Project Approach Submission Request jdl Tasks Description Workflow Description Optimization Criteria Application 1 Submission Interface Deployment Workflow Manager RTE Scheduling Interface Progress Tracker Resource Manager Scheduling 2 Request Deployment 7 Plan 3 4 Scheduling Advisor 5 6 Heuristics Module Node Node Node Node Node Node Node Node Node Node Monitoring Framework Performance Profiles Infrastructure SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 37

38 6 Conclusion Main Results HPC is going to face new challenges related to data-centric application expansion. Parallel programming models (mainly MapReduce and MPI) are the key enablers of HPC to data-centric applications Reaching near-peak performance is going to be the major challenge Future Work Promote existing technologies, such as MPI, to solving new challenges, such as Big Data. Making existing framework more data-centric. SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 38

Cognitive-based Computation, Semantic Understanding, and Web Wisdom

Cognitive-based Computation, Semantic Understanding, and Web Wisdom Panel SEMAPRO/ADVCOMP/DATA ANALYTICS, Porto, 01.10.2013 Reutlingen University Cognitive-based Computation, Semantic Understanding, and Web Wisdom Moderation: Fritz Laux, Reutlingen University, Germany

More information

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

The Future of High Performance Computing

The Future of High Performance Computing The Future of High Performance Computing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Comparing Two Large-Scale Systems Oakridge Titan Google Data Center 2 Monolithic supercomputer

More information

MPI RUNTIMES AT JSC, NOW AND IN THE FUTURE

MPI RUNTIMES AT JSC, NOW AND IN THE FUTURE , NOW AND IN THE FUTURE Which, why and how do they compare in our systems? 08.07.2018 I MUG 18, COLUMBUS (OH) I DAMIAN ALVAREZ Outline FZJ mission JSC s role JSC s vision for Exascale-era computing JSC

More information

ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION

ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION Vignesh Adhinarayanan Ph.D. (CS) Student Synergy Lab, Virginia Tech INTRODUCTION Supercomputers are constrained by power Power

More information

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove High Performance Data Analytics for Numerical Simulations Bruno Raffin DataMove bruno.raffin@inria.fr April 2016 About this Talk HPC for analyzing the results of large scale parallel numerical simulations

More information

Green Supercomputing

Green Supercomputing Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale

More information

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions Providing Superior Server and Storage Performance, Efficiency and Return on Investment As Announced and Demonstrated at

More information

China Big Data and HPC Initiatives Overview. Xuanhua Shi

China Big Data and HPC Initiatives Overview. Xuanhua Shi China Big Data and HPC Initiatives Overview Xuanhua Shi Services Computing Technology and System Laboratory Big Data Technology and System Laboratory Cluster and Grid Computing Laboratory Huazhong University

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimized Programming Preliminary discussion: July 11, 2008 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de MSc. Csaba

More information

IBM Power Systems HPC Cluster

IBM Power Systems HPC Cluster IBM Power Systems HPC Cluster Highlights Complete and fully Integrated HPC cluster for demanding workloads Modular and Extensible: match components & configurations to meet demands Integrated: racked &

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

GPU-centric communication for improved efficiency

GPU-centric communication for improved efficiency GPU-centric communication for improved efficiency Benjamin Klenk *, Lena Oden, Holger Fröning * * Heidelberg University, Germany Fraunhofer Institute for Industrial Mathematics, Germany GPCDP Workshop

More information

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori

More information

The Future of High- Performance Computing

The Future of High- Performance Computing Lecture 26: The Future of High- Performance Computing Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2017 Comparing Two Large-Scale Systems Oakridge Titan Google Data Center Monolithic

More information

System Software for Big Data and Post Petascale Computing

System Software for Big Data and Post Petascale Computing The Japanese Extreme Big Data Workshop February 26, 2014 System Software for Big Data and Post Petascale Computing Osamu Tatebe University of Tsukuba I/O performance requirement for exascale applications

More information

Data Movement & Tiering with DMF 7

Data Movement & Tiering with DMF 7 Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why

More information

The Future of High- Performance Computing

The Future of High- Performance Computing Lecture 24: The Future of High- Performance Computing Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Executive Order July 29, 2015 EXECUTIVE ORDER - - - - - - - CREATING

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

Intel Xeon Phi архитектура, модели программирования, оптимизация.

Intel Xeon Phi архитектура, модели программирования, оптимизация. Нижний Новгород, 2017 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Дмитрий Рябцев, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture

More information

Motivation Goal Idea Proposition for users Study

Motivation Goal Idea Proposition for users Study Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan Computer and Information Science University of Oregon 23 November 2015 Overview Motivation:

More information

Emerging Technologies for HPC Storage

Emerging Technologies for HPC Storage Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional

More information

AIST Super Green Cloud

AIST Super Green Cloud AIST Super Green Cloud A build-once-run-everywhere high performance computing platform Takahiro Hirofuchi, Ryosei Takano, Yusuke Tanimura, Atsuko Takefusa, and Yoshio Tanaka Information Technology Research

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimised Programming Preliminary discussion, 17.7.2007 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de Dipl.-Geophys.

More information

Performance analysis tools: Intel VTuneTM Amplifier and Advisor. Dr. Luigi Iapichino

Performance analysis tools: Intel VTuneTM Amplifier and Advisor. Dr. Luigi Iapichino Performance analysis tools: Intel VTuneTM Amplifier and Advisor Dr. Luigi Iapichino luigi.iapichino@lrz.de Which tool do I use in my project? A roadmap to optimisation After having considered the MPI layer,

More information

Inauguration Cartesius June 14, 2013

Inauguration Cartesius June 14, 2013 Inauguration Cartesius June 14, 2013 Hardware is Easy...but what about software/applications/implementation/? Dr. Peter Michielse Deputy Director 1 Agenda History Cartesius Hardware path to exascale: the

More information

Intel profiling tools and roofline model. Dr. Luigi Iapichino

Intel profiling tools and roofline model. Dr. Luigi Iapichino Intel profiling tools and roofline model Dr. Luigi Iapichino luigi.iapichino@lrz.de Which tool do I use in my project? A roadmap to optimization (and to the next hour) We will focus on tools developed

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Paving the Road to Exascale

Paving the Road to Exascale Paving the Road to Exascale Gilad Shainer August 2015, MVAPICH User Group (MUG) Meeting The Ever Growing Demand for Performance Performance Terascale Petascale Exascale 1 st Roadrunner 2000 2005 2010 2015

More information

Data Intensive Scalable Computing

Data Intensive Scalable Computing Data Intensive Scalable Computing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Examples of Big Data Sources Wal-Mart 267 million items/day, sold at 6,000 stores HP built them

More information

A New NSF TeraGrid Resource for Data-Intensive Science

A New NSF TeraGrid Resource for Data-Intensive Science A New NSF TeraGrid Resource for Data-Intensive Science Michael L. Norman Principal Investigator Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist Slide 1 Coping with the data deluge

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

High Performance Computing. What is it used for and why?

High Performance Computing. What is it used for and why? High Performance Computing What is it used for and why? Overview What is it used for? Drivers for HPC Examples of usage Why do you need to learn the basics? Hardware layout and structure matters Serial

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

KNL tools. Dr. Fabio Baruffa

KNL tools. Dr. Fabio Baruffa KNL tools Dr. Fabio Baruffa fabio.baruffa@lrz.de 2 Which tool do I use? A roadmap to optimization We will focus on tools developed by Intel, available to users of the LRZ systems. Again, we will skip the

More information

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project Shared Services Canada Environment and Climate Change Canada HPC Renewal Project CUG 2017 Redmond, WA, USA Deric Sullivan Alain St-Denis & Luc Corbeil May 2017 Background: SSC's HPC Renewal for ECCC Environment

More information

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,

More information

Pervasive DataRush TM

Pervasive DataRush TM Pervasive DataRush TM Parallel Data Analysis with KNIME www.pervasivedatarush.com Company Overview Global Software Company Tens of thousands of users across the globe Americas, EMEA, Asia ~230 employees

More information

Exascale: challenges and opportunities in a power constrained world

Exascale: challenges and opportunities in a power constrained world Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made

More information

The way toward peta-flops

The way toward peta-flops The way toward peta-flops ISC-2011 Dr. Pierre Lagier Chief Technology Officer Fujitsu Systems Europe Where things started from DESIGN CONCEPTS 2 New challenges and requirements! Optimal sustained flops

More information

Brand-New Vector Supercomputer

Brand-New Vector Supercomputer Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth

More information

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241

More information

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Overview HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Market Strategy HPC Commercial Scientific Modeling & Simulation Big Data Hadoop In-memory Analytics Archive Cloud Public Private

More information

Modernize Your IT with Dell EMC Storage and Data Protection Solutions

Modernize Your IT with Dell EMC Storage and Data Protection Solutions Modernize Your IT with Dell EMC Storage and Data Protection Solutions Steve Willson Modern Infrastructure Team GLOBAL SPONSORS Last 15 years IT-centric Systems of record Traditional applications Transactional

More information

Data Intensive Computing SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY PASIG June, 2009

Data Intensive Computing SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY PASIG June, 2009 Data Intensive Computing SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY PASIG June, 2009 Presenter s Name Simon CW See Title & and Division HPC Cloud Computing Sun Microsystems Technology Center Sun Microsystems,

More information

Revealing Applications Access Pattern in Collective I/O for Cache Management

Revealing Applications Access Pattern in Collective I/O for Cache Management Revealing Applications Access Pattern in for Yin Lu 1, Yong Chen 1, Rob Latham 2 and Yu Zhuang 1 Presented by Philip Roth 3 1 Department of Computer Science Texas Tech University 2 Mathematics and Computer

More information

Toward Automated Application Profiling on Cray Systems

Toward Automated Application Profiling on Cray Systems Toward Automated Application Profiling on Cray Systems Charlene Yang, Brian Friesen, Thorsten Kurth, Brandon Cook NERSC at LBNL Samuel Williams CRD at LBNL I have a dream.. M.L.K. Collect performance data:

More information

Performance and Energy Usage of Workloads on KNL and Haswell Architectures

Performance and Energy Usage of Workloads on KNL and Haswell Architectures Performance and Energy Usage of Workloads on KNL and Haswell Architectures Tyler Allen 1 Christopher Daley 2 Doug Doerfler 2 Brian Austin 2 Nicholas Wright 2 1 Clemson University 2 National Energy Research

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm Stephanie Labasan & Matt Larsen (University of Oregon), Hank Childs (Lawrence Berkeley National Laboratory) 26

More information

Cycle Sharing Systems

Cycle Sharing Systems Cycle Sharing Systems Jagadeesh Dyaberi Dependable Computing Systems Lab Purdue University 10/31/2005 1 Introduction Design of Program Security Communication Architecture Implementation Conclusion Outline

More information

Leveraging Flash in HPC Systems

Leveraging Flash in HPC Systems Leveraging Flash in HPC Systems IEEE MSST June 3, 2015 This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA27344. Lawrence Livermore National Security,

More information

Secure, scalable storage made simple. OEM Storage Portfolio

Secure, scalable storage made simple. OEM Storage Portfolio Secure, scalable storage made simple. OEM Storage Portfolio P Data is the currency of the digital economy. It s the new oil and the lifeblood of your organization. But, how to manage it all? How can you

More information

High Performance Computing. What is it used for and why?

High Performance Computing. What is it used for and why? High Performance Computing What is it used for and why? Overview What is it used for? Drivers for HPC Examples of usage Why do you need to learn the basics? Hardware layout and structure matters Serial

More information

What are Clusters? Why Clusters? - a Short History

What are Clusters? Why Clusters? - a Short History What are Clusters? Our definition : A parallel machine built of commodity components and running commodity software Cluster consists of nodes with one or more processors (CPUs), memory that is shared by

More information

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday

More information

Technologies for High Performance Data Analytics

Technologies for High Performance Data Analytics Technologies for High Performance Data Analytics Dr. Jens Krüger Fraunhofer ITWM 1 Fraunhofer ITWM n Institute for Industrial Mathematics n Located in Kaiserslautern, Germany n Staff: ~ 240 employees +

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

EARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA

EARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA EARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA SUDHEER CHUNDURI, SCOTT PARKER, KEVIN HARMS, VITALI MOROZOV, CHRIS KNIGHT, KALYAN KUMARAN Performance Engineering Group Argonne Leadership Computing Facility

More information

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl DDN s Vision for the Future of Lustre LUG2015 Robert Triendl 3 Topics 1. The Changing Markets for Lustre 2. A Vision for Lustre that isn t Exascale 3. Building Lustre for the Future 4. Peak vs. Operational

More information

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design Sadaf Alam & Thomas Schulthess CSCS & ETHzürich CUG 2014 * Timelines & releases are not precise Top 500

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

VxRack FLEX Technical Deep Dive: Building Hyper-converged Solutions at Rackscale. Kiewiet Kritzinger DELL EMC CPSD Snr varchitect

VxRack FLEX Technical Deep Dive: Building Hyper-converged Solutions at Rackscale. Kiewiet Kritzinger DELL EMC CPSD Snr varchitect VxRack FLEX Technical Deep Dive: Building Hyper-converged Solutions at Rackscale Kiewiet Kritzinger DELL EMC CPSD Snr varchitect Introduction to hyper-converged Focus on innovation, not IT integration

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

MDHIM: A Parallel Key/Value Store Framework for HPC

MDHIM: A Parallel Key/Value Store Framework for HPC MDHIM: A Parallel Key/Value Store Framework for HPC Hugh Greenberg 7/6/2015 LA-UR-15-25039 HPC Clusters Managed by a job scheduler (e.g., Slurm, Moab) Designed for running user jobs Difficult to run system

More information

Fast Forward I/O & Storage

Fast Forward I/O & Storage Fast Forward I/O & Storage Eric Barton Lead Architect 1 Department of Energy - Fast Forward Challenge FastForward RFP provided US Government funding for exascale research and development Sponsored by 7

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

Performance Matters Scaling Integration Processes to Meet the Needs of Your Business. James Ahlborn, Chief Software Architect, Dell Boomi

Performance Matters Scaling Integration Processes to Meet the Needs of Your Business. James Ahlborn, Chief Software Architect, Dell Boomi Performance Matters Scaling Integration Processes to Meet the Needs of Your Business James Ahlborn, Chief Software Architect, Dell Boomi 1 Atoms Agenda Atoms vs. Molecules Atom Clouds Atom Workers Performance

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

NetApp: Solving I/O Challenges. Jeff Baxter February 2013 NetApp: Solving I/O Challenges Jeff Baxter February 2013 1 High Performance Computing Challenges Computing Centers Challenge of New Science Performance Efficiency directly impacts achievable science Power

More information

CSCI-GA Multicore Processors: Architecture & Programming Lecture 3: The Memory System You Can t Ignore it!

CSCI-GA Multicore Processors: Architecture & Programming Lecture 3: The Memory System You Can t Ignore it! CSCI-GA.3033-012 Multicore Processors: Architecture & Programming Lecture 3: The Memory System You Can t Ignore it! Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Memory Computer Technology

More information

Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING

Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING Contents Scientifi c Supercomputing Center Karlsruhe (SSCK)... 4 Consultation and Support... 5 HP XC 6000 Cluster at the SSC Karlsruhe... 6 Architecture

More information

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017 Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle

More information

The Software Driven Datacenter

The Software Driven Datacenter The Software Driven Datacenter Three Major Trends are Driving the Evolution of the Datacenter Hardware Costs Innovation in CPU and Memory. 10000 10 µm CPU process technologies $100 DRAM $/GB 1000 1 µm

More information

Analyzing I/O Performance on a NEXTGenIO Class System

Analyzing I/O Performance on a NEXTGenIO Class System Analyzing I/O Performance on a NEXTGenIO Class System holger.brunst@tu-dresden.de ZIH, Technische Universität Dresden LUG17, Indiana University, June 2 nd 2017 NEXTGenIO Fact Sheet Project Research & Innovation

More information

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet WHITE PAPER Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet Contents Background... 2 The MapR Distribution... 2 Mellanox Ethernet Solution... 3 Test

More information

Arm Processor Technology Update and Roadmap

Arm Processor Technology Update and Roadmap Arm Processor Technology Update and Roadmap ARM Processor Technology Update and Roadmap Cavium: Giri Chukkapalli is a Distinguished Engineer in the Data Center Group (DCG) Introduction to ARM Architecture

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

Building supercomputers from embedded technologies

Building supercomputers from embedded technologies http://www.montblanc-project.eu Building supercomputers from embedded technologies Alex Ramirez Barcelona Supercomputing Center Technical Coordinator This project and the research leading to these results

More information

Performance Analysis of Parallel Scientific Applications In Eclipse

Performance Analysis of Parallel Scientific Applications In Eclipse Performance Analysis of Parallel Scientific Applications In Eclipse EclipseCon 2015 Wyatt Spear, University of Oregon wspear@cs.uoregon.edu Supercomputing Big systems solving big problems Performance gains

More information

High performance computing and numerical modeling

High performance computing and numerical modeling High performance computing and numerical modeling Volker Springel Plan for my lectures Lecture 1: Collisional and collisionless N-body dynamics Lecture 2: Gravitational force calculation Lecture 3: Basic

More information

Medical practice: diagnostics, treatment and surgery in supercomputing centers

Medical practice: diagnostics, treatment and surgery in supercomputing centers International Advanced Research Workshop on High Performance Computing from Clouds and Big Data to Exascale and Beyond Medical practice: diagnostics, treatment and surgery in supercomputing centers Prof.

More information

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Programming Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Computing Only required amount of CPU and storage can be used anytime from anywhere via network Availability, throughput, reliability

More information

TECHNOLOGIES CO., LTD.

TECHNOLOGIES CO., LTD. A Fresh Look at HPC HUAWEI TECHNOLOGIES Francis Lam Director, Product Management www.huawei.com WORLD CLASS HPC SOLUTIONS TODAY 170+ Countries $74.8B 2016 Revenue 14.2% of Revenue in R&D 79,000 R&D Engineers

More information

The FlashStack Data Center

The FlashStack Data Center SOLUTION BRIEF The FlashStack Data Center THE CHALLENGE: DATA CENTER COMPLEXITY Deploying, operating, and maintaining data center infrastructure is complex, time consuming, and costly. The result is a

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Introduction: Modern computer architecture The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Motivation: Multi-Cores where and why Introduction: Moore s law Intel

More information

Performance Analysis and Prediction for distributed homogeneous Clusters

Performance Analysis and Prediction for distributed homogeneous Clusters Performance Analysis and Prediction for distributed homogeneous Clusters Heinz Kredel, Hans-Günther Kruse, Sabine Richling, Erich Strohmaier IT-Center, University of Mannheim, Germany IT-Center, University

More information

Software and Tools for HPE s The Machine Project

Software and Tools for HPE s The Machine Project Labs Software and Tools for HPE s The Machine Project Scalable Tools Workshop Aug/1 - Aug/4, 2016 Lake Tahoe Milind Chabbi Traditional Computing Paradigm CPU DRAM CPU DRAM CPU-centric computing 2 CPU-Centric

More information