: A new version of Supercomputing or life after the end of the Moore s Law

Size: px

Start display at page:

Download ": A new version of Supercomputing or life after the end of the Moore s Law"

Ann Skinner
6 years ago
Views:

1 : A new version of Supercomputing or life after the end of the Moore s Law Dr.-Ing. Alexey Cheptsov SEMAPRO 2015 :: :: Dr. Alexey Cheptsov

2 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 2

3 1 About HLRS Baden-Württemberg Turnover: ~1-2 bil. Staff: ~8000 Turnover: ~50 bil. Staff: ~ Turnover: ~100 bil. Staff: Turnover: ~5 bil. Staff: ~ Turnover: ~2,5 bil. Staff: ~11.00 Turnover: ~14 bil. Staff: ~ SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 3

to industry in 2014 HORNET (Cray XC40, Intel Haswell CPU, Aries network) - 4.

4 1 About HLRS High Performance Computing Center Stuttgart First Cray system in Europe (Cray-2, 1986, 4 CPUs, 2GB RAM, approx. 2 GFLOPS) National HPC infrastructure provider since 1995 EU infrastructure provider since M core hours delivered to industry in 2014 HORNET (Cray XC40, Intel Haswell CPU, Aries network) nodes (24 cores) - 4 PFLOPs performance GB RAM per node - 7,8 PB Disc KW power consumption / 1.5M Euro SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 4

5 1 About HLRS TOP500 Total: - USA Japan Germany 37 - China - 37 Newcomers in 2015: - USA 34 - Germany 12 - Japan - 11 SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 5

000 cores, 84 hours, 450 TB data Turbulent flow simulation for air- and gas dynamic analysis RWTH Aachen

6 1 About HLRS XXL Application Portfolio Climate simulation on a high (a few kms) resolution University of Hohenheim cores, 84 hours, 450 TB data Turbulent flow simulation for air- and gas dynamic analysis RWTH Aachen University cores, 110 hours, 80 TB data See more at: SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 6

1 About HLRS Maya, the bee ( みつばちマーヤの冒険 ) 115.

7 1 About HLRS Maya, the bee ( みつばちマーヤの冒険 ) pictures 2 hours each 200 pictures in parallel From 9600 days to 48 days viewers in 2 weeks SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 7

8 1 About HLRS Challenges for New-Generation Systems Computation power Exascale is on his way, perhaps by 2020 More performance for less money Hazelhehn - approx. 8 PTFLOPs / 3M Euro power costs, approx. 1 PB RAM Sustainable performance - 1 PTFLOP Storage Spinning devices will go away by 2020 Flash is the core technology (aka NVM) Tapes will remain for data persistency Memory Source: both high memory capacity and high memory bandwidth are required cannot guarantee same growth as for FLOPS extremely deep hierarchy (at the cache level) programming ease 3D stacked memory, NVM SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 8

9 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 9

10 2 Convergence of Big Data into Supercomputing The modern HPC have to address a new class of computing-intensive applications from data-intensive domains in the internet, media, business, science, etc. Evolution of Computational Applications Traditional Computational Sciences Data-Intensive Sciences Structure static dynamic Locality spatially and temporally local no or little Volume fit into memory do not fit into memory Arithmetical Complexity High Precision Arithmetic variable precision or integer based SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 10

11 2 Convergence of Big Data into Supercomputing Hardware Infrastruktures e-services Applications Semantic Web Linked Data Data Facts Internet Intranet Information Knowledge Cloud Data Center HPC SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 11

12 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 12

13 3 Semantic Web as a New HPC Domain? What are the challenges Infrastructure on demand - distributed memory parallel clusters with low-latency intercon. - multicore machines with shared memory - GPGPU devices - alltogether? SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 13

14 3 Semantic Web as a New HPC Domain? What are the challenges A Semantic Web Integration Platform for Large-Scale Reasoning Query (SPARQL) Selection Identification Reasoner Answer (RDF) SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 14

15 3 Semantic Web as a New HPC Domain? Development Platforms The idea of LarKC LarKC = an infrastructure for large scale, high performance incomplete reasoning Decider Selecter 1 Query Transformer Identifier Reasoner Selecter 2 Flexibility, Modularity Scalability, Performance 15 SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 15

Plug-in parallelisation Query Identifier Workflow branching Decider Selecter

16 3 Semantic Web as a New HPC Domain? Development Platforms LarKC architecture: High-Level Overview Workflow branching Plug-in parallelisation Query Identifier Workflow branching Decider Selecter Selecter Process 1 Reasoner Process 2 Process 3 Process 4 Multi-Threading MPI Map-Reduce SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 16

17 3 Semantic Web as a New HPC Domain? Development Platforms LarKC architecture: High-Level Overview Plug-in developers Workflow designers Semantic Web Service Farm / / Plug-in Marketplace Deciders Identifiers Transformers Plug-in Plug-in Selectors Reasoners Plug-in Plug-in Registry Plug-in Managers Workflow Support System End-points LarKC platform Execution Framework Data Layer (OWLIM) Remote Invocation Framework (GAT) Monitoring Service Application end-users Infrastructure RDF data base Storage Highperformance computer Computation Monitoring SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 17

18 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 18

19 4 Data-Centric Parallel Programming Models State-Of-The-Art: Distributed Memory Processing over FS Programming models: MapReduce data-centric fault-tolerant poor sustainable performance restrictive key/value model SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 19

20 4 Data-Centric Parallel Programming Models The Message-Passing Interface Data-driven scenarios with MPI easy to integrate high performance SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 20

21 4 Data-Centric Parallel Programming Models The Message-Passing Interface Issues lack of implementations for the programming languages typically used in the data-centric communities, such as Java Java implementations (MPJExpress) native implementations (mpijava) full MPI-2 standard implementation issues with supporting new high-speed interconnects (e.g. Cray), related to the JVM scalability to a peta/exaflop? support of native tools for parallel computing, i.e. debuggers, error detectors, etc. JNI is used for calling communication libraries that are available in native codes (i.e. highly optimized MPI comm.) integration with a native MPI library is not easy but if you got it running, very enjoyable performance SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 21

22 4 Data-Centric Parallel Programming Models The Message-Passing Interface mpijava Architecture import mpi.*; Applications Java wrappers JNI C Interface Java MPI bindings Open MPI MPICH Native (C) MPI implementation SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 22

23 4 Data-Centric Parallel Programming Models The Message-Passing Interface Java bindings for Open MPI (ompijava) in the Open MPI s trunk! SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 23

4 Data-Centric Parallel Programming Models Developments @ HLRS ompijava

24 4 Data-Centric Parallel Programming Models HLRS ompijava performance P2P communication SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 24

4 Data-Centric Parallel Programming Models Application Scenarios Random Indexing for Large Texts A Statistical Distribution technique for word/text

25 4 Data-Centric Parallel Programming Models Application Scenarios Random Indexing for Large Texts A Statistical Distribution technique for word/text similarity analysis Docs Subsetting Terms Occurance vector Occurance vector Similarity index Query Expansion SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 25

26 OUTLINE About us Convergence of Supercomputing into Big Data Semantic Web as a new HPC domain? Mission Data-Centric Parallel Programming Models DreamCloud project approach SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 26

27 5 DreamCloud Project Approach Problem statement Application I/O bound Communication-bound HPC Resource Manager Infrastructure RAM ccnuma CPU NUMA node NUMA node I/O SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 27

28 5 DreamCloud Project Approach DreamCloud solution Application Scalable Low overhead Flexible Unified Performance Monitoring Framework Dynamic Scheduler Infrastructure SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 28

29 5 DreamCloud Project Approach Available application profiling tools Valgrind general analysis Likwid energy/power consumption Vampir/Paraver communication Wireshark networking infrastructure Infrastructure Monitoring Frameworks Zabbix Nagious Excess Problem: A very high user s involvement Key solution: Consolidation and Integration! SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 29

5 DreamCloud Project Approach Consolidated View Legenda: communication streams Total execution time Total energy consumption High-level view to be targeted by D3.

30 5 DreamCloud Project Approach Consolidated View Legenda: communication streams Total execution time Total energy consumption High-level view to be targeted by D3.2 Communicationintensive tasks 1 2 I/O-intensive tasks 4 Memoryintensive tasks 5 Pattern-based view 3 Disk RAM Detailed view SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 30

31 5 DreamCloud Project Approach Architecture Querying Interface (RESTful) Monitoring Server (Elastic Search) EXCESS monitoring agent PAPI RAPL... Node 1 3rd-party tools... Node N SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 31

32 5 DreamCloud Project Approach Performance Profiles Static (prior to the execution) Dynamic (collected at runtime) Performance Counters (PAPI etc.) Energy Counters (RAPL etc.) User-defined ones (e.g. progress tracker) SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 32

33 5 DreamCloud Project Approach Basic Metrics SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 33

34 5 DreamCloud Project Approach Power consumption SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 34

5 DreamCloud Project Approach Performance Profiles Representation in DreamCloud Performance Profiles(Example in CSV) timestamp;node;task;papi_tot_ins:cpu0;papi_tot_ins:cpu1 1418120193.

35 5 DreamCloud Project Approach Performance Profiles Representation in DreamCloud Performance Profiles(Example in CSV) timestamp;node;task;papi_tot_ins:cpu0;papi_tot_ins:cpu ;node_1[perf];task_1; ; ;node_2[perf];task_1;663789; ;node_3[balanced];task_1; ; ;node_3[energy];task_2_1; ; ;node_9[perf];task_2_1;663789; ;node_3[perf];task_2_2; ; ;node_9[perf];task_2_2;663789; Visualization SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 35

5 DreamCloud Project Approach Molecular Dynamics (MD) Simulation Code MS2 Simulation of movement of atoms and molecules Massively parallel, on a per-particle basis Persistent Task Adaptation

36 5 DreamCloud Project Approach Molecular Dynamics (MD) Simulation Code MS2 Simulation of movement of atoms and molecules Massively parallel, on a per-particle basis Persistent Task Adaptation Adaptation Phase Phase Adapt State Generation Generation Phase Phase Generate Individuals Evaluation Evaluation Phase Phase Generate Input Data Sets Compute Intensive (MPI) Exit Check Termination MS 2 MS 2 MS 2 MS 2 Rank Individuals Calculate Fitness Values SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 36

37 5 DreamCloud Project Approach Submission Request jdl Tasks Description Workflow Description Optimization Criteria Application 1 Submission Interface Deployment Workflow Manager RTE Scheduling Interface Progress Tracker Resource Manager Scheduling 2 Request Deployment 7 Plan 3 4 Scheduling Advisor 5 6 Heuristics Module Node Node Node Node Node Node Node Node Node Node Monitoring Framework Performance Profiles Infrastructure SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 37

38 6 Conclusion Main Results HPC is going to face new challenges related to data-centric application expansion. Parallel programming models (mainly MapReduce and MPI) are the key enablers of HPC to data-centric applications Reaching near-peak performance is going to be the major challenge Future Work Promote existing technologies, such as MPI, to solving new challenges, such as Big Data. Making existing framework more data-centric. SEMAPRO 2015 :: :: Dr. Alexey Cheptsov 38

Cognitive-based Computation, Semantic Understanding, and Web Wisdom

Panel SEMAPRO/ADVCOMP/DATA ANALYTICS, Porto, 01.10.2013 Reutlingen University Cognitive-based Computation, Semantic Understanding, and Web Wisdom Moderation: Fritz Laux, Reutlingen University, Germany