IN11E: Architecture and Integration Testbed for Earth/Space Science Cyberinfrastructures

Similar documents
Big Data Analytics Performance for Large Out-Of- Core Matrix Solvers on Advanced Hybrid Architectures

IBM Power Advanced Compute (AC) AC922 Server

IBM Power AC922 Server

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

LBRN - HPC systems : CCT, LSU

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Inspur AI Computing Platform

Deep Learning mit PowerAI - Ein Überblick

IBM CORAL HPC System Solution

S8765 Performance Optimization for Deep- Learning on the Latest POWER Systems

Mapping MPI+X Applications to Multi-GPU Architectures

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

IBM Deep Learning Solutions

Infrastructure Matters: POWER8 vs. Xeon x86

MICROWAY S NVIDIA TESLA V100 GPU SOLUTIONS GUIDE

ADVANCED IN-MEMORY COMPUTING USING SUPERMICRO MEMX SOLUTION

IBM Spectrum Scale IO performance

University at Buffalo Center for Computational Research

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Big Data Systems on Future Hardware. Bingsheng He NUS Computing

Real Parallel Computers

How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC

Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation

The Stampede is Coming: A New Petascale Resource for the Open Science Community

BlueDBM: An Appliance for Big Data Analytics*

Pedraforca: a First ARM + GPU Cluster for HPC

HPC Hardware Overview

Advanced Research Compu2ng Informa2on Technology Virginia Tech

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

GPUs and Emerging Architectures

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2017

Fast Hardware For AI

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

GPU Architecture. Alan Gray EPCC The University of Edinburgh

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer

Building an Exotic HPC Ecosystem at The University of Tulsa

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

Gen-Z Memory-Driven Computing

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

Storage for HPC, HPDA and Machine Learning (ML)

Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Building NVLink for Developers

High performance Computing and O&G Challenges

IBM Power Systems HPC Cluster

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

HPC Architectures. Types of resource currently in use

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 16 th CALL (T ier-0)

Diamond Networks/Computing. Nick Rees January 2011

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

IBM s Data Warehouse Appliance Offerings

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

New HPE 3PAR StoreServ 8000 and series Optimized for Flash

Organizational Update: December 2015

HPC Storage Use Cases & Future Trends

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

A U G U S T 8, S A N T A C L A R A, C A

PracticeTorrent. Latest study torrent with verified answers will facilitate your actual test

Interconnect Your Future

Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect

Dr. Jean-Laurent PHILIPPE, PhD EMEA HPC Technical Sales Specialist. With Dell Amsterdam, October 27, 2016

Parallel Applications on Distributed Memory Systems. Le Yan HPC User LSU

Our Workshop Environment

Advances of parallel computing. Kirill Bogachev May 2016

Broadberry. Artificial Intelligence Server for Fraud. Date: Q Application: Artificial Intelligence

Storage Class Memory in Scalable Cognitive Systems

Exascale: challenges and opportunities in a power constrained world

OpenPOWER Performance

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

OpenPOWER Innovations for HPC. IBM Research. IWOPH workshop, ISC, Germany June 21, Christoph Hagleitner,

IBM Leading High Performance Computing and Deep Learning Technologies

Cisco UCS M5 Blade and Rack Servers

Machine Learning In A Snap. Thomas Parnell Research Staff Member IBM Research - Zurich

Oak Ridge National Laboratory Computing and Computational Sciences

ACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

Introduction to IBM System Storage SVC 2145-DH8 and IBM Storwize V7000 model 524

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

HPC Cloud at SURFsara

A Comprehensive Study on the Performance of Implicit LS-DYNA

DDN About Us Solving Large Enterprise and Web Scale Challenges

Looking ahead with IBM i. 10+ year roadmap

TACC s Stampede Project: Intel MIC for Simulation and Data-Intensive Computing

Description of Power8 Nodes Available on Mio (ppc[ ])

Realizing the Next Generation of Exabyte-scale Persistent Memory-Centric Architectures and Memory Fabrics

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

PORTING CP2K TO THE INTEL XEON PHI. ARCHER Technical Forum, Wed 30 th July Iain Bethune

Lessons from Post-processing Climate Data on Modern Flash-based HPC Systems

Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators

Transcription:

IN11E: Architecture and Integration Testbed for Earth/Space Science Cyberinfrastructures A Future Accelerated Cognitive Distributed Hybrid Testbed for Big Data Science Analytics Milton Halem 1, John Edward Dorband 1, Navid Golpayegani 2, Smriti Prathapan 1, Yin Huang 1, Timothy Blattner 3 and Kaushik Velusamy 1 (1)University of Maryland Baltimore County, Computer Science, Baltimore, MD, United States, (2)NASA Goddard Space Flight Center, Greenbelt, MD, United States, (3)National Institute of Standards and Technology Gaithersburg, Gaithersburg, MD, United States

What is Cognitive computing? Cognition is the process of acquiring knowledge and understanding through thought, experience, observations and use of other senses; It encompasses integrating reasoning, memory, language and tools for attaining knowledge from information. Cognitive computing is the simulation of the human learning processes by employing accelerated computerized component architectures that mimic the human brain. The components address problems that are characterized by ambiguity and uncertainty. Cognitive systems integrate diverse sources of massive information using machine learning algorithms to find patterns and then apply those patterns to respond to the needs. Cognitive applications such as machine learning to regress flux measurements from long records or to integrate a variety of sensor data, reanalysis model output along with propositional based rules to infer trends, issue warnings or recommendations to decision makers.

Basic general purpose compute cluster

THE IBM IDATAPLEX CLUSTER: BLUEWAVE The Bluewave Cluster Originally known at NASA Center for Climate Simulation(NCCS) as Discover Series Scalable Unit 8, was competitively awarded by NASA under the Congressional Stevenson- Wyndler Surplus System Act and came to CHMPR at UMBC in late 2013. The acquired NCCS system consisted of 6 racks with 512 nodes, of which 4 racks are employed by UMBC/CHMPR. Now called Bluewave, the system currently consists of 336 nodes which includes management nodes, storage nodes, and SLURM client compute nodes. Most nodes are running a Linux image based CentOS 6.5 system. Each node has 2 quad core Intel Nehalem processors (Xeon X5560) and 24GB RAM and 250GB of local storage Bluewave includes a 32 node Hadoop cloud sub-cluster with each node containing 1TB of local disk storage. In addition, 4 nodes with 2 oct core Intel Sandy bridge and 4 Intel Many Integrated Core (Phi) co-processors are currently being integrated into the system. The cluster is used both for research by graduate students and faculty for integrating and testing high performance specialized prototyping computing components for Big Data Analytics.

A State-of-the-Art A elerated Cognitive Test ed ACT for CHMPR Big Data science Acquired prototype of the next gen IBM 250 Petaflops summit processor, two IBM Minsky nodes each with dual Power 8+ processors with 10 cores at 3.0 GHz with 1TB ram and four Nvidia Tesla P100 GPUs (3584 cores) with 2 Nvidia's NVLinks from the power 8+ to each pair of GPUs have been integrated into Bluewave. In addition, an IBM Flash system A 900 with 30TB of flashram expandable to 40 TB are connected to the Minsky nodes with a Coherent Accelerator Processor Interface (CAPI). The flashram A900 system supports both nodes and partitioning of flash ram is programmable An additional 8TB of SSD through an NVME card is available as local storage.

Acquired Two IBM Minsky Nodes Each Node contains: 2 Upgraded Power 8 processors, 10 cores and 8 threads/core 4 Nvida P100 GPUs with 2 Nvidia Links 1TB of DRAM 3.2 GHz Non Volatile Mem card 4 SSDs with 4 TBs each Memory bandwidth 115 GB/s Infiniband In addition: CAPI (Coherent Acelerator Processor Interface FPGA) 20 TB Flash Memory

Nvidia s First GPU-to-GPU and GPU-to-CPU High-Speed Interconnect Accelerated computing with increasing numbers of highly dense GPU server nodes has become the de-facto standard for attaining HPC ratings. Interconnect bandwidth between GPUs is the most significant bottleneck to application performance. NVLink is the first high-speed interconnect for NVIDIA GPUs to solve the interconnect problem. Four NVLink connections per GPU can deliver a total 160 GB/s bidirectional bandwidth for the Tesla P100. This is over 5X the bandwidth of PCI Express Gen3.

Nvidia

STANDARD DISK STORAGE - 107 TB The Network-Attached Storage (NAS) The NAS has 3 components. Store1 43 Terabytes Store2-21 Terabytes Store3-43 Terabytes Each node in the Bluewave can access the above 3 stores for external persistent storage The NAS component in the Blu

IBM Flashram Storage System 30TB

IBM Flashram as Cognitive Memory Did not use traditional SSDs off the shelf, but created their own flash modules. Flashram is 3x-4x faster than SSDs since it is directly tied to mother board similar to DRAM. Flash and SSDs are doing to hard drive arrays what hard drive arrays did to tape.

Flashram Storage as a Cognitive Memory IBM claims All-flash arrays are cognitive in design and operation, and include intelligent storage automation that enables data to move from flash to other storage media within an array. Moving data from tier-to-tier occurs by the storage management system which evaluates and learns data access usage patterns.

Seagate: Holy Grail of Active Storage Research Acquired object oriented key value Seagate kinetic disk system consisting of 2 chassis each with 12 disks. Each disk has 4TB and its own ethernet cable, IP address and local processor. Total storage 96TB. Harnessing processor to allow an object storage platform to delegate intelligent functionality to the drives. Could run their own virus scanning, content discovery, and compression functions. Developed Lightweight Virtual File System (LVFS) which separates data retrieval from metadata information Performed first ever satellite processing directly on Active Kinetic Disk using the MapReduce parallel programming model for gridding 2 years of global OCO-2 CO2 data. Interfacing active kinetic disks to IBM idataplex for Image Stitching, Deep belief Nets, Variable block size for out of core matrix multiplication.

REMOTE ACCESS TO THE D-WAVE 2X CHMPR has access to the D-Wave 2X at ARC at NASA. D-Wave 2X is the only commercially available quantum computer. We view the D-Wave not as a general purpose, but rather designed to be a hybrid quantum annealing coprocessor. D-Wave 2X consists of ~1152 qubits. 8 X 12 X 12 Qubit arrays and announced at SC16 the D-Wave 3X with ~ 2048 qubits to be delivered in 1 st quarter 2017. Plan to couple Bluewave to D-Wave 3X over Internet 2 as an accelerator for optimization.

PGI OpenACC and NIST HTGS S/W Installing Portland Group Inc (PGI) OpenACC software for implementing C, C++ and Fortran 90 onto GPUs. Implementing a Hybrid Task Graph Scheduler developed at NIST to improve programmer productivity by implementing and optimizing parallel algorithms to fully utilize the multicore CPUs and multiple GPUs, while managing dependencies, locality of data and overlapping data motion with computation.

Summary Presented an existing accelerated cognitive distributed hybrid testbed ready for Big Data Science Analytics. We wish to publicly acknowledge the following organizations for enabling the provisioning of this prototype cognitive architecture a UMBC for future Big Data Science: NASA/GSFC, IBM, Seagate, D-Wave, NASA/AIST, NSF/IUCRC.

Thank You 4/3/2017 Page 17 ICCE