Big Data mais basse consommation L apport des processeurs manycore

Similar documents
Kalray MPPA Manycore Challenges for the Next Generation of Professional Applications Benoît Dupont de Dinechin MPSoC 2013

GPU Programming. Lecture 1: Introduction. Miaoqing Huang University of Arkansas 1 / 27

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013

Porting Linux to a New Architecture

Dr. Yassine Hariri CMC Microsystems

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?

GPUs and GPGPUs. Greg Blanton John T. Lubia

How to build a Megacore microprocessor. by Andreas Olofsson (MULTIPROG WORKSHOP 2017)

THE PATH TO EXASCALE COMPUTING. Bill Dally Chief Scientist and Senior Vice President of Research

There s STILL plenty of room at the bottom! Andreas Olofsson

March 10 11, 2015 San Jose

Ultra Low Power GPUs for Wearables

The Mont-Blanc approach towards Exascale

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

Lecture 1: Gentle Introduction to GPUs

From Brook to CUDA. GPU Technology Conference

Parallel Architectures

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Introduction to GPU computing

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

Outline Marquette University

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes


Higher Level Programming Abstractions for FPGAs using OpenCL

LOW POWER LOW LATENCY VIDEO CODEC LOT 82

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

Each Milliwatt Matters

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

Administrivia. HW0 scores, HW1 peer-review assignments out. If you re having Cython trouble with HW2, let us know.

Porting Linux to a New Architecture

Lecture 9: MIMD Architectures

At the heart of a new generation of data center infrastructures and appliances. Sept 2017

Lecture 9: MIMD Architecture

Building blocks for 64-bit Systems Development of System IP in ARM

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

ECE 8823: GPU Architectures. Objectives

Memory Systems IRAM. Principle of IRAM

An Alternative to GPU Acceleration For Mobile Platforms

Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye

Growth outside Cell Phone Applications

GPU for HPC. October 2010

Approaches to Parallel Computing

Gen-Z Memory-Driven Computing

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan

Intel Many Integrated Core (MIC) Architecture

BREAKING THE MEMORY WALL

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Parallel Computing: Parallel Architectures Jin, Hai

A Building Block 3D System with Inductive-Coupling Through Chip Interfaces Hiroki Matsutani Keio University, Japan

Welcome. Altera Technology Roadshow 2013

RISC Processors and Parallel Processing. Section and 3.3.6

The Return of Innovation. David May. David May 1 Cambridge December 2005

HPC Technology Trends

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

Leveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

On the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters

GPU Architecture. Alan Gray EPCC The University of Edinburgh

High Performance Computing: Blue-Gene and Road Runner. Ravi Patel

The Future of GPU Computing

Manycore and GPU Channelisers. Seth Hall High Performance Computing Lab, AUT

Parallel Computer Architecture Concepts

WHY PARALLEL PROCESSING? (CE-401)

Godson Processor and its Application in High Performance Computers

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

EXASCALE COMPUTING ROADMAP IMPACT ON LEGACY CODES MARCH 17 TH, MIC Workshop PAGE 1. MIC workshop Guillaume Colin de Verdière

ARM Multimedia IP: working together to drive down system power and bandwidth

Technology Trends Presentation For Power Symposium

Copyright 2012, Elsevier Inc. All rights reserved.

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP

What is GPU? CS 590: High Performance Computing. GPU Architectures and CUDA Concepts/Terms

Cell Broadband Engine. Spencer Dennis Nicholas Barlow

Altera SDK for OpenCL

24th MONDAY. Overview 2018

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

Power Technology For a Smarter Future

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

CUDA Programming Model

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

An Introduction to Graphical Processing Unit

Chapter 0 Introduction

Memory Industry Report T A B L E O F C O N T E N T S

Computer Systems Architecture

Parallella: A $99 Open Hardware Parallel Computing Platform

Sort vs. Hash Join Revisited for Near-Memory Execution. Nooshin Mirzadeh, Onur Kocberber, Babak Falsafi, Boris Grot

Memory Industry Dynamics & Market Drivers Memory Market Segments Memory Products Dynamics Memory Status 2007 Key Challenges for 2008 Future Trends

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1

Introduction II. Overview

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism

Parallel Architecture. Hwansoo Han

Supercomputing and Mass Market Desktops

Multiprocessors & Thread Level Parallelism

Transcription:

Big Data mais basse consommation L apport des processeurs manycore Laurent Julliard - Kalray Le potentiel et les défis du Big Data Séminaire ASPROM 2 et 3 Juillet 2013

Presentation Outline Kalray : the company, products and markets The rising of Manycore architectures Feartures and benefits Using Manycore processors in Big Data Opportunities and customer cases Q&A 2013 Kalray SA All Rights Reserved Confidential Information 2

Kalray at A Glance Founded in 2008 located in Paris, Grenoble (France) & Tokyo (Japan) 55+ people Independent and unique technology : MPPA MANYCORE processor (Multi-Purpose Processing Array) and software programming environment Targeting the industrial, embedded and computing intensive markets Large patent portfolio Several awards over the past years Kalray ranked in EETimes silicon 60 : Hot start up to watch in 2012 Best technology award from Les Trophées de l Embarqué in 2012 Startup of the year by ElectroniqueS magazine in 2013 2013 Kalray SA All Rights Reserved Confidential Information 3

First MPPA -256 Chips with CMOS 28nm TSMC High processing performance 700 GOPS 230 GFLOPS Low power consumption - 5W High execution predictability Released November 2012 Software programmable 2013 Kalray SA All Rights Reserved Confidential Information 4

KALRAY, a global solution Powerful, Low Power and Programmable Processors C/C++ based Software Development Kit (SDK) for massively parallel programing Development platform Reference Design Board BOARDS Reference Design board Application specific boards Multi-MPPA or Single-MPPA boards 2013 Kalray SA All Rights Reserved Confidential Information 5

Target Application Areas INTENSIVE COMPUTING Finance Numerical Simulation Geophysics Life sciences IMAGE & VIDEO Broadcast Medical Imaging Digital Cinema Augmented reality Vision EMBEDDED SYSTEMS Signal Processing Aerospace/Defence Transport Industrial Automation Video Protection TELECOM / NETWORKING Packet Switching Network Optimisation Security Services Software Defined Radio Software Defined Network 2013 Kalray SA All Rights Reserved Confidential Information 6

MPPA MANYCORE Roadmap Architecture scalability for high performances and low power Q4 2012 Q2 2014 Q2 2015 MPPA -1024 Low Power 7 W MPPA -256 V1 MPPA -256 V2 Low Power 5 W MPPA -64 Very Low Power 75 mw - 1,8 W 1 st core generation 50 GFLOPS/W 2 nd core generation 80 GFLOPS/W 3 rd core generation 100 GFLOPS/W 2013 Kalray SA All Rights Reserved Confidential Information 7

The rising of Manycore architectures

3 decades of microprocessor trend 2013 Kalray SA All Rights Reserved Confidential Information 11

Even more cores on a single chip 2008 2013 1971 2004 SMP 10 1 cores Manycore 10 3 cores Megacore 10 6 cores Monocore 10 0 cores 2013 Kalray SA All Rights Reserved Confidential Information 12

Multicore CPUs vs GPUs vs Manycore Control ALU ALU ALU ALU Cache DRAM CPU GPU Manycore CPU are optimized for sequential code performance Sophisiticated Control Logic to execute several instruction at the same time Very large cache to reduce access time to instruction and data of complex applications Clock frequency limits reached long ago 50 to 150 Watt CONSTRAINED BY Power consumption, Complexity, Scalability Originates from the video game industry Very simple control units and huge number of floating point units working in parallel SIMD processing model Smaller cache than CPUs 20 to 300 Watt CONSTRAINED BY Power consumption, Programming model, Communication overhead Rich CPU-like Control units + FPU Cluster of processors share a local memory Clusters communicate through a high speed low latency network on chip MIMD paradigm (Multiple instructions Multiple Data) rather than MIMD 5 to 10 Watt CONSTRAINED BY Disruptive technology adoption curve 2013 Kalray SA All Rights Reserved Confidential Information 13

MPPA Technology Compared to GPU & CPU MPPA-256 20pJ/Instruction Optimized for GFlops/Watt and $ Distributed Memory model Low Latency GPU 200pJ/Instruction CPU 2000pJ/Instruction 230 GFlops 5 W ~50 3000 GFlops 300 W ~10 Source: Bill Dally, To ExaScale and Beyond - NVidia 700 GFlops 130 W ~5 2013 Kalray SA All Rights Reserved Confidential Information 14

Manycore opportunities in Big Data

Be more efficient remotely Computing efficiency (GFlops/Watt) Energy consumption becomes an absolute barrier whether in high-end embedded sytems or data-centers Hardware efficiency (GFlops/$) Remove unnecessary hardware overhead Bandwidth (MB/s) Bring data in and out fast and avoid bus bottleneck 2013 Kalray SA All Rights Reserved Confidential Information 17

Do more locally Process data at the source Bring more intelligence right next to the data generator Imperatives : more Gflop/W and Gflop/$ Use new «online» information processing models Learn from data and predict continuously and in real time (e.g. HTM/CLA) 01110 10100 11011 Benefits Save on the global system cost (e.g. lighter network infrastructure) Save on energy : transmitting and processing date remotely takes 100 to 1000 times more energy than doing it locally Adapt local acquisition and screening strategies 2013 Kalray SA All Rights Reserved Confidential Information 18

MPPA for video protection Improved Content analysis High resolution camera / low false detection rate Robust algorithms high performance computing of MPPA Real Time detection More simple infrastructure Compute power at the source System integration: Ethernet input / decode / content analysis / encode Multi camera system on single chip MPPA 10GBe decode decode decode Content analysis Content analysis Content analysis 2013 Kalray SA All Rights Reserved Confidential Information 19

2013 Kalray SA All Rights Reserved Confidential Information 20

Q&A www.kalray.eu - info@kalray.eu