SpiNNaker a Neuromorphic Supercomputer. Steve Temple University of Manchester, UK SOS21-21 Mar 2017

Similar documents
Biologically-Inspired Massively-Parallel Architectures - computing beyond a million processors

SpiNNaker - a million core ARM-powered neural HPC

The other function of this talk is to introduce some terminology that will be used through the workshop.

Neuromorphic Hardware. Adrita Arefin & Abdulaziz Alorifi

Project Number: Project Title: Human Brain Project. Neuromorphic Platform Specification public version

SpiNN 3 System Diagram

Neuro-inspired Computing Systems & Applications

A survey of the SpiNNaker Project : A massively parallel spiking Neural Network Architecture

Emerging computing paradigms: The case of neuromorphic platforms

Approximate Fixed-Point Elementary Function Accelerator for the SpiNNaker-2 Neuromorphic Chip

3D Wafer Scale Integration: A Scaling Path to an Intelligent Machine

A Large-Scale Spiking Neural Network Accelerator for FPGA Systems

Brainchip OCTOBER

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

A ROUTER FOR MASSIVELY-PARALLEL NEURAL SIMULATION

More Course Information

Neural Computer Architectures

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

Event-driven computing

Mapping AER Events to SpiNNaker Multicast Packets

An FPGA Architecture Supporting Dynamically-Controlled Power Gating

Memory Systems IRAM. Principle of IRAM

A Real-Time, FPGA based, Biologically Plausible Neural Network Processor

Algorithm and Software for Simulation of Spiking Neural Networks on the Multi-Chip SpiNNaker System

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

Chapter 5 Internal Memory

An Introduction to Programmable Logic

FLEXIBLE USB 2.0-BASED ADDRESS- EVENT REPRESENTATION ROUTER

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Modular Neural Tile Architecture for Compact Embedded Hardware Spiking Neural Network

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM

CS310 Embedded Computer Systems. Maeng

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

VLSI Design Automation. Maurizio Palesi

Spiking Neural P Systems and Petri Nets

Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA

Dynamic Pipelining: Making IP- Lookup Truly Scalable

Technical Report. Massively parallel neural computation. Paul J. Fox. Number 830. March Computer Laboratory UCAM-CL-TR-830 ISSN

Part 1 of 3 -Understand the hardware components of computer systems

Mixed-signal neuromorphic VLSI devices for spiking neural network

XMOS Technology Whitepaper

CprE 583 Reconfigurable Computing

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

Execution Model: Neuromorphic Computing

OCP Engineering Workshop - Telco

Despite an increasing amount

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer

Exploration of dynamic communication networks for neuromorphic computing

Getting to Work with OpenPiton. Princeton University. OpenPit

ReNoC: A Network-on-Chip Architecture with Reconfigurable Topology

Data Mining. Neural Networks

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

ADVANCED FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 3 & 4

Digital Fabric. PDXScholar

Data Center Switch Architecture in the Age of Merchant Silicon. Nathan Farrington Erik Rubow Amin Vahdat

Biologically-Inspired Massively-Parallel Architectures computing beyond a million processors

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain

Neurmorphic Architectures. Kenneth Rice and Tarek Taha Clemson University

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric

Parallel Computing xxx (2011) xxx xxx. Contents lists available at ScienceDirect. Parallel Computing. journal homepage:

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad

Real-Time Interface Board for Closed-Loop Robotic Tasks on the SpiNNaker Neural Computing System

ECE 4514 Digital Design II. Spring Lecture 22: Design Economics: FPGAs, ASICs, Full Custom

Lecture 41: Introduction to Reconfigurable Computing

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

EMERGING NON VOLATILE MEMORY

Field Programmable Gate Array (FPGA) Devices

Technology Roadmap 2002

FPGA DESIGN OF A MULTICORE NEUROMORPHIC PROCESSING SYSTEM. Thesis. Submitted to. The School of Engineering of the UNIVERSITY OF DAYTON

Novel Hardware Architecture for Fast Address Lookups

CHAPTER 1 Introduction

Monolithic 3D IC Design for Deep Neural Networks

Power and Thermal Models. for RAMP2

Mapping a Pipelined Data Path onto a Network-on-Chip

3D INTEGRATION, A SMART WAY TO ENHANCE PERFORMANCE. Leti Devices Workshop December 3, 2017

Actel s SX Family of FPGAs: A New Architecture for High-Performance Designs

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric

FABRICATION TECHNOLOGIES

Computer Hardware. In this lesson we will learn about Computer Hardware, so that we have a better understanding of what a computer is.

Hardware Design with VHDL PLDs IV ECE 443

Intro Lab. This lab is meant to expose workshop participants to examples of problems which can be applied to the SpiNNaker architecture.

Chapter 2 Parallel Hardware

Alternative Ideas for the CALICE Back-End System

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

NPE-300 and NPE-400 Overview

DEEPFINNS. Group 3 Members FPGA IMPLEMENTATION OF A DEEP NEURAL NETWORK FOR SPEECH RECOGNITION. Fall 2016 LINDSAY DAVIS

Louis DiNardo. Ryan Benton Chief Financial Officer. September Quarter Update. Presented November 2017 ASX CODE: BRN AN AI PROCESSOR COMPANY

A Hardware/Software Framework for Real-time Spiking Systems

Calibrating Achievable Design GSRC Annual Review June 9, 2002

Schematic. A: Overview of the Integrated Detector Readout Electronics and DAQ-System. optical Gbit link. 1GB DDR Ram.

Embedded GPGPU and Deep Learning for Industrial Market

PLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

Big Orange Bramble. August 09, 2016

Switch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011

Transcription:

SpiNNaker a Neuromorphic Supercomputer Steve Temple University of Manchester, UK SOS21-21 Mar 2017

Outline of talk Introduction Modelling neurons Architecture and technology Principles of operation Summary / Future plans [Short (2 min) video]

So what is... Neuromorphic computing / engineering? Has its origins in the work of Carver Mead at Caltech in late 1980s Originally coined for VLSI analogue systems to mimic neural biology Latterly, any brain-inspired hardware computation system or technology SpiNNaker? An experimental neuromorphic computing platform for modelling spiking neurons Based around a custom multi-core ASIC Scalable from a few chips to around 1 million cores Targets large-scale brain models

SpiNNaker Motivation Can massively-parallel computing resources accelerate our understanding of brain function? Can our growing understanding of brain function point the way to more efficient parallel, faulttolerant computation?

Neurons Neurons are the single-celled building blocks of the brain (cf logic gates) Multiple inputs (dendrites) Single output (axon) High fanout (average 7000 in human brain) Communicate using spikes (or action potentials)

Brains Brain behaviour Adaptive tolerant of component failure Perform tasks that conventional computers find hard Autonomous learning Complex pattern recognition Inference Brain characteristics Massive parallelism (1011 neurons in human) Massive connectivity (1015 synapses in human) Excellent power efficiency (~ 25W in human) Low speed components (~ 10 Hz) Low speed communication (~ metres/sec)

Neural Computation synapse The cell body contains short-term state (membrane potential). A spike is output when a threshold is reached The membrane potential is modified by the arrival of spikes Neuron-to-neuron connection takes place at a synapse The effect of each spike is modulated by a weight in each synapse. The weights can change over time (learning!) There is a (N ms) delay as the spike traverses the axon

SpiNNaker is... A scalable hardware/software architecture for modelling spiking neural networks in real-time Based around a custom chip Scalable to 64K chips (> 106 cores) Special packet-routing hardware for modelling spiking networks Low power (around 1W per chip) Real-time eg for robotic applications Flexible mostly software not tied to a particular modelling style or level

Chip Architecture 6 inter-chip links for packet communication (~6 M spikes/sec each) Spike packet router 18 ARM968 cores each with 96K SRAM and clocked at 200MHz 128MB SDRAM for neural weight storage Ethernet for system I/O

Chip Interconnection Chips laid out in 2D grid and connected to 6 neighbours Scalable to 64K chips Edges can be wrapped to make a toroid Packets (spikes) pass from chip-tochip via interchip links

SpiNNaker Software Architecture Each SpiNNaker core models a number of neurons (typically 10 1000) A neuron model which spikes sends a packet containing the source ID of that neuron (source routing) A spike/packet is routed to all cores modelling neurons which connect to the spiking neuron Biological delays are modelled in software Only simple, abstract models can be efficiently implemented All state is kept in RAM All coding is done in C (and a little assembler)

Spiking Neural Simulation synapse

SpiNNaker Chip (Mar 2011) 130nm process 10 x 10 mm 18 ARM9 cores each with 96K SRAM Spike packet router SDRAM controller Asynchronous GALS NoC External 128MB SDRAM in same package 300 pin plastic BGA package

SpiNN-5 PCB (Oct 2013) 48 SpiNNaker chips 3 FPGAs for interboard links 9 * 3 Gbit/s 'SATA' links Backplane connector for power 2 * 100 Mbit/S Ethernet ports Board management processor (BMP) Max power consumption ~60W Standard Eurocard sizing

48 Chip Interconnect

HBP System (Mar 2016) Human Brain Project deliverable and resource 600 SpiNN-5 PCBs 28,800 chips 518,400 cores 1800 SATA cables Max power ~ 40kW Forced air cooling Dedicated machine room Standard servers as front-end interface Accessible via HBP portal for job submission

Application Software Neural models running on SpiNNaker are coded in C Many neuroscientists don't do C (or any form of programming...) Must provide a way for them to describe a neural network and simulate it in SpiNNaker We provide a library of models exposed via high-level neural network descriptions PyNN is a standard language to describe networks and interact with neural simulators We map PyNN (and others) onto SpiNNaker

PACMAN PACMAN currently runs on a workstation

Spike Packet Router Single input packet (spike) can replicate to 0-24 output ports CAM-based lookup based on routing key (source neuron ID) CAM implementation limited to 1024 entries Don't-care states in CAM bits allows output routes to be shared Network configuration software responsible for setting routing table Efficient use of CAM entries essential to stay within 1024 entries Routing delay ~200ns per chip

Spike Packet Routing

Alternative Approaches Analogue Neurons Supercomputer + Very low power + Neuron in a few tens of transistors + Flexible model at many levels + Fast + Deterministic results reproducible - Inflexible - model fixed in silicon - Expensive - Spike routing difficult - Non-deterministic hard to calibrate - High power - Spike routing slow

Summary / Future Plans After 10 years we have a large-scale SpiNNaker machine and over 100 smaller systems deployed worldwide Users are exploring various application spaces Brain modelling Deep networks (using spikes) Real-time Spaun (U. of Waterloo brain model) Mobile apps (untethered speech/image recognition) SpiNNaker-2 in design phase - aiming for 10x improvement

SpiNNaker Video https://www.youtube.com/watch?v=z1_ge_ugege