SpiNNaker - a million core ARM-powered neural HPC

Size: px

Start display at page:

Download "SpiNNaker - a million core ARM-powered neural HPC"

Spencer Doyle
5 years ago
Views:

1 The Advanced Processor Technologies Group SpiNNaker - a million core ARM-powered neural HPC Cameron Patterson cameron.patterson@cs.man.ac.uk School of Computer Science, The University of Manchester, UK 1

2 Outline Motivation SpiNNaker Architecture Machines Software State of the Nation Conclusions and Futures 2

3 Motivation Ubiquity of parallelism The human brain is the best example Grand challenges UK: GC5: Architecture of Brain and Mind Can we learn from the brain? As a processor technologies group 3

4 The Biological Brain Brains demonstrate: massive parallelism (1011 neurons) massive connectivity (1015 synapses) excellent power-efficiency low-performance components (~ 100 Hz) low-speed communication (~ metres / s) adaptivity tolerant of component failure autonomous learning 4

5 Simplified Structure 5

6 Taking Inspiration The Grand Challenges - work both ways By mimicking the brain, can we understand it? Use it to perform 'unethical' experiments Improve treatment regimes Can we learn lessons from the biology? Apply to parallel computing e.g. Energy efficiency Fault-tolerance 6

7 Artificial Neural Nets Taxonomy Three generations of neural modelling Granularity of Simulation 7

8 Outline Motivation SpiNNaker Architecture Machines Software State of the Nation Conclusions and Futures 8

9 Network Scaling Large-scale ANNs require lots of neurons SpiNNaker's aim is 1 billion plausible neurons Large-scale ANNs require lots of bandwidth In the brain discrete 'wiring' Resulting in: 109 neurons * 10 Hz * 103 synapses = 1013 (10 trillion) network events / s Often the limiting factor for large simulations 9

10 Biology vs Electronics Luckily biology is 'slow' & electronics 'fast' This is exploited in SpiNNaker Model multiple neurons/synapses on a core SpiNNaker models neurons in software on ARM Quantity depends on fidelity required Route spikes using AER SpiNNaker has a rich interconnection fabric Su pports very large number of small packets (spikes) 10

11 SpiNNaker Principles Energy frugality Low-power processors ARM, embedded GALS (Globally Asynchronous, Locally Synchronous) Event-Driven Redundancy 18 cores per chip 6 links per chip Real-time modelling 11

12 SpiNNaker Project Multi-core SpiNNaker nodes 18 ARM968 cores Programmable Interconnects Scalable up to 216 nodes in a system over a million processors >108 MIPS total 12

13 Flattened Topology 13

14 System On a Chip Async Ext. Links 2 Async NoCs Comms Packetised via Router System Shared resource In package RAM Ethernet 18 Proc. Nodes 14

15 Processor Node ARM968E-S Synthesizable Fixed Point Efficient 32KB & 64KB instruction/data Local peripherals Custom DMAC JTAG 15

16 Fabricated CMP UMC 130nm Die area mm2 >100 million transistors Power consumption: 1W at 1.2V, 180MHz Peak performance ~4 GIPS 16

17 Chip Design Considerations 1 Choice of process technology UMC 130nm 1.2V 1P8M Fusion process Standard Performance & Low Leakage libraries Mature, competitively priced Physical Layout Async logic crafted with commercial EDA tools Customized macrocells for key asynchronous circuits 17

18 Chip Design Considerations 2 Power Optimization Low power embedded processors Relatively low frequency 180 MHz 32-bit fixed point arithmetic Mobile DDR SDRAM Idle processor cores put to sleep mode Architecture and logic-level clock gating Power-aware synthesis throughout the design flow 18

19 Chip Design Considerations 3 GALS Clocked Islands: 2 * 180MHz, 180MHz, system 100MHz, 166 MHz Fault Tolerance and Monitoring Redundancy: 18 cores, 6 bidir. I/O links, 2 PLLs Runtime diagnostics, temps and reconfiguration Diagnostic comms along with application traffic Comms NoC parity and framing error detection DMA optional CRC 19 Emergency Routing for inter-chip comms failures

20 Emergency Routing 20

21 Outline Motivation SpiNNaker Architecture Machines Software State of the Nation Conclusions and Futures 21

22 SpiNNaker Board 3rd Generation SpiNNaker board 22

23 th 4 Generation Board 23

24 Hexagonal PCB structure 24

25 SpiNNaker Machines 25

26 Outline Motivation SpiNNaker Architecture Machines Software State of the Nation Conclusions and Futures 26

27 Software on SpiNNaker SpiNNaker primarily for ANNs Not limited to this Finite Element Analysis Ray Tracing Heat Diffusion All require mapping from the problem space/graph to the machine itself As this scales, the problem gets significant 27

28 Mapping Paths 2 paths being developed for large machines Desc. Desc. A B C Splitter Place Route Machine and Model Libraries PACMAN for mapping models to hardware Partition And Configuration MANager 28

29 Neural Simulation Processors 16 application + 1 monitor (+1 spare) Simulate ~1000 neurons/proc. SDRAM holds synaptic data Brought to core by DMA across System NoC Spikes coded as packets Bespoke router with multicast & point-to-point routing tables and emergency routing mechanism Source-addressed MC 'spike' packets over Comms NoC29

30 Software Operation API provides h/w abstraction for modelling 3 main events ANN software deals with: Packet Received Buffer and request DMA DMA Event Read / update synapse table Timer Event Calculate and update neurons 30

31 Example ANN Problem A Constraint: All:All B neurons / core (due to mem/cpu etc.) 25% C 12 50% 4 1:1 4 D A splits into 2 cores Mapped to core 1&2 B, C and D map to 3-5 Routes: 1 3 & 2 3 (A B), 1 5 & 2 5 (A D), 3 5 (B D), 4 5 (C D). 31

32 PyNN Integration PACMAN 32

33 Real-Time I/O 1 33

34 Real-Time I/O 2 34

35 Real-Time I/O 3 35

36 Outline Motivation SpiNNaker Architecture Machines Software State of the Nation Conclusions and Futures 36

37 Current Project Status Full 18-core chip: arrived May th Gen Card: 48 chips, 864 processors June 2012 Neuron models: LIF, Izhikevich, MLP Synapse models: STDP, NMDA Networks: PyNN, NEF (Nengo) -> SpiNNaker various small tools to build Router tables, etc Plans: 104 machine (Q4 2012), 105 (1H 2013), 50,000-chip 106 machine (Q3 2013). 37

38 Outline Motivation SpiNNaker Architecture Machines Software State of the Nation Conclusions and Futures 38

39 Conclusions SpiNNaker MPSoC Power-efficiency Scalable communications Programmability Fault-tolerance SpiNNaker machine Massively-parallel, programmable platform Aim to help neuroscientists to explore and understand information processing mechanisms in the brain Other parallel applications too 39

40 Manchester Team 40

41 Any Questions? Search the Web and YouTube for SpiNNaker Manchester chip 41

Biologically-Inspired Massively-Parallel Architectures - computing beyond a million processors

Biologically-Inspired Massively-Parallel Architectures - computing beyond a million processors Dave Lester The University of Manchester d.lester@manchester.ac.uk NeuroML March 2011 1 Outline 60 years of