Neuromorphic Computing: Our approach to developing applications using a new model of computing

Size: px
Start display at page:

Download "Neuromorphic Computing: Our approach to developing applications using a new model of computing"

Transcription

1 Neuromorphic Computing: Our approach to developing applications using a new model of computing David J. Mountain Senior Technical Director Advanced Computing Systems Research Program

2 Background Info Outline Mapping applications onto a neuromorphic computer Example Applications* AES-256 Encryption Malware Identification Quantitative results *I will not be using cats, MNIST digits or ImageNet pictures there are lots of people demonstrating those applications

3 Neuromorphic Computing The integration of algorithms, architectures, and technologies, informed by neuroscience, to create new computational approaches. Silicon Brain not required Image: zmescience.com

4 Neural Networks x 1 x 2 w 1 w 2 w 3 Ʃ x 3 Single Neuron Equation Multiply accumulate (MACC) with an activation function Feed-Forward Neural Network McCulloch/Pitts Diagram Activation Function

5 Threshold Gates A neuron can be considered a threshold gate and used to perform logic functions x 1 x 2 x 3 1 x Ʃ 1 x 2 Ʃ 1 β = -2.5 x 3 β = -0.5 x 1-1 Ʃ β = 0.5 AND OR NOT Many interconnected threshold gates can perform complex logic functions

6 McCulloch & Pitts Threshold Logic, Neural Nets A Logical Calculus of the Ideas Immanent in Nervous Activity, Bulletin of Mathematical Biophysics, 1943 Images: (L) NESFA, Boskone V Conference, 1968 (R) Estate of Francis Bello/ScienceSource via Nautilus/A. Gefter

7 Frank Rosenblatt Mark I Perceptron, circa 1960 Images: Hecht-Nielsen, R. Neurocomputing (Reading, Mass.: Addison-Wesley, 1990) via rutherfordjournal.org.

8 Normalized inputs/sec/watt Different computational primitives will become the common case: Majority function example Digital implementations are relatively inefficient for large numbers of inputs; MACC-centric design appears to have a large sweet spot 7.00 Majority Gate -- Throughput per Watt a b c d e Ʃ β = MACC Digital Number of inputs

9 Our approach Architectures that scale to handle real applications Ohmic Weave Methodologies and algorithms for designing/programming these systems Loom Experience & experiments with applications to guide architectures and methodologies AES-256, malware triage

10 Implementing neurons using physics I = V/R G = 1/R I = VG Multiply x 1 w 1 x 2 w 2 w 3 Ʃ Multiply x 3 Multiply Accumulate Image: Stan Williams, HP Labs via arstechnica.com

11 3x4 Crossbar = 2 Neurons input drivers memristor comparators

12 INPUT DRIVERS Ohmic Weave: Single Tile 256 axons, 128 neurons, synapses 256x256 memristor crossbar input drivers memristor comparators IO PORT COMPARATORS 128 differential comparators All inputs and all outputs are sent to a central router

13 Ohmic Weave: 64 Tile General Purpose Processor* 16k axons 8k neurons 4M synapses 64 port router all-to-all connectivity *56 Tera synaptic ops per watt (TSOPS/W), 1.1 TSOPS/mm 2

14 Tools, Methodologies, Algorithms Loom Ohmic Weave design tool Python classes with C, Cuda extensions Enables exploration of design trade-offs Limited precision weights Neural network topologies (layers, neurons per layer) Connectivity pruning Simulates Ohmic Weave designs on CPUs, GPUs Debug with full view of internal state

15 Methodology: Block Based Design Decompose the problem into blocks Much like block based CMOS design Can pull blocks from a circuit library Loom can compose blocks into a single larger network Will optimize by removing unused neurons and connections Compresses to minimum number of layers Handles recurrence/loops

16 Digital Hierarchical Neural Nets Digital functions must be 100% correct Divide and conquer by partitioning 64 inputs = 1.6 x training vectors 4 x 16 inputs = 2.56 x 10 5 training vectors Reduce the training set size But train to 100% accuracy The logic truth table becomes the training set The training data encompasses all possible data

17 Training Loom can train blocks given a training set or truth table Uses the Concurrent Learning Algorithm* Can train for exact logic or for inexact classifiers *M. McLean, Concurrent Learning Algorithm and the Importance Map, Network Science and Cybersecurity, ed. By R. Pino, Springer 2014, vol. 55, pp

18 Libraries and Composability Once trained, a block can be reused Train once, use often We have a growing library, starting with simple logic (NANDs, NORs, latches) and growing to more sophisticated functions (majority gates) We have algorithms for composing multiple neural networks into a single network

19 AES-256 Encryption Advanced Encryption Standard, 256 bit 128 bits of data encrypted using 256 bit key Algorithm uses 14 rounds of 4 steps each Published standard, result must be exact Ohmic Weave Implementation 45 blocks, 21 unique types 16 Subbytes Blocks, 3 Mix Col Blocks, 1 Control Block, 1 Mux Each Subbytes block unique because keys baked in 12,500 neurons in 10 layers Each block trained using CLA

20 AES-256 Conceptual Diagram Plain text 32 bit mux SubBytes SubBytes SubBytes SubBytes ShiftRows (to other cols) MixCol A MixCol B MixCol C MixCol C MixCol C Cypher text State Machine MixCol C 45 instances of 21 unique blocks About 12,500 neurons in 10 layers

21 Application: Malware Detection Classifies files as malware (e.g. virus) or benign Looks at the file in 6 byte n-grams at a time Matches 2000 critical n-grams, notes their presence in a 2000 bit latch Uses a neural network classifier to decide if pattern in latch is malware

22 Conceptual Diagram 6 byte n-gram 48:1536 DECODER 2000 PATTERN MATCHER 2000 BIT LATCH MALWARE DETECTOR STAGE 1 MALWARE DETECTOR STAGE 2 malware? About 4007 instances of 5 unique blocks About 5800 neurons in 5 layers

23 Mapped to Ohmic Weave using Loom Layer 1 Layer 2 Layer 3 64 port router all-to-all connectivity Layer 4 Layer 5 Unused

24 Malware detection using neural nets: General purpose Ohmic Weave vs CPU CPU: 6 Core Intel Core i7 3930K; Throughput is 1.92 Gbps Ohmic Weave requires 12 copies to match that throughput Numbers shown below are total for 12 copies -- comparison is performance neutral Function Area (mm^2) Power (mw) Row Drivers Memristive Array Comparators Router Total x Improvement in Area 54x Improvement in Power* *Aggressive implementations of memristor technology and on-chip routing increase the power improvement to ~500x

25 Improvement Faactor Energy efficiency scaling comparison X X nm 32 nm 22 nm 15 nm 11 nm 8 nm CMOS technology node

26 Roadmap Forward Fabricate and characterize circuits Continue to characterize memristor crossbars Build increasingly mature prototype boards Explore on-chip vs. off-chip training Validate routing choices Provide more realistic power comparisons Improve tools and simulation environment More applications and comparisons (to FPGA, GPU, ASIC, etc.)

27 Acknowledgments to the NMC team that is making all this happen Chris Krieger Mark McLean Josh Prucnal Doug Palmer along with a substantial number of academic, national lab and industry partners

28 Questions?

29 A first attempt at road mapping NMC Strengths CMOS compatible Room T operation Integrates with other approaches (traditional, approximate) Self-learning NMC attributes Neutral Security Programmer productivity Weaknesses Legacy code FP intensive applications Serial speed Data selection Nanotechnology challenges Efficient STDP/Spiking device Memristors Access devices (monolithic 3D) Comparators and on-chip programming Interconnect density Controlled growth of interconnects Efficient analog comms (or efficient transducers to optical, magnetic, etc.); includes sensors

30 What are the right metrics? Ops/analyst Ops/trained analyst Learning rate for analysts Scaling rate World Class Expert How much of the pyramid can you augment with NMC? Proficient Skilled Entry Level Novice Skill level pyramid

Neuromorphic Hardware. Adrita Arefin & Abdulaziz Alorifi

Neuromorphic Hardware. Adrita Arefin & Abdulaziz Alorifi Neuromorphic Hardware Adrita Arefin & Abdulaziz Alorifi Introduction Neuromorphic hardware uses the concept of VLSI systems consisting of electronic analog circuits to imitate neurobiological architecture

More information

Brainchip OCTOBER

Brainchip OCTOBER Brainchip OCTOBER 2017 1 Agenda Neuromorphic computing background Akida Neuromorphic System-on-Chip (NSoC) Brainchip OCTOBER 2017 2 Neuromorphic Computing Background Brainchip OCTOBER 2017 3 A Brief History

More information

UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163

UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163 UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163 LEARNING OUTCOMES 4.1 DESIGN METHODOLOGY By the end of this unit, student should be able to: 1. Explain the design methodology for integrated circuit.

More information

Neural Computer Architectures

Neural Computer Architectures Neural Computer Architectures 5kk73 Embedded Computer Architecture By: Maurice Peemen Date: Convergence of different domains Neurobiology Applications 1 Constraints Machine Learning Technology Innovations

More information

Hardware Software Codesign of Embedded Systems

Hardware Software Codesign of Embedded Systems Hardware Software Codesign of Embedded Systems Rabi Mahapatra Texas A&M University Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on Codesign of Embedded System

More information

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES DIGITAL DESIGN TECHNOLOGY & TECHNIQUES CAD for ASIC Design 1 INTEGRATED CIRCUITS (IC) An integrated circuit (IC) consists complex electronic circuitries and their interconnections. William Shockley et

More information

Intel: Driving the Future of IT Technologies. Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation

Intel: Driving the Future of IT Technologies. Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation Research @ Intel: Driving the Future of IT Technologies Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation kp Intel Labs Mission To fuel Intel s growth, we deliver breakthrough technologies that

More information

More Course Information

More Course Information More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well

More information

A Compiler for Scalable Placement and Routing of Brain-like Architectures

A Compiler for Scalable Placement and Routing of Brain-like Architectures A Compiler for Scalable Placement and Routing of Brain-like Architectures Narayan Srinivasa Center for Neural and Emergent Systems HRL Laboratories LLC Malibu, CA International Symposium on Physical Design

More information

CS 354R: Computer Game Technology

CS 354R: Computer Game Technology CS 354R: Computer Game Technology AI Fuzzy Logic and Neural Nets Fall 2018 Fuzzy Logic Philosophical approach Decisions based on degree of truth Is not a method for reasoning under uncertainty that s probability

More information

Index. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning,

Index. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning, Index A Algorithmic noise tolerance (ANT), 93 94 Application specific instruction set processors (ASIPs), 115 116 Approximate computing application level, 95 circuits-levels, 93 94 DAS and DVAS, 107 110

More information

Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research

Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research Nick Fraser (Xilinx & USydney) Yaman Umuroglu (Xilinx & NTNU) Giulio Gambardella (Xilinx)

More information

Hardware Software Codesign of Embedded System

Hardware Software Codesign of Embedded System Hardware Software Codesign of Embedded System CPSC489-501 Rabi Mahapatra Mahapatra - Texas A&M - Fall 00 1 Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on

More information

PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory

PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory Scalable and Energy-Efficient Architecture Lab (SEAL) PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in -based Main Memory Ping Chi *, Shuangchen Li *, Tao Zhang, Cong

More information

FPGA DESIGN OF A MULTICORE NEUROMORPHIC PROCESSING SYSTEM. Thesis. Submitted to. The School of Engineering of the UNIVERSITY OF DAYTON

FPGA DESIGN OF A MULTICORE NEUROMORPHIC PROCESSING SYSTEM. Thesis. Submitted to. The School of Engineering of the UNIVERSITY OF DAYTON FPGA DESIGN OF A MULTICORE NEUROMORPHIC PROCESSING SYSTEM Thesis Submitted to The School of Engineering of the UNIVERSITY OF DAYTON In Partial Fulfillment of the Requirements for The Degree of Master of

More information

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,

More information

Design Methodologies and Tools. Full-Custom Design

Design Methodologies and Tools. Full-Custom Design Design Methodologies and Tools Design styles Full-custom design Standard-cell design Programmable logic Gate arrays and field-programmable gate arrays (FPGAs) Sea of gates System-on-a-chip (embedded cores)

More information

Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators

Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators Yulhwa Kim, Hyungjun Kim, and Jae-Joon Kim Dept. of Creative IT Engineering, Pohang University of Science and Technology (POSTECH),

More information

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog

More information

Implementation of Full -Parallelism AES Encryption and Decryption

Implementation of Full -Parallelism AES Encryption and Decryption Implementation of Full -Parallelism AES Encryption and Decryption M.Anto Merline M.E-Commuication Systems, ECE Department K.Ramakrishnan College of Engineering-Samayapuram, Trichy. Abstract-Advanced Encryption

More information

Biologically-Inspired Massively-Parallel Architectures - computing beyond a million processors

Biologically-Inspired Massively-Parallel Architectures - computing beyond a million processors Biologically-Inspired Massively-Parallel Architectures - computing beyond a million processors Dave Lester The University of Manchester d.lester@manchester.ac.uk NeuroML March 2011 1 Outline 60 years of

More information

DNN ENGINE: A 16nm Sub-uJ DNN Inference Accelerator for the Embedded Masses

DNN ENGINE: A 16nm Sub-uJ DNN Inference Accelerator for the Embedded Masses DNN ENGINE: A 16nm Sub-uJ DNN Inference Accelerator for the Embedded Masses Paul N. Whatmough 1,2 S. K. Lee 2, N. Mulholland 2, P. Hansen 2, S. Kodali 3, D. Brooks 2, G.-Y. Wei 2 1 ARM Research, Boston,

More information

FPGA BASED CRYPTOGRAPHY FOR INTERNET SECURITY

FPGA BASED CRYPTOGRAPHY FOR INTERNET SECURITY Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 10, October 2015,

More information

Low area implementation of AES ECB on FPGA

Low area implementation of AES ECB on FPGA Total AddRoundkey_3 MixCollumns AddRoundkey_ ShiftRows SubBytes 1 Low area implementation of AES ECB on FPGA Abstract This project aimed to create a low area implementation of the Rajindael cipher (AES)

More information

Low-Power Neural Processor for Embedded Human and Face detection

Low-Power Neural Processor for Embedded Human and Face detection Low-Power Neural Processor for Embedded Human and Face detection Olivier Brousse 1, Olivier Boisard 1, Michel Paindavoine 1,2, Jean-Marc Philippe, Alexandre Carbon (1) GlobalSensing Technologies (GST)

More information

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution

More information

Power Consump-on in High- End Rou-ng Systems. Lawrence J. Wobker System Architecture / Technical Marke<ng, Cisco

Power Consump-on in High- End Rou-ng Systems. Lawrence J. Wobker System Architecture / Technical Marke<ng, Cisco Power Consump-on in High- End Rou-ng Systems Lawrence J. Wobker System Architecture / Technical Marke

More information

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based

More information

For Monday. Read chapter 18, sections Homework:

For Monday. Read chapter 18, sections Homework: For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron

More information

NeuroMem. A Neuromorphic Memory patented architecture. NeuroMem 1

NeuroMem. A Neuromorphic Memory patented architecture. NeuroMem 1 NeuroMem A Neuromorphic Memory patented architecture NeuroMem 1 Unique simple architecture NM bus A chain of identical neurons, no supervisor 1 neuron = memory + logic gates Context Category ted during

More information

Lecture 41: Introduction to Reconfigurable Computing

Lecture 41: Introduction to Reconfigurable Computing inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41: Introduction to Reconfigurable Computing Michael Le, Sp07 Head TA April 30, 2007 Slides Courtesy of Hayden So, Sp06 CS61c Head TA Following

More information

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based

More information

HRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing

HRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing HRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing Mingyu Gao and Christos Kozyrakis Stanford University http://mast.stanford.edu HPCA March 14, 2016 PIM is Coming Back End of Dennard

More information

Design Methodologies. Full-Custom Design

Design Methodologies. Full-Custom Design Design Methodologies Design styles Full-custom design Standard-cell design Programmable logic Gate arrays and field-programmable gate arrays (FPGAs) Sea of gates System-on-a-chip (embedded cores) Design

More information

In-memory computing with emerging memory devices

In-memory computing with emerging memory devices In-memory computing with emerging memory devices Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano daniele.ielmini@polimi.it Emerging memory devices 2 Resistive switching

More information

So you think developing an SoC needs to be complex or expensive? Think again

So you think developing an SoC needs to be complex or expensive? Think again So you think developing an SoC needs to be complex or expensive? Think again Phil Burr Senior product marketing manager CPU Group NMI - Silicon to Systems: Easy Access ASIC 23 November 2016 Innovation

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

A Software Stack for Neuromorphic Computing

A Software Stack for Neuromorphic Computing A Software Stack for Neuromorphic Computing James S. Plank Mark E. Dean Garrett S. Rose Catherine D. Schuman July 19, 2017 Neuromorphic Computing Symposium Knoxville, Tennessee What our group looked like

More information

Design Methodologies

Design Methodologies Design Methodologies 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 Complexity Productivity (K) Trans./Staff - Mo. Productivity Trends Logic Transistor per Chip (M) 10,000 0.1

More information

ELCT 501: Digital System Design

ELCT 501: Digital System Design ELCT 501: Digital System Lecture 1: Introduction Dr. Mohamed Abd El Ghany, Mohamed.abdel-ghany@guc.edu.eg Administrative Rules Course components: Lecture: Thursday (fourth slot), 13:15-14:45 (H8) Office

More information

A High-Performance VLSI Architecture for Advanced Encryption Standard (AES) Algorithm

A High-Performance VLSI Architecture for Advanced Encryption Standard (AES) Algorithm A High-Performance VLSI Architecture for Advanced Encryption Standard (AES) Algorithm N. M. Kosaraju, M. Varanasi & Saraju P. Mohanty VLSI Design and CAD Laboratory Homepage: http://www.vdcl.cse.unt.edu

More information

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al. Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.) Andreas Kurth 2017-12-05 1 In short: The situation Image credit:

More information

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Design of an Efficient Architecture for Advanced Encryption Standard Algorithm Using Systolic Structures

Design of an Efficient Architecture for Advanced Encryption Standard Algorithm Using Systolic Structures Design of an Efficient Architecture for Advanced Encryption Standard Algorithm Using Systolic Structures 1 Suresh Sharma, 2 T S B Sudarshan 1 Student, Computer Science & Engineering, IIT, Khragpur 2 Assistant

More information

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements.

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements. Contemporary Design We have been talking about design process Let s now take next steps into examining in some detail Increasing complexities of contemporary systems Demand the use of increasingly powerful

More information

Sensors & Transducers Published by IFSA Publishing, S. L.,

Sensors & Transducers Published by IFSA Publishing, S. L., Sensors & Transducers Published by IFSA Publishing, S. L., 208 http://www.sensorsportal.com Power and Area Efficient Intelligent Hardware Design for Water Quality Applications Abheek GUPTA, 2 Anu GUPTA

More information

Hybrid Memory Cube (HMC)

Hybrid Memory Cube (HMC) 23 Hybrid Memory Cube (HMC) J. Thomas Pawlowski, Fellow Chief Technologist, Architecture Development Group, Micron jpawlowski@micron.com 2011 Micron Technology, I nc. All rights reserved. Products are

More information

Programmable Logic Devices II

Programmable Logic Devices II São José February 2015 Prof. Hoeller, Prof. Moecke (http://www.sj.ifsc.edu.br) 1 / 28 Lecture 01: Complexity Management and the Design of Complex Digital Systems Prof. Arliones Hoeller arliones.hoeller@ifsc.edu.br

More information

Music. Numbers correspond to course weeks EULA ESE150 Spring click OK Based on slides DeHon 1. !

Music. Numbers correspond to course weeks EULA ESE150 Spring click OK Based on slides DeHon 1. ! MIC Lecture #7 Digital Logic Music 1 Numbers correspond to course weeks sample EULA D/A 10101001101 click OK Based on slides 2009--2018 speaker MP Player / iphone / Droid DeHon 1 2 A/D domain conversion

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

SpiNNaker a Neuromorphic Supercomputer. Steve Temple University of Manchester, UK SOS21-21 Mar 2017

SpiNNaker a Neuromorphic Supercomputer. Steve Temple University of Manchester, UK SOS21-21 Mar 2017 SpiNNaker a Neuromorphic Supercomputer Steve Temple University of Manchester, UK SOS21-21 Mar 2017 Outline of talk Introduction Modelling neurons Architecture and technology Principles of operation Summary

More information

ECE 152A LABORATORY 2

ECE 152A LABORATORY 2 ECE 152A LABORATORY 2 Objectives : 1. Understand the trade-off between time- and space-efficiency in the design of adders. In this lab, adders operate on unsigned numbers. 2. Learn how to write Verilog

More information

Evolution of CAD Tools & Verilog HDL Definition

Evolution of CAD Tools & Verilog HDL Definition Evolution of CAD Tools & Verilog HDL Definition K.Sivasankaran Assistant Professor (Senior) VLSI Division School of Electronics Engineering VIT University Outline Evolution of CAD Different CAD Tools for

More information

EMERGING NON VOLATILE MEMORY

EMERGING NON VOLATILE MEMORY EMERGING NON VOLATILE MEMORY Innovative components for neuromorphic architecture Leti, technology research institute Contact: leti.contact@cea.fr Neuromorphic architecture Brain-inspired computing has

More information

2 Gbit/s Hardware Realizations of RIJNDAEL and SERPENT: A Comparative Analysis

2 Gbit/s Hardware Realizations of RIJNDAEL and SERPENT: A Comparative Analysis 2 Gbit/s Hardware Realizations of RIJNDAEL and SERPENT: A Comparative Analysis Adrian K. Lutz 1, Jürg Treichler 2, Frank K. Gürkaynak 3, Hubert Kaeslin 4, Gérard Basler 2, Andres Erni 1, Stefan Reichmuth

More information

MNSIM: A Simulation Platform for Memristor-based Neuromorphic Computing System

MNSIM: A Simulation Platform for Memristor-based Neuromorphic Computing System MNSIM: A Simulation Platform for Memristor-based Neuromorphic Computing System Lixue Xia 1, Boxun Li 1, Tianqi Tang 1, Peng Gu 12, Xiling Yin 1, Wenqin Huangfu 1, Pai-Yu Chen 3, Shimeng Yu 3, Yu Cao 3,

More information

! References: ! Computer eyesight gets a lot more accurate, NY Times. ! Stanford CS 231n. ! Christopher Olah s blog. ! Take ECS 174!

! References: ! Computer eyesight gets a lot more accurate, NY Times. ! Stanford CS 231n. ! Christopher Olah s blog. ! Take ECS 174! Exams ECS 189 WEB PROGRAMMING! If you are satisfied with your scores on the two midterms, you can skip the final! As soon as your Photobooth and midterm are graded, I can give you your course grade (so

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without a teacher No targets for the outputs Networks which discover patterns, correlations, etc. in the input data This is a self organisation Self organising networks An

More information

The Xilinx XC6200 chip, the software tools and the board development tools

The Xilinx XC6200 chip, the software tools and the board development tools The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions

More information

Data-Centric Innovation Summit DAN MCNAMARA SENIOR VICE PRESIDENT GENERAL MANAGER, PROGRAMMABLE SOLUTIONS GROUP

Data-Centric Innovation Summit DAN MCNAMARA SENIOR VICE PRESIDENT GENERAL MANAGER, PROGRAMMABLE SOLUTIONS GROUP Data-Centric Innovation Summit DAN MCNAMARA SENIOR VICE PRESIDENT GENERAL MANAGER, PROGRAMMABLE SOLUTIONS GROUP Devices / edge network Cloud/data center Removing data Bottlenecks with Fpga acceleration

More information

Digital System Design

Digital System Design Digital System Design Analog time varying signals that can take on any value across a continuous range of voltage, current or other metric Digital signals are modeled with two states, 0 or 1 underneath

More information

Three DIMENSIONAL-CHIPS

Three DIMENSIONAL-CHIPS IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 4 (Sep-Oct. 2012), PP 22-27 Three DIMENSIONAL-CHIPS 1 Kumar.Keshamoni, 2 Mr. M. Harikrishna

More information

Daniele Ielmini DEI - Politecnico di Milano, Milano, Italy Outline. Solid-state disk (SSD) Storage class memory (SCM)

Daniele Ielmini DEI - Politecnico di Milano, Milano, Italy Outline. Solid-state disk (SSD) Storage class memory (SCM) Beyond NVMs Daniele Ielmini DEI - Politecnico di Milano, Milano, Italy ielmini@elet.polimi.it Outline Storage applications Solid-state disk (SSD) Storage class memory (SCM) Logic applications: Crossbar

More information

Distributed Problem Solving Based on Recurrent Neural Networks Applied to Computer Network Management

Distributed Problem Solving Based on Recurrent Neural Networks Applied to Computer Network Management Distributed Problem Solving Based on Recurrent Neural Networks Applied to Computer Network Management Analúcia Schiaffino Morales De Franceschi analucia@gpeb.ufsc.br Jorge M. Barreto* barreto@inf.ufsc.br

More information

Chapter 5: ASICs Vs. PLDs

Chapter 5: ASICs Vs. PLDs Chapter 5: ASICs Vs. PLDs 5.1 Introduction A general definition of the term Application Specific Integrated Circuit (ASIC) is virtually every type of chip that is designed to perform a dedicated task.

More information

High Performance Mixed-Signal Solutions from Aeroflex

High Performance Mixed-Signal Solutions from Aeroflex High Performance Mixed-Signal Solutions from Aeroflex We Connect the REAL World to the Digital World Solution-Minded Performance-Driven Customer-Focused Aeroflex (NASDAQ:ARXX) Corporate Overview Diversified

More information

An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart

An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart Weiwei Jiang Columbia University, USA Gabriele Miorandi University of Ferrara, Italy Wayne Burleson

More information

Memristive stateful logic

Memristive stateful logic Memristive stateful logic Eero Lehtonen, Jussi Poikonen 2 University of Turku, Finland 2 Aalto University, Finland January 22, 24 Outline Basic principle of memristive stateful logic 2 Generalized memristive

More information

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking

More information

A Large-Scale Spiking Neural Network Accelerator for FPGA Systems

A Large-Scale Spiking Neural Network Accelerator for FPGA Systems A Large-Scale Spiking Neural Network Accelerator for FPGA Systems Kit Cheung 1, Simon R Schultz 2, Wayne Luk 1 1 Department of Computing, 2 Department of Bioengineering Imperial College London {k.cheung11,

More information

Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System

Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Chi Zhang, Viktor K Prasanna University of Southern California {zhan527, prasanna}@usc.edu fpga.usc.edu ACM

More information

Simplify System Complexity

Simplify System Complexity 1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller

More information

Shrivathsa Bhargav Larry Chen Abhinandan Majumdar Shiva Ramudit

Shrivathsa Bhargav Larry Chen Abhinandan Majumdar Shiva Ramudit Shrivathsa Bhargav Larry Chen Abhinandan Majumdar Shiva Ramudit May 10, 2008 Spring 2008, Columbia University System architecture SDRAM chip AES decrypto Nios II processor SDRAM controller Avalon Bus VGA

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

CAN on Integration Technologies

CAN on Integration Technologies CAN on Integration Technologies CAN technology has reached the mature state where the powerful network technology is well covered by standard parts; mainly processors with integrated CAN periphery. Nevertheless

More information

Introduction to ICs and Transistor Fundamentals

Introduction to ICs and Transistor Fundamentals Introduction to ICs and Transistor Fundamentals A Brief History 1958: First integrated circuit Flip-flop using two transistors Built by Jack Kilby at Texas Instruments 2003 Intel Pentium 4 mprocessor (55

More information

Neural Networks CMSC475/675

Neural Networks CMSC475/675 Introduction to Neural Networks CMSC475/675 Chapter 1 Introduction Why ANN Introduction Some tasks can be done easily (effortlessly) by humans but are hard by conventional paradigms on Von Neumann machine

More information

Sharing Resources Between AES and the SHA-3 Second Round Candidates Fugue and Grøstl

Sharing Resources Between AES and the SHA-3 Second Round Candidates Fugue and Grøstl Sharing Resources Between AES and the SHA-3 Second Round Candidates Fugue and Grøstl Kimmo Järvinen Department of Information and Computer Science Aalto University, School of Science and Technology Espoo,

More information

3. HARDWARE ARCHITECTURE

3. HARDWARE ARCHITECTURE 3. HARDWARE ARCHITECTURE The architecture of the Recognition Accelerator consists of two main parts: a dedicated classifier engine and a general-purpose 16-bit microcontroller. The classifier implements

More information

Moore s Law: Alive and Well. Mark Bohr Intel Senior Fellow

Moore s Law: Alive and Well. Mark Bohr Intel Senior Fellow Moore s Law: Alive and Well Mark Bohr Intel Senior Fellow Intel Scaling Trend 10 10000 1 1000 Micron 0.1 100 nm 0.01 22 nm 14 nm 10 nm 10 0.001 1 1970 1980 1990 2000 2010 2020 2030 Intel Scaling Trend

More information

Reminder. Course project team forming deadline. Course project ideas. Friday 9/8 11:59pm You will be randomly assigned to a team after the deadline

Reminder. Course project team forming deadline. Course project ideas. Friday 9/8 11:59pm You will be randomly assigned to a team after the deadline Reminder Course project team forming deadline Friday 9/8 11:59pm You will be randomly assigned to a team after the deadline Course project ideas If you have difficulty in finding team mates, send your

More information

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, 2006 Sr. Principal Engineer Panel Questions How do we build scalable networks that balance power, reliability and performance

More information

2. TOPOLOGICAL PATTERN ANALYSIS

2. TOPOLOGICAL PATTERN ANALYSIS Methodology for analyzing and quantifying design style changes and complexity using topological patterns Jason P. Cain a, Ya-Chieh Lai b, Frank Gennari b, Jason Sweis b a Advanced Micro Devices, 7171 Southwest

More information

Neuromorphic Data Microscope

Neuromorphic Data Microscope Neuromorphic Data Microscope CLSAC 16 October 28, 2016 David Follett Founder, CEO Lewis Rhodes Labs (LRL) david@lewis-rhodes.com 978-273-0537 Slide 1 History Neuroscience 1998-2012 Neuronal Spiking Models

More information

AIM Photonics: Manufacturing Challenges for Photonic Integrated Circuits

AIM Photonics: Manufacturing Challenges for Photonic Integrated Circuits AIM Photonics: Manufacturing Challenges for Photonic Integrated Circuits November 16, 2017 Michael Liehr Industry Driving Force EXA FLOP SCALE SYSTEM Blades SiPh Interconnect Network Memory Stack HP HyperX

More information

Neurmorphic Architectures. Kenneth Rice and Tarek Taha Clemson University

Neurmorphic Architectures. Kenneth Rice and Tarek Taha Clemson University Neurmorphic Architectures Kenneth Rice and Tarek Taha Clemson University Historical Highlights Analog VLSI Carver Mead and his students pioneered the development avlsi technology for use in neural circuits

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

Proposers Day Workshop

Proposers Day Workshop Proposers Day Workshop Monday, January 23, 2017 @srcjump, #JUMPpdw Advanced Devices, Packaging, and Materials Horizontal Research Center Aaron Oki NG Fellow Northrop Grumman Center Motivation Active and

More information

Fundamentals of Computer Design

Fundamentals of Computer Design CS359: Computer Architecture Fundamentals of Computer Design Yanyan Shen Department of Computer Science and Engineering 1 Defining Computer Architecture Agenda Introduction Classes of Computers 1.3 Defining

More information

ALGORITHM AND SOFTWARE BASED ON MLPNN FOR ESTIMATING CHANNEL USE IN THE SPECTRAL DECISION STAGE IN COGNITIVE RADIO NETWORKS

ALGORITHM AND SOFTWARE BASED ON MLPNN FOR ESTIMATING CHANNEL USE IN THE SPECTRAL DECISION STAGE IN COGNITIVE RADIO NETWORKS ALGORITHM AND SOFTWARE BASED ON MLPNN FOR ESTIMATING CHANNEL USE IN THE SPECTRAL DECISION STAGE IN COGNITIVE RADIO NETWORKS Johana Hernández Viveros 1, Danilo López Sarmiento 2 and Nelson Enrique Vera

More information

A General Method for the Analysis and the Logical Generation of Discrete Mathematical Systems in Programmable Logical Controller

A General Method for the Analysis and the Logical Generation of Discrete Mathematical Systems in Programmable Logical Controller A General Method for the Analysis and the Logical Generation of Discrete Mathematical Systems in Programmable Logical Controller Daniel M. Dubois * Department of Applied Informatics and Artificial Intelligence,

More information

Value-driven Synthesis for Neural Network ASICs

Value-driven Synthesis for Neural Network ASICs Value-driven Synthesis for Neural Network ASICs Zhiyuan Yang University of Maryland, College Park zyyang@umd.edu ABSTRACT In order to enable low power and high performance evaluation of neural network

More information

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad-500 014 Subject: Digital Design Using Verilog Hdl Class : ECE-II Group A (Short Answer Questions) UNIT-I 1 Define verilog HDL? 2 List levels of

More information

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, Christos Kozyrakis Stanford University Platform Lab Review Feb 2017 Deep Neural

More information

New STM32 F7 Series. World s 1 st to market, ARM Cortex -M7 based 32-bit MCU

New STM32 F7 Series. World s 1 st to market, ARM Cortex -M7 based 32-bit MCU New STM32 F7 Series World s 1 st to market, ARM Cortex -M7 based 32-bit MCU 7 Keys of STM32 F7 series 2 1 2 3 4 5 6 7 First. ST is first to sample a fully functional Cortex-M7 based 32-bit MCU : STM32

More information

Character Recognition Using Convolutional Neural Networks

Character Recognition Using Convolutional Neural Networks Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract

More information

Supervised Learning in Neural Networks (Part 2)

Supervised Learning in Neural Networks (Part 2) Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning

More information