Asynchronous Logic: A Computer Systems Perspective. Rajit Manohar Computer Systems Lab, Cornell
|
|
- Prosper Dean
- 6 years ago
- Views:
Transcription
1 Asynchronous Logic: A Computer Systems Perspective Rajit Manohar Computer Systems Lab, Cornell
2 Approaches to computation Discrete Value Continuous Value Discrete Time Digital, synchronous logic Switched-capacitor analog Continuous Time Digital, asynchronous logic General analog computation
3 Asynchronous logic sender clock data receiver signal value time sampling window sender data 0 data 1 ack receiver sender req data ack receiver data 1 data 0 X X X X data request acknowledge acknowledge
4 Explicit versus implicit synchronization A B C A B C A1 B1 C1 A1 B1 C1 A2 B2 C2 A2 B2 C2 A3 B3 C3 time A3 B3 C3 C4 time A4 B4 C4 A4 B4 Synchronous computation Asynchronous computation
5 Differences at a glance Asynchronous Continuous time computation. Observed delay is the average over possible circuit paths. Local switching from data-driven computation. Can lead to low-power operation. Throughput determined by device speed. (Margins needed for some circuit families.) Data must be encoded, requiring additional wires for signals. Synchronous Discrete time computation. Observed delay is the maximum over possible circuit paths. Global activity from clock-driven computation. Low power requires careful clock gating. Throughput determined by device speed and additional operating margins. Unencoded data acceptable due to global clock network.
6 A little bit of (biased) history Binary addition (von Neumann) Switching theory (Miller) Illiac II (UIUC/Muller) Macro modular systems (Clarke, Molnar, Stucki, ) Atlas, MU5 (Manchester) Flow tables (Huffman) System Timing (Seitz, in Mead/Conway) Trace theory (Snepscheut) Syntactic compilation (Martin, Rem) ARM with caches (Furber) Pipelined FPGAs (Manohar) HiAER (Cauwenberghs) FACETS (Heidelberg) TrueNorth (IBM/Cornell) Micro pipelines (Sutherland) Asynchronous CPU (Martin) General hazard-free synthesis AER (Mead lab) 17FO4 MIPS CPU (Martin) Commercial 80C51 (Phillips) UltraSparc IIIi mem controller (Sun) Sensory systems (retina, cochlea, etc.) Event-driven CPU (Manohar) 10G Ethernet switch (Fulcrum) SpiNNaker (Furber) Neurogrid (Boahen)
7 I. Exploiting the gap between average and max delay Slow part: computing carries but only in the worst-case! Theorem [von Neumann, 1946]. The average-case latency for a ripple carry binary adder is O(log N) for i.i.d. inputs A.W. Burks, H.H. Goldstein, J. von Neumann. Preliminary discussion of the logical design of an electronic computing instrument. (1946)
8 I. Exploiting the gap between average and max delay Standard adder tree O(log N) latency Hybrid adder tree Tree adder + ripple adder Theorem [Winograd, 1965]. The worst-case latency for binary addition is Ω(log N) Theorem. The average-case latency of the hybrid asynchronous adder is O(log log N) for i.i.d. inputs Theorem. The average-case latency of the hybrid asynchronous adder is optimal for any input distribution S.O. Winograd. On the time required to perform addition. JACM (1965) R. Manohar and J. Tierno. Asynchronous parallel prefix computation. IEEE Transactions on Computers (1998)
9 II. Exploiting data-driven power management Low power FIR filter External, synchronous I/O Monitor occupancy of FIFOs Speed up/slow down using adaptive power supply Break-point add/multiply Detect when top half of operand is required Dynamically switch between full width and half width arithmetic State DC/DC converter Vdd FIR filter L. Nielsen and J. Sparso. A Low-power Asynchronous Datapath for a FIR filter bank. Proc. ASYNC (1996)
10 II. Exploiting data-driven power management Width-adaptive numbers Compress leading 0 s and 1 s 8 bits per VDW block v/s sender receiver 128 bit datapath Cumulative Distribution Function (%) Average: 8.4 Average: SPEC: 124.m88ksim Average: 10.9 no compaction simple compaction full compaction Operand Widths Number: receiver sender Number: R. Manohar. Width-Adaptive Data Word Architectures. Proc. ARVLSI (2001)
11 III. Exploiting elasticity of pipelines f g h Asynchronous computation is* robust to changes in circuitlevel pipelining! f g h *R. Manohar, A. J. Martin. Slack Elasticity in Concurrent Computing. Proc. Mathematics of Program Construction (1998)
12 III. Exploiting elasticity of pipelines Asynchronous FIR filter Data arrives at variable rates Data rates are predictable Fixed amount of time for async computation Variable number of clock cycles for the fixed time budget M. Singh, J.A. Tierno, A. Rylyakov, S. Rylov, S.M. Nowick. An adaptively pipelined mixed synchronous-asynchronous digital FIR filter chip operating at 1.3 GHz. Proc. ASYNC (2002)
13 III. Exploiting elasticity of pipelines FPGA throughput low due to overhead of programmable interconnect LB CB LB CB CB SB CB SB ~3x improvement in peak throughput by transparent interconnect pipelining LB CB LB CB CB SB CB SB J. Teifel and R. Manohar. Highly pipelined asynchronous FPGAs. Proc. FPGA (2004)
14 IV. Exploiting timing robustness Asynchronous FPGA Test Data Throughput (MHz) Process: TSMC 0.18um Nominal Vdd: 1.8V 77K 12K K K 200 Xilinx Virtex Voltage (V) D. Fang, J. Teifel, and R. Manohar. A High-Performance Asynchronous FPGA: Test Results. Proc. FCCM (2005)
15 IV. Exploiting timing robustness GasP logic family 6-10 FO4 Track reset Self-reset w Single-track control wire No margins on control signals RA(i) RA(i+1) Standard datapath Wait for L set, R reset Track set Latest 40nm chip operates >6 GHz Datapath driver w I. Sutherland, S. Fairbanks. GasP: a minimal FIFO control. Proc. ASYNC (2001)
16 V. Exploiting low noise and low EMI time domain frequency domain MHz MHz Synchronous 80C51 (Phillips) Asynchronous 80C51 (Phillips) J. Kessels, T. Kramer, G. den Besten, A. Peeters, V. Timm. Applying Asynchronous Circuits in Contactless Smart Cards. Proc. ASYNC (2000)
17 VI. Exploiting continuous-time information processing Continuous-time digital signal processing Automatic adaptation to input bandwidth No aliasing from sampling process Nyquist sampling Level-crossing sampling B. Schell, Y. Tsividis. A clockless adc/dsp/dac system with activitydependent power dissipation and no aliasing. Proc. ISSCC (2008)
18 VI. Exploiting continuous-time information processing Address-event representation in neuromorphic systems Time represents itself Spike timing and coincidence for information processing Asynchronous logic Temporal resolution is decoupled from throughput Automatic power management Projects Silicon retinas, cochleas, More recent: FACETS, HiAER, Neurogrid, neurop, SpiNNaker, TrueNorth
19 A note on oscillations in asynchronous logic Well-known result in asynchronous circuit theory Signal transitions: look at the i th occurrence of transition t Then: time(t, i) = the actual time of this event time 0 (t, i) =f(t)+p i time(t, i) time 0 (t, i) <B Bounded, independent of i, and t If the circuit is strongly connected, this result holds More recently: stronger periodicity under more general conditions J. Magott. Performance evaluation of concurrent systems using petri nets. IPL (1984) S.M. Burns, A.J. Martin. Performance Analysis and Optimization of Asynchronous Circuits. Proc. ARVLSI (1990) W. Hua and R. Manohar. Exact timing analysis for concurrent systems. To appear, work-in-progress session, DAC (2016)
20 Asynchronous logic in TrueNorth Spikes trigger activity so they should be asynchronous Neurons leak periodically so add a periodic event to the system (the clock) At the external timing tick : Update each neuron, delivering spikes plus any local state update If the neuron spikes, send output spike to destinations, with specified delay Spikes travel through asynchronous network Receiver buffers spikes until it is time to deliver them Digital, asynchronous logic Scheduler Router Token Control Neuron SRAM
21 Asynchronous logic in TrueNorth Spike communication network Links are a shared resource Links might be contended Spike delivery times can differ based on traffic Different runs of the same network might produce different spike patterns Instead, we use a spike scheduler Guarantees deterministic computation and hence, repeatability
22 Summary Asynchronous logic has a long history Some key properties exploited in projects Gap between average and maximum delay for a computation Automatic data-driven power management Elastic pipelining Robustness to timing uncertainty/variation Reduced EMI and noise Continuous-time information representation
23 Acknowledgments The async community Students: Ned Bingham, Wenmian Hua, Sean Ogden, Praful Purohit, Tayyar Rzayev, Nitish Srivastava John Teifel, Clinton Kelly, Virantha Ekanayake, Song Peng, David Biermann, David Fang, Chris LaFrieda, Filipp Akopyan, Basit Riaz Sheikh, Benjamin Tang, Nabil Imam, Sandra Jackson, Carlos Otero, Benjamin Hill, Stephen Longfield, Jonathan Tse, Robert Karmazin Collaborators: General: David Albonesi, Sunil Bhave, Francois Guimbretiere, Amit Lal, Yoram Moses, Ivan Sutherland, Yannis Tsividis Neuromorphic: John Arthur, Kwabena Boahen, Tobi Delbruck, Chris Eliasmith, Giacomo Indiveri, Paul Merolla, Dharmendra Modha Support: AFRL, DARPA, IARPA, NSF, ONR, IBM, Lockheed, Qualcomm, Samsung, SRC
Advances in Designing Clockless Digital Systems
Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick nowick@cs.columbia.edu Department of Computer Science (and Elect. Eng.) Columbia University New York, NY, USA Introduction l Synchronous
More informationPOWER consumption has become one of the most important
704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 4, APRIL 2004 Brief Papers High-Throughput Asynchronous Datapath With Software-Controlled Voltage Scaling Yee William Li, Student Member, IEEE, George
More informationAchieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation
Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation
More informationA Energy-Efficient Pipeline Templates for High-Performance Asynchronous Circuits
A Energy-Efficient Pipeline Templates for High-Performance Asynchronous Circuits Basit Riaz Sheikh and Rajit Manohar, Cornell University We present two novel energy-efficient pipeline templates for high
More informationAn Asynchronous Floating-Point Multiplier
An Asynchronous Floating-Point Multiplier Basit Riaz Sheikh and Rajit Manohar Computer Systems Lab Cornell University http://vlsi.cornell.edu/ The person that did the work! Motivation Fast floating-point
More informationReconfigurable Asynchronous Logic
Reconfigurable Asynchronous Logic Rajit Manohar Computer Systems Laboratory Cornell University, Ithaca, NY 14853. Abstract Challenges in mapping asynchronous logic to a flexible substrate include developing
More informationSpiNNaker - a million core ARM-powered neural HPC
The Advanced Processor Technologies Group SpiNNaker - a million core ARM-powered neural HPC Cameron Patterson cameron.patterson@cs.man.ac.uk School of Computer Science, The University of Manchester, UK
More informationAdvances in Designing Clockless Digital Systems
Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick nowick@cs.columbia columbia.edu Department of Computer Science (and Elect. Eng.) Columbia University New York, NY, USA Introduction
More informationA Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM
A Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM Mansi Jhamb, Sugam Kapoor USIT, GGSIPU Sector 16-C, Dwarka, New Delhi-110078, India Abstract This paper demonstrates an asynchronous
More informationSingle-Track Asynchronous Pipeline Templates Using 1-of-N Encoding
Single-Track Asynchronous Pipeline Templates Using 1-of-N Encoding Marcos Ferretti, Peter A. Beerel Department of Electrical Engineering Systems University of Southern California Los Angeles, CA 90089
More informationCHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER
84 CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 3.1 INTRODUCTION The introduction of several new asynchronous designs which provides high throughput and low latency is the significance of this chapter. The
More informationHigh Performance Asynchronous Circuit Design Method and Application
High Performance Asynchronous Circuit Design Method and Application Charlie Brej School of Computer Science, The University of Manchester, Oxford Road, Manchester, M13 9PL, UK. cbrej@cs.man.ac.uk Abstract
More informationAdvances in Designing Clockless Digital Systems
Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick nowick@cs.columbia columbia.edu Department of Computer Science (and Elect. Eng.) Columbia University New York, NY, USA Introduction
More informationOUTLINE Introduction Power Components Dynamic Power Optimization Conclusions
OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism
More informationGRE Architecture Session
GRE Architecture Session Session 2: Saturday 23, 1995 Young H. Cho e-mail: youngc@cs.berkeley.edu www: http://http.cs.berkeley/~youngc Y. H. Cho Page 1 Review n Homework n Basic Gate Arithmetics n Bubble
More informationModified Micropipline Architecture for Synthesizable Asynchronous FIR Filter Design
Modified Micropipline Architecture for Synthesizable Asynchronous FIR Filter Design Basel Halak and Hsien-Chih Chiu, ECS, Southampton University, Southampton, SO17 1BJ, United Kingdom Email: {bh9, hc13g09}
More informationNatalie Enright Jerger, Jason Anderson, University of Toronto November 5, 2010
Next Generation FPGA Research Natalie Enright Jerger, Jason Anderson, and Ali Sheikholeslami l i University of Toronto November 5, 2010 Outline Part (I): Next Generation FPGA Architectures Asynchronous
More informationTIMA Lab. Research Reports
ISSN 1292-862 TIMA Lab. Research Reports TIMA Laboratory, 46 avenue Félix Viallet, 38000 Grenoble France Session 1.2 - Hop Topics for SoC Design Asynchronous System Design Prof. Marc RENAUDIN TIMA, Grenoble,
More informationA 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing
A 167-processor Computational Array for Highly-Efficient DSP and Embedded Application Processing Dean Truong, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge, Michael Meeuwsen, Christine
More informationIEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1
TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 The Design of High-Performance Dynamic Asynchronous Pipelines: Lookahead Style Montek Singh and Steven M. Nowick Abstract A new class of asynchronous
More informationImplementation of ALU Using Asynchronous Design
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 6 (Nov. - Dec. 2012), PP 07-12 Implementation of ALU Using Asynchronous Design P.
More informationTEMPLATE BASED ASYNCHRONOUS DESIGN
TEMPLATE BASED ASYNCHRONOUS DESIGN By Recep Ozgur Ozdag A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationFPGA architecture and design technology
CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA
More information16x16 Multiplier Design Using Asynchronous Pipeline Based On Constructed Critical Data Path
Volume 4 Issue 01 Pages-4786-4792 January-2016 ISSN (e): 2321-7545 Website: http://ijsae.in 16x16 Multiplier Design Using Asynchronous Pipeline Based On Constructed Critical Data Path Authors Channa.sravya
More informationIntroduction to asynchronous circuit design. Motivation
Introduction to asynchronous circuit design Using slides from: Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky, Intel Corporation, USA Alex Kondratyev, Theseus Logic,
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationDesign of a Pipelined 32 Bit MIPS Processor with Floating Point Unit
Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit P Ajith Kumar 1, M Vijaya Lakshmi 2 P.G. Student, Department of Electronics and Communication Engineering, St.Martin s Engineering College,
More informationIntroduction to Asynchronous Circuits and Systems
RCIM Presentation Introduction to Asynchronous Circuits and Systems Kristofer Perta April 02 / 2004 University of Windsor Computer and Electrical Engineering Dept. Presentation Outline Section - Introduction
More informationAsynchronous Circuit Design
Asynchronous Circuit Design Chris J. Myers Lecture 9: Applications Chapter 9 Chris J. Myers (Lecture 9: Applications) Asynchronous Circuit Design 1 / 60 Overview A brief history of asynchronous circuit
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationISSN Vol.08,Issue.07, July-2016, Pages:
ISSN 2348 2370 Vol.08,Issue.07, July-2016, Pages:1312-1317 www.ijatir.org Low Power Asynchronous Domino Logic Pipeline Strategy Using Synchronization Logic Gates H. NASEEMA BEGUM PG Scholar, Dept of ECE,
More informationThe Design and Implementation of a Low-Latency On-Chip Network
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current
More informationArchitecture or Parallel Computers CSC / ECE 506
Architecture or Parallel Computers CSC / ECE 506 Summer 2006 Scalable Programming Models 6/19/2006 Dr Steve Hunter Back to Basics Parallel Architecture = Computer Architecture + Communication Architecture
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationAn Asynchronous Array of Simple Processors for DSP Applications
An Asynchronous Array of Simple Processors for DSP Applications Zhiyi Yu, Michael Meeuwsen, Ryan Apperson, Omar Sattari, Michael Lai, Jeremy Webb, Eric Work, Tinoosh Mohsenin, Mandeep Singh, Bevan Baas
More informationEmbedded Systems: Hardware Components (part I) Todor Stefanov
Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 5, Sep-Oct 2014
RESEARCH ARTICLE OPEN ACCESS A Survey on Efficient Low Power Asynchronous Pipeline Design Based on the Data Path Logic D. Nandhini 1, K. Kalirajan 2 ME 1 VLSI Design, Assistant Professor 2 Department of
More informationRecent Advances in Designing Clockless Digital Systems
Recent Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick nowick@cs.columbia columbia.edu Chair, Computer Engineering Program Department of Computer Science (and Elect. Eng.) Columbia
More informationComputer Systems. Binary Representation. Binary Representation. Logical Computation: Boolean Algebra
Binary Representation Computer Systems Information is represented as a sequence of binary digits: Bits What the actual bits represent depends on the context: Seminar 3 Numerical value (integer, floating
More informationPOWER ANALYSIS OF CRITICAL PATH DELAY DESIGN USING DOMINO LOGIC
181 POWER ANALYSIS OF CRITICAL PATH DELAY DESIGN USING DOMINO LOGIC R.Yamini, V.Kavitha, S.Sarmila, Anila Ramachandran,, Assistant Professor, ECE Dept, M.E Student, M.E. Student, M.E. Student Sri Eshwar
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationA Cache Hierarchy in a Computer System
A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the
More informationHP PA-8000 RISC CPU. A High Performance Out-of-Order Processor
The A High Performance Out-of-Order Processor Hot Chips VIII IEEE Computer Society Stanford University August 19, 1996 Hewlett-Packard Company Engineering Systems Lab - Fort Collins, CO - Cupertino, CA
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationAn Overview of Standard Cell Based Digital VLSI Design
An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationMapping AER Events to SpiNNaker Multicast Packets
SpiNNaker AppNote 8 SpiNNaker - AER interface Page 1 AppNote 8 - Interfacing AER devices to SpiNNaker using an FPGA SpiNNaker Group, School of Computer Science, University of Manchester Luis A. Plana -
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 18, 2005 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationDesign of 8 bit Pipelined Adder using Xilinx ISE
Design of 8 bit Pipelined Adder using Xilinx ISE 1 Jayesh Diwan, 2 Rutul Patel Assistant Professor EEE Department, Indus University, Ahmedabad, India Abstract An asynchronous circuit, or self-timed circuit,
More informationChapter 03. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1
Chapter 03 Authors: John Hennessy & David Patterson Copyright 2011, Elsevier Inc. All rights Reserved. 1 Figure 3.3 Comparison of 2-bit predictors. A noncorrelating predictor for 4096 bits is first, followed
More informationCS 31: Intro to Systems Digital Logic. Kevin Webb Swarthmore College February 3, 2015
CS 31: Intro to Systems Digital Logic Kevin Webb Swarthmore College February 3, 2015 Reading Quiz Today Hardware basics Machine memory models Digital signals Logic gates Circuits: Borrow some paper if
More informationA Scalable Counterflow-Pipelined Asynchronous Radix-4 Booth Multiplier Λ
A Scalable Counterflow-Pipelined Asynchronous Radix-4 Booth Multiplier Λ Justin Hensley, Anselmo Lastra and Montek Singh Department of Computer Science University of North Carolina, Chapel Hill, NC 27514,
More informationVerilog Sequential Logic. Verilog for Synthesis Rev C (module 3 and 4)
Verilog Sequential Logic Verilog for Synthesis Rev C (module 3 and 4) Jim Duckworth, WPI 1 Sequential Logic Module 3 Latches and Flip-Flops Implemented by using signals in always statements with edge-triggered
More informationMore Course Information
More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well
More informationCS 31: Intro to Systems Digital Logic. Kevin Webb Swarthmore College February 2, 2016
CS 31: Intro to Systems Digital Logic Kevin Webb Swarthmore College February 2, 2016 Reading Quiz Today Hardware basics Machine memory models Digital signals Logic gates Circuits: Borrow some paper if
More informationEmerging computing paradigms: The case of neuromorphic platforms
Emerging computing paradigms: The case of neuromorphic platforms Andrea Acquaviva DAUIN Computer and control Eng. Dept. 1 Neuromorphic platforms Neuromorphic platforms objective is to enable simulation
More informationThe CoreConnect Bus Architecture
The CoreConnect Bus Architecture Recent advances in silicon densities now allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripherals formerly attached
More informationAFRL-RI-RS-TR
AFRL-RI-RS-TR-2015-047 EXPERIMENTAL 3D ASYNCHRONOUS FIELD PROGRAMMABLE GATE ARRAY (FPGA) CORNELL UNIVERSITY MARCH 2015 FINAL TECHNICAL REPORT APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED STINFO
More informationEECS150 - Digital Design Lecture 17 Memory 2
EECS150 - Digital Design Lecture 17 Memory 2 October 22, 2002 John Wawrzynek Fall 2002 EECS150 Lec17-mem2 Page 1 SDRAM Recap General Characteristics Optimized for high density and therefore low cost/bit
More informationOverview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router
Overview Implementing Gigabit Routers with NetFPGA Prof. Sasu Tarkoma The NetFPGA is a low-cost platform for teaching networking hardware and router design, and a tool for networking researchers. The NetFPGA
More informationMulticycle-Path Challenges in Multi-Synchronous Systems
Multicycle-Path Challenges in Multi-Synchronous Systems G. Engel 1, J. Ziebold 1, J. Cox 2, T. Chaney 2, M. Burke 2, and Mike Gulotta 3 1 Department of Electrical and Computer Engineering, IC Design Research
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationEmbedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory
Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives
More informationLecture 3: Modulation & Layering"
Lecture 3: Modulation & Layering" CSE 123: Computer Networks Alex C. Snoeren HW 1 out Today, due 10/09! Lecture 3 Overview" Encoding schemes Shannon s Law and Nyquist Limit Clock recovery Manchester, NRZ,
More informationInterconnecting Components
Interconnecting Components Need interconnections between CPU, memory, controllers Bus: shared communication channel Parallel set of wires for data and synchronization of data transfer Can become a bottleneck
More informationComputer Architecture: Dataflow/Systolic Arrays
Data Flow Computer Architecture: Dataflow/Systolic Arrays he models we have examined all assumed Instructions are fetched and retired in sequential, control flow order his is part of the Von-Neumann model
More informationA Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding
A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely
More informationStructured Datapaths. Preclass 1. Throughput Yield. Preclass 1
ESE534: Computer Organization Day 23: November 21, 2016 Time Multiplexing Tabula March 1, 2010 Announced new architecture We would say w=1, c=8 arch. March, 2015 Tabula closed doors 1 [src: www.tabula.com]
More informationThe design of a simple asynchronous processor
The design of a simple asynchronous processor SUN-YEN TAN 1, WEN-TZENG HUANG 2 1 Department of Electronic Engineering National Taipei University of Technology No. 1, Sec. 3, Chung-hsiao E. Rd., Taipei,10608,
More informationKiloCore: A 32 nm 1000-Processor Array
KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel, Timothy Andreas, Bin Liu, Anh Tran, Emmanuel Adeagbo, Bevan Baas University of California, Davis VLSI Computation
More informationANEW asynchronous pipeline style, called MOUSETRAP,
684 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 6, JUNE 2007 MOUSETRAP: High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh and Steven M. Nowick Abstract
More informationIntroduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation
Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,
More informationISSN: [Bilani* et al.,7(2): February, 2018] Impact Factor: 5.164
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A REVIEWARTICLE OF SDRAM DESIGN WITH NECESSARY CRITERIA OF DDR CONTROLLER Sushmita Bilani *1 & Mr. Sujeet Mishra 2 *1 M.Tech Student
More informationSilicon Auditory Processors as Computer Peripherals
Silicon uditory Processors as Computer Peripherals John Lazzaro, John Wawrzynek CS Division UC Berkeley Evans Hall Berkeley, C 94720 lazzaro@cs.berkeley.edu, johnw@cs.berkeley.edu M. Mahowald, Massimo
More informationReal Time NoC Based Pipelined Architectonics With Efficient TDM Schema
Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam
More informationWrite only as much as necessary. Be brief!
1 CIS371 Computer Organization and Design Midterm Exam Prof. Martin Thursday, March 15th, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached
More informationCreating a Scalable Microprocessor:
Creating a Scalable Microprocessor: A 16-issue Multiple-Program-Counter Microprocessor With Point-to-Point Scalar Operand Network Michael Bedford Taylor J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B.
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationPower dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.
The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults
More informationProcessor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP
Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor
More informationBuses. Disks PCI RDRAM RDRAM LAN. Some slides adapted from lecture by David Culler. Pentium 4 Processor. Memory Controller Hub.
es > 100 MB/sec Pentium 4 Processor L1 and L2 caches Some slides adapted from lecture by David Culler 3.2 GB/sec Display Memory Controller Hub RDRAM RDRAM Dual Ultra ATA/100 24 Mbit/sec Disks LAN I/O Controller
More informationAsynchronous Circuits: An Increasingly Practical Design Solution
Asynchronous Circuits: An Increasingly Practical Design Solution Peter A. Beerel Fulcrum Microsystems Calabasas Hills, CA 91301 and Electrical Engineering, Systems Division University of Southern California
More informationAMULET2e. S. B. Furber, J. D. Garside, S. Temple, Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK.
AMULET2e S. B. Furber, J. D. Garside, S. Temple, Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK. P. Day, N. C. Paver, Cogency Technology Inc., Bruntwood
More informationEmbedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.
Embedded processors Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.fi Comparing processors Evaluating processors Taxonomy of processors
More informationA 1-GHz Configurable Processor Core MeP-h1
A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface
More informationESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder
ESE532: System-on-a-Chip Architecture Day 5: September 18, 2017 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Basic Approach Dataflow variants Motivations/demands for
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationComputer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing
More informationA Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on
A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced
More informationEE241 - Spring 2004 Advanced Digital Integrated Circuits
EE24 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolić Lecture 2 Impact of Scaling Class Material Last lecture Class scope, organization Today s lecture Impact of scaling 2 Major Roadblocks.
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationBabbage Analytical Machine
Von Neumann Machine Babbage Analytical Machine The basis of modern computers is proposed by a professor of mathematics at Cambridge University named Charles Babbage (1972-1871). He has invented a mechanical
More informationIMPROVES. Initial Investment is Low Compared to SoC Performance and Cost Benefits
NOC INTERCONNECT IMPROVES SOC ECONO CONOMICS Initial Investment is Low Compared to SoC Performance and Cost Benefits A s systems on chip (SoCs) have interconnect, along with its configuration, verification,
More informationCS250 VLSI Systems Design Lecture 8: Introduction to Hardware Design Patterns
CS250 VLSI Systems Design Lecture 8: Introduction to Hardware Design Patterns John Wawrzynek, Jonathan Bachrach, with Krste Asanovic, John Lazzaro and Rimas Avizienis (TA) UC Berkeley Fall 2012 Lecture
More informationComputer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture I Benny Thörnberg Associate Professor in Electronics Hardware architecture Computer architecture The functionality of a modern computer is so complex that no human can
More informationComputer Architecture Dr. Charles Kim Howard University
EECE416 Microcomputer Fundamentals & Design Computer Architecture Dr. Charles Kim Howard University 1 Computer Architecture Computer Architecture Art of selecting and interconnecting hardware components
More information