Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China
|
|
- Ginger Vivian Baldwin
- 6 years ago
- Views:
Transcription
1 CMOS Crossbar Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China
2 OUTLINE Motivations Problems of Designing Large Crossbar Our Approach - Pipelined MUX Core Interface Link and Clocking Design Conclusions
3 Motivations Advances in fiber optic link technology and WDM have made raw bandwidth in abundance Switches/Routers are replacing the transmission link as the bottleneck of the network Switches/Routers with high speed (OC-192, 10Gb/s) and large number of I/O ports (128*128 or 256*256) are becoming a necessity Key issues for designing high speed router: Switching Fabric Interconnects Queuing Schemes Arbiters Multicast
4 Fabric Interconnects: Crossbar Crossbar (Crosspoint) Fabric is becoming the preferable interconnect fabric for high speed and scalable switching It has been proven that crossbar (even input-queued) can have as high throughput as any switch. A crossbar inherently supports multicast efficiently. QoS can be implemented reasonably easy. The key challenge is the scalability for high line rate and large number of ports CMOS technology can achieve high density and low cost
5 Architecture of the CMOS Crossbar Switch Crossbar Switch Core fulfills the switch/router function Controller configures the crossbar core switching High speed data link communicates between switch fabric and line card PLL provides on-chip precise clock High Speed Data Link High Speed Data Link Controller 1GHz 256*256 Crossbar Switch Core High Speed Data Link P L L High Speed Data Link
6 Two Approaches to Build the Core X-Y Based Crossbar MUX Based Crossbar INPUT Control word INPUT OUTPUT 2:1 MUX OUTPUT Control word Scalability : N 2 Speed : limited by Cap at input and output lines Control: N 2 bits Scalability : N 2 Speed : limited by Cap only at input line Control: N*Log 2 N bits
7 Problems of Designing Large Crossbar Switch The switch core scales to N 2 Design complexity increases The performance requirement increases much faster than that can be achieved through CMOS technology scaling The throughput can be satisfied by using multiple bit-slices (e.g. 8) of the core, however, the core size increases by 8 times Wire delay is also substantial in high performance chip
8 Our Approach Pipelined MUX Crossbar Digital Core Digital MUX tree based design technique guarantees the high performance as well as the low design complexity In order to integrate large crossbar switch, only 2 bit-slices are embedded in the digital core instead of 8 (60%+ area saving) 1GHz digital core is demanded for the 2 Gb/s interface, the MUX tree can be pipelined to fulfill the requirement Additional pipeline stage is added to drive long wire
9 SDFF embedded with MUX Vdd embedded 4- to-1 MUX P1 N1 X S NAND I3 P2 I5 Q QB Sel I4 N5 N2 N6 I6 D Clk N3 N4 I1 I2 High performance Semi-Dynamic Flip-Flop (SDFF) is used [Klass98, Stojanovic et.al. 99] One of the fastest Flip-Flops due to negative setup time Little overhead for embedding with MUX function
10 Pipeline Stages Partition Static SDFF embedded Static SDFF embedded 1st stage 2nd stage (a) Static 2-to-1 SDFF embedded Static 8-to-1 SDFF embedded 1st stage 2nd stage The pipeline of the 256-to-1 MUX can be partitioned as Natural: 16-to-1 MUX in 1 st stage; 16-to-1 MUX in 2 nd stage Balanced: 8-to-1 MUX in 1 st stage; 32-to-1 MUX 2 nd stage (b)
11 Driving Long Wire Adding Repeater cannot Satisfy the 1GHz Requirement The 1 st stage is critical due to the large capacitor at the input line Distributed R-C wire model is employed Repeater can be inserted to reduce the wire delay For 256 ports, even inserting the optimal size and number of repeater, the delay is still larger than 1ns D CLK0 Q0 Qb0 driv1 Delay time (ps) Driver delay Rwire Cwire W Static 2to1 MUX in each cell CLK1 DD Wire delay Repeater cell cell cell cell SDFF_em 4to1 MUX Q1 Port No.= 64 Port No.= 128 Port No.= 256 Repeater number
12 Adding One Pipeline Stage to Drive Long Wire Add one more stage for driving the long wire by D CLK0 inserting a Flip-Flop The whole 256*256 crossbar is divided into 4 128* sub-crossbar, so that the input line only need to drive 128 cells instead of 256 For 128 ports, sub-ns delay time is achievable Q0 Qb0 driv1 Delay time (ps) Driver delay Rwire Cwire W Static 2to1 MUX Wire delay in each cell CLK1 DD Flip-Flop cell cell cell cell FF CLK1 SDFF_em 4to1 MUX Repeater number Q1 Port No.= 64 Port No.= 128 Port No.= 256
13 3-stages Pipelined MUX Crossbar Floor-Planning The 256*256 crossbar consists of 4 sub-crossbars (128*128) running at 1GHz frequency 2 pipeline stages in each sub-crossbar 2 bit-slices are embedded matching with 2Gb/s data link IN [0:63] OUT [0:63] Legend R R OUT [192:255] IN [64:127] OUT [64:127] R R R R 2:1 2:1 R R R R R 0 ~ 3 : Sub-Crossbar 0~3; IN [192:255] 2:1 2:1 2:1: SDFF embedded with 2-to-1 Mux; R 2 R R: Register c R 128*128 Sub-Crossbar OUT [128:191] IN [128:191]
14 3-stages Pipelined MUX Crossbar Timing Diagram In sub-crossbar 0, inputs [0:127] are switching in the 1 st and 2 nd stages, while in sub-crossbar 3, inputs [128:255] are switching in the 2 nd and 3 rd stages Finally, the two groups of outputs are fed into SDFF_embedded with 2-to-1 MUX to complete the 256-to-1 MUX action IN [0:63] IN [64:127] Sub-Crossbar0 Static 2-to-1 1st stage SDFF Static 2nd stage Static 3rd stage IN [128:191] IN [192:255] Sub-Crossbar3 1st stage Static 2-to-1 2nd stage SDFF Static 3rd stage Static SDFF 2-to-1 Mux OUT [0:63] OUT [192:255]
15 The Sub-Crossbar Circuits Simulation Results Technology Layout size Sub-crossbar No. Post layout simulation 1st stage 2 nd stage 3 rd stage TSMC 0.25µm SCN5M Deep 5.9 mm * 3.1 mm Sub-crossbar0 Sub-crossbar3 959 ps 845 ps 866 ps 959 ps 236 ps 866 ps
16 Control Circuits Control bits are used to configure the corresponding MUX in the crossbar pipeline in the correct timing stage. For saving the pin counts, the control inputs are embedded within the data inputs, each incoming frame packet includes one byte control word and 64 bits of data The timing constraints can be satisfied by careful pipelining the control path
17 Control Circuits (cont d) Bang-Bang PD: samples the 2Gb/s inputs, converts to 2bits, each at 1Gb/s Re-synchronization: synchronizes each input signal to the main clock 2Gb/s Interface clock Bang - Bang P D 2 1Gb/s 2 Counter Pulse Gen Main clock Counter Pulse Gen D M U X to Data - Path DMUX: demultiplexs signal to data & control bits Counter: counts 4/36 and controls the DMUX Control DMUX 8 Resynchronization subcrossbar to Control - Path to Crossbar
18 Full Crossbar Core Layout and Specification Full 256*256 crossbar core with 2 bit-slices Technology: TSMC 0.25µm SCN5M Deep, 5 Layer Metal Layout size: 14 mm*8 mm Transistor counts: 2000k Supply voltage: 2.5v Clock frequency: 1GHz Power: 40W
19 Interface Link and Clocking Design The dual loop delay locked loop () design technique is adopted in the data link for data and clock recovery The main analog generates multiple clock phases for the interpolation in the full digital periphery loop A half rate bang-bang phase detector is used in the periphery loop to sample the 2Gb/s incoming signal by using 1GHz clock A 3 rd loop, an analog PLL, provides the 1GHz on-chip clock D L L Peri - Loops P L L
20 Interface Link and Clocking Design System CLK The system clock is at 250MHz, PLL provides the precise 1GHz clock for the whole chip Several periphery loops share one analog D 2Gb/s D 2Gb/s Bang-Bang 4 Peri - Loop Bang-Bang 4 2 1Gb/s F S M Phase Selector & Interpolator 2 1Gb/s F S M Phase Selector & Interpolator PFD 1/4 VCDL P L L CLK Distribution Network P D CP+LP CP+LF VCO Peri - Loop D L L......
21 Conclusions A 2Gb/s 256*256 CMOS Crossbar Switch Core is achievable with current process technology Significant area saving is obtained by using only 2 bit-slices in the crossbar switch core 3-stages pipelined MUX circuit is proposed to decrease the cycle time to less than 1ns Post layout simulation results show that each stage can run at a clock rate higher than 1GHz Full 256*256 crossbar core has been laid out to demonstrate the design PLL & dual circuits have been designed for the clocking and high speed link in the whole chip
Chapter - 7. Multiplexing and circuit switches
Chapter - 7 Multiplexing and circuit switches Multiplexing Multiplexing is used to combine multiple communication links into a single stream. The aim is to share an expensive resource. For example several
More informationProblem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.
Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.
More informationIntroduction to Field Programmable Gate Arrays
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Historical introduction.
More informationThe Design of the KiloCore Chip
The Design of the KiloCore Chip Aaron Stillmaker*, Brent Bohnenstiehl, Bevan Baas DAC 2017: Design Challenges of New Processor Architectures University of California, Davis VLSI Computation Laboratory
More informationIV. PACKET SWITCH ARCHITECTURES
IV. PACKET SWITCH ARCHITECTURES (a) General Concept - as packet arrives at switch, destination (and possibly source) field in packet header is used as index into routing tables specifying next switch in
More informationHigh-speed Serial Interface
High-speed Serial Interface Lect. 16 Clock and Data Recovery 3 1 CDR Design Example ( 권대현 ) Clock and Data Recovery Circuits Transceiver PLL vs. CDR High-speed CDR Phase Detector Charge Pump Voltage Controlled
More informationA Four-Terabit Single-Stage Packet Switch with Large. Round-Trip Time Support. F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I.
A Four-Terabit Single-Stage Packet Switch with Large Round-Trip Time Support F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I. Iliadis IBM Research, Zurich Research Laboratory, CH-8803 Ruschlikon, Switzerland
More informationHigh Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx
High Capacity and High Performance 20nm FPGAs Steve Young, Dinesh Gaitonde August 2014 Not a Complete Product Overview Page 2 Outline Page 3 Petabytes per month Increasing Bandwidth Global IP Traffic Growth
More informationReconfigurable PLL for Digital System
International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 6, Number 3 (2013), pp. 285-291 International Research Publication House http://www.irphouse.com Reconfigurable PLL for
More informationSpiral 2-8. Cell Layout
2-8.1 Spiral 2-8 Cell Layout 2-8.2 Learning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as geometric
More informationPackage level Interconnect Options
Package level Interconnect Options J.Balachandran,S.Brebels,G.Carchon, W.De Raedt, B.Nauwelaers,E.Beyne imec 2005 SLIP 2005 April 2 3 Sanfrancisco,USA Challenges in Nanometer Era Integration capacity F
More informationFPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)
FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor
More informationA 400Gbps Multi-Core Network Processor
A 400Gbps Multi-Core Network Processor James Markevitch, Srinivasa Malladi Cisco Systems August 22, 2017 Legal THE INFORMATION HEREIN IS PROVIDED ON AN AS IS BASIS, WITHOUT ANY WARRANTIES OR REPRESENTATIONS,
More informationFile: 'ReportV37P-CT89533DanSuo.doc' CMPEN 411, Spring 2013, Homework Project 9 chip, 'Tiny Chip' fabricated through MOSIS program
MOSIS Chip Test Report Dan Suo File: 'ReportV37P-CT89533DanSuo.doc' CMPEN 411, Spring 2013, Homework Project 9 chip, 'Tiny Chip' fabricated through MOSIS program Technology: 0.5um CMOS, ON Semiconductor
More informationTiming for Optical Transmission Network (OTN) Equipment. Slobodan Milijevic Maamoun Seido
Timing for Optical Transmission Network (OTN) Equipment Slobodan Milijevic Maamoun Seido Agenda Overview of timing in OTN De-synchronizer (PLL) Requirements of OTN Phase Gain Loop Bandwidth Frequency Conversion
More informationHigh-speed Serial Interface
High-speed Serial Interface Lect. 8 SERES 1 High-Speed Circuits and Systems Lab., Yonsei University 2013-1 Block diagram Where are we today? Serializer Tx river Channel Rx Equalizer Sampler eserializer
More informationFuture of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1
Future of Interconnect Fabric A ontrarian View Shekhar Borkar June 13, 2010 Intel orp. 1 Outline Evolution of interconnect fabric On die network challenges Some simple contrarian proposals Evaluation and
More informationMulti-gigabit Switching and Routing
Multi-gigabit Switching and Routing Gignet 97 Europe: June 12, 1997. Nick McKeown Assistant Professor of Electrical Engineering and Computer Science nickm@ee.stanford.edu http://ee.stanford.edu/~nickm
More informationProcessor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP
Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor
More informationA 2 Gb/s Asymmetric Serial Link for High-Bandwidth Packet Switches
A 2 Gb/s Asymmetric Serial Link for High-Bandwidth Packet Switches Ken K. -Y. Chang, William Ellersick, Shang-Tse Chuang, Stefanos Sidiropoulos, Mark Horowitz, Nick McKeown: Computer System Laboratory,
More informationFPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.
FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More informationNetworks for Multi-core Chips A A Contrarian View. Shekhar Borkar Aug 27, 2007 Intel Corp.
Networks for Multi-core hips A A ontrarian View Shekhar Borkar Aug 27, 2007 Intel orp. 1 Outline Multi-core system outlook On die network challenges A simple contrarian proposal Benefits Summary 2 A Sample
More informationCAP+ CAP. Loop Filter
STS-12/STS-3 Multirate Clock and Data Recovery Unit FEATURES Performs clock and data recovery for 622.08 Mbps (STS-12/OC-12/STM-4) or 155.52 Mbps (STS-3/OC-3/STM-1) NRZ data 19.44 MHz reference frequency
More informationAn Overview of Standard Cell Based Digital VLSI Design
An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,
More informationChallenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer
Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, 2006 Sr. Principal Engineer Panel Questions How do we build scalable networks that balance power, reliability and performance
More informationCSE 123A Computer Networks
CSE 123A Computer Networks Winter 2005 Lecture 8: IP Router Design Many portions courtesy Nick McKeown Overview Router basics Interconnection architecture Input Queuing Output Queuing Virtual output Queuing
More informationDesign of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture
Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and
More informationAMchip architecture & design
Sezione di Milano AMchip architecture & design Alberto Stabile - INFN Milano AMchip theoretical principle Associative Memory chip: AMchip Dedicated VLSI device - maximum parallelism Each pattern with private
More informationStreaming Encryption for a Secure Wavelength and Time Domain Hopped Optical Network
treaming Encryption for a ecure Wavelength and Time Domain Hopped Optical Network Herwin Chan, Alireza Hodjat, Jun hi, Richard Wesel, Ingrid Verbauwhede {herwin, ahodjat, junshi, wesel, ingrid} @ ee.ucla.edu
More informationAn Introduction to Programmable Logic
Outline An Introduction to Programmable Logic 3 November 24 Transistors Logic Gates CPLD Architectures FPGA Architectures Device Considerations Soft Core Processors Design Example Quiz Semiconductors Semiconductor
More informationASNT1016-PQA 16:1 MUX-CMU
16:1 MUX-CMU 16 to 1 multiplexer (MUX) with integrated CMU (clock multiplication unit). PLL-based architecture featuring both counter and forward clocking modes. Supports multiple data rates in the 9.8-12.5Gb/s
More informationEECS150 - Digital Design Lecture 17 Memory 2
EECS150 - Digital Design Lecture 17 Memory 2 October 22, 2002 John Wawrzynek Fall 2002 EECS150 Lec17-mem2 Page 1 SDRAM Recap General Characteristics Optimized for high density and therefore low cost/bit
More informationImplementing Tile-based Chip Multiprocessors with GALS Clocking Styles
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues
More information10 Gbit/s Transmitter MUX with Re-timing GD16585/GD16589 (FEC)
10 Gbit/s Transmitter MUX with Re-timing GD16585/GD16589 (FEC) Preliminary General Description GD16585 and GD16589 are transmitter chips used in SDH STM-64 and SONET OC-192 optical communication systems.
More informationA mm 2 780mW Fully Synthesizable PLL with a Current Output DAC and an Interpolative Phase-Coupled Oscillator using Edge Injection Technique
A 0.0066mm 2 780mW Fully Synthesizable PLL with a Current Output DAC and an Interpolative Phase-Coupled Oscillator using Edge Injection Technique Wei Deng, Dongsheng Yang, Tomohiro Ueno, Teerachot Siriburanon,
More informationPushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University
PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -
More information16th Microelectronics Workshop Oct 22-24, 2003 (MEWS-16) Tsukuba Space Center JAXA
16th Microelectronics Workshop Oct 22-24, 2003 (MEWS-16) Tsukuba Space Center JAXA 1 The proposed presentation explores the use of commercial processes, including deep-sub micron process technology, package
More informationPacket Switch Architectures Part 2
Packet Switch Architectures Part Adopted from: Sigcomm 99 Tutorial, by Nick McKeown and Balaji Prabhakar, Stanford University Slides used with permission from authors. 999-000. All rights reserved by authors.
More informationThe Memory Hierarchy 1
The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow
More informationA Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports
A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports Takeshi Shimizu, Yukihiro Nakagawa, Sridhar Pathi, Yasushi Umezawa, Takashi Miyoshi, Yoichi Koyanagi, Takeshi Horie, Akira Hattori Hot
More informationRHiNET-3/SW: an 80-Gbit/s high-speed network switch for distributed parallel computing
RHiNET-3/SW: an 0-Gbit/s high-speed network switch for distributed parallel computing S. Nishimura 1, T. Kudoh 2, H. Nishi 2, J. Yamamoto 2, R. Ueno 3, K. Harasawa 4, S. Fukuda 4, Y. Shikichi 4, S. Akutsu
More informationNevis ADC Design. Jaroslav Bán. Columbia University. June 4, LAr ADC Review. LAr ADC Review. Jaroslav Bán
Nevis ADC Design Columbia University June 4, 2014 Outline The goals of the project Introductory remarks The road toward the design Components developed in Nevis09, Nevis10 and Nevis12 Nevis13 chip Architecture
More informationCS310 Embedded Computer Systems. Maeng
1 INTRODUCTION (PART II) Maeng Three key embedded system technologies 2 Technology A manner of accomplishing a task, especially using technical processes, methods, or knowledge Three key technologies for
More informationCHAPTER 4 DUAL LOOP SELF BIASED PLL
52 CHAPTER 4 DUAL LOOP SELF BIASED PLL The traditional self biased PLL is modified into a dual loop architecture based on the principle widely applied in clock and data recovery circuits proposed by Seema
More informationCCS Technical Documentation NHL-2NA Series Transceivers. Camera Module
CCS Technical Documentation NHL-2NA Series Transceivers Issue 1 07/02 Nokia Corporation NHL-2NA CCS Technical Documentation [This page left intentionally blank] Page 2 Nokia Corporation Issue 1 07/02 CCS
More informationDevelopment and research of different architectures of I 2 C bus controller. E. Vasiliev, MIET
Development and research of different architectures of I 2 C bus controller E. Vasiliev, MIET I2C and its alternatives I²C (Inter-Integrated Circuit) is a multi-master serial computer bus invented by Philips
More informationA Configurable Radiation Tolerant Dual-Ported Static RAM macro, designed in a 0.25 µm CMOS technology for applications in the LHC environment.
A Configurable Radiation Tolerant Dual-Ported Static RAM macro, designed in a 0.25 µm CMOS technology for applications in the LHC environment. 8th Workshop on Electronics for LHC Experiments 9-13 Sept.
More informationTECHNOLOGY BRIEF. Double Data Rate SDRAM: Fast Performance at an Economical Price EXECUTIVE SUMMARY C ONTENTS
TECHNOLOGY BRIEF June 2002 Compaq Computer Corporation Prepared by ISS Technology Communications C ONTENTS Executive Summary 1 Notice 2 Introduction 3 SDRAM Operation 3 How CAS Latency Affects System Performance
More informationCHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP
133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located
More informationMemory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM
ECEN454 Digital Integrated Circuit Design Memory ECEN 454 Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports DRAM Outline Serial Access Memories ROM ECEN 454 12.2 1 Memory
More informationAxcelerator Family FPGAs
Product Brief Axcelerator Family FPGAs u e Leading-Edge Performance 350+ MHz System Performance 500+ MHz Internal Performance High-Performance Embedded s 700 Mb/s LVDS Capable I/Os Specifications Up to
More informationInterconnection Network Project EE482 Advanced Computer Organization May 28, 1999
Interconnection Network Project EE482 Advanced Computer Organization May 28, 1999 Group Members: Overview Tom Fountain (fountain@cs.stanford.edu) T.J. Giuli (giuli@cs.stanford.edu) Paul Lassa (lassa@relgyro.stanford.edu)
More information10 Gbit/s Transmitter MUX with Re-timing GD16585/GD16589 (FEC)
an Intel company 10 Gbit/s Transmitter MUX with Re-timing GD16585/GD16589 (FEC) Preliminary General Description GD16585 and GD16589 are transmitter chips used in SDH STM-64 and SONET OC-192 optical communication
More informationDesigning High-Speed ATM Switch Fabrics by Using Actel FPGAs
pplication Note C105 esigning High-Speed TM Switch Fabrics by Using ctel FPGs The recent upsurge of interest in synchronous Transfer Mode (TM) is based on the recognition that it represents a new level
More informationPHOTONIC ATM FRONT-END PROCESSOR
PHOTONIC ATM FRONT-END PROCESSOR OBJECTIVES: To build a photonic ATM front-end processor including the functions of virtual channel identifier (VCI) over-write and cell synchronization for future photonic
More informationA 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications
A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System
More informationThe RM9150 and the Fast Device Bus High Speed Interconnect
The RM9150 and the Fast Device High Speed Interconnect John R. Kinsel Principal Engineer www.pmc -sierra.com 1 August 2004 Agenda CPU-based SOC Design Challenges Fast Device (FDB) Overview Generic Device
More informationECE 485/585 Microprocessor System Design
Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering
More informationDesign and Implementation of High Performance Application Specific Memory
Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics
More informationPhysical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture
1 Physical Implementation of the DSPI etwork-on-chip in the FAUST Architecture Ivan Miro-Panades 1,2,3, Fabien Clermidy 3, Pascal Vivet 3, Alain Greiner 1 1 The University of Pierre et Marie Curie, Paris,
More informationMicrocomputers. Outline. Number Systems and Digital Logic Review
Microcomputers Number Systems and Digital Logic Review Lecture 1-1 Outline Number systems and formats Common number systems Base Conversion Integer representation Signed integer representation Binary coded
More informationEECS 150 Homework 7 Solutions Fall (a) 4.3 The functions for the 7 segment display decoder given in Section 4.3 are:
Problem 1: CLD2 Problems. (a) 4.3 The functions for the 7 segment display decoder given in Section 4.3 are: C 0 = A + BD + C + BD C 1 = A + CD + CD + B C 2 = A + B + C + D C 3 = BD + CD + BCD + BC C 4
More information3. The high voltage level of a digital signal in positive logic is : a) 1 b) 0 c) either 1 or 0
1. The number of level in a digital signal is: a) one b) two c) four d) ten 2. A pure sine wave is : a) a digital signal b) analog signal c) can be digital or analog signal d) neither digital nor analog
More information2. Link and Memory Architectures and Technologies
2. Link and Memory Architectures and Technologies 2.1 Links, Thruput/Buffering, Multi-Access Ovrhds 2.2 Memories: On-chip / Off-chip SRAM, DRAM 2.A Appendix: Elastic Buffers for Cross-Clock Commun. Manolis
More informationEECS150 - Digital Design Lecture 16 Memory 1
EECS150 - Digital Design Lecture 16 Memory 1 March 13, 2003 John Wawrzynek Spring 2003 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: Whenever a large collection of state elements is required. data &
More informationHybrid On-chip Data Networks. Gilbert Hendry Keren Bergman. Lightwave Research Lab. Columbia University
Hybrid On-chip Data Networks Gilbert Hendry Keren Bergman Lightwave Research Lab Columbia University Chip-Scale Interconnection Networks Chip multi-processors create need for high performance interconnects
More informationDigital Integrated Circuits A Design Perspective. Jan M. Rabaey
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Outline (approximate) Introduction and Motivation The VLSI Design Process Details of the MOS Transistor Device Fabrication Design Rules CMOS
More information6. Latches and Memories
6 Latches and Memories This chapter . RS Latch The RS Latch, also called Set-Reset Flip Flop (SR FF), transforms a pulse into a continuous state. The RS latch can be made up of two interconnected
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More information1. NoCs: What s the point?
1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos
More informationDesign of Optical Burst Switches based on Dual Shuffle-exchange Network and Deflection Routing
Design of Optical Burst Switches based on Dual Shuffle-exchange Network and Deflection Routing Man-Ting Choy Department of Information Engineering, The Chinese University of Hong Kong mtchoy1@ie.cuhk.edu.hk
More informationAN-12. Use Quickswitch Bus Switches to Make Large, Fast Dual Port RAMs. Dual Port RAM. Figure 1. Dual Port RAM in Dual Microprocessor System
Q QUALITY SEMICONDUCTOR, INC. AN-12 Use Quickswitch Bus Switches to Make Large, Fast Dual Port s Application Note AN-12 Abstract Dual port s are effective devices for highspeed communication between microprocessors.
More informationLab 2. Standard Cell layout.
Lab 2. Standard Cell layout. The purpose of this lab is to demonstrate CMOS-standard cell design. Use the lab instructions and the cadence manual (http://www.es.lth.se/ugradcourses/cadsys/cadence.html)
More informationUltra-Low Latency, Bit-Parallel Message Exchange in Optical Packet Switched Interconnection Networks
Ultra-Low Latency, Bit-Parallel Message Exchange in Optical Packet Switched Interconnection Networks O. Liboiron-Ladouceur 1, C. Gray 2, D. Keezer 2 and K. Bergman 1 1 Department of Electrical Engineering,
More informationRouter Architectures
Router Architectures Venkat Padmanabhan Microsoft Research 13 April 2001 Venkat Padmanabhan 1 Outline Router architecture overview 50 Gbps multi-gigabit router (Partridge et al.) Technology trends Venkat
More informationHigh-Speed NAND Flash
High-Speed NAND Flash Design Considerations to Maximize Performance Presented by: Robert Pierce Sr. Director, NAND Flash Denali Software, Inc. History of NAND Bandwidth Trend MB/s 20 60 80 100 200 The
More informationEE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1
EE382C Lecture 1 Bill Dally 3/29/11 EE 382C - S11 - Lecture 1 1 Logistics Handouts Course policy sheet Course schedule Assignments Homework Research Paper Project Midterm EE 382C - S11 - Lecture 1 2 What
More informationChapter 8 Memory Basics
Logic and Computer Design Fundamentals Chapter 8 Memory Basics Charles Kime & Thomas Kaminski 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Overview Memory definitions Random Access
More informationChapter 6. CMOS Functional Cells
Chapter 6 CMOS Functional Cells In the previous chapter we discussed methods of designing layout of logic gates and building blocks like transmission gates, multiplexers and tri-state inverters. In this
More informationAn overview of standard cell based digital VLSI design
An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased
More informationVersatile RRAM Technology and Applications
Versatile RRAM Technology and Applications Hagop Nazarian Co-Founder and VP of Engineering, Crossbar Inc. Santa Clara, CA 1 Agenda Overview of RRAM Technology RRAM for Embedded Memory Mass Storage Memory
More informationCS150 Fall 2012 Solutions to Homework 6
CS150 Fall 2012 Solutions to Homework 6 October 6, 2012 Problem 1 a.) Answer: 0.09 ns This delay is given in Table 65 as T ILO, specifically An Dn LUT address to A. b.) Answer: 0.41 ns In Table 65, this
More informationSEE Tolerant Self-Calibrating Simple Fractional-N PLL
SEE Tolerant Self-Calibrating Simple Fractional-N PLL Robert L. Shuler, Avionic Systems Division, NASA Johnson Space Center, Houston, TX 77058 Li Chen, Department of Electrical Engineering, University
More informationHEAT (Hardware enabled Algorithmic tester) for 2.5D HBM Solution
HEAT (Hardware enabled Algorithmic tester) for 2.5D HBM Solution Kunal Varshney, Open-Silicon Ganesh Venkatkrishnan, Open-Silicon Pankaj Prajapati, Open-Silicon May 9, 9, 2016 1 Agenda High Bandwidth Memory
More informationCS152 Computer Architecture and Engineering Lecture 16: Memory System
CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson
More informationMTJ-Based Nonvolatile Logic-in-Memory Architecture
2011 Spintronics Workshop on LSI @ Kyoto, Japan, June 13, 2011 MTJ-Based Nonvolatile Logic-in-Memory Architecture Takahiro Hanyu Center for Spintronics Integrated Systems, Tohoku University, JAPAN Laboratory
More informationSynchronous Optical Networks (SONET) Advanced Computer Networks
Synchronous Optical Networks (SONET) Advanced Computer Networks SONET Outline Brief History SONET Overview SONET Rates SONET Ring Architecture Add/Drop Multiplexor (ADM) Section, Line and Path Virtual
More informationDI-SSD: Desymmetrized Interconnection Architecture and Dynamic Timing Calibration for Solid-State Drives
DI-SSD: Desymmetrized Interconnection Architecture and Dynamic Timing Calibration for Solid-State Drives Ren-Shuo Liu and Jian-Hao Huang System and Storage Design Lab Department of Electrical Engineering
More informationThe Future of Electrical I/O for Microprocessors. Frank O Mahony Intel Labs, Hillsboro, OR USA
The Future of Electrical I/O for Microprocessors Frank O Mahony frank.omahony@intel.com Intel Labs, Hillsboro, OR USA 1 Outline 1TByte/s I/O: motivation and challenges Circuit Directions Channel Directions
More informationESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)
ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,
More informationMarching Memory マーチングメモリ. UCAS-6 6 > Stanford > Imperial > Verify 中村維男 Based on Patent Application by Tadao Nakamura and Michael J.
UCAS-6 6 > Stanford > Imperial > Verify 2011 Marching Memory マーチングメモリ Tadao Nakamura 中村維男 Based on Patent Application by Tadao Nakamura and Michael J. Flynn 1 Copyright 2010 Tadao Nakamura C-M-C Computer
More informationSynchronous Optical Networks SONET. Computer Networks: SONET
Synchronous Optical Networks SONET 1 Telephone Networks {Brief History} Digital carrier systems The hierarchy of digital signals that the telephone network uses. Trunks and access links organized in DS
More informationToward a unified architecture for LAN/WAN/WLAN/SAN switches and routers
Toward a unified architecture for LAN/WAN/WLAN/SAN switches and routers Silvano Gai 1 The sellable HPSR Seamless LAN/WLAN/SAN/WAN Network as a platform System-wide network intelligence as platform for
More informationASIC Physical Design Top-Level Chip Layout
ASIC Physical Design Top-Level Chip Layout References: M. Smith, Application Specific Integrated Circuits, Chap. 16 Cadence Virtuoso User Manual Top-level IC design process Typically done before individual
More information3. Implementing Logic in CMOS
3. Implementing Logic in CMOS 3. Implementing Logic in CMOS Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 27 September, 27 ECE Department,
More informationASNT7120-KMA 10GS/s, 4-bit Flash Analog-to-Digital Converter
10GS/s, 4-bit Flash Analog-to-Digital Converter 18GHz analog input bandwidth. Selectable clocking mode: external high-speed clock or internal PLL with external lowspeed reference clock. Broadband operation
More informationA 20 GSa/s 8b ADC with a 1 MB Memory in 0.18 µm CMOS
A 20 GSa/s 8b ADC with a 1 MB Memory in 0.18 µm CMOS Ken Poulton, Robert Neff, Brian Setterberg, Bernd Wuppermann, Tom Kopley, Robert Jewett, Jorge Pernillo, Charles Tan, Allen Montijo 1 Agilent Laboratories,
More informationA 32nm, 0.9V Supply-Noise Sensitivity Tracking PLL for Improved Clock Data Compensation Featuring a Deep Trench Capacitor Based Loop Filter
A 32nm, 0.9V Supply-Noise Sensitivity Tracking PLL for Improved Clock Data Compensation Featuring a Deep Trench Capacitor Based Loop Filter Bongjin Kim, Weichao Xu, and Chris H. Kim University of Minnesota,
More informationToken Bit Manager for the CMS Pixel Readout
Token Bit Manager for the CMS Pixel Readout Edward Bartz Rutgers University Pixel 2002 International Workshop September 9, 2002 slide 1 TBM Overview Orchestrate the Readout of Several Pixel Chips on a
More information