Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture
|
|
- Ella Gardner
- 6 years ago
- Views:
Transcription
1 1 Physical Implementation of the DSPI etwork-on-chip in the FAUST Architecture Ivan Miro-Panades 1,2,3, Fabien Clermidy 3, Pascal Vivet 3, Alain Greiner 1 1 The University of Pierre et Marie Curie, Paris, France 2 STMicroelectronics, Crolles, France 3 CEA-Leti, MIATEC, Grenoble, France
2 Outline Motivation FAUST architecture Migration of DSPI into FAUST DSPI implementation etworks comparison 2
3 3 Motivation Physically implement the DSPI oc into the FAUST application platform - DSPI is a oc developed between Lip6 and ST - FAUST is a stream-oriented application platform for 4G telecom applications, based on AOC, developed by CEA-Leti. Compare the performances between AOC and DSPI on a real application and traffic
4 4 FAUST architecture OC1 IF 84 Pads SPort APort EXP OFDM MOD. RAC ALAM. MOD. CDMA MOD. JTAG MAPP. Clk & Reset CTRL BIT ITER. TURBO CODER Clk, Rst COV. CODER 23 computation units Asynchronous oc (AOC) 20 AOC routers TX units oc Perf. AHB RAM ARM946 RAM EXT. RAM CTRL RAM IF 58 Pads GALS conception 24 independent Clks AHB units RX units Async/ Sync IF Async node ROTOR FRAME SYC. EQUAL. ODFM DEM. CHA. EST. CDMA DEM. COV. DEC. DE- MAPP. ETHER ET DE- ITER. ETHERET IF 17 Pads Ethernet port Internal/External RAM CPU ARM946ES Cache 4KB-I 4KB-D OC2 IF 83 Pads EXP SPort APort DART Hardware OFDM modulation/demodulation
5 5 AOC architecture Crossbar Crossbar IP IC Asynchronous handshake protocol GALS interface Crossbar Crossbar Asynchronous oc - Asynchronous send/accept handshake protocol - QDI 4-phase/4-rail asynchronous logic - QoS with two Virtual Channels (Best Effort, Guaranteed Service) oc Routers - 5 ports router - Source routing - Wormhole packet switching - 32 bits payload FIFO based GALS interfaces Mapped onto libraries - CMOS ST 130nm - TIMA TAL130 library
6 FAUST floor-plan with AOC Exp. RAC RAM1 Rotor Exp. Frame sync. Equal. P2 OFDM mod. ARM946 Channel Est. OFDM demod. Ala. P1 CDMA Mod. CDMA Dem. RAM2 CLK Reset Mapp. Conv. Dec. Demapp. Bit. Inter. Turbo Dec. Conv. Codec. Ext. RAM Ctrl. Ethernet Deinter. 4.5 M Gates 276 chip pins Chip area 80 mm² ST CMOS 130nm 166 MHz (worst-case) oc implementation : - Uses AOC router hard-macro - Perform buffering and routing of AOC links - o spaghetti routing at top level! orth Port Unit Port DART West Port East Port 6 South Port AOC router (0.211 mm²)
7 Outline Motivation FAUST architecture Migration of DSPI into FAUST DSPI implementation etworks comparison 7
8 in in in in DSPI architecture CK6 CK7 CK8 (Y+1,X-1) 1) Cluster (Y+1,X) (Y+1,X+1) CK3 CK4 CK5 (Y,X-1) Cluster (Y,X) (Y,X+1) CK0 CK1 CK 2 in in in in in Local port 8 Packet base Distributed router architecture Suited to GALS approach Mesochronous links between routers Synthesizable with standard cells either asynchronous nor custom cells Metastability resolved by bi-synchronous FIFO (Y-1,X 1,X-1) 1) Cluster (Y-1,X) (Y-1,X+1) More details in: "A Low Cost etwork-on-chip with Guaranteed Service Well Suited to the GALS Approach anoet 06
9 9 oc architectures comparison AOC Router arity 5 port router 5 port router DSPI Topology Irregular 2D mesh Regular 2D mesh Routing technique Source routing (9 hops) Address-based (8 bits) X-First algorithm Switching technique Wormhole Wormhole Flit size (payload) 34 bits (32 bits) Generic: 34 bits (32 bits) Flow control bits on the flit Virtual channels Begin of packet (BOP) End of packet (EOP) Best effort Guaranteed service Begin of packet (BOP) End of packet (EOP) Best effort Guaranteed service Programming model Message passing Shared memory (2 routers per cluster) Message passing (1 router per cluster) Clocking scheme Fully asynchronous (QDI) with GALS interfaces Multi-synchronous with mesochronous interfaces Flow control protocol Send/accept asynchronous handshake FIFO protocol (Write and WriteOk) Clock tree one One per router Physical implementation Hard macro Soft macro distributed on five modules Long wires Inter-router wires Intra-cluster wires
10 10 Packet format 34 bits 18 bits 34 bits (generic) 8 bits First flit H8... H2 H1 H0 Y X 2 bits 2 bits 2 bits 4 bits 4 bits Following flits AOC packet DSPI packet Similar packet format and control bits AOC uses Source-routing (18 bits) allowing 9 hops DSPI uses Address-based (8 bits) Packet conversion module required: - Design of Protocol_conversion module
11 FAUST integration CLK_IP CLK_IP IP IC Synchronous SED/ACCEPT IP IC Synchronous SED/ACCEPT Protocol_conversion module: Translates the routing algorithm using a LUT GALS interface Protocol_conversion Adapts the flow control signals: LUT - AOC: Send/Accept AOC router Asynchronous SED/ACCEPT Asynchronous SED/ACCEPT DSPI router Synchronous READ/WRITE Mesochronous READ/WRITE - DSPI: FIFO protocol Implements two bi-synchronous FIFOs CLK_oC AOC IP template DSPI IP template AOC GALS Interface: Implements 4 FIFOs synchronous asynchronous FAUST IPs and ICs are not modified 11
12 Outline Motivation FAUST architecture Migration of DSPI into FAUST DSPI implementation etworks comparison 12
13 13 DSPI implementation Synthesis Floorplanning Place Optimize placement Clock-tree Route Optimize Hierarchical synthesis Low Power CMOS ST 130nm technology Standard cells Without asynchronous nor custom cells Timing constraints file: - Muti-cycle path (mesochronous interfaces) - False path (asynchronous interfaces) GALS compatible Clock gating Implemented in 4 steps
14 DSPI clock-tree Mesochronous links Clk Clk Clk Clk GALS compatible Bi-synchronous FIFO [OCS 2007] Max skew 50% clock period Timing constraints: set_multi_cycle_path Clk_oC 5% skew within the router 180 phase shift and 30% skew between routers Router (0,0) Router (0,1) Router (0,2) 4 th Step 30% skew (top tree) 1 st Step 2 nd, 3 th Step 5% skew ( bottom tree) Clock-tree implementation 1. Add buffers/inverters 2. Built bottom clock tree (5% skew) 3. Characterize bottom clock-tree 4. Build top clock-tree (30% skew) 14
15 15 FAUST floor-plan with DSPI W S L RAC Rotor W S RAM1 E W Frame sync. W S W L S L L Equal. E W W E S L E E S W L OFDM mod. ARM946 L E W L S E W S E W L S E W S L E W L Channel Est. OFDM demod. Ala. P1 CDMA Dem. S E L S E CDMA Mod. CLK RAM2 W L S L E W W L L S E S E W S Conv. Dec. E W S E W Demapp. Bit. Inter. L L S E Turbo Dec. Conv. Codec. Ext. RAM Ctrl. Ethernet L E Deinter. E W L S E Distributed router implementation Soft macro approach Higher floor-plan flexibility oc adapts to the SoC Long wires are routed in a tree manner Different router configurations are possible DART (reserved area) W L S E DSPI router Placement density: %
16 16 FAUST floor-plan Mapp. Exp. P2 P2 Exp. P1 CLK Reset FAUST with AOC FAUST with DSPI
17 Outline Motivation FAUST architecture Migration of DSPI into FAUST DSPI implementation etworks comparison 17
18 oc Area AOC DSPI Router mm² mm² Interface GALS mm² mm² Clock tree mm² mm² Total mm² mm² CMOS 130nm AOC is implemented as a hard macro DSPI is implemented as a soft macro DSPI is 33% smaller than AOC 18
19 oc Throughput AOC DSPI Throughput on worst-case conditions Throughput on nominal conditions ~ 160Mflit/s 289Mflit/s ~ 220Mflit/s 408Mflit/s DSPI throughput is deterministic with respect to the clock frequency (one flit per clock cycle) Long wire latency penalty on throughput: - DSPI: critical path crosses one time the long wires - AOC: critical path crosses 4 times the long wires, 4-phase protocol AOC link pipelining is feasible In a commercial circuit, DSPI will be clocked not far away from worst-case (289 MHz) to improve the fabrication yield 19
20 Packet latency (1) AOC DSPI F = 150 MHz AOC DSPI F = 250 MHz Compute the packet latency through many routers DSPI has deterministic packet latency with respect to clock frequency DSPI has lower First and Last packet latencies AOC intermediate latency is lower than DSPI. DSPI resynchronize the data on each router Intermediate router latency First + Last latency 6.80 ns ns ns ns 6.80 ns ns ns ns 20
21 Packet latency (2) AOC DSPI AOC DSPI F = 150 MHz F = 250 MHz Latency for 5 hops path Latency for 9 hops path ns ns ns ns ns ns ns ns The packet latency are similar for clock frequencies >250 MHz Intermediate packet latency is important but the application should be mapped in order to optimize the data locality (try to communicate with neighbor IPs rather than with faraway IPs) 21
22 Power consumption AOC F = 150 MHz DSPI F = 250 MHz Router 2.07 mw 2.89 mw 4.85 mw GALS interface 1.62 mw 0.56 mw 0.81 mw Clock tree 0.00 mw 2.44 mw 4.73 mw Total 3.69 mw 5.89 mw mw Power extraction after P&R Functional packet traffic (OFDM demodulation) Power consumption majorly dominated by FIFO data registers The DSPI clock-gating reduced the power consumption by 67% DSPI clock-tree consumes as much power as the router itself - eeds to improve DSPI clock-gating - GALS clock-tree consumes only 2.5% of total clock-tree power 22
23 Conclusion Physical implementation of the DSPI etwork-on-chip on FAUST platform - Comparison between AOC and DSPI at architecture level - Adaptation of DSPI architecture to manage stream-oriented communications - Implementation up to layout of DSPI network on FAUST platform => DSPI mesochronous links fully implemented with standard tools Comparison between DSPI and AOC ocs (STMicroelectronics 130nm) - Area of DSPI is 33% smaller than AOC one - Maximum sustained throughput of DSPI is 31% higher than AOC - AOC has lower packet latency - DSPI power consumption 1.5 to 3 times higher than AOC AOC is a good candidate for low latency and low power applications, while DSPI is more suited to low area and high performance applications 23
24 24 See you at my PhD defense on May 20 th
Design of On-chip and Off-chip Interfaces for a GALS NoC Architecture
2006 Design of On-chip and Off-chip Interfaces for a GALS NoC Architecture Async 06 Edith Beigné Pascal Vivet {edith.beigne; pascal.vivet}@cea.fr Async 06 Symposium Grenoble P. Vivet 1 2006 FAUST : an
More informationTIMA Lab. Research Reports
ISSN 1292-862 TIMA Lab. Research Reports TIMA Laboratory, 46 avenue Félix Viallet, 38000 Grenoble France Session 1.2 - Hop Topics for SoC Design Asynchronous System Design Prof. Marc RENAUDIN TIMA, Grenoble,
More informationThe Design of the KiloCore Chip
The Design of the KiloCore Chip Aaron Stillmaker*, Brent Bohnenstiehl, Bevan Baas DAC 2017: Design Challenges of New Processor Architectures University of California, Davis VLSI Computation Laboratory
More informationAn Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart
An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart Weiwei Jiang Columbia University, USA Gabriele Miorandi University of Ferrara, Italy Wayne Burleson
More informationDesign-for-Test Approach of an Asynchronous etwork-on-chip Architecture and its Associated Test Pattern Generation and Application
Design-for-Test Approach of an Asynchronous etwork-on-chip Architecture and its Associated Test Pattern Generation and Application Xuan-Tu Tran 1, 3, Yvain Thonnart 1, Jean Durupt 1, Vincent Beroulle 2,
More informationNoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad
NoC Round Table / ESA Sep. 2009 Asynchronous Three Dimensional Networks on on Chip Frédéric ric PétrotP Outline Three Dimensional Integration Clock Distribution and GALS Paradigm Contribution of the Third
More informationReNoC: A Network-on-Chip Architecture with Reconfigurable Topology
1 ReNoC: A Network-on-Chip Architecture with Reconfigurable Topology Mikkel B. Stensgaard and Jens Sparsø Technical University of Denmark Technical University of Denmark Outline 2 Motivation ReNoC Basic
More informationAchieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation
Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation
More informationOASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS
OASIS NoC Architecture Design in Verilog HDL Technical Report: TR-062010-OASIS Written by Kenichi Mori ASL-Ben Abdallah Group Graduate School of Computer Science and Engineering The University of Aizu
More informationSome Design Issues for 3D NoCs: From Circuits to Systems
ReCoSoC 2013 Some Design Issues for 3D NoCs: From Circuits to Systems Frédéric Pétrot with many inputs from Abbas Sheibanyrad, Florentine Dubois, Sahar Foroutan, Maryam Bahmani TIMA System Level Synthesis
More informationAn overview of standard cell based digital VLSI design
An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased
More informationAn Overview of Standard Cell Based Digital VLSI Design
An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,
More informationNetwork on Chip Architecture: An Overview
Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology
More informationHigh Performance Interconnect and NoC Router Design
High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali
More informationDynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers
Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California
More informationReal Time NoC Based Pipelined Architectonics With Efficient TDM Schema
Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam
More informationImplementing Tile-based Chip Multiprocessors with GALS Clocking Styles
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues
More informationSystem-Level Analysis of Network Interfaces for Hierarchical MPSoCs
System-Level Analysis of Network Interfaces for Hierarchical MPSoCs Johannes Ax *, Gregor Sievers *, Martin Flasskamp *, Wayne Kelly, Thorsten Jungeblut *, and Mario Porrmann * * Cognitronics and Sensor
More informationSPATIAL PARALLELISM IN THE ROUTERS OF ASYNCHRONOUS ON-CHIP NETWORKS
SPATIAL PARALLELISM IN THE ROUTERS OF ASYNCHRONOUS ON-CHIP NETWORKS A THESIS SUBMITTED TO THE UNIVERSITY OF MANCHESTER FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN THE FACULTY OF ENGINEERING AND PHYSICAL
More informationSoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik
SoC Design Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik Chapter 5 On-Chip Communication Outline 1. Introduction 2. Shared media 3. Switched media 4. Network on
More informationSoC Communication Complexity Problem
When is the use of a Most Effective and Why MPSoC, June 2007 K. Charles Janac, Chairman, President and CEO SoC Communication Complexity Problem Arbitration problem in an SoC with 30 initiators: Hierarchical
More informationBandwidth Optimization in Asynchronous NoCs by Customizing Link Wire Length
Bandwidth Optimization in Asynchronous NoCs by Customizing Wire Length Junbok You Electrical and Computer Engineering, University of Utah jyou@ece.utah.edu Daniel Gebhardt School of Computing, University
More informationSwizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems
1 Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems Ronald Dreslinski, Korey Sewell, Thomas Manville, Sudhir Satpathy, Nathaniel Pinckney, Geoff Blake, Michael Cieslak, Reetuparna
More informationTing Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China
CMOS Crossbar Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China OUTLINE Motivations Problems of Designing Large Crossbar Our Approach - Pipelined MUX
More informationCHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER
84 CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 3.1 INTRODUCTION The introduction of several new asynchronous designs which provides high throughput and low latency is the significance of this chapter. The
More informationJoint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals
Joint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals Philipp Gorski, Tim Wegner, Dirk Timmermann University
More informationIndex 283. F Fault model, 121 FDMA. See Frequency-division multipleaccess
Index A Active buffer window (ABW), 34 35, 37, 39, 40 Adaptive data compression, 151 172 Adaptive routing, 26, 100, 114, 116 119, 121 123, 126 128, 135 137, 139, 144, 146, 158 Adaptive voltage scaling,
More informationA Novel Energy Efficient Source Routing for Mesh NoCs
2014 Fourth International Conference on Advances in Computing and Communications A ovel Energy Efficient Source Routing for Mesh ocs Meril Rani John, Reenu James, John Jose, Elizabeth Isaac, Jobin K. Antony
More informationA Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports
A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports Takeshi Shimizu, Yukihiro Nakagawa, Sridhar Pathi, Yasushi Umezawa, Takashi Miyoshi, Yoichi Koyanagi, Takeshi Horie, Akira Hattori Hot
More informationFuture of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1
Future of Interconnect Fabric A ontrarian View Shekhar Borkar June 13, 2010 Intel orp. 1 Outline Evolution of interconnect fabric On die network challenges Some simple contrarian proposals Evaluation and
More informationPart IV: 3D WiNoC Architectures
Wireless NoC as Interconnection Backbone for Multicore Chips: Promises, Challenges, and Recent Developments Part IV: 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan 1 Outline: 3D WiNoC Architectures
More informationSpiNNaker - a million core ARM-powered neural HPC
The Advanced Processor Technologies Group SpiNNaker - a million core ARM-powered neural HPC Cameron Patterson cameron.patterson@cs.man.ac.uk School of Computer Science, The University of Manchester, UK
More informationOpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel
OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) hyoukjun@gatech.edu April
More informationIntellectual Property Macrocell for. SpaceWire Interface. Compliant with AMBA-APB Bus
Intellectual Property Macrocell for SpaceWire Interface Compliant with AMBA-APB Bus L. Fanucci, A. Renieri, P. Terreni Tel. +39 050 2217 668, Fax. +39 050 2217522 Email: luca.fanucci@iet.unipi.it - 1 -
More information3D WiNoC Architectures
Interconnect Enhances Architecture: Evolution of Wireless NoC from Planar to 3D 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan Sep 18th, 2014 Hiroki Matsutani, "3D WiNoC Architectures",
More informationFPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)
FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor
More informationTrade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip
Trade Offs in the Design of a Router with Both Guaranteed and BestEffort Services for Networks on Chip E. Rijpkema, K. Goossens, A. R dulescu, J. Dielissen, J. van Meerbergen, P. Wielage, and E. Waterlander
More informationAsynchronous Bypass Channel Routers
1 Asynchronous Bypass Channel Routers Tushar N. K. Jain, Paul V. Gratz, Alex Sprintson, Gwan Choi Department of Electrical and Computer Engineering, Texas A&M University {tnj07,pgratz,spalex,gchoi}@tamu.edu
More informationDesigning 3D Tree-based FPGA TSV Count Minimization. V. Pangracious, Z. Marrakchi, H. Mehrez UPMC Sorbonne University Paris VI, France
Designing 3D Tree-based FPGA TSV Count Minimization V. Pangracious, Z. Marrakchi, H. Mehrez UPMC Sorbonne University Paris VI, France 13 avril 2013 Presentation Outlook Introduction : 3D Tree-based FPGA
More informationRHiNET-3/SW: an 80-Gbit/s high-speed network switch for distributed parallel computing
RHiNET-3/SW: an 0-Gbit/s high-speed network switch for distributed parallel computing S. Nishimura 1, T. Kudoh 2, H. Nishi 2, J. Yamamoto 2, R. Ueno 3, K. Harasawa 4, S. Fukuda 4, Y. Shikichi 4, S. Akutsu
More informationThe communication bottleneck
3D-MPSoCs: architectural and design technology outlook Luca Benini DEIS Università di Bologna lbenini@deis.unibo.it The communication bottleneck Architectural issues Traditional shared buses do not scale
More informationOASIS Network-on-Chip Prototyping on FPGA
Master thesis of the University of Aizu, Feb. 20, 2012 OASIS Network-on-Chip Prototyping on FPGA m5141120, Kenichi Mori Supervised by Prof. Ben Abdallah Abderazek Adaptive Systems Laboratory, Master of
More informationModule 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth
Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012
More informationA Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on
A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced
More informationA VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK
A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER A Thesis by SUNGHO PARK Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements
More informationThe Xilinx XC6200 chip, the software tools and the board development tools
The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions
More informationThe Design and Implementation of a Low-Latency On-Chip Network
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current
More informationDesign and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University
Design and Test Solutions for Networks-on-Chip Jin-Ho Ahn Hoseo University Topics Introduction NoC Basics NoC-elated esearch Topics NoC Design Procedure Case Studies of eal Applications NoC-Based SoC Testing
More informationEECS 578 Interconnect Mini-project
EECS578 Bertacco Fall 2015 EECS 578 Interconnect Mini-project Assigned 09/17/15 (Thu) Due 10/02/15 (Fri) Introduction In this mini-project, you are asked to answer questions about issues relating to interconnect
More informationA 1-GHz Configurable Processor Core MeP-h1
A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface
More informationSpiral 2-8. Cell Layout
2-8.1 Spiral 2-8 Cell Layout 2-8.2 Learning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as geometric
More informationRASoC: A Router Soft-Core for Networks-on-Chip
RASoC: A Router Soft-Core for Networks-on-Chip Cesar Albenes Zeferino Márcio Eduardo Kreutz Altamiro Amadeu Susin UNIVALI CTTMar Rua Uruguai, 458 C.P. 360 CEP 88302-202 Itajaí SC BRAZIL zeferino@inf.univali.br
More informationNetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013
NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching
More informationTHERE are few published methods to aid in automating
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 9, SEPTEMBER 2011 1387 Design of an Energy-Efficient Asynchronous NoC and Its Optimization Tools for Heterogeneous
More informationEECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 12: On-Chip Interconnects
1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 12: On-Chip Interconnects Instructor: Ron Dreslinski Winter 216 1 1 Announcements Upcoming lecture schedule Today: On-chip
More informationH 2 A: A Hardened Asynchronous Network on Chip
H 2 A: A Hardened Asynchronous Network on Chip Faculty of Informatics - FACIN, - PUCRS 1 Porto Alegre, RS, Brazil ney.calazans@pucrs.br Julian Pontes 1,2, Ney Calazans 1, Pascal Vivet 2 CEA-LETI 2 Grenoble,
More informationHermes-GLP: A GALS Network on Chip Router with Power Control Techniques
IEEE Computer Society Annual Symposium on VLSI Hermes-GLP: A GALS Network on Chip Router with Power Control Techniques Julian Pontes 1, Matheus Moreira 2, Rafael Soares 3, Ney Calazans 4 Faculty of Informatics,
More informationQuasi Delay-Insensitive High Speed Two-Phase Protocol Asynchronous Wrapper for Network on Chips
Guan XG, Tong XY, Yang YT. Quasi delay-insensitive high speed two-phase protocol asynchronous wrapper for network on chips. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 25(5): 1092 1100 Sept. 2010. DOI 10.1007/s11390-010-1086-3
More informationACCORDING to the International Technology Roadmap
420 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 3, SEPTEMBER 2011 A Voltage-Frequency Island Aware Energy Optimization Framework for Networks-on-Chip Wooyoung Jang,
More informationFCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow
FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture
More informationLow Power and Energy Efficient Asynchronous Design
Copyright 2007 American Scientific Publishers All rights reserved Printed in the United States of America Journal of Low Power Electronics Vol. 3, 234 253, 2007 Peter A. Beerel 1 and Marly E. Roncken 2
More informationCross Clock-Domain TDM Virtual Circuits for Networks on Chips
Cross Clock-Domain TDM Virtual Circuits for Networks on Chips Zhonghai Lu Dept. of Electronic Systems School for Information and Communication Technology KTH - Royal Institute of Technology, Stockholm
More informationFPGA for Software Engineers
FPGA for Software Engineers Course Description This course closes the gap between hardware and software engineers by providing the software engineer all the necessary FPGA concepts and terms. The course
More informationPrediction Router: Yet another low-latency on-chip router architecture
Prediction Router: Yet another low-latency on-chip router architecture Hiroki Matsutani Michihiro Koibuchi Hideharu Amano Tsutomu Yoshinaga (Keio Univ., Japan) (NII, Japan) (Keio Univ., Japan) (UEC, Japan)
More informationCoreVA-MPSoC: A Many-core Architecture with Tightly Coupled Shared and Local Data Memories
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, POST-PRINT, DECEMBER 2017 1 CoreVA-MPSoC: A Many-core Architecture with Tightly Coupled and Local Data Memories Johannes Ax, Gregor Sievers, Julian
More informationDesign and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA
Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Maheswari Murali * and Seetharaman Gopalakrishnan # * Assistant professor, J. J. College of Engineering and Technology,
More informationDesign of Reconfigurable Router for NOC Applications Using Buffer Resizing Techniques
Design of Reconfigurable Router for NOC Applications Using Buffer Resizing Techniques Nandini Sultanpure M.Tech (VLSI Design and Embedded System), Dept of Electronics and Communication Engineering, Lingaraj
More informationLow-Power Interconnection Networks
Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:
More informationSynthesizable FPGA Fabrics Targetable by the VTR CAD Tool
Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design
More informationA 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology
http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee
More informationArchitecture and Automated Design Flow for Digital Network on chip for Analog/RF Building Block Control
Architecture and Automated Design Flow for Digital Network on chip for Analog/RF Building Block Control Wolfgang Eberle, PhD IMEC Bioelectronic Systems Bridging software and analog/rf Software defined
More informationConfigurable Router Design for Dynamically Reconfigurable Systems based on the SoCWire NoC
International Journal of Reconfigurable and Embedded Systems (IJRES) Vol. 2, No. 1, March 2013, pp. 27~48 ISSN: 2089-4864 27 Configurable Router Design for Dynamically Reconfigurable Systems based on the
More informationAT-501 Cortex-A5 System On Module Product Brief
AT-501 Cortex-A5 System On Module Product Brief 1. Scope The following document provides a brief description of the AT-501 System on Module (SOM) its features and ordering options. For more details please
More informationHARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK
DOI: 10.21917/ijct.2012.0092 HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK U. Saravanakumar 1, R. Rangarajan 2 and K. Rajasekar 3 1,3 Department of Electronics and Communication
More informationAn Asynchronous Array of Simple Processors for DSP Applications
An Asynchronous Array of Simple Processors for DSP Applications Zhiyi Yu, Michael Meeuwsen, Ryan Apperson, Omar Sattari, Michael Lai, Jeremy Webb, Eric Work, Tinoosh Mohsenin, Mandeep Singh, Bevan Baas
More informationNoC Test-Chip Project: Working Document
NoC Test-Chip Project: Working Document Michele Petracca, Omar Ahmad, Young Jin Yoon, Frank Zovko, Luca Carloni and Kenneth Shepard I. INTRODUCTION This document describes the low-power high-performance
More informationFPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP
FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College
More informationDesign and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek Raj.K 1 Prasad Kumar 2 Shashi Raj.K 3
IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 02, 2014 ISSN (online): 2321-0613 Design and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek
More informationThe Nostrum Network on Chip
The Nostrum Network on Chip 10 processors 10 processors Mikael Millberg, Erland Nilsson, Richard Thid, Johnny Öberg, Zhonghai Lu, Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004
More informationJUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS
1 JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS Shabnam Badri THESIS WORK 2011 ELECTRONICS JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS
More informationKiloCore: A 32 nm 1000-Processor Array
KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel, Timothy Andreas, Bin Liu, Anh Tran, Emmanuel Adeagbo, Bevan Baas University of California, Davis VLSI Computation
More informationSoC. The Asynchronous NOC. SoC. Conceptual Summary. Ran Ginosar NOCS Tutorial San Diego, 10 May Each color is a separate clock domain
SoC The Asynchronous NOC an Ginosar NOCS Tutorial San Diego, 10 May 2009 Each color is a separate clock domain an Ginosar Async Noc Tutorial NOCS 2009 2 What clock for the interconnect? Fastest? Opportunistic?
More informationWITH THE CONTINUED advance of Moore s law, ever
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 11, NOVEMBER 2011 1663 Asynchronous Bypass Channels for Multi-Synchronous NoCs: A Router Microarchitecture, Topology,
More informationA 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications
A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System
More informationSimplifying FPGA Design for SDR with a Network on Chip Architecture
Simplifying FPGA Design for SDR with a Network on Chip Architecture Matt Ettus Ettus Research GRCon13 Outline 1 Introduction 2 RF NoC 3 Status and Conclusions USRP FPGA Capability Gen
More informationFPGAs & Multi-FPGA Systems. FPGA Abstract Model. Logic cells imbedded in a general routing structure. Logic cells usually contain:
s & Multi- Systems Fit logic into a prefabricated system Fixed inter-chip routing Fixed on-chip logic & routing XBA Partitioning Global outing Technology Map. XBA XBA Placement outing 23 Abstract Model
More informationProcessor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP
Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor
More informationBasic Low Level Concepts
Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock
More informationInternational Training Workshop on FPGA Design for Scientific Instrumentation and Computing November 2013
2499-13 International Training Workshop on FPGA Design for Scientific Instrumentation and Computing 11-22 Digital CMOS Design Combinational and sequential circuits, contd. Pirouz Bazargan-Sabet Department
More information3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER
3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}
More informationPhysical Implementation
CS250 VLSI Systems Design Fall 2009 John Wawrzynek, Krste Asanovic, with John Lazzaro Physical Implementation Outline Standard cell back-end place and route tools make layout mostly automatic. However,
More informationSoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology
SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology Outline SoC Interconnect NoC Introduction NoC layers Typical NoC Router NoC Issues Switching
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationLecture 18: Communication Models and Architectures: Interconnection Networks
Design & Co-design of Embedded Systems Lecture 18: Communication Models and Architectures: Interconnection Networks Sharif University of Technology Computer Engineering g Dept. Winter-Spring 2008 Mehdi
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More informationDeadlock-free XY-YX router for on-chip interconnection network
LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ
More informationRe-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs
This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations. Re-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs
More informationFPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.
FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different
More informationDATA REUSE DRIVEN MEMORY AND NETWORK-ON-CHIP CO-SYNTHESIS *
DATA REUSE DRIVEN MEMORY AND NETWORK-ON-CHIP CO-SYNTHESIS * University of California, Irvine, CA 92697 Abstract: Key words: NoCs present a possible communication infrastructure solution to deal with increased
More informationOn Packet Switched Networks for On-Chip Communication
On Packet Switched Networks for On-Chip Communication Embedded Systems Group Department of Electronics and Computer Engineering School of Engineering, Jönköping University Jönköping 1 Outline : Part 1
More information