A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing

Similar documents
Design and Analysis of On-Chip Router for Network On Chip

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

Power and Area Efficient NOC Router Through Utilization of Idle Buffers

Evaluation of NOC Using Tightly Coupled Router Architecture

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

Lecture 18: Communication Models and Architectures: Interconnection Networks

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

Design and Implementation of Buffer Loan Algorithm for BiNoC Router

Deadlock-free XY-YX router for on-chip interconnection network

High Performance Interconnect and NoC Router Design

OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel

PERFORMANCE ANALYSES OF SPECULATIVE VIRTUAL CHANNEL ROUTER FOR NETWORK-ON-CHIP

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK

OpenSMART: An Opensource Singlecycle Multi-hop NoC Generator

ISSN Vol.03,Issue.06, August-2015, Pages:

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

A Modified NoC Router Architecture with Fixed Priority Arbiter

Design of Reconfigurable Router for NOC Applications Using Buffer Resizing Techniques

Design of Synchronous NoC Router for System-on-Chip Communication and Implement in FPGA using VHDL

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Design and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek Raj.K 1 Prasad Kumar 2 Shashi Raj.K 3

A Novel Energy Efficient Source Routing for Mesh NoCs

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

Lecture 3: Flow-Control

Lecture: Interconnection Networks

Lecture 22: Router Design

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

A Literature Review of on-chip Network Design using an Agent-based Management Method

International Journal of Research and Innovation in Applied Science (IJRIAS) Volume I, Issue IX, December 2016 ISSN

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

NOC Deadlock and Livelock

Lecture 23: Router Design

Noc Evolution and Performance Optimization by Addition of Long Range Links: A Survey. By Naveen Choudhary & Vaishali Maheshwari

Dynamic Router Design For Reliable Communication In Noc

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture

EC 513 Computer Architecture

ISSN:

Design and Simulation of Router Using WWF Arbiter and Crossbar

DESIGN AND IMPLEMENTATION ARCHITECTURE FOR RELIABLE ROUTER RKT SWITCH IN NOC

NoC Test-Chip Project: Working Document

A NEW ROUTER ARCHITECTURE FOR DIFFERENT NETWORK- ON-CHIP TOPOLOGIES

Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC

Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections

Efficient And Advance Routing Logic For Network On Chip

Design and implementation of deadlock free NoC Router Architecture

Deadlock and Livelock. Maurizio Palesi

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC

LOW POWER REDUCED ROUTER NOC ARCHITECTURE DESIGN WITH CLASSICAL BUS BASED SYSTEM

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

Design of Router Architecture Based on Wormhole Switching Mode for NoC

Lecture 15: NoC Innovations. Today: power and performance innovations for NoCs

Ultra-Fast NoC Emulation on a Single FPGA

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

4. Networks. in parallel computers. Advances in Computer Architecture

TDT Appendix E Interconnection Networks

NOC: Networks on Chip SoC Interconnection Structures

Connection-oriented Multicasting in Wormhole-switched Networks on Chip

The Design and Implementation of a Low-Latency On-Chip Network

Basic Switch Organization

Extended Junction Based Source Routing Technique for Large Mesh Topology Network on Chip Platforms

Transaction Level Model Simulator for NoC-based MPSoC Platform

Performance Analysis of Routing Algorithms

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology

Akash Raut* et al ISSN: [IJESAT] [International Journal of Engineering Science & Advanced Technology] Volume-6, Issue-3,

Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling

MinRoot and CMesh: Interconnection Architectures for Network-on-Chip Systems

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution

On an Overlaid Hybrid Wire/Wireless Interconnection Architecture for Network-on-Chip

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

RASoC: A Router Soft-Core for Networks-on-Chip

Flow Control can be viewed as a problem of

PRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS. A Thesis SONALI MAHAPATRA

VLSI Design of Multichannel AMBA AHB

Configurable Router Design for Dynamically Reconfigurable Systems based on the SoCWire NoC

Routers In On Chip Network Using Roshaq For Effective Transmission Sivasankari.S et al.,

NEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect

ISSN Vol.03, Issue.02, March-2015, Pages:

MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect

Joint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals

Bursty Communication Performance Analysis of Network-on-Chip with Diverse Traffic Permutations

Fault-adaptive routing

Overlaid Mesh Topology Design and Deadlock Free Routing in Wireless Network-on-Chip. Danella Zhao and Ruizhe Wu Presented by Zhonghai Lu, KTH

AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

DESIGN A APPLICATION OF NETWORK-ON-CHIP USING 8-PORT ROUTER

Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip

CS575 Parallel Processing

Networks-on-Chip Router: Configuration and Implementation

Transcription:

727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni College Of Engineering, Nagpur, Maharashtra, India 2 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni Academy of Engineering & Technology, Nagpur, Maharashtra, India Abstract - Continuous scaling of CMOS expertise makes the integration of large number of heterogeneous devices which results in efficient communication on a single chip. This is the purpose for which competent routers are desirable as a result of which communication takes place between these devices. The proposed methodology gives the method for on-chip routers based on combination of the XY and (Virtual cut through) VCT router for better and optimized data transfer. In this paper the proposed design of on-chip router give the outcome as the optimized routing output because of the priority allocate to the input data packet. The dynamic arbitration technique using VCT and XY is explained in this paper. Keywords - Arbiter, Network on chip (NOC), On chip Router, XY Router, Virtual out through (VCT) router, Cross Bar Switch. 1. Introduction In order to meet the growing computation-intensive applications as well as low-power necessities for high performance systems, the amount of computing resources on a single-chip has increased. Inadvertently to build a System-on-Chip (SoC), by adding many computing resources such as specific IPs, CPU, DSP etc. the interconnection between resources is another challenging issue. In nearly all SoC applications, a shared bus interconnection which requires arbitration logic to serialize several bus access requests is designed to aid communication with integrated processing units which is due to its simple control traits of low-cost. However, this type of shared bus architecture does not scale very well because at a time only one master can utilize the bus that means all the bus accesses should be serialized by the arbitrator. This scalable bandwidth constraint could be removed by using packet-switched on-chip micro-network of interconnects, popularly known as Network-on-Chip (NoC) architecture. The fundamental initiative originated from conventional major multi-processors and dispersed computing networks. 1.1 Network on Chip A variety of interconnection techniques are presently in use counting crossbar, buses and NOCs. Of these, later two are dominant in research community. But buses bear reduced scalability which is because of the increase in number of processing elements that degrades the performance drastically. Therefore they are not taken care of where the processing entities are greater in number. To remove this constraint the concentration has dawned to the packet-based on-chip communication networks, called as Network-On-Chip (NOC). Figure 1 shows the basic Network-On-Chip Architecture [1]. A conventional NoC comprises of computational processing elements (PEs), network interfaces (NIs), and routers as shown in Fig. 1. The latter two comprise the communication architecture. Before utilizing the router backbone to go across the NoC the NI packetizes the data. Each PE is attached to an NI which connects the PE to a local router. The packet is moved further hop by hop on the network listening to the decision made by each router, when a packet was transferred from a source PE to a destination PE. The packet is first received and then stored in an input buffer for every router. After this the control logics in the router are accountable to take routing decision and channel arbitration. At last, the packet which is granted will traverse through a crossbar to the next router. The procedure is repeated till the packet arrives at its destination.

728 design of a NoC router should be as simple as possible because implementation cost increases with an increase in the design complexity of a router. 2. Routing Techniques In this paper for the creation of the NoC router we are going to use two different routing techniques and they are as follows: 1. X-Y Routing 2. VCT (Virtual cut through) Routing 2.1. X-Y Routing Fig. 1 PE Processing Element R - Router NI Network Interface Fig:1 Typical NoC Architecture X-Y routing is determined completely from their addresses. In X-Y routing, the message travels horizontally (in the X-dimension) that is from the source point to the column comprising the destination, from where the message starts travelling vertically. In X-Y routing X direction is determined first and then the Y direction is determined. There are four possible 1.2 On-Chip Router Fig. 2 Typical NoC Router The main part of an on-chip network is the router, which undertakes crucial task of coordinating the data flow. Fig. 2 shows the typical Network-On-Chip router. The router operation revolves around two fundamental regimes: (a) the data path and (b) the associated control logic. The data path consists of number of input and output channels to facilitated packet switching and traversal [1] [2]. Generally 5 input X 5 output router is used. Out of five ports four ports are in cardinal direction (North, South, East, and Waste) and one port is attached to its local processing element. Like in any other network, router is the most important component for the design of communication back-bone of a NoC system. In a packet switched network, the functionality of the router is to forward an incoming packet to the destination resource if it is directly connected to it, or to forward the packet to another router connected to it. It is very important that Fig. 3 X-Y Routing Example direction pairs in X-Y Routing, east-south, east-north, west-south and west-north. Fig. 4 shows the simple example of X-Y routing [4]. Advantages for X-Y routing: Very simple to implement Deadlock-free 2.2. VCT Routing Virtual Cut Through routing is very similar to the message switching, with the difference that is as a message comes

729 in an intermediate node plus its chosen departing channel is free, then in distinction to message switching, the message is sent out of the adjacent node towards its destination prior to its complete reception at the node. If the message is not allowed due to busy output, the channel comes with a message buffered in an intermediate node. Therefore, the interruption due to unnecessary buffering in front of an unused channel can be avoided [5]. VCT routing technique used to create number of virtual channel to pass the data packet allowed by the arbiter in the sequential order. 3. Router Design The design of router mainly consists of three parts: 1. FIFO 2. Arbiter 3. Crossbar kept by it and knows the number of ports that are free and the ports are communicating with one another. In proposed work, we are using Round Robin Arbitration Algorithm. Packets with the same priority and meant for the similar output port are listed with a round-robin arbiter. Fig. 5 demonstrates the example of dynamic priority round robin arbiter. If in a given period of time, many input ports request were there with the similar output or resource, the arbiter is responsible completing the priorities amongst lots of diverse request inputs. The arbiter will liberate the output port that is connected to the crossbar, once the last packet has finished diffusion. This would result in the use of output by the arbitration of arbiter by other waiting packets. Fig. 5 Dynamic priority round robin arbiter. A round-robin arbiter operates on the principle that a request which was just served should have the lowest priority on the next round of arbitration. Depending upon the control logic arbiter generates select lines for MUX based crossbar and read or write signal for FIFO buffers. Fig.4 5 as shown in Fig.4. 3.1FIFO Buffer 5 Input-Output port Unidirectional On-Chip Router Buffering is extremely important in NoC for congestion control and flow control. Calculating proper size of buffer is the solution for receiving best performance. In this design the depth of buffer is used to provide input buffering. The extent of buffer is equivalent to the packet size. Because the depth of buffer is four, minimum four clock signals are requisites for first packet to come out of the buffer. 3.2. Arbiter Arbiter controls the resolution of the ports and solves contention problem. The updated status of all the ports is 3.3. Crossbar A crossbar switch is a switch connecting numerous inputs to multiple outputs in a matrix manner. The diagram of crossbar switch has 4 outputs and 4 inputs. In design given in Fig.6, every input port is enforced to share a single crossbar port even when multiple flit could be sent from different virtual-channel buffer. This restriction allows keeping crossbar size small and independent of the number of virtual-channel. 3.3.1 Reconfigurable Cross Bar Switch Reconfigurable crossbar switch has mainly three blocks: (1) connection matrix, in which each and every topologies are implemented; (2) decoder, it converts the reconfigurable bits for a matrix bits set and (3) pre-header analyzer. this third block could be added in the packet with the output destination by Network processor.

730 Reconfigurable crossbar switch (RCS) uses reconfiguration bits to employ the topology in the space. Only the Reconfiguration Unit and instruction set of the network processor are capable to change those bits in order to implements new topologies. Although one instruction can modify the 01 and 10 formats the 00 format is restricted to reconfiguration unit [3]. 4. Proposed Router Architecture Figure 7 shows the overall proposed on-chip router technique we implemented in this paper. The input data packets gives to arbiter with priority as shown in Fig.7, so the dynamic output from the arbiter is produced. Then the output is send to VCT router to find the shortest path for the transferring the data packet through the bus. Then X-Y router is used for sending the packets in same direction from the bus. 5. Simulation Result In the proposed architecture simulation result has been taken using the XIINX 13.1 simulator. Fig. 9 shows the output of the X-Y router. Synthesis report shows all the time related information of the X-Y router, like total delay, input time, output time, total path delay, source, destination, etc. Fig. 10 shows the number of devices utilized by the X-Y router to perform its task. This will give the total energy utilized by the router. Fig. 9 Output of X-Y Router Fig. 6 4 4 Cross Bar Switch Fig. 7 Proposed NoC router technique. Fig. 10 Device utilization of X-Y router Fig. 8 Combination of VCT routing and X-Y routing. Fig. 8 shows the combination of this both routing techniques. The example of 4 packets has been taken. It is also applicable for random number of packets. Fig. 11 shows the output of the VCT router and Fig. 12 shows the number of devices utilized by the VCT router to perform its task. This will give the total energy utilized by the router.

731 Fig. 13 Output of VCT and XY Fig.11 Output of VCT router Fig. 14 Device utilization of VCT with XY router Parame ters Numbe r of slices Numbe r of slices flip flop Numbe r of 4 input LuTs Latenc y Fig. 12 Device utilization of VCT router Table 1 Comparison table of the results of VCT and XY router. Available Used Utilization (%) XY VC T VC T + XY XY VCT VC T + XY XY VC T VC T + XY 192 0 384 0 384 0-588 8 117 76 117 76 147 52 295 04 295 04 127 148 240 7 6% 2% 16 % 53 52 329 1% 0% 1% 230 267 457 4 20.34 1ns 5.642 ns 5% 2% 15 % Fig. 13 shows the output of the VCT and XY router and Fig. 14 shows the number of devices utilized by the VCT and XY router to perform its task. 6. Conclusion Table 1 shows the comparative results of the XY router, and VCT router. As shown in table 1, utilization of number of slices, number of slices flip flop and number of 4 input LuTs in XY is better than VCT. Hence from the above result we can say that XY router results are better than VCT router. The utilization of same parameters is further improved when the combination of XY and VCT router is used. Path selection can be done based on XY routing. We would be selecting a Virtual Channel to pass packets, thus the system would be having advantages of the VCT algorithm. The authors declare that they have no competing interests. References [1] V.Soteriou, R.S. Ramanujam, B. Lin, Li-Shiuan Peh. A High- Throughput Distributed Shared-Buffer NoC Router. IEEE Computer Architecture Letters, vol. 8, no. 1, pp. 21-24, Jan.-June 2009, doi:10.1109/lca. 2009.5.

732 [2] A. Louri, J. Wang, Design of energy-efficient channel buffers with router bypassing for network-on-chips (NoCs). [3] H. C. Freitas and C. A. P. S. Martins, Didactic Architectures and Simulator for Network Processor Learning, Workshop on Computer Architecture Education, San Diego, CA, USA, 2003, pp.86-95 [4] A. Kodi, A. Louri, J. Wang. Design of energyefficientnchannel buffers with router bypassing for network-onchips (NoCs). Proceedings of International Symposium on Quality of Electronic Design (ISQED), pp.826-832, March 2009. [5] Khalid Latif, Moazzam Niazi, Hannu Tenhunen, Tiberiu Seceleanu, Sakir Sezer. Application development flow for on-chip distributed architectures. Proceedings of IEEE International SoC Conference (SOCC), Sept. 2008, pp. 163-168.