ScienceDirect. Packet-based Adaptive Virtual Channel Configuration for NoC Systems

Size: px
Start display at page:

Download "ScienceDirect. Packet-based Adaptive Virtual Channel Configuration for NoC Systems"

Transcription

1 Available online at ScienceDirect Procedia Computer Science 34 (2014 ) International Workshop on the Design and Performance of Network on Chip (DPNoC 2014) Packet-based Adaptive Virtual Channel Configuration for NoC Systems Masoud Oveis Gharan and Gul N. Khan* Electrical and Computer Engineering, Ryerson University, Toronto, ON M5B 2K3 Canada Abstract Growing number of on-chip cores requires the introduction of an efficient communication structure such as NoC. In NoC design, the channel buffer organization facilitates the use of Virtual Channels (VC) for on-chip communication. A VC structure can be categorized as static or dynamic. In a dynamic VC structure, variable number of buffer-slots can be employed by each VC according to different traffic conditions in the NoC. We introduce a Packet-Based Virtual Channel (PBVC) scheme, where a VC is reserved when a packet enters the router and released when the packet leaves. A VC will hold the flits of only one packet at a time that subsequently removes the Head-of-Line blocking. PBVC technique is an amended version of dynamically allocated multi-queue schemes where, an input or output port employs a centralized buffer whose slots are dynamically allocated to VCs. The experimental results show that our approach improves network latency and throughput as compared to other VC designs The Elsevier Authors. B.V. Published This an by open Elsevier access B.V. article under the CC BY-NC-ND license Peer-review ( under responsibility of the Program Chairs of FNC Selection and peer-review under responsibility of Conference Program Chairs Keywords: adaptive virtual channels; head-of-line blocking; NoC; on-chip communication; VC organization. 1. Introduction In NoC (Network-on-Chip) domain, wormhole routing is mainly employed for communication among various cores of NoC 1. The flits of a packet are stored in the channel buffers. The header flit of a packet starts from the source core router and passes through a series of routers reserving the route path. This type of routing can cause traffic congestion. The blocking of one packet leads to the blocking of other packets queued for the channel. This blocking is known as Head-of-Line (HoL) blocking that increases the latency and results in lower throughput and buffer utilization. HoL problem can be alleviated by employing Virtual Channels 2. The VC mechanism enables the multiplexing and buffering of several packets in a single router channel 3. However, it does not remove HoL blocking * Corresponding author. Tel.: x6084; fax: address: gnkhan@ryerson.ca Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( Selection and peer-review under responsibility of Conference Program Chairs doi: /j.procs

2 Masoud Oveis Gharan and Gul N. Khan / Procedia Computer Science 34 ( 2014 ) completely 4,5,6. Queue structure is an important component of NoC router that stores the flits of message packets. It is the main storage of a router structure that temporarily stores packet flits by following the First-Come-First-Serve (FCFS) order until network resources become available. Sometimes queue and FIFO terms are used interchangeably. Many queue architectures have been proposed such as FIFO queue, circular queue, Dynamically Allocated Multi-Queue (DAMQ) and their variants 7,8. DAMQ is a single storage array that maintains multiple FIFO queues 4. It can provide a solution to the HoL blocking problem 10. The packets are stored in the queues of a multiqueue of the output port for routing. Therefore, in case the current packet faces a blockage, the other packets destined to that output port can become blocked. Choi and Pinkston claimed that this type of blocking is not HoL 10. However, we can argue that in fact this is HoL blocking. Even, HoL blocking can also happen inside the queue of a DAMQ buffer 2,11. We present a new methodology that entirely removes the effect of HoL blocking. 2. DAMQ based VC organization A DAMQ buffer organization scheme for all the virtual channels (VCs) has been presented in the past 6. In Fig. 1, two buffer slots are reserved for each virtual channel before the buffer accepts an incoming flit. The rest of the buffer slots are free to be employed by any of the VCs according to the traffic demand. Various research groups have presented the design and implementation of DAMQ buffers 2,4,5. These designs are either expensive in terms of hardware or inefficient due to data dependencies. A centralized shared buffer architecture called Virtual Channel Regulator (ViChaR) is introduced by Nicopoulos and others 2. ViChaR requires a large size of arbiter that can create latency bottlenecks in the critical path and may limit the overall NoC frequency 11. It dynamically allocates VCs on a First-Come-First-Served basis, and there is no priority for the new packets. In the case of blocking, a packet can occupy all the slots of a channel buffer and prevents any new packet to pass through the router. If the blocking of that packet continues, all the upstream routers will be occupied by the packet, and a new packet may not pass through the routers on that route. Another drawback of ViChaR is its huge hardware for some configurations. Two more approaches are introduced by Evripidou et.al involving DAMQ mechanism 11. These are named as Mask-based and Link-List-based techniques. The Mask-based approach is cheaper in terms of hardware but it is of synchronous nature and its performance is low. The Link-List-based approach that mimics DAMQ organization is expensive however, it leads to higher performance. All the existing DAMQ based VC buffer structures suffer from HoL blocking except ViChaR that impose high latency and expensive hardware overhead. Therefore, a general solution to remove HoL blocking from DAMQ buffer organization is needed. All of the above points have motivated us to investigate and design the PBVC approach for avoiding HoL blocking Conventional wormhole VC organization Conventional Wormhole Virtual Channel (CWVC) mechanism is a common approach used in most of the existing VC based NoCs being researched. It is usually used as a basic approach to be compared with the other new approaches 5,8,9,12,13. In Fig. 2, the VC implementation of a physical channel is illustrated for a conventional static queue structure. The buffer slots in static queue structure are statically allocated to incoming packets where the buffer slots are dynamically allocated to incoming packets in DAMQ organization. In CWVC mechanism, when the header flit of a packet enters a VC buffer, the VC is reserved by the packet. The reservation of the VC is kept until the tail flit arrives. Then the VC can accept a new packet and it might have flits from previous packet. In this way, it can contain two parts of two packets at a time. When a VC has at least two parts of two packets, the blocking of HoL packet blocks the 2nd packet even if the route of the other packet is open. This is the main source of HoL blocking. Physical channel P 0 P 1 P 1 P 3 P 3 P 1 P 1 VC0 VC1 VC2 VC3 Free Fig. 1. Reserved buffer space for virtual channels.

3 554 Masoud Oveis Gharan and Gul N. Khan / Procedia Computer Science 34 ( 2014 ) Write-Pointer VC3 Read-Pointer VC3 Write-Pointer VC2 Read-Pointer VC2 Write-Pointer VC1 Read-Pointer VC1 Write-Pointer VC0 Read-Pointer VC0 P 0 P 0 P 0 Physical Channel 32 P 1 P 1 P 1 P 1 P 2 P 2 P 2 P 2 P 2 P 2 P 2 P 2 32 Fig. 2. Input-port with static queue. For DAMQ based VC buffers, the depth of VC dynamically varies according to the traffic situation. To better illustrate the structure and operation of DAMQ buffers, two different cases are illustrated in Fig. 3. In the first case, the depth of VC0 is thirteen while in the 2 nd case, VC0 depth is zero. The varying depth of VC0 can be due to different traffic conditions. This property of DAMQ buffer where a VC can occupy all the buffer space can lead to lower performance. In the case of DAMQ based CWVC organization, there may be two situations of lower performance. First of all, when a packet is blocked in a VC and the other packets can move into that VC buffer, the VC becomes bigger and occupies all the space of buffer. This prevents the other VCs to perform efficiently due to lack of buffer space. Secondly, this blockage leads to the blocking in upstream routers (related to the blocked packets). In other words, the buffer blockage will spread to the other parts of NoC causing global congestion PBVC approach Our PBVC approach provides an effective solution to remove HoL blocking. The methodology is suitable for DAMQ buffer organization. In this approach, when the header flit of a packet enters a VC, the VC is reserved by the packet. The VC becomes free when the tail flit of the packet exits. The VC does not accept a new packet until the VC buffer is holding any flit of the existing packet. In this way, our PBVC methodology removes HoL blocking completely as only one packet can occupy a VC at a particular instance. Its main features are listed below: The chances of getting a free VC for unblocked packets of a channel in the PBVC approach are much higher than the CWVC mechanism. Consider two situations of Figures 4 and 5. In Fig. 4 that represents CWVC organization, the packet 4 and 5 remain blocked until packet-0 is facing a blockage. For PBVC organization, when one of the VCs (i.e. VC1, VC2 or VC3) becomes free, the packet 4 and 5 can occupy the buffer as illustrated in Fig. 5. In the case of our PBVC approach, when a packet faces a blockage, its VC gets minimum space from the channel buffer. Consider a situation of packet blockage as shown in Fig. 6. For the CWVC scheme, the upstream router continues sending new packets to the VC and it continues allocating more buffers and can occupy all the free area of DAMQ buffer as illustrated in Fig. 6a. In the case of PBVC, the new packets stay in the upstream routers until a VC becomes empty. In fact, more free space is available for the other unblocked VCs as shown in Fig. 6b. Consequently, PBVC performance and buffer utilization is better under such scenarios. We expect that the packets reach their destinations in a sequential order. In the traditional CWVC mechanism, if any packet of a series is faced with HoL blocking, the following packets of the series can travel via a free VC and reach the destination before the blocked packet. However, our PBVC approach avoids HoL blocking and each packet of a series will reach the destination in order. H 3 B 2 T 1 H 6 T 5 B 5 B 5 H 5 T 4 B 4 B 4 H 4 T 0 B 0 B 0 H 0 Physical Channel (a) first case VC3 VC2 VC1 VC0 T 3 B 3 B 3 H 3 T 2 B 2 B 2 T 1 B 1 B 1 H 1 P 4 P 4 P 5 P 4 P 4 P 3 P 6 P 5 P 2 P 5 P 5 P 0 P 1 P 0 P 0 P 0 Physical Channel P 3 P 3 P 2 P 3 P 2 P 2 P 3 P 1 P 1 P 1 P 1 (b) 2nd case Free Header flit: H Body flit: B Tail Flit: T Fig. 3. Two different cases in a DAMQ buffer for 4 VCs/channel and 4 flits/packet.

4 Masoud Oveis Gharan and Gul N. Khan / Procedia Computer Science 34 ( 2014 ) VC3 VC2 VC1 VC0 H 3 T 2 B 2 B 2 H 2 T 1 B 1 H 5 T 4 B 4 B 4 H 4 T 0 B 0 B 0 H 0 Free H 3 T 2 B 2 B 2 H 2 T 1 B 1 T 0 B 0 B 0 H 0 Physical channel P 4 P 4 P 5 P 4 P 4 P 3 P 2 P 2 P 2 P 2 P 1 P 0 P 1 P 0 P 0 P 0 P 3 P 2 P 2 P 2 P 2 P 1 P 0 P 1 P 0 P 0 P 0 Fig. 4. CWVC: Packet-0 blockage causes HoL blocking for P4 & P5. Fig. 5. PBVC: Packet-0 blocked but no HoL blocking. H 3 B 2 T 1 H 6 T 5 B 5 B 5 H 5 T 4 B 4 B 4 H 4 T 0 B 0 B 0 H 0 T 3 B 3 B 3 H 3 T 2 B 2 B 2 T 1 T 0 B 0 B 0 H 0 Free P 4 P 4 P 5 P 4 P 4 P 3 P 5 P 5 P 5 P 2 P 6 P 0 P 1 P 0 P 0 P 0 (a) CWVC The buffer utilization of PBVC can be a bit lower than the CWVC approach. However, due to the adaptive nature of buffer, its utilization can be compensated. Moreover, in NoCs where the communication involves a large number of HoL blockings, PBVC mechanism will show better buffer utilization. 3. PBVC router structure Fig. 6. HoL blocking due to a blocked Packet-0. P 3 P 3 P 2 P 3 P 2 P 2 P 3 P 0 P 1 P 0 P 0 P 0 (b) PBVC A DAMQ based CWVC 5x5 router architecture for mesh topology is given in Fig. 7. Generally, the microarchitecture of a router consists of input and/or output ports, an arbiter and a crossbar switch. Each input or output port can utilize VCs to control flow and to enable the sharing of a NoC communication channel. We assume in this paper that a router has VCs at the input-ports level. The NoC router organization and architecture is similar for both CWVC and PBVC approaches. The only difference is the organization of their input-ports. The micro-architecture of a typical CWVC input-port is illustrated in Fig. 8. It contains an SRAM, five arrival/departure tables and other logic and control circuits. The SRAM module serves as the channel buffer. The word size of SRAM is equal to the packetflit size. The data pointed by the Read-Address will always appeared at the SRAM output. When credit-in is active, the data is stored in the SRAM slot pointed by the Write-Address. Five tables are used to implement a dynamic Link-List based CWVC mechanism. The table keeps the records of the occupied VCs. The Header-List table keeps the addresses of channel buffer (SRAM) that point to the header flits of VCs. The Tail-List table keeps the addresses of SRAM buffers that point to the tail flits of VCs. The Link-List table keeps the address of next slot of each buffer slot in the SRAM. In fact, it links the address of flits of each VC in a FIFO manner. The Slot-State table keeps the records of occupied slots in the SRAM. In a CWVC router with reserved space for each VC, a credit signal is sent to upstream routers indicating the in the down-stream router. When the capacity of a VC is full, the credit signal will change to close the VC. Assume that one slot is reserved per VC, the capacity of each VC is dynamic and it varies from one to M buffer slots dynamically, where: M = SRAM slots Total VCs 1 Assuming a total of 16 buffer slots and four virtual channels, the dynamic capacity of each VC varies from 1 slot to 13 slots. In the dynamic buffer implementation, the credits are regulated only by two conditions as given below: if {(Current VC is Empty) OR (Free Slots < Free VCs)} then credit = ON In other words, a VC is open if it is empty or at least one slot is reserved for each free VC. These conditions guarantee that at least one slot is dedicated to each VC. Subsequently, the rest of the slots are dynamically used for all the VCs. It will also remove any starvation and protocol-level deadlocks in the NoC communication. A little-bit of additional hardware is required to implement the PBVC approach in each router. In fact, the coding of two "if statements" is required to open and close each VC as illustrated in Fig. 9a. The first "if statement" is used to close each VC when the arriving flit is a tail flit. The second "if statement" is used to open each VC when the exiting flit is a tail flit. To open and close VC, we use a single bit register, PBVC-Blk. It is set in the case of open and reset otherwise. PBVC-Blk output is inverted and ANDed with the credit out signal of the related VC. Therefore, when PBVC-Blk is set, the VC will be closed as shown in Fig. 9b. PBVC and CWVC routers are also implemented for two platforms: SYNOPSYS and FPGA. The SYNOPSYS

5 556 Masoud Oveis Gharan and Gul N. Khan / Procedia Computer Science 34 ( 2014 ) platform uses 0.25μ technology to determine the power and area overhead for implementing PBVC approach. Both routers use a dual-ported SRAM memory for data buffering. The SRAMs have 16 slots with 16 bits in each slot. The power and chip area values of both routers are listed in Table 1. The PBVC router requires an extra 0.006% chip area and consumes 0.24% more power than CWVC. We have also investigated the amount of extra hardware for PBVC router when implemented in the FPGA platform. The PBVC router requires 0.4% more combinational logic and 0.9% more registers as compared to CWVC router. VC-ID VC-ID-local Arrival/Departure Tables Write-pointer Credit-in Grant Read-pointer Clk Link-list slot state VC-State Slot-State slot state Tail-List Header-List Header-List Read-pointer VC-ID-local VC-block Request Buffer-full Credit in Link N Link E Link S Link W Link L Router Input Port N Input Port E Input Port S Input Port W Input Port L Req N In N out N In E out E Crossbar In S out S In W In L grant N Arbiter (VC and SW Allocator) VC_block Aselect VC_full config out W out L Buffer_full Credit_out Link N Link E Link S LinkW Link L Buffer-full Data_in Credit_in Slot-State slot state VC-State Read-pointer Write-pointer De coder 4 3 Read-Address SRAM Write-Address 2 Data Out Data In Grant << Flit- info Out Reg. In Req Data_out VC_ID VC_ID Fig. 7. A 5 5 router architecture for 2D-mesh NoCs. Fig. 8. Micro-architecture of a CWVC router input-port. 1 Credit-in grant tail 0 tail PBVC-Blk In Out PBVC-Blk One-bit Register >... PBVC-Blk Credit-out (a) logic for if statements (b) logic for Credit-out Fig. 9. Extra hardware per VC for PBVC router input-port. Table 1. PBVC and CWVC router design. ASIC FPGA Router Area Power Combinational Registers Type (μm 2 ) (mw) Logic (elements) (bits) CWVC PBVC Extra Hardware 0.006% 0.24% 0.4% 0.9% 4. Experimental results The conventional DAMQ organization with a slot reserved per VC is employed in the CWVC mechanism and the same terminology is used. We simulated two different traffic patterns such as Random and HoL specific traffic. For Random traffic, all the destination cores are chosen randomly. HoL specific traffic creates a situation of higher HoL blocking conditions. By evaluating the results of these traffic patterns, the efficiency of our PBVC approach is demonstrated in this section. We setup our simulator for PBVC and CWVC modes in a DAMQ Link-List based router and input-port micro-architecture. We change the number of traffic packets and VCs to measure the

6 Masoud Oveis Gharan and Gul N. Khan / Procedia Computer Science 34 ( 2014 ) performance in terms of throughput and latency. The NoC topology selected is a 4 4 mesh with XY routing methodology. The communication of packets is in the form of wormhole switching where the channel width is equal to the flit size (32 bits). Each packet is made of 16 flits and the buffer depth for each input-port is 16 slots of a packet-flit each. Each flit is sent or received by a source core/router in two clock cycles. We assume that the link delays between routers are negligible as compared to the router delay. The performance of PBVC is compared with the CWVC approach during the following experiments and results are presented in Figures 10, 11, 12 and 13. In the case of Random traffic, all the sources, destinations and routers are clocked at the same rate (e.g. 1nsec). In the second experiment, the HoL specific traffic pattern is employed, where all the source cores send their first packet to one destination. The destination is set to be two times slower than the other destination cores. After sending the first packet, the rest of the packets of all the sources are sent randomly to all the destination cores. This condition increases the HoL blocking especially when the second or later packets are transferred. Our simulator is coded in Verilog and simulation is done by using the ModelSim for the Altera FPGA platform. In these experiments, the throughput is measured by the rate of packets received to the maximum number of packets (ideally) sent at a specific time. The latency is measured in terms of time that a specific number of packets are sent and received by the NoCs. 70% 60% Throghput (rate of receiving) 50% 40% 30% 20% 10% Average Latency ( ns ) % Time (ns) Fig. 10. Throughput for random traffic Packets Sent Fig. 11. Average latency for random traffic. Throghput (rate of receiving) 45% 40% 35% 30% 25% 20% 15% 10% Average Latency ( ns ) % % Time (ns) Packets Sent Fig. 12. Throughput for HoL specific traffic. Fig. 13. Average latency for HoL specific traffic.

7 558 Masoud Oveis Gharan and Gul N. Khan / Procedia Computer Science 34 ( 2014 ) Figures 10 and 11 show the throughput and latency results in the case of Random Traffic. In the beginning of simulation (around 132 nsec), the performance of PBVC is much higher than that of CWVC, and as the time passes this advantage diminishes. This is due the fact that in the beginning of simulation, the traffic is not crowded, and when the HoL blocking occurs in a channel, the incoming packet can move out of the channel. This situation will improve the performance of PBVC approach. In the case of four virtual channel (i.e. VC4 case in Figures 10 and 11), the throughput of PBVC is 2.6% higher and the latency is 8.2% lower than those of CWVC. In the second part of the experiment, both models are evaluated in a high contention environment. Figures 12 and 13 show the throughput and latency results for the HoL specific traffic where two scenarios are investigated. In the first scenario, the first packets of all the sources are intended for one destination, and afterward the packets travel to all the destinations randomly. In the second condition, the destination is two times slower than the other destinations. In this traffic pattern, the PBVC performance improvement is much better than CWVC. For 1024 packets, the average latency is 40% less than CWVC, and the average throughput is 23% higher than CWVC for 2048 nsecs. This is due to a lot of HoL blocking occurring in the beginning of simulation. As the time passes the occurrence of HoL blockings will reduce, and the throughputs of two methods are going to be close to each other. Another important point is that as the number of VCs is reduced to two, the advantage of PBVC also diminishes. This is due the fact that when the HoL blocking occurs in PBVC and there are free VCs, the new packet passes through these free VCs that improves the performance. 5. Conclusions The architecture and structure of packet-based virtual channel (PBVC) approach is presented. PBVC buffer has its root in dynamically allocated multi-queue (i.e. DAMQ) buffers. We conclude that PBVC is able to completely remove HoL blockings in NoCs. To verify our claims PBVC and the conventional wormhole VCs (CWVC) are implemented using DAMQ buffers. In the experiments, two traffic patterns i.e. random and HoL specific traffic are applied to PBVC and CWVC based NoCs. Throughput and latency for PBVC based NoC are compared with those of CWVC based NoC. The performance results are obtained in varying number of VCs and traffic conditions. The PBVC results are better on average as compared to the CWVC. In the case of HoL specific traffic, the average latency and throughput improve for our PBVC approach as compared to traditional CWVC. References 1. Yoo HJ, Lee K, Kim JK, Network-on-Chip based SoC. In: Low-Power NoC for High-Performance SoC Design. Boca Raton: CRC Press; p Nicopoulos CA, Dongkook P, Jongman K, Vijaykrishnan N, Yousif MS, Das CR. ViChaR: A dynamic virtual channel regulator for Network-on-Chip routers. In Proc. 39th Annual IEEE/ACM International Symposium on Microarchitecture, p Dally WJ. Virtual-channel flow control. IEEE Transactions on Parallel and Distributed Systems, 1992; 3: Frazier GL, Tamir Y. The design and implementation of a multiqueue buffer for VLSI communication switches. In Proc. IEEE International Conference on Computer Design: VLSI in Computers and Processors, p Liu J, Delgado-Frias JG. DAMQ self-compacting buffer schemes for systems with Network-on-Chip. In Proc. International Conference on Computer Design, p Tamir Y, Frazier GL. Dynamically-Allocated Multi-Queue Buffers for VLSI Communication Switches. IEEE Transactions on Computers, 1992; 41: Park J, O Krafka BW, Vassiliadis S, Delgado-Frias J. Design and evaluation of a DAMQ multiprocessor network with self-compacting buffers. In Proc. Supercomputing, p Benini L, Micheli GD. Register designs for queuing buffer. In: Networks on Chips: Technology And Tools. San fransisco: Morgan Kaufmann Publishers; Choi Y, Pinkston TM. Evaluation of queue designs for true fully adaptive routers. Journal of Parallel and Distributed Computing, 2004; 64: Xu Y, Zhao B, Zhang Y, Yang J. Simple virtual channel allocation for high throughput and high frequency on-chip routers. In Proc. International Symposium on High Performance Computer Architecture, p Evripidou M, Nicopoulos C, Soteriou V, Kim J. Virtualizing virtual vhannels for increased Network-on-Chip robustness and upgradeability. In Proc. IEEE Computer Society Annual Symposium on VLSI, p Zhang H, Wang K, Dai Y, Liu L. A Multi-VC Dynamically Shared Buffer with Prefetch for Network on Chip. In Proc. IEEE 7th International Conference on Networking, Architecture and Storage, p Oveis-Gharan M, Khan GN. A novel virtual channel implementation technique for multi-core on-chip communication. In Proc. IEEE Symp. Computer Architecture and High Performance Computing (WAMCA 12), p

Evaluation of NOC Using Tightly Coupled Router Architecture

Evaluation of NOC Using Tightly Coupled Router Architecture IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 1, Ver. II (Jan Feb. 2016), PP 01-05 www.iosrjournals.org Evaluation of NOC Using Tightly Coupled Router

More information

A dynamic distribution of the input buffer for on-chip routers Fan Lifang, Zhang Xingming, Chen Ting

A dynamic distribution of the input buffer for on-chip routers Fan Lifang, Zhang Xingming, Chen Ting Advances in Computer Science Research, volume 5 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 26) A dynamic distribution of the input buffer for on-chip

More information

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK DOI: 10.21917/ijct.2012.0092 HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK U. Saravanakumar 1, R. Rangarajan 2 and K. Rajasekar 3 1,3 Department of Electronics and Communication

More information

Deadlock-free XY-YX router for on-chip interconnection network

Deadlock-free XY-YX router for on-chip interconnection network LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ

More information

SERVICE ORIENTED REAL-TIME BUFFER MANAGEMENT FOR QOS ON ADAPTIVE ROUTERS

SERVICE ORIENTED REAL-TIME BUFFER MANAGEMENT FOR QOS ON ADAPTIVE ROUTERS SERVICE ORIENTED REAL-TIME BUFFER MANAGEMENT FOR QOS ON ADAPTIVE ROUTERS 1 SARAVANAN.K, 2 R.M.SURESH 1 Asst.Professor,Department of Information Technology, Velammal Engineering College, Chennai, Tamilnadu,

More information

Lecture 23: Router Design

Lecture 23: Router Design Lecture 23: Router Design Papers: A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks, ISCA 06, Penn-State ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip

More information

Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers

Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California

More information

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER A Thesis by SUNGHO PARK Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements

More information

Dynamic Buffer Organization Methods for Interconnection Network Switches Amit Kumar Gupta, Francois Labonte, Paul Wang Lee, Alex Solomatnikov

Dynamic Buffer Organization Methods for Interconnection Network Switches Amit Kumar Gupta, Francois Labonte, Paul Wang Lee, Alex Solomatnikov I Dynamic Buffer Organization Methods for Interconnection Network Switches Amit Kumar Gupta, Francois Labonte, Paul Wang Lee, Alex Solomatnikov I. INTRODUCTION nterconnection networks originated from the

More information

Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture

Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and

More information

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Nauman Jalil, Adnan Qureshi, Furqan Khan, and Sohaib Ayyaz Qazi Abstract

More information

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching. Switching/Flow Control Overview Interconnection Networks: Flow Control and Microarchitecture Topology: determines connectivity of network Routing: determines paths through network Flow Control: determine

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

Design and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek Raj.K 1 Prasad Kumar 2 Shashi Raj.K 3

Design and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek Raj.K 1 Prasad Kumar 2 Shashi Raj.K 3 IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 02, 2014 ISSN (online): 2321-0613 Design and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek

More information

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni

More information

CONGESTION AWARE ADAPTIVE ROUTING FOR NETWORK-ON-CHIP COMMUNICATION. Stephen Chui Bachelor of Engineering Ryerson University, 2012.

CONGESTION AWARE ADAPTIVE ROUTING FOR NETWORK-ON-CHIP COMMUNICATION. Stephen Chui Bachelor of Engineering Ryerson University, 2012. CONGESTION AWARE ADAPTIVE ROUTING FOR NETWORK-ON-CHIP COMMUNICATION by Stephen Chui Bachelor of Engineering Ryerson University, 2012 A thesis presented to Ryerson University in partial fulfillment of the

More information

A DAMQ SHARED BUFFER SCHEME FOR NETWORK-ON-CHIP

A DAMQ SHARED BUFFER SCHEME FOR NETWORK-ON-CHIP A DAMQ HARED BUFFER CHEME FOR ETWORK-O-CHIP Jin Liu and José G. Delgado-Frias chool of Electrical Engineering and Computer cience Washington tate University Pullman, WA 99164-2752 {jinliu, jdelgado}@eecs.wsu.edu

More information

DESIGN A APPLICATION OF NETWORK-ON-CHIP USING 8-PORT ROUTER

DESIGN A APPLICATION OF NETWORK-ON-CHIP USING 8-PORT ROUTER G MAHESH BABU, et al, Volume 2, Issue 7, PP:, SEPTEMBER 2014. DESIGN A APPLICATION OF NETWORK-ON-CHIP USING 8-PORT ROUTER G.Mahesh Babu 1*, Prof. Ch.Srinivasa Kumar 2* 1. II. M.Tech (VLSI), Dept of ECE,

More information

Networks-on-Chip Router: Configuration and Implementation

Networks-on-Chip Router: Configuration and Implementation Networks-on-Chip : Configuration and Implementation Wen-Chung Tsai, Kuo-Chih Chu * 2 1 Department of Information and Communication Engineering, Chaoyang University of Technology, Taichung 413, Taiwan,

More information

OASIS Network-on-Chip Prototyping on FPGA

OASIS Network-on-Chip Prototyping on FPGA Master thesis of the University of Aizu, Feb. 20, 2012 OASIS Network-on-Chip Prototyping on FPGA m5141120, Kenichi Mori Supervised by Prof. Ben Abdallah Abderazek Adaptive Systems Laboratory, Master of

More information

Abstract. Paper organization

Abstract. Paper organization Allocation Approaches for Virtual Channel Flow Control Neeraj Parik, Ozen Deniz, Paul Kim, Zheng Li Department of Electrical Engineering Stanford University, CA Abstract s are one of the major resources

More information

Study of Dynamically-Allocated Multi-Queue Buffers for NoC Routers

Study of Dynamically-Allocated Multi-Queue Buffers for NoC Routers Study of Dynamically-llocated Multi-Queue Buffers for NoC Routers Yung-Chou Tsai Department of lectrical ngineering National Tsing Hua University Hsinchu, Taiwan d923935@oz.nthu.edu.tw Yarsun Hsu Department

More information

Ultra-Fast NoC Emulation on a Single FPGA

Ultra-Fast NoC Emulation on a Single FPGA The 25 th International Conference on Field-Programmable Logic and Applications (FPL 2015) September 3, 2015 Ultra-Fast NoC Emulation on a Single FPGA Thiem Van Chu, Shimpei Sato, and Kenji Kise Tokyo

More information

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS OASIS NoC Architecture Design in Verilog HDL Technical Report: TR-062010-OASIS Written by Kenichi Mori ASL-Ben Abdallah Group Graduate School of Computer Science and Engineering The University of Aizu

More information

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Maheswari Murali * and Seetharaman Gopalakrishnan # * Assistant professor, J. J. College of Engineering and Technology,

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Design and Implementation of Buffer Loan Algorithm for BiNoC Router

Design and Implementation of Buffer Loan Algorithm for BiNoC Router Design and Implementation of Buffer Loan Algorithm for BiNoC Router Deepa S Dev Student, Department of Electronics and Communication, Sree Buddha College of Engineering, University of Kerala, Kerala, India

More information

ISSN Vol.03,Issue.06, August-2015, Pages:

ISSN Vol.03,Issue.06, August-2015, Pages: WWW.IJITECH.ORG ISSN 2321-8665 Vol.03,Issue.06, August-2015, Pages:0920-0924 Performance and Evaluation of Loopback Virtual Channel Router with Heterogeneous Router for On Chip Network M. VINAY KRISHNA

More information

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs -A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The

More information

Design of Adaptive Communication Channel Buffers for Low-Power Area-Efficient Network-on-Chip Architecture

Design of Adaptive Communication Channel Buffers for Low-Power Area-Efficient Network-on-Chip Architecture Design of Adaptive Communication Channel Buffers for Low- Area-Efficient etwork-on-chip Architecture Avinash Kodi, Ashwini Sarathy and Ahmed Louri Dept. of Electrical Engineering and Computer Science,

More information

Lecture 7: Flow Control - I

Lecture 7: Flow Control - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 7: Flow Control - I Tushar Krishna Assistant Professor School of Electrical

More information

Routing Algorithms. Review

Routing Algorithms. Review Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

Flow Control can be viewed as a problem of

Flow Control can be viewed as a problem of NOC Flow Control 1 Flow Control Flow Control determines how the resources of a network, such as channel bandwidth and buffer capacity are allocated to packets traversing a network Goal is to use resources

More information

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes EE482, Spring 1999 Research Paper Report Deadlock Recovery Schemes Jinyung Namkoong Mohammed Haque Nuwan Jayasena Manman Ren May 18, 1999 Introduction The selected papers address the problems of deadlock,

More information

ISSN Vol.03, Issue.02, March-2015, Pages:

ISSN Vol.03, Issue.02, March-2015, Pages: ISSN 2322-0929 Vol.03, Issue.02, March-2015, Pages:0122-0126 www.ijvdcs.org Design and Simulation Five Port Router using Verilog HDL CH.KARTHIK 1, R.S.UMA SUSEELA 2 1 PG Scholar, Dept of VLSI, Gokaraju

More information

Design of Reconfigurable Router for NOC Applications Using Buffer Resizing Techniques

Design of Reconfigurable Router for NOC Applications Using Buffer Resizing Techniques Design of Reconfigurable Router for NOC Applications Using Buffer Resizing Techniques Nandini Sultanpure M.Tech (VLSI Design and Embedded System), Dept of Electronics and Communication Engineering, Lingaraj

More information

Design of Synchronous NoC Router for System-on-Chip Communication and Implement in FPGA using VHDL

Design of Synchronous NoC Router for System-on-Chip Communication and Implement in FPGA using VHDL Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

More information

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC

Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC BWCCA 2010 Fukuoka, Japan November 4-6 2010 Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu

More information

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID 1 Virtual Channel Flow Control Each switch has multiple virtual channels per phys. channel Each virtual

More information

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College

More information

DLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip

DLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip DLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip Anh T. Tran and Bevan M. Baas Department of Electrical and Computer Engineering University of California - Davis, USA {anhtr,

More information

Basic Switch Organization

Basic Switch Organization NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization

More information

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:

More information

OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel

OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) hyoukjun@gatech.edu April

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

NoC Test-Chip Project: Working Document

NoC Test-Chip Project: Working Document NoC Test-Chip Project: Working Document Michele Petracca, Omar Ahmad, Young Jin Yoon, Frank Zovko, Luca Carloni and Kenneth Shepard I. INTRODUCTION This document describes the low-power high-performance

More information

High Performance Interconnect and NoC Router Design

High Performance Interconnect and NoC Router Design High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali

More information

WITH the development of the semiconductor technology,

WITH the development of the semiconductor technology, Dual-Link Hierarchical Cluster-Based Interconnect Architecture for 3D Network on Chip Guang Sun, Yong Li, Yuanyuan Zhang, Shijun Lin, Li Su, Depeng Jin and Lieguang zeng Abstract Network on Chip (NoC)

More information

TECHNOLOGY scaling is expected to continue into the deep

TECHNOLOGY scaling is expected to continue into the deep IEEE TRANSACTIONS ON COMPUTERS, VOL. 57, NO. 9, SEPTEMBER 2008 1169 Adaptive Channel Buffers in On-Chip Interconnection Networks A Power and Performance Analysis Avinash Karanth Kodi, Member, IEEE, Ashwini

More information

IJMTES International Journal of Modern Trends in Engineering and Science ISSN:

IJMTES International Journal of Modern Trends in Engineering and Science ISSN: Low-Latency Wormhole Routers for On-Chip Networks N.Rajesh Kumar 1, A.Dhivya 2, K.Kalaivani 3, R.Sindhu 4 1 (Assistant Professor,Pollachi Institute of engineering and Technology,Poosaripatti,rajeshkumar.n1@gmail.com)

More information

Real-Time Mixed-Criticality Wormhole Networks

Real-Time Mixed-Criticality Wormhole Networks eal-time Mixed-Criticality Wormhole Networks Leandro Soares Indrusiak eal-time Systems Group Department of Computer Science University of York United Kingdom eal-time Systems Group 1 Outline Wormhole Networks

More information

Power and Performance Efficient Partial Circuits in Packet-Switched Networks-on-Chip

Power and Performance Efficient Partial Circuits in Packet-Switched Networks-on-Chip 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Power and Performance Efficient Partial Circuits in Packet-Switched Networks-on-Chip Nasibeh Teimouri

More information

Power and Area Efficient NOC Router Through Utilization of Idle Buffers

Power and Area Efficient NOC Router Through Utilization of Idle Buffers Power and Area Efficient NOC Router Through Utilization of Idle Buffers Mr. Kamalkumar S. Kashyap 1, Prof. Bharati B. Sayankar 2, Dr. Pankaj Agrawal 3 1 Department of Electronics Engineering, GRRCE Nagpur

More information

Demand Based Routing in Network-on-Chip(NoC)

Demand Based Routing in Network-on-Chip(NoC) Demand Based Routing in Network-on-Chip(NoC) Kullai Reddy Meka and Jatindra Kumar Deka Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India Abstract

More information

LOW POWER REDUCED ROUTER NOC ARCHITECTURE DESIGN WITH CLASSICAL BUS BASED SYSTEM

LOW POWER REDUCED ROUTER NOC ARCHITECTURE DESIGN WITH CLASSICAL BUS BASED SYSTEM Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 5, May 2015, pg.705

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

Design of Router Architecture Based on Wormhole Switching Mode for NoC

Design of Router Architecture Based on Wormhole Switching Mode for NoC International Journal of Scientific & Engineering Research Volume 3, Issue 3, March-2012 1 Design of Router Architecture Based on Wormhole Switching Mode for NoC L.Rooban, S.Dhananjeyan Abstract - Network

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

PAPER Two-Level FIFO Buffer Design for Routers in On-Chip Interconnection Networks

PAPER Two-Level FIFO Buffer Design for Routers in On-Chip Interconnection Networks 2412 PAPER Two-Level FIFO Buffer Design for Routers in On-Chip Interconnection Networks Po-Tsang HUANG a), Student Member and Wei HWANG, Nonmember SUMMARY The on-chip interconnection network (OCIN) is

More information

Noc Evolution and Performance Optimization by Addition of Long Range Links: A Survey. By Naveen Choudhary & Vaishali Maheshwari

Noc Evolution and Performance Optimization by Addition of Long Range Links: A Survey. By Naveen Choudhary & Vaishali Maheshwari Global Journal of Computer Science and Technology: E Network, Web & Security Volume 15 Issue 6 Version 1.0 Year 2015 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

An Efficient Design of Serial Communication Module UART Using Verilog HDL

An Efficient Design of Serial Communication Module UART Using Verilog HDL An Efficient Design of Serial Communication Module UART Using Verilog HDL Pogaku Indira M.Tech in VLSI and Embedded Systems, Siddhartha Institute of Engineering and Technology. Dr.D.Subba Rao, M.Tech,

More information

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Available online at  ScienceDirect. Procedia Computer Science 89 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 89 (2016 ) 180 186 Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) A Perspective on

More information

Performance Explorations of Multi-Core Network on Chip Router

Performance Explorations of Multi-Core Network on Chip Router Performance Explorations of Multi-Core Network on Chip Router U.Saravanakumar Department of Electronics and Communication Engineering PSG College of Technology Coimbatore, India saran.usk@gmail.com R.

More information

Quality-of-Service for a High-Radix Switch

Quality-of-Service for a High-Radix Switch Quality-of-Service for a High-Radix Switch Nilmini Abeyratne, Supreet Jeloka, Yiping Kang, David Blaauw, Ronald G. Dreslinski, Reetuparna Das, and Trevor Mudge University of Michigan 51 st DAC 06/05/2014

More information

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture

More information

NEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect

NEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect 1 A Soft Tolerant Network-on-Chip Router Pipeline for Multi-core Systems Pavan Poluri and Ahmed Louri Department of Electrical and Computer Engineering, University of Arizona Email: pavanp@email.arizona.edu,

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup

More information

Asynchronous Bypass Channel Routers

Asynchronous Bypass Channel Routers 1 Asynchronous Bypass Channel Routers Tushar N. K. Jain, Paul V. Gratz, Alex Sprintson, Gwan Choi Department of Electrical and Computer Engineering, Texas A&M University {tnj07,pgratz,spalex,gchoi}@tamu.edu

More information

Comparison of Deadlock Recovery and Avoidance Mechanisms to Approach Message Dependent Deadlocks in on-chip Networks

Comparison of Deadlock Recovery and Avoidance Mechanisms to Approach Message Dependent Deadlocks in on-chip Networks Comparison of Deadlock Recovery and Avoidance Mechanisms to Approach Message Dependent Deadlocks in on-chip Networks Andreas Lankes¹, Soeren Sonntag², Helmut Reinig³, Thomas Wild¹, Andreas Herkersdorf¹

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

ScienceDirect. Power-Aware Mapping for 3D-NoC Designs using Genetic Algorithms

ScienceDirect. Power-Aware Mapping for 3D-NoC Designs using Genetic Algorithms Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 34 (2014 ) 538 543 2014 International Workshop on the Design and Performance of Networks on Chip (DPNoC 2014) Power-Aware

More information

Fast Flexible FPGA-Tuned Networks-on-Chip

Fast Flexible FPGA-Tuned Networks-on-Chip This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations. Fast Flexible FPGA-Tuned Networks-on-Chip Michael K. Papamichael, James C. Hoe

More information

Design of a router for network-on-chip. Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh

Design of a router for network-on-chip. Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh 98 Int. J. High Performance Systems Architecture, Vol. 1, No. 2, 27 Design of a router for network-on-chip Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh Department of Electrical Engineering and Computer

More information

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik SoC Design Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik Chapter 5 On-Chip Communication Outline 1. Introduction 2. Shared media 3. Switched media 4. Network on

More information

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,

More information

Link-Sharing Method of Buffer in NoC Router: Its Implementation and Communication Performance

Link-Sharing Method of Buffer in NoC Router: Its Implementation and Communication Performance Link-Sharing Method of Buffer in NoC Router: Its Implementation and Communication Performance Naohisa Fukase *1, Yasuyuki Miura 2, Shigeyoshi Watanabe 3 Shonan Institute of Technology, Fujisawa, Kanagawa,

More information

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located

More information

DESIGN AND PERFORMANCE EVALUATION OF ON CHIP NETWORK ROUTERS

DESIGN AND PERFORMANCE EVALUATION OF ON CHIP NETWORK ROUTERS DESIGN AND PERFORMANCE EVALUATION OF ON CHIP NETWORK ROUTERS 1 U.SARAVANAKUMAR, 2 R.RANGARAJAN 1 Asst Prof., Department of ECE, PSG College of Technology, Coimbatore, INDIA 2 Professor & Principal, Indus

More information

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ E. Baydal, P. López and J. Duato Depto. Informática de Sistemas y Computadores Universidad Politécnica de Valencia, Camino

More information

The Benefits of Using Clock Gating in the Design of Networks-on-Chip

The Benefits of Using Clock Gating in the Design of Networks-on-Chip The Benefits of Using Clock Gating in the Design of Networks-on-Chip Michele Petracca, Luca P. Carloni Dept. of Computer Science, Columbia University, New York, NY 127 Abstract Networks-on-chip (NoC) are

More information

Improving Routing Efficiency for Network-on-Chip through Contention-Aware Input Selection

Improving Routing Efficiency for Network-on-Chip through Contention-Aware Input Selection Improving Routing Efficiency for Network-on-Chip through Contention-Aware Input Selection Dong Wu, Bashir M. Al-Hashimi, Marcus T. Schmitz School of Electronics and Computer Science University of Southampton

More information

Design and Simulation of Router Using WWF Arbiter and Crossbar

Design and Simulation of Router Using WWF Arbiter and Crossbar Design and Simulation of Router Using WWF Arbiter and Crossbar M.Saravana Kumar, K.Rajasekar Electronics and Communication Engineering PSG College of Technology, Coimbatore, India Abstract - Packet scheduling

More information

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam

More information

AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP

AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP Rehan Maroofi, 1 V. N. Nitnaware, 2 and Dr. S. S. Limaye 3 1 Department of Electronics, Ramdeobaba Kamla Nehru College of Engg, Nagpur,

More information

Design and Implementation of Multistage Interconnection Networks for SoC Networks

Design and Implementation of Multistage Interconnection Networks for SoC Networks International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 212 Design and Implementation of Multistage Interconnection Networks for SoC Networks Mahsa

More information

Efficient And Advance Routing Logic For Network On Chip

Efficient And Advance Routing Logic For Network On Chip RESEARCH ARTICLE OPEN ACCESS Efficient And Advance Logic For Network On Chip Mr. N. Subhananthan PG Student, Electronics And Communication Engg. Madha Engineering College Kundrathur, Chennai 600 069 Email

More information

PRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS. A Thesis SONALI MAHAPATRA

PRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS. A Thesis SONALI MAHAPATRA PRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS A Thesis by SONALI MAHAPATRA Submitted to the Office of Graduate and Professional Studies of Texas A&M University

More information

Lecture 22: Router Design

Lecture 22: Router Design Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO 03, Princeton A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip

More information

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS 1 JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS Shabnam Badri THESIS WORK 2011 ELECTRONICS JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

More information

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC 1 Pawar Ruchira Pradeep M. E, E&TC Signal Processing, Dr. D Y Patil School of engineering, Ambi, Pune Email: 1 ruchira4391@gmail.com

More information

Design and Implementation of a Shared Memory Switch Fabric

Design and Implementation of a Shared Memory Switch Fabric 6'th International Symposium on Telecommunications (IST'2012) Design and Implementation of a Shared Memory Switch Fabric Mina Ejlali M.ejlalikamorin@ec.iut.ac.ir Mohammad Ali Montazeri Montazeri@cc.iut.ac.ir

More information

Deadlock-Avoidance Technique for Fault-Tolerant 3D-OASIS-Network-on-Chip

Deadlock-Avoidance Technique for Fault-Tolerant 3D-OASIS-Network-on-Chip Deadlock-Avoidance Technique for Fault-Tolerant 3D-OASIS-Network-on-Chip Akram Ben Ahmed, Abderazek Ben Abdallah The University of Aizu Graduate School of Computers Science and Engineering Aizu-Wakamatsu

More information

SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS

SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS Chandrika D.N 1, Nirmala. L 2 1 M.Tech Scholar, 2 Sr. Asst. Prof, Department of electronics and communication engineering, REVA Institute

More information

MULTIPATH ROUTER ARCHITECTURES TO REDUCE LATENCY IN NETWORK-ON-CHIPS. A Thesis HRISHIKESH NANDKISHOR DESHPANDE

MULTIPATH ROUTER ARCHITECTURES TO REDUCE LATENCY IN NETWORK-ON-CHIPS. A Thesis HRISHIKESH NANDKISHOR DESHPANDE MULTIPATH ROUTER ARCHITECTURES TO REDUCE LATENCY IN NETWORK-ON-CHIPS A Thesis by HRISHIKESH NANDKISHOR DESHPANDE Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment

More information

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including Router Architectures By the end of this lecture, you should be able to. Explain the different generations of router architectures Describe the route lookup process Explain the operation of PATRICIA algorithm

More information