Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Similar documents
Network on Chip Architecture: An Overview

Network on Chip Architectures BY JAGAN MURALIDHARAN NIRAJ VASUDEVAN

Network-on-Chip Architecture

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology

PERFORMANCE EVALUATION OF WIRELESS NETWORKS ON CHIP JYUN-LYANG CHANG

on Chip Architectures for Multi Core Systems

Lecture 18: Communication Models and Architectures: Interconnection Networks

Overlaid Mesh Topology Design and Deadlock Free Routing in Wireless Network-on-Chip. Danella Zhao and Ruizhe Wu Presented by Zhonghai Lu, KTH

SOC design methodologies will undergo revolutionary

4. Networks. in parallel computers. Advances in Computer Architecture

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

FDMA Enabled Phase-based Wireless Networkon-Chip using Graphene-based THz-band Antennas

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

PERFORMANCE EVALUATION OF FAULT TOLERANT METHODOLOGIES FOR NETWORK ON CHIP ARCHITECTURE

Network-on-chip (NOC) Topologies

Lecture: Interconnection Networks

Interconnection Networks

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

POLYMORPHIC ON-CHIP NETWORKS

Architectures for Networks on Chips with Emerging Interconnect Technologies

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

Flow Control can be viewed as a problem of

Lecture 2: Topology - I

NoC Test-Chip Project: Working Document

Communication Performance in Network-on-Chips

A NEW ROUTER ARCHITECTURE FOR DIFFERENT NETWORK- ON-CHIP TOPOLOGIES

MinRoot and CMesh: Interconnection Architectures for Network-on-Chip Systems

INTERCONNECTION NETWORKS LECTURE 4

Switching and Forwarding Reading: Chapter 3 1/30/14 1

An Interconnection Architecture for Seamless Inter and Intra-Chip Communication Using Wireless Links

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

Multi-level Fault Tolerance in 2D and 3D Networks-on-Chip

COMPARATIVE PERFORMANCE EVALUATION OF WIRELESS AND OPTICAL NOC ARCHITECTURES

Design and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization

Tree-structured small-world connected wireless network-on-chip with adaptive routing

3D WiNoC Architectures

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

TDT Appendix E Interconnection Networks

Lecture 3: Topology - II

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Understanding the Routing Requirements for FPGA Array Computing Platform. Hayden So EE228a Project Presentation Dec 2 nd, 2003

Deadlock-free XY-YX router for on-chip interconnection network

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Interconnect Technology and Computational Speed

udirec: Unified Diagnosis and Reconfiguration for Frugal Bypass of NoC Faults

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

A closer look at network structure:

Performance Evaluation of Probe-Send Fault-tolerant Network-on-chip Router

Subject: Adhoc Networks

Chapter 5 Ad Hoc Wireless Network. Jang Ping Sheu

WITH the development of the semiconductor technology,

Chapter 9 Multiprocessors

NOC: Networks on Chip SoC Interconnection Structures

Route Packets, Not Wires: On-Chip Interconnection Networks

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow

CHAPTER 9: PACKET SWITCHING N/W & CONGESTION CONTROL

A Hybrid Interconnection Network for Integrated Communication Services

NETWORK TOPOLOGIES. Application Notes. Keywords Topology, P2P, Bus, Ring, Star, Mesh, Tree, PON, Ethernet. Author John Peter & Timo Perttunen

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Packet Switch Architecture

Packet Switch Architecture

Interconnection Network

Three parallel-programming models

A Survey of Techniques for Power Aware On-Chip Networks.

End-To-End Delay Optimization in Wireless Sensor Network (WSN)

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

The Design and Implementation of a Low-Latency On-Chip Network

Lecture 7: Flow Control - I

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

Module 16: Distributed System Structures

Multiprocessor Interconnection Networks- Part Three

SONA: An On-Chip Network for Scalable Interconnection of AMBA-Based IPs*

Performance Evaluation of Different Routing Algorithms in Network on Chip

Sanaz Azampanah Ahmad Khademzadeh Nader Bagherzadeh Majid Janidarmian Reza Shojaee

Design and Implementation of Buffer Loan Algorithm for BiNoC Router

Temperature Evaluation of NoC Architectures and Dynamically Reconfigurable NoC

Power and Area Efficient NOC Router Through Utilization of Idle Buffers

CSCI Computer Networks

Lecture 15: NoC Innovations. Today: power and performance innovations for NoCs

Design of a System-on-Chip Switched Network and its Design Support Λ

Networks-on-Chip Router: Configuration and Implementation

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS

Design of Router Architecture Based on Wormhole Switching Mode for NoC

The Open System Interconnect model

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

Topologies. Maurizio Palesi. Maurizio Palesi 1

Multicomputer distributed system LECTURE 8

Interconnection Networks

A local area network that employs either a full mesh topology or partial mesh topology

Transcription:

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson

Overview NoC: Future generation of many core processor on a single chip Current multicore processor cores communicate over shared bus. Only one core can send a message at a time. Limited number of cores.

Overview (Contd.) NoC allows for more cores i.e. ensuring scalability. Multiple cores to send messages simultaneously. Somewhat similar to computer network. Route Packets, Not Wires - William J. Dally, Stanford University, NVIDIA

Interconnect Technology The shared medium arbitrated bus: most frequently used on-chip interconnect architecture. All communication devices share the same transmission medium. Advantages o o o simple topology low area cost extensibility

Interconnect Technology (Contd.) Disadvantages o High intrinsic parasitic resistance and capacitance o Increased delay in bit transfer with increase in processing elements, eventually exceed the targeted clock period o Limits the system scalability

Novel NoC Architectures A network-on-chip (NoC) resembles the interconnect architecture of high-performance parallel computing systems. The functional IP blocks communicate with each other with the help of intelligent switches. NoC allows the decoupling of the processing elements (i.e., the IPs) from the communication fabric (i.e., the network). Employs explicit parallelism, exhibits modularity to minimize the use of global wires, and utilizes locality for power minimization.

SPIN SPIN Scalable, Programmable, Integrated Network. Uses a fat-tree architecture. Every node has four children and the parent is replicated four times at any level.

BFT BFT - Butterfly Fat-Tree. The IPs are placed at the leaves and switches placed at the vertices. At each subsequent level, the number of required switches reduces by a factor of 2.

CLICHE CLICHE (Chip-Level Integration of Communicating Heterogeneous Elements. Consists of an m x n mesh of switches interconnecting computational resources (IPs). Every switch, except those at the edges, is connected to four neighboring switches and one IP block.

2D Torus Basically the same as a regular mesh. Only difference is that the switches at the edges are connected to the switches at the opposite edge through wrap-around channels. Long end-around connections can yield excessive delays.

Folded Torus The long end around delay can be avoided by folding the torus. This renders to a more suitable VLSI implementation.

Octagon Communication between any pair of nodes takes at most two hops within the basic octagonal unit. Each functional IP has dedicated switch.

SWITCHING METHODOLOGIES Switching techniques determine o o When and how internal switches connect their inputs to outputs The time at which message components may be transferred along these paths Different types of switching techniques o o o Circuit Switching, Packet Switching Wormhole Switching

Circuit Switching A physical path from source to destination is reserved prior to the transmission of the data. The path is held until all the data has been transmitted. Network bandwidth is reserved for the entire duration of the data. Valuable resources are also tied up for the duration of the transmitted data. Set up of an end-to-end path may cause unnecessary delays.

Packet Switching Data is divided into fixed-length blocks called packets. Whenever the source has a packet to be sent, it transmits the data. The need for storing entire packets in a switch in case of conventional packet switching makes the buffer requirement high. In an NoC environment, the requirement is that switches should not consume a large fraction of silicon area compared to the IP blocks.

Wormhole Switching Packets are divided into fixed length flow control units (flits). The input and output buffers are expected to store only a few flits. The buffer space requirement in the switches is small i.e. the switches are small and compact. The first flit, i.e., header flit, of a packet contains routing information. Header flit decoding enables the switches to establish the path and subsequent flits simply follow this path in a pipelined fashion.

Wormhole Switching (Contd.) Each incoming data flit of a message packet is simply forwarded along the same output channel as the preceding data flit. No packet reordering is required at destinations Drawbacko o o Transmission of distinct messages cannot be interleaved or multiplexed. Messages must cross the channel in their entirety before the channel can be used by another message. Decrease channel utilization if a flit from a given packet is blocked in a buffer.

Wormhole Switching (Contd.) By introducing virtual channels in the input and output ports, channel utility can be increased considerably. If a flit belonging to a particular packet is blocked in one of the virtual channels, then flits of alternate packets can use the other virtual channel buffers.

NoC PERFORMANCE METRICS It is desirable that an NoC interconnect architecture exhibits high throughput, low latency, energy efficiency, and low area overhead. In today s power constrained environments, it is increasingly critical to be able to identify the most energy efficient architectures and to be able to quantify the energy-performance trade-offs.

Message Throughput Message throughput is measured as the fraction of the maximum load that the network is capable of physically handling. Throughput 1 corresponds to all end nodes receiving one flit every cycle. Measured in flits/cycle/ip.

Transport Latency Defined as the time (in clock cycles) that elapses from between the occurrence of a message header injection into the network at the source node and the occurrence of a tail flit reception at the destination node. Depending on the source/destination pair and the routing algorithm, each message may have a different latency.

Experimental Results

Experimental Results (Contd.)

Experimental Results (Contd.)

Wireless NoC Replacement of some long wired lines by RF wireless links. On chip Carbon Nano Tube (CNT) antennas. Long range wireless links, short wire-line links.

Wireless NoC Architecture The WiNoC architecture is based on the Small World property. Networks with the small world property have a very small average path length. A small-world topology can be constructed from a locally connected network by rewiring connections randomly to any other node, which creates short-cuts in the network.

Scale Free Networks Maximum nodes have low degree. Few nodes have very high degree.

Wireless NoC Architecture (Contd.) The whole system is divided into multiple small clusters of neighboring cores called subnets. The cores in a subnet are connected to a centrally located hub through direct links. The hubs from all subnets are connected in a 2 nd level network. Due to limitations of wireless links, a few wireless links are distributed between hubs separated by relatively long distances.

WiNoC Experimental Results

WiNoC Experimental Results (Contd.)

NoC Security It is likely to have cores and other devices of different manufacturers embedded on a single chip. Makes vulnerable to hardware Trojans. Malicious Trojans try to bypass or disable the security fence of a system. It can continuously broadcast garbage data, leak confidential information by radio emission, or route flits in wrong directions or even tamper the flits. As soon as a hardware Trojan is detected in a system, it may required to remove from the system immediately with minimum effect on the system.

Fault Tolerant NoC Architecture We performed a study to find a NoC architecture which would show maximum fault tolerance in case of a node deletion. Study performed on both Mesh and Small World topologies. For the small world topology, we devised an algorithm for finding an attack tolerant architecture by iteratively reorganizing the initial topology.

Routing Algorithm Dijkstra s shortest path routing is adopted for routing the SW NoC. This graph search algorithm solves the single-source shortest path problem for a graph with nonnegative edge path costs, producing a shortest path tree.

Optimal Fault Tolerant Architecture The attack tolerant architecture is achieved by applying an algorithm based on Simulated Annealing. Specific cores in the small world topology are attacked i.e. they are isolated from all their neighbors so that they can neither send nor receive flits. The topology is reorganized iteratively until convergence of throughput by reordering one of its existing link.

Simulated Annealing Metrics M = (i, j) d (i, j) / N(N-1), where i, j are NoC cores, d(i, j) are their shortest path distance according to Dijkstra s algorithm and N is the total number of cores in the system. ρ = dm/ dl, where L is the number of levels of neighbors up to which a core is attacked. The objective is to minimize ρ to find an optimal solution.

Simulated Annealing Algorithm Initial Network Setup ρ < ρ? no Current Network = Initial network yes Generate uniform random number r in [0, 1] Compute Metric for Current Network, ρ Generate New Network Configuration, Compute new Metric ρ Rendomly pick & rewire 1 link Dijkstra Routing Algorithm Current Network = New network Reached convergence? no yes yes itr * e (ρ ρ ) > r? no Optimal network configurati on

Simulation Results

Questions? THANK YOU