Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Similar documents
Lecture: Interconnection Networks

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

Lecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics

Lecture: Networks, Disks, Datacenters, GPUs. Topics: networks wrap-up, disks and reliability, datacenters, GPU intro (Sections

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

Lecture 27: Pot-Pourri. Today s topics: Shared memory vs message-passing Simultaneous multi-threading (SMT) GPUs Disks and reliability

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture 23: Router Design

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

I/O CANNOT BE IGNORED

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

I/O CANNOT BE IGNORED

Lecture 22: Router Design

Storage Systems. Storage Systems

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Lecture 3: Flow-Control

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger

Chapter 6. Storage and Other I/O Topics

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Appendix D: Storage Systems

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.

Chapter 10: Mass-Storage Systems

Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management

Chapter 11. I/O Management and Disk Scheduling

CS3600 SYSTEMS AND NETWORKS

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Mass-Storage Structure

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Structure

Chapter 9: Peripheral Devices: Magnetic Disks

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Lecture 23. Finish-up buses Storage

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

I/O Systems and Storage Devices

Module 13: Secondary-Storage Structure

ES1 An Introduction to On-chip Networks

V. Mass Storage Systems

Storage System COSC UCB

Chapter 6 External Memory

Disk Scheduling. Based on the slides supporting the text

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

CS143: Disks and Files

Deadlock and Router Micro-Architecture

Chapter 14: Mass-Storage Systems. Disk Structure

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

COMP283-Lecture 3 Applied Database Management

Disk Scheduling. Chapter 14 Based on the slides supporting the text and B.Ramamurthy s slides from Spring 2001

Components of the Virtual Memory System

Lecture 18: Communication Models and Architectures: Interconnection Networks

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

Chapter 6 Storage and Other I/O Topics

I/O Device Controllers. I/O Systems. I/O Ports & Memory-Mapped I/O. Direct Memory Access (DMA) Operating Systems 10/20/2010. CSC 256/456 Fall

CS5460: Operating Systems Lecture 20: File System Reliability

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Reliable Computing I

Lecture 9. I/O Management and Disk Scheduling Algorithms

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

Mass-Storage Structure

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Prediction Router: Yet another low-latency on-chip router architecture

RAID (Redundant Array of Inexpensive Disks)

On-Die Interconnects for next generation CMPs

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Introduction. Operating Systems. Outline. Hardware. I/O Device Types. Device Controllers. One OS function is to control devices

I/O Management and Disk Scheduling. Chapter 11

Advanced Computer Networks. Flow Control

SIGNET: NETWORK-ON-CHIP FILTERING FOR COARSE VECTOR DIRECTORIES. Natalie Enright Jerger University of Toronto

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Introduction. Operating Systems. Outline. Hardware. I/O Device Types. Device Controllers. One OS function is to control devices

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

Lectures More I/O

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

EC 513 Computer Architecture

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Packet Switch Architecture

Packet Switch Architecture

Lecture 7: Flow Control - I

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Thomas Moscibroda Microsoft Research. Onur Mutlu CMU

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

Transcription:

Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID 1

Virtual Channel Flow Control Each switch has multiple virtual channels per phys. channel Each virtual channel keeps track of the output channel assigned to the head, and pointers to buffered packets A head flit must allocate the same three resources in the next switch before being forwarded By having multiple virtual channels per physical channel, two different packets are allowed to utilize the channel and not waste the resource when one packet is idle 2

Example Wormhole: Node-0 A is going from Node-1 to Node-4; B is going from Node-0 to Node-5 Node-1 A Virtual channel: B idle B idle Node-2 Node-3 Node-4 Node-5 (blocked, no free VCs/buffers) Node-0 Traffic Analogy: B is trying to make a left turn; A is trying to go straight; there is no left-only lane with wormhole, but there is one with VC Node-1 B A A A B Node-2 Node-3 Node-4 Node-5 (blocked, no free VCs/buffers) 3

Buffer Management Credit-based: keep track of the number of free buffers in the downstream node; the downstream node sends back signals to increment the count when a buffer is freed; need enough buffers to hide the round-trip latency On/Off: the upstream node sends back a signal when its buffers are close to being full reduces upstream signaling and counters, but can waste buffer space 4

Deadlock Avoidance with VCs VCs provide another way to number the links such that a route always uses ascending link numbers 17 18 19 2 1 0 18 1 2 3 2 1 0 17 1 2 3 2 1 0 16 1 2 3 2 1 0 117 118 119 102 101 100 118 117 101 102 103 Alternatively, use West-first routing on the 1 st plane and cross over to the 2 nd plane in case you need to go West again (the 2 nd plane uses North-last, for example) 116 202 201 200 217 218 218 217 201 202 203 219 216 5

Router Functions Crossbar, buffer, arbiter, VC state and allocation, buffer management, ALUs, control logic Typical on-chip network power breakdown: 30% link 30% buffers 30% crossbar 6

Virtual Channel Router Buffers and channels are allocated per flit Each physical channel is associated with multiple virtual channels the virtual channels are allocated per packet and the flits of various VCs can be interweaved on the physical channel For a head flit to proceed, the router has to first allocate a virtual channel on the next router For any flit to proceed (including the head), the router has to allocate the following resources: buffer space in the next router (credits indicate the available space), access to the physical channel 7

Router Pipeline Four typical stages: RC routing computation: the head flit indicates the VC that it belongs to, the VC state is updated, the headers are examined and the next output channel is computed (note: this is done for all the head flits arriving on various input channels) VA virtual-channel allocation: the head flits compete for the available virtual channels on their computed output channels SA switch allocation: a flit competes for access to its output physical channel ST switch traversal: the flit is transmitted on the output channel A head flit goes through all four stages, the other flits do nothing in the first two stages (this is an in-order pipeline and flits can not jump ahead), a tail flit also de-allocates the VC 8

Router Pipeline Four typical stages: RC routing computation: compute the output channel VA virtual-channel allocation: allocate VC for the head flit SA switch allocation: compete for output physical channel ST switch traversal: transfer data on output physical channel Cycle 1 2 3 4 5 6 7 STALL Head flit Body flit 1 RC VA SA ST -- -- SA ST RC VA SA SA ST -- -- -- SA ST Body flit 2 -- -- SA ST -- -- -- SA ST Tail flit -- -- SA ST -- -- -- SA ST 9

Speculative Pipelines Perform VA and SA in parallel Note that SA only requires knowledge of the output physical channel, not the VC If VA fails, the successfully allocated channel goes un-utilized Cycle 1 2 3 4 5 6 7 Perform VA, SA, and ST in parallel (can cause collisions and re-tries) Typically, VA is the critical path can possibly perform SA and ST sequentially Head flit Body flit 1 RC VA SA ST -- SA ST RC VA SA ST SA ST Body flit 2 -- SA ST SA ST Tail flit -- SA ST SA ST Router pipeline latency is a greater bottleneck when there is little contention When there is little contention, speculation will likely work well! Single stage pipeline? 10

Recent Intel Router Used for a 6x6 mesh 16 B, > 3 GHz Wormhole with VC flow control Source: Partha Kundu, On-Die Interconnects for Next-Generation CMPs, talk at On-Chip Interconnection Networks Workshop, Dec 2006 11

Recent Intel Router Source: Partha Kundu, On-Die Interconnects for Next-Generation CMPs, talk at On-Chip Interconnection Networks Workshop, Dec 2006 12

Recent Intel Router Source: Partha Kundu, On-Die Interconnects for Next-Generation CMPs, talk at On-Chip Interconnection Networks Workshop, Dec 2006 13

Magnetic Disks A magnetic disk consists of 1-12 platters (metal or glass disk covered with magnetic recording material on both sides), with diameters between 1-3.5 inches Each platter is comprised of concentric tracks (5-30K) and each track is divided into sectors (100 500 per track, each about 512 bytes) A movable arm holds the read/write heads for each disk surface and moves them all in tandem a cylinder of data is accessible at a time 14

Disk Latency To read/write data, the arm has to be placed on the correct track this seek time usually takes 5 to 12 ms on average can take less if there is spatial locality Rotational latency is the time taken to rotate the correct sector under the head average is typically more than 2 ms (15,000 RPM) Transfer time is the time taken to transfer a block of bits out of the disk and is typically 3 65 MB/second A disk controller maintains a disk cache (spatial locality can be exploited) and sets up the transfer on the bus (controller overhead) 15

RAID Reliability and availability are important metrics for disks RAID: redundant array of inexpensive (independent) disks Redundancy can deal with one or more failures Each sector of a disk records check information that allows it to determine if the disk has an error or not (in other words, redundancy already exists within a disk) When the disk read flags an error, we turn elsewhere for correct data 16

RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) it uses an array of disks and stripes (interleaves) data across the arrays to improve parallelism and throughput RAID 1 mirrors or shadows every disk every write happens to two disks Reads to the mirror may happen only when the primary disk fails or, you may try to read both together and the quicker response is accepted Expensive solution: high reliability at twice the cost 17

RAID 3 Data is bit-interleaved across several disks and a separate disk maintains parity information for a set of bits For example: with 8 disks, bit 0 is in disk-0, bit 1 is in disk-1,, bit 7 is in disk-7; disk-8 maintains parity for all 8 bits For any read, 8 disks must be accessed (as we usually read more than a byte at a time) and for any write, 9 disks must be accessed as parity has to be re-calculated High throughput for a single request, low cost for redundancy (overhead: 12.5%), low task-level parallelism 18

RAID 4 and RAID 5 Data is block interleaved this allows us to get all our data from a single disk on a read in case of a disk error, read all 9 disks Block interleaving reduces thruput for a single request (as only a single disk drive is servicing the request), but improves task-level parallelism as other disk drives are free to service other requests On a write, we access the disk that stores the data and the parity disk parity information can be updated simply by checking if the new data differs from the old data 19

RAID 5 If we have a single disk for parity, multiple writes can not happen in parallel (as all writes must update parity info) RAID 5 distributes the parity block to allow simultaneous writes 20

RAID Summary RAID 1-5 can tolerate a single fault mirroring (RAID 1) has a 100% overhead, while parity (RAID 3, 4, 5) has modest overhead Can tolerate multiple faults by having multiple check functions each additional check can cost an additional disk (RAID 6) RAID 6 and RAID 2 (memory-style ECC) are not commercially employed 21

Title Bullet 22