Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture Input/Output (I/O) Copyright 2012 Daniel J. Sorin Duke University

Similar documents
Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture Input/Output (I/O) Copyright 2011 Daniel J. Sorin Duke University

ECE 250 / CPS 250 Computer Architecture. Input/Output

ECE/CS 250 Computer Architecture. Summer 2018

INPUT/OUTPUT DEVICES Dr. Bill Yi Santa Clara University

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Science 146. Computer Architecture

Chapter 8. A Typical collection of I/O devices. Interrupts. Processor. Cache. Memory I/O bus. I/O controller I/O I/O. Main memory.

Lecture 13. Storage, Network and Other Peripherals

Storage. Hwansoo Han

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Chapter Seven Morgan Kaufmann Publishers

Storage Systems. Storage Systems

Chapter 6 Storage and Other I/O Topics

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Buses. Disks PCI RDRAM RDRAM LAN. Some slides adapted from lecture by David Culler. Pentium 4 Processor. Memory Controller Hub.

Reading and References. Input / Output. Why Input and Output? A typical organization. CSE 410, Spring 2004 Computer Systems

Chapter 6. Storage and Other I/O Topics

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

This Unit: Main Memory. Building a Memory System. First Memory System Design. An Example Memory System

Introduction to Input and Output

ECE 152 Introduction to Computer Architecture

I/O CANNOT BE IGNORED

Thomas Polzer Institut für Technische Informatik

ECE 250 / CS250 Introduction to Computer Architecture

Chapter 6. I/O issues

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Main Memory. Readings

IO System. CP-226: Computer Architecture. Lecture 25 (24 April 2013) CADSL

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

Computer Architecture. Hebrew University Spring Chapter 8 Input/Output. Big Picture: Where are We Now?

Chapter 6. Storage & Other I/O

Bus System. Bus Lines. Bus Systems. Chapter 8. Common connection between the CPU, the memory, and the peripheral devices.

Lecture 23. Finish-up buses Storage

Chapter 6. Storage and Other I/O Topics. ICE3003: Computer Architecture Fall 2012 Euiseong Seo

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

CSE 120. Overview. July 27, Day 8 Input/Output. Instructor: Neil Rhodes. Hardware. Hardware. Hardware

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

CS152 Computer Architecture and Engineering Lecture 20: Busses and OS s Responsibilities. Recap: IO Benchmarks and I/O Devices

Chapter 6. Storage and Other I/O Topics

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

I/O CANNOT BE IGNORED

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

EE108B Lecture 17 I/O Buses and Interfacing to CPU. Christos Kozyrakis Stanford University

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

Lectures More I/O

Chapter 6. Storage and Other I/O Topics. Jiang Jiang

Chapter 6. Storage and Other I/O Topics

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Caches and Memory Hierarchies.

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Input/Output Introduction

Introduction. Motivation Performance metrics Processor interface issues Buses

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Appendix D: Storage Systems

CS 341l Fall 2008 Test #4 NAME: Key

Digital Design Laboratory Lecture 6 I/O

ECE468 Computer Organization and Architecture. OS s Responsibilities

Computer Architecture CS 355 Busses & I/O System

COMP283-Lecture 3 Applied Database Management

Introduction to I/O. April 30, Howard Huang 1

Computer Systems Laboratory Sungkyunkwan University

Chapter 6. Storage and Other I/O Topics

Storage Technologies and the Memory Hierarchy

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture

Computer Organization

Virtual Memory Input/Output. Admin

INPUT/OUTPUT ORGANIZATION

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Rest of the Semester. CIS 371 Computer Organization and Design. Readings. This Unit: Virtualization & I/O

Mass-Storage Structure

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

INPUT/OUTPUT ORGANIZATION

Random-Access Memory (RAM) CS429: Computer Organization and Architecture. SRAM and DRAM. Flash / RAM Summary. Storage Technologies

Module 6: INPUT - OUTPUT (I/O)

Chapter 3. Top Level View of Computer Function and Interconnection. Yonsei University

Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996

ECE232: Hardware Organization and Design

Organisasi Sistem Komputer

Outline. Operating Systems: Devices and I/O p. 1/18

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

CIT 668: System Architecture. Computer Systems Architecture

ECE331: Hardware Organization and Design

PC-based data acquisition II

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Secondary storage. File systems

Chapter 6. Storage and Other I/O Topics. ICE3003: Computer Architecture Spring 2014 Euiseong Seo

Chapter 11. I/O Management and Disk Scheduling

Computer Science 61C Spring Friedland and Weaver. Input/Output

Announcement. Computer Architecture (CSC-3501) Lecture 23 (17 April 2008) Chapter 7 Objectives. 7.1 Introduction. 7.2 I/O and Performance

Lecture 25: Busses. A Typical Computer Organization

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

Computer System Overview

This page intentionally left blank

Input/Output. Chapter 5: I/O Systems. How fast is I/O hardware? Device controllers. Memory-mapped I/O. How is memory-mapped I/O done?

Recap: Machine Organization

PC I/O. May 7, Howard Huang 1

Transcription:

Introduction to Computer Architecture Input/Output () Copyright 2012 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2012 Where We Are in This Course Right Now So far: We know how to design a processor that can fetch, decode, and execute the instructions in an ISA We understand how to design caches and memory Now: We learn about the lowest level of storage (disks) We learn about input/output in general Next: Multicore processors Evaluating and improving performance 1 2 This Unit: Readings Application OS Compiler Firmware CPU Digital Circuits Gates & Transistors system structure Devices, controllers, and buses Device characteristics Disks Bus characteristics control Polling and interrupts Patterson and Hennessy Chapter 6 3 4

Computers Interact with Outside World One Instance of Input/output () Otherwise, how will we ever tell a computer what to do or exploit the results of its work? Computers without are not useful ICQ: What kinds of do computers have? CPU I$ D$ L2 Have briefly seen one instance of Disk: bottom of memory hierarchy Holds whatever can t fit in memory ICQ: What else do disks hold? Main Disk(swap) 5 6 A More General/Realistic System A computer system CPU, including cache(s) (DRAM) peripherals: disks, input devices, displays, network cards,... With built-in or separate (or ) controllers All connected by a system bus CPU ($) System (memory-) bus ctrl : Control + Data Transfer devices have two ports Control: commands and status reports How we tell what to do How tells us about itself Control is the tricky part (especially status reports) Data Labor-intensive part Interesting devices do data transfers (to/from memory) Display: video memory monitor Disk: memory disk Network interface: memory network Main kbd Disk display NIC 7 8

Operating System (OS) Plays a Big Role interface is typically under OS control User applications access devices indirectly (e.g., SYSCALL) Why? Device drivers are programs that OS uses to manage devices Virtualization: same argument as for memory Physical devices shared among multiple programs Direct access could lead to conflicts example? Synchronization Most have asynchronous interfaces, require unbounded waiting OS handles asynchrony internally, presents synchronous interface Standardization Devices of a certain type (disks) can/will have different interfaces OS handles differences (via drivers), presents uniform interface 9 Device Characteristics Primary characteristic Data rate (aka bandwidth) Contributing factors Partner: humans have slower output data rates than machines Input or output or both (input/output) Device Partner I? O? Data Rate (KB/s) Keyboard Human Input 0.01 Mouse Human Input 0.02 Speaker Human Output 0.60 Printer Human Output 200 Display Human Output 240,000 Modem Machine 7 Ethernet card Machine ~1,000,000 Disk Machine ~10,000 10 Device Bandwidth: Some Examples Device: Disk Keyboard 1 B/key * 10 keys/s = 10 B/s Mouse 2 B/transfer * 10 transfers/s = 20 B/s Display 4 B/pixel * 1M pixel/display * 60 displays/s = 240 MB/s platter sector track head Disk: like stack of record players Collection of platters Each with read/write head Platters divided into concentric tracks Head seeks (forward/backward) to track All heads move in unison Each track divided into sectors ZBR (zone bit recording) More sectors on outer tracks Sectors rotate under head Controller Seeks heads, waits for sectors Turns heads on/off May have its own cache (made w/dram) 11 12

Disk Parameters Slightly newer disk from Toshiba 0.85, 4 GB drives, used in ipod-mini Seagate ST3200 Seagate Savvio Toshiba MK1003 Diameter 3.5 2.5 1.8 Capacity 200 GB 73 GB 10 GB RPM 7200 RPM 10000 RPM 4200 RPM Cache 8 MB? 512 KB Disks/Heads 2/4 2/4 1/2 Average Seek 8 ms 4.5 ms 7 ms Peak Data Rate 150 MB/s 200 MB/s 200 MB/s Sustained Data Rate 58 MB/s 94 MB/s 16 MB/s Interface ATA SCSI ATA Use Desktop Laptop ipod Disk Read/Write Latency Disk read/write latency has four components Seek delay (t seek ): head seeks to right track Rotational delay (t rotation ): right sector rotates under head On average: time to go halfway around disk Transfer time (t transfer ): data actually being transferred Controller delay (t controller ): controller overhead (on either side) Example: time to read a 4KB page assuming 128 sectors/track, 512 B/sector, 6000 RPM, 10 ms t seek, 1 ms t controller 6000 RPM 100 R/s 10 ms/r t rotation = 10 ms / 2 = 5 ms 4 KB page 8 sectors t transfer = 10 ms * 8/128 = 0.6 ms t disk = t seek + t rotation + t transfer + t controller = 10 + 5 + 0.6 + 1 = 16.6 ms 13 14 Disk Bandwidth Disk is bandwidth-inefficient for page-sized transfers Actual data transfer (t transfer ) a small part of disk access (and cycle) Increase bandwidth: stripe data across multiple disks Striping strategy depends on disk usage model File System or web server : many small files Map entire files to disks Supercomputer or database : several large files Stripe single file across multiple disks Both bandwidth and individual transaction latency important Error Correction: RAID Error correction: more important for disk than for memory Mechanical disk failures (entire disk lost) is common failure mode Entire file system can be lost if files striped across multiple disks RAID (redundant array of inexpensive disks) Similar to DRAM error correction, but Major difference: which disk failed is known Even parity can be used to recover from single failures Parity disk can be used to reconstruct data faulty disk RAID design balances bandwidth and fault-tolerance Many flavors of RAID exist Tradeoff: extra disks (cost) vs. performance vs. reliability Deeper discussion of RAID in ECE 252 and ECE 254 RAID doesn t solve all problems can you think of any examples? 15 16

The System Bus System bus: connects system components together Important: insufficient bandwidth can bottleneck entire system Performance factors Physical length Number and type of connected devices (taps) CPU ($) Main System (memory-) bus kbd Disk display ctrl NIC 17 CPU Three Buses Mem Proc-Mem CPU adapter Mem Backplane Processor-memory bus Connects CPU and memory, no direct interface + Short, few taps fast, high-bandwidth System specific bus Connects devices, no direct P-M interface Longer, more taps slower, lower-bandwidth + Industry standard Connect P-M bus to bus using adapter Backplane bus CPU, memory, connected to same bus + Industry standard, cheap (no adapters needed) Processor-memory performance compromised 18 Bus Design data lines address lines control lines Goals High Performance: low latency and high bandwidth Standardization: flexibility in dealing with many devices Low Cost Processor-memory bus emphasizes performance, then cost & backplane emphasize standardization, then performance Design issues 1. Width/multiplexing: are wires shared or separate? 2. Clocking: is bus clocked or not? 3. Switching: how/when is bus control acquired and released? 4. Arbitration: how do we decide who gets the bus next? (1) Bus Width and Multiplexing Wider + More bandwidth More expensive and more susceptible to skew Multiplexed: address and data share same lines + Cheaper Less bandwidth Burst transfers (bus parking) Multiple sequential data transactions for single address + Increase bandwidth at relatively little cost 19 20

(2) Bus Clocking Synchronous: clocked + Fast Bus must be short to minimize clock skew Asynchronous: un-clocked + Can be longer: no clock skew, deals with devices of different speeds Slower: requires hand-shaking protocol For example, asynchronous read Multiplexed data/address lines, 3 control lines 1. Processor drives address onto bus, asserts Request line 2. asserts Ack line, processor stops driving 3. drives data on bus, asserts DataReady line 4. Processor asserts Ack line, memory stops driving (3) Bus Switching Atomic: bus busy between request and reply + Simple Low utilization Split-transaction: requests/replies can be interleaved + Higher utilization higher throughput Complex, requires sending IDs to match replies to request P-M buses are synchronous and backplane buses asynchronous or slow-clock synchronous 21 22 (4) Bus Arbitration Bus master: component that can initiate a bus request Bus typically has several masters, including processor devices can also be masters (Why? See in a bit) Arbitration: choosing a master among multiple requests Try to implement priority and fairness (no device starves ) Daisy-chain: devices connect to bus in priority order High-priority devices intercept/deny requests by low-priority ones ± Simple, but slow and can t ensure fairness Centralized: special arbiter chip collects requests, decides ± Ensures fairness, but arbiter chip may itself become bottleneck Distributed: everyone sees all requests simultaneously Back off and retry if not the highest priority request ± No bottlenecks and fair, but needs a lot of control lines Standard Bus Examples PCI/PCIe SCSI USB Type Backplane Width 32 64 bits 8 32 bits 1 Multiplexed? Yes Yes Yes Clocking 33 (66) MHz 5 (10) MHz Asynchronous Data rate 133 (266) MB/s 10 (20) MB/s 0.2, 1.5, 60 MB/s Arbitration Distributed Distributed Daisy-chain Maximum masters 1024 7 31 127 Maximum length 0.5 m 2.5 m USB (universal serial bus) Popular for low/moderate bandwidth external peripherals + Packetized interface (like TCP), extremely flexible + Also supplies power to the peripheral 23 24