CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 36: IO Basics. Instructor: Dan Garcia h<p://inst.eecs.berkeley.

Similar documents
Review. Motivation for Input/Output. What do we need to make I/O work?

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 36: IO Basics

61C In the News. Review. Memory Management Today. Impact of Paging on AMAT

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture. Input/Output

CS61C : Machine Structures

Review. Manage memory to disk? Treat as cache. Lecture #26 Virtual Memory II & I/O Intro

CS 61C: Great Ideas in Computer Architecture Lecture 22: Opera&ng Systems, Interrupts, and Virtual Memory Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 38: IO and Networking

ECE331: Hardware Organization and Design

Raspberry Pi ($40 on Amazon)

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Opera&ng Systems, Interrupts, Virtual Memory

ECE232: Hardware Organization and Design

The Processor: Improving the performance - Control Hazards

Full Datapath. Chapter 4 The Processor 2

LECTURE 3: THE PROCESSOR

CS61C : Machine Structures

EECS150 - Digital Design Lecture 9 Project Introduction (I), Serial I/O. Announcements

CSEE 3827: Fundamentals of Computer Systems

Raspberry Pi ($40 on Amazon)

11/13/17. CS61C so far. Adding I/O C Programs. So How is a Laptop Any Different? It s a Real Computer! Raspberry Pi (Less than $40 on Amazon in 2017)

EECS150 - Digital Design Lecture 08 - Project Introduction Part 1

CS 61C: Great Ideas in Computer Architecture. Lecture 22: Operating Systems. Krste Asanović & Randy H. Katz

There are different characteristics for exceptions. They are as follows:

COMPUTER ORGANIZATION AND DESI

COMPUTER ORGANIZATION AND DESIGN

LECTURE 10. Pipelining: Advanced ILP

UCB CS61C : Machine Structures

Motivation for Input/Output

Systems Architecture II

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Exceptions and Interrupts

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

Chapter 4. The Processor

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Chapter 4. The Processor

Traps, Exceptions, System Calls, & Privileged Mode!

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

CPS104 Computer Organization and Programming Lecture 17: Interrupts and Exceptions. Interrupts Exceptions and Traps. Visualizing an Interrupt

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

ICS 233 COMPUTER ARCHITECTURE. MIPS Processor Design Multicycle Implementation

Input / Output. School of Computer Science G51CSA

Chapter 4 The Processor (Part 4)

Faculty of Science FINAL EXAMINATION

CS61C : Machine Structures

EIE/ENE 334 Microprocessors

Homework 5. Start date: March 24 Due date: 11:59PM on April 10, Monday night. CSCI 402: Computer Architectures

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 27: Single- Cycle CPU Datapath Design

ECE 250 / CPS 250 Computer Architecture. Processor Design Datapath and Control

5.b Principles of I/O Hardware CPU-I/O communication

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

CS3350B Computer Architecture Winter 2015

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

EE 457 Unit 8. Exceptions What Happens When Things Go Wrong

Computer Architecture CS 355 Busses & I/O System

Thomas Polzer Institut für Technische Informatik

ECE 486/586. Computer Architecture. Lecture # 12

EN1640: Design of Computing Systems Topic 07: I/O

Suggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17" Short Pipelining Review! ! Readings!

What are Exceptions? EE 457 Unit 8. Exception Processing. Exception Examples 1. Exceptions What Happens When Things Go Wrong

Anne Bracy CS 3410 Computer Science Cornell University

Microprogramming. Microprogramming

Module 6: INPUT - OUTPUT (I/O)

CS 134. Operating Systems. April 8, 2013 Lecture 20. Input/Output. Instructor: Neil Rhodes. Monday, April 7, 14

COMP I/O, interrupts, exceptions April 3, 2016

Chapter 4. The Processor

Lecture 9 Pipeline and Cache

2 GHz = 500 picosec frequency. Vars declared outside of main() are in static. 2 # oset bits = block size Put starting arrow in FSM diagrams

Lecture 11: Interrupt and Exception. James C. Hoe Department of ECE Carnegie Mellon University

CS 537 Lecture 2 Computer Architecture and Operating Systems. OS Tasks

Computer System Overview

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2.

RISC Architecture: Multi-Cycle Implementation

Digital Design Laboratory Lecture 6 I/O

CS510 Operating System Foundations. Jonathan Walpole

Pipelining. CSC Friday, November 6, 2015

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

Computer Architecture Review CS 595

CSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Clever Signed Adder/Subtractor. Five Components of a Computer. The CPU. Stages of the Datapath (1/5) Stages of the Datapath : Overview

Unresolved data hazards. CS2504, Spring'2007 Dimitris Nikolopoulos

EE108B Lecture 17 I/O Buses and Interfacing to CPU. Christos Kozyrakis Stanford University

The control of I/O devices is a major concern for OS designers

POLITECNICO DI MILANO. Exception handling. Donatella Sciuto:

Lecture 10: Simple Data Path

Chapter 4. The Processor

Computer Architecture 2/26/01 Lecture #

Chapter 6. Storage & Other I/O

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

UC Berkeley CS61C : Machine Structures

1 Hazards COMP2611 Fall 2015 Pipelined Processor

Computer Organization and Structure

Lecture 13: Bus and I/O. James C. Hoe Department of ECE Carnegie Mellon University

Computer System Architecture Midterm Examination Spring 2002

Chapter 5 - Input / Output

I/O Management Intro. Chapter 5

COSC121: Computer Systems. Exceptions and Interrupts

Transcription:

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 36: IO Basics Instructor: Dan Garcia h<p://inst.eecs.berkeley.edu/~cs61c/

Recall : 5 components of any Computer Earlier Lectures Current Lectures Computer Processor (acgve) Control ( brain ) Datapath ( brawn ) Memory (passive) (where programs, data live when running) Devices Input Output Keyboard, Mouse Disk, Network Display, Printer

MoGvaGon for Input/Output I/O is how humans interact with computers I/O gives computers long- term memory. I/O lets computers do amazing things: MIT Media Lab Sixth Sense h<p://youtu.be/zfv4r4x2sk0 Computer without I/O like a car w/no wheels; great technology, but gets you nowhere

I/O Device Examples and Speeds I/O Speed: bytes transferred per second (from mouse to Gigabit LAN: 7 orders of magnitude!) Device Behavior Partner Data Rate (KBytes/s) Keyboard Input Human 0.01 Mouse Input Human 0.02 Voice output Output Human 5.00 Floppy disk Storage Machine 50.00 Laser Printer Output Human 100.00 MagneGc Disk Storage Machine 10,000.00 Wireless Network I or O Machine 10,000.00 Graphics Display Output Human 30,000.00 Wired LAN Network I or O Machine 125,000.00 When discussing transfer rates, use 10 x

What do we need to make I/O work? A way to connect many types of devices A way to control these devices, respond to them, and transfer data A way to present them to user programs so they are useful SCSI Bus Files APIs OperaCng System Proc PCI Bus Mem cmd reg. data reg.

InstrucGon Set Architecture for I/O What must the processor do for I/O? Input: reads a sequence of bytes Output: writes a sequence of bytes Some processors have special input and output instrucgons AlternaGve model (used by MIPS): Use loads for input, stores for output (in small pieces) Called Memory Mapped Input/Output A porgon of the address space dedicated to communicagon paths to Input or Output devices (no memory there)

Memory Mapped I/O Certain addresses are not regular memory Instead, they correspond to registers in I/O devices address 0xFFFFFFFF 0xFFFF0000 cntrl reg. data reg. 0

Processor- I/O Speed Mismatch 1GHz microprocessor can execute 1 billion load or store instrucgons per second, or 4,000,000 KB/s data rate I/O devices data rates range from 0.01 KB/s to 125,000 KB/s Input: device may not be ready to send data as fast as the processor loads it Also, might be waigng for human to act Output: device not be ready to accept data as fast as processor stores it What to do?

Processor Checks Status before AcGng Path to a device generally has 2 registers: Control Register, says it s OK to read/write (I/O ready) [think of a flagman on a road] Data Register, contains data Processor reads from Control Register in loop, waigng for device to set Ready bit in Control reg (0 1) to say its OK Processor then loads from (input) or writes to (output) data register Load from or Store into Data Register resets Ready bit (1 0) of Control Register This is called Polling

I/O Example (polling) Input: Read from keyboard into $v0 lui $t0, 0xffff #ffff0000 Waitloop: lw $t1, 0($t0) #control andi $t1,$t1,0x1 beq $t1,$zero, Waitloop lw $v0, 4($t0) #data Output: Write to display from $a0 lui $t0, 0xffff #ffff0000 Waitloop: lw $t1, 8($t0) #control andi $t1,$t1,0x1 beq $t1,$zero, Waitloop sw $a0, 12($t0) #data Ready bit is from processor s point of view!

Cost of Polling a Mouse? Assume for a processor with a 1GHz clock it takes 400 clock cycles for a polling operagon (call polling rougne, accessing the device, and returning). Determine % of processor Gme for polling Mouse: polled 30 Gmes/sec so as not to miss user movement Mouse Polling [clocks/sec] = 30 [polls/s] * 400 [clocks/poll] = 12K [clocks/s] % Processor for polling: 12*10 3 [clocks/s] / 1*10 9 [clocks/s] = 0.0012% Polling mouse li<le impact on processor

% Processor Gme to poll hard disk Hard disk: transfers data in 16- Byte chunks and can transfer at 16 MB/second. No transfer can be missed. (we ll come up with a be<er way to do this) Frequency of Polling Disk = 16 [MB/s] / 16 [B/poll] = 1M [polls/s] Disk Polling, Clocks/sec = 1M [polls/s] * 400 [clocks/poll] = 400M [clocks/s] % Processor for polling: 400*10 6 [clocks/s] / 1*10 9 [clocks/s] = 40% Unacceptable (Polling is only part of the problem main problem is that accessing in small chunks is inefficient)

What is the alternagve to polling? Wasteful to have processor spend most of its Gme spin- waigng for I/O to be ready Would like an unplanned procedure call that would be invoked only when I/O device is ready SoluGon: use excepgon mechanism to help I/O. Interrupt program when I/O ready, return when done with data transfer

ExcepGons and Interrupts Unexpected events requiring change in flow of control Different ISAs use the terms differently ExcepGon Arises within the CPU e.g., Undefined opcode, overflow, syscall, TLB Miss, Interrupt From an external I/O controller Dealing with them without sacrificing performance is difficult 4.9 ExcepGons 15

Handling ExcepGons In MIPS, excepgons managed by a System Control Coprocessor (CP0) Save PC of offending (or interrupted) instrucgon In MIPS: save in special register called Excep8on Program Counter (EPC) Save indicagon of the problem In MIPS: saved in special register called Cause register We ll assume 1- bit 0 for undefined opcode, 1 for overflow Jump to excepgon handler code at address 8000 0180 hex 16

ExcepGon ProperGes Restartable excepgons Pipeline can flush the instrucgon Handler executes, then returns to the instrucgon Refetched and executed from scratch PC saved in EPC register IdenGfies causing instrucgon Actually PC + 4 is saved because of pipelined implementagon Handler must adjust PC to get right address 17

Handler AcGons Read Cause register, and transfer to relevant handler Determine acgon required If restartable excepgon Take correcgve acgon use EPC to return to program Otherwise Terminate program Report error using EPC, cause, 18

ExcepGons in a Pipeline Another kind of control hazard Consider overflow on add in EX stage add $1, $2, $1 Prevent $1 from being clobbered Complete previous instrucgons Flush add and subsequent instrucgons Set Cause and EPC register values Transfer control to handler Similar to mispredicted branch Use much of the same hardware 19

I n s t r. and or add ExcepGon Example Time (clock cycles) ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg O r d e r slt lw lui I$ ALU Reg D$ Reg ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg 20

ExcepGon Example Time (clock cycles) I n s t r. and or Flush add, slt, lw O r d e r (bubble) (bubble) (bubble) sw ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg I$ 1 st instruc8on of handler ALU I$ Reg D$ Reg ALU Save PC+4 into EPC Reg D$ Reg ALU I$ Reg D$ Reg ALU I$ Reg D$ Reg 21

I/O Interrupt An I/O interrupt is like an excepgon except: An I/O interrupt is asynchronous More informagon needs to be conveyed An I/O interrupt is asynchronous with respect to instrucgon execugon: I/O interrupt is not associated with any instrucgon, but it can happen in the middle of any given instrucgon I/O interrupt does not prevent any instrucgon from complegon 22

Interrupt- Driven Data Transfer (1) I/O interrupt (2) save PC Memory add sub and or user program (3) jump to interrupt service routine (5) (4) perform transfer read store... jr interrupt service routine 23

Benefit of Interrupt- Driven I/O Find the % of processor consumed if the hard disk is only acgve 5% of the Gme. Assuming 500 clock cycle overhead for each transfer, including interrupt: Disk Interrupts/s = 5% * 16 [MB/s] / 16 [B/interrupt] = 50,000 [interrupts/s] Disk Interrupts [clocks/s] = 50,000 [interrupts/s] * 500 [clocks/interrupt] = 25,000,000 [clocks/s] % Processor for during transfer: 2.5*10 7 [clocks/s] / 1*10 9 [clocks/s] = 2.5% Busy DMA (Direct Memory Access) even be<er only one interrupt for an engre page! 24

And in conclusion I/O gives computers their 5 senses + long term memory I/O speed range is 7 Orders of Magnitude (or more!) Processor speed means must synchronize with I/O devices before use Polling works, but expensive processor repeatedly queries devices Interrupts work, more complex we ll talk about these next I/O control leads to OperaGng Systems