CS61C : Machine Structures

Similar documents
Review. Motivation for Input/Output. What do we need to make I/O work?

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 36: IO Basics

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 36: IO Basics. Instructor: Dan Garcia h<p://inst.eecs.berkeley.

Review. Manage memory to disk? Treat as cache. Lecture #26 Virtual Memory II & I/O Intro

CS61C : Machine Structures

61C In the News. Review. Memory Management Today. Impact of Paging on AMAT

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture. Input/Output

ECE331: Hardware Organization and Design

ECE232: Hardware Organization and Design

UCB CS61C : Machine Structures

Motivation for Input/Output

Lecture #28 I/O, networks, disks

Most important topics today. I/O bus (computer bus) Polling vs interrupts Switch vs shared network Network protocol layers Various types of RAID

Review 1/2 Operating System started as shared I/O library. CS61C Anatomy of I/O Devices: Networks. Lecture 14

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

EECS150 - Digital Design Lecture 9 Project Introduction (I), Serial I/O. Announcements

Internet II. CS10 : Beauty and Joy of Computing. cs10.berkeley.edu. !!Senior Lecturer SOE Dan Garcia!!! Garcia UCB!

CS61C : Machine Structures

EECS150 - Digital Design Lecture 08 - Project Introduction Part 1

CS61C : Machine Structures

Raspberry Pi ($40 on Amazon)

CS61C : Machine Structures

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Opera&ng Systems, Interrupts, Virtual Memory

CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

Clever Signed Adder/Subtractor. Five Components of a Computer. The CPU. Stages of the Datapath (1/5) Stages of the Datapath : Overview

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

Another View of the Memory Hierarchy. Lecture #25 Virtual Memory I Memory Hierarchy Requirements. Memory Hierarchy Requirements

UCB CS61C : Machine Structures

Block Size Tradeoff (1/3) Benefits of Larger Block Size. Lecture #22 Caches II Block Size Tradeoff (3/3) Block Size Tradeoff (2/3)

Outline. Lecture 40 Hardware Parallel Computing Thanks to John Lazarro for his CS152 slides inst.eecs.berkeley.

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

Chapter 8. A Typical collection of I/O devices. Interrupts. Processor. Cache. Memory I/O bus. I/O controller I/O I/O. Main memory.

UC Berkeley CS61C : Machine Structures

11/13/17. CS61C so far. Adding I/O C Programs. So How is a Laptop Any Different? It s a Real Computer! Raspberry Pi (Less than $40 on Amazon in 2017)

UC Berkeley CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture. Lecture 22: Operating Systems. Krste Asanović & Randy H. Katz

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE!

Review. Lecture #25 Input / Output, Networks II, Disks ABCs of Networks: 2 Computers

Computer Architecture CS 355 Busses & I/O System

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 26 IO Basics, Storage & Performance I ! HACKING THE SMART GRID!

Goals for today:! Recall : 5 components of any Computer! ! What do we need to make I/O work?!

UCB CS61C : Machine Structures

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. Accessing data in a direct mapped cache

Reading and References. Input / Output. Why Input and Output? A typical organization. CSE 410, Spring 2004 Computer Systems

UC Berkeley CS61C : Machine Structures

CS61C Machine Structures Lecture 39 Networks. No Machine is an Island!

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

CS61C : Machine Structures

CS61C : Machine Structures

Raspberry Pi ($40 on Amazon)

Brought to you by CalLUG (UC Berkeley GNU/Linux User Group). Tuesday, September 20, 6-8 PM in 100 GPB.

Lecture #6 Intro MIPS; Load & Store Assembly Variables: Registers (1/4) Review. Unlike HLL like C or Java, assembly cannot use variables

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 38: IO and Networking

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

CS152 Computer Architecture and Engineering Lecture 21 Buses and Networks Dave Patterson (

CMSC 611: Advanced. Interconnection Networks

Virtual Memory Input/Output. Admin

CS 61C: Great Ideas in Computer Architecture Lecture 22: Opera&ng Systems, Interrupts, and Virtual Memory Part 1

Systems Architecture II

Chapter 1. Computer Abstractions and Technology

Lecture #6 Intro MIPS; Load & Store

EE108B Lecture 17 I/O Buses and Interfacing to CPU. Christos Kozyrakis Stanford University

www-inst.eecs.berkeley.edu/~cs61c/

CS152 Computer Architecture and Engineering Lecture 21 Buses and Networks Dave Patterson (

Processor. Lecture #2 Number Rep & Intro to C classic components of all computers Control Datapath Memory Input Output }

Introduction to Input and Output

INPUT/OUTPUT DEVICES Dr. Bill Yi Santa Clara University

UCB CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

CS61C L10 MIPS Instruction Representation II, Floating Point I (6)

CS61C : Machine Structures

UCB CS61C : Machine Structures

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture

Module 6: INPUT - OUTPUT (I/O)

UCB CS61C : Machine Structures

CS61C Machine Structures. Lecture 1 Introduction. 8/25/2003 Brian Harvey. John Wawrzynek (Warznek) www-inst.eecs.berkeley.

CS61C : Machine Structures

lecture 22 Input / Output (I/O) 4

CS61C : Machine Structures

Architecture and OS. To do. q Architecture impact on OS q OS impact on architecture q Next time: OS components and structure

Chapter Seven Morgan Kaufmann Publishers

Chapter 6. Storage & Other I/O

Lecture 13. Storage, Network and Other Peripherals

Page 1, 5/4/99 8:22 PM

Transcription:

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Input / Output, Networks I CPS today! 2005-11-28 There is one handout today at the front and back of the room! Lecturer PSOE, new dad Dan Garcia www.cs.berkeley.edu/~ddgarcia The ultimate in I/O: Robots!! There s a revolution going on in Japan to design the most useful, lifelike robot the US is far behind! This one has a projector, WiFi, cellphone, speech recog, & can speak Japanese. $85K news.3yen.com/2005-11-26/japanese-robot-gets-kissie-kissie/ CS61C L24 Input/Output, Networks I (1)

Review Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings for each user vs. tag/data in cache TLB is cache of Virtual!Physical addr trans Virtual Memory allows protected sharing of memory between processes Spatial Locality means Working Set of Pages is all that must be in memory for process to run fairly well CS61C L24 Input/Output, Networks I (2)

Recall : 5 components of any Computer Earlier Lectures Current Lectures Computer Processor (active) Control ( brain ) Datapath ( brawn ) Memory (passive) (where programs, data live when running) Devices Input Output Keyboard, Mouse Disk, Network Display, Printer CS61C L24 Input/Output, Networks I (3)

Motivation for Input/Output I/O is how humans interact with computers I/O gives computers long-term memory. I/O lets computers do amazing things: Read pressure of synthetic hand and control synthetic arm and hand of fireman Control propellers, fins, communicate in BOB (Breathable Observable Bubble) Computer without I/O like a car without wheels; great technology, but won!t get you anywhere CS61C L24 Input/Output, Networks I (4)

I/O Device Examples and Speeds I/O Speed: bytes transferred per second (from mouse to Gigabit LAN: 12.5-million-to-1) Device Behavior Partner Data Rate (KBytes/s) Keyboard Input Human 0.01 Mouse Input Human 0.02 Voice output Output Human 5.00 Floppy disk Storage Machine 50.00 Laser Printer Output Human 100.00 Magnetic Disk Storage Machine 10,000.00 Wireless Network I or O Machine 10,000.00 Graphics Display Output Human 30,000.00 Wired LAN Network I or O Machine 125,000.00 When discussing transfer rates, use 10 x CS61C L24 Input/Output, Networks I (5)

What do we need to make I/O work? A way to connect many types of devices to the Proc-Mem A way to control these devices, respond to them, and transfer data A way to present them to user programs so they are useful Files APIs Operating System Proc SCSI Bus PCI Bus Mem cmd reg. data reg. CS61C L24 Input/Output, Networks I (6)

Instruction Set Architecture for I/O What must the processor do for I/O? Input: reads a sequence of bytes Output: writes a sequence of bytes Some processors have special input and output instructions Alternative model (used by MIPS): Use loads for input, stores for output Called Memory Mapped Input/Output A portion of the address space dedicated to communication paths to Input or Output devices (no memory there) CS61C L24 Input/Output, Networks I (7)

Memory Mapped I/O Certain addresses are not regular memory Instead, they correspond to registers in I/O devices address 0xFFFFFFFF 0xFFFF0000 cntrl reg. data reg. 0 CS61C L24 Input/Output, Networks I (8)

Processor-I/O Speed Mismatch 1GHz microprocessor can execute 1 billion load or store instructions per second, or 4,000,000 KB/s data rate I/O devices data rates range from 0.01 KB/s to 125,000 KB/s Input: device may not be ready to send data as fast as the processor loads it Also, might be waiting for human to act Output: device not be ready to accept data as fast as processor stores it What to do? CS61C L24 Input/Output, Networks I (9)

Processor Checks Status before Acting Path to device generally has 2 registers: Control Register, says it!s OK to read/write (I/O ready) [think of a flagman on a road] Data Register, contains data Processor reads from Control Register in loop, waiting for device to set Ready bit in Control reg (0! 1) to say its OK Processor then loads from (input) or writes to (output) data register Load from or Store into Data Register resets Ready bit (1! 0) of Control Register CS61C L24 Input/Output, Networks I (10)

SPIM I/O Simulation SPIM simulates 1 I/O device: memorymapped terminal (keyboard + display) Read from keyboard (receiver); 2 device regs Writes to terminal (transmitter); 2 device regs Receiver Control 0xffff0000 Receiver Data 0xffff0004 Transmitter Control 0xffff0008 Transmitter Data 0xffff000c (IE) Unused (00...00) Unused Unused (00...00) Unused (00...00) Ready (I.E.) Received Byte Ready (I.E.) Transmitted Byte CS61C L24 Input/Output, Networks I (11)

SPIM I/O Control register rightmost bit (0): Ready Receiver: Ready==1 means character in Data Register not yet been read; 1! 0 when data is read from Data Reg Transmitter: Ready==1 means transmitter is ready to accept a new character; 0! Transmitter still busy writing last char - I.E. bit discussed later Data register rightmost byte has data Receiver: last char from keyboard; rest = 0 Transmitter: when write rightmost byte, writes char to display CS61C L24 Input/Output, Networks I (12)

I/O Example Input: Read from keyboard into $v0 lui $t0, 0xffff #ffff0000 Waitloop: lw $t1, 0($t0) #control andi $t1,$t1,0x1 beq $t1,$zero, Waitloop lw $v0, 4($t0) #data Output: Write to display from $a0 lui $t0, 0xffff #ffff0000 Waitloop: lw $t1, 8($t0) #control andi $t1,$t1,0x1 beq $t1,$zero, Waitloop sw $a0, 12($t0) #data Processor waiting for I/O called Polling Ready bit from processor!s point of view! CS61C L24 Input/Output, Networks I (13)

Administrivia Only 3 lectures to go (after this one)! :-( Project 4 (Cache simulator) due friday Performance contest rules up today Deadline is Mon, 2005-12-12 @ 11:59pm, two weeks from today HW4 and HW5 are done Regrade requests are due by 2005-12-05 Project 3 will be graded face-to-face, check web page for scheduling Final: 2005-12-17 @ 12:30pm in 2050 VLSB! CS61C L24 Input/Output, Networks I (14)

Upcoming Calendar Week # #14 This week #15 Last Week o! Classes #16 Sun 2pm Review 10 Evans Mon I/O Basics & Networks I Performance Performance competition due tonight @ midnight Wed I/O Networks II & Disks LAST CLASS Summary, Review, & HKN Evals Thu Lab I/O Polling I/O Networking & 61C Feedback Survey Sat Cache project due yesterday FINAL EXAM SAT 12-17 @ 12:30pm- 3:30pm 2050 VLSB Performance awards CS61C L24 Input/Output, Networks I (15)

Cost of Polling? Assume for a processor with a 1GHz clock it takes 400 clock cycles for a polling operation (call polling routine, accessing the device, and returning). Determine % of processor time for polling Mouse: polled 30 times/sec so as not to miss user movement Floppy disk: transfers data in 2-Byte units and has a data rate of 50 KB/second. No data transfer can be missed. Hard disk: transfers data in 16-Byte chunks and can transfer at 16 MB/second. Again, no transfer can be missed. CS61C L24 Input/Output, Networks I (16)

% Processor time to poll [p. 677 in book] Mouse Polling, Clocks/sec = 30 [polls/s] * 400 [clocks/poll] = 12K [clocks/s] % Processor for polling: 12*10 3 [clocks/s] / 1*10 9 [clocks/s] = 0.0012%! Polling mouse little impact on processor Frequency of Polling Floppy = 50 [KB/s] / 2 [B/poll] = 25K [polls/s] Floppy Polling, Clocks/sec = 25K [polls/s] * 400 [clocks/poll] = 10M [clocks/s] % Processor for polling: 10*10 6 [clocks/s] / 1*10 9 [clocks/s] = 1%! OK if not too many I/O devices CS61C L24 Input/Output, Networks I (17)

% Processor time to poll hard disk Frequency of Polling Disk = 16 [MB/s] / 16 [B/poll] = 1M [polls/s] Disk Polling, Clocks/sec = 1M [polls/s] * 400 [clocks/poll] = 400M [clocks/s] % Processor for polling: 400*10 6 [clocks/s] / 1*10 9 [clocks/s] = 40%! Unacceptable CS61C L24 Input/Output, Networks I (18)

What is the alternative to polling? Wasteful to have processor spend most of its time spin-waiting for I/O to be ready Would like an unplanned procedure call that would be invoked only when I/O device is ready Solution: use exception mechanism to help I/O. Interrupt program when I/O ready, return when done with data transfer CS61C L24 Input/Output, Networks I (19)

I/O Interrupt An I/O interrupt is like overflow exceptions except: An I/O interrupt is asynchronous More information needs to be conveyed An I/O interrupt is asynchronous with respect to instruction execution: I/O interrupt is not associated with any instruction, but it can happen in the middle of any given instruction I/O interrupt does not prevent any instruction from completion CS61C L24 Input/Output, Networks I (20)

Definitions for Clarification Exception: signal marking that something out of the ordinary has happened and needs to be handled Interrupt: asynchronous exception Trap: synchronous exception Note: Many systems folks say interrupt to mean what we mean when we say exception. CS61C L24 Input/Output, Networks I (21)

Interrupt-Driven Data Transfer Memory (1) I/O interrupt (2) save PC (3) jump to interrupt service routine (5) (4) perform transfer add sub and or read store... jr user program interrupt service routine CS61C L24 Input/Output, Networks I (22)

SPIM I/O Simulation: Interrupt Driven I/O I.E. stands for Interrupt Enable Set Interrupt Enable bit to 1 have interrupt occur whenever Ready bit is set Receiver Control 0xffff0000 Receiver Data 0xffff0004 Transmitter Control 0xffff0008 Transmitter Data 0xffff000c (IE) Unused (00...00) Unused Unused (00...00) Unused (00...00) Ready (I.E.) Received Byte Ready (I.E.) Transmitted Byte CS61C L24 Input/Output, Networks I (23)

Benefit of Interrupt-Driven I/O Find the % of processor consumed if the hard disk is only active 5% of the time. Assuming 500 clock cycle overhead for each transfer, including interrupt: Disk Interrupts/s = 16 MB/s / 16B/interrupt = 1M interrupts/s Disk Interrupts, clocks/s = 1M interrupts/s * 500 clocks/interrupt = 500,000,000 clocks/s % Processor for during transfer: 500*10 6 / 1*10 9 = 50% Disk active 5%! 5% * 50%! 2.5% busy CS61C L24 Input/Output, Networks I (24)

Peer Instruction A. A faster CPU will result in faster I/O. B. Hardware designers handle mouse input with interrupts since it is better than polling in almost all cases. C. Low-level I/O is actually quite simple, as it!s really only reading and writing bytes. ABC 1: FFF 2: FFT 3: FTF 4: FTT 5: TFF 6: TFT 7: TTF 8: TTT CS61C L24 Input/Output, Networks I (25)

Why Networks? Originally sharing I/O devices between computers (e.g., printers) Then Communicating between computers (e.g, file transfer protocol) Then Communicating between people (e.g., email) Then Communicating between networks of computers! File sharing, WWW, CS61C L24 Input/Output, Networks I (27)

How Big is the Network (2005)? ~30 ~525 ~6,400 (1999) ~50,000 ~9,000,000 ~217,000,000 Computers in 273 Soda in inst.cs.berkeley.edu in eecs&cs.berkeley.edu in berkeley.edu (100,000+?) in.edu in US (.net.com.edu.arpa.us.mil.org.gov) ~318,000,000 in the world Source: Internet Software Consortium: www.isc.org CS61C L24 Input/Output, Networks I (28)

Growth Rate Ethernet Bandwidth 1983 3 Mb/s 1990 10 Mb/s 1997 100 Mb/s 1999 1000 Mb/s 2006 10 Gig E (to come!) en.wikipedia.org/wiki/10_gigabit_ethernet CS61C L24 Input/Output, Networks I (29)

Buses in a PC: connect a few devices (2002) CPU Memory bus Memory PCI Interface Data rates (P4) CS61C L24 Input/Output, Networks I (30) PCI: Internal (Backplane) I/O bus Ethernet Interface Memory: 400 MHz, 8 bytes! 3.2 GB/s (peak) SCSI Interface PCI: 100 MHz, 8 bytes wide! 0.8 GB/s (peak) Bus - shared medium of communication that can connect to many devices. Hierarchy!! SCSI: External I/O bus (1 to 15 disks) Ethernet Local Area Network SCSI: Ultra4 (160 MHz), Gigabit Wide (2 bytes) Ethernet:! 0.3 GB/s (peak)! 0.125 GB/s (peak)

Shared vs. Switched Based Networks Shared Media vs. Switched: in switched, pairs ( point-to-point connections) communicate at same time; shared 1 at a time Aggregate bandwidth (BW) in switched network is many times shared: point-to-point faster since no arbitration, simpler interface Node Node Node Node Node Node Shared Crossbar Switch Node CS61C L24 Input/Output, Networks I (31)

What makes networks work? links connecting switches to each other and to computers or devices Computer switch switch network interface switch ability to name the components and to route packets of information - messages - from a source to a destination Layering, protocols, and encapsulation as means of abstraction (61C big idea) CS61C L24 Input/Output, Networks I (32)

Typical Types of Networks Local Area Network (LAN) Ethernet Inside a building: Up to 1 km (peak) Data Rate: 10 Mbits/sec, 100 Mbits /sec,1000 Mbits/sec (1.25, 12.5, 125 MBytes/s) Run, installed by network administrators Wide Area Network (WAN) Across a continent (10km to 10000 km) (peak) Data Rate: 1.5 Mb/s to 10000 Mb/s Run, installed by telecommunications companies (Sprint, UUNet[MCI], AT&T) Wireless Networks (LAN),... CS61C L24 Input/Output, Networks I (33)

Example: Network Media Twisted Pair ( Cat 5 ): Fiber Optics Transmitter Is L.E.D or Laser Diode light source Air Buffer Copper, 1mm thick, twisted to avoid antenna effect Cladding Total internal reflection Receiver Photodiode Light: 3 parts are cable, light source, light detector Silica: glass or plastic; actually < 1/10 diameter of copper CS61C L24 Input/Output, Networks I (34)

The Sprint U.S. Topology (2001) CS61C L24 Input/Output, Networks I (35)

And in conclusion I/O gives computers their 5 senses I/O speed range is 12.5-million to one Processor speed means must synchronize with I/O devices before use Polling works, but expensive processor repeatedly queries devices Interrupts works, more complex devices cause exception, OS runs and deal with the device I/O control leads to Operating Systems Integrated circuit ( Moore!s Law ) revolutionizing network switches as well as processors Switch just a specialized computer Trend from shared to switched networks to get faster links and scalable bandwidth CS61C L24 Input/Output, Networks I (36)