ECE 485/585 Microprocessor System Design

Similar documents
ECE 485/585 Microprocessor System Design

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Storage Systems. Storage Systems

Universal Serial Bus - USB 2.0

Chapter 11: Input/Output Organisation. Lesson 17: Standard I/O buses USB (Universal Serial Bus) and IEEE1394 FireWire Buses

I/O CANNOT BE IGNORED

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Microprocessors LCD Parallel Port USB Port

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Storage. Hwansoo Han

I/O CANNOT BE IGNORED

ECE 485/585 Microprocessor System Design

Chapter 6 Storage and Other I/O Topics

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Computer Science 146. Computer Architecture

SERIAL BUS COMMUNICATION PROTOCOLS USB

Peripheral Component Interconnect - Express

Chapter 9: Peripheral Devices: Magnetic Disks

Chapter Seven Morgan Kaufmann Publishers

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Database Systems II. Secondary Storage

ECE 485/585 Microprocessor System Design

Chapter 6. Storage and Other I/O Topics

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Lecture 23. Finish-up buses Storage

Storing Data: Disks and Files

Storage. CS 3410 Computer System Organization & Programming

Universal Serial Bus Host Interface on an FPGA

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

Optimal Management of System Clock Networks

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

ECE 485/585 Microprocessor System Design

Mass-Storage Structure

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Advanced Memory Organizations

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

CIT 668: System Architecture. Computer Systems Architecture

Chapter 10: Mass-Storage Systems

Chapter 8. A Typical collection of I/O devices. Interrupts. Processor. Cache. Memory I/O bus. I/O controller I/O I/O. Main memory.

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture Input/Output (I/O) Copyright 2012 Daniel J. Sorin Duke University

Storing Data: Disks and Files

A (Very Hand-Wavy) Introduction to. PCI-Express. Jonathan Heathcote

Reading and References. Input / Output. Why Input and Output? A typical organization. CSE 410, Spring 2004 Computer Systems

Hard Disk Drive. Controller

Operating Systems. Operating Systems Professor Sina Meraji U of T

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Chapter 10: Mass-Storage Systems

Chapter 6. Storage and Other I/O Topics. ICE3003: Computer Architecture Fall 2012 Euiseong Seo

Chapter 6. Storage and Other I/O Topics

IO System. CP-226: Computer Architecture. Lecture 25 (24 April 2013) CADSL

V. Mass Storage Systems

More on IO: The Universal Serial Bus (USB)

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

CS3600 SYSTEMS AND NETWORKS

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Semiconductor Memory Types Microprocessor Design & Organisation HCA2102

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

CENG3420 Lecture 08: Memory Organization

Memory Hierarchy Y. K. Malaiya

Data Storage and Query Answering. Data Storage and Disk Structure (2)

Digital Logic Level. Buses PCI (..continued) PTE MIK MIT

Today: Secondary Storage! Typical Disk Parameters!

Chapter 6. Storage and Other I/O Topics

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 6 Supporting Hard Drives

Chapter 6. Storage & Other I/O

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Storing Data: Disks and Files

Operating Systems 2010/2011

Generic Model of I/O Module Interface to CPU and Memory Interface to one or more peripherals

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture

Thomas Polzer Institut für Technische Informatik

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Organisasi Sistem Komputer

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

I/O Management and Disk Scheduling. Chapter 11

Address Accessible Memories. A.R. Hurson Department of Computer Science Missouri University of Science & Technology

Page 1. Magnetic Disk Purpose Long term, nonvolatile storage Lowest level in the memory hierarchy. Typical Disk Access Time

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

CS 261 Fall Mike Lam, Professor. Memory

Storage Technologies and the Memory Hierarchy

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

STUDY, DESIGN AND SIMULATION OF FPGA BASED USB 2.0 DEVICE CONTROLLER

CSE 380 Computer Operating Systems

Chapter 11. I/O Management and Disk Scheduling

Transcription:

Microprocessor System Design Lecture 17: Serial Buses USB Disks and other I/O Zeshan Chishti Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science Source: Lecture based on materials provided by Mark F.

Bus Trends Parallel to Serial Lower Voltage Can t accomplish large voltage swings at high speed Power consumption Differential Signaling Clock Forwarding and Clock Embedding Encoding Line codes: 8b/10b

Why Serial? Parallel Serial + - Device A Device B Device A + - Device B 10 bidirectional wires at 250Mbps pair unidirectional wires at 2500Mbps (2.5Gbps)

Traditional Parallel Bus Device A Device B Used in low to medium 100 MHz range Issues Board trace length mismatch effects skew at a device Clock skew across devices Faster data rate squeezes eye t SU CLK t H EYE

Source Synchronous Bus Device A Device B Used in 200MHz to 1.6GHz range Clock signal is forwarded with data Design impact: Board layout track length mismatch still adds to skew Eliminates skew error term caused by clock domain skew Allows faster cycle times than parallel

Source Synchronous - Clock Forward Clock Data 1 Data 2 Data 3 Data 4 Clock is transmitted continuously from Tx to Rx Removes clock distribution skew Examples HyperTransport Parallel RapidIO

Embedded Clock Data Clock Clock signal is embedded with data Edge density guaranteed by encoding scheme Examples Edge density?...well, you have to have enough transitions to recover the clock PCI Express USB Serial RapidIO Infiniband Clock signal embedded with data

SerDes (Serial/Deserializer) Serialize Tx 10 x 250Mbps Low Speed Parallel Data 1 x 2.5Gbps High Speed Serial Data Deserialize Rx

SerDes - Parallel and Serial Conversion 10-bit Parallel Interface Serializer TxP TxN SysClk XmitClk Rx Parallel Interface 10-bit Clock Recovery and Data Deserializer RxP RxN Recovered Clock

Differential Signaling Differential, point to point Complementary signals transmitted Receiver detects voltage difference between lines Low amplitudes (200mV - 400mV typical), high speeds Good noise immunity Pair routed together noise cancels out EX: LVDS Low Voltage Differential Signaling 3.125 Gbps, +/- 350mV Gbps at mws -- High speed & low power consumption FibreChannel, Gigabit Ethernet, HDMI, DVI

Clock Recovery for Embedded Clock Data Clock Problem: When clock signal is embedded with data, need enough transitions to recover the clock from the data (high edge density) Solution: Line codes Use codes which ensure that there are always enough transitions (0->1 and 1->0) in short amount of time, so that the clock signal can be easily recovered from the embedded clock/data signal Other advantage: avoid DC bias Clock signal embedded with data

Ensuring Edge Density: m-of-n codes Some 8-bit code words have too few 1s (or 0s) to ensure edge density sufficient to recover clock 8b10 encoding (developed by IBM in 1983) Use a subset of 10-bit code words having a balanced number of 0s and 1s Benefits Ensure edge density Avoid DC bias at receiver from imbalance 8 bit byte 10 bit code Table lookup Device A Device B Parallel Data TX FIFO 8B/10B Encoder Serializer + _ Deserializer 8B/10B Decoder RX FIFO Parallel Data Parallel Data RX FIFO 8B/10B Decoder Deserializer + _ Serializer 8B/10B Encoder TX FIFO Parallel Data

8b/10b Transmission Code http://en.wikipedia.org/wiki/8b/10b_encoding Name 8-bit binary 10-bit Encode D0.0 00000 000 0100 100111 D0.1 00000 001 0100 011101 D0.2 00000 010 0100 101101 D0.3 00000 011 1011 110001 Reasons for using 8B/10B encoding/decoding Guarantees transition density to ensure correct PLL operation Error correction to detect signaling errors Ensures signal is DC balanced no DC offset develops over time Support of special characters that can be used as delimiters for control, such as sync or framing, or other generalized commands

USB (Universal Serial Bus)

USB Universal Serial Bus Need for new peripheral bus Inexpensive Capable of supporting wide range of devices Simple (no IRQ contention, etc) Hot-pluggable (plug and play) USB introduced 1995 Compaq, Digital, IBM, Intel, Microsoft, NEC, Northern Telecom, later 25 companies USB 2.0 Finalized April 2000 Fully backward compatible implications! USB 1.X USB 2.0 LS Low Speed 1.5 Mbps FS Full Speed 12 Mbps HS High Speed 480 Mbps Low speed devices: keyboard, mice Higher speed devices: scanners disks, CD/RW, flash storage

USB Features No arbitration! Everything controlled by host (master) Point-to-point with host as one point No device-to-device communication USB devices don t use I/O or memory address space Memory used for buffers, descriptors Devices have addresses but not from I/O or memory space Error detection and correction CRC (16-bit for data packets, 5-bit for others) ACK Resend bad packets

USB Flexible Peripheral Interface Tree Topology Point-to-point links Hubs (7 peripherals/hub) 127 devices maximum 5m maximum cable length Serial Interface Low power devices can be powered through cable 100mA 500mA if permitted by host Isochronous Device can request guaranteed bandwidth Mission Critical apps. Audio, streaming video

Tree Topology Host Controller Generates transactions Hubs Controls power to its ports Enables/disables ports Recognizes devices attached ports Reports status events associated with ports Devices when requested by host

Layers and Logical Communication

Endpoints Drivers communicate with device endpoints All devices must support endpoint 0 (control) Except for endpoint 0 all endpoints unidirectional Devices can implement multiple endpoints Ex: CD/ROM Endpoint for control Endpoint for writing data Endpoint for reading data Endpoint for reading audio Isochronous

USB Device Descriptors Each device attached to the USB describes itself using a hierarchy of descriptors device descriptor (1) specifies number of supported configurations, default communication pipe, general info configuration descriptor (1/configuration) specifies info on configuration and number of interfaces interface descriptor (1/interface) device class, number of endpoints, etc endpoint descriptor (1/endpoint) each descriptor defines a point of communication incl. max transfer rate, transfer type, etc. string descriptor human readable information about device, configuration, interface (Unicode characters) class-specific product specific info or requirements not covered by USB standard

USB Enumeration USB devices are enumerated at power up and when first attached Enumeration is carried out by the host software and the root hub and consists of (roughly) the following steps: Host configures root hub and requests that power be applied to all ports Root hub enables power to each port and detects presence of device and downstream hubs Root resets all of the devices it has detected forces every USB device to respond as address 0 so root hub can read each device s descriptor Root hub assigns unique device address to each USB device Host software probes each device to determine if the endpoints can be accommodated w/ the remaining bandwidth and power. If insufficient power or bandwidth is available that device configuration is not allowed on the USB there may be other configurations offered by the USB device if no configurations work the device is not allowed on the USB If acceptable configuration is found the device is fully enabled in that configuration and the client software is notified that device is installed

USB Transactions Typical USB transaction comprises 3 phases: Token packet phase identifies type of transaction, device address and whether there are additional phases Data packet phase carries payload (<= 1023 bytes) Handshake packet phase status of transaction (ex: errors?) All transfers except isochronous guarantee data delivery Packet format: Synchronization data (seven 0 s followed by single 1) Packet ID purpose and content of packet (Token, Data, Handshake or Special + inverted ID for check) CRC and end of packet indicator

Packets Token packets Sent at beginning of transaction to specify target endpoint address Define type of transaction Data packets SOF (Start of Frame) IN (transfer from target USB device to system) OUT (transfer from system to target USB device) SETUP (start of control transfer) For data payload (follow token packets) Handshake packets Sent by device to acknowledge receipt of data ACK/NAK/NYET/STALL Special packets Pre-amble packets to enable low-speed ports prior to sending lowspeed packets

USB Devices Share Frames

Transactions Transaction consists of multiple packets Token/Data/ACK Must complete in (micro) frame USB Client Driver Initiates transfers Using IRP (I/O Request Packet) USB Driver Breaks IRPs into transactions Achievable in single frame Transaction descriptors created Host Controller Driver Schedules transaction descriptors Using knowledge of devices Allocates to frames USB Host Controller Traverses transaction descriptors Serializes data Creates USB packets

Frames and mframes 1ms 125ms Low Speed Mode 1ms frames 1,500 bits /frame 187.5 KB/s Full Speed Mode 1ms frames 12,000 bits/frame 1.5 MB/s High Speed Mode (USB 2.0) 125ms frames 60,000 bits/mframe 60 MB/s

Disk Drives and other I/O

Computer Memory Hierarchy: Disks Processor Datapath Control Registers On-Chip Cache Second Level Cache (SRAM) Third Level Cache (SRAM) Main Memory (DRAM) Secondary Storage (Disk) Tertiary Storage (Tape) Intermediate results Cached DRAM Instructions File System Data Paging [Cached Files] Archive Backup

Typical Disk Drive A device called a head reads and writes data in a hard drive by manipulating the magnetic medium that composes the surface of an associated disk platter. Naturally, a platter has 2 surfaces; therefore there are 2 heads per platter Cylinder refers to stack of tracks on all surfaces at same distance from spindle Sectors are fixed-length (typically standardized on 512 bytes)* Need addressing scheme to uniquely identify sectors * This is moving (has moved) to 4KB sectors

Disk Geometry Cylinder/Head/Sector Addressing (CHS) Limitations due to PC BIOS bit field sizes Operating System 1024 cylinders, 256 heads, 64 sectors Cylinders, sectors numbered from 1 Heads numbered from 0 2 10 x 2 8 x 2 6 x 2 9 = 2 33 = 8 GB Logical Block Addressing (LBA) Sectors just numbered consecutively from 0 Drive maps from LBA to physical location on disk Optimization Bad sector mgmt. Caching Disk Drive

Issues Fragmentation Internal Fragmentation Files unlikely to be exact multiples of sector size, so there will be unused space in last sector of every file (natural consequence of fixed size sectors) External Fragmentation Over time, with additions and deletions, there will be gaps in allocated sectors. These must be managed by the file system and unused sectors allocated to new files (often allocated multiple sectors at a time (blocks)). Bad block handling Increasing density, lower signal-to-noise ratio Increase in SER (soft error rate) SER ~ 1 in 105 bits or about 1 in 24 sectors - ECC handles most of these For those which can t be corrected, drive maintains internal tables which mark them as bad and never uses them Drive still presents a contiguous LBA address space to the user Two techniques Sector Slipping - If sector is bad, find next non-defective sequential sector to be that LBA Sector Sparing - Reserve some spare sectors at one or more locations on the disk

Key Disk Performance Parameters Seek time Time required to move head to the desired track Depends on where the head is currently (less for adjacent or nearby tracks) 5ms to 15ms typical Rotational Latency Time required for sector to rotate under head 1/rotational speed of disk Average is ½ rotational latency 5,400 to 12,000 RPM typical Transfer Time Time to transfer a sector 20 to 160 MB/s typical Controller Time (Controller Overhead)

Disk Performance Average access time Seek time + rotational latency + transfer time + controller time Fair to assume that rotation latency = 1/(½ * rotational speed) Controller time is often negligible (compared to seek time and rotation speed) Buffers DRAM Write buffer allows actual disk write to occur later Read buffer necessary because transfer rate varies Pre-fetching Anticipate reads beyond current sector Read and cache sectors during rotational latency Exploit spatial locality No additional latency for sectors in same track Caching (on disk and in OS) Command Queuing and Scheduling

RAID Redundant Array of Inexpensive (Independent) Drives/Disks Idea: Use many smaller, cheaper disks to improve I/O operations/sec and data transfer rates Weakness: Lower reliability (more disks more failures) Solution: Add redundancy Important: RAID is concerned about entire disk failure(s) not with errors within a sector (ECC on disk) Ken Ouichi, IBM (1978) applied parity scheme to disks UC Berkeley RAID paper (1988) proposed taxonomy and described RAID 1 through RAID 5 Widely adopted (and extended) by industry

RAID striping

RAID Striping (block level) No redundancy Mirroring Requires 2x disks Striping (bit or byte level) Parity Inexpensive (1 redundant disk) Read all disks to recover Issue: All reads/writes access all disks

RAID Small reads (block or smaller) require only one disk read Small reads can occur in parallel Small writes more complicated But optimization possible instead of reading all disks and two writes read two disks and two writes Problem: contention for disk with parity

RAID Stripe parity blocks across disks Writes proceed in parallel if parity on different disks If two drives fail previous schemes fail Two (striped) parity checks Row and diagonal