Appendix D. Controller Implementation

Similar documents
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

Elementary Educational Computer

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

Chapter 5: Processor Design Advanced Topics. Microprogramming: Basic Idea

Chapter 4 The Datapath

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

. Written in factored form it is easy to see that the roots are 2, 2, i,

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

Data diverse software fault tolerance techniques

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

Chapter 3. Floating Point Arithmetic

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

Python Programming: An Introduction to Computer Science

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

Fundamentals of. Chapter 1. Microprocessor and Microcontroller. Dr. Farid Farahmand. Updated: Tuesday, January 16, 2018

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

UNIVERSITY OF MORATUWA

Computer Architecture. Microcomputer Architecture and Interfacing Colorado School of Mines Professor William Hoff

Computer Architecture

Computer Architecture ELEC3441

Lecture 1: Introduction and Strassen s Algorithm

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

1. SWITCHING FUNDAMENTALS

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Τεχνολογία Λογισμικού

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Isn t It Time You Got Faster, Quicker?

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago

Python Programming: An Introduction to Computer Science

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Analysis of Algorithms

Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers?

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Lecture 3. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

CMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago

Module Instantiation. Finite State Machines. Two Types of FSMs. Finite State Machines. Given submodule mux32two: Instantiation of mux32two

Description of Single Cycle Computer (SCC)

Chapter 3 Classification of FFT Processor Algorithms

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

Lecture 1: Introduction and Fundamental Concepts 1

Multicycle Approach. Designing MIPS Processor

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

condition w i B i S maximum u i

ΕΠΛ 605 Εργαστήριο 5. Παναγιώτα Νικολάου 11/10/18. Slides from: Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin

Computer Graphics Hardware An Overview

Data Structures and Algorithms. Analysis of Algorithms

Computer Systems - HS

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

CS252 Spring 2017 Graduate Computer Architecture. Lecture 6: Out-of-Order Processors

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

Instruction and Data Streams

EE123 Digital Signal Processing

Lecture 28: Data Link Layer

Chapter 4 The Processor (Part 2)

The Magma Database file formats

IMP: Superposer Integrated Morphometrics Package Superposition Tool

CSE 305. Computer Architecture

Implementing the Control. Simple Questions

Arquitectura de Computadores

Introduction CHAPTER Computers

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Threads and Concurrency in Java: Part 1

Threads and Concurrency in Java: Part 1

Cache-Optimal Methods for Bit-Reversals

Introduction to Computing Systems: From Bits and Gates to C and Beyond 2 nd Edition

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Lower Bounds for Sorting

Τεχνολογία Λογισμικού

Overview Chapter 12 A display model

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition.

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits

One advantage that SONAR has over any other music-sequencing product I ve worked

Digital System Design

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

L6: FSMs and Synchronization

Computers and Scientific Thinking

Performance Plus Software Parameter Definitions

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory!

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago

Abstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup Note:

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Transcription:

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied); ROM; PLA; Micro-code. Appedix D - Cotrollers 1

Sigle Cycle Implemetatio All istructios take oe clock cycle. Clock rate determied by the slowest istructio (LW). Combiatioal Implemetatio Appedix D - Cotrollers 2

Pipelied Implemetatio Implemetig Cotrol Value of cotrol sigals is depedet upo What istructio is beig executed. Which step is beig performed. Use the iformatio we ve accumulated to specify a fiite state machie Specify the fiite state machie graphically. Use micro-programmig. Appedix D - Cotrollers 3

Graphical Specificatio of FSM Cotrol sigals Are do t care if they are ot metioed. Are asserted if ame oly. Otherwise the value is stated. How may state bits will we eed? Fiite State Machie for Cotrol Appedix D - Cotrollers 4

PLA Implemetatio If I picked a horizotal or vertical lie, could you explai it? ROM Implemetatio ROM = "Read Oly Memory" Values of memory locatios are fixed ahead of time. A ROM ca be used to implemet a truth table If the address is m-bits, we ca address 2 m etries i the ROM. Outputs are the bits of data that the address poits to. m is the "height", ad is the "width." m 0 0 0 0 0 1 1 0 0 1 1 1 0 0 0 1 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 0 1 1 1 0 1 1 1 Appedix D - Cotrollers 5

ROM Implemetatio How may iputs are there? 6 bits for opcode, 4 bits for state = 10 address lies. i.e. 2 10 = 1024 differet addresses. How may outputs are there? 16 datapath-cotrol, 4 state bits = 20 outputs. ROM is 2 10 x 20 bits = 20K bits (a rather uusual size). Rather wasteful, sice for lots of etries the outputs are the same. ROM vs. PLA PLA is much smaller Ca share product terms. Oly eed etries that produce a active output. Ca take ito accout do't cares. Size is (#iputs #product-terms) + (#outputs #productterms). For this example = (10x17)+(20x17) = 510 PLA cells PLA cell is slightly larger tha a ROM cell. Appedix D - Cotrollers 6

Micro-programmig Prelude The cotroller is easy to graphically specify for the few istructios we are implemetig. What about a full MIPS istructio set with over 100 istructios ragig from 1 clock cycle to over 20 clock cycles? Use VHDL or Verilog, but iefficiet from hardware perspective. Cosider a istructio set with several hudred istructios of widely varyig classes, such as the IA-32 architecture Cotrol uit could easily require thousads of states with hudreds of differet sequeces. Specifyig the cotrol uit with a graphical represetatio would be impossible. Micro-programmig Suppose we thik of the set of cotrol sigals that must be asserted i a state as a istructio to be executed by the datapath. To avoid cofusig the istructios of the MIPS istructio set with these low-level cotrol istructios, the latter are called micro-istructios. Each micro-istructio defies the set of datapath cotrol sigals that must be asserted i a give state. Executig a micro-istructio has the effect of assertig the cotrol sigals specified by the micro-istructio. Appedix D - Cotrollers 7

Micro-programmig I additio to defiig which cotrol sigals must be asserted, we must also specify the sequecig what micro-istructio should be executed ext? If the micro-istructio requiremets become large, tha a micro-istructio assembler is usually used, icludig such abilities as subroutie calls. I summary, desigig the cotrol as a program that implemets the machie istructios i terms of simpler micro-istructios is called micro-programmig. Micro-programmig aka Wikipedia Microcode is stored i SRAM or flash memory. This is traditioally deoted a "writeable cotrol store" i the cotext of computers. Complex digital processors may also employ more tha oe (possibly microcode based) cotrol uit i order to delegate sub-tasks which must be performed (more or less) asychroously i parallel. Microcode is geerally ot visible or chageable by a ormal programmer, ot eve by a assembly programmer. Ulike machie code which ofte retais some compatibility amog differet processors i a family, microcode oly rus o the exact electroic circuitry for which it is desiged. Appedix D - Cotrollers 8

Micro-programmig aka Wikipedia More extesive micro-codig has also bee used to allow small ad simple microarchitectures to emulate more powerful architectures with wider word legth, more executio uits ad so o; a relatively simple way to achieve software compatibility betwee differet products i a processor family. Some hardware vedors, especially IBM, use the term as a syoym for firmware, so that all code i a device, whether microcode or machie code, is termed microcode (such as i a hard drive for istace, which typically cotais both). Micro-program Cotroller Appedix D - Cotrollers 9

Maximally vs. Miimally Ecoded No ecodig Basis for VLIW. 1 bit for each data path cotrol sigal. Faster, requires more memory. Used for Vax 780 i the 1980 s 400K of memory. Lots of ecodig Sed the micro-istructios through logic to get cotrol sigals. Uses less memory, slower. Vocabulary Micro-istructio Cotais a cotrol word ad a sequecig word Cotrol Word - all the cotrol iformatio required for oe clock cycle. Sequecig Word - iformatio eeded to decide the ext micro-istructio to be executed. Cotrol Memory or Cotrol Storage Writable storage i the micro-programmed cotrol uit to store the micro-program. Allows the micro-program to be modified, thereby providig meas to chage or modify the istructio set. Appedix D - Cotrollers 10

Micro-istructio Classificatio Micro-istructios ca be classified i a variety of ways i which the desiger must choose the parallel power of each istructio. There are two mai types: Vertical micro-programmig - each micro-istructio specifies a sigle (or few) micro-operatios to be performed. Horizotal micro-programmig - each microistructio specifies may differet micro-operatios to be performed i parallel. Micro-programmig Vertical Width is arrow - N cotrol sigals ca be ecoded ito log2 cotrol bits. Limited ability to express parallelism. Cosiderable ecodig of cotrol iformatio requires exteral memory word decoder to idetify the exact cotrol lies beig maipulated. Horizotal Wide memory word. High degree of parallel operatios are possible. Little to o ecodig of cotrol iformatio. Appedix D - Cotrollers 11

Compromise Techique Nao-programmig Use a 2-level cotrol storage orgaizatio. Top level is a vertical format memory. Output of the top level memory drives the address register of the bottom (ao-level) memory. Nao-memory uses the horizotal format which produces the actual cotrol sigal outputs. Mai advatage is sigificat savig i cotrol memory size. Mai disadvatage is more complexity ad slower operatio (doig 2 memory accesses for each micro-istructio). Historical Perspective I the 60s ad 70s micro-programmig was used frequetly for implemetatio. This led to more sophisticated ISAs ad the VAX. I the 80s RISC processors based o pipeliig became popular. Pipeliig the micro-istructios is also possible. Implemetatios of IA-32 processors sice the 486 use hardwired cotrol for simpler istructios (few cycles, FSM cotrol implemeted usig PLA or radom logic). micro coded cotrol for more complex istructios (large umbers of cycles). Appedix D - Cotrollers 12

Petium 4 Cotrol Cotrol I/O iterface Istructio cache Ehaced floatig poit ad multimedia Cotrol Data cache Iteger datapath Secodary cache ad memory iterface Advaced pipeliig hyperthreadig support Cotrol Processor executes simple microistructios, 70 bits wide (hardwired). 120 cotrol lies for iteger datapath, 400 for floatig poit. If a istructio requires more tha 4 microistructios to implemet, cotrol from micro-code ROM (8000 micro-istructios). It ca become complicated. Summary Fiite-state -machies give the most flexibility PLA ROM Micro-programmig Moder processors are complicated Micro-coded. Make the commo case fast. Make the simple istructios fast. Take the performace hits o the complex istructios. Appedix D - Cotrollers 13