Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Similar documents
Conditional Speculative Decimal Addition*

Assembler. Building a Modern Computer From First Principles.

Parallel matrix-vector multiplication

CPE 628 Chapter 2 Design for Testability. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Lecture 3: Computer Arithmetic: Multiplication and Division

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

A RECONFIGURABLE ARCHITECTURE FOR MULTI-GIGABIT SPEED CONTENT-BASED ROUTING. James Moscola, Young H. Cho, John W. Lockwood

The Codesign Challenge

Wishing you all a Total Quality New Year!

Area Efficient Self Timed Adders For Low Power Applications in VLSI

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

System-on-Chip Design Analysis of Control Data Flow. Hao Zheng Comp Sci & Eng U of South Florida

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization

ELEC 377 Operating Systems. Week 6 Class 3

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

THE low-density parity-check (LDPC) code is getting

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Lecture 7. Standard ICs FPGA (Field Programmable Gate Array) VHDL (Very-high-speed integrated circuits. Hardware Description Language)

Rapid Development of High Performance Floating-Point Pipelines for Scientific Simulation 1

Using Delayed Addition Techniques to Accelerate Integer and Floating-Point Calculations in Configurable Hardware

B.10 Finite State Machines B.10

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

CHAPTER 4 PARALLEL PREFIX ADDER

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Concurrent models of computation for embedded software

Verification by testing

Quantifying Performance Models

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

The stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0

Real-Time Systems. Real-Time Systems. Verification by testing. Verification by testing

FPGA-based implementation of circular interpolation

CMPS 10 Introduction to Computer Science Lecture Notes

FPGA Based Fixed Width 4 4, 6 6, 8 8 and Bit Multipliers using Spartan-3AN

High-Level Power Modeling of CPLDs and FPGAs

Zilog ZDS 1/25 HARDWARE USER MANUAL \ PRELIMINARY

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar

Lecture 5: Multilayer Perceptrons

Setup and Use. For events not using AuctionMaestro Pro. Version /7/2013

Modeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR

CHAPTER 4. Applications of Boolean Algebra/ Minterm and Maxterm Expansions

If you miss a key. Chapter 6: Demand Paging Source:

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

CUSTOM FPGA CRYPTOGRAPHIC BLOCKS FOR RECONFIGURABLE EMBEDDED NIOS PROCESSOR

CS 268: Lecture 8 Router Support for Congestion Control

Memory and I/O Organization

Barriers. CS252 Graduate Computer Architecture Lecture 22. Synchronization (con t) Memory Technology Error Correction Codes April 18 th, 2010

THEORETICAL BACKGROUND FOR THE APPLET DESIGN AND TEST OF DIGITAL SYSTEMS ON RT-LEVEL AND RELATED EXERCISES

Array transposition in CUDA shared memory

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Simulation Based Analysis of FAST TCP using OMNET++

Mallathahally, Bangalore, India 1 2

USING GRAPHING SKILLS

Optimization of Critical Paths in Circuits with Level-Sensitive Latches

Simple March Tests for PSF Detection in RAM

Programming in Fortran 90 : 2017/2018

Analysis of Min Sum Iterative Decoder using Buffer Insertion

RADIX-10 PARALLEL DECIMAL MULTIPLIER

CSE 260 Introduction to Digital Logic and Computer Design. Exam 1. Your name 2/13/2014

Functional Testing of Digital Systems

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Wavefront Reconstructor

Efficient Distributed File System (EDFS)

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: Single- Cycle CPU Datapath Control Part 1

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Sign here to give permission for your test to be returned in class, where others might see your score:

Esc101 Lecture 1 st April, 2008 Generating Permutation

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Problem Set 3 Solutions

SYSTEM 8 Diagnostic Solution Plus Ideal Starter Package

Storage Binding in RTL synthesis

A SCALABLE DIGITAL ARCHITECTURE OF A KOHONEN NEURAL NETWORK

Brave New World Pseudocode Reference

FPGA IMPLEMENTATION OF RADIX-10 PARALLEL DECIMAL MULTIPLIER

Yield Enhancement of Asynchronous Logic Circuits through 3-Dimensional Integration Technology

Computer Architecture ELEC3441

RAP. Speed/RAP/CODA. Real-time Systems. Modeling the sensor networks. Real-time Systems. Modeling the sensor networks. Real-time systems:

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

VHDL for Synthesis. Course Description. Course Duration. Goals

Intro. Iterators. 1. Access

AADL : about scheduling analysis

ECE 545 Lecture 12. Datapath vs. Controller. Structure of a Typical Digital System Data Inputs. Required reading. Design of Controllers

Hardware Support for QoS-based Function Allocation in Reconfigurable Systems

Problem Set 10 Solutions

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Specifications in 2001

A Binarization Algorithm specialized on Document Images and Photos


Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

EFFICIENT SYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION

Computational ghost imaging using a fieldprogrammable

FIBARO WALL PLUG OPERATING MANUAL FGBWHWPE-102/FGBWHWPF-102 CONTENTS

A Distributed Dynamic Bandwidth Allocation Algorithm in EPON

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

Improving The Test Quality for Scan-based BIST Using A General Test Application Scheme

Compiling Process Networks to Interaction Nets

Transcription:

9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals are ether logcally or Asserted () or deasserted () Implemented wth voltage on (V DD ) or ground Combnatoral crcuts Output only depends on current nputs Sequental crcuts Output depends on current nputs and state 5 6 Truth Tables Logc Gates A truth table can be used to descrbe any combnatoral crcut A B A + B Grow quckly n sze Most useful for small crcuts Alternatve: Boolean Logc Basc buldng block of dgtal systems Constructed from transstors Transstors are analog, gates are dgtal (abstracton!) Rngs on nputs or outputs descrbe negaton Two ways of mplementng ~(~A+B)

9/8/2 7 8 Combnatoral logc C.3: Combnatoral Logc Combnatoral logc only depends on current nputs We don t need a clock! There mght be nputs that are rrelevant to our crcut Don t cares Room for optmzaton 9 Combnatoral VHDL Example Combnatoral Schematc lbrary IEEE; use IEEE.STD_LOGIC_64.ALL; entty toplevel s Port ( data_n_ : n STD_LOGIC_VECTOR (3 downto ); data_n_2 : n STD_LOGIC_VECTOR (3 downto ); data_out : out STD_LOGIC_VECTOR (3 downto )); end toplevel; archtecture Behavoral of toplevel s begn combnatoral: process(data_n_, data_n_2) begn data_out <= data_n_ and data_n_2; end process combnatoral; end Behavoral; 2 Multplexor Combnatoral crcut that selects one output from many nputs A decoder s a reverse multplexor mux: process(data_n_, data_n_2, nput_select) begn f nput_select = '' then data_out <= data_n_; data_out <= data_n_2; end f; end process mux; Other Combnatoral Crcuts Read Only Memores (ROMs) Fxed content memory Can be used to mplement logc: Use nput pattern as address and store logc output Lookup Tables (LUTs) Can mplement any logc functon Stores output n memory locatons correspondng to nput address Key buldng block for reconfgurable chps (e.g. FPGAs) 2

9/8/2 3 4 Bt Arthmetc Logc Unt (ALU) C.5: Constructng a Basc ALU The ALU s the unt that does the work n all processors Supports many operatons Add, subtract, and, or, etc. Idea: add hardware for all operatons and use a multplexor to select the result Optmzatons possble (e.g. subtracton s 2 s complement addton) 5 6 Computng Carry Correct addton depends on correct propagaton of carry CarryOut = (b * CarryIn) + (a * CarryIn) + (a * b) 32 Bt ALU Explot the bt ALU abstracton to create a wde ALU Called a rpple carry adder Rpple carry adders are slow Carry propagaton through the crcut s the crtcal path 7 8 Carry Lookahead C.6: Faster Addton: Carry Lookahead Idea: We can use more logc to shorten the crtcal path of a rpple carry adder Each carry bt uses all prevous carres and nputs We can compute each carry drectly by applyng the formulas recursvely But: Logc overhead grows quckly Two bt carry lookahead example: c b c 2 2 c b c c b [ b c ] a [ b c ] 3

9/8/2 9 2 Explotng Overlap General lookahead equaton: c c b c ( b a ) c Explotng Abstracton Use a 4-bt carry lookahead adder as buldng block for larger adders Each carry computaton recomputes a + b and a b Idea: Rename these and compute once: g p b a g s known as generate and p as propagate g decdes f carry should be generated for level p decdes f prevous carres should be propagated for level We need to generate propagate and generate sgnals for each block Reapply the same prncples as for the -level desgn The resultng sgnals are referred to as super-propagate P and super-generate G 6 bt adder latency estmaton Two-level carry lookahead: Worst case latency 5 gate delays Rpple carry: Worst case latency 32 gate delays 2 Two-Level Carry Lookahead 4 bt carry lookahead unt as basc buldng block 22 C.7: Clocks Carry-lookahead unt takes care of generatng ntermodule carry sgnals 23 24 Clocks Clocks are used n sequental systems C.8: Flp-flops, Latches and Regsters Clockng methodologes Edge trggered: State elements are updated on clock transtons Level trggered: State elements are updated contnuously whle the clock s ether or Choose one or the other Dfferent methodologes may be approprate for dfferent producton technologes 4

9/8/2 25 26 Latch A latch s the basc memory element The output s locked due to the cross-couplng Level trggered latch: process(clk, nput_select) begn f clk = '' then output_select <= nput_select; end f; end process latch; Flp-Flop Edged trggered extenson of the latch Fgure trggers on fallng edge, VHDL code on rsng edge flpflop: process(clk) begn f rsng_edge(clk) then output_select <= nput_select; end f; end process flpflop; 27 28 Regster Regster Fle Example Collecton of flpflops or latches that store mult-bt values Regster fles contan multple regsters and access logc reg: process(clk) begn f rsng_edge(clk) then data_out <= data_n_; end f; end process reg; VHDL code s dentcal to latch/flp-flop except that the sgnals are vectors and not scalars 2 Port Read logc Port Wrte logc 29 3 Regster Fle Abstracton C.9: SRAMs and DRAMs 5

9/8/2 3 32 Statc RAM Bult from transstors n a logc process Retans charge over tme (statc) Uses more area than Dynamc RAM Dynamc RAM Separate producton process to acheve hgh densty Ths s why the DRAM almost always s on a separate chp Looses charge over tme (dynamc) Needs refresh Sngle DRAM cell 33 34 Fnte State Machnes C.: Fnte State Machnes Commonly synchronous Changes state on clock tck Two types Moore: Next state only depends on current state Mealy: Next state depends on current state and nputs Moore or Mealy? Almost all electronc systems contan a number of state machnes 35 36 Traffc Lght Controller FSM Block Dagram State machne for traffc lghts n a 4-way juncton Sgnals: Output sgnals: NSgreen: North-South green lght Wegreen: West-East green lght Input: NScar: North-South car watng EWcar: East-West car watng What happens f there are cars n both drectons? Logc gates Flp-flops or latches Abstracton! 6

9/8/2 37 38 Edge Trggered Methodology C.: Tmng Methodologes Need to account for clock skew The clock mght arrve at dfferent flp flops at slghtly dfferent tmes Lmts the maxmum clock frequency Desgners try to lmt clock skew wth clever clock dstrbuton strateges Advantage: Easer to acheve correct operaton than wth level trggered Drawback: Edge detecton requres extra logc 39 4 Level Senstve Methodology Metastablty Less area and possbly less delay than Edge Trggered But: prone to race condtons Soluton: Two phase clockng 4 42 Synchronzers Idea: reduce the probablty of a metastable output by addng an extra flp-flop Gves the nput sgnal tme to stablze before we allow t nto our system If the clock cycle s long compared to the metastable perod (common), the probablty of falure wll be low but never C.2: Feld Programmable Devces 7

9/8/2 43 44 Feld Programmable Devces Programmable Logc Devces (PLDs) Only combnatoral logc Feld Programmable Gate Arrays (FPGAs) Combnatoral and sequental Key buldng block: LUTs Many specal purpose resources: ALUs, Block RAM, etc. Case Study: TDT4255 Exercse Communcaton Module 45 46 Exercse System Archtecture Xlnx FPGA-based embedded system McroBlaze softcore Perpheral bus Custom perpheral UART over USB communcaton wth host computer Host-to-perpheral communcaton provded Communcaton Features Wrte arbtrary data memory address Read arbtrary nstructon memory address Wrte arbtrary nstructon memory address 47 48 Interface Specfcaton Perpheral Block Dagram 8

9/8/2 49 5 Wrte Enable MUX DMEM MUX IMEM_WRITE_ENABLE_MUX : process(com_wrte_mem, com_wrte_en) begn f com_wrte_mem = '' then mem_com_wrte_en <= com_wrte_en; dmem_wrte_en_mux_com <= com_wrte_en; end f; end process; DMEM_MUX : process( ) begn f processor_enable = '' then dmem_read_addr_mux_out <= dmem_read_addr_mux_proc; dmem_wrte_addr_mux_out <= dmem_wrte_addr_mux_proc; dmem_ wrte_ data_ mux_ out <= dmem_ wrte_ data_ mux_proc; dmem_wrte_en_mux_out <= dmem_wrte_en_mux_proc; dmem_read_addr_mux_out <= dmem_read_addr_mux_com; dmem_wrte_addr_mux_out <= dmem_wrte_addr_mux_com; dmem_wrte_data_mux_out <= dmem_wrte_data_mux_com; dmem_wrte_en_mux_out <= dmem_wrte_en_mux_com; end f; end process; Note: senstvty lst removed to make the code ft the slde 5 Communcaton State Machne 52 Communcaton Implementaton STATE_MACHINE : process(clk, reset) constant STATE_IDLE : std_logc_vector(2 downto ) := ""; -- more constants begn f rsng_edge(clk) then f reset = '' then -- reset all sgnals case state s -- dle when STATE_IDLE => -- set sgnals when STATE_WI => -- set sgnals --more states here end case; end f; end f; end process; 53 Run State Sgnals -- processor runnng when STATE_RUN => status <= STATUS_RUN; bus_data_out <= (others => ''); read_addr <= (others => ''); wrte_addr <= (others => ''); wrte_data <= (others => ''); wrte_enable <= ''; processor_enable <= ''; wrte_mem <= ''; nternal_data_out <= (others => ''); f command = CMD_RUN then state <= STATE_RUN; state <= STATE_IDLE; end f; 9