Asynchronous FIFO Architectures (Part 1)

Similar documents
VHDL Testbench. Test Bench Syntax. VHDL Testbench Tutorial 1. Contents

Sequential Statement

Digital System Construction

[VARIABLE declaration] BEGIN. sequential statements

Design Problem 5 Solutions

ECOM 4311 Digital Systems Design

Problem Set 10 Solutions

Very High Speed Integrated Circuit Har dware Description Language

vhdl-style-guide Documentation

DIGITAL LOGIC WITH VHDL (Fall 2013) Unit 6

CDA 4253 FGPA System Design Xilinx FPGA Memories. Hao Zheng Comp Sci & Eng USF

ECE 545 Lecture 6. Behavioral Modeling of Sequential-Circuit Building Blocks. George Mason University

Concurrent & Sequential Stmts. (Review)

VHDL simulation and synthesis

Novel Architecture for Designing Asynchronous First in First out (FIFO)

Example 15: Moving Sprites with Flash Background

Introduction to VHDL #1

DIGITAL LOGIC DESIGN VHDL Coding for FPGAs Unit 6

3e library declarations library IEEE; use IEEE.std_logic_1164.all;

Getting Started with VHDL

Control and Datapath 8

Lecture 12 VHDL Synthesis

CprE 583 Reconfigurable Computing

Altera s Avalon Communication Fabric

Contents. Chapter 9 Datapaths Page 1 of 28

Formal Verification of Synchronizers in Multi-Clock Domain SoCs. Tsachy Kapschitz and Ran Ginosar Technion, Israel

In our case Dr. Johnson is setting the best practices

Test Benches - Module 8

CCE 3202 Advanced Digital System Design

Multiplication Simple Gradeschool Algorithm for 16 Bits (32 Bit Result)

Abi Farsoni, Department of Nuclear Engineering and Radiation Health Physics, Oregon State University

Lab 3: Standard Combinational Components

Constructing VHDL Models with CSA

Digital Design Laboratory Lecture 2

Hardware Description Language VHDL (1) Introduction

Altera s Avalon Communication Fabric

Lecture 4. VHDL Fundamentals. George Mason University

Example 58: Traffic Lights

IT T35 Digital system desigm y - ii /s - iii

ECE 448 Lecture 4. Sequential-Circuit Building Blocks. Mixing Description Styles

Single-Cycle CPU VITO KLAUDIO CSC343 FALL 2015 PROF. IZIDOR GERTNER

CSE 260 Introduction to Digital Logic and Computer Design. Exam 1. Your name 2/13/2014

TRAFFIC LIGHT CONTROLLER USING VHDL

Introduction to VHDL. Prof. Vanderlei Bonato - 10/10/16 1

Simulation with ModelSim Altera from Quartus II

Sequential Logic - Module 5

ECE 545 Lecture 8. Data Flow Description of Combinational-Circuit Building Blocks. George Mason University

FIFO Generator v13.0

ECE 545 Lecture 12. Datapath vs. Controller. Structure of a Typical Digital System Data Inputs. Required reading. Design of Controllers

CS/EE Homework 7 Solutions

MCPU - A Minimal 8Bit CPU in a 32 Macrocell CPLD.

ORCA Series 4 Quad-Port Embedded Block RAM

Chapter 2 Basic Logic Circuits and VHDL Description

Midterm Exam Thursday, October 24, :00--2:15PM (75 minutes)

Chapter 8 VHDL Code Examples

ELCT 501: Digital System Design

Simulation with ModelSim Altera from Quartus II

Lattice VHDL Training

Counters. Counter Types. Variations. Modulo Gray Code BCD (Decimal) Decade Ring Johnson (twisted ring) LFSR

Using ModelSim to Simulate Logic Circuits in VHDL Designs. 1 Introduction. For Quartus II 13.0

Outline. CPE/EE 422/522 Advanced Logic Design L05. Review: General Model of Moore Sequential Machine. Review: Mealy Sequential Networks.

CprE 583 Reconfigurable Computing

Review of Digital Design with VHDL

Lecture 7. Standard ICs FPGA (Field Programmable Gate Array) VHDL (Very-high-speed integrated circuits. Hardware Description Language)

XSV Flash Programming and Virtex Configuration

Lecture 3: Basic Adders and Counters

EEL 4712 Digital Design Test 1 Spring Semester 2008

EECE 353: Digital Systems Design Lecture 10: Datapath Circuits

Schedule. ECE U530 Digital Hardware Synthesis. Rest of Semester. Midterm Question 1a

Nanosistemų programavimo kalbos 5 paskaita. Sekvencinių schemų projektavimas

Documentation. Implementation Xilinx ISE v10.1. Simulation

VHDL: Modeling RAM and Register Files. Textbook Chapters: 6.6.1, 8.7, 8.8, 9.5.2, 11.2

COVER SHEET: Total: Regrade Info: 5 (5 points) 2 (8 points) 6 (10 points) 7b (13 points) 7c (13 points) 7d (13 points)

The block diagram representation is given below: The output equation of a 2x1 multiplexer is given below:

Hardware Description Languages. Modeling Complex Systems

Inthis lecture we will cover the following material:

FPGAs in a Nutshell - Introduction to Embedded Systems-

VHDL for Complex Designs

Two HDLs used today VHDL. Why VHDL? Introduction to Structured VLSI Design

Case Study AM2901. Hierarchy in Large Designs


Synthesis from VHDL. Krzysztof Kuchcinski Department of Computer Science Lund Institute of Technology Sweden

CDA 4253 FPGA System Design VHDL Testbench Development. Hao Zheng Comp. Sci & Eng USF

Grading Summary for ECE 2031 Lab Reports

Accelerate FPGA Prototyping with

Writing VHDL for RTL Synthesis

The CPU Bus : Structure 0

VHDL And Synthesis Review

LogiCORE IP FIFO Generator v6.1

COE Design Process Tutorial

Lecture 1: VHDL Quick Start. Digital Systems Design. Fall 10, Dec 17 Lecture 1 1

HDL. Hardware Description Languages extensively used for:

APPLICATION NOTE. A CPLD VHDL Introduction. Introduction. Overview. Entity. XAPP 105 January12, 1998 (Version 1.0) 0 4* Application Note

CS303 LOGIC DESIGN FINAL EXAM

VHDL. VHDL History. Why VHDL? Introduction to Structured VLSI Design. Very High Speed Integrated Circuit (VHSIC) Hardware Description Language

FPGA Lecture for LUPO and GTO Vol , 31 August (revised 2013, 19 November) H. Baba

Control Unit: Binary Multiplier. Arturo Díaz-Pérez Departamento de Computación Laboratorio de Tecnologías de Información CINVESTAV-IPN

Inferring Storage Elements

Design Problem 3 Solutions

VHDL for Logic Synthesis

Transcription:

Asynchronous FIFO Architectures (Part 1) Vijay A. Nebhrajani Designing a FIFO is one of the most common problems an ASIC designer comes across. This series of articles is aimed at looking at how FIFOs may be designed -- a task that is not as simple as it seems. At the onset, note that FIFOs are usually used for domain crossing, and are therefore dual clock designs. In other words, the design works off two clocks, and so the most general case is a FIFO that is designed with no relationship assumed between these two clocks. However, we will not start with such an architecture -- we will start with the trivial case of a FIFO that runs on just one clock. I imagine that such a circuit will have limited use in practice, but it serves the very useful purpose of setting the stage for more complex designs. This series of articles is designed to follow the course of development of FIFO designs from the trivial to the general. This will be the series of articles: 1. Single clock architecture 2. Dual clock architectures - Architecture 1. 3. Dual clock architectures - Architecture 2. 4. Dual clock architectures - Architecture 3. 5. Pulse mode FIFO.

The trivial case of a single clock FIFO There are several architectures that are possible for a FIFO; and these include ripple FIFOs, shift registers and other such architectures that we will not care much about. We will concentrate on architectures that involve random access memory arrays. Such an architecture is shown in Fig 1. wr_data rd_data DPRAM Array (Array Size Deep) wr_addr rd_addr wr_ptr rd_ptr valid_wr wr_en Status rd_en valid_rd empty full wcnt Figure 1. A FIFO Architecture Analyzing, we see that there is a RAM array with separate read and write ports. This is chosen for convenience. If you have a single port memory, you would have to include an arbiter that would grant access to one operation (read or write) at one time. We choose Dual Port RAMs (not necessarily true dual port, because we simply want a separate read and write port) because these illustrate more realistic situations. The read and write ports have separate read and write addresses, generated by two counters of width log 2 (array_size). We don't care much about the data width right now, but this does become an important parameter in choosing architectures later. For consistency we will refer to these counters as the "read pointer" and the "write pointer". The write pointer points to the location that will be written next, and the read

pointer points to the location that will be read next. A write increments the write pointer, and a read increments the read pointer. The last block we see is the "status" block. The responsibility of this block is to generate the "empty" and "full" signals to the FIFO. These signals tell the outside world that the FIFO has reached a terminal condition: If "full" is active, then the FIFO has reached a terminal condition for write and if "empty" is active, the FIFO has reached a terminal condition for read. A terminal condition for write implies that the FIFO has no space to accommodate more data and a terminal condition for read implies that the FIFO has no more data available for readout. The status block may also report the number of empty or full locations in the FIFO, and this is accomplished by an arithmetic operation on the pointers. The actual count of empty or full locations does not play much of a part in the FIFO itself; it is used as a reporting mechanism to the outside world. However, the empty and full signals play a very important role within the FIFO they block access to further reads and writes respectively. The importance of this blocking does not lie in the fact that data may be overwritten (or read out twice); the critical importance lies in the fact that pointer positions are the only control we have over the FIFO, and writes or reads change the pointers. If we do not block the pointers from changing state at terminal conditions, we will have a FIFO that either "eats" data or "generates" data, and that is quite unacceptable. Further analysis: The DPRAM could have "registered" reads -- this means that the output data from the array is registered. If this is so, the read pointers will have to be designed to "read and increment". This means that you will have to provide an explicit read signal before the data at the output of the FIFO is valid. On the other hand, if the DPRAM does not have registered outputs, valid data is available as soon as it is written; you read this data first, and increment the pointer later. This affects the logic that reads data out from the FIFO and the logic that performs the empty/full calculation. For the sake of simplicity we will only treat the case where the DPRAM does not provide registered outputs. It is not very complex to extend the same reasoning (that we will use) to the case of a registered output DPRAM. Functionally the FIFO works as follows: At reset, the pointers are both 0. This is the empty condition of the FIFO, and empty is pulled high (we will use the active high convention) and full is low. At empty, reads are blocked and so the only operation possible is write. A write loads location 0 of the array and increments the write pointer to 1. This causes the empty signal to go LOW. Assuming that there are no reads and subsequent cycles only write to the FIFO, there will be a time when the write pointer will equal array_size -1. This means that the last location in the array is the next location that will be written to. At this condition, a write will cause the write pointer to become 0, and set full. Note that in this condition the write and read pointers are equal, but the FIFO is full, and not empty. This implies that the full/empty decision is not based on the pointer values alone, but on the operation that caused the pointers to become equal. If the cause of pointer equality is a reset or a read, the FIFO is deemed empty; if the cause is a write, the FIFO is full.

Now assume that we a series of reads. Each read will increment the read pointer, to the point where the read pointer equals array_size -1. At this point, the data from this location is available on the output bus of the FIFO. Succeeding logic reads this data and provides a read signal (active for one clock). This causes the read pointer to become equal to the write pointer again (after both pointers have completed one cycle through the array). However, since this equality came about because of a read, empty is set. Thus, we have the following for the empty flag: A write unconditionally clears empty. read_pointer = (array_size -1) and a read sets empty. and the following for the full flag: A read unconditionally clears full. write_pointer = (array_size -1) and a write sets full. However, this is a special case, since in general reads may start as soon as the FIFO is not empty (the reading logic need not wait for the FIFO to become full), so these conditions have to be modified to accommodate any read_pointer and write_pointer values. A little thought shows that we have organized the array as a circular list. Thus, if the write pointer is numerically greater than the read pointer by 1 and a read occurs, the FIFO is empty. This works just fine for the boundary case described above as long as we use unsigned (n-bit) arithmetic. Similarly, if the read pointer is numerically greater than the write pointer by 1 and a write occurs, the FIFO is full. This leads to the conditions: A write unconditionally clears empty. (write_pointer = read_pointer + 1) and a read sets empty. A read unconditionally clears full. (read_pointer = write_pointer + 1) and a write sets full. Note that a simultaneous read and write increments both pointers, but does not alter the state of the empty and full flags. A simultaneous read and write is not allowed at the full and empty boundaries. With this we can now define the status block of the FIFO. I am providing the code in VHDL, but since this is synthesizable, translation to Verilog is easy. library IEEE, STD; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; use IEEE.std_logic_unsigned.all; entity status is port (reset : in std_logic; : in std_logic; fifo_wr : in std_logic; fifo_rd : in std_logic; valid_rd : out std_logic; valid_wr : out std_logic; rd_ptr : out std_logic_vector(4 downto 0);

wr_ptr : out std_logic_vector(4 downto 0); empty : out std_logic; full : out std_logic ); end status; architecture status_a of status is signal rd_ptr_s : std_logic_vector(4 downto 0); signal wr_ptr_s : std_logic_vector(4 downto 0); signal valid_rd_s : std_logic; signal valid_wr_s : std_logic; empty_p : process(, reset) empty <= '1'; elsif ('event and = '1') then if (fifo_wr = '1' and fifo_rd = '1') then -- do nothing null; elsif (fifo_wr = '1') then -- write unconditionally clears empty empty <= '0'; elsif (fifo_rd = '1' and (wr_ptr_s = rd_ptr_s + '1')) then -- set empty empty <= '1'; full_p : process(, reset) full <= '0'; elsif ('event and = '1') then if (fifo_rd = '1' and fifo_wr = '1') then -- do nothing null; elsif (fifo_rd = '1') then -- read unconditionally clears full full <= '0'; elsif (fifo_wr = '1' and (rd_ptr_s = wr_ptr_s + '1')) then -- set full full <= '1';

valid_rd_s <= '1' when (empty = '0' and fifo_rd = '1'); valid_wr_s <= '1' when (full = '0' and fifo_wr = '1'); wr_ptr_s_p : process(, reset) wr_ptr_s_p <= (others => '0'); elsif ('event and = '1') then if (valid_wr_s = '1') then wr_ptr_s <= wr_ptr_s + '1'; rd_ptr_s_p : process(, reset) rd_ptr_s_p <= (others => '0'); elsif ('event and = '1') then if (valid_rd_s = '1') then rd_ptr_s <= rd_ptr_s + '1'; rd_ptr <= rd_ptr_s; wr_ptr <= wr_ptr_s; end status_a;

The circuit for this is shown in Fig 2. valid_rd_s rd_ptr +1 (INC) == wr_ptr_s = rd_ptr_s + 1 fifo_rd fifo_wr rd_en empty valid_rd_s RD_CLK DOMAIN WR_CLK DOMAIN valid_wr_s wr_ptr +1 (INC) == rd_ptr_s = wr_ptr_s + 1 fifo_rd fifo_wr wr_en empty valid_wr_s Figure 2: Circuit for Status block and pointers of FIFO of Figure 1. The observant reader will notice that generating the full or empty flags requires the use of both pointers. In the case of a dual clock design, the read pointer is expected to work off the read clock and the write pointer off the write clock. This raises unexpectedly thorny issues -- as you may try and see for yourself. These issues and some solutions will be covered in future articles in this series.