EN2911X: Reconfigurable Computing Topic 01: Programmable Logic

Similar documents
Altera FLEX 8000 Block Diagram

ECE 636. Reconfigurable Computing. Lecture 2. Field Programmable Gate Arrays I

Chapter 2. Cyclone II Architecture

FPGA architecture and design technology

ENGN1640: Design of Computing Systems Topic 02: Lab Foundations

Programmable Logic Devices FPGA Architectures II CMPE 415. Overview This set of notes introduces many of the features available in the FPGAs of today.

Section I. Cyclone II Device Family Data Sheet

Section I. Cyclone II Device Family Data Sheet

ENGN1640: Design of Computing Systems Topic 02: Design/Lab Foundations

Digital System Design Lecture 7: Altera FPGAs. Amir Masoud Gharehbaghi

CPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline

Section I. Cyclone FPGA Family Data Sheet

Organic Computing. Dr. rer. nat. Christophe Bobda Prof. Dr. Rolf Wanka Department of Computer Science 12 Hardware-Software-Co-Design

CS310 Embedded Computer Systems. Maeng

Stratix. Introduction. Features... 10,570 to 114,140 LEs; see Table 1. FPGA Family. Preliminary Information

Intel Stratix 10 Logic Array Blocks and Adaptive Logic Modules User Guide

Stratix. Introduction. Features... Programmable Logic Device Family. Preliminary Information

ENGN1640: Design of Computing Systems Topic 02: Design/Lab Foundations

Section I. Cyclone II Device Family Data Sheet

Section I. Cyclone FPGA Family Data Sheet

Lecture 7. Standard ICs FPGA (Field Programmable Gate Array) VHDL (Very-high-speed integrated circuits. Hardware Description Language)

CHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

Section I. Cyclone FPGA Family Data Sheet

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

EN2911X: Reconfigurable Computing Lecture 01: Introduction

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Handouts. FPGA-related documents


ALTERA M9K EMBEDDED MEMORY BLOCKS

Handouts. 1. Project Guidelines and DSP Function Generator Design Specifications. (We ll discuss the project at the beginning of lab on Wednesday)

An Introduction to Programmable Logic

2. Stratix II Architecture

Field Programmable Gate Array (FPGA)

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

EN2911X: Reconfigurable Computing Lecture 13: Design Flow: Physical Synthesis (5)

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.

INTRODUCTION TO FPGA ARCHITECTURE

Digital Integrated Circuits

discrete logic do not

Field Programmable Gate Array (FPGA) Devices

8. Migrating Stratix II Device Resources to HardCopy II Devices

How Much Logic Should Go in an FPGA Logic Block?

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

Section 6. Memory Components Chapter 5.7, 5.8 Physical Implementations Chapter 7 Programmable Processors Chapter 8

Hardware Design with VHDL PLDs IV ECE 443

Reconfigurable Computing. Introduction

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

CprE 583 Reconfigurable Computing

Feature EPF10K30E EPF10K50E EPF10K50S

AGM CPLD AGM CPLD DATASHEET

Very Large Scale Integration (VLSI)

MAX 10 FPGA Device Overview

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)

MAX 10 FPGA Device Overview

PLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs

Intel MAX 10 FPGA Device Overview

DATA SHEET AGM AG16K FPGA. Low Cost and High Performance FPGA. Revision: 1.0. Page 1 of 17

EE219A Spring 2008 Special Topics in Circuits and Signal Processing. Lecture 9. FPGA Architecture. Ranier Yap, Mohamed Ali.

FPGAs & Multi-FPGA Systems. FPGA Abstract Model. Logic cells imbedded in a general routing structure. Logic cells usually contain:

RECONFIGURABLE COMPUTING THE THEORY AND PRACTICE

Cyclone Device Handbook, Volume 1

Cyclone Device Handbook, Volume 1

Academic Clustering and Placement Tools for Modern Field-Programmable Gate Array Architectures

Programmable Logic. Any other approaches?

Section I. MAX II Device Family Data Sheet

DATA SHEET. Low Cost and High Performance FPGA. Revision: 1.1. Release date: Page 1 of 18

Low Power Design Techniques

Qsys and IP Core Integration

Power Solutions for Leading-Edge FPGAs. Vaughn Betz & Paul Ekas

Intel Arria 10 Core Fabric and General Purpose I/Os Handbook

LSN 6 Programmable Logic Devices

AN 567: Quartus II Design Separation Flow

Outline. Field Programmable Gate Arrays. Programming Technologies Architectures. Programming Interfaces. Historical perspective

Cyclone Device Handbook, Volume 1

Stratix V Device Handbook

Arria 10 Core Fabric and General Purpose I/Os Handbook

Arria V Device Handbook

Readings: Storage unit. Can hold an n-bit value Composed of a group of n flip-flops. Each flip-flop stores 1 bit of information.

Introduction to Programmable Logic Devices (Class 7.2 2/28/2013)

DE2 Board & Quartus II Software

Stratix II Device Handbook, Volume 1

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

ENGR 303 Introduction to Logic Design Lecture 7. Dr. Chuck Brown Engineering and Computer Information Science Folsom Lake College

Programmable Logic. Simple Programmable Logic Devices

Product Obsolete/Under Obsolescence

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

1. Device Interfaces and Integration Basics for Cyclone V Devices

DATA SHEET. Low power and low cost CPLD. Revision: 1.0. Release date: 10/10/2016. Page 1 of 14

Hybrid LUT/Multiplexer FPGA Logic Architectures

PowerPlay Early Power Estimator User Guide for Cyclone III FPGAs

DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs

Cyclone V Device Handbook

ECE 448 Lecture 5. FPGA Devices

Cyclone II Device Handbook, Volume 1

The Stratix TM Routing and Logic Architecture

Memory and Programmable Logic

Transcription:

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2012 1

FPGA architecture Programmable interconnect Programmable logic blocks [Maxfield 04] Programmable logic element Objective of this lecture: study organization of programmable logic blocks and interconnects 2

Block logic element (BLE) [Rose 04] [Maxfield 04] How is the number of bits in a K-input table? How many Boolean functions can a K-input LUT implement? What is the best LUT size? 3

A closer look at the BLE 4

Example [from J. Zambreno] F = A 0 A 1 A 3 + A 1 A 2 Ā 3 + Ā 0 Ā 1 Ā 2 4-input LUT 16 bits 3-input LUT 24 bits 2-input LUT 28 bits 5

Logic block clusters (logic array block LAB, configurable logic block CLB) Assume K-input LUT in each BLE and assume N BLEs per logic cluster The BLEs in each logic clusters are fully connected or nearly-fully connected What are the best values for I, K, and N? [Betz-Rose 97] 6

To implement in FPGAs, designs need to be decomposed and mapped to LBs Map to a LUT in a CLB [Figure from Cong FPGA 01] 7

Heterogeneous reconfigurable logic Reconfigurable fabric might contain non-reconfigurable elements that interface to the logic blocks through the programmable interconnect fabric Examples: Embedded memory Embedded multipliers, adders, MAC Embedded processors 8

Embedded memory blocks Costly to implement memory with configurable logic blocks add hard chunks of RAM blocks Position/size vary depending on the FPGA device. Size varies from few thousands (or tens of thousands) per RAM block [Maxfield 04] Each block can be used independently or combined to form larger RAM blocks Could be single or dual-port RAMs 9

Embedded multipliers and adders Multipliers are inherently slow if implemented by connecting a large number of programmable logic blocks add hard-wired multiplier blocks Typically located close to the embedded RAM blocks Some FPGA use Multiply-And- Accumulate (MAC) blocks (useful in DSP applications) 10

Programmable routing Wires provide the necessary communication fabric to route the output of one computational node to the inputs of another computational node Why routing is very crucial? Routing resources occupy a larger area than logic resources in an FPGA Wire delay grows quadratically as a function of its length Technology scaling reduces device delay but increases wire delay 11

General routing definitions track channel segment CLB CLB CLB CLB A wire segment is a wire unbroken by programmable switches A track is a sequence of one or more wire segments in a line. The segments could be connected by switches at their ends A routing channel is a group of parallel tracks. The channel width is the number of tracks in the channel 12

Connection blocks: formed where CLB input or output pins connect to the routing channels Life would have been easy if only logic blocks within the same column or row need to communicate! 13

Segment-segment switch design for bidirectional wires track channel segment CLB CLB CLB CLB [Lemieux 04] 14

Switch blocks: formed wherever horizontal and vertical channels intersect Switch box Switch box size grows quadratically as a function of the number of its input wires 15

Bidirectional switch details [Lemieux 04, Tessier] 16

Segmented and hierarchical routing segmented routing hierarchical routing Short wires accommodate local traffic Short wires can be connected together using switch boxes to emulate longer wires Also contain long wires to allow efficient communication without passing through switches Routing within a group of logic blocks occur at the local level Longer hierarchical wires connect different groups 17

Programming the FPGA Configuration data in Configuration data out = I/O pin/pad = SRAM cell Configuration memory that determine the programmability of the logic blocks and interconnects 18

Programmable switch technology Anti-fuse SRAM Switch by default is OFF; when programmed it is ON. Advantages: negligible delay small area overhead Disadvantages: not really reconfigurable; one time programmable Flash Switch by default is ON; when programmed it is OFF. Advantages: programming not lost when device is turned off. Disadvantages: requires more manufacturing steps SRAM bit cell stores the programmability of the device Advantages: can be reconfigured quickly and as repeatedly as required no special fabrication steps Disadvantages: takes more area loses charge when turned off 19

Case study: Altera s Cyclone II device Two dimensional array of Logic Array Blocks (LABs), with 16 Logic Elements (LEs) in each LAB. Embedded memory blocks (M4K) and multipliers (18x18) PLL (Phased Locked Loops) are used to generate clock signal for a range of frequencies EP2C35 (in DE2 board) has 60 columns and 45 rows for a total of 33216 LEs. 105 M4K blocks and 35 embedded multipliers. 20

Logic element organization (normal mode) The LE has two operating modes: normal and arithmetic Normal mode is suitable for general logic implementation 4-input LUT 6 input connections 3 output connections LAB-wide synchronous/asynchronous clear and load signals. Clock signal 21

Logic element organization (arithmetic mode) Arithmetic mode is suitable for implementing adders, counters, accumulators and comparators The LUT is split into two 3-input LUTs (ideal for implementing 2-bit full adders) and basic carry chain 22

Logic array block organization Each LAB consists of the following: 16 LEs, LAB control signals, LE carry chains, register chains and local interconnects Local interconnects transfer signals between LEs in the same LAB and is driven by column and row interconnects and LE outputs within the same LAB Neighboring LABs, PLLs, M4K RAM and multipliers from the left and right can also drive an LAB s local interconnect Each LE can drive 48 Les through fast local and direct interconnects 23

Register/carry chain connections with a LAB 24

Multi-track interconnects Multitrack interconnect consists of row (directlink, R4, R24) and column (register chain, C4, C16) R4/C4 interconnects spans 4 blocks (right, left / top, down) R24/C16 spans 24/16 blocks and connects to R4/C4 interconnects R4/C4 can drive each other to extend their range 25

C4 interconnections C4 interconnects drive local and R4 interconnect up to 4 rows C16 column interconnects span 16 LABs and provide long column connections C16 column interconnects indirectly drive LAB local interconnects via C4 and R4 and interconnects 26

Embedded RAMs and multipliers 4608 RAM bits (w or w/o parity) 250 MHz performance Either single or dual port memory Can also be configured as FIFO ideal for DSP applications 250 Mhz performance Either configured as one 18 bit multiplier or two independent 9 bit multipliers 27

IO Element (IOE) structure IO Element (IOE) structure (allows bidirectional signals) 5 IOE per row I/O block Row I/O blocks drive C4, R4, R24 or direct link interconnects. Column I/O blocks drive C4, C16 interconnects 28