L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

Similar documents
Digital Integrated Circuits

INTRODUCTION TO FPGA ARCHITECTURE

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

EITF35: Introduction to Structured VLSI Design

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

Programmable Logic Devices FPGA Architectures II CMPE 415. Overview This set of notes introduces many of the features available in the FPGAs of today.

SECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj

CPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline

FPGA How do they work?

FPGA Implementations

Programmable Logic Devices UNIT II DIGITAL SYSTEM DESIGN

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

Field Programmable Gate Array (FPGA)

Programmable Logic. Any other approaches?

FPGA. Agenda 11/05/2016. Scheduling tasks on Reconfigurable FPGA architectures. Definition. Overview. Characteristics of the CLB.

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

Atlys (Xilinx Spartan-6 LX45)

FPGA: What? Why? Marco D. Santambrogio

Memory and Programmable Logic

S2C K7 Prodigy Logic Module Series

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

Outline. Field Programmable Gate Arrays. Programming Technologies Architectures. Programming Interfaces. Historical perspective

FPGA architecture and design technology

EE178 Lecture Module 2. Eric Crabill SJSU / Xilinx Fall 2007

PROGRAMMABLE MODULES SPECIFICATION OF PROGRAMMABLE COMBINATIONAL AND SEQUENTIAL MODULES

ECE 636. Reconfigurable Computing. Lecture 2. Field Programmable Gate Arrays I

FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

Virtex-II Architecture. Virtex II technical, Design Solutions. Active Interconnect Technology (continued)

Very Large Scale Integration (VLSI)

Gate Estimate. Practical (60% util)* (1000's) Max (100% util)* (1000's)

Introduction to FPGAs. H. Krüger Bonn University

FPGA Based Digital Design Using Verilog HDL

System-on Solution from Altera and Xilinx

Zynq-7000 All Programmable SoC Product Overview

ECE 448 Lecture 5. FPGA Devices

Spiral 2-8. Cell Layout

Field Programmable Gate Arrays (FPGAs)

Computer Structure. Unit 2: Memory and programmable devices

The DSP Primer 8. FPGA Technology. DSPprimer Home. DSPprimer Notes. August 2005, University of Strathclyde, Scotland, UK

Configurable Embedded Systems: Using Programmable Logic to Compress Embedded System Design Cycles

Altera FLEX 8000 Block Diagram

Chapter 5: ASICs Vs. PLDs

Universal Serial Bus Host Interface on an FPGA

Field Program mable Gate Arrays

Basic FPGA Architecture Xilinx, Inc. All Rights Reserved

Full Linux on FPGA. Sven Gregori

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.

The Xilinx XC6200 chip, the software tools and the board development tools

FPGA system development What you need to think about. Frédéric Leens, CEO

International Training Workshop on FPGA Design for Scientific Instrumentation and Computing November 2013.

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

A software platform to support dynamically reconfigurable Systems-on-Chip under the GNU/Linux operating system

Memory and Programmable Logic

Topics. Midterm Finish Chapter 7

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic

Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE

Advanced Digital Design Using FPGA. Dr. Shahrokh Abadi

Built-In Self-Test for Regular Structure Embedded Cores in System-on-Chip

Zynq Architecture, PS (ARM) and PL

ECEN 449 Microprocessor System Design. FPGAs and Reconfigurable Computing

Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage ECE Temple University

Introduction to Modern FPGAs

Workspace for '4-FPGA' Page 1 (row 1, column 1)

Hardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20

Introduction to Field Programmable Gate Arrays

Section 6. Memory Components Chapter 5.7, 5.8 Physical Implementations Chapter 7 Programmable Processors Chapter 8

FPGA VHDL Design Flow AES128 Implementation

ispxpld TM 5000MX Family White Paper

PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

FPGAs in a Nutshell - Introduction to Embedded Systems-

Enabling success from the center of technology. A Practical Guide to Configuring the Spartan-3A Family

System-on-a-Programmable-Chip (SOPC) Development Board

Field Programmable Gate Array (FPGA) Devices

New System Solutions for Laser Printer Applications by Oreste Emanuele Zagano STMicroelectronics

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture

Virtex 6 FPGA Broadcast Connectivity Kit FAQ

Design of Digital Circuits

FPGAs: Instant Access

Learning Outcomes. Spiral 3 1. Digital Design Targets ASICS & FPGAS REVIEW. Hardware/Software Interfacing

Copyright 2016 Xilinx

ML505 ML506 ML501. Description. Description. Description. Features. Features. Features

Embedded Systems: Hardware Components (part I) Todor Stefanov

ECE 111 ECE 111. Advanced Digital Design. Advanced Digital Design Winter, Sujit Dey. Sujit Dey. ECE Department UC San Diego

Advanced course on Embedded Systems design using FPGA

Hardware Design with VHDL PLDs IV ECE 443

CMPE 415 Programmable Logic Devices Introduction

LEON4: Fourth Generation of the LEON Processor

Pipelining & Verilog. Sequential Divider. Verilog divider.v. Math Functions in Coregen. Lab #3 due tonight, LPSet 8 Thurs 10/11

EE219A Spring 2008 Special Topics in Circuits and Signal Processing. Lecture 9. FPGA Architecture. Ranier Yap, Mohamed Ali.

Realize the Genius of Your Design

Midterm Exam. Solutions

The S6000 Family of Processors

RUN-TIME PARTIAL RECONFIGURATION SPEED INVESTIGATION AND ARCHITECTURAL DESIGN SPACE EXPLORATION

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

LatticeSCM SPI4.2 Interoperability with PMC-Sierra PM3388

Lecture 41: Introduction to Reconfigurable Computing

EN2911X: Reconfigurable Computing Lecture 01: Introduction

Transcription:

L2: FPGA HARDWARE 18-545: ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

18-545: FALL 2014 2 Admin stuff Project Proposals happen on Monday Be prepared to give an in-class presentation Lab 1 is due Wednesday, Sept. 16th Reading Assignment #1 due today Submit a PDF/text file, don't fill in the web form Team assignments are done

18-545: FALL 2014 3 Admin Stuff Status reports due today No word docs, please! Be specific about what happened/is going to happen Talk about what YOU did/will do, not just what your group did Grades on the way, as general feedback

18-545: FALL 2014 4 Game Plan Overview Why use FPGAs? FPGA Internals Caveat: I will use Xilinx specific terminology since that s the FPGA company you will be using. Beware that other companies use different terms

FPGA Overview Field Programmable Gate Array Array of generic logic gates Gates where logic function can be programmed Programmable interconnection between gates Fielded systems can be programmed i.e. post-fabrication

18-545: FALL 2014 6 Xilinx Virtex-5 FPGA

18-545: FALL 2014 7 Design Platform Virtex-5 Development System Xilinx XC5VLX110T FPGA 17280 slices of CLB goodness 256MB DDR2 (SODIMM) DVI Video port VGA port is for input 10/100/1000 Ethernet port Audio Codec (AC97) USB2 port 16x2 LCD, RS-232 Compact Flash card slot Expansion connectors

18-545: FALL 2014 8 Game Plan Overview Why use FPGAs? FPGA Internals

Why use FPGAs? System designers have a Goldilocks problem Off-the-shelf parts are not efficient enough Custom ASICs cost too much Need a just right solution

ASIC Design Difficult to design Large and complex Issues in advanced processes Interconnect delay Device leakage Power density constraints Expensive to design / fabricate Mask set costs Non-recurring engineering costs Need a high-volume, high-profit market to justify costs!

Energy Efficiency (MOPS/mW) Area Efficiency (MOPS/mm2) 10000 1000 Microprocessors 100 10 1 0.1 0.01 DSPs ASICs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Efficiency View An efficiency gap exists between ASICs and CPUs! N. Zhang, et. al, The Cost of Flexibility in Systems on a Chip Design for Signal Processing Applications

Development Cost + Device Cost ASIC Trend FPGA Trend Decreasing FPGA unit cost pushing crossover point to the right FPGA solution has a lower total cost Additional ASIC costs: Increasing NRE charge 58% are late to market -- impacts total volumes shipped ASIC cycle longer than some market windows Over 50% need to be respun Total Units ASIC solution has a lower total cost (Courtesy Xilinx, Inc.) Economic View FPGAs: High package costs ($300+), low NRE costs ASICs: Low package costs (pennies), high NRE costs ($600K+)

18-545: FALL 2014 13 FPGA Advantages Higher performance than CPU solution Lower power than CPU solution (usually) Low NRE costs Off-the-shelf part designed by FPGA vendor You are sharing NRE costs with all other customers Fast design time Low time-to-market Fast re-design / re-fabrication time Easy to correct an error, to add functionality, in response to spec change Can even change product after deployment

18-545: FALL 2014 14 FPGA Disadvantages High per-part costs Good for low to middle volume applications High volume applications should consider ASICs Perhaps use FPGA for prototyping Lower performance than ASIC Higher power than ASIC More specialized design skills than programming a CPU

Example uses of FPGAs Rapid Prototyping Emulation of ASIC design Design exploration Verification Shipping product Networking Military Microsoft Bing Datacenters Reconfigurable Computing

18-545: FALL 2014 16 Game Plan Overview Why use FPGAs? FPGA Internals

FPGA Breakdown 3 Basic components Configurable Logic Blocks General purpose interconnect I/O Blocks Advanced components Hard macros CPUs Block RAM Multipliers Specialized components VIRTEX-II PRO

XILINX XC3020 CLB (64 TOTAL) I/O BLOCK (64 TOTAL) GENERAL PURPOSE INTERCONNECT IOBS HAVE DIRECT ACCESS TO ADJACENT CLBS SWITCH MATRIX (COURTESY XILINX, INC.)

ROUTING EVEN MORE ZOOMED IN VIEW ZOOMED IN VIEW OF THE CLB MATRIX OF THE FPGA SPECIFIC INGRESS AND EGRESS CONNECTION OPTIONS (BLACK DOTS) ARE AVAILABLE (COURTESY XILINX, INC.)

ROUTING: THE SWITCH MATRIX EACH MATRIX HAS 5 CONNECTIONS PER SIDE (COURTESY XILINX, INC.)

ROUTING: THE SWITCH MATRIX EACH MATRIX HAS 5 CONNECTIONS PER SIDE ONLY CERTAIN CONNECTION PATTERNS ARE POSSIBLE (COURTESY XILINX, INC.)

18-545: FALL 2014 22 Hierarchical Routing Spartan-2 and more recent have different length connections between switch matrices Local roads, limited access roads, interstate highways Routes across entire chip don t burn lots of short connections

Configurable Logic Blocks CLBs get more and more stuff crammed in them over time XC3K family had LUT (5 variable input, 2 FF values, 2 outputs), 2 FFs, clock enable, FF reset (direct / global) and 9 muxes ~51 bits of configuration SRAM per CLB (COURTESY XILINX, INC.)

What s a Look-up-table (LUT)? A direct implementation of a truth table, using memory LUT inputs are memory address values LUT outputs are the memory data value A B C D LUT F A B C D F 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 0 0 0 1 1 0 0 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 1 0 A B F A B C D F 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 0 A C D B F 18-545: FALL 2014 24

18-545: FALL 2014 25 Another View of LUTs D Q D Q Can view LUT as 16:1 mux 16 D D Q Q 16 x 1 mux Output Inputs are mux select Config sets mux data inputs Logically same as 16x1 memory D D Q Q Can compact logic if you can route inputs to mux data inputs Inputs Programmed as part of configuration bitstream

Look Up Table Additional Functionality Can be configured as: Shift register (16 regs) Small memory (16 bits) Distributed RAM! Some other FPGAs use muxes instead of memories to implement the core combinational logic

Spartan-2 CLB Spartan-2 has 2 LUTs (4 input each) feeding a 3rd LUT, 2 FFs (with Preset/Reset, Enable, posedge or negedge clocks) and 16 muxes 12 inputs (plus clock), 4 outputs 18-545: FALL 2014 (COURTESY XILINX, INC.) 27

Spartan-3 CLBs are composed of 4 slices Organized as 2 pairs, one of which is optimized for memory access Each slice has 2 FFs and 2 LUTs (COURTESY XILINX, INC.)

FPGA Families extend Architecture Devices are built, with more capability, but around the same basic architecture Some additional capabilities Low voltage versions Faster clock rates Different packaging options (Courtesy Xilinx, Inc.)

The need for more stuff CompEs cannot design on logic, routing, I/O alone Extreme case from early 90s 16 port ATM switch, designed on a single board! FPGAs (XC3Ks) FIFO memory chips Design is limited by I/O to memory chips--bring them on-chip 30

Other Stuff Clock managers Global clock buffering, distribution DCM: eliminate skew, phase shifts, multiply or divide clock Memory Block RAM Distributed RAM (repurposed LUTs) Shift Registers Dedicated Multiplexers Carry Look-Ahead Generators I/O Blocks SelectIO supports 18 standards (single, differential, various voltage levels,...) Embedded Multipliers 31

Hard Macros Hard macros Block RAMs Multipliers CPUs Soft macros HDL

Block RAMs Distributed RAM Use LUTs as memories Low density Poor performance! Block RAM Large-ish dedicated memory blocks Xilinx BRAMs = 18Kb Some configurability Dual-port Data width / depth FIFO, CAM, etc.

Multipliers 18x18 signed 2 s-complement multiplier Two 18b inputs One 36b output 18b enough for many DSP applications Can gang multiple units together for wider data Faster and lower power than multiplier from CLBs

CPUs PowerPC 405 XC2VP30 has 2 Embedded PowerPC 405 cores Embedded L1 I and D caches No FPU

CPU Connectivity: PLB and OPB IBM Core Connect Processor Local Bus (PLB) - fast on-chip communication On-Chip Peripheral Bus (OPB) - optimized for periphs. (UART, etc) Device Control Register bus (DCR) - used to send and set config.

CPU Connectivity: PLB and OPB (cont.)

CPU Connectivity: OCM On-Chip Memory controller CPU ß à block RAM 2 OCMs I and D Direct, fast interface Can use dual-port BRAMs for producer-consumer link to FPGA fabric

18-545: FALL 2014 39 CPU Links A lot more details on the embedded CPU http://www.xilinx.com/bvdocs/userguides/ppc_ref_guide.pdf http://direct.xilinx.com/bvdocs/userguides/ug018.pdf http://www-3.ibm.com/chips/techlib/techlib.nsf/productfamilies/ CoreConnect_Bus_Architecture

Zynq 7000 Advanced Microcontroller Bus Interface + Advanced extensible Interconnect! To memory, FPGA fabric, I/O & Peripherals! AMBA = ARM s attempt at The One True Interface

Configuration Storage Lots of configuration bits WL LUTs, routing, I/O configuration Xilinx XC2VP30 has >11Mb Configuration storage technologies Volatile SRAM cells Non-volatile FLASH, EEPROM Anti-fuse bit 6T SRAM cell bit_b Actel anti-fuse

18-545: FALL 2014 42 Configuration How to load (scan) configuration bits (bitstream) Connect all configuration registers into single long shift register Serially clock in configuration bits Most designs use standard scan interface (JTAG) developed for test Bitstream source Non-volatile memory On-board FLASH, EEPROM, serial memory External media (CF card) Attached workstation Can encrypt bitstream to conceal configuration

18-545: FALL 2014 43 Major FPGA Vendors SRAM-based FPGAs Xilinx Altera Atmel Share over 60% of the market Lattice Semiconductor Flash & antifuse FPGAs Actel Corp. Quick Logic Corp. Lattice Semiconductor Xilinx (system-in-a-package solution)