FPGA VHDL Design Flow AES128 Implementation

Similar documents
Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

Field Programmable Gate Array (FPGA)

INTRODUCTION TO FPGA ARCHITECTURE

EITF35: Introduction to Structured VLSI Design

EE219A Spring 2008 Special Topics in Circuits and Signal Processing. Lecture 9. FPGA Architecture. Ranier Yap, Mohamed Ali.

FYSE420 DIGITAL ELECTRONICS. Lecture 7

Digital Integrated Circuits

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA Programming Technology

LSN 6 Programmable Logic Devices

ECE 636. Reconfigurable Computing. Lecture 2. Field Programmable Gate Arrays I

Spiral 2-8. Cell Layout

VHX - Xilinx - FPGA Programming in VHDL

PLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs

Field Programmable Gate Array (FPGA) Devices

Programmable Logic Devices UNIT II DIGITAL SYSTEM DESIGN

Chapter 5: ASICs Vs. PLDs

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

Field Program mable Gate Arrays

Zynq-7000 All Programmable SoC Product Overview

An Introduction to Programmable Logic

Programmable Logic Devices FPGA Architectures II CMPE 415. Overview This set of notes introduces many of the features available in the FPGAs of today.

Design Methodologies. Full-Custom Design

Very Large Scale Integration (VLSI)

Actel s SX Family of FPGAs: A New Architecture for High-Performance Designs

Introduction to Field Programmable Gate Arrays

FPGA architecture and design technology

FPGA Provides Speedy Data Compression for Hyperspectral Imagery

Power Consumption in 65 nm FPGAs

Xilinx DSP. High Performance Signal Processing. January 1998

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

Hardware Design with VHDL PLDs I ECE 443. FPGAs can be configured at least once, many are reprogrammable.

Advanced Digital Design Using FPGA. Dr. Shahrokh Abadi

FABRICATION TECHNOLOGIES

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Outline. Field Programmable Gate Arrays. Programming Technologies Architectures. Programming Interfaces. Historical perspective

Programmable Logic. Any other approaches?

Design Methodologies and Tools. Full-Custom Design

Copyright 2017 Xilinx.

Introduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013

The Next Generation 65-nm FPGA. Steve Douglass, Kees Vissers, Peter Alfke Xilinx August 21, 2006

Programmable Logic Devices Introduction CMPE 415. Programmable Logic Devices

Programmable Logic Devices

FPGA design with National Instuments

FPGA: What? Why? Marco D. Santambrogio

Graduate course on FPGA design

ECEN 449 Microprocessor System Design. FPGAs and Reconfigurable Computing

CPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline

Workspace for '4-FPGA' Page 1 (row 1, column 1)

ESE532: System-on-a-Chip Architecture. Today. Message. Graph Cycles. Preclass 1. Reminder

CS310 Embedded Computer Systems. Maeng

FPGA Technology and Industry Experience

Understanding Peak Floating-Point Performance Claims

Motivation for Lecture. Market for Memories. Example: FFT Design. Sequential Circuits & D flip-flop. Latches and Registers.

Virtex-II Architecture. Virtex II technical, Design Solutions. Active Interconnect Technology (continued)

High Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx

Advanced FPGA Design Methodologies with Xilinx Vivado

EECS150 - Digital Design Lecture 16 - Memory

Spartan-3E FPGA Design Guide for prototyping and production environment

Embedded Systems. Octav Chipara. Thursday, September 13, 12

Computer Structure. Unit 2: Memory and programmable devices

PROGRAMMABLE MODULES SPECIFICATION OF PROGRAMMABLE COMBINATIONAL AND SEQUENTIAL MODULES

EECS150 - Digital Design Lecture 16 Memory 1

PINE TRAINING ACADEMY

Embedded Controller Design. CompE 270 Digital Systems - 5. Objective. Application Specific Chips. User Programmable Logic. Copyright 1998 Ken Arnold 1

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

DINI Group. FPGA-based Cluster computing with Spartan-6. Mike Dini Sept 2010

Zynq AP SoC Family

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

LOW POWER DESIGN IMPLEMENTATION OF A SIGNAL ACQUISITION MODULE RAVI BHUSHAN THAKUR. B.Tech, Jawaharlal Nehru Technological University, INDIA 2007

FlexRIO. FPGAs Bringing Custom Functionality to Instruments. Ravichandran Raghavan Technical Marketing Engineer. ni.com

Chapter 9: Integration of Full ASIP and its FPGA Implementation

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

FPGAs: FAST TRACK TO DSP

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering. Winter/Summer Training

H100 Series FPGA Application Accelerators

The DSP Primer 8. FPGA Technology. DSPprimer Home. DSPprimer Notes. August 2005, University of Strathclyde, Scotland, UK

Early Models in Silicon with SystemC synthesis

VHDL. Chapter 1 Introduction to VHDL. Course Objectives Affected. Outline

Simplify System Complexity

10 Gigabit XGXS/XAUI PCS Core. 1 Introduction. Product Brief Version April 2005

Introduction to ICs and Transistor Fundamentals

Gate Estimate. Practical (60% util)* (1000's) Max (100% util)* (1000's)

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

Adapter Modules for FlexRIO

Achieving Breakthrough Performance with Virtex-4, the World s Fastest FPGA

Lecture (1) Introduction to FPGA. 1. The History of Programmable Logic

Chapter 2. FPGA and Dynamic Reconfiguration ...

V1 - VHDL Language. FPGA Programming with VHDL and Simulation (through the training Xilinx, Lattice or Actel FPGA are targeted) Objectives

FPGA Based Digital Design Using Verilog HDL

FPGAs & Multi-FPGA Systems. FPGA Abstract Model. Logic cells imbedded in a general routing structure. Logic cells usually contain:

UNIT V (PROGRAMMABLE LOGIC DEVICES)

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Embedded Systems: Hardware Components (part I) Todor Stefanov

XMC-FPGA05F. Programmable Xilinx Virtex -5 FPGA PMC/XMC with Quad Fiber-optics. Data Sheet

EE178 Lecture Module 2. Eric Crabill SJSU / Xilinx Fall 2007

Transcription:

Sakinder Ali

FPGA VHDL Design Flow AES128 Implementation

Field Programmable Gate Array Basic idea: two-dimensional array of logic blocks and flip-flops with a means for the user to configure: 1. The interconnection between the logic blocks, 2. The function of each block. The FPGA LUTs are configured to implement Gates A B C 0 0 0 0 1 0 1 0 0 1 1 1 Empty Look Up Tables Adding More Functionality DFF next to a 4-Input LUT to form a "Configurable Logic Block" (CLB) A X INV AND LUT X B X X X X X Out C X INV OR LUT X X X X X X X LUT LUT LUT The interconnect switches are then programmed to implement the net connections X The basic FPGA functional unit: CLB

1987: 9,000 gates, Xilinx 1992: 600,000, Naval Surface Warfare Department Early 2000s: Millions 1985: First commercial FPGA technology invented by Xilinx 1987: $14 million ~1993: >$385 million 2005: $1.9 billion 2010 estimates: $2.75 billion

NASA/Lockheed X-33 Launch Vehicle The SpaceWire communications system is driven by low-power chips. VMC: The Venus Monitoring Camera is a wide-angle, multi-channel CCD. The camera includes an FPGA to pre-process image data, reducing the amount transmitted to Earth FPGA used to modify the control algorithm to improve testing, explore shutter issues, and further NASA MEMS microshutter array development. Over the last decade, the company's FPGAs have been onboard more than 100 launches and flown on over 300 satellites and spacecraft, including GPS, Mars Reconnaissance Orbiter, and the Mars Explorer Rovers 1 and 2 (Spirit and Opportunity).

8-Bit Parallel Input (Up to 3.5GHz) 8:48 Deserialize Bit Ordering FIFO (Up to 72.92 MHz) ADC CHANNEL Bit Allinger Packetizer (Up to 77.76 MHz) DATA FRAME GENERATOR FRAME SONET TRANSCEIVER (Up to 77.76 MHz) SONET FPGA BLOCK 1:8 Demuxes (Up to 437.5 MHz) 2 CHANNELS OF 8-BITS WIDE 2 CHANNELS OF 48-BITS WIDE 2 CHANNELS OF 48-BITS WIDE 2 CHANNELS OF 48-BITS WIDE 8 CHANNELS OF 48-BITS WIDE 12 CHANNELS OF 32-BITS WIDE 12 CHANNELS OF 32-BITS WIDE 3 CHANNELS OF 32-BITS WIDE 12 CHANNELS OF 32-BITS WIDE (Up to 2.489 GHz) 3.5 GHz 437.5 MHz 72.92 MHz 77.76 MHz 2.489 GHz Color Code Input 0 Input 1 Input 2 Input 3 Input 4 Input 5 Input 6 Input 7 Serial Data at xxx GHz Time (0.xxx ns) INPUTS OUTPUTS 32 BIT ( WORD ) Serial Data at xxx GHz BIT ALLIGNER BIT ALLIGNER SONET FRAME 48 BIT CHANNEL-0 CHANNEL-3 CHANNEL 11 CHANNEL-7 CHANNEL-1 CHANNEL-2 CHANNEL-9 CHANNEL-4 CHANNEL-5 CHANNEL-8 CHANNEL-6 CHANNEL-10

Virtex-7-HT Virtex-6 Virtex-5 Virtex-4 Virtex-II Pro Spartan-6 Extended Spartan-3A Spartan-3E Spartan-3 28Gbps Serial Transceiver Technology Xilinx Virtex-7 HT FPGA Highlights Enables communication equipment vendors to develop the integrated, high- bandwidth-efficient. Built with four to sixteen 28Gbps transceivers complying with OIF CEI-28G (Optical Internetworking Forum s Common Electrical I/O). Up to seventy-two 13.1Gbps transceivers Up to 2.8Tbps full duplex throughput Features Artix-7 Kintex-7 Virtex-7 Spartan-6 Virtex-6 Logic Cells 352,000 480,000 2,000,000 150,000 760,000 BlockRAM 19Mb 34Mb 85Mb 4.8Mb 38Mb DSP Slices 1,040 1,920 5,280 180 2,016 DSP Performance (symmetric FIR) 1,129GMACS 2,450GMACS 6,737GMACS 140GMACS 2,419GMACS Transceiver Count 16 32 96 8 72 Transceiver Speed 6.6Gb/s 12.5Gb/s 28.05Gb/s 3.125Gb/s 11.18Gb/s Total Transceiver Bandwidth (full duplex) 211Gb/s 800Gb/s 2,784Gb/s 50Gb/s 536Gb/s Memory Interface (DDR3) 1,066Mb/s 1,866Mb/s 1,866Mb/s 800Mb/s 1,066Mb/s PCI Express Interface Gen2x4 Gen2x8 Gen3x8 Gen1x1 Gen2x8 Agile Mixed Signal (AMS)/XADC Yes Yes Yes Yes Configuration AES Yes Yes Yes Yes Yes I/O Pins 600 500 1,200 576 1,200 The company claims it is the fastest product rollout of next generation programmable logic devices built with 28nm technology.

~$15,000 The technology leadership continues with the transceivers in 28-nm Stratix V FPGAs. Providing up to over 930 Gbps of transceiver bandwidth, Stratix V FPGAs deliver the highest system bandwidth at the lowest power consumption for a wide range of applications and protocols. ~@200 -$500

One time programmable Fuses (destroy internal links with current) Anti-fuses (grow internal links) Reprogrammable EPROM EEPROM Flash SRAM - volatile

Programmable Function Fusible Link Technology

Antifuse based configuration uses a two terminal device that is electrically programmed to change from an electrical open circuit to an electrical short circuit. The operation is the inverse to that of the fuse. This is a one-time process (i.e., permanent) and once blown, cannot be undone. 0 (i), the no connection exists between the two metal layers. Once programmed, a low-resistance link (ii) exists between the metal layers and connects them together. Disadvantages Not reprogrammable; links made are permanent Advantages Small size Relatively low series resistance Low parasitic capacitance 1

Bit Line Bit Line Encryption/Decryption system used by Xilinx to protect the configuration of Virtex-II devices. Word Line - - Control Gate Oxide Layer Floating Gate 1 Drain Word Line Drain Source Control Gate - - - - - - - Oxide Layer Floating Gate Source 0 Xilinx CAD software tool encrypts the bitstream using the powerful Triple Data Encryption (DES) algorithm before downloading the configuration inside the FPGA. Triple DES is the standard used by many governments for safe communication and by banks around the world for money transfers. This algorithm uses three 56-bits public keys. The designer can use random keys or choose their own-keys.

Configuration data in Configuration data out Disadvantages Volatile External Permanent Memory Required Large Area Required Advantages = I/O pin/pad = SRAM cell Reprogrammable, easily and quickly Requires only standard integrated circuit process technology (as opposed to Antifuse) SRAM-based FPGAs are non-volatile devices. Upon power up, They are required to be programmed from an external source. Most FPGA vendors do not publish the definition of the bit-stream. It is therefore very difficult to reverse engineer a design from a configuration bit-stream.

Level 0 (Zero) No special security features added to the system. Level 1 (Low) Some security features in place. Level 2 (Medium Low) More expensive tools are required, as well as specialized knowledge. Level 3 (Medium ) Special tools and equipment are required, as well as some special skills and knowledge. The attack may become time-consuming but will eventually be successful. Level 4 (Medium High) Equipment is available but is expensive to buy and operate. Level 5 (High) All known attacks have been unsuccessful

Basic Components LUT (look-up-table) Flip-Flops Multiplexers I/O Blocks Programmable switching matrices Interconnect Clocks

Create a circuit design Graphic circuit tool Verilog VHDL AHDL Vendor Tools

VHDL: VHSIC Hardware Description Language VHSIC: Very High Speed Integrated Circuit Developed originally by DARPA for specifying digital systems International IEEE standard (IEEE 1076-1993) Practical benefits: A mechanism for digital design and reusable design documentation Model interoperability among vendors Third party vendor support Design re-use.

It allows the behavior of the required system to be described (modeled) and verified (simulated) before synthesis tools translate the design into real hardware (gates and wires). Allows the description of a concurrent system. VHDL is a dataflow language, unlike procedural computing languages such as BASIC, C, and assembly code, which all run sequentially, one instruction at a time. VHDL project is multipurpose. Being created once, a calculation block can be used in many other projects. However, many formational and functional block parameters can be tuned (capacity parameters, memory size, element base, block composition and interconnection structure). VHDL project is portable. Being created for one element base, a computing device project can be ported on another element base, for example VLSI with various technologies.

Interfaces Behavior Structure Test Benches Analysis, elaboration, simulation Synthesis

entity name port names port mode (direction) port type punctuation reserved words

Design Entry/RTL Coding - Behavioral or Structural Description of Design RTL Simulation - Functional Simulation - Verify Logic Model & Data Flow (No Timing Delays) Synthesis - Translate Design into Device Specific Primitives - Optimization to Meet Required Area & Performance Constraints Place & Route - Map Primitives to Specific Locations inside Target Technology with Reference to Area & Performance Constraints - Specify Routing Resources to Be Used

t clk Timing Analysis - Verify Performance Specifications Were Met - Static Timing Analysis Gate Level Simulation - Timing Simulation - Verify Design Will Work in Target Technology PC Board Simulation & Test - Simulate Board Design - Program & Test Device on Board

Logic analysis system provides high-performance, system-level debugging of digital designs. Track down cross domain analog and digital problems by time-correlating logic analysis systems.

Advance FPGA design flow Typical FPGA design flow

High performance at densities of up to 4 million equivalent system gates and 840 user I/Os for space-based applications. The RTAX-DSP family features up to 120 mathblocks, each capable of operating at 125 MHz over the full military temperature range (- 55 C to 125 C), for a total throughput of 15 billion multiply/accumulates per second (15 GMACS). Key Features Highly reliable, nonvolatile antifuse technology 250,000 to 500,000 ASIC gates (2 to 4 million system gates) Up to 120 DSP mathblocks with 125 MHz 18 bit x 18 bit multiply-accumulate Up to 540 kbits of embedded memory with optional EDAC protection Up to 840 user-programmable I/Os

Interfaces to the system (data, clock, control, and status) including width (serial or parallel) Functional blocks in system Interfaces between blocks including data width Data rates into the system and between blocks (data rates are tied to data widths) Clock domains into system and between blocks Memory (size, speed, and width) Standards/specs Timing diagrams (setup, hold, latency tolerance) Schedule.

The FPGA is one of the most popular logic circuit components and has revolutionized the way digital systems are designed. Some FPGA advantages include: Low-cost Fast-turnaround prototype implementation Supported by CAD/EDA tools High density High speed Programmable and versatile Flexible Reusable Large amounts of logic gates, registers, RAM and routing resources Quick time-to-market SRAM FPGA provide the benefits of custom CMOS

Current State Next State

Current State Next State

http://www.altera.com/ http://www.xilinx.com/ http://en.wikipedia.org/wiki/field-programmable_gate_array http://www.ece.umd.edu/class/ents650/aesmodes.pdf http://www.home.agilent.com/agilent http://www.esa.int/specials/venus_express/ http://sine.ni.com/cs/app/doc/p/id/cs-11556 http://www.actel.com/products/milaero/rtaxdsp/default.aspx