FPGAs in a Nutshell - Introduction to Embedded Systems- Dipl.- Ing. Falk Salewski Lehrstuhl Informatik RWTH Aachen salewski@informatik.rwth-aachen.de Winter term 6/7 Contents History FPGA architecture Hardware description languages VHDL Microcontroller vs. FPGA FPGA Application Areas Soft Cores Outlook: next lecture Folie 2
Programmable logic basics You can do a lot with just AND- and OR-gates! In the late 7s systems were built with Standard Discrete Logic (fixed function devices where connected together to implement a system) Idea to reduce space and increase flexibility: - One chip with two programmable planes - Provide any combination of AND and OR gates, as well as sharing of AND terms across multiple ORs. - Umbrella term: PLDs Folie 3 PLD (CPLD & FPGA) PLDs = Devices which can be re-programmed to implement any function within the device s resources Field Programmable Gate Array Complex Programmable Logic Device Folie 4 2
Xilinx Spartan-III Architecture FPGA IOB = Input/Output Block (interface between the package pins and the internal logic) CLB = Configurable Logic Block (provides functional elements for constructing logic) DCM = Digital Clock Manager (clock domain control) Folie 5 Configurable Logic Block (CLB) Folie 6 3
Spartan-III Architecture (3) Slice - Look Up Tables(LUTs) for combinatorial logic - FlipFlops for clocked logic - Control Logic as multiplexers, carry logic, Folie 7 LUT = small RAM Look Up Tables (LUTs) Example: AND-Gate (2 input LUT) Address Data 2 bit input (address) bit output (data) Data : function is stored in SRAM (other devices with Flash or Anti fuse available) LUTs in Spartan family have 4 inputs, the LUTs of two slices can be combined. A 4-input LUT allows to generate 2^2^4 = 65536 different functions. Folie 8 4
5 Folie 9 Look Up Tables (cont.) 2-input LUT: 2^2=4 input combinations 2^4 different functions possible 4-input LUT: 2^4=6 input combinations 2^6 different functions possible Output (possible functions) Input Folie Spartan III Architecture (4) Up to 832 CLBs How many slices? 4 x 832 = 3328 slices!
Configure the FPGA Does every component has to be configured on this low level? No! FPGAs can be programmed on a higher level with various Hardware Description Languages (HDL). The Translation to Gate Level is done by tools automatically Folie Principle: Design Hardware as if it is Software Software Design (Microcontroller). Specification 2. Implementation (e.g.: C or assembly) 3. Compilation to machine code 4. Load code in program memory of target 5. Functionality is realized by execution of code by CPU (CPU can use certain peripherals as timers ) Hardware Design (CPLD / FPGA). Specification 2. Implementation in HDL - Structural description - Behavioral description 3. Automatic transformation in Gate Level Description (Synthesis) 4. Load configuration in target 5. Functionality is implemented in hardware Folie 2 6
Hardware Description Languages (HDLs) Most important HDL: - VHDL Syntax similar to ADA Mostly used in Europe - Verilog Syntax similar to C Mostly used in USA - SystemC C++ library for hardware specific constructs Quite young, good for Simulation, synthesis still problematic Folie 3 VHDL VHSIC Hardware Description Language VHSIC = Very High Speed Integrated Circuits VHDL VHDL subset This VHDL subset is not standardized! Allows description and simulation of hardware (original purpose) Allows automatic synthesis to gate level description Folie 4 7
VHDL crash course Basic constructs: - Entity: specifies inputs and outputs of each module - Architecture: specifies the structure or the behavior of a module - Process: can be used for description of the behavior - Signal: can be understood as physical connections - Variable: can be understood as memory cell Control structures like in other higher programming languages are available. Folie 5 VHDL Example A simple 4bit Timer clk countervalue(3:) reset Folie 6 8
VHDL Example clk reset Timer 4bit Countervalue(3:) library IEEE; use IEEE.STD_LOGIC_64.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; entity Timer is Port ( clk : in std_logic; reset : in std_logic; countervalue : inout std_logic_vector(3 downto )); end Timer; entity architecture Behavioral of Timer is process (clk,reset) if reset='' then countervalue <= ""; elsif rising_edge(clk) then countervalue <= countervalue + ; architecture process end Behavioral; Folie 7 VHDL Example clk reset Timer 4bit Countervalue(3:) entity Timer is Port ( clk : in std_logic; reset : in std_logic; countervalue : inout std_logic_vector(3 downto )); end Timer; architecture Behavioral of Timer is process (clk,reset) if reset='' then countervalue <= ""; elsif rising_edge(clk) then countervalue <= countervalue + ; entity architecture process end Behavioral; Folie 8 9
VHDL Entity clk reset Timer 4bit Countervalue(3:) entity Timer is Port ( clk : in std_logic; reset : in std_logic; countervalue : inout std_logic_vector(3 downto )); end Timer; entity Defines I/O signals (Ports) of the module: - in read only - out write only - inout write and read back - buffer write and read (bidirectional) Ports can be - binary signals (as clk: bit) - vectors of signals (as countervalue: 4bit) Data type of Ports: usually std_logic (later more) Folie 9 VHDL Example clk reset Timer 4bit Countervalue(3:) entity Timer is Port ( clk : in std_logic; reset : in std_logic; countervalue : inout std_logic_vector(3 downto )); end Timer; architecture Behavioral of Timer is process (clk,reset) if reset='' then countervalue <= ""; elsif rising_edge(clk) then countervalue <= countervalue + ; entity architecture process end Behavioral; Folie 2
VHDL Process clk reset Timer 4bit Countervalue(3:) A process is used to describe the behavior of the timer architecture Behavioral of Timer is architecture process (clk,reset) -- sensitivity list: changing of clk or reset starts process -- (only for clarity, all input values start process) if reset='' then countervalue <= ""; -- assign initial value elsif rising_edge(clk) then -- if rising edge of clk signal then countervalue <= countervalue + ; -- increment end Behavioral; Folie 2 VHDL Example clk reset Timer 4bit Countervalue(3:) entity Timer is Port ( clk : in std_logic; reset : in std_logic; countervalue : inout std_logic_vector(3 downto )); end Timer; Name can be given to process architecture Behavioral of Timer is Timer: process (clk,reset) if reset='' then countervalue <= ""; elsif rising_edge(clk) then countervalue <= countervalue + ; entity architecture process end Behavioral; Folie 22
VHDL Data types Data type std_logic (9-value logic type) - Possible values : U uninitialized X undefined forcing forcing Z high impedance Needed if bus structures have to be realized W weak undefined L weak H weak - don t care - Signals can be grouped to vector: std_logic_vector(3 downto ) Other data types as boolean or integer are known - Can be used for arithmetics only - Usually, subtypes are usefull (integer is 64 bit!) Folie 23 VHDL Architecture architecture Behavioral of Example_Modul is TaskA: process (clk,reset) Execution outside process: parallel TaskB: process (clk,reset,in) inside process: sequential out <= in and in2; Process Process 2 Process 3 end Behavioral; Folie 24 2
Questions? What is an entity? - What is std_logic? - What is std_logic_vector? What is an architecture? - How are processes executed within an architecture? What is a process? - When is a process executed? - How is a process executed? let s have a closer look at sequential execution Folie 25 Sequential execution (blink two LEDs) Microcontroller Code Void blink_led (int delay) { int i=; PORTA = ; for (i =, i<delay, i++); PORTA = ; PORTA2 = ; for (i =, i<delay, i++); PORTA2 = ; } Probably working VHDL Code Blink_LED: process (delay) variable i : integer:= ; PORTA <= ; for i in to delay loop i:=i+; end loop; PORTA <= ; PORTA2 <= ; for i in to delay loop i:=i+; end loop; PORTA2 <= ; The last signal assignment is taken No delay (no clk) LEDs stay off Note: variables are assigned with := and signals with <= Folie 26 3
Sequential Execution (2) State machines have to be used to realize clocked sequential behavior! Blink_LED: process (clk, reset) variable state : std_logic_vector; if reset = then state := ; PORTA <= ; PORTA2 <= ; elsif rising_edge(clk) then case state is when "" => PORTA <= ; when "" => PORTA <= ; PORTA2 <= ; when "" => PORTA2 <= ; when others => end case; state := state +; Folie 27 Execution speed can be adjusted by clk, e.g. with a timer used as clock divider Parallel processes Clk Reset Count_in Frequ. measure Frequ Example: Frequency measurement (impulses/time unit) TIMER: process (clk,reset) variable countervalue: std_logic_vector(3 downto ) if reset='' then countervalue <= ""; elsif rising_edge(clk) then countervalue <= countervalue + ; if countervalue = then Frequ <= globalcount globalcount <= COUNTER: process (count_in, reset) if reset='' then globalcount <= ""; elsif rising_edge(count_in) then globalcount <= globalcount + ; Conflict: globalcount could be modified in both processes at the same time! not allowed Folie 28 4
Parallel processes Clk Reset Count_in Example (corrected): Frequency measurement Frequ. measure Frequ TIMER: process (clk,reset) variable countervalue: std_logic_vector(3 downto ) if reset='' then countervalue <= ""; counter_reset <= ; elsif rising_edge(clk) then countervalue <= countervalue + ; if countervalue = then Frequ <= globalcount elsif countervalue = then counter_reset <= ; else counter_reset<= ; COUNTER: process (count_in, counter_reset) if counter_reset = then globalcount <= ""; elsif rising_edge(count_in) then globalcount <= globalcount + ; Additional global signal counter_reset is used Folie 29 Design in VHDL forget the hardware details? Knowledge of the FPGA architecture is needed for - Optimization of execution speed - Optimization of chip resources needed (area) - Optimization of power consumption - The design of high reliable applications - Complex designs with several clock domains etc. Tools constantly improve in order to - Support the designer with these issues and to - Automatize different types of optimizations Folie 3 5
ALTERA Design Flow Source: www.altera.com Folie 3 MCU vs. FPGA (functional) MCU Sequential execution is easy - State machine with states no problem Limited to on chip peripherals or on board peripherals - MCU has typically 3 to 5 onchip timer Understanding the MCU hardware might be a challenge FPGA Parallel execution is easy - Change outputs at a time no problem All needed digital hardware in one device - 5 timer no problem Configuring the FPGA hardware might be a challenge Decision for the one or the other hardware is application dependent. Folie 32 6
FPGA application areas Applications of FPGAs include - Digital Signal Processing, - Software Defined Radio (SDR), - Space, aerospace and defense systems, - ASIC prototyping, - Medical imaging, bioinformatics, - Computer vision, speech recognition, - Cryptography, - Computer hardware emulation, - High speed communication, and a growing range of other areas. Folie 33 Soft Cores In some cases a combination of MCU and FPGA features would be nice HW/SW CoDesign Todays FPGAs have enough resources to - synthesize CPU cores (Soft Cores) - together with parallel logic FPGA CPU Logic Logic Logic CPU These Soft Cores are usually available in VHDL or Verilog code. Folie 34 7
Example: 8bit Xilinx Soft Core For further information see: www.xilinx.com/picoblaze Folie 35 Xilinx Picoblaze 57 instructions 6 registers (8bit) 64 byte data memory On Spartan-3 up to 44MIPS Resources needed: 96 Spartan-3 slices! Theoretically, 346 Soft Cores would fit into the largest Spartan 3 (additional resources needed for interconnection) Usually, the chip internal memory is the bottle neck more and more FPGAs have additional block RAM VHDL source code is available on www.xilinx.com/picoblaze Folie 36 8
Further Soft Cores Altera: NiosII www.altera.com - 32-bit Harvard-RISC - Optional FPU (Floating Point Unit) - Up to 2MHz Xilinx: Microblaze www.xilinx.com - 32-bit Harvard-RISC - Optional FPU (Floating Point Unit) - 9-26 LUTs (45-3 slices) - Up to 2MHz (up to MHz, 92 DMIPS on Spartan3) Lattice: LatticeMico8 www.latticesemi.com - 8-bit Harvard-RISC - 275 LUTs Chip independent cores from third party suppliers (e.g. 85-derivatives) Folie 37 Questions? Next Contents of the next exercise A comparison: Something to think about until next week Folie 38 9
In the exercise Create own VHDL program Simulate this program Test the program on Spartan 3 FPGA Use schematics as alternative programming method Folie 39 6 V Jumper: M&M2: offen, M: geschlossen LEDs connected to FPGA Power supply for external boards Jumper: closed Programming cable to parallel port Access to FPGA-Pins Folie 4 CAN board 2
VHDL sources Free VHDL Online Tutorial: http://www.aldec.com/downloads/ The Hamburg VHDL archive http://tech-www.informatik.uni-hamburg.de/vhdl/ VHDL Tutorial Uni Erlangen-Nürnberg http://www.vhdl-online.de/tutorial/ Online Support from the book VHDL Eine Einführung http://nirvana.informatik.uni-halle.de/pearson/ Folie 4 A comparison During this semester you learned about different hardware platforms: - Microcontrollers (MCUs) - Programmable Logic Controllers (PLCs) - Field Programmable Gate Arrays (FPGAs) What could influence your decision of hardware platform selection? Folie 42 2
Think about the Pro and Cons of the different hardware platforms Folie 43 22