Energy scalability and the RESUME scalable video codec

Size: px
Start display at page:

Download "Energy scalability and the RESUME scalable video codec"

Transcription

1 Energy scalability and the RESUME scalable video codec Harald Devos, Hendrik Eeckhaut, Mark Christiaens ELIS/PARIS Ghent University pag. 1

2 Outline Introduction Scalable Video Reconfigurable HW: FPGAs Implementation details Energy measurements pag. 2

3 Scalable Video Server Intelligent Network Clients Node Node Encode once Rescale video stream Quality ~ Deployed hardware resources pag. 3

4 Overview Video Codec Motion Estim. Wavelet Transform Entropy Encoding Original frames P a c k Pull bit stream Motion Comp. Decompressed frames Inverse Wavelet T. Entropy Decoding U n p a c k pag. 4

5 Overview Video Codec Motion Estim. Original frames Wavelet Transform Entropy Encoding Temporal + Temporal Scalability Motion Comp. Decompressed frames P a c k Pull bit stream Inverse Wavelet T. Entropy Decoding U n p a c k pag. 5

6 Overview Video Codec Original frames Motion Estim. Wavelet Transform Temporal + Temporal Scalability Spatial + Resolution Scalability Motion Comp. Decompressed frames Inverse Wavelet T. Entropy Encoding P a c k Pull bit stream Entropy Decoding U n p a c k pag. 6

7 Overview Video Codec Motion Estim. Original frames Temporal + Temporal Scalability Motion Comp. Decompressed frames Wavelet Transform Entropy Encoding Statistical Spatial + + Resolution & Resolution Quality Scalability Scalability Inverse Wavelet T. Entropy Decoding P a c k Pull bit stream U n p a c k pag. 7

8 FPGA FPGA: Field Programmable Gate Array e.g. : Altera Stratix IO LE LE LE Mem LE Mem M-RAM DSP blocks pag. 8

9 Development Board 256 MiB PC333 DDR SDRAM Altera Stratix S60 PCI interface pag. 9

10 Introduction: RESUME RESUME project (Reconfigurable Embedded Systems for Use in scalable Multimedia Environments) Build real-time decoder for scalable video Software profilation: Hardware acceleration needed Scalable video scalable hardware and energy? pag. 10

11 Outline Introduction Implementation details System Overview 2D-IDWT Energy measurements pag. 11

12 System Overview Unpack Entropy Decoding Inverse Wavelet T. Motion Comp. Decoded frames Enc. video stream Control PCI (DMA) PCI Software FPGA VGA-card pag. 12

13 System Overview pag. 13

14 System Overview Bitplane Inverse Inverse assembler assembl. Wavelet WaveletT. T. WED Motion Motion Comp. Comp. Color Conv. Data Objects are too large to store in the FPGA DDR WED Bitplane assembl. Inverse Wavelet T. Bottleneck Motion Comp. Color Conv. pag. 14

15 2D-IDWT Inverse Discrete Wavelet Transform Resolution scalability pag. 15

16 st 2D-IDWT: 1 Design Made manually (SystemC, VHDL) Results: Simulation: cycles/frame Synthesis: MHz Expectation: 79 frames/s Measurements on hardware: 29 frames/s: Memory bottle neck!!! pag. 16

17 2D-IDWT: 2 nd Design Loop Transformations improve spatial and temporal locality of data accesses polyhedral model common practice for software (cfr. cache optimization) Hardware Generation from the polyhedral model (CLooGVHDL) pag. 17

18 Loop Transformations Original algorithm in, e.g., C Representation in the Loop Polyhedral Model Transformations Optimized algorithm in, e.g., C Optimized algorithm in HW (VHDL) pag. 18

19 2D-IDWT: Loop transformations Data flow to external memory Data flow Burst Usage Variant 5.25 RC 50% RC-based RC 100% Line-based 2 RC 100% Stripe-based 1st design 2nd design pag. 19

20 Outline Introduction Implementation details Energy measurements Method Results and problems pag. 20

21 Power supply FPGA alone not possible entire board pag. 21

22 PCI extender 3.3V 5V pag. 22

23 TCP Ampere AC/DC current probe Accuracy: ~ 20 ma pag. 23

24 pag. 24

25 pag. 25

26 Steady state current FPGA board: 1.8 A x 3.3 V = 6 Watt when idle pag. 26

27 pag. 27

28 Line-Based IDWT pag. 28

29 Line-Based IDWT pag. 29

30 Line-Based IDWT pag. 30

31 I (A) I (A) Energy Isteady state Time (s) P (W) P (W) Time (s) Time (s) P(t) = 3.3V x I(t) Time (s) E = P(t) x dt pag. 31

32 Automation Measurement (PC is master) Trigger scope Save wave trace GPIB Processing Matlab-script Steady state current determination Energy calculation pag. 32

33 Energy for increasing quality Foreman CIF, 10 GOPS (161 frames) 32 different image quality settings 20 identical runs pag. 33

34 Noise Steady state current: after - before Impact temperature -> Add heat sink Steady state current calculation pag. 34

35 Different sequences > 1J pag. 35

36 Measure per component? CPU WED PCI MS AS IDWT MC CC AD VGA DDR Log commands components and replay per component Keep all (intermediate) data in DDR 256 MiB 5 GOPS pag. 36

37 Per component 10 5 GOPs Energy = ~ 1/2 pag. 37

38 Wavelet entropy decoder E (J) x 10 PSNR(dB) pag. 38

39 Inverse wavelet transform # calculations = constant! E (J) x 1.5 pag. 39 PSNR(dB)

40 Whole = sum of components? VGA-component Interaction pag. 40

41 2 variants of IDWT RC-IDWT Made manually LB-IDWT Generated semiautomatically Energy total decoder, 10 GoPs (=161 frames) pag. 41

42 2 variants of IDWT RC-IDWT T= 40 s, Pmean = W E=25.4 J LB-IDWT T=10 s, Pmean = 1.16 W E=11.6 J pag. 42

43 Future work Resolution and temporal scalability Try different approach for measurement per component Measure temperature of FPGA (MAX1619) Predict energy consumption Steady state current? pag. 43

44 Conclusions Energy measurement feasible Sufficient accuracy: not trivial Scalability has significant impact on energy consumption External memory has large impact pag. 44

45 References From loop transformation to hardware generation, H. Devos et al. ProRISC 06, Veldhoven, The Netherlands. Finding and applying loop transformations for optimized FPGA implementations, H. Devos et al. Transactions on HiPEAC, to appear. pag. 45

46 pag. 46

47 Reconfigurable computing CPU Flexibility DSP VLIW FPGA ASIC Efficiency Development effort pag. 47

48 Infrastructure: SOPC-builder Board PCI DMA PCIcore DDR DDR-core WED IDWT... Avalon switch fabric FPGA Custom components SOPC-builder (Quartus, Altera): Automatic generation of Avalon switch fabric pag. 48

49 Calculation Limited Frame rate (frames/s) Bandwidth Limited pag. 49 Available BandWidth to external Memory (MB/s)

50 ed m r o sf ed as Lin e-b sed Str ipe -ba Frame rate (frames/s) n a r w o R -c u l o n = Manual t n u m CLooGVHDL + Manual opt. CLooGVHDL Impulse C pag. 50 Available BandWidth to external Memory (MB/s)

51 The Polyhedral Model SCoP: Static Control Part Part of program with data independent control flow Typically set of nested loops (hot code) Loop bounds are linear expressions of parameters and other iterators pag. 51

52 2D-IDWT: Memory bottle neck Memory hierarchy large but slow external memory fast but smaller (parallel) on-chip memory External memory often bottle neck Minimize accesses to external memory Increase reuse of data stored in on-chip buffers pag. 52

53 2D-IDWT: Problem Memory bottle neck Design automation needed Manual design process = slow, errorprone,... Lots of designs to be made Reconfigurable HW (QoS) Different platforms pag. 53

54 2D-IDWT Memory bottle neck -> Loop transformations SW-techniques can be reused for HW Polyhedral model eases transformations Design automation CLooGVHDL: hardware generation from the polyhedral model pag. 54

55 Overview Video Codec Motion Estim. Original frames Temporal + Temporal Scalability Motion Comp. Decompressed frames Wavelet Transform Entropy Encoding Statistical Spatial + + Resolution & Resolution Quality Scalability Scalability Inverse Wavelet T. Entropy Decoding P a c k Pull bit stream U n p a c k pag. 55

Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder. Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India

Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder. Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India Outline of Presentation MPEG-2 to H.264 Transcoding Need for a multiprocessor

More information

Multimedia Decoder Using the Nios II Processor

Multimedia Decoder Using the Nios II Processor Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra

More information

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

The S6000 Family of Processors

The S6000 Family of Processors The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which

More information

TKT-2431 SoC design. Introduction to exercises. SoC design / September 10

TKT-2431 SoC design. Introduction to exercises. SoC design / September 10 TKT-2431 SoC design Introduction to exercises Assistants: Exercises and the project work Juha Arvio juha.arvio@tut.fi, Otto Esko otto.esko@tut.fi In the project work, a simplified H.263 video encoder is

More information

Developing and Integrating FPGA Co-processors with the Tic6x Family of DSP Processors

Developing and Integrating FPGA Co-processors with the Tic6x Family of DSP Processors Developing and Integrating FPGA Co-processors with the Tic6x Family of DSP Processors Paul Ekas, DSP Engineering, Altera Corp. pekas@altera.com, Tel: (408) 544-8388, Fax: (408) 544-6424 Altera Corp., 101

More information

Implementing Video and Image Processing Designs Using FPGAs. Click to add subtitle

Implementing Video and Image Processing Designs Using FPGAs. Click to add subtitle Implementing Video and Image Processing Designs Using FPGAs Click to add subtitle Agenda Key trends in video and image processing Video and Image Processing Suite Model-based design for video processing

More information

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution

More information

Embedded Systems: Hardware Components (part II) Todor Stefanov

Embedded Systems: Hardware Components (part II) Todor Stefanov Embedded Systems: Hardware Components (part II) Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded

More information

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School

More information

Hardware/Software Co-design

Hardware/Software Co-design Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction

More information

Lecture 41: Introduction to Reconfigurable Computing

Lecture 41: Introduction to Reconfigurable Computing inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41: Introduction to Reconfigurable Computing Michael Le, Sp07 Head TA April 30, 2007 Slides Courtesy of Hayden So, Sp06 CS61c Head TA Following

More information

4K Format Conversion Reference Design

4K Format Conversion Reference Design 4K Format Conversion Reference Design AN-646 Application Note This application note describes a 4K format conversion reference design. 4K resolution is the next major enhancement in video because of the

More information

DDR and DDR2 SDRAM Controller Compiler User Guide

DDR and DDR2 SDRAM Controller Compiler User Guide DDR and DDR2 SDRAM Controller Compiler User Guide 101 Innovation Drive San Jose, CA 95134 www.altera.com Operations Part Number Compiler Version: 8.1 Document Date: November 2008 Copyright 2008 Altera

More information

A 1-GHz Configurable Processor Core MeP-h1

A 1-GHz Configurable Processor Core MeP-h1 A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface

More information

Constructing Application-specific Memory Hierarchies on FPGAs

Constructing Application-specific Memory Hierarchies on FPGAs Constructing Application-specific Memory Hierarchies on FPGAs Harald Devos 1, Jan Van Campenhout 1, Ingrid Verbauwhede 2, and Dirk Stroobandt 1 1 Parallel Information Systems, ELIS-Dept., Ghent University,

More information

A Multimedia Streaming Server/Client Framework for DM64x

A Multimedia Streaming Server/Client Framework for DM64x SEE THEFUTURE. CREATE YOUR OWN. A Multimedia Streaming Server/Client Framework for DM64x Bhavani GK Senior Engineer Ittiam Systems Pvt Ltd bhavani.gk@ittiam.com Agenda Overview of Streaming Application

More information

CAMED: Complexity Adaptive Motion Estimation & Mode Decision for H.264 Video

CAMED: Complexity Adaptive Motion Estimation & Mode Decision for H.264 Video ICASSP 6 CAMED: Complexity Adaptive Motion Estimation & Mode Decision for H.264 Video Yong Wang Prof. Shih-Fu Chang Digital Video and Multimedia (DVMM) Lab, Columbia University Outline Complexity aware

More information

Cornell Cup Tutorials

Cornell Cup Tutorials Cornell Cup Tutorials Online Tutorials Atom processor FPGA material Yocto tools 2 Atom processor For N2600 (CedarView) information: Based on 32nm process technology, the processor series feature new levels

More information

Maximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman

Maximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency with ML accelerators Michael

More information

FPGA for Software Engineers

FPGA for Software Engineers FPGA for Software Engineers Course Description This course closes the gap between hardware and software engineers by providing the software engineer all the necessary FPGA concepts and terms. The course

More information

Functional modeling style for efficient SW code generation of video codec applications

Functional modeling style for efficient SW code generation of video codec applications Functional modeling style for efficient SW code generation of video codec applications Sang-Il Han 1)2) Soo-Ik Chae 1) Ahmed. A. Jerraya 2) SD Group 1) SLS Group 2) Seoul National Univ., Korea TIMA laboratory,

More information

TKT-2431 SoC design. Introduction to exercises

TKT-2431 SoC design. Introduction to exercises TKT-2431 SoC design Introduction to exercises Assistants: Exercises Jussi Raasakka jussi.raasakka@tut.fi Otto Esko otto.esko@tut.fi In the project work, a simplified H.263 video encoder is implemented

More information

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei

More information

Cover TBD. intel Quartus prime Design software

Cover TBD. intel Quartus prime Design software Cover TBD intel Quartus prime Design software Fastest Path to Your Design The Intel Quartus Prime software is revolutionary in performance and productivity for FPGA, CPLD, and SoC designs, providing a

More information

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced

More information

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki An Ultra High Performance Scalable DSP Family for Multimedia Hot Chips 17 August 2005 Stanford, CA Erik Machnicki Media Processing Challenges Increasing performance requirements Need for flexibility &

More information

Performance Verification for ESL Design Methodology from AADL Models

Performance Verification for ESL Design Methodology from AADL Models Performance Verification for ESL Design Methodology from AADL Models Hugues Jérome Institut Supérieur de l'aéronautique et de l'espace (ISAE-SUPAERO) Université de Toulouse 31055 TOULOUSE Cedex 4 Jerome.huges@isae.fr

More information

Reconfigurable Computing. Introduction

Reconfigurable Computing. Introduction Reconfigurable Computing Tony Givargis and Nikil Dutt Introduction! Reconfigurable computing, a new paradigm for system design Post fabrication software personalization for hardware computation Traditionally

More information

Cover TBD. intel Quartus prime Design software

Cover TBD. intel Quartus prime Design software Cover TBD intel Quartus prime Design software Fastest Path to Your Design The Intel Quartus Prime software is revolutionary in performance and productivity for FPGA, CPLD, and SoC designs, providing a

More information

Qsys and IP Core Integration

Qsys and IP Core Integration Qsys and IP Core Integration Stephen A. Edwards (after David Lariviere) Columbia University Spring 2016 IP Cores Altera s IP Core Integration Tools Connecting IP Cores IP Cores Cyclone V SoC: A Mix of

More information

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info. A FPGA based development platform as part of an EDK is available to target intelop provided IPs or other standard IPs. The platform with Virtex-4 FX12 Evaluation Kit provides a complete hardware environment

More information

The Implement of MPEG-4 Video Encoding Based on NiosII Embedded Platform

The Implement of MPEG-4 Video Encoding Based on NiosII Embedded Platform The Implement of MPEG-4 Video Encoding Based on NiosII Embedded Platform Fugang Duan School of Optical-Electrical and Computer Engineering, USST Shanghai, China E-mail: dfgvvvdfgvvv@126.com Zhan Shi School

More information

Today s Agenda Background/Experience Course Information Altera DE2B Board do Overview Introduction to Embedded Systems Design Abstraction Microprocess

Today s Agenda Background/Experience Course Information Altera DE2B Board do Overview Introduction to Embedded Systems Design Abstraction Microprocess ECEN 4633/5633 Hybrid Embedded Systems Fall 2010 Semester Dr. David Ward Today s Agenda Background/Experience Course Information Altera DE2B Board do Overview Introduction to Embedded Systems Design Abstraction

More information

Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective

Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective Nathan Woods XtremeData FPGA 2007 Outline Background Problem Statement Possible Solutions Description

More information

1. Data plane blocks can be optimized for different applications. 2. The IP blocks can be reused and the design complexity decreases.

1. Data plane blocks can be optimized for different applications. 2. The IP blocks can be reused and the design complexity decreases. Outline System-on-Chip Qiong Cai System-on-Chip Illustrated SoC Challenges and Current Solutions Intel s Moorestown Platform Designed for Next Generation Smartphones Future SoC: Programmable Accelerator

More information

Scalable Video Coding

Scalable Video Coding Introduction to Multimedia Computing Scalable Video Coding 1 Topics Video On Demand Requirements Video Transcoding Scalable Video Coding Spatial Scalability Temporal Scalability Signal to Noise Scalability

More information

Design of Embedded Hardware and Firmware

Design of Embedded Hardware and Firmware Design of Embedded Hardware and Firmware Introduction on "System On Programmable Chip" NIOS II Avalon Bus - DMA Andres Upegui Laboratoire de Systèmes Numériques hepia/hes-so Geneva, Switzerland Embedded

More information

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner RiceNIC Prototyping Network Interfaces Jeffrey Shafer Scott Rixner RiceNIC Overview Gigabit Ethernet Network Interface Card RiceNIC - Prototyping Network Interfaces 2 RiceNIC Overview Reconfigurable and

More information

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc. Upcoming Video Standards Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc. Outline Brief history of Video Coding standards Scalable Video Coding (SVC) standard Multiview Video Coding

More information

Lab 1: Using the LegUp High-level Synthesis Framework

Lab 1: Using the LegUp High-level Synthesis Framework Lab 1: Using the LegUp High-level Synthesis Framework 1 Introduction and Motivation This lab will give you an overview of how to use the LegUp high-level synthesis framework. In LegUp, you can compile

More information

System-on-a-Programmable-Chip (SOPC) Development Board

System-on-a-Programmable-Chip (SOPC) Development Board System-on-a-Programmable-Chip (SOPC) Development Board Solution Brief 47 March 2000, ver. 1 Target Applications: Embedded microprocessor-based solutions Family: APEX TM 20K Ordering Code: SOPC-BOARD/A4E

More information

System-on-Chip. Outline. Example: iphone 3GS disassembled. System-on-Chip is Everywhere! SoC Challenges. SoC Challenges and Current Solutions

System-on-Chip. Outline. Example: iphone 3GS disassembled. System-on-Chip is Everywhere! SoC Challenges. SoC Challenges and Current Solutions Outline System-on-Chip Nevin Kirman, Meyrem Kirman, Qiong Cai System-on-Chip Illustrated SoC Challenges and Current Solutions Intel s Moorestown Platform Designed for Next Generation Smartphones Future

More information

Section III. Transport and Communication

Section III. Transport and Communication Section III. Transport and Communication This section describes communication and transport peripherals provided for SOPC Builder systems. This section includes the following chapters: Chapter 16, SPI

More information

3-D Accelerator on Chip

3-D Accelerator on Chip 3-D Accelerator on Chip Third Prize 3-D Accelerator on Chip Institution: Participants: Instructor: Donga & Pusan University Young-Hee Won, Jin-Sung Park, Woo-Sung Moon Sam-Hak Jin Design Introduction Recently,

More information

EEL 4783: Hardware/Software Co-design with FPGAs

EEL 4783: Hardware/Software Co-design with FPGAs EEL 4783: Hardware/Software Co-design with FPGAs Lecture 5: Digital Camera: Software Implementation* Prof. Mingjie Lin * Some slides based on ISU CPrE 588 1 Design Determine system s architecture Processors

More information

Embedded Systems: Hardware Components (part I) Todor Stefanov

Embedded Systems: Hardware Components (part I) Todor Stefanov Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System

More information

Embedded Systems. "System On Programmable Chip" NIOS II Avalon Bus. René Beuchat. Laboratoire d'architecture des Processeurs.

Embedded Systems. System On Programmable Chip NIOS II Avalon Bus. René Beuchat. Laboratoire d'architecture des Processeurs. Embedded Systems "System On Programmable Chip" NIOS II Avalon Bus René Beuchat Laboratoire d'architecture des Processeurs rene.beuchat@epfl.ch 3 Embedded system on Altera FPGA Goal : To understand the

More information

Graphics Controller Core

Graphics Controller Core Core - with 2D acceleration functionalities Product specification Prevas AB PO Box 4 (Legeringsgatan 18) SE-721 03 Västerås, Sweden Phone: Fax: Email: URL: Features +46 21 360 19 00 +46 21 360 19 29 johan.ohlsson@prevas.se

More information

Multimedia in Mobile Phones. Architectures and Trends Lund

Multimedia in Mobile Phones. Architectures and Trends Lund Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson

More information

FPGAs Provide Reconfigurable DSP Solutions

FPGAs Provide Reconfigurable DSP Solutions FPGAs Provide Reconfigurable DSP Solutions Razak Mohammedali Product Marketing Engineer Altera Corporation DSP processors are widely used for implementing many DSP applications. Although DSP processors

More information

Introduction to the Qsys System Integration Tool

Introduction to the Qsys System Integration Tool Introduction to the Qsys System Integration Tool Course Description This course will teach you how to quickly build designs for Altera FPGAs using Altera s Qsys system-level integration tool. You will

More information

MAX 10 FPGA Device Overview

MAX 10 FPGA Device Overview 2014.09.22 M10-OVERVIEW Subscribe MAX 10 devices are the industry s first single chip, non-volatile programmable logic devices (PLDs) to integrate the optimal set of system components. The following lists

More information

Venezia: a Scalable Multicore Subsystem for Multimedia Applications

Venezia: a Scalable Multicore Subsystem for Multimedia Applications Venezia: a Scalable Multicore Subsystem for Multimedia Applications Takashi Miyamori Toshiba Corporation Outline Background Venezia Hardware Architecture Venezia Software Architecture Evaluation Chip and

More information

NIOS CPU Based Embedded Computer System on Programmable Chip

NIOS CPU Based Embedded Computer System on Programmable Chip 1 Objectives NIOS CPU Based Embedded Computer System on Programmable Chip EE8205: Embedded Computer Systems This lab has been constructed to introduce the development of dedicated embedded system based

More information

Design Space Exploration for Memory Subsystems of VLIW Architectures

Design Space Exploration for Memory Subsystems of VLIW Architectures E University of Paderborn Dr.-Ing. Mario Porrmann Design Space Exploration for Memory Subsystems of VLIW Architectures Thorsten Jungeblut 1, Gregor Sievers, Mario Porrmann 1, Ulrich Rückert 2 1 System

More information

Choosing a Processor: Benchmarks and Beyond (S043)

Choosing a Processor: Benchmarks and Beyond (S043) Insight, Analysis, and Advice on Signal Processing Technology Choosing a Processor: Benchmarks and Beyond (S043) Jeff Bier Berkeley Design Technology, Inc. Berkeley, California USA +1 (510) 665-1600 info@bdti.com

More information

Field Programmable Gate Array (FPGA) Devices

Field Programmable Gate Array (FPGA) Devices Field Programmable Gate Array (FPGA) Devices 1 Contents Altera FPGAs and CPLDs CPLDs FPGAs with embedded processors ACEX FPGAs Cyclone I,II FPGAs APEX FPGAs Stratix FPGAs Stratix II,III FPGAs Xilinx FPGAs

More information

EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES

EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES Pong P. Chu Cleveland State University A JOHN WILEY & SONS, INC., PUBLICATION PREFACE An SoC (system on a chip) integrates a processor, memory

More information

Key technologies for many core architectures

Key technologies for many core architectures Key technologies for many core architectures Thierry Collette CEA, LIST thierry.collette@c ea.fr 1 Embedded computing Silicon area offers perfo rmance Applications x 40 from 90 to 45 ns Computing performance

More information

Embedded Computing Platform. Architecture and Instruction Set

Embedded Computing Platform. Architecture and Instruction Set Embedded Computing Platform Microprocessor: Architecture and Instruction Set Ingo Sander ingo@kth.se Microprocessor A central part of the embedded platform A platform is the basic hardware and software

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 10 Task Partitioning Sources: Prof. Margarida Jacome, UT Austin Prof. Lothar Thiele, ETH Zürich Andreas Gerstlauer Electrical and Computer Engineering University

More information

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim Ray Duran Staff Design Specialist FAE, Altera Corporation 408-544-7937

More information

An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform

An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform Youn-Long Lin Department of Computer Science National Tsing Hua University Hsin-Chu, TAIWAN 300 ylin@cs.nthu.edu.tw 2006/08/16

More information

Enabling New Low-Cost Embedded System Using Cyclone III FPGAs

Enabling New Low-Cost Embedded System Using Cyclone III FPGAs Enabling New Low-Cost Embedded System Using Cyclone III FPGAs Unprecedented combination of low power, high functionality, and low cost to enable your new designs Agenda Historical perceptions of FPGAs

More information

Classification of Semiconductor LSI

Classification of Semiconductor LSI Classification of Semiconductor LSI 1. Logic LSI: ASIC: Application Specific LSI (you have to develop. HIGH COST!) For only mass production. ASSP: Application Specific Standard Product (you can buy. Low

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Naveen Suda, Vikas Chandra *, Ganesh Dasika *, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, Yu

More information

AN 690: PCI Express DMA Reference Design for Stratix V Devices

AN 690: PCI Express DMA Reference Design for Stratix V Devices AN 690: PCI Express DMA Reference Design for Stratix V Devices an690-1.0 Subscribe The PCI Express Avalon Memory-Mapped (Avalon-MM) DMA Reference Design highlights the performance of the Avalon-MM 256-Bit

More information

Altera SDK for OpenCL

Altera SDK for OpenCL Altera SDK for OpenCL A novel SDK that opens up the world of FPGAs to today s developers Altera Technology Roadshow 2013 Today s News Altera today announces its SDK for OpenCL Altera Joins Khronos Group

More information

Park Sung Chul. AE MentorGraphics Korea

Park Sung Chul. AE MentorGraphics Korea PGA Design rom Concept to Silicon Park Sung Chul AE MentorGraphics Korea The Challenge of Complex Chip Design ASIC Complex Chip Design ASIC or FPGA? N FPGA Design FPGA Embedded Core? Y FPSoC Design Considerations

More information

Flexible Architecture Research Machine (FARM)

Flexible Architecture Research Machine (FARM) Flexible Architecture Research Machine (FARM) RAMP Retreat June 25, 2009 Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan Bronson Christos Kozyrakis, Kunle Olukotun Motivation Why CPUs + FPGAs make sense

More information

FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION

FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION 1 GOPIKA G NAIR, 2 SABI S. 1 M. Tech. Scholar (Embedded Systems), ECE department, SBCE, Pattoor, Kerala, India, Email:

More information

System-level simulation (HW/SW co-simulation) Outline. EE290A: Design of Embedded System ASV/LL 9/10

System-level simulation (HW/SW co-simulation) Outline. EE290A: Design of Embedded System ASV/LL 9/10 System-level simulation (/SW co-simulation) Outline Problem statement Simulation and embedded system design functional simulation performance simulation POLIS implementation partitioning example implementation

More information

Building Data Path for the Custom Instruction. Yong ZHU *

Building Data Path for the Custom Instruction. Yong ZHU * 2017 2nd International Conference on Computer, Mechatronics and Electronic Engineering (CMEE 2017) ISBN: 978-1-60595-532-2 Building Data Path for the Custom Instruction Yong ZHU * School of Computer Engineering,

More information

9. Verification and Board Bring-Up

9. Verification and Board Bring-Up 9. Verification and Board Bring-Up July 2011 ED51010-1.3 ED51010-1.3 Introduction This chapter provides an overview of the tools available in the Quartus II software and the Nios II Embedded Design Suite

More information

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer) ESE Back End 2.0 D. Gajski, S. Abdi (with contributions from H. Cho, D. Shin, A. Gerstlauer) Center for Embedded Computer Systems University of California, Irvine http://www.cecs.uci.edu 1 Technology advantages

More information

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany 2013 The MathWorks, Inc. 1 Agenda Model-Based Design of embedded Systems Software Implementation

More information

INT G bit TCP Offload Engine SOC

INT G bit TCP Offload Engine SOC INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.

More information

FPGA Adaptive Software Debug and Performance Analysis

FPGA Adaptive Software Debug and Performance Analysis white paper Intel Adaptive Software Debug and Performance Analysis Authors Javier Orensanz Director of Product Management, System Design Division ARM Stefano Zammattio Product Manager Intel Corporation

More information

SoC Design Lecture 11: SoC Bus Architectures. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology

SoC Design Lecture 11: SoC Bus Architectures. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology SoC Design Lecture 11: SoC Bus Architectures Shaahin Hessabi Department of Computer Engineering Sharif University of Technology On-Chip bus topologies Shared bus: Several masters and slaves connected to

More information

ECE 111 ECE 111. Advanced Digital Design. Advanced Digital Design Winter, Sujit Dey. Sujit Dey. ECE Department UC San Diego

ECE 111 ECE 111. Advanced Digital Design. Advanced Digital Design Winter, Sujit Dey. Sujit Dey. ECE Department UC San Diego Advanced Digital Winter, 2009 ECE Department UC San Diego dey@ece.ucsd.edu http://esdat.ucsd.edu Winter 2009 Advanced Digital Objective: of a hardware-software embedded system using advanced design methodologies

More information

INT 1011 TCP Offload Engine (Full Offload)

INT 1011 TCP Offload Engine (Full Offload) INT 1011 TCP Offload Engine (Full Offload) Product brief, features and benefits summary Provides lowest Latency and highest bandwidth. Highly customizable hardware IP block. Easily portable to ASIC flow,

More information

Introduction of the Research Based on FPGA at NICS

Introduction of the Research Based on FPGA at NICS Introduction of the Research Based on FPGA at NICS Rong Luo Nano Integrated Circuits and Systems Lab, Department of Electronic Engineering, Tsinghua University Beijing, 100084, China 1 luorong@tsinghua.edu.cn

More information

MAX 10 FPGA Device Overview

MAX 10 FPGA Device Overview 2016.05.02 M10-OVERVIEW Subscribe MAX 10 devices are single-chip, non-volatile low-cost programmable logic devices (PLDs) to integrate the optimal set of system components. The highlights of the MAX 10

More information

Digital Systems Design. System on a Programmable Chip

Digital Systems Design. System on a Programmable Chip Digital Systems Design Introduction to System on a Programmable Chip Dr. D. J. Jackson Lecture 11-1 System on a Programmable Chip Generally involves utilization of a large FPGA Large number of logic elements

More information

System Level Design with IBM PowerPC Models

System Level Design with IBM PowerPC Models September 2005 System Level Design with IBM PowerPC Models A view of system level design SLE-m3 The System-Level Challenges Verification escapes cost design success There is a 45% chance of committing

More information

Platform-based Design

Platform-based Design Platform-based Design The New System Design Paradigm IEEE1394 Software Content CPU Core DSP Core Glue Logic Memory Hardware BlueTooth I/O Block-Based Design Memory Orthogonalization of concerns: the separation

More information

Turbo Encoder Co-processor Reference Design

Turbo Encoder Co-processor Reference Design Turbo Encoder Co-processor Reference Design AN-317-1.2 Application Note Introduction The turbo encoder co-processor reference design is for implemention in an Stratix DSP development board that is connected

More information

VLSI Design Automation. Maurizio Palesi

VLSI Design Automation. Maurizio Palesi VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 Outline Technology trends VLSI Design flow (an overview) 3 IC Products Processors CPU, DSP, Controllers Memory chips

More information

Design Space Exploration Using Parameterized Cores

Design Space Exploration Using Parameterized Cores RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE

More information

Intel MAX 10 FPGA Device Overview

Intel MAX 10 FPGA Device Overview Intel MAX 10 FPGA Device Overview Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents...3 Key Advantages of Intel MAX 10 Devices... 3 Summary of Intel MAX 10 Device Features...

More information

Designing Embedded Processors in FPGAs

Designing Embedded Processors in FPGAs Designing Embedded Processors in FPGAs 2002 Agenda Industrial Control Systems Concept Implementation Summary & Conclusions Industrial Control Systems Typically Low Volume Many Variations Required High

More information

EN2911X: Reconfigurable Computing Lecture 01: Introduction

EN2911X: Reconfigurable Computing Lecture 01: Introduction EN2911X: Reconfigurable Computing Lecture 01: Introduction Prof. Sherief Reda Division of Engineering, Brown University Fall 2009 Methods for executing computations Hardware (Application Specific Integrated

More information

Exploration of Cache Coherent CPU- FPGA Heterogeneous System

Exploration of Cache Coherent CPU- FPGA Heterogeneous System Exploration of Cache Coherent CPU- FPGA Heterogeneous System Wei Zhang Department of Electronic and Computer Engineering Hong Kong University of Science and Technology 1 Outline ointroduction to FPGA-based

More information

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de

More information

Jumping Hurdles. High Expectations in a Low Power Environment. Christopher Fadeley Software Engineering Manager EIZO Rugged Solutions

Jumping Hurdles. High Expectations in a Low Power Environment. Christopher Fadeley Software Engineering Manager EIZO Rugged Solutions Jumping Hurdles High Expectations in a Low Power Environment Christopher Fadeley Software Engineering Manager EIZO Rugged Solutions Biggest Challenges Embedded/Rugged Environment High performance expectations

More information

VIDEO COMPRESSION STANDARDS

VIDEO COMPRESSION STANDARDS VIDEO COMPRESSION STANDARDS Family of standards: the evolution of the coding model state of the art (and implementation technology support): H.261: videoconference x64 (1988) MPEG-1: CD storage (up to

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information