Functional modeling style for efficient SW code generation of video codec applications

Size: px
Start display at page:

Download "Functional modeling style for efficient SW code generation of video codec applications"

Transcription

1 Functional modeling style for efficient SW code generation of video codec applications Sang-Il Han 1)2) Soo-Ik Chae 1) Ahmed. A. Jerraya 2) SD Group 1) SLS Group 2) Seoul National Univ., Korea TIMA laboratory, France

2 Summary Challenges in video codec system design Complex High-performance within short design time Classical RTL design methodology is too slow Requirements in video codec system design System level design methodology with functional model Explicit parallelism and conditional in functional model Existing modeling styles Data-driven model lacks explicit conditional (ex: SDF lacks conditionals) Event-driven model lacks explicit parallelism (ex: SM lacks distributed imp.) Proposed modeling style Abstract clock synchronous model (ACSM): An extension of CSM for RTL Both parallelism and conditional with an abstract global clock 02/01/06 2

3 The scope of this work System-level design methodology Target Application Target Architecture Modeling ACSM Model Scope Mapping Evaluation Mixed Algo./Arch. Model Automatic Generation Scope Modeling style to generate efficient SW code. Implementation of System 02/01/06 3

4 Introduction Proposed modeling style Contents Comparison with existing modeling styles Experiment Conclusion and Perspective 02/01/06 4

5 Target application Macroblock-based video codecs MPEG-2, H.263, MPEG-4, H.264, and WMV9 W Frame/Slice VLD Macroblock VLD Inverse Scan Quantization Inverse Transform + Deblocking Filter H Video Bit Stream Spatial Prediction Modes Decoded Frame Store Multiple Previous Frame Store Spatial Compensation Process Motion Compensation Process Switch Intra/ Inter MB A frame = WxH macroblocks Exploit temporal redundancy Exploit spatial redundancy Current Frame Store Motion Vectors Block diagram of an H.264 decoder 02/01/06 5

6 Challenges and Requirements (1) 1) Computation Complexity Ex: 10 Gops/sec for 4VGA decoding Explicit parallelism Parallel and pipelined computation execution 2) Communication Complexity Ex: 500 MB/sec external memory bandwidth for 4VGA decoding Explicit and predictable communication Parallel computation and communication execution Efficient communication scheme e.g. burst transfer 02/01/06 6

7 Challenges and Requirements (2) 3) Specification Complexity Ex: Curr. MB has dependency with prev., upper., upper prev. MB Higher level synchronization Function- and thread-level synchronization 4) Control Complexity Ex: 4x4 intra, 16x16 intra, 4x4 inter, 8x8 inter, MB prediction modes Explicit data-dependent comp./comm. Sophisticated control operations e.g. memory management 02/01/06 7

8 Contribution Propose modeling style for video codecs Abstract clock synchronous model (ACSM) An extension of clocked synchronous model for RTL Coarser clock: Abstract clock 1)Partially ordered dependency at function level Computation Complexity, 2)Explicit comm. channel with fixed buffer size Communication Complexity 3)Coarser clock granularity than physical clock Specification Complexity 4)Explicit conditionals through global abstract clock Control Complexity 02/01/06 8

9 Introduction Contents Proposed modeling style Comparison with existing modeling styles Experiment Conclusion and Perspective 02/01/06 9

10 Proposed modeling style Abstract clock synchronous model Structure Network of state-less functions Memory elements Abstract clock F 0 F 2 Delay F 1 F 3 Abstract clock Hypothesis : Lock step execution model In every abstract clock cycle, new values propagate in the network 02/01/06 10

11 RTL model vs ACSM Difference: Granularity of clock and components RTL model ACSM * - / + Delay F 0 F 2 F 1 F 3 Delay clk Abstract clock Network of combinational gates Memory elements (delay) No loops in combinational logics Synch: Lock step model Synch. interval: clock Network of state-less functions Memory elements (delay) No loops in network of functions Synch: Lock step model Synch. interval: abstract clock 02/01/06 11

12 Basic components i 0,,, i n If/ else F 0 o 0,,, Z -K F 0 F 1 o m Z -K (a) module (b) delay (c) arc IAS 1 Data D Demux F 2 Control F 1 F M Mux 3 token = (tag, value) Merge 0..3 FIS 1 IAS i F 4 F 5 D F 0 M (d) If-action subsystem (IAS) (e) For-iterator subsystem (FIS) Rule on execution: If one token on all input ports, fired. Except Merge block Rule on absent token: absent token with global abstract clock. absent token = (tag, empty value) 02/01/06 12

13 Abstract clock for video codecs Hierarchical iterations in video decoder and encoder Frame level index: too coarse to express parallelism. Sub-macroblock level index: irregular delay depth Macroblock level index: proper granularity and regular delay depth QCIF image (a) Decoding order of macroblock PrevMB UperMB Delay depth : 166 Delay depth : (b) Decoding order of 4x4 block Abstract clock for video codecs: Macroblock index 02/01/06 13

14 Implementations of RTL model and ACSM RTL model ACSM model + - / * Delay F 0 F 2 F 1 F 3 Delay clk Abstract clock 16bit Carry lookahead Adder RISC processor add div sub mul register F 0 F 2 IP 1 IP 3 SRAM 02/01/06 clk Abstract clock 14

15 A simplified ACSM of H.264 decoder Fr. VLD Video Bit Stream Z -1 Z -W Z -1 Z -W MB VLD x8 IQ 8x8 IT + FIS 1 M 8x8 DF Video Out Intra/ Inter MB D If/ else 4 MC Types D 8x8 SC IAS 1 In the full model 16MVs D 0..3 D 8x8 IF 4x4 IF 8x8 MC 4x4 MC IAS 2 IAS 3 FIS 1 M 83 Modules 24 delays 286 arcs 43 IASs 5 FISs Previous Frames Current Frame 02/01/06 15

16 Introduction Proposed modeling style Contents Comparison with existing modeling styles Experiment Conclusion and Perspective 02/01/06 16

17 Synchronous Dataflow (SDF) If/else Z -j If/else Z -j F 2 F 2 F 1 F 4 F 1 F mux F 6 F 3 F 3 Z -k Z -k (a) ACSM (b) SDF SDF: Concurrent actors communicate with one-way FIFO channels Rule on execution: If sufficient non-zero tokens on all input ports, fired Rule on absent token: No concept of absent token. Difficulty: No explicit conditionals (!req. 4) Redundant computation : One of F 2 and F 3 Redundant commutation : One of (Z -j F 2 ) and (Z -k F 3 ) 02/01/06 17

18 Boolean Dataflow (BDF) If/else Z -j If/else Z -j F 1 F 2 F 3 F 4 F 1 SWITCH T F F 2 F 3 T F SELECT F 4 Z -k Z -k (a) ACSM (b) BDF BDF: Concurrent actors communicate with one-way FIFO channels Rule on execution: If sufficient non-zero tokens on all input ports, fired Except SWITCH and SELECT blocks Rule on absent token: Absent token without global clock. Difficulty: Buffer overflow problem. (!req. 4) Accumulate tokens between (Z -j F 2 ) and (Z -k F 3 ) To solve this problem, a lot of additional switches: 138 switches/83 modules in case of H.264 decoder 02/01/06 18

19 Other modeling styles Khan Process Network (KPN) Difficulty1: Task-level coarse granularity (!req 1) Difficulty2: Comp./comm./control mixed (!req 2,!req4) Synchronous Dataflow with embedded control (SDF+eCtrl) Difficulty1: Coarser granularity. (!req. 1) Difficulty2: Still Redundant communication (!req. 4) Synchronous model (SM) Rule on execution: If one present token on one or more input ports, fired Rule on absent token: Absent tokens with virtual clock. Difficulty: distributed implementation (!req 1) 02/01/06 19

20 Summary of Modeling styles Models KPN SDF SDF BDF SM RTL ACSM Requirements +ectrl Explicit fine-grain parallelism Explicit communication Higher level synchronization Explicit datadependent op. Model Type Data-driven Event-driven Clock-driven Clock No global clock Virtual clks Physic clk Abstract clk Analysis No absent token (or without clock) Explicit conditionals Unrestricted absent token Restricted absent token with global clock Distributed Execution Data-driven 02/01/06 system Absent token Event-driven 20

21 Contents Introduction Proposed modeling style Comparison with existing modeling styles Experiment Conclusion and Perspective 02/01/06 21

22 Experiment environment Target application H.264 baseline decoder Foreman QCIF format with QP=28 and IntraPeriod=5 All search modes are supported. Xtensa Target architecture Tensilica Xtensa single processor DMA + memory I/F Mapping Frames are stored in global memory. Image fetch process is executed on DMA. All other processes are executed on Xtensa. In Out Global Mem0 Interconnection DMA/ Mem I/F Proc0 local mem 02/01/06 22

23 Experiment Handed code H.264 decoder SDF (4x4 w/o IAS) Modeling SW code by RTW C codes ACSM (4x4) Profile with Xtensa ISS Cycle and Memory access Xtensa Single proc. Simulink models Opt. ACSM (4x4, 8x8) 02/01/ Execution cycle SDF ACSM Opt. ACSM Handed code 100 External memory access SDF ACSM Opt. ACSM Handed code Etc Rec IT VLD Chrom DF Luma DF Chrom MC Luma MC Top control Write Chrom Write Luma Read Chrom Read Luma

24 Experiment result: Comparison Compared to SDF Reduce 32% execution time Due to Redundant computation of SDF Reduce 46% external memory access Due to redundant communication of SDF Compared to SDF+eCtrl Similar execution time, but SDF+eCtrl limits design space. Reduce 46% external memory access Due to redundant communication of SDF+eCtrl Compared to BDF Buffer overflow, or a lot of switches required (138 switches/83 modules) 02/01/06 24

25 Contents Introduction Contribution Basics of target application and architecture Proposed modeling style Comparison with existing modeling styles Experiment Conclusion and Perspective 02/01/06 25

26 Conclusion Conclusion and perspective ACSM suitable for video codec application An high level RTL model Satisfies the requirements of functional model Efficient SW code generation from Simulink with RTW Reduce 32% execution time than SDF Reduce 46% external memory access than SDF Perspective SW generation tool for distributed implementation. 02/01/06 26

27 Thank you!!! Question? 02/01/06 27

Memory-efficient multithreaded code generation from Simulink for heterogeneous MPSoC

Memory-efficient multithreaded code generation from Simulink for heterogeneous MPSoC Des Autom Embed Syst (2007) 11: 249 283 DOI 10.1007/s10617-007-9009-4 Memory-efficient multithreaded code generation from Simulink for heterogeneous MPSoC Sang-Il Han Soo-Ik Chae Lisane Brisolara Luigi

More information

Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC

Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC Lisane Brisolara 1, Sang-il Han 2,3, Xavier Guerin 2, Luigi Carro 1, Ricardo Reis 1, Soo-Ik Chae 3, Ahmed

More information

Multimedia Decoder Using the Nios II Processor

Multimedia Decoder Using the Nios II Processor Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra

More information

Venezia: a Scalable Multicore Subsystem for Multimedia Applications

Venezia: a Scalable Multicore Subsystem for Multimedia Applications Venezia: a Scalable Multicore Subsystem for Multimedia Applications Takashi Miyamori Toshiba Corporation Outline Background Venezia Hardware Architecture Venezia Software Architecture Evaluation Chip and

More information

MPEG-4: Simple Profile (SP)

MPEG-4: Simple Profile (SP) MPEG-4: Simple Profile (SP) I-VOP (Intra-coded rectangular VOP, progressive video format) P-VOP (Inter-coded rectangular VOP, progressive video format) Short Header mode (compatibility with H.263 codec)

More information

Mapping the AVS Video Decoder on a Heterogeneous Dual-Core SIMD Processor. NikolaosBellas, IoannisKatsavounidis, Maria Koziri, Dimitris Zacharis

Mapping the AVS Video Decoder on a Heterogeneous Dual-Core SIMD Processor. NikolaosBellas, IoannisKatsavounidis, Maria Koziri, Dimitris Zacharis Mapping the AVS Video Decoder on a Heterogeneous Dual-Core SIMD Processor NikolaosBellas, IoannisKatsavounidis, Maria Koziri, Dimitris Zacharis University of Thessaly Greece 1 Outline Introduction to AVS

More information

In the name of Allah. the compassionate, the merciful

In the name of Allah. the compassionate, the merciful In the name of Allah the compassionate, the merciful Digital Video Systems S. Kasaei Room: CE 315 Department of Computer Engineering Sharif University of Technology E-Mail: skasaei@sharif.edu Webpage:

More information

BANDWIDTH-EFFICIENT ENCODER FRAMEWORK FOR H.264/AVC SCALABLE EXTENSION. Yi-Hau Chen, Tzu-Der Chuang, Yu-Jen Chen, and Liang-Gee Chen

BANDWIDTH-EFFICIENT ENCODER FRAMEWORK FOR H.264/AVC SCALABLE EXTENSION. Yi-Hau Chen, Tzu-Der Chuang, Yu-Jen Chen, and Liang-Gee Chen BANDWIDTH-EFFICIENT ENCODER FRAMEWORK FOR H.264/AVC SCALABLE EXTENSION Yi-Hau Chen, Tzu-Der Chuang, Yu-Jen Chen, and Liang-Gee Chen DSP/IC Design Lab., Graduate Institute of Electronics Engineering, National

More information

MPEG-2. And Scalability Support. Nimrod Peleg Update: July.2004

MPEG-2. And Scalability Support. Nimrod Peleg Update: July.2004 MPEG-2 And Scalability Support Nimrod Peleg Update: July.2004 MPEG-2 Target...Generic coding method of moving pictures and associated sound for...digital storage, TV broadcasting and communication... Dedicated

More information

PREFACE...XIII ACKNOWLEDGEMENTS...XV

PREFACE...XIII ACKNOWLEDGEMENTS...XV Contents PREFACE...XIII ACKNOWLEDGEMENTS...XV 1. MULTIMEDIA SYSTEMS...1 1.1 OVERVIEW OF MPEG-2 SYSTEMS...1 SYSTEMS AND SYNCHRONIZATION...1 TRANSPORT SYNCHRONIZATION...2 INTER-MEDIA SYNCHRONIZATION WITH

More information

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Literature Survey Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao March 25, 2002 Abstract

More information

High Performance Computing. University questions with solution

High Performance Computing. University questions with solution High Performance Computing University questions with solution Q1) Explain the basic working principle of VLIW processor. (6 marks) The following points are basic working principle of VLIW processor. The

More information

10.2 Video Compression with Motion Compensation 10.4 H H.263

10.2 Video Compression with Motion Compensation 10.4 H H.263 Chapter 10 Basic Video Compression Techniques 10.11 Introduction to Video Compression 10.2 Video Compression with Motion Compensation 10.3 Search for Motion Vectors 10.4 H.261 10.5 H.263 10.6 Further Exploration

More information

Modeling and Simulation of H.26L Encoder. Literature Survey. For. EE382C Embedded Software Systems. Prof. B.L. Evans

Modeling and Simulation of H.26L Encoder. Literature Survey. For. EE382C Embedded Software Systems. Prof. B.L. Evans Modeling and Simulation of H.26L Encoder Literature Survey For EE382C Embedded Software Systems Prof. B.L. Evans By Mrudula Yadav and Gayathri Venkat March 25, 2002 Abstract The H.26L standard is targeted

More information

Comparing Memory Systems for Chip Multiprocessors

Comparing Memory Systems for Chip Multiprocessors Comparing Memory Systems for Chip Multiprocessors Jacob Leverich Hideho Arakida, Alex Solomatnikov, Amin Firoozshahian, Mark Horowitz, Christos Kozyrakis Computer Systems Laboratory Stanford University

More information

The Scope of Picture and Video Coding Standardization

The Scope of Picture and Video Coding Standardization H.120 H.261 Video Coding Standards MPEG-1 and MPEG-2/H.262 H.263 MPEG-4 H.264 / MPEG-4 AVC Thomas Wiegand: Digital Image Communication Video Coding Standards 1 The Scope of Picture and Video Coding Standardization

More information

4G WIRELESS VIDEO COMMUNICATIONS

4G WIRELESS VIDEO COMMUNICATIONS 4G WIRELESS VIDEO COMMUNICATIONS Haohong Wang Marvell Semiconductors, USA Lisimachos P. Kondi University of Ioannina, Greece Ajay Luthra Motorola, USA Song Ci University of Nebraska-Lincoln, USA WILEY

More information

VIDEO COMPRESSION STANDARDS

VIDEO COMPRESSION STANDARDS VIDEO COMPRESSION STANDARDS Family of standards: the evolution of the coding model state of the art (and implementation technology support): H.261: videoconference x64 (1988) MPEG-1: CD storage (up to

More information

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD THE H.264 ADVANCED VIDEO COMPRESSION STANDARD Second Edition Iain E. Richardson Vcodex Limited, UK WILEY A John Wiley and Sons, Ltd., Publication About the Author Preface Glossary List of Figures List

More information

The VC-1 and H.264 Video Compression Standards for Broadband Video Services

The VC-1 and H.264 Video Compression Standards for Broadband Video Services The VC-1 and H.264 Video Compression Standards for Broadband Video Services by Jae-Beom Lee Sarnoff Corporation USA Hari Kalva Florida Atlantic University USA 4y Sprin ger Contents PREFACE ACKNOWLEDGEMENTS

More information

Digital Video Processing

Digital Video Processing Video signal is basically any sequence of time varying images. In a digital video, the picture information is digitized both spatially and temporally and the resultant pixel intensities are quantized.

More information

Simulink -based Programming Environment for Heterogeneous MPSoC

Simulink -based Programming Environment for Heterogeneous MPSoC Simulink -based Programming Environment for Heterogeneous MPSoC Katalin Popovici katalin.popovici@mathworks.com Software Engineer, The MathWorks DATE 2009, Nice, France 2009 The MathWorks, Inc. Summary

More information

A Unified HW/SW Interface Model to Remove Discontinuities between HW and SW Design

A Unified HW/SW Interface Model to Remove Discontinuities between HW and SW Design A Unified HW/SW Interface Model to Remove Discontinuities between HW and SW Design Ahmed Amine JERRAYA EPFL November 2005 TIMA Laboratory 46 Avenue Felix Viallet 38031 Grenoble CEDEX, France Email: Ahmed.Jerraya@imag.fr

More information

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France Video Compression Zafar Javed SHAHID, Marc CHAUMONT and William PUECH Laboratoire LIRMM VOODDO project Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier LIRMM UMR 5506 Université

More information

MPEG-2. ISO/IEC (or ITU-T H.262)

MPEG-2. ISO/IEC (or ITU-T H.262) MPEG-2 1 MPEG-2 ISO/IEC 13818-2 (or ITU-T H.262) High quality encoding of interlaced video at 4-15 Mbps for digital video broadcast TV and digital storage media Applications Broadcast TV, Satellite TV,

More information

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 10 ZHU Yongxin, Winson

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 10 ZHU Yongxin, Winson Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 10 ZHU Yongxin, Winson zhuyongxin@sjtu.edu.cn Basic Video Compression Techniques Chapter 10 10.1 Introduction to Video Compression

More information

IMPLEMENTATION OF H.264 DECODER ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Mayan Moudgill, John Glossner

IMPLEMENTATION OF H.264 DECODER ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Mayan Moudgill, John Glossner IMPLEMENTATION OF H.264 DECODER ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Mayan Moudgill, John Glossner Sandbridge Technologies, 1 North Lexington Avenue, White Plains, NY 10601 sjinturkar@sandbridgetech.com

More information

Energy scalability and the RESUME scalable video codec

Energy scalability and the RESUME scalable video codec Energy scalability and the RESUME scalable video codec Harald Devos, Hendrik Eeckhaut, Mark Christiaens ELIS/PARIS Ghent University pag. 1 Outline Introduction Scalable Video Reconfigurable HW: FPGAs Implementation

More information

Video Encoding with. Multicore Processors. March 29, 2007 REAL TIME HD

Video Encoding with. Multicore Processors. March 29, 2007 REAL TIME HD Video Encoding with Multicore Processors March 29, 2007 Video is Ubiquitous... Demand for Any Content Any Time Any Where Resolution ranges from 128x96 pixels for mobile to 1920x1080 pixels for full HD

More information

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression Fraunhofer Institut für Nachrichtentechnik Heinrich-Hertz-Institut Ralf Schäfer schaefer@hhi.de http://bs.hhi.de H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression Introduction H.264/AVC:

More information

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc. Upcoming Video Standards Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc. Outline Brief history of Video Coding standards Scalable Video Coding (SVC) standard Multiview Video Coding

More information

Fast frame memory access method for H.264/AVC

Fast frame memory access method for H.264/AVC Fast frame memory access method for H.264/AVC Tian Song 1a), Tomoyuki Kishida 2, and Takashi Shimamoto 1 1 Computer Systems Engineering, Department of Institute of Technology and Science, Graduate School

More information

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson High Efficiency Video Coding: The Next Gen Codec Matthew Goldman Senior Vice President TV Compression Technology Ericsson High Efficiency Video Coding Compression Bitrate Targets Bitrate MPEG-2 VIDEO 1994

More information

ESE532: System-on-a-Chip Architecture. Today. Process. Message FIFO. Thread. Dataflow Process Model Motivation Issues Abstraction Recommended Approach

ESE532: System-on-a-Chip Architecture. Today. Process. Message FIFO. Thread. Dataflow Process Model Motivation Issues Abstraction Recommended Approach ESE53: System-on-a-Chip Architecture Day 5: January 30, 07 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Recommended Approach Message Parallelism can be natural Discipline

More information

Flexible Architecture Research Machine (FARM)

Flexible Architecture Research Machine (FARM) Flexible Architecture Research Machine (FARM) RAMP Retreat June 25, 2009 Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan Bronson Christos Kozyrakis, Kunle Olukotun Motivation Why CPUs + FPGAs make sense

More information

Software Design and Integration for Embedded Multimedia Applications by Successive Refinement

Software Design and Integration for Embedded Multimedia Applications by Successive Refinement Software Design and Integration for Embedded Multimedia Applications by Successive Refinement Katalin Popovici katalin.popovici@mathworks.com The MathWorks, France 2008 The MathWorks, Inc. Acknowledgement

More information

Multimedia Standards

Multimedia Standards Multimedia Standards SS 2017 Lecture 5 Prof. Dr.-Ing. Karlheinz Brandenburg Karlheinz.Brandenburg@tu-ilmenau.de Contact: Dipl.-Inf. Thomas Köllmer thomas.koellmer@tu-ilmenau.de 1 Organisational issues

More information

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri MPEG MPEG video is broken up into a hierarchy of layer From the top level, the first layer is known as the video sequence layer, and is any self contained bitstream, for example a coded movie. The second

More information

Overview of Dataflow Languages. Waheed Ahmad

Overview of Dataflow Languages. Waheed Ahmad Overview of Dataflow Languages Waheed Ahmad w.ahmad@utwente.nl The purpose of models is not to fit the data but to sharpen the questions. Samuel Karlins 11 th R.A Fisher Memorial Lecture Royal Society

More information

LECTURE VIII: BASIC VIDEO COMPRESSION TECHNIQUE DR. OUIEM BCHIR

LECTURE VIII: BASIC VIDEO COMPRESSION TECHNIQUE DR. OUIEM BCHIR 1 LECTURE VIII: BASIC VIDEO COMPRESSION TECHNIQUE DR. OUIEM BCHIR 2 VIDEO COMPRESSION A video consists of a time-ordered sequence of frames, i.e., images. Trivial solution to video compression Predictive

More information

Video Compression An Introduction

Video Compression An Introduction Video Compression An Introduction The increasing demand to incorporate video data into telecommunications services, the corporate environment, the entertainment industry, and even at home has made digital

More information

An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform

An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform Youn-Long Lin Department of Computer Science National Tsing Hua University Hsin-Chu, TAIWAN 300 ylin@cs.nthu.edu.tw 2006/08/16

More information

H.264 STANDARD BASED SIDE INFORMATION GENERATION IN WYNER-ZIV CODING

H.264 STANDARD BASED SIDE INFORMATION GENERATION IN WYNER-ZIV CODING H.264 STANDARD BASED SIDE INFORMATION GENERATION IN WYNER-ZIV CODING SUBRAHMANYA MAIRA VENKATRAV Supervising Professor: Dr. K. R. Rao 1 TABLE OF CONTENTS 1. Introduction 1.1. Wyner-Ziv video coding 1.2.

More information

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation Chapter 10 Basic Video Compression Techniques 10.1 Introduction to Video Compression 10.2 Video Compression with Motion Compensation 10.3 Search for Motion Vectors 10.4 H.261 10.5 H.263 10.6 Further Exploration

More information

Using animation to motivate motion

Using animation to motivate motion Using animation to motivate motion In computer generated animation, we take an object and mathematically render where it will be in the different frames Courtesy: Wikipedia Given the rendered frames (or

More information

Dataflow Architectures. Karin Strauss

Dataflow Architectures. Karin Strauss Dataflow Architectures Karin Strauss Introduction Dataflow machines: programmable computers with hardware optimized for fine grain data-driven parallel computation fine grain: at the instruction granularity

More information

H.264 Decoding. University of Central Florida

H.264 Decoding. University of Central Florida 1 Optimization Example: H.264 inverse transform Interprediction Intraprediction In-Loop Deblocking Render Interprediction filter data from previously decoded frames Deblocking filter out block edges Today:

More information

Reliable Embedded Multimedia Systems?

Reliable Embedded Multimedia Systems? 2 Overview Reliable Embedded Multimedia Systems? Twan Basten Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others Embedded Multi-media Analysis of

More information

Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet.

Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet. Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet. Injong Rhee Department of Computer Science North Carolina State University Video Conferencing over Packet- Switching

More information

FPGA based High Performance CAVLC Implementation for H.264 Video Coding

FPGA based High Performance CAVLC Implementation for H.264 Video Coding FPGA based High Performance CAVLC Implementation for H.264 Video Coding Arun Kumar Pradhan Trident Academy of Technology Bhubaneswar,India Lalit Kumar Kanoje Trident Academy of Technology Bhubaneswar,India

More information

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture I Benny Thörnberg Associate Professor in Electronics Hardware architecture Computer architecture The functionality of a modern computer is so complex that no human can

More information

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content

More information

Cross Layer Protocol Design

Cross Layer Protocol Design Cross Layer Protocol Design Radio Communication III The layered world of protocols Video Compression for Mobile Communication » Image formats» Pixel representation Overview» Still image compression Introduction»

More information

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46 LIST OF TABLES TABLE Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46 Table 5.2 Macroblock types 46 Table 5.3 Inverse Scaling Matrix values 48 Table 5.4 Specification of QPC as function

More information

Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system

Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system A.J.M. Moonen Information and Communication Systems Department of Electrical Engineering Eindhoven University

More information

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Week 14. Video Compression. Ref: Fundamentals of Multimedia Week 14 Video Compression Ref: Fundamentals of Multimedia Last lecture review Prediction from the previous frame is called forward prediction Prediction from the next frame is called forward prediction

More information

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs. Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and

More information

Lecture 14. Synthesizing Parallel Programs IAP 2007 MIT

Lecture 14. Synthesizing Parallel Programs IAP 2007 MIT 6.189 IAP 2007 Lecture 14 Synthesizing Parallel Programs Prof. Arvind, MIT. 6.189 IAP 2007 MIT Synthesizing parallel programs (or borrowing some ideas from hardware design) Arvind Computer Science & Artificial

More information

EE 5359 H.264 to VC 1 Transcoding

EE 5359 H.264 to VC 1 Transcoding EE 5359 H.264 to VC 1 Transcoding Vidhya Vijayakumar Multimedia Processing Lab MSEE, University of Texas @ Arlington vidhya.vijayakumar@mavs.uta.edu Guided by Dr.K.R. Rao Goals Goals The goal of this project

More information

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Jeffrey S. McVeigh 1 and Siu-Wai Wu 2 1 Carnegie Mellon University Department of Electrical and Computer Engineering

More information

Real-Time Course. Video Streaming Over network. June Peter van der TU/e Computer Science, System Architecture and Networking

Real-Time Course. Video Streaming Over network. June Peter van der TU/e Computer Science, System Architecture and Networking Real-Time Course Video Streaming Over network 1 Home network example Internet Internet Internet in Ethernet switch 2 QoS chains Quality of video Size of video bit/s network Quality of network Bandwidth,

More information

Research on Transcoding of MPEG-2/H.264 Video Compression

Research on Transcoding of MPEG-2/H.264 Video Compression Research on Transcoding of MPEG-2/H.264 Video Compression WEI, Xianghui Graduate School of Information, Production and Systems Waseda University February 2009 Abstract Video transcoding performs one or

More information

A Motion Vector Predictor Architecture for AVS and MPEG-2 HDTV Decoder

A Motion Vector Predictor Architecture for AVS and MPEG-2 HDTV Decoder A Motion Vector Predictor Architecture for AVS and MPEG-2 HDTV Decoder Junhao Zheng 1,3, Di Wu 1, Lei Deng 2, Don Xie 4, and Wen Gao 1,2,3 1 Institute of Computing Technology, Chinese Academy of Sciences,

More information

Processor Architecture and Interconnect

Processor Architecture and Interconnect Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing

More information

Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao

Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao 28th April 2011 LIST OF ACRONYMS AND ABBREVIATIONS AVC: Advanced Video Coding DVD: Digital

More information

Parallel Scalability of Video Decoders

Parallel Scalability of Video Decoders J Sign Process Syst (2009) 57:173 194 DOI 10.1007/s11265-008-0256-9 Parallel Scalability of Video Decoders Cor Meenderinck Arnaldo Azevedo Ben Juurlink Mauricio Alvarez Mesa Alex Ramirez Received: 14 September

More information

CMPT 365 Multimedia Systems. Media Compression - Video

CMPT 365 Multimedia Systems. Media Compression - Video CMPT 365 Multimedia Systems Media Compression - Video Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Introduction What s video? a time-ordered sequence of frames, i.e.,

More information

A macroblock-level analysis on the dynamic behaviour of an H.264 decoder

A macroblock-level analysis on the dynamic behaviour of an H.264 decoder A macroblock-level analysis on the dynamic behaviour of an H.264 decoder Florian H. Seitner, Ralf M. Schreier, Member, IEEE Michael Bleyer, Margrit Gelautz, Member, IEEE Abstract This work targets the

More information

ESE532 Spring University of Pennsylvania Department of Electrical and System Engineering System-on-a-Chip Architecture

ESE532 Spring University of Pennsylvania Department of Electrical and System Engineering System-on-a-Chip Architecture University of Pennsylvania Department of Electrical and System Engineering System-on-a-Chip Architecture ESE532, Spring 2017 HW2: Profiling Wednesday, January 18 Due: Friday, January 27, 5:00pm In this

More information

Performance Tuning on the Blackfin Processor

Performance Tuning on the Blackfin Processor 1 Performance Tuning on the Blackfin Processor Outline Introduction Building a Framework Memory Considerations Benchmarks Managing Shared Resources Interrupt Management An Example Summary 2 Introduction

More information

5LSE0 - Mod 10 Part 1. MPEG Motion Compensation and Video Coding. MPEG Video / Temporal Prediction (1)

5LSE0 - Mod 10 Part 1. MPEG Motion Compensation and Video Coding. MPEG Video / Temporal Prediction (1) 1 Multimedia Video Coding & Architectures (5LSE), Module 1 MPEG-1/ Standards: Motioncompensated video coding 5LSE - Mod 1 Part 1 MPEG Motion Compensation and Video Coding Peter H.N. de With (p.h.n.de.with@tue.nl

More information

Scalable Extension of HEVC 한종기

Scalable Extension of HEVC 한종기 Scalable Extension of HEVC 한종기 Contents 0. Overview for Scalable Extension of HEVC 1. Requirements and Test Points 2. Coding Gain/Efficiency 3. Complexity 4. System Level Considerations 5. Related Contributions

More information

A hardware operating system kernel for multi-processor systems

A hardware operating system kernel for multi-processor systems A hardware operating system kernel for multi-processor systems Sanggyu Park a), Do-sun Hong, and Soo-Ik Chae School of EECS, Seoul National University, Building 104 1, Seoul National University, Gwanakgu,

More information

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 ISSCC 26 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 22.1 A 125µW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu 1, Ting-An Lin 2, Sheng-Zen Wang 2, Wen-Ping Lee

More information

Portland State University ECE 588/688. Dataflow Architectures

Portland State University ECE 588/688. Dataflow Architectures Portland State University ECE 588/688 Dataflow Architectures Copyright by Alaa Alameldeen and Haitham Akkary 2018 Hazards in von Neumann Architectures Pipeline hazards limit performance Structural hazards

More information

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System

More information

ESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder

ESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder ESE532: System-on-a-Chip Architecture Day 5: September 18, 2017 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Basic Approach Dataflow variants Motivations/demands for

More information

Automated Space/Time Scaling of Streaming Task Graphs. Hossein Omidian Supervisor: Guy Lemieux

Automated Space/Time Scaling of Streaming Task Graphs. Hossein Omidian Supervisor: Guy Lemieux Automated Space/Time Scaling of Streaming Task Graphs Hossein Omidian Supervisor: Guy Lemieux 1 Contents Introduction KPN-based HLS Tool for MPPA overlay Experimental Results Future Work Conclusion 2 Introduction

More information

Overview of SOC Architecture design

Overview of SOC Architecture design Computer Architectures Overview of SOC Architecture design Tien-Fu Chen National Chung Cheng Univ. SOC - 0 SOC design Issues SOC architecture Reconfigurable System-level Programmable processors Low-level

More information

Parallel Scalability of Video Decoders

Parallel Scalability of Video Decoders Parallel Scalability of Video Decoders Technical Report June 30, 2008 Cor Meenderinck, Arnaldo Azevedo and Ben Juurlink Computer Engineering Department Delft University of Technology {Cor, Azevedo, Benj}@ce.et.tudelft.nl

More information

JPEG 2000 vs. JPEG in MPEG Encoding

JPEG 2000 vs. JPEG in MPEG Encoding JPEG 2000 vs. JPEG in MPEG Encoding V.G. Ruiz, M.F. López, I. García and E.M.T. Hendrix Dept. Computer Architecture and Electronics University of Almería. 04120 Almería. Spain. E-mail: vruiz@ual.es, mflopez@ace.ual.es,

More information

A 1-GHz Configurable Processor Core MeP-h1

A 1-GHz Configurable Processor Core MeP-h1 A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface

More information

Multi-Grain Parallel Accelerate System for H.264 Encoder on ULTRASPARC T2

Multi-Grain Parallel Accelerate System for H.264 Encoder on ULTRASPARC T2 JOURNAL OF COMPUTERS, VOL 8, NO 12, DECEMBER 2013 3293 Multi-Grain Parallel Accelerate System for H264 Encoder on ULTRASPARC T2 Yu Wang, Linda Wu, and Jing Guo Key Lab of the Academy of Equipment, Beijing,

More information

MODELING OF BLOCK-BASED DSP SYSTEMS

MODELING OF BLOCK-BASED DSP SYSTEMS MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko and Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College

More information

How to Add Low-Power, Multi-Codec, Digital Video and Audio to Your Next ASIC or SOC Design

How to Add Low-Power, Multi-Codec, Digital Video and Audio to Your Next ASIC or SOC Design WHITE PAPER How to Add Low-Power, Multi-Codec, Digital Video and Audio to Your Next ASIC or SOC Design Since the early 1990s, video compression has grown increasingly important for the design of modern

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.

More information

Computer Architecture: Dataflow/Systolic Arrays

Computer Architecture: Dataflow/Systolic Arrays Data Flow Computer Architecture: Dataflow/Systolic Arrays he models we have examined all assumed Instructions are fetched and retired in sequential, control flow order his is part of the Von-Neumann model

More information

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY Department of Computer science and engineering Year :II year CS6303 COMPUTER ARCHITECTURE Question Bank UNIT-1OVERVIEW AND INSTRUCTIONS PART-B

More information

Modelling, Analysis and Scheduling with Dataflow Models

Modelling, Analysis and Scheduling with Dataflow Models technische universiteit eindhoven Modelling, Analysis and Scheduling with Dataflow Models Marc Geilen, Bart Theelen, Twan Basten, Sander Stuijk, AmirHossein Ghamarian, Jeroen Voeten Eindhoven University

More information

Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles

Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues

More information

15: OS Scheduling and Buffering

15: OS Scheduling and Buffering 15: OS Scheduling and ing Mark Handley Typical Audio Pipeline (sender) Sending Host Audio Device Application A->D Device Kernel App Compress Encode for net RTP ed pending DMA to host (~10ms according to

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL Multi-Standard Decoder Use-Case

Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL Multi-Standard Decoder Use-Case XIV International Conference on Embedded Computer and Systems: Architectures, MOdeling and Simulation SAMOS XIV - 2014 July 14 th - Samos Island (Greece) Carlo Sau, Luigi Raffo DIEE Università degli Studi

More information

EE Low Complexity H.264 encoder for mobile applications

EE Low Complexity H.264 encoder for mobile applications EE 5359 Low Complexity H.264 encoder for mobile applications Thejaswini Purushotham Student I.D.: 1000-616 811 Date: February 18,2010 Objective The objective of the project is to implement a low-complexity

More information

Caches. Samira Khan March 23, 2017

Caches. Samira Khan March 23, 2017 Caches Samira Khan March 23, 2017 Agenda Review from last lecture Data flow model Memory hierarchy More Caches The Dataflow Model (of a Computer) Von Neumann model: An instruction is fetched and executed

More information

IEEE 1857 Standard Empowering Smart Video Surveillance Systems

IEEE 1857 Standard Empowering Smart Video Surveillance Systems IEEE 1857 Standard Empowering Smart Video Surveillance Systems Wen Gao, Yonghong Tian, Tiejun Huang, Siwei Ma, Xianguo Zhang to be published in IEEE Intelligent Systems (released in 2013). Effrosyni Doutsi

More information

Introduction to Video Compression

Introduction to Video Compression Insight, Analysis, and Advice on Signal Processing Technology Introduction to Video Compression Jeff Bier Berkeley Design Technology, Inc. info@bdti.com http://www.bdti.com Outline Motivation and scope

More information

High Efficiency Video Coding. Li Li 2016/10/18

High Efficiency Video Coding. Li Li 2016/10/18 High Efficiency Video Coding Li Li 2016/10/18 Email: lili90th@gmail.com Outline Video coding basics High Efficiency Video Coding Conclusion Digital Video A video is nothing but a number of frames Attributes

More information

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Course Presentation Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Video Coding Correlation in Video Sequence Spatial correlation Similar pixels seem

More information

Reliable Dynamic Embedded Data Processing Systems

Reliable Dynamic Embedded Data Processing Systems 2 Embedded Data Processing Systems Reliable Dynamic Embedded Data Processing Systems sony Twan Basten thales Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen,

More information