Self-Adaptive FPGA-Based Image Processing Filters Using Approximate Arithmetics
|
|
- Ariel Horn
- 6 years ago
- Views:
Transcription
1 Self-Adaptive FPGA-Based Image Processing Filters Using Approximate Arithmetics Jutta Pirkl, Andreas Becher, Jorge Echavarria, Jürgen Teich, and Stefan Wildermann Hardware/Software Co-Design, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) SCOPES 17, St. Goar, Germany, June 1, 17
2 Approximate Computing A New Design Paradigm Portable battery-powered devices Rapid workload increase Underlying Idea Trading accuracy of computations against disproportionate improvements with respect to power consumption and/or performance and/or circuit area. Sources: Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
3 Motivation Problem: Error-tolerance depends on both input data and application context Quality-configurability is the key principle of prospective AC platforms 1 Dynamic Partial Reconfiguration Adaptation of the approximation level Approximate Computing Approach: Dynamic autonomous swapping of filters with different degrees of approximation utilizing reconfigurable hardware 1 S. Venkataramani et al. Approximate computing and the quest for computing efficiency. In: 15 5nd ACM/EDAC/IEEE Design Automation Conference (DAC). June 15, pp Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17
4 Outline Concepts of Self-Adaptive Image Processing Approximate D-Convolution Filters Quality Evaluation Reconfiguration Management Experimental Results Quality-Configurable Control Mechanism Partial Reconfiguration Overhead Summary Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 3
5 Concepts of Self-Adaptive Image Processing
6 Approximate D-Convolution Filters Basic filter building block: D-convolution filter wrapper with a kernel size of 3 3 n Filter kernel m Y[m, n] = i H[i, j] X[m i, n j] j Output Filter kernel Input Parallel Multiply-Accumulate (MAC) operation in a pipelined adder tree structure Replacement of all adders by the same approximate version Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17
7 Approximate Adder Structures on FPGAs Most Significant Part Least Significant Part an 1 bn 1 a... b... am bm am 1 bm 1 am bm a... b... a b LUT6_... LUT6_ LUT5 LUT6_... LUT6_ o o o o o o cout c... cm all1 c... c sn 1 s... sm m sm 1 sm s... s Case 1: Carry suppression MSP LSP A. Becher et al. A LUT-based approximate adder. In: Proceedings of the th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines. FCCM 16. Washington DC, USA, May 16. Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 5
8 Approximate Adder Structures on FPGAs Most Significant Part Least Significant Part an 1 bn 1 a... b... am bm am 1 bm 1 am bm a... b... a b LUT6_... LUT6_ LUT5 LUT6_... LUT6_ o o o o o o cout c... cm all1 c... c sn 1 s... sm m sm 1 sm s... s Case 1: Carry suppression MSP LSP error reduction mechanism A. Becher et al. A LUT-based approximate adder. In: Proceedings of the th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines. FCCM 16. Washington DC, USA, May 16 Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 5
9 Approximate Adder Structures on FPGAs Most Significant Part Least Significant Part an 1 bn 1 a... b... am bm am 1 bm 1 am bm a... b... a b LUT6_... LUT6_ LUT5 LUT6_... LUT6_ o o o o o o cout c... cm all1 c... c sn 1 s... sm m sm 1 sm s... s Case 1: Carry suppression MSP LSP Case : Carry prediction MSP LSP error reduction mechanism no approximation error A. Becher et al. A LUT-based approximate adder. In: Proceedings of the th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines. FCCM 16. Washington DC, USA, May 16 Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 5
10 Case Study Approximate Gaussian Lowpass Filter m=1 m= Artifacts: Brightness decrease underestimating adder Cartoon effect Jutta Pirkl m= m=6 m=8 m = 1 Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing m=9 SCOPES 17 6
11 Impact of the Carry Chain Splitting Point on the Output Quality Dependency of the average Peak Signal-to-Noise Ratio (PSNR) on m among the Kodak Lossless True Color Image Suite Average PSNR [db] Splitting Position of the Carry Chain (m) 3 R. Franzen. True Color Kodak Images. Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 7
12 Quality Evaluation Problem: Requirement of a no-reference metric to assess the quality at runtime Approach: Feature extraction from the histograms of in- and output images Input Image m = 1 m = 3 6, 6, 6,,,,,,, Frequency m = 5 6,, m = m = 9 6, Gray level More and more pixels are mapped onto exactly the same brightness values cartoon -effect Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 8
13 Quality Evaluation Frequency 6 1 m = m = m = Gray level Gray level Gray level Gauss kernel: ( ) Distinctive peaks created by the all1-signal erroneous sums are mapped onto n m values Example: m = 9, output bit width after normalization n = 8 before normalization: x x after normalization by 16: x 11 x 1 x smallest collection bin at b (31 d ) further peaks at a distance of 5 = 3 Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 9
14 Quality Evaluation Frequency 6 1 m = m = m = Gray level Gray level Gray level Gauss kernel: ( ) Distinctive peaks created by the all1-signal erroneous sums are mapped onto n m values Example: m = 9, output bit width after normalization n = 8 before normalization: x x after normalization by 16: x 11 x 1 x smallest collection bin at b (31 d ) further peaks at a distance of 5 = 3 Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 9
15 Quality Evaluation Amount of counters to sample the histograms constitutes a trade-off between overhead and fidelity Counting of the pixels with gray levels 31, 63, 17 and 191 in both in- and output image Definition QM Ratio of the maximum peak height of the four bins and the corresponding amount in the input image QM Progression of QM with increasing m Splitting Position of the Carry Chain (m) Large metric value indicates bad quality Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
16 Case Study Approximate Gaussian Lowpass Filter m=1 m= Artifacts: Brightness decrease underestimating adder Cartoon effect Jutta Pirkl m= m=6 m=8 m = 1 Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing m=9 SCOPES 17 1
17 Reconfiguration Management Objective: Minimization of the critical path while maintaining a given quality boundary successive approximation of m to the fastest configuration m opt Based on a bang-bang controller with integrated hysteresis Decision logic for setting the degree of approximation Case 1: Quality is still acceptable in- or decrement the splitting position m in the direction of m opt Case : Quality boundary is exceeded in- or decrement m in the opposite direction of m opt Case 3: Quality metric is within dead zone keep configuration Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 11
18 Experimental Results
19 Results Input-Based Adaptivity System behavior at runtime for the approximate Gaussian filter QM m τ QM , Frames , Frames Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
20 Results Input-Based Adaptivity System behavior at runtime for the approximate Gaussian filter QM m static m = 6 τ QM , Frames , Frames Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
21 Results Input-Based Adaptivity System behavior at runtime for the approximate Gaussian filter QM m τ QM , Frames static m = 5 static m = , Frames Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
22 Results Input-Based Adaptivity System behavior at runtime for the approximate Gaussian filter QM m τ QM , Frames dynamic static m = 5 static m = , Frames Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
23 Results Input-Based Adaptivity System behavior at runtime for the approximate Gaussian filter QM m τ QM , Frames dynamic static m = 5 static m = , Frames Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
24 Results Requirement-Based Adaptivity System evaluation results,5 9 8 DoA 6 Output Quality mavg PSNRavg [db] low medium high low medium high Quality Requirement Quality Requirement aspen redkayak snowmnt touchdownpass pedestrian area demo video Test videos: Derf s collection + self-shot demo video, resolution of 6 8, grayscale 8 bits/pixel 5 Evaluation Parameters: Adaptation rate of at a frame rate of 3 fps sec Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 13
25 Analysis of the Partial Reconfiguration Overhead Reconfiguration time for the partial bitstreams 6 XC7Z Partial Bitstream Bitstream Size [KB] Reconfiguration Time [ms] 1.6 Download Rate [MB/s] 37.8 Approximately linear correlation between configuration time and bitstream size Remaining time slot for the filtering process at 3 fps: ms ms = ms Partial reconfiguration requires.86 % of the time frame 6 This table contains only the largest bitstream among the approximate variants for the Gaussian filter which determines the slowest transfer 7 Full binary bitstream size for the xc7z device:,5,56 Bytes Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 1
26 Summary
27 Summary Challenge: Input-dependent approximation error behavior requires self-adaptive methods Proposition of a no-reference metric for online output quality monitoring based on histogram information Our concept offers better exploitation of a given error tolerance than static approximation a user control knob to select the desired output quality at runtime Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 15
28 Summary Challenge: Input-dependent approximation error behavior requires self-adaptive methods Proposition of a no-reference metric for online output quality monitoring based on histogram information Our concept offers better exploitation of a given error tolerance than static approximation a user control knob to select the desired output quality at runtime Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 15
29 Summary Challenge: Input-dependent approximation error behavior requires self-adaptive methods Proposition of a no-reference metric for online output quality monitoring based on histogram information Our concept offers better exploitation of a given error tolerance than static approximation a user control knob to select the desired output quality at runtime Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 15
30 Summary Challenge: Input-dependent approximation error behavior requires self-adaptive methods Proposition of a no-reference metric for online output quality monitoring based on histogram information Our concept offers better exploitation of a given error tolerance than static approximation a user control knob to select the desired output quality at runtime Thank you for listening! Any questions? Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 15
31 Backup Slides
32 System Level Overview SoC Processing System Programmable Logic Main Memory User application + Reconfiguration Manager Driver Modules Linux Kernel /dev/image_filter /dev/xdevcfg HW/SW Interface Filter Wrapper Reconfigurable Partition Quality Evaluation Filter Controller PR m = 1 m = m = 3 Software Hardware Reconfiguration Manager: Quality-control loop Linux device drivers as hardware interfaces Approximate Filter Operators: Partial bitstreams for various degrees of approximation Quality Evaluation: Online quality monitoring Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 16
33 Correlation Between the Proposed Quality Metric and PSNR 15 QM PSNR [db] Inverse relation: Increasing tendency of the metric with decreasing PSNR Jutta Pirkl Hardware/Software Co-Design (FAU) Self-Adaptive FPGA-Based Image Processing SCOPES 17 17
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationModified SPIHT Image Coder For Wireless Communication
Modified SPIHT Image Coder For Wireless Communication M. B. I. REAZ, M. AKTER, F. MOHD-YASIN Faculty of Engineering Multimedia University 63100 Cyberjaya, Selangor Malaysia Abstract: - The Set Partitioning
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationDigital system (SoC) design for lowcomplexity. Hyun Kim
Digital system (SoC) design for lowcomplexity multimedia processing Hyun Kim SoC Design for Multimedia Systems Goal : Reducing computational complexity & power consumption of state-ofthe-art technologies
More informationRuntime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays
Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann
More informationParallel FIR Filters. Chapter 5
Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture
More informationScalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA
Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School
More informationNoC Simulation in Heterogeneous Architectures for PGAS Programming Model
NoC Simulation in Heterogeneous Architectures for PGAS Programming Model Sascha Roloff, Andreas Weichslgartner, Frank Hannig, Jürgen Teich University of Erlangen-Nuremberg, Germany Jan Heißwolf Karlsruhe
More informationDelay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier
Delay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier Vivek. V. Babu 1, S. Mary Vijaya Lense 2 1 II ME-VLSI DESIGN & The Rajaas Engineering College Vadakkangulam, Tirunelveli 2 Assistant Professor
More informationFrequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System
Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Chi Zhang, Viktor K Prasanna University of Southern California {zhan527, prasanna}@usc.edu fpga.usc.edu ACM
More informationIntroduction Warp Processors Dynamic HW/SW Partitioning. Introduction Standard binary - Separating Function and Architecture
Roman Lysecky Department of Electrical and Computer Engineering University of Arizona Dynamic HW/SW Partitioning Initially execute application in software only 5 Partitioned application executes faster
More informationFingerprint Image Compression
Fingerprint Image Compression Ms.Mansi Kambli 1*,Ms.Shalini Bhatia 2 * Student 1*, Professor 2 * Thadomal Shahani Engineering College * 1,2 Abstract Modified Set Partitioning in Hierarchical Tree with
More informationOrganic Self-organizing Bus-based Communication Systems
Organic Self-organizing Bus-based Communication Systems, Stefan Wildermann, Jürgen Teich Hardware-Software-Co-Design Universität Erlangen-Nürnberg tobias.ziermann@informatik.uni-erlangen.de 15.09.2011
More informationBiometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)
Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html
More informationA Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning
A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning By: Roman Lysecky and Frank Vahid Presented By: Anton Kiriwas Disclaimer This specific
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear
More informationDesigning for Performance. Patrick Happ Raul Feitosa
Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance
More informationScalable Compression and Transmission of Large, Three- Dimensional Materials Microstructures
Scalable Compression and Transmission of Large, Three- Dimensional Materials Microstructures William A. Pearlman Center for Image Processing Research Rensselaer Polytechnic Institute pearlw@ecse.rpi.edu
More informationEdge and corner detection
Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements
More informationCost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Lo
Cost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Low-Power Capacity- based Measurement Application on Xilinx FPGAs Abstract The application of Field Programmable
More informationFlexible wireless communication architectures
Flexible wireless communication architectures Sridhar Rajagopal Department of Electrical and Computer Engineering Rice University, Houston TX Faculty Candidate Seminar Southern Methodist University April
More informationDigital Image Processing. Image Enhancement - Filtering
Digital Image Processing Image Enhancement - Filtering Derivative Derivative is defined as a rate of change. Discrete Derivative Finite Distance Example Derivatives in 2-dimension Derivatives of Images
More informationGeneration of Multigrid-based Numerical Solvers for FPGA Accelerators
Generation of Multigrid-based Numerical Solvers for FPGA Accelerators Christian Schmitt, Moritz Schmid, Frank Hannig, Jürgen Teich, Sebastian Kuckuk, Harald Köstler Hardware/Software Co-Design, System
More informationCo-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17,
Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms SAMOS XIV July 14-17, 2014 1 Outline Introduction + Motivation Design requirements for many-accelerator SoCs Design problems
More informationStereo Video Processing for Depth Map
Stereo Video Processing for Depth Map Harlan Hile and Colin Zheng University of Washington Abstract This paper describes the implementation of a stereo depth measurement algorithm in hardware on Field-Programmable
More informationSIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P
SIGNAL COMPRESSION 9. Lossy image compression: SPIHT and S+P 9.1 SPIHT embedded coder 9.2 The reversible multiresolution transform S+P 9.3 Error resilience in embedded coding 178 9.1 Embedded Tree-Based
More informationGeneric Arithmetic Units for High-Performance FPGA Designs
Proceedings of the 10th WSEAS International Confenrence on APPLIED MATHEMATICS, Dallas, Texas, USA, November 1-3, 2006 20 Generic Arithmetic Units for High-Performance FPGA Designs John R. Humphrey, Daniel
More informationHEAD HardwarE Accelerated Deduplication
HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version
More informationSearching Video Collections:Part I
Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion
More informationVLSI Implementation of Daubechies Wavelet Filter for Image Compression
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue 6, Ver. I (Nov.-Dec. 2017), PP 13-17 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org VLSI Implementation of Daubechies
More informationThroughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling
Throughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling Tobias Schwarzer 1, Joachim Falk 1, Michael Glaß 1, Jürgen Teich 1, Christian Zebelein 2, Christian
More informationApplication and Desktop Sharing. Omer Boyaci November 1, 2007
Application and Desktop Sharing Omer Boyaci November 1, 2007 Overview Introduction Demo Architecture Challenges Features Conclusion Application Sharing Models Application specific + Efficient - Participants
More informationA Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms
A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms Jingzhao Ou and Viktor K. Prasanna Department of Electrical Engineering, University of Southern California Los Angeles, California,
More informationLecture 5: Compression I. This Week s Schedule
Lecture 5: Compression I Reading: book chapter 6, section 3 &5 chapter 7, section 1, 2, 3, 4, 8 Today: This Week s Schedule The concept behind compression Rate distortion theory Image compression via DCT
More informationImplementing FIR Filters
Implementing FIR Filters in FLEX Devices February 199, ver. 1.01 Application Note 73 FIR Filter Architecture This section describes a conventional FIR filter design and how the design can be optimized
More informationEdge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels
Edge Detection Edge detection Convert a 2D image into a set of curves Extracts salient features of the scene More compact than pixels Origin of Edges surface normal discontinuity depth discontinuity surface
More informationMCM Based FIR Filter Architecture for High Performance
ISSN No: 2454-9614 MCM Based FIR Filter Architecture for High Performance R.Gopalana, A.Parameswari * Department Of Electronics and Communication Engineering, Velalar College of Engineering and Technology,
More informationSelf Learning Hard Disk Power Management for Mobile Devices
Self Learning Hard Disk Power Management for Mobile Devices Andreas Weissel weissel@cs.fau.de, http://www4.cs.fau.de Department of Computer Sciences 4 Distributed Systems and Operating Systems Friedrich
More informationECE5775 High-Level Digital Design Automation, Fall 2018 School of Electrical Computer Engineering, Cornell University
ECE5775 High-Level Digital Design Automation, Fall 2018 School of Electrical Computer Engineering, Cornell University Lab 4: Binarized Convolutional Neural Networks Due Wednesday, October 31, 2018, 11:59pm
More informationAutomated RTR Temporal Partitioning for Reconfigurable Embedded Real-Time System Design
Automated RTR Temporal Partitioning for Reconfigurable Embedded Real-Time System Design C. Tanougast, Y. Berviller, P. Brunet and S. Weber L. I. E. N. Laboratoire d Instrumentation Electronique de Nancy
More informationComputer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.
Announcements Edge and Corner Detection HW3 assigned CSE252A Lecture 13 Efficient Implementation Both, the Box filter and the Gaussian filter are separable: First convolve each row of input image I with
More informationComputer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier
Computer Vision 2 SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung Computer Vision 2 Dr. Benjamin Guthier 1. IMAGE PROCESSING Computer Vision 2 Dr. Benjamin Guthier Content of this Chapter Non-linear
More informationImproving Reconfiguration Speed for Dynamic Circuit Specialization using Placement Constraints
Improving Reconfiguration Speed for Dynamic Circuit Specialization using Placement Constraints Amit Kulkarni, Tom Davidson, Karel Heyse, and Dirk Stroobandt ELIS department, Computer Systems Lab, Ghent
More informationPartial product generation. Multiplication. TSTE18 Digital Arithmetic. Seminar 4. Multiplication. yj2 j = xi2 i M
TSTE8 igital Arithmetic Seminar 4 Oscar Gustafsson Multiplication Multiplication can typically be separated into three sub-problems Generating partial products Adding the partial products using a redundant
More informationIndian Silicon Technologies 2013
SI.No Topics IEEE YEAR 1. An RFID Based Solution for Real-Time Patient Surveillance and data Processing Bio- Metric System using FPGA 2. Real-time Binary Shape Matching System Based on FPGA 3. An Optimized
More informationFiltering and Enhancing Images
KECE471 Computer Vision Filtering and Enhancing Images Chang-Su Kim Chapter 5, Computer Vision by Shapiro and Stockman Note: Some figures and contents in the lecture notes of Dr. Stockman are used partly.
More informationOptimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform
Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of
More informationBy Charvi Dhoot*, Vincent J. Mooney &,
By Charvi Dhoot*, Vincent J. Mooney &, -Shubhajit Roy Chowdhury*, Lap Pui Chau # *International Institute of Information Technology, Hyderabad, India & School of Electrical and Computer Engineering, Georgia
More informationImproving Energy Efficiency of Block-Matching Motion Estimation Using Dynamic Partial Reconfiguration
, pp.517-521 http://dx.doi.org/10.14257/astl.2015.1 Improving Energy Efficiency of Block-Matching Motion Estimation Using Dynamic Partial Reconfiguration Jooheung Lee 1 and Jungwon Cho 2, * 1 Dept. of
More informationImage Compression for Mobile Devices using Prediction and Direct Coding Approach
Image Compression for Mobile Devices using Prediction and Direct Coding Approach Joshua Rajah Devadason M.E. scholar, CIT Coimbatore, India Mr. T. Ramraj Assistant Professor, CIT Coimbatore, India Abstract
More informationCSEP 521 Applied Algorithms Spring Lossy Image Compression
CSEP 521 Applied Algorithms Spring 2005 Lossy Image Compression Lossy Image Compression Methods Scalar quantization (SQ). Vector quantization (VQ). DCT Compression JPEG Wavelet Compression SPIHT UWIC (University
More informationAnnouncements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10
Announcements Assignment 2 due Tuesday, May 4. Edge Detection, Lines Midterm: Thursday, May 6. Introduction to Computer Vision CSE 152 Lecture 10 Edges Last Lecture 1. Object boundaries 2. Surface normal
More informationReconstruction PSNR [db]
Proc. Vision, Modeling, and Visualization VMV-2000 Saarbrücken, Germany, pp. 199-203, November 2000 Progressive Compression and Rendering of Light Fields Marcus Magnor, Andreas Endmann Telecommunications
More informationComputer Vision I - Basics of Image Processing Part 2
Computer Vision I - Basics of Image Processing Part 2 Carsten Rother 07/11/2014 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image
More informationMultimedia Decoder Using the Nios II Processor
Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra
More informationAn Approach to Addressing QoE for Effective Video Streaming
Pause Intensity An Approach to Addressing QoE for Effective Video Streaming Xiaohong Peng Electronic, Electrical and Power Engineering School of Engineering & Applied Science Aston University Birmingham,
More informationInternational Journal Of Global Innovations -Vol.6, Issue.I Paper Id: SP-V6-I1-P01 ISSN Online:
IMPLEMENTATION OF OBJECT RECOGNITION USING SIFT ALGORITHM ON BEAGLE BOARD XM USING EMBEDDED LINUX #1 T.KRISHNA KUMAR -M. Tech Student, #2 G.SUDHAKAR - Assistant Professor, #3 R. MURALI, HOD - Assistant
More informationStereo Image Compression
Stereo Image Compression Deepa P. Sundar, Debabrata Sengupta, Divya Elayakumar {deepaps, dsgupta, divyae}@stanford.edu Electrical Engineering, Stanford University, CA. Abstract In this report we describe
More informationFrom Temporal Partitioning and Temporal Placement to Algorithmic Skeletons
From Temporal Partitioning and Temporal Placement to Algorithmic Skeletons Florian Dittmann, Franz J. Rammig Heinz Nixdorf Institute University of Paderborn, Germany Motivation Making reconfigurable computing
More informationSIMULATIVE ANALYSIS OF EDGE DETECTION OPERATORS AS APPLIED FOR ROAD IMAGES
SIMULATIVE ANALYSIS OF EDGE DETECTION OPERATORS AS APPLIED FOR ROAD IMAGES Sukhpreet Kaur¹, Jyoti Saxena² and Sukhjinder Singh³ ¹Research scholar, ²Professsor and ³Assistant Professor ¹ ² ³ Department
More informationComplexity results for throughput and latency optimization of replicated and data-parallel workflows
Complexity results for throughput and latency optimization of replicated and data-parallel workflows Anne Benoit and Yves Robert GRAAL team, LIP École Normale Supérieure de Lyon June 2007 Anne.Benoit@ens-lyon.fr
More information5.7. Fractal compression Overview
5.7. Fractal compression Overview 1. Introduction 2. Principles 3. Encoding 4. Decoding 5. Example 6. Evaluation 7. Comparison 8. Literature References 1 Introduction (1) - General Use of self-similarities
More informationA General Sign Bit Error Correction Scheme for Approximate Adders
A General Sign Bit Error Correction Scheme for Approximate Adders Rui Zhou and Weikang Qian University of Michigan-Shanghai Jiao Tong University Joint Institute Shanghai Jiao Tong University, Shanghai,
More informationDesign and Implementation of 3-D DWT for Video Processing Applications
Design and Implementation of 3-D DWT for Video Processing Applications P. Mohaniah 1, P. Sathyanarayana 2, A. S. Ram Kumar Reddy 3 & A. Vijayalakshmi 4 1 E.C.E, N.B.K.R.IST, Vidyanagar, 2 E.C.E, S.V University
More information4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)
1 4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013) Lab #1: ITB Room 157, Thurs. and Fridays, 2:30-5:20, EOW Demos to TA: Thurs, Fri, Sept.
More informationResource Efficient Real-Time Processing of Contrast Limited Adaptive Histogram Equalization
Resource Efficient Real-Time Processing of Contrast Limited Adaptive Histogram Equalization Burak Ünal, Ali Akoglu Reconfigurable Computing Lab Department of Electrical and Computer Engineering The University
More informationVector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks
Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Christos Kozyrakis Stanford University David Patterson U.C. Berkeley http://csl.stanford.edu/~christos Motivation Ideal processor
More informationAnno accademico 2006/2007. Davide Migliore
Robotica Anno accademico 6/7 Davide Migliore migliore@elet.polimi.it Today What is a feature? Some useful information The world of features: Detectors Edges detection Corners/Points detection Descriptors?!?!?
More informationHardware/Software Co-design for Hyperelliptic Curve Cryptography (HECC) on the 8051 µp
Hardware/Software Co-design for Hyperelliptic Curve Cryptography (HECC) on the 8051 µp Lejla Batina, David Hwang, Alireza Hodjat, Bart Preneel and Ingrid Verbauwhede Outline Introduction and Motivation
More informationMultimedia Computing: Algorithms, Systems, and Applications: Edge Detection
Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854, USA Part of the slides
More informationMassively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain
Massively Parallel Computing on Silicon: SIMD Implementations V.M.. Brea Univ. of Santiago de Compostela Spain GOAL Give an overview on the state-of of-the- art of Digital on-chip CMOS SIMD Solutions,
More informationProfiling the Performance of Binarized Neural Networks. Daniel Lerner, Jared Pierce, Blake Wetherton, Jialiang Zhang
Profiling the Performance of Binarized Neural Networks Daniel Lerner, Jared Pierce, Blake Wetherton, Jialiang Zhang 1 Outline Project Significance Prior Work Research Objectives Hypotheses Testing Framework
More informationEdges and Binary Images
CS 699: Intro to Computer Vision Edges and Binary Images Prof. Adriana Kovashka University of Pittsburgh September 5, 205 Plan for today Edge detection Binary image analysis Homework Due on 9/22, :59pm
More informationDeveloping Applications for HPRCs
Developing Applications for HPRCs Esam El-Araby The George Washington University Acknowledgement Prof.\ Tarek El-Ghazawi Mohamed Taher ARSC SRC SGI Cray 2 Outline Background Methodology A Case Studies
More informationFerre, PL., Doufexi, A., Chung How, J. T. H., Nix, AR., & Bull, D. (2003). Link adaptation for video transmission over COFDM based WLANs.
Ferre, PL., Doufexi, A., Chung How, J. T. H., Nix, AR., & Bull, D. (2003). Link adaptation for video transmission over COFDM based WLANs. Peer reviewed version Link to publication record in Explore Bristol
More informationAn FPGA Based Adaptive Viterbi Decoder
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst Overview Introduction Objectives Background Adaptive Viterbi Algorithm Architecture
More informationSiggraph Course 2017 Path Tracing in Production Part 1 Manuka: Weta Digital's Spectral Renderer
Siggraph Course 2017 Path Tracing in Production Part 1 Manuka: Weta Digital's Spectral Renderer Johannes Hanika, Weta Digital 1 Motivation Weta Digital is a VFX house we care about matching plate a lot
More informationDesign and Implementation of an Eight Bit Multiplier Using Twin Precision Technique and Baugh-Wooley Algorithm
International Journal of Scientific and Research Publications, Volume 3, Issue 4, April 2013 1 Design and Implementation of an Eight Bit Multiplier Using Twin Precision Technique and Baugh-Wooley Algorithm
More informationVHDL Implementation of Multiplierless, High Performance DWT Filter Bank
VHDL Implementation of Multiplierless, High Performance DWT Filter Bank Mr. M.M. Aswale 1, Prof. Ms. R.B Patil 2,Member ISTE Abstract The JPEG 2000 image coding standard employs the biorthogonal 9/7 wavelet
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A
More informationA HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS
A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS Saba Gouhar 1 G. Aruna 2 gouhar.saba@gmail.com 1 arunastefen@gmail.com 2 1 PG Scholar, Department of ECE, Shadan Women
More informationEnergy Consumption in Mobile Phones: A Measurement Study and Implications for Network Applications (IMC09)
Energy Consumption in Mobile Phones: A Measurement Study and Implications for Network Applications (IMC09) Niranjan Balasubramanian Aruna Balasubramanian Arun Venkataramani University of Massachusetts
More informationComputer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11
Announcement Edge and Corner Detection Slides are posted HW due Friday CSE5A Lecture 11 Edges Corners Edge is Where Change Occurs: 1-D Change is measured by derivative in 1D Numerical Derivatives f(x)
More informationDESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL
DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL [1] J.SOUJANYA,P.G.SCHOLAR, KSHATRIYA COLLEGE OF ENGINEERING,NIZAMABAD [2] MR. DEVENDHER KANOOR,M.TECH,ASSISTANT
More informationPotentials and Limitations for Energy Efficiency Auto-Tuning
Center for Information Services and High Performance Computing (ZIH) Potentials and Limitations for Energy Efficiency Auto-Tuning Parco Symposium Application Autotuning for HPC (Architectures) Robert Schöne
More informationDigital Image Fundamentals
Digital Image Fundamentals Image Quality Objective/ subjective Machine/human beings Mathematical and Probabilistic/ human intuition and perception 6 Structure of the Human Eye photoreceptor cells 75~50
More informationSTUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)
STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC) EE 5359-Multimedia Processing Spring 2012 Dr. K.R Rao By: Sumedha Phatak(1000731131) OBJECTIVE A study, implementation and comparison
More informationEnhancing Resource Utilization with Design Alternatives in Runtime Reconfigurable Systems
Enhancing Resource Utilization with Design Alternatives in Runtime Reconfigurable Systems Alexander Wold, Dirk Koch, Jim Torresen Department of Informatics, University of Oslo, Norway Email: {alexawo,koch,jimtoer}@ifi.uio.no
More informationDesign Tradeoffs for Data Deduplication Performance in Backup Workloads
Design Tradeoffs for Data Deduplication Performance in Backup Workloads Min Fu,DanFeng,YuHua,XubinHe, Zuoning Chen *, Wen Xia,YuchengZhang,YujuanTan Huazhong University of Science and Technology Virginia
More informationTag a Tiny Aggregation Service for Ad-Hoc Sensor Networks. Samuel Madden, Michael Franklin, Joseph Hellerstein,Wei Hong UC Berkeley Usinex OSDI 02
Tag a Tiny Aggregation Service for Ad-Hoc Sensor Networks Samuel Madden, Michael Franklin, Joseph Hellerstein,Wei Hong UC Berkeley Usinex OSDI 02 Outline Introduction The Tiny AGgregation Approach Aggregate
More informationEfficient Hardware Context- Switch for Task Migration between Heterogeneous FPGAs
Efficient Hardware Context- Switch for Task Migration between Heterogeneous FPGAs Frédéric Rousseau TIMA lab University of Grenoble Alpes A join work with Alban Bourge, Arief Wicaksana and Olivier Muller
More informationCompressed Swap for Embedded Linux. Alexander Belyakov, Intel Corp.
Compressed Swap for Embedded Linux Alexander Belyakov, Intel Corp. Outline. 1. Motivation 2. Underlying media types 3. Related works 4. MTD compression layer driver place in kernel architecture swap-in/out
More informationImage Processing: Final Exam November 10, :30 10:30
Image Processing: Final Exam November 10, 2017-8:30 10:30 Student name: Student number: Put your name and student number on all of the papers you hand in (if you take out the staple). There are always
More informationA Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm
International Journal of Engineering Research and General Science Volume 3, Issue 4, July-August, 15 ISSN 91-2730 A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm
More informationFiltering Applications & Edge Detection. GV12/3072 Image Processing.
Filtering Applications & Edge Detection GV12/3072 1 Outline Sampling & Reconstruction Revisited Anti-Aliasing Edges Edge detection Simple edge detector Canny edge detector Performance analysis Hough Transform
More informationPresented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey
Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Evangelos MALTEZOS, Charalabos IOANNIDIS, Anastasios DOULAMIS and Nikolaos DOULAMIS Laboratory of Photogrammetry, School of Rural
More informationECG782: Multidimensional Digital Signal Processing
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 03 Image Processing Basics 13/01/28 http://www.ee.unlv.edu/~b1morris/ecg782/
More informationISSN (ONLINE): , VOLUME-3, ISSUE-1,
PERFORMANCE ANALYSIS OF LOSSLESS COMPRESSION TECHNIQUES TO INVESTIGATE THE OPTIMUM IMAGE COMPRESSION TECHNIQUE Dr. S. Swapna Rani Associate Professor, ECE Department M.V.S.R Engineering College, Nadergul,
More informationECG782: Multidimensional Digital Signal Processing
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spatial Domain Filtering http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Background Intensity
More informationData parallel algorithms, algorithmic building blocks, precision vs. accuracy
Data parallel algorithms, algorithmic building blocks, precision vs. accuracy Robert Strzodka Architecture of Computing Systems GPGPU and CUDA Tutorials Dresden, Germany, February 25 2008 2 Overview Parallel
More information