QUKU: A Fast Run Time Reconfigurable Platform for Image Edge Detection
|
|
- Franklin Jordan
- 5 years ago
- Views:
Transcription
1 QUKU: A Fast Run Time Reconfigurable Platform for Image Edge Detection Sunil Shukla 1,2, Neil W. Bergmann 1, Jürgen Becker 2 1 ITEE, University of Queensland, Brisbane, QLD 4072, Australia {sunil, n.bergmann}@itee.uq.edu.au 2 ITIV, Universität Karlsruhe, Karlsruhe, Germany {shukla, becker}@itiv.uni-karlsruhe.de Abstract. To fill the gap between increasing demand for reconfigurability and performance efficiency, CGRAs are seen to be an emerging platform. In this paper, a new architecture, QUKU, is described which uses a coarse-grained reconfigurable PE array (CGRA) overlaid on an FPGA. The low-speed reconfigurability of the FPGA is used to optimize the CGRA for different applications, whilst the high-speed CGRA reconfiguration is used within an application for operator re-use. We will demonstrate the dynamic reconfigurability of QUKU by porting Sobel and Laplacian kernel for edge detection in an image frame. 1 Introduction With the rapid emergence of multi-protocol standards, the need for a platform which is able to reconfigure itself rapidly and on the fly has grown tremendously. This situation demands a high level of flexibility, as seen in GPP (General Purpose Processor), performance and power efficiency. GPP has failed to show its mark in terms of processing efficiency. To fill the gap between flexibility and efficiency, several architectures have been proposed. Generally, ASICs are seen as the only low-power solution which can meet this demand for fast computational efficiency and high I/O bandwidth. However, the NRE cost of ASICs are a strong motivation for some sort of general purpose computing chip with better power efficiency than microprocessors. Recently, coarse grained reconfigurable architectures (CGRA) have been developed to bridge the gap between power-hungry microprocessors and single-purpose ASICs. Coarse-grained arrays have the advantages of power-efficiency for their intended application domain, and also they are designed with efficient implementation of dynamic reconfiguration supported in hardware and in design software. Furthermore, this dynamic reconfiguration can be very fast, since configuration codes are very short. But their use remains in question due to the lack of a well defined design flow and commercial availability. Traditionally, FPGAs have been thought of as a prototyping device for small scale
2 digital systems. However current generation mega-gate FPGA chips have changed this perception. FPGAs can be reconfigured for an unlimited number of times to implement any circuit. FPGAs have proved their usability in control intensive as well as computational intensive tasks. An attractive feature of FPGAs is the ability to partially and dynamically reconfigure the FPGA function. The next section gives a short overview of FPGA based reconfigurable architectures. Section 3 describes the QUKU architecture. Section 4 describes the run time reconfigurability of QUKU. Section 5 discusses the implementation results followed by conclusion. 2 Related Work Some other systems consisting of MPU and FPGAs have been proposed in past [3, 4, 5, 6, 7, 8]. Guccione [9] provides a detailed list of FPGA based reconfigurable architectures. YARDS [5] is a hybrid system of MPU connected with an array of FPGA and has been targeted for telecommunication applications. Spyder [7] consists of a processor with 3 reconfigurable execution units working in parallel. The configuration of the execution units is determined from the set of applications. DISC [6] is an FPGA based processor which loads application specific instruction from a library of image processing elements. It takes advantage of the partial reconfiguration feature of FPGA and implements a system analogous to memory paging in software. PRISM [8] is a reconfigurable architecture built using hardware-software co-design concepts. For each application, new instructions are compiled, synthesized and loaded. GARP [4] combines a MIPS processor with a reconfigurable array to achieve accelerated performance in certain applications. 3 QUKU Architecture QUKU [1, 2] is a merger of two technologies: CGRAs and FPGAs. Fig 1 shows the system level description of QUKU. It consists of a dynamically reconfigurable PE array, configuration controller and address controller for data and result memory along with a Microblaze based soft processor. QUKU is a coarse-grained PE matrix overlaid on a conventional FPGA. The aim is to develop a system which is based on commercially available and affordable technologies, but at the same time provides active support for fast and efficient dynamic reconfiguration. Our system is unique in that it provides two levels of application-specific reconfigurability. The operation of each PE, and the interconnections between PEs can be reconfigured on a cycle-bycycle basis, giving maximum reuse of arithmetic and logical operators without long reconfiguration delays. However, the CGRA does not suffer from the normal CGRA problem of trying to identify the optimal mix of operators for all present and future applications to be used on that array. Rather, the structure of the CGRA can be periodically re-optimized for each new application at a cost of several milliseconds of FPGA reconfiguration time.
3 Configuration Contoller Module Configuration BRAM Configuration controller Address Manager Module Processor for reconfiguration control and also for running software application processes. Fig. 1. System level diagram of QUKU A commonly perceived obstacle to provide algorithm speedup is the difficulty of designing custom hardware for each new algorithm. The QUKU architecture moves the design problem from a complex hardware design problem to a simpler problem of programming a PE array. QUKU compiles these PEs to a heterogeneous array, with each PE optimized for just the range of operations it requires to implement for one application set. It is our conjecture that such a PE array has the potential to provide area and power efficiencies not normally associated with FPGAs, and this paper outlines some of our initial investigations into this potential. 4 Run Time Reconfiguration In [2] we have described the design process and shown that QUKU performance is much better than a FPGA based soft processor implementation. In this section, we will describe the configuration memory organization to support run time reconfiguration of whole or partial PE array. As shown in fig. 1, the configuration controller is responsible for the loading of different modules on the PE array. The configuration BRAM contains configuration codes for up to four modules. Fig 2 shows the address space of modules and the configuration word structure. The first location of a module configuration space is reserved for specifying the address of PEs involved in the module mapping. Address of PE is written in decoded form say, if a module is to be loaded on PE 0, 1, 2, 4, 7, 8 and 10 then it is written in the address location as This ensures that the corresponding PEs can be reset before they are loaded with the new configuration. Actual PE configuration word starts after the PE reset information. A PE array consists of 16 PEs. Each PE may have three configuration layers: start, mid-
4 dle and end layer. Each configuration layer is programmed to run for a certain number of iterations as a 10 bit value in iteration counter. The iteration counter value is written with the configuration bits in odd numbered locations. The even numbered locations contain the address of PEs for which the configuration is valid. Address Module 1 '1' Reserved[30:16] PE address[15:0] Configuration[31:16] Iteration Cntr[15:0] '1' Reserved[30:16] PE address[15:0] 00XXXXXX1 00XXXXXX0 Configuration[31:16] '0' Reserved[30:16] Iteration Cntr[15:0] PE address[15:0] Module 2 Module 3 Module 4 Fig. 2. Configuration BRAM memory structure Depth of the address space is decided by the worst case in which no two PEs share the same configuration. For this worst case, each PE requires 3 configuration word (start, middle and end layer) and each configuration word along with address occupies two locations leading to a total of 96 (16X3X2) locations, plus one location for the reset information. Hence a total of 97 locations are required. To keep the memory map simple, this depth is extended to the nearest power of 2, making it 128. To mark the end of configuration, we have reserved bit 31 of the even numbered location. The last configuration word is indicated by a 0 at bit 31 in the even numbered location. 4.1 Run Time Reconfiguration for Image Processing Kernel In this section, we will be concentrating on edge detection algorithms that find widespread application in image processing [10,11]. There are a lot of methods for edge detection. But most of them can be grouped into either gradient or Laplacian. Gradient method looks for finding out minima or maxima in the first derivative while Laplacian method looks for zero crossings in the second derivative. We have mapped Sobel and Laplace edge detection algorithms on QUKU. The Sobel and Laplacian masks are shown in Fig. 3. Sobel edge detector uses a pair of 3X3 convolution masks, one estimating the gradient in X direction and the other in Y direction. The approximate gradient magnitude is then obtained by adding the X and Y gradients The Laplacian technique uses a 5X5 convoluted mask to approximate the second derivative in both X and Y directions. The idea of generating gradient from image frame, utilizes the concept of sliding window over the image frame. The mask
5 is slid over an area of the input image. The new value is calculated for the middle pixel, then the mask is slid one pixel to the right. This continues from left to right and top to bottom Sobel X Mask Sobel Y Mask Laplacian Mask Fig. 3. Sobel & Laplacian filter mask for edge detection in image frame 5 Result Analysis For edge detection purpose, we have chosen an image with 336 rows and 341 columns (fig. 4(a)) of pixels. Fig. 4(b) and 4(c) represents the image obtained after applying Laplacian and Sobel edge detection algorithms respectively. Initially, Laplacian kernel was mapped followed by Sobel kernel. Table 1. Result comparison for Laplacian and Sobel kernel implementation on QUKU Application Execution time/frame Configuration time Laplacian Kernel 33.5 ms 105 ns Sobel kernel 54.4 ms 105 ns a) Original Image b) Image filtered using Laplacian algorithm c) Image filtered using Sobel algorithm Fig. 4. Original image and image after processing with Laplacian and Sobel filter
6 6 Conclusion In this paper we showed how QUKU can be dynamically reconfigured at the application level within a few clock cycles. The configuration time was found to be a very small fraction (10-7 ) of the execution time. The preliminary results support the fact that QUKU can be used in real time image processing applications and can be dynamically reconfigured at a very fast speed to switch between different kernels Development of QUKU is still in progress. This first set of experiments is mostly to confirm the APEX design flow, and to establish performance and reconfigurability features. We are yet to develop a Microblaze based soft processor interface to QUKU for system and software control. 7 Acknowledgement This project is proudly supported by the International Science Linkages programme established under the Australian Government s innovation statement, Backing Australia s Ability. References 1. Sunil Shukla, Neil Bergmann, Jürgen Becker, APEX A Coarse Grained Reconfigurable Overlay for FPGA, Proceedings of the IFIP VLSI SoC 2005, pp Sunil Shukla, Neil Bergmann, Jürgen Becker, QUKU: A Two-Level Reconfigurable Architecture, accepted for presentation in ISVLSI Alexandro M. S. Adário, E. L. Roehe and S. Bampi, Dynamically Reconfigurable Architecture for Image Processor Applications, DAC 99, New Orleans, Louisiana 4. Hauser, J. R.; Wawrzyneck J., GARP: A MIPS Processor with a Reconfigurable Coprocessor Proceedings of FCCM (1997), pp A. Tsutsui, T. Miyazaki, YARDS: FPGA/MPU Hybrid Architecture for Telecommunication Data Processing. Proceedings of FPGA 1997, pp M. J. Wirthlin, B. L. Hutchings, DISC: The dynamic instruction set compiler, in FPGAs for Fast Board Development and Reconfigurable Computing, Proc. SPIE 2607, pp (1995) 7. C. Iseli, E. Sanchez, Spyder: A Reconfigurable VLIW processor using FPGAs in proceedings of FCCM, 1993, pp P. Athanas, H. F. Silverman, Processor Reconfiguration through Instruction Set Metamorphosis, IEEE Computer, March 1993, pp S. Guccione, List of FPGA based Computing Machine, online at L. S. Davis, A survey of edge detection techniques, Computer Graphics and Image Processing, vol. 4, no. 3, pp , September V. S. Nalwa and T. O. Binford, On detecting edges, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp , November 1986
FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS
FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS 1 RONNIE O. SERFA JUAN, 2 CHAN SU PARK, 3 HI SEOK KIM, 4 HYEONG WOO CHA 1,2,3,4 CheongJu University E-maul: 1 engr_serfs@yahoo.com,
More informationSession: Configurable Systems. Tailored SoC building using reconfigurable IP blocks
IP 08 Session: Configurable Systems Tailored SoC building using reconfigurable IP blocks Lodewijk T. Smit, Gerard K. Rauwerda, Jochem H. Rutgers, Maciej Portalski and Reinier Kuipers Recore Systems www.recoresystems.com
More informationModeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors
Modeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors Siew-Kei Lam Centre for High Performance Embedded Systems, Nanyang Technological University, Singapore (assklam@ntu.edu.sg)
More informationCoarse Grained Reconfigurable Architecture
Coarse Grained Reconfigurable Architecture Akeem Edwards July 29 2012 Abstract: This paper examines the challenges of mapping applications on to a Coarsegrained reconfigurable architecture (CGRA). Through
More informationReconfigurable Computing. Introduction
Reconfigurable Computing Tony Givargis and Nikil Dutt Introduction! Reconfigurable computing, a new paradigm for system design Post fabrication software personalization for hardware computation Traditionally
More informationHardware Acceleration of Edge Detection Algorithm on FPGAs
Hardware Acceleration of Edge Detection Algorithm on FPGAs Muthukumar Venkatesan and Daggu Venkateshwar Rao Department of Electrical and Computer Engineering University of Nevada Las Vegas. Las Vegas NV
More informationCoarse Grain Reconfigurable Arrays are Signal Processing Engines!
Coarse Grain Reconfigurable Arrays are Signal Processing Engines! Advanced Topics in Telecommunications, Algorithms and Implementation Platforms for Wireless Communications, TLT-9707 Waqar Hussain Researcher
More informationII. LITERATURE SURVEY
Hardware Co-Simulation of Sobel Edge Detection Using FPGA and System Generator Sneha Moon 1, Prof Meena Chavan 2 1,2 Department of Electronics BVUCOE Pune India Abstract: This paper implements an image
More informationManaging Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks
Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department
More informationCost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Lo
Cost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Low-Power Capacity- based Measurement Application on Xilinx FPGAs Abstract The application of Field Programmable
More informationA Partitioning Flow for Accelerating Applications in Processor-FPGA Systems
A Partitioning Flow for Accelerating Applications in Processor-FPGA Systems MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical & Computer Engineering
More informationRUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationA Low Power and High Speed MPSOC Architecture for Reconfigurable Application
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference
More informationHRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing
HRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing Mingyu Gao and Christos Kozyrakis Stanford University http://mast.stanford.edu HPCA March 14, 2016 PIM is Coming Back End of Dennard
More informationTowards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing
Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de
More informationController Synthesis for Hardware Accelerator Design
ler Synthesis for Hardware Accelerator Design Jiang, Hongtu; Öwall, Viktor 2002 Link to publication Citation for published version (APA): Jiang, H., & Öwall, V. (2002). ler Synthesis for Hardware Accelerator
More informatione-issn: p-issn:
Available online at www.ijiere.com International Journal of Innovative and Emerging Research in Engineering e-issn: 2394-3343 p-issn: 2394-5494 Edge Detection Using Canny Algorithm on FPGA Ms. AASIYA ANJUM1
More informationMemory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures
Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures Abstract: The coarse-grained reconfigurable architectures (CGRAs) are a promising class of architectures with the advantages of
More informationRuntime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays
Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann
More informationEmbedded Real-Time Video Processing System on FPGA
Embedded Real-Time Video Processing System on FPGA Yahia Said 1, Taoufik Saidani 1, Fethi Smach 2, Mohamed Atri 1, and Hichem Snoussi 3 1 Laboratory of Electronics and Microelectronics (EμE), Faculty of
More informationSystem Verification of Hardware Optimization Based on Edge Detection
Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection
More informationPS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor
PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor K.Rani Rudramma 1, B.Murali Krihna 2 1 Assosiate Professor,Dept of E.C.E, Lakireddy Bali Reddy Engineering College, Mylavaram
More informationSynthetic Benchmark Generator for the MOLEN Processor
Synthetic Benchmark Generator for the MOLEN Processor Stephan Wong, Guanzhou Luo, and Sorin Cotofana Computer Engineering Laboratory, Electrical Engineering Department, Delft University of Technology,
More informationDesign of Reusable Context Pipelining for Coarse Grained Reconfigurable Architecture
Design of Reusable Context Pipelining for Coarse Grained Reconfigurable Architecture P. Murali 1 (M. Tech), Dr. S. Tamilselvan 2, S. Yazhinian (Research Scholar) 3 1, 2, 3 Dept of Electronics and Communication
More informationPerformance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path
Performance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical
More informationA Reconfigurable Multifunction Computing Cache Architecture
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 509 A Reconfigurable Multifunction Computing Cache Architecture Huesung Kim, Student Member, IEEE, Arun K. Somani,
More informationAccelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path
Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path Michalis D. Galanis, Gregory Dimitroulakos, and Costas E. Goutis VLSI Design Laboratory, Electrical and Computer Engineering
More informationThe S6000 Family of Processors
The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which
More informationEfficient Interfacing of Partially Reconfigurable Instruction Set Extensions for Softcore CPUs on FPGAs
04-Dirk-AF:04-Dirk-AF 8/19/11 6:17 AM Page 35 Efficient Interfacing of Partially Reconfigurable Instruction Set Extensions for Softcore CPUs on FPGAs Dirk Koch 1, Christian Beckhoff 2, and Jim Torresen
More informationChapter 5: ASICs Vs. PLDs
Chapter 5: ASICs Vs. PLDs 5.1 Introduction A general definition of the term Application Specific Integrated Circuit (ASIC) is virtually every type of chip that is designed to perform a dedicated task.
More informationIssues in FPGA-based Configurable Computer Design for Operating Systems and Multimedia Applications
Issues in -based Configurable Computer Design for Operating Systems and Multimedia Applications 3UHGUDJ.QHåHYLü%RåLGDU5DGXQRYLü$EREDNHU(OKRXQL Department of Computer Engineering School of Electrical Engineering
More informationEdge Detection. Today s reading. Cipolla & Gee on edge detection (available online) From Sandlot Science
Edge Detection From Sandlot Science Today s reading Cipolla & Gee on edge detection (available online) Project 1a assigned last Friday due this Friday Last time: Cross-correlation Let be the image, be
More informationA Distributed Canny Edge Detector and Its Implementation on FPGA
A Distributed Canny Edge Detector and Its Implementation on FPGA 1, Chandrashekar N.S., 2, Dr. K.R. Nataraj 1, Department of ECE, Don Bosco Institute of Technology, Bangalore. 2, Department of ECE, SJB
More informationDynamically Programmable Cache for Multimedia Applications
International Journal of Applied Engineering Research ISSN 0973-4562 Vol.1 No.1 (2006) pp. 23-35 Research India Publications http://www.ripublication.com/ijaer.htm Dynamically Programmable Cache for Multimedia
More informationInternational Journal of Computer Sciences and Engineering. Research Paper Volume-6, Issue-2 E-ISSN:
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-6, Issue-2 E-ISSN: 2347-2693 Implementation Sobel Edge Detector on FPGA S. Nandy 1*, B. Datta 2, D. Datta 3
More informationThe Use of Field-Programmable Gate Arrays for the
VLSI DESIGN 1996, Vol. 4, No. 2, pp. 135-139 Reprints available directly from the publisher Photocopying permitted by license only (C) 1996 OPA (Overseas Publishers Association) Amsterdam B.V. Published
More informationHardware/Software Co-design
Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction
More informationIMPLEMENTATION OF DISTRIBUTED CANNY EDGE DETECTOR ON FPGA
IMPLEMENTATION OF DISTRIBUTED CANNY EDGE DETECTOR ON FPGA T. Rupalatha 1, Mr.C.Leelamohan 2, Mrs.M.Sreelakshmi 3 P.G. Student, Department of ECE, C R Engineering College, Tirupati, India 1 Associate Professor,
More informationCARUSO Project Goals and Principal Approach
CARUSO Project Goals and Principal Approach Uwe Brinkschulte *, Jürgen Becker #, Klaus Dorfmüller-Ulhaas +, Ralf König #, Sascha Uhrig +, and Theo Ungerer + * Department of Computer Science, University
More informationHardware Software Codesign of Embedded Systems
Hardware Software Codesign of Embedded Systems Rabi Mahapatra Texas A&M University Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on Codesign of Embedded System
More informationHardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA
Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA Arash Nosrat Faculty of Engineering Shahid Chamran University Ahvaz, Iran Yousef S. Kavian
More informationHardware/Software Partitioning and Minimizing Memory Interface Traffic
Hardware/Software Partitioning and Minimizing Memory Interface Traffic Axel Jantsch, Peeter Ellervee, Johnny Öberg, Ahmed Hemani and Hannu Tenhunen Electronic System Design Laboratory, Royal Institute
More informationDigital Electronics 27. Digital System Design using PLDs
1 Module -27 Digital System Design 1. Introduction 2. Digital System Design 2.1 Standard (Fixed function) ICs based approach 2.2 Programmable ICs based approach 3. Comparison of Digital System Design approaches
More informationDesign of Feature Extraction Circuit for Speech Recognition Applications
Design of Feature Extraction Circuit for Speech Recognition Applications SaambhaviVB, SSSPRao and PRajalakshmi Indian Institute of Technology Hyderabad Email: ee10m09@iithacin Email: sssprao@cmcltdcom
More informationFast dynamic and partial reconfiguration Data Path
Fast dynamic and partial reconfiguration Data Path with low Michael Hübner 1, Diana Göhringer 2, Juanjo Noguera 3, Jürgen Becker 1 1 Karlsruhe Institute t of Technology (KIT), Germany 2 Fraunhofer IOSB,
More informationAnalysis of Image and Video Using Color, Texture and Shape Features for Object Identification
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features
More informationISSN Vol.03, Issue.02, March-2015, Pages:
ISSN 2322-0929 Vol.03, Issue.02, March-2015, Pages:0122-0126 www.ijvdcs.org Design and Simulation Five Port Router using Verilog HDL CH.KARTHIK 1, R.S.UMA SUSEELA 2 1 PG Scholar, Dept of VLSI, Gokaraju
More informationIs There A Tradeoff Between Programmability and Performance?
Is There A Tradeoff Between Programmability and Performance? Robert Halstead Jason Villarreal Jacquard Computing, Inc. Roger Moussalli Walid Najjar Abstract While the computational power of Field Programmable
More informationThe Design of Sobel Edge Extraction System on FPGA
The Design of Sobel Edge Extraction System on FPGA Yu ZHENG 1, * 1 School of software, Beijing University of technology, Beijing 100124, China; Abstract. Edge is a basic feature of an image, the purpose
More informationEmbedded Systems: Hardware Components (part I) Todor Stefanov
Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System
More informationA Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms
A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms Jingzhao Ou and Viktor K. Prasanna Department of Electrical Engineering, University of Southern California Los Angeles, California,
More informationA Coprocessor for Accelerating Visual Information Processing
A Coprocessor for Accelerating Visual Information Processing W. Stechele*), L. Alvado Cárcel**), S. Herrmann*), J. Lidón Simón**) *) Technische Universität München **) Universidad Politecnica de Valencia
More informationIntegrating MRPSOC with multigrain parallelism for improvement of performance
Integrating MRPSOC with multigrain parallelism for improvement of performance 1 Swathi S T, 2 Kavitha V 1 PG Student [VLSI], Dept. of ECE, CMRIT, Bangalore, Karnataka, India 2 Ph.D Scholar, Jain University,
More informationA Reconfigurable Functional Unit for an Adaptive Extensible Processor
A Reconfigurable Functional Unit for an Adaptive Extensible Processor Hamid Noori Farhad Mehdipour Kazuaki Murakami Koji Inoue and Morteza SahebZamani Department of Informatics, Graduate School of Information
More informationDigital Integrated Circuits
Digital Integrated Circuits Lecture 9 Jaeyong Chung Robust Systems Laboratory Incheon National University DIGITAL DESIGN FLOW Chung EPC6055 2 FPGA vs. ASIC FPGA (A programmable Logic Device) Faster time-to-market
More informationEECS150 - Digital Design Lecture 13 - Accelerators. Recap and Outline
EECS150 - Digital Design Lecture 13 - Accelerators Oct. 10, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)
More informationDesign and Verification Point-to-Point Architecture of WISHBONE Bus for System-on-Chip
International Journal of Emerging Engineering Research and Technology Volume 2, Issue 2, May 2014, PP 155-159 Design and Verification Point-to-Point Architecture of WISHBONE Bus for System-on-Chip Chandrala
More informationSystems Development Tools for Embedded Systems and SOC s
Systems Development Tools for Embedded Systems and SOC s Óscar R. Ribeiro Departamento de Informática, Universidade do Minho 4710 057 Braga, Portugal oscar.rafael@di.uminho.pt Abstract. A new approach
More informationCoarse-grained reconfigurable architecture with hierarchical context cache structure and management approach
LETTER IEICE Electronics Express, Vol.14, No.6, 1 10 Coarse-grained reconfigurable architecture with hierarchical context cache structure and management approach Chao Wang, Peng Cao, Bo Liu, and Jun Yang
More informationInterfacing a High Speed Crypto Accelerator to an Embedded CPU
Interfacing a High Speed Crypto Accelerator to an Embedded CPU Alireza Hodjat ahodjat @ee.ucla.edu Electrical Engineering Department University of California, Los Angeles Ingrid Verbauwhede ingrid @ee.ucla.edu
More informationA Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs
A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs Antonino Tumeo, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto Politecnico di Milano, Dipartimento di Elettronica e Informazione
More informationEfficient Self-Reconfigurable Implementations Using On-Chip Memory
10th International Conference on Field Programmable Logic and Applications, August 2000. Efficient Self-Reconfigurable Implementations Using On-Chip Memory Sameer Wadhwa and Andreas Dandalis University
More informationDesign and Implementation of Programmable Logic Controller (PLC) Using System on Programmable Chip (SOPC)
Design and Implementation of Programmable Logic Controller (PLC) Using System on Programmable Chip (SOPC) Vilas V Deotare 1, Arjun J. Khetia 2, Dinesh V Padole 3 Professor, Dept. of Electronics and Telecommunication,
More informationhigh performance medical reconstruction using stream programming paradigms
high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming
More informationA VARIETY OF ICS ARE POSSIBLE DESIGNING FPGAS & ASICS. APPLICATIONS MAY USE STANDARD ICs or FPGAs/ASICs FAB FOUNDRIES COST BILLIONS
architecture behavior of control is if left_paddle then n_state
More informationDesign and Implementation of Buffer Loan Algorithm for BiNoC Router
Design and Implementation of Buffer Loan Algorithm for BiNoC Router Deepa S Dev Student, Department of Electronics and Communication, Sree Buddha College of Engineering, University of Kerala, Kerala, India
More informationA Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning
A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning By: Roman Lysecky and Frank Vahid Presented By: Anton Kiriwas Disclaimer This specific
More informationRECONFIGURABLE computing (RC) [5] is an interesting
730 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 7, JULY 2006 System-Level Power-Performance Tradeoffs for Reconfigurable Computing Juanjo Noguera and Rosa M. Badia Abstract
More informationImplementation of a Unified DSP Coprocessor
Vol. (), Jan,, pp 3-43, ISS: 35-543 Implementation of a Unified DSP Coprocessor Mojdeh Mahdavi Department of Electronics, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran *Corresponding author's
More informationProgram-Driven Fine-Grained Power Management for the Reconfigurable Mesh
Program-Driven Fine-Grained Power Management for the Reconfigurable Mesh Heiner Giefers, Marco Platzner Computer Engineering Group University of Paderborn {hgiefers, platzner}@upb.de Outline 1. Introduction
More informationFPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1
FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1 Anurag Dwivedi Digital Design : Bottom Up Approach Basic Block - Gates Digital Design : Bottom Up Approach Gates -> Flip Flops Digital
More informationFPGA-Based Rapid Prototyping of Digital Signal Processing Systems
FPGA-Based Rapid Prototyping of Digital Signal Processing Systems Kevin Banovic, Mohammed A. S. Khalid, and Esam Abdel-Raheem Presented By Kevin Banovic July 29, 2005 To be presented at the 48 th Midwest
More informationDesign and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA
Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Maheswari Murali * and Seetharaman Gopalakrishnan # * Assistant professor, J. J. College of Engineering and Technology,
More informationCOPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design
COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm
More informationA Complete Data Scheduler for Multi-Context Reconfigurable Architectures
A Complete Data Scheduler for Multi-Context Reconfigurable Architectures M. Sanchez-Elez, M. Fernandez, R. Maestre, R. Hermida, N. Bagherzadeh, F. J. Kurdahi Departamento de Arquitectura de Computadores
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationRED: A Reconfigurable Datapath
RED: A Reconfigurable Datapath Fernando Rincón, José M. Moya, Juan Carlos López Universidad de Castilla-La Mancha Departamento de Informática {frincon,fmoya,lopez}@inf-cr.uclm.es Abstract The popularity
More informationDesign and Implementation of Effective Architecture for DCT with Reduced Multipliers
Design and Implementation of Effective Architecture for DCT with Reduced Multipliers Susmitha. Remmanapudi & Panguluri Sindhura Dept. of Electronics and Communications Engineering, SVECW Bhimavaram, Andhra
More informationOverview of ROCCC 2.0
Overview of ROCCC 2.0 Walid Najjar and Jason Villarreal SUMMARY FPGAs have been shown to be powerful platforms for hardware code acceleration. However, their poor programmability is the main impediment
More informationImage gradients and edges
Image gradients and edges April 7 th, 2015 Yong Jae Lee UC Davis Announcements PS0 due this Friday Questions? 2 Last time Image formation Linear filters and convolution useful for Image smoothing, removing
More informationImage gradients and edges April 10 th, 2018
Image gradients and edges April th, 28 Yong Jae Lee UC Davis PS due this Friday Announcements Questions? 2 Last time Image formation Linear filters and convolution useful for Image smoothing, removing
More informationMatrix Manipulation Using High Computing Field Programmable Gate Arrays
Matrix Manipulation Using High Computing Field Programmable Gate Arrays 1 Mr.Rounak R. Gupta, 2 Prof. Atul S. Joshi Department of Electronics and Telecommunication Engineering, Sipna College of Engineering
More informationDesign of a Pipelined 32 Bit MIPS Processor with Floating Point Unit
Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit P Ajith Kumar 1, M Vijaya Lakshmi 2 P.G. Student, Department of Electronics and Communication Engineering, St.Martin s Engineering College,
More informationA Massively Parallel Virtual Machine for. SIMD Architectures
Advanced Studies in Theoretical Physics Vol. 9, 15, no. 5, 37-3 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/1.19/astp.15.519 A Massively Parallel Virtual Machine for SIMD Architectures M. Youssfi and
More informationRFU Based Computational Unit Design For Reconfigurable Processors
RFU Based Computational Unit Design For Reconfigurable Processors M. Aqeel Iqbal Faculty of Engineering and Information Technology Foundation University, Institute of Engineering and Management Sciences
More informationFPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST
FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST SAKTHIVEL Assistant Professor, Department of ECE, Coimbatore Institute of Engineering and Technology Abstract- FPGA is
More informationChapter 1 Overview of Digital Systems Design
Chapter 1 Overview of Digital Systems Design SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 8, 2017 Why Digital Design? Many times, microcontrollers
More informationFPGA CAN BE IMPLEMENTED BY USING ADVANCED ENCRYPTION STANDARD ALGORITHM
FPGA CAN BE IMPLEMENTED BY USING ADVANCED ENCRYPTION STANDARD ALGORITHM P. Aatheeswaran 1, Dr.R.Suresh Babu 2 PG Scholar, Department of ECE, Jaya Engineering College, Chennai, Tamilnadu, India 1 Associate
More informationReconfigurable Computing
Reconfigurable Computing Kees A. Vissers CTO Chameleon Systems, Inc. kees@cmln.com Research Fellow, UC Berkeley vissers@eecs.berkeley.edu www.eecs.berkeley.edu/~vissers MP-SOC, July, 2002 Silicon, 20 years
More informationIntroduction to reconfigurable systems
Introduction to reconfigurable systems Reconfigurable system (RS)= any system whose sub-system configurations can be changed or modified after fabrication Reconfigurable computing (RC) is commonly used
More informationSignal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage ECE Temple University
Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage silage@temple.edu ECE Temple University www.temple.edu/scdl Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation
More informationEmbedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi. Lecture - 10 System on Chip (SOC)
Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 10 System on Chip (SOC) In the last class, we had discussed digital signal processors.
More informationDSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions
White Paper: Spartan-3 FPGAs WP212 (v1.0) March 18, 2004 DSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions By: Steve Zack, Signal Processing Engineer Suhel Dhanani, Senior
More informationFPGA Based Intelligent Co-operative Processor in Memory Architecture
FPGA Based Intelligent Co-operative Processor in Memory Architecture Zaki Ahmad 1, Reza Sotudeh 2, D. M. Akbar Hussain 3, Shahab-ud-din 4 1 Pakistan Institute of Laser and Optics, Rawalpindi, Pakistan
More informationISSN Vol.05,Issue.09, September-2017, Pages:
WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.09, September-2017, Pages:1693-1697 AJJAM PUSHPA 1, C. H. RAMA MOHAN 2 1 PG Scholar, Dept of ECE(DECS), Shirdi Sai Institute of Science and Technology, Anantapuramu,
More informationDesign and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology
Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology Senthil Ganesh R & R. Kalaimathi 1 Assistant Professor, Electronics and Communication Engineering, Info Institute of Engineering,
More informationSimulation & Synthesis of FPGA Based & Resource Efficient Matrix Coprocessor Architecture
Simulation & Synthesis of FPGA Based & Resource Efficient Matrix Coprocessor Architecture Jai Prakash Mishra 1, Mukesh Maheshwari 2 1 M.Tech Scholar, Electronics & Communication Engineering, JNU Jaipur,
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationThe Xilinx XC6200 chip, the software tools and the board development tools
The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions
More information