A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs
|
|
- Clinton Barrett
- 6 years ago
- Views:
Transcription
1 A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs Antonino Tumeo, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto Politecnico di Milano, Dipartimento di Elettronica e Informazione Via Ponzio 34/ Milano, Italy s: {tumeo, monchier, gpalermo, ferrandi, sciuto}@elet.polimi.it Abstract Multimedia applications, and in particular the encoding and decoding of standard image and video formats, are usually a typical target for Systemson-Chip (SoC). The bi-dimensional Discrete Cosine Transformation (2D-DCT) is a commonly used frequency transformation in graphic compression algorithms. Many hardware implementations, adopting disparate algorithms, have been proposed for Field Programmable Gate Arrays (FPGA). These designs focus either on performance or area, and often do not succeed in balancing the two aspects. In this paper, we present a design of a fast 2D- DCT hardware accelerator for a FPGA-based SoC. This accelerator makes use of a single seven stages 1D-DCT pipeline able to alternate computation for the even and odd coefficients in every cycle. In addition, it uses special memories to perform the transpose operations. Our hardware takes 80 clock cycles at 107MHz to generate a complete 8x8 2D DCT, from the writing of the first input sample to the reading of the last result (including the overhead of the interface logic). We show that this architecture provides optimal performance/area ratio with respect to several alternative designs. 1 Introduction Reconfigurable platforms have recently emerged to be an important alternative to ASIC design, featuring a significant flexibility and time-to-market improvement with respect to the conventional digital design flow [1]. In this context, several toolchains for the design and prototyping of Systems-on-Chip (SoC) have been presented [2, 3]. These tools permit to rapidly create systems composed of hard and soft core processors and a set of standard IP-cores, to interface with internal and external peripherals. In addition, the system can be tailored to the target application by including ad hoc coprocessors to properly accelerate the critical kernels. This paper presents a novel hardware architecture for a fast 2D Discrete Cosine Transform accelerator. The basic idea is to exploit the symmetries of the algorithm to save some area, but still ensure highperformance. The architecture is targeted to work as a hardware accelerator for the Xilinx MicroBlaze soft core processor, and builds on the specifications of the connection with the processor to further optimize its operations. This design is a component of a complete HW/SW implementation of the JPEG encoding algorithm. The 2D-DCT is one of the most computationally intensive phase of the encoding process, and its acceleration noticeably reduces the whole execution time of the application. The structure of this paper is the following. Section 2 discusses some related works. The 2D-DCT and on the Fast DCT algorithm are briefly discussed in Section 3. The proposed architecture is described in Section 4. Results are discussed in Section 5. Finally, Section 6 concludes the paper. 2 Related Work Several works proposing the architecture and highlevel design of a 2D-DCT cores have appeared. Xil-
2 F (u, v) = Λ(u)Λ(v) cos[ i=0 j=0 (2i + 1)uπ ] cos[ 16 (2j + 1)vπ ]f(i, j) (1) 16 Λ(k) = { 1 2 if k = 0 1 else (2) Figure 1. Equations for the 2D-DCT inx [4] and Altera [5] offers, in their libraries, specific cores, optimized for their programmable devices in terms of occupation. Nevertheless, they feature relatively low performance and, furthermore, they are not so easy to integrate in System-on-Chip designs realized with their own toolchains. Many custom designs for FPGA have also been presented. Among them, Trainor et al. [6] propose an architecture with distributed arithmetic that exploits parallelism and pipelining. Agostini et al. [7] propose a 2D-DCT architecture based on the previous work of Kovac et al. [8]. The authors decompose the transform in two 1D-DCT calculations with a transpose buffer thanks to the separability property. This design is based on the Fast DCT algorithm. It uses a six stages Wallace tree multiplier, that decomposes the multiplier in shift and add operations. Nevertheless, since nowadays multipliers are embedded in FPGA, this approach is no more effective in order to reduce occupation. The 2D DCT global latency is 160 clock cycles and a complete 8x8 matrix is processed in 64 clock cycles. Our proposal is loosely inspired to this work. Nevertheless, we propose several optimizations that achieve important advantages in terms of area and performance. In addition, Agostini s design is conceived for a fully HW implementation of the JPEG encoder. On the other hand, our work targets a mixed HW/SW design, stressing the role of the interfaces to/from the processor. Yusof et al. [9], present a similar DCT architecture, integrated in a complex SoC targeted at image encoding. Finally, Bukhari et al. [10] present an architecture that implements a modified Loeffler algorithm (resulting in a faster but significantly larger implementation w.r.t. our proposal). In addition, the authors show how the occupation of the accelerators can greatly vary when implemented on FPGAs from different vendors. 3 2D-DCT Overview The DCT is a frequency transformation commonly adopted for compression algorithms, that concentrates the most information in a few low frequency coefficients. Slightly different definitions of the transform exist. Nevertheless, the bi-dimensional version, in the mostly used form, for 8x8 input samples block is shown in Figure 1. This equation has a high computational complexity. For instance, a 8x8 block requires 4096 multiplications and 4096 additions. Many optimizations have been proposed and, among them, in the field of image compression algorithms, the Fast DCT has been widely adopted. According to the Fast DCT algorithm, since the cosines depend only on the position in the 8x8 block of the samples, their values can be precomputed and the transform can be rewritten as a matrix multiplication, where the last matrix is the transpose of the first: T = CxC where C is the matrix of the values of the cosines. In addition, since the 2D-DCT is a separable operation, it can be computed by applying a 1D-DCT in one dimension (row-wise) and then by applying another 1D-DCT to the results in the other dimension (column-wise). This decomposition reduces the complexity of the calculation by a factor of four. Applying both the 1D decomposition and the Fast DCT algorithm, only 80 multiplications and 464 additions are needed to compute a 2D-DCT of a 8x8 block, where each 1D-DCT on a vector of 8 elements requires 29 sums or subtractions and 5 multiplications. It is important to stress that the result of the Fast DCT algorithm is scaled, so for example for the JPEG algorithm, it gets corrected in the quantization phase, where it can be performed in one step with the quantization itself.
3 4 Architecture The decomposition in two 1D computations leads to an architecture composed of two 1D pipelined architectures, and an intermediate buffer for the transposition, as proposed in [7]. Nevertheless, this solution is not area efficient, since each 1D pipeline performs exactly the same operations. In addition, to allow the use a global 2D-DCT pipeline, a special transpose buffer must be designed, since the first DCT produces row results, and the second DCT needs column values as input. This memory should have ping pong 1 features to permit to the first 1D architecture to write different values that could be read by the second 1D architecture. This leads to even more space occupation on FPGA. In particular, if the latency is critical, these memories cannot be implemented with internal BRAMs and they should be implemented as registers, which takes a lot of logic cells. The solution proposed in [7], which uses BRAMs, takes a latency of 64 cycles to generate a full transpose matrix. Also BRAMs can become a limiting factor, in particular if the 2D-DCT architectures needs to be integrated in a System-on-Chip with soft core processors, that needs the BRAMs as fast data and instruction memories. Our architecture has been designed considering the fact that the resulting accelerator should be connected to a soft core processor, the MicroBlaze [11] from Xilinx. Our DCT module should be part of a complete System-on-Chip to perform image encoding. The MicroBlaze, thanks to the Fast Simplex Links (FSL) [12], permits to connect application specific hardware accelerators using a point-to-point communication protocol via master slave ports. Each communication primitive can transmit 32 bits from the register file of the MicroBlaze to the accelerator and vice versa. Since the values of the input samples in image compression are constrained in a range covered with 8 bits, a single FSL command can transmit up to 4 values per cycle. Next section provides more details on the architecture implementation. 1 We say a ping pong memory, a memory interposed between two blocks (A and B) that can alternatively be written by A and read by B or be written by B and read by A. Figure 2. The 2D-DCT architecture with a single 1D-DCT component 4.1 Implementation We decided to implement an architecture that uses a single 1D-DCT pipeline, fed by a master FSL port, and a transpose memory that, as soon as the first monodimensional transformation has been completed, feeds back the transposed results to the same pipeline. Removing the option for a 2D-DCT global pipeline (like in [7]), we could implement this memory as a simple memory that gets written in rows and gets read from its columns. Then, the second 1D-DCT is performed, and the final results are stored in a secondary buffer before being transposed again and output to the slave FSL. Figure 2 shows an overview of the architecture. As explained before, a single pipeline would require the execution of 29 sums/subs and 5 multiplications. Observing that odd and even coefficients of the resulting 8 samples transformed vector requires different types of computations, we organized the pipeline in seven stages. In this way, we reduced the number of adder/subtractors to 19 and the number of multipliers to 4. This means that the pipeline alternates the needed values, each cycle, to compute the odd and the even coefficients of the resulting vector. The organization of our seven stages pipeline is shown in Figure 3. The FSL connection can feed four 8 bits values per cycle, and all the input samples are needed (8 values) for both the odd and even output samples. For these reasons, we implemented a pseudoping pong buffer (now at the input) partitioned in two parts of four values, in order to maintain the same values for two consecutive clock cycles. It is also important to stress that the DCT extends the range of the output values. Thus, the initial 8 bits values become, at the end of a 1D-DCT, values that
4 Figure 3. The seven stages of the 1D-DCT pipeline, with 19 adders/subtractors and 4 multipliers. Notice that latches between each stage are not drawn to show how the different functional units are connected are valid on 16 bits. But, in order not to lose precision, when doing multiple passes performing a 2D-DCT, it is important to represent the intermediate results between the first and the second 1D-DCT in a fixed point format, with at least 24 bits (8 bits for the decimal part). Our 1D-DCT pipeline accounts for this. Each computation is performed at 24 bits precision, and the transpose memory allows to save 24 bits values. The final results buffer saves, instead, only the integer part of the numbers in 16 bits format. Therefore, effectively, the output rate of the complete 2D-DCT is two 16 bits values per clock cycle. 4.2 Interfaces The input logic starts receiving data from the processor master port, feeds the ping pong buffer, and the pipeline, as soon as the first group of four samples is available. The output logic waits that the full 8x8 block has completed the two 1D phases and the result has been stored to the memory. Then, it starts sending results, grouped as two 16 bits values each, to the processor. The MicroBlaze, which, after sending the input samples, is waiting for a block to receive (MicroBlaze block read), finally starts reading the results. Resource Used Available Utilization Slices % Slice Flip Flop % 4 input LUTs % Table 1. Resource utilization of the Optimized Fast 2D-DCT hardware accelerator on the Xilinx XC2VP30 FPGA Starting from the loading of the first group of four input samples, to the reading of the last group of two results, the IP core takes 80 cycles. 48 cycles are used to manage the interfaces and the ping pong buffer, while 32 cycles are used for effective computation. 5 Evaluation In Table 1, we show the occupation of our 2D-DCT accelerator on a Xilinx XC2VP30 Device. With Xilinx ISE 8.2 our IP Core is synthesized at 107 MHz. Compared to the Xilinx [4] solution, our core has an occupation around 2.5 times higher, but the Xilinx IP core does not include input and output logic for a standard bus and it is much slower since it has an initial latency of 92 cycles and then produces just one
5 sample every cycle. This is due to the fact that it is realized combining 8 FIR filters to produce a single sample. Also, the area values refer to a standard, notcustomized core, and so they are relative to a 8 bits input and 9 bits output range, clearly not ready for JPEG encoding. Compared to Agostini s [7] architecture, which uses full Fast 1D-DCT components, our solution uses less multipliers and adders/subtractors just adding a single pipeline stage (six compared to seven). In addition, they adopt a solution with two 1D-DCT elements, while our IP core has one that get reused. They try to use less area implementing the multiplications using a Wallace tree, but since new FPGAs have embedded multipliers this is no more an interesting solution. In addition, this can lead to more occupation. Moreover, each stage of the pipeline needs eight clock cycles to be completed, so the initial latency is 48 cycles for a single 1D-DCT. The transpose memory requires 64 cycles more to complete the transpose operation, which leads to a global latency of 160 cycles. After filling the pipeline, however, each 8x8 blocks comes out at a full 64 cycles rate. Finally, Bukhari [10] IP core uses less adders/subtractors but many more multipliers (11) than our solution for a single 1D DCT element, due to the adoption of the Loeffler algorithm. A single 1D DCT is computed for 8 input samples in a single clock cycle, so the full 2D-DCT needs 16 cycles to be completed. The complexity of each stage of the core anyway does not allow more than 54 MHz in synthesis, and the area occupied, without the logic to interface to a standard processor bus, is already higher. Figure 4 shows the area/delay scatter plot for the four solutions, normalized with respect to the standard Xilinx IP Core. It can be seen that the Xilinx solution, our Optimized Fast 2D-DCT architecture, and Bukhari s solution are Pareto-optimal, lying on the same constant area/delay curve. Nevertheless, our proposal well balance area and delay, unlike Xilinx and Bukhari s solution. Agostini s architecture, which uses an organization similar to ours, features larger delay and area. Our work effectively optimizes this architecture for both area and delay. Table 2 reports the results obtained by executing the full JPEG encoding algorithm (including the reading Figure 4. Area/Delay comparison of the Four IP Cores of the input file and the saving of the output) on a two different architectures for a 160x120 pixels image. The first solution executes the encoding completely in software, and it is easy to see that the DCT calculation, performed with a Fast DCT software implementation, accounts for almost 20% of the application. The second architecture uses instead our Optimized 2D-DCT core to execute the transform. The numbers show that the 2D-DCT hardware accelerator is two orders of magnitude faster than the software implementation, giving a speed up of It is also interesting to note that with the MicroBlaze architecture and the JPEG implementation adopted, the DCT phase is the second most computationally intensive phase of the algorithm. Since this work focuses only on the 2D-DCT hardware accelerator implementation, we did not optimize the RGB to YUV phase. The inclusion of the IP core nullifies the weight of the DCT phase in the application, giving a global speed up of Conclusions In this paper we presented a novel architecture for the Fast 2D-DCT algorithm. The proposed solution is optimized from the area/performance point of view. It uses the symmetries of the algorithm to minimize the number of functional units. Furthermore, the core has been designed to act as an Application Specific IP for the MicroBlaze soft core processor, and taking into account the features and the limitations of its communication system, the architecture has been even more
6 Phase Full SW HW/SW File reading 133,375, ,566,414 RGB to YUV 1,575,687,380 1,586,965,423 Exp & Downsample 2,013,185 2,013,435 Set quant. table 74,711 98,242 DCT 585,084,357 4,227,699 Quantization 354,084, ,500,870 Entropic coding 461,738, ,292,474 Total 3,112,057,809 2,535,664,559 Table 2. Comparison, in clock cycles, of the JPEG algorithm executed on a MicroBlaze architecture with and without the Optimized Fast 2D-DCT hardware accelerator optimized. Our Fast 2D-DCT hardware accelerator adopts a single 1D-DCT element with a seven stage pipeline, that encompasses 19 adders/subtractors and 4 multipliers. Compared to other designs in literature, it satisfies the requirements of low occupation without sacrificing performance. When introduced in a complete System-on-Chip architecture, it executes two orders of magnitude faster than a software implementation. Overall, it can make the execution of the full JPEG encoding algorithm 20% faster on a standard MicroBlaze system with reduced impact on occupation. References [1] Frank Vahid. The softening of hardware. Computer, 36(4):27 34, [2] Altera system-on-a-programmable-chip (SOPC) Builder. Altera Corporation. [3] Xilinx embedded developer kit (EDK). Xilinx Corporation. [4] Xilinx xapp610 video compression using dct, application note. xilinx corporation, available at [5] Altera Megacore Digital Library, Altera Corporation. Workshop on, pages , Leicester, UK, November [7] L.V. Agostini, I.S. Silva, and S. Bampi. Pipelined fast 2d DCT architecture for JPEG image compression. In Integrated Circuits and Systems Design, 2001, 14th Symposium on., pages , Pirenopolis, Brazil, [8] M. Kovac and N. Ranganathan. JAGUAR: a fully pipelined VLSI architecture for JPEG imagecompression standard. Proceedings of the IEEE, 83(2): , February [9] Z.M. Yusof, Z. Aspar, and I. Suleiman. Field programmable gate array (FPGA) based baseline JPEG decoder. In TENCON Proceedings, volume 3, pages , Kuala Lumpur, Malaysia, [10] K. Z. Bukhari, G.K. Kuzmanov, and S. Vassiliadis. Dct and idct implementations on different fpga technologies. In Proceedings of ProRISC 2002, pages , November [11] MicroBlaze Processor Reference Guide. Xilinx Corporation. [12] Fast Simplex Link (FSL) Bus (v2.00a). Reference Guide. Xilinx Corporation. [6] D.W. Trainor, J.P. Heron, and R.F. Woods. Implementation of the 2d DCT using a Xilinx XC6264 FPGA. In Signal Processing Systems, SIPS 97 - Design and Implementation., 1997 IEEE
Pipelined Fast 2-D DCT Architecture for JPEG Image Compression
Pipelined Fast 2-D DCT Architecture for JPEG Image Compression Luciano Volcan Agostini agostini@inf.ufrgs.br Ivan Saraiva Silva* ivan@dimap.ufrn.br *Federal University of Rio Grande do Norte DIMAp - Natal
More informationAn Interrupt Controller for FPGA-based Multiprocessors
An Interrupt Controller for FPGA-based Multiprocessors Antonino Tumeo, Marco Branca, Lorenzo Camerini, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto Politecnico di Milano E-mail:
More informationImplementation of Pipelined Architecture Based on the DCT and Quantization For JPEG Image Compression
Volume 01, No. 01 www.semargroups.org Jul-Dec 2012, P.P. 60-66 Implementation of Pipelined Architecture Based on the DCT and Quantization For JPEG Image Compression A.PAVANI 1,C.HEMASUNDARA RAO 2,A.BALAJI
More informationPolitecnico di Milano
Politecnico di Milano Prototyping Pipelined Applications on a Heterogeneous FPGA Multiprocessor Virtual Platform Antonino Tumeo, Marco Branca, Lorenzo Camerini, Marco Ceriani, Gianluca Palermo, Fabrizio
More informationDesign and Implementation of Effective Architecture for DCT with Reduced Multipliers
Design and Implementation of Effective Architecture for DCT with Reduced Multipliers Susmitha. Remmanapudi & Panguluri Sindhura Dept. of Electronics and Communications Engineering, SVECW Bhimavaram, Andhra
More informationEfficient Implementation of Low Power 2-D DCT Architecture
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 Efficient Implementation of Low Power 2-D DCT Architecture 1 Kalyan Chakravarthy. K, 2 G.V.K.S.Prasad 1 M.Tech student, ECE, AKRG College
More informationMulti-level Design Methodology using SystemC and VHDL for JPEG Encoder
THE INSTITUTE OF ELECTRONICS, IEICE ICDV 2011 INFORMATION AND COMMUNICATION ENGINEERS Multi-level Design Methodology using SystemC and VHDL for JPEG Encoder Duy-Hieu Bui, Xuan-Tu Tran SIS Laboratory, University
More informationA SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN
A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China
More informationSystem Verification of Hardware Optimization Based on Edge Detection
Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection
More informationA Multiprocessor Self-reconfigurable JPEG2000 Encoder
A Multiprocessor Self-reconfigurable JPEG2000 Encoder Antonino Tumeo 1 Simone Borgio 1 Davide Bosisio 1 Matteo Monchiero 2 Gianluca Palermo 1 Fabrizio Ferrandi 1 Donatella Sciuto 1 1 Politecnico di Milano
More information: : (91-44) (Office) (91-44) (Residence)
Course: VLSI Circuits (Video Course) Faculty Coordinator(s) : Prof. S. Srinivasan Department of Electrical Engineering Indian Institute of Technology Madras Chennai 600036 Email Telephone : srinis@iitm.ac.in,
More informationA Reconfigurable Multifunction Computing Cache Architecture
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 509 A Reconfigurable Multifunction Computing Cache Architecture Huesung Kim, Student Member, IEEE, Arun K. Somani,
More informationMemory-efficient and fast run-time reconfiguration of regularly structured designs
Memory-efficient and fast run-time reconfiguration of regularly structured designs Brahim Al Farisi, Karel Heyse, Karel Bruneel and Dirk Stroobandt Ghent University, ELIS Department Sint-Pietersnieuwstraat
More informationDUE to the high computational complexity and real-time
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen
More informationFPGA Matrix Multiplier
FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri
More informationImage Compression System on an FPGA
Image Compression System on an FPGA Group 1 Megan Fuller, Ezzeldin Hamed 6.375 Contents 1 Objective 2 2 Background 2 2.1 The DFT........................................ 3 2.2 The DCT........................................
More informationRUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationAn adaptive genetic algorithm for dynamically reconfigurable modules allocation
An adaptive genetic algorithm for dynamically reconfigurable modules allocation Vincenzo Rana, Chiara Sandionigi, Marco Santambrogio and Donatella Sciuto chiara.sandionigi@dresd.org, {rana, santambr, sciuto}@elet.polimi.it
More informationMultimedia Decoder Using the Nios II Processor
Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra
More informationA Dual-Priority Real-Time Multiprocessor System on FPGA for Automotive Applications
A Dual-Priority Real-Time Multiprocessor System on FPGA for Automotive Applications Antonino Tumeo 1 Marco Branca 1 Lorenzo Camerini 1 Marco Ceriani 1 Matteo Monchiero 2 Gianluca Palermo 1 Fabrizio Ferrandi
More informationDesign of Feature Extraction Circuit for Speech Recognition Applications
Design of Feature Extraction Circuit for Speech Recognition Applications SaambhaviVB, SSSPRao and PRajalakshmi Indian Institute of Technology Hyderabad Email: ee10m09@iithacin Email: sssprao@cmcltdcom
More informationFPGA Implementation of 2-D DCT Architecture for JPEG Image Compression
FPGA Implementation of 2-D DCT Architecture for JPEG Image Compression Prashant Chaturvedi 1, Tarun Verma 2, Rita Jain 3 1 Department of Electronics & Communication Engineering Lakshmi Narayan College
More informationA Dedicated Hardware Solution for the HEVC Interpolation Unit
XXVII SIM - South Symposium on Microelectronics 1 A Dedicated Hardware Solution for the HEVC Interpolation Unit 1 Vladimir Afonso, 1 Marcel Moscarelli Corrêa, 1 Luciano Volcan Agostini, 2 Denis Teixeira
More informationQUKU: A Fast Run Time Reconfigurable Platform for Image Edge Detection
QUKU: A Fast Run Time Reconfigurable Platform for Image Edge Detection Sunil Shukla 1,2, Neil W. Bergmann 1, Jürgen Becker 2 1 ITEE, University of Queensland, Brisbane, QLD 4072, Australia {sunil, n.bergmann}@itee.uq.edu.au
More informationDISCRETE COSINE TRANSFORM (DCT) is a widely
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL 20, NO 4, APRIL 2012 655 A High Performance Video Transform Engine by Using Space-Time Scheduling Strategy Yuan-Ho Chen, Student Member,
More informationAn FPGA based rapid prototyping platform for wavelet coprocessors
An FPGA based rapid prototyping platform for wavelet coprocessors Alonzo Vera a, Uwe Meyer-Baese b and Marios Pattichis a a University of New Mexico, ECE Dept., Albuquerque, NM87131 b FAMU-FSU, ECE Dept.,
More informationMCM Based FIR Filter Architecture for High Performance
ISSN No: 2454-9614 MCM Based FIR Filter Architecture for High Performance R.Gopalana, A.Parameswari * Department Of Electronics and Communication Engineering, Velalar College of Engineering and Technology,
More informationFPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA
FPGA Implementation of 16-Point FFT Core Using NEDA Abhishek Mankar, Ansuman Diptisankar Das and N Prasad Abstract--NEDA is one of the techniques to implement many digital signal processing systems that
More informationEMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES
EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES Pong P. Chu Cleveland State University A JOHN WILEY & SONS, INC., PUBLICATION PREFACE An SoC (system on a chip) integrates a processor, memory
More informationCost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Lo
Cost-and Power Optimized FPGA based System Integration: Methodologies and Integration of a Low-Power Capacity- based Measurement Application on Xilinx FPGAs Abstract The application of Field Programmable
More informationFPGA Implementation of Low Complexity Video Encoder using Optimized 3D-DCT
FPGA Implementation of Low Complexity Video Encoder using Optimized 3D-DCT Rajalekshmi R Embedded Systems Sree Buddha College of Engineering, Pattoor India Arya Lekshmi M Electronics and Communication
More informationHardware Software Co-design and SoC. Neeraj Goel IIT Delhi
Hardware Software Co-design and SoC Neeraj Goel IIT Delhi Introduction What is hardware software co-design Some part of application in hardware and some part in software Mpeg2 decoder example Prediction
More informationHardware Optimized DCT/IDCT Implementation on Verilog HDL
Hardware Optimized DCT/IDCT Implementation on Verilog HDL ECE 734 In this report, I explore 4 implementations for hardware based pipelined DCT/IDCT in Verilog HDL. Conventional DCT/IDCT implementations
More informationVideo Compression An Introduction
Video Compression An Introduction The increasing demand to incorporate video data into telecommunications services, the corporate environment, the entertainment industry, and even at home has made digital
More informationEfficient design and FPGA implementation of JPEG encoder
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 5, Ver. II (Sep. - Oct. 2016), PP 47-53 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Efficient design and FPGA implementation
More informationMulti MicroBlaze System for Parallel Computing
Multi MicroBlaze System for Parallel Computing P.HUERTA, J.CASTILLO, J.I.MÁRTINEZ, V.LÓPEZ HW/SW Codesign Group Universidad Rey Juan Carlos 28933 Móstoles, Madrid SPAIN Abstract: - Embedded systems need
More informationTKT-2431 SoC design. Introduction to exercises. SoC design / September 10
TKT-2431 SoC design Introduction to exercises Assistants: Exercises and the project work Juha Arvio juha.arvio@tut.fi, Otto Esko otto.esko@tut.fi In the project work, a simplified H.263 video encoder is
More informationComparative Study and Implementation of JPEG and JPEG2000 Standards for Satellite Meteorological Imaging Controller using HDL
Comparative Study and Implementation of JPEG and JPEG2000 Standards for Satellite Meteorological Imaging Controller using HDL Vineeth Mohan, Ajay Mohanan, Paul Leons, Rizwin Shooja Amrita Vishwa Vidyapeetham,
More informationA Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning
A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning By: Roman Lysecky and Frank Vahid Presented By: Anton Kiriwas Disclaimer This specific
More informationA full-pipelined 2-D IDCT/ IDST VLSI architecture with adaptive block-size for HEVC standard
LETTER IEICE Electronics Express, Vol.10, No.9, 1 11 A full-pipelined 2-D IDCT/ IDST VLSI architecture with adaptive block-size for HEVC standard Hong Liang a), He Weifeng b), Zhu Hui, and Mao Zhigang
More informationCHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM
CHAPTER 4 IMPLEMENTATION OF DIGITAL UPCONVERTER AND DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM 4.1 Introduction FPGAs provide an ideal implementation platform for developing broadband wireless systems such
More informationDesign and Implementation of 3-D DWT for Video Processing Applications
Design and Implementation of 3-D DWT for Video Processing Applications P. Mohaniah 1, P. Sathyanarayana 2, A. S. Ram Kumar Reddy 3 & A. Vijayalakshmi 4 1 E.C.E, N.B.K.R.IST, Vidyanagar, 2 E.C.E, S.V University
More informationMultiprocessor System in an FPGA
November 2011 1 Multiprocessor System in an FPGA Wilson Maltez José Abstract As time goes by, new applications emerge more complex and demanding than ever, leading technology forward. In the embedded systems
More informationHIGH LEVEL SYNTHESIS OF A 2D-DWT SYSTEM ARCHITECTURE FOR JPEG 2000 USING FPGAs
HIGH LEVEL SYNTHESIS OF A 2D-DWT SYSTEM ARCHITECTURE FOR JPEG 2000 USING FPGAs V. Srinivasa Rao 1, Dr P.Rajesh Kumar 2, Dr Rajesh Kumar. Pullakura 3 1 ECE Dept. Shri Vishnu Engineering College for Women,
More informationAn HEVC Fractional Interpolation Hardware Using Memory Based Constant Multiplication
2018 IEEE International Conference on Consumer Electronics (ICCE) An HEVC Fractional Interpolation Hardware Using Memory Based Constant Multiplication Ahmet Can Mert, Ercan Kalali, Ilker Hamzaoglu Faculty
More informationLecture 8 JPEG Compression (Part 3)
CS 414 Multimedia Systems Design Lecture 8 JPEG Compression (Part 3) Klara Nahrstedt Spring 2011 Administrative MP1 is posted Extended Deadline of MP1 is February 18 Friday midnight submit via compass
More informationFPGA IMPLEMENTATION OF HIGH SPEED DCT COMPUTATION OF JPEG USING VEDIC MULTIPLIER
FPGA IMPLEMENTATION OF HIGH SPEED DCT COMPUTATION OF JPEG USING VEDIC MULTIPLIER Prasannkumar Sohani Department of Electronics Shivaji University, Kolhapur, Maharashtra, India P.C.Bhaskar Department of
More informationFPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS
FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS 1 RONNIE O. SERFA JUAN, 2 CHAN SU PARK, 3 HI SEOK KIM, 4 HYEONG WOO CHA 1,2,3,4 CheongJu University E-maul: 1 engr_serfs@yahoo.com,
More informationDesign and Implementation of SPIHT Algorithm for DWT (Image Compression)
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 3, Issue 5 (Nov. Dec. 2013), PP 18-22 e-issn: 2319 4200, p-issn No. : 2319 4197 Design and Implementation of SPIHT Algorithm for DWT (Image
More informationKeywords - DWT, Lifting Scheme, DWT Processor.
Lifting Based 2D DWT Processor for Image Compression A. F. Mulla, Dr.R. S. Patil aieshamulla@yahoo.com Abstract - Digital images play an important role both in daily life applications as well as in areas
More informationA Novel Design Framework for the Design of Reconfigurable Systems based on NoCs
Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction
More informationFPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011
FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level
More informationAC : INCORPORATING SYSTEM-LEVEL DESIGN TOOLS INTO UPPER-LEVEL DIGITAL DESIGN AND CAPSTONE COURSES
AC 2007-2290: ICORPORATIG SYSTEM-LEVEL DESIG TOOLS ITO UPPER-LEVEL DIGITAL DESIG AD CAPSTOE COURSES Wagdy Mahmoud, University of the District of Columbia IEEE Senior Member American Society for Engineering
More informationPS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor
PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor K.Rani Rudramma 1, B.Murali Krihna 2 1 Assosiate Professor,Dept of E.C.E, Lakireddy Bali Reddy Engineering College, Mylavaram
More informationModeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano
Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationSyCERS: a SystemC design exploration framework for SoC reconfigurable architecture
SyCERS: a SystemC design exploration framework for SoC reconfigurable architecture Carlo Amicucci Fabrizio Ferrandi Marco Santambrogio Donatella Sciuto Politecnico di Milano Dipartimento di Elettronica
More informationFPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard
FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE 754-2008 Standard M. Shyamsi, M. I. Ibrahimy, S. M. A. Motakabber and M. R. Ahsan Dept. of Electrical and Computer Engineering
More informationESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)
ESE Back End 2.0 D. Gajski, S. Abdi (with contributions from H. Cho, D. Shin, A. Gerstlauer) Center for Embedded Computer Systems University of California, Irvine http://www.cecs.uci.edu 1 Technology advantages
More informationPolitecnico di Milano
Politecnico di Milano Automatic parallelization of sequential specifications for symmetric MPSoCs [Full text is available at https://re.public.polimi.it/retrieve/handle/11311/240811/92308/iess.pdf] Fabrizio
More informationA DYNAMICALLY RECONFIGURABLE PARALLEL PIXEL PROCESSING SYSTEM. Daniel Llamocca, Marios Pattichis, and Alonzo Vera
A DYNAMICALLY RECONFIGURABLE PARALLEL PIXEL PROCESSING SYSTEM Daniel Llamocca, Marios Pattichis, and Alonzo Vera Electrical and Computer Engineering Department The University of New Mexico, Albuquerque,
More informationFPGA Implementation of 4-Point and 8-Point Fast Hadamard Transform
FPGA Implementation of 4-Point and 8-Point Fast Hadamard Transform Ankit Agrawal M.Tech Electronics engineering department, MNIT, Jaipur Rajasthan, INDIA. Rakesh Bairathi Associate Professor Electronics
More informationFPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression
FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas
More informationLecture 7: Introduction to Co-synthesis Algorithms
Design & Co-design of Embedded Systems Lecture 7: Introduction to Co-synthesis Algorithms Sharif University of Technology Computer Engineering Dept. Winter-Spring 2008 Mehdi Modarressi Topics for today
More informationHardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA
Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA Arash Nosrat Faculty of Engineering Shahid Chamran University Ahvaz, Iran Yousef S. Kavian
More informationDESIGN OF DCT ARCHITECTURE USING ARAI ALGORITHMS
DESIGN OF DCT ARCHITECTURE USING ARAI ALGORITHMS Prerana Ajmire 1, A.B Thatere 2, Shubhangi Rathkanthivar 3 1,2,3 Y C College of Engineering, Nagpur, (India) ABSTRACT Nowadays the demand for applications
More informationPipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications
, Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar
More informationFault Tolerant Parallel Filters Based On Bch Codes
RESEARCH ARTICLE OPEN ACCESS Fault Tolerant Parallel Filters Based On Bch Codes K.Mohana Krishna 1, Mrs.A.Maria Jossy 2 1 Student, M-TECH(VLSI Design) SRM UniversityChennai, India 2 Assistant Professor
More informationImplementation of Lifting-Based Two Dimensional Discrete Wavelet Transform on FPGA Using Pipeline Architecture
International Journal of Computer Trends and Technology (IJCTT) volume 5 number 5 Nov 2013 Implementation of Lifting-Based Two Dimensional Discrete Wavelet Transform on FPGA Using Pipeline Architecture
More informationA Hardware Task-Graph Scheduler for Reconfigurable Multi-tasking Systems
A Hardware Task-Graph Scheduler for Reconfigurable Multi-tasking Systems Abstract Reconfigurable hardware can be used to build a multitasking system where tasks are assigned to HW resources at run-time
More informationCOPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code
COPY RIGHT 2018IJIEMR.Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material
More informationFPGA Provides Speedy Data Compression for Hyperspectral Imagery
FPGA Provides Speedy Data Compression for Hyperspectral Imagery Engineers implement the Fast Lossless compression algorithm on a Virtex-5 FPGA; this implementation provides the ability to keep up with
More informationHigh Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems
High Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems RAVI KUMAR SATZODA, CHIP-HONG CHANG and CHING-CHUEN JONG Centre for High Performance Embedded Systems Nanyang Technological University
More informationMapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.
Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable
More informationFPGA Polyphase Filter Bank Study & Implementation
FPGA Polyphase Filter Bank Study & Implementation Raghu Rao Matthieu Tisserand Mike Severa Prof. John Villasenor Image Communications/. Electrical Engineering Dept. UCLA 1 Introduction This document describes
More informationSimulation & Synthesis of FPGA Based & Resource Efficient Matrix Coprocessor Architecture
Simulation & Synthesis of FPGA Based & Resource Efficient Matrix Coprocessor Architecture Jai Prakash Mishra 1, Mukesh Maheshwari 2 1 M.Tech Scholar, Electronics & Communication Engineering, JNU Jaipur,
More informationSupporting the Linux Operating System on the MOLEN Processor Prototype
1 Supporting the Linux Operating System on the MOLEN Processor Prototype Filipa Duarte, Bas Breijer and Stephan Wong Computer Engineering Delft University of Technology F.Duarte@ce.et.tudelft.nl, Bas@zeelandnet.nl,
More informationPERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS
American Journal of Applied Sciences 11 (4): 558-563, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.558.563 Published Online 11 (4) 2014 (http://www.thescipub.com/ajas.toc) PERFORMANCE
More informationFPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)
FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor
More informationFPGA Implementation of an Efficient Two-dimensional Wavelet Decomposing Algorithm
FPGA Implementation of an Efficient Two-dimensional Wavelet Decomposing Algorithm # Chuanyu Zhang, * Chunling Yang, # Zhenpeng Zuo # School of Electrical Engineering, Harbin Institute of Technology Harbin,
More informationHIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE
HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu
More informationUsing Streaming SIMD Extensions in a Fast DCT Algorithm for MPEG Encoding
Using Streaming SIMD Extensions in a Fast DCT Algorithm for MPEG Encoding Version 1.2 01/99 Order Number: 243651-002 02/04/99 Information in this document is provided in connection with Intel products.
More informationAN EFFICIENT VLSI IMPLEMENTATION OF IMAGE ENCRYPTION WITH MINIMAL OPERATION
AN EFFICIENT VLSI IMPLEMENTATION OF IMAGE ENCRYPTION WITH MINIMAL OPERATION 1, S.Lakshmana kiran, 2, P.Sunitha 1, M.Tech Student, 2, Associate Professor,Dept.of ECE 1,2, Pragati Engineering college,surampalem(a.p,ind)
More informationASIC Implementation of one level 2D DWT and 2D DWT in Hybrid Wave-Pipelining & Pipelining
Journal of Scientific & Industrial Research Vol. 74, November 2015, pp. 609-613 ASIC Implementation of one level 2D DWT and 2D DWT in Hybrid Wave-Pipelining & Pipelining V Adhinarayanan 1 *, S Gopalakrishnan
More informationEFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL INTERLEAVER FOR LONG TERM EVOLUTION SYSTEMS
Rev. Roum. Sci. Techn. Électrotechn. et Énerg. Vol. 61, 1, pp. 53 57, Bucarest, 016 Électronique et transmission de l information EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL
More informationA LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING Dieison Silveira, Guilherme Povala,
More informationEfficient Self-Reconfigurable Implementations Using On-Chip Memory
10th International Conference on Field Programmable Logic and Applications, August 2000. Efficient Self-Reconfigurable Implementations Using On-Chip Memory Sameer Wadhwa and Andreas Dandalis University
More informationFPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm
AMSE JOURNALS-AMSE IIETA publication-2017-series: Advances B; Vol. 60; N 2; pp 332-337 Submitted Apr. 04, 2017; Revised Sept. 25, 2017; Accepted Sept. 30, 2017 FPGA Implementation of Discrete Fourier Transform
More informationReconfigurable PLL for Digital System
International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 6, Number 3 (2013), pp. 285-291 International Research Publication House http://www.irphouse.com Reconfigurable PLL for
More informationDesign of 2-D DWT VLSI Architecture for Image Processing
Design of 2-D DWT VLSI Architecture for Image Processing Betsy Jose 1 1 ME VLSI Design student Sri Ramakrishna Engineering College, Coimbatore B. Sathish Kumar 2 2 Assistant Professor, ECE Sri Ramakrishna
More informationOrthogonal Approximation of DCT in Video Compressing Using Generalized Algorithm
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 1 ISSN : 2456-3307 Orthogonal Approximation of DCT in Video Compressing
More informationFPGA-Based Rapid Prototyping of Digital Signal Processing Systems
FPGA-Based Rapid Prototyping of Digital Signal Processing Systems Kevin Banovic, Mohammed A. S. Khalid, and Esam Abdel-Raheem Presented By Kevin Banovic July 29, 2005 To be presented at the 48 th Midwest
More informationA flexible memory shuffling unit for image processing accelerators
Eindhoven University of Technology MASTER A flexible memory shuffling unit for image processing accelerators Xie, R.Z. Award date: 2013 Disclaimer This document contains a student thesis (bachelor's or
More informationINTRODUCTION TO FPGA ARCHITECTURE
3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)
More informationA Light Weight Network on Chip Architecture for Dynamically Reconfigurable Systems
A Light Weight Network on Chip Architecture for Dynamically Reconfigurable Systems Simone Corbetta, Vincenzo Rana, Marco Domenico Santambrogio and Donatella Sciuto Dipartimento di Elettronica e Informazione
More informationInternational Research Journal of Engineering and Technology (IRJET) e-issn:
Implementation of Image Compression algorithm on FPGA S.A.Gore 1, S.N.Kore 2 1 PG Student, Department of Electronics Engineering, Walchand College of Engineering, Sangli, Maharashtra, 2Associate Professor,
More informationTKT-2431 SoC design. Introduction to exercises
TKT-2431 SoC design Introduction to exercises Assistants: Exercises Jussi Raasakka jussi.raasakka@tut.fi Otto Esko otto.esko@tut.fi In the project work, a simplified H.263 video encoder is implemented
More informationDesign Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE
More informationLecture 8 JPEG Compression (Part 3)
CS 414 Multimedia Systems Design Lecture 8 JPEG Compression (Part 3) Klara Nahrstedt Spring 2012 Administrative MP1 is posted Today Covered Topics Hybrid Coding: JPEG Coding Reading: Section 7.5 out of
More informationThe Efficient Implementation of Numerical Integration for FPGA Platforms
Website: www.ijeee.in (ISSN: 2348-4748, Volume 2, Issue 7, July 2015) The Efficient Implementation of Numerical Integration for FPGA Platforms Hemavathi H Department of Electronics and Communication Engineering
More information