Design Exploration and Implementation of Simplex Algorithm over Reconfigurable Computing Platforms

Size: px
Start display at page:

Download "Design Exploration and Implementation of Simplex Algorithm over Reconfigurable Computing Platforms"

Transcription

1 Design Exploration and Implementation of Simplex Algorithm over Reconfigurable Computing Platforms Sparsh Mittal Department of Electrical and Computer Engg., ISU Ames, USA Lizhi Wang Department of Electrical and Computer Engg., ISU Ames, USA Abstract Linear programming (LP) is an important tool for many inter-disciplinary optimization problems. he Simplex method is the most widely used algorithm to solve LP problems and has immense impact on several developments in various fields. With development of public domain and commercial software solvers, it has been automated and made available for use. A serious bottleneck in implementation of Simplex algorithm is the efficient implementation over application-specific processors and parallel hardware platforms such as Field Programmable Gate Arrays. Such implementation could result in drastic speed up in execution of linear programming models. In this paper, we implement Simplex algorithm over FPGA with both low-level design language namely VHDL and high-level design and modeling packages for hardware generation. In addition, we have also modeled the design in Simulink to serve as an intermediate design for migration from software to hardware. A comparison with existing works promises large speed-ups. Keywords- Linear Programming; Simplex; Simulink; Xilinx System Generator; VHDL programming; I. INRODUCION Linear programming refers to the optimization techniques where both objective functions and the constraints are linear. he linear programming started in 1947 with the discovery of the Simplex method by Dantzig [1]. It allows mechanical solutions for optimization problems with large number of programming constraints and variables. Simplex method is a simple, elegant, yet powerful tool for solving linear programming problems. It requires only function evaluations, not derivatives and can be solved efficiently in software. Although different algorithms have been proposed for solving LP problems, Simplex remains a popular choice. With the availability of many Simplex-based solvers on many general purpose processing platforms, it is being extensively used in diverse engineering domains. However the computation intensive nature of the problem and the algorithm calls for greater processing power and greater his work is partially supported by the National Science Foundation under Grant # to the Computing Research Association for the CIFellows Project. Amit Pande Department of Computer Science, UC Davis, California, USA amit@cs.ucdavis.edu Praveen Kumar Department of Computer Science, GRIE Hyderabad, India praveen.kverma@gmail.com speed for efficient computation in solving real-time problems. Recently, great speedups have been achieved for several algorithms by efficient implementation in dedicated hardware such as Application-Specific Integrated Circuits (ASICs). However, high time-to market has been a bottleneck for the ASICs. he evolution of Field Programmable Gate Arrays (FPGAs) along with high-level design tools such as from Altera, Xilinx System Generator have come as valuable and effective tool for high-level programmers to achieve better execution times in these reconfigurable hardware. FPGA expedite the time lag between hardware design and shipping time of the circuit from 2-3 years to a few weeks. In this paper, we implement Simplex algorithm on FPGA using both VHDL (a low level programming language) and XSG (a high level visual tool for hardware generation), for small-sized problems and also model and simulate the algorithm on Simulink. he key contributions of our work are as follows: 1) o best of our knowledge, this is the first model of Simplex in Simulink for ease in visualization and simulation. 2) We are also the first to implement Simplex in System Generator for FPGA design. 3) We have also developed Simplex on FPGA, using direct design in VHDL to achieve a fast implementation. 4) We discuss the parallelization obtained by efficient tableau based representation. he clock frequency achieved by such design is compared with that in general purpose software. he paper is organized as follows: Section II discusses about basics of simplex method. Section III discusses about existing literature work while Section IV discusses some design languages for hardware implementation on FPGA. wo such implementations are then discussed: Simulink based design in Section V and vhdl based coding in Section VI. Section VII gives conclusion and future work in this direction. II. BACKGROUND A linear program is represented in the standard form in

2 matrix notation: M A X C x s. t. A x w b x, w 0 Here C, xw, n n b m, and are decision variables. m n A are parameters and For the special case of three variables (n=3) and three constraints (m=3), it can be explicitly written as: M A X = C x C x C x st x, x, x, w, w, w In what follows, we briefly explain the working of the Simplex. he basic idea of simplex is based on the observation that the optimal solution to an LP, if exists, occurs at an extreme point of the feasible region (called basic solution"). Based on this observation, we can find the optimal solution by (i) starting from a feasible corner point, and (ii) moving to a better corner point until the current one is already optimal. If we cannot find a starting point, then the LP is infeasible; if we can optimize the objective value to infinity, then the LP is unbounded. We write the problem in following matrix form: M A X { C x : A x b, x 0} Here, C [ C 0 ], A [ A I ] x [ x w ] m 1 m m n m We need n m linearly independent active constraints to uniquely determine a basic solution. Define N as the indices of constraints in x 0 that are set to hold at equality, and B as the indices of other constraints in x 0. Such a pair of (B,N) is an exclusive and exhaustive partition of the set {1, 2,..., m n}. he above conditions are only necessary for a basic solution. o find the sufficient condition for a basic solution, we rewrite using the definition of N and B: M A X C x C x B B N N s. t. A x A x b x B B B N N 0, x 0 N where A B is the collection of columns in A whose indices are in the set B, and N x and N C are the collections of elements in x and C, respectively, whose indices are in the set N. hen, the necessary and sufficient conditions for a basic solution are: he m elements in the set B should be chosen such that A B is invertible, and the n elements in the set N are then determined by N {1, 2,..., m n} / B. Such a partition is called a basic partition. In any iteration with a feasible basic partition ( B, N ) which is not optimal, the partition is updated by selecting an entering variable and a leaving variable. he rule for selecting the entering and leaving variables is called a pivoting rule. We have used Bland's pivoting rule. After the entering and leaving variables are chosen, we get an updated partition. his process is repeated till an optimal partition and solution is found. In worst case, Simplex may require exponentially many iterations to examine each of the basic solution, and other methods (such as ellipsoid) exist which theoretically are guaranteed to be polynomial. However, the practical performance of simplex algorithm is in general much better than that of the ellipsoid method, and that is one of the reasons simplex algorithm is widely used. III. LIERAURE REVIEW In literature, several techniques have been proposed for the solution of linear programs. For example, feasible direction methods are proposed by Brown and Koopmans [2], as well as by Murty and Faithi [3], among others. Megiddo [4] reduces the number of constraints through a multidimensional search technique. he ellipsoid algorithm [5] first established that linear programming problems can be solved in polynomial time, but it performs poorly in practice. Karmarkar [6] developed a polynomial projection approach that is used in some applications. However, the simplex algorithm remains the underlying algorithm utilized by most commercial linear programming packages. Even though the simplex algorithm is not polynomial, in practice it is found to be efficient enough to be used and Borgwardt [7] proved that its expected number of iterations is polynomial when it is applied for practical problems. he main computational disadvantage of the simplex algorithm is that the total number of iterations cannot be predicted. As dimension n increases, the computational time rises up exponentially. o improve the efficiency, parallel implementations of linear programming algorithms have been studied extensively in the recent years ([8,9,10]). Linear programming is applied to a large variety of scientific and industrial computing applications employing optimization problems. A few application areas include real time motion analysis ([11]), MIMO detection and decoding

3 ([12]) etc. In these applications, linear programming is preferred over nonlinear programming because of its efficiency and other problem-specific advantages. here are many variants of Simplex that have been developed and are more efficient than naïve simplex such as Cosine Simplex etc., which offer some improvement such as reduction in the number of simplex iterations and the number of computations in each iteration. We also discuss the work done on implementing Simplex on hardware. Majumdar [13] implements integer linear programming on FPGA and show a speed-up over software implementation. heir design is composed of both software and hardware unit. he software unit accepts the input file and scans it for the problem size, objective and different components of the input and sends it to the hardware unit where it is stored into the Zero Bus urn around (ZB) of Virtex-II and sends the data to the processing module. he processing module processes the data and sends the solution to the output module that gets stored in the ZB. hey have used dictionary based representation of problem, however we have used tableau based representation for efficient computations. Due to large hardware requirements and lack of pipelining, their implementation is slow compared to ours as shown in very poor clock frequency. Klindworth and Schutz [14] present a hardware realization of Simplex. hey discuss the solution of problem where many operands (coefficients in A and b) are zeros. he hardware is based on parallel architecture and it employs standards FPUs, RAMs and custom VLSI chips. hey use a VLSI chip model which is somewhat like a multicore chip. However implementation on FPGA has its own advantages. hey use eight processing units to get parallelism, however, by efficiently exploiting parallelism of FPGA, we promise very high parallelism (as shown in section 6, such as 28 or 100 or more). he small time-to-market for FPGAs over VLSI models is the reason for popular choice of FPGAs in current market. Besides, none of the current system uses any modeling or simulation language for visualization and demonstration of this algorithm to enhance learning. Moreover, even though commercial and public domain software packages for Simplex exist and are widely used, the immense potential of hardware has hardly been utilized for enhancing performance of this computation intensive algorithm. In this paper, we address these limitations. IV. DESIGN LANGUAGES FOR IMPLEMENAION he salient features of FPGAs that make them superior in speed, over conventional general purpose hardware like Pentiums are their greater I/O bandwidth to local memory, pipelining, parallelism and availability of optimizing compiler. Complex tasks, which involve, multiple image operators, run much faster on FPGAs than on Pentiums, in fact, Bruce (2003) reports an 800-time speed up by FPGA using SA-C. here are several reasons for such large speed up which FPGAs have over PCs. In comparison to an FPGA, hardware such as Pentium runs at memory speed, not at cache speed. So, even running at much higher clock frequency and having the facility of cache memory, it responds much slower than a comparable FPGA. Frequency of operation in hardware such as Pentium can be increased up to a certain extent to increase the performance or the required data rate to process the image data, but increasing the frequency above certain limits causes system level and board level issues that become a bottleneck in the design. Choosing an appropriate tool for FPGA design is of crucial importance as it affects the cost, development time and various other aspects of design. Simulink is a platform for multi-domain simulation and Model-Based Design for dynamic systems. It provides an interactive graphical environment and a set of block libraries, and can be extended for different specialized applications. Using Simulink one can quickly build up models from libraries of pre-built blocks. For high level design we have chosen Xilinx System Generator. It is a DSP design tool from Xilinx that enables the use of the Mathworks model-based design environment Simulink for FPGA design. Xilinx System Generator (XSG) for DSP is a tool which offers block libraries that plugs into Simulink tool (containing bit-true and cycleaccurate models of their FPGA s particular math, logic, and DSP functions). It is a system-level modeling tool in which designs are captured in the DSP friendly Simulink modeling environment using a Xilinx specific blockset. All of the downstream FPGA implementation steps including synthesis and place and route are automatically performed to generate an FPGA programming file. Over 90 DSP building blocks are provided in the Xilinx DSP blockset for Simulink. V. SYSEM DESIGN Figure 1 shows our model of Simplex Solver on Simulink. Simplex iteratively searches for the optimal solution till one is found and checks the vertices of the feasible region for its computation. he values of the coefficients at the end of one step act as starting point in the next step of pivot computation. hus in a visual data flow environment, this is represented by a feedback network or memory element to remember the previous value of coefficients in the Simplex tableau. We have implemented models using both the properties. For sake of brevity, we omit the figure employing feedback network to update value. he model in figure 1 uses persistent variables for this purpose. hey have the special property that they need to be initialized only once during first function call and remember their values during subsequent function calls. he value of objective function can be inferred from the display for both current step and optimal (final) step. he simulation automatically stops on finding optimal value. Note that,

4 Figure 1: Simplex Model in Simulink Figure 2: Simplex Model in Xilinx System Generator since this design does not use Xilinx Blocksets, it cannot be directly implemented in hardware. Figure 2 shows the model of Simplex in System Generator. he input and output interface blocks carry out the function of interfacing between signal produced by Simulink sources and that to be used by Xilinx blocks and vice versa. Except sources and sinks (for display of results), this design is composed entirely of Xilnx blocks and hence can be used to generate the hardware at the click of the button. VI. VHDL IMPLEMENAION VHDL or Very high speed integrated circuits Hardware Description Language has been the choice of commercial and military consumers for digital hardware design (Kief, 2008) for the past and continues to dominate the commercial market due to optimized implementation on hardware and availability of large number of free IP cores. After studying the solution of Simplex method using Simulink we demonstrated its hardware feasibility and visual interface through Xilinx system generator. Many blocks of custom Matlab code (.m files) were however needed for the design and the hardware generated for these blocks was not optimized. In this section we present the details of design implemented using VHDL programming language and later synthesized in Xilinx ISE. Xilinx ISE is a design tool provided by Xilinx to help build bit streams to be directly ported into the FPGA boards.. he Xilinx ISE tool performs several optimizations before synthesizing the design. We targeted the Xilinx Vertex V XCVLX330-LX board. he hardware usage of FPGA is presented table 1. he hardware was pipelined to increase the critical path of the design and increase the clock frequency. he multipliers were implemented in hardware with the help of

5 Extreme DSP slices while the divider IP core was generated using Xilinx core-generator software. he hardware implementation details are presented in table 2. A clock frequency of 644 MHz was achieved with a able 1. he Hardware usage statistics of FPGA Slice Logic Utilization: Number of Slice Registers: 1029 / Number of Slice LUs: 1018/ Number used as Logic: 1012 / Number used as Memory: 6 / Slice Logic Distribution: Number of LU Flip Flop pairs used: 1591 able 2. he hardware implementation details on Xilinx FPGA # Multipliers : 27 16x16-bit multiplier : 27 # Adders/Subtractors : bit adder : bit subtractor : 28 # Registers : bit register : 91 3-bit register : 1 # Latches : 20 1-bit latch : 20 # Comparators : bit comparator greater : 3 16-bit comparator less : 26 # Multiplexers : 3 16-bit 8-to-1 multiplexer : 3 # Xors : 10 1-bit xor2 : 10 #Dividers : 8 latency of 3 cycles. his implies that we can move from one optimal solution to another in 4.5 ns. It can be observed from able 1 that the implementation (a standard LP with 3 variables and 3 constraints), leaves most of the FPGA hardware unutilized. herefore, we can increase the number of variables to a very large value and still get a reasonably good implementation. As the number of variables and constraints increase, there is a quadratic increase in hardware resources (slice registers) usage. However, since most of the multiplication operations are done in parallel and in a row/ column-wise manner, the clock frequency decreases linearly. As we increase the number of variables, the clock frequency of FPGA based implementation will decrease, owing to large time in signal propagation through interconnects, however we expect that the performance will be still better than other software based implementations where the increase in number of variables cannot be accompanied with increased resource utilization. VII. CONCLUSIONS AND FUURE WORK Advances in FPGA technology along with development of elaborate and efficient tools for modeling, simulation and synthesis have made FPGAs a highly useful platform. With a graphical environment based on Simulink and a predefined block set of Xilinx DSP cores, System Generator meets the needs of both system architects who need to integrate the components of a complete design and hardware designers who need to optimize implementations. We have implemented Simplex over Simulink and over FPGA using Xilinx System Generator for problem size of three variables and constraints. We presented the synthesis results for implementation over Vertex V XCVLX330 FPGA board. A high clock frequency of 644 MHz was obtained. he future work will focus on development of visually enhanced implementation of Simplex on Simulink and its generalization to arbitrary large number of variables, using powerful graphical functions of Simulink. We also plan to conduct a survey among undergraduate and graduate students, learning Simplex algorithm to assess how a graphical implementation of Simplex assists in and augments their learning process. REFERENCES [1] Dantzig, G.B.. Maximization of a linear function of variables subject to linear inequalities. In.C. Koopmans, editor, Activity Analysis of Production and Allocation, number 13 in Cowles Commission Monographs, pages 339_347, John Wiley & Sons, Inc. [2] Brown, G.W. and Koopmans,.C. Computational suggestions for maximizing a linear function subject to linear inequalities in.c. Koopmans, editor, Activity Analysis of Production and Allocation, John Wiley, New York (1951). [3] Murty, K.G. and Faithi, Y. A feasible direction method for linear programming Operations Research Letters 3, (1984). [4] Megiddo, N. Linear programming in linear time when the dimension is fixed Journal of the Association of Computing Machinery 31, (1984). [5] R.G. Bland, D. Goldfarb, and M.J. odd, he ellipsoid method: a survey, Operations Research 29, (1981). [6] Karmarkar, N. A new polynomial-time algorithm for linear programming Combinatorica 4, (1984). [7] Borgwardt, K. H. Some distribution independent results about the asymptotic order of the average number of pivot steps in the simplex method Mathematics of Operations Research, vol. 7, no. 3, pp , [8] Maros, I. and Mitra, G. Investigating the sparse simplex method on a distributed memory multiprocessor, Parallel Computing, vol. 26, pp , [9] Klabjan, D. Johnson, L. E. and Nemhauser, L. G. A parallel primaldual simplex algorithm Operations Research Letters, vol. 27, no. 2, pp , [10] Eckstein, J. Bodurglu, I. Polymenakos, L. and Goldfarb, D. Data- Parallel Implementations of Dense Simplex Methods on the ConnectionMachine CM-2 ORSA Journal on Computing, vol. 7, no. 4, pp , [11] Ben-Ezra, M. Peleg, S. Werman, M. Real-time motion analysis with linear programming Computer Vision and Image Understanding, vol.78 no.1, pp.32-52, April [12] Cui,. Ho,. ellambura, C. Linear Programming Detection and Decoding for MIMO Systems IEEE International Symposium on Information heory, pp , July [13] Majumdar, A. FPGA Implementation Of Integer Linear Programming Accelerator International Conference on Systemics, Cybernetics and Informatics, (ICSCI), Jan [14] Klindworth, A. Schutz, B. A VLSI-Chip-Set for a Hardware- Accelerator for the Simplex-Method; Proc. 5th Ann. IEEE International ASIC Conference, Rochester, NY, Sept. 1992, pp

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China

More information

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications , Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar

More information

Developing a Data Driven System for Computational Neuroscience

Developing a Data Driven System for Computational Neuroscience Developing a Data Driven System for Computational Neuroscience Ross Snider and Yongming Zhu Montana State University, Bozeman MT 59717, USA Abstract. A data driven system implies the need to integrate

More information

SOLVING LINEAR PROBLEMS USING SIMPLEX METHOD

SOLVING LINEAR PROBLEMS USING SIMPLEX METHOD SOLVING LINEAR PROBLEMS USING SIMPLEX METHOD Usha K Patil 1,Rashmi M 2 1,2 Assistant Professor, Dept of CSE,GSSSIETW, Mysuru Abstract The main aim of our application is to implement simplex method using

More information

Reconfigurable PLL for Digital System

Reconfigurable PLL for Digital System International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 6, Number 3 (2013), pp. 285-291 International Research Publication House http://www.irphouse.com Reconfigurable PLL for

More information

High Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems

High Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems High Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems RAVI KUMAR SATZODA, CHIP-HONG CHANG and CHING-CHUEN JONG Centre for High Performance Embedded Systems Nanyang Technological University

More information

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila

More information

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,

More information

Implementation of Floating Point Multiplier Using Dadda Algorithm

Implementation of Floating Point Multiplier Using Dadda Algorithm Implementation of Floating Point Multiplier Using Dadda Algorithm Abstract: Floating point multiplication is the most usefull in all the computation application like in Arithematic operation, DSP application.

More information

High-Performance Linear Algebra Processor using FPGA

High-Performance Linear Algebra Processor using FPGA High-Performance Linear Algebra Processor using FPGA J. R. Johnson P. Nagvajara C. Nwankpa 1 Extended Abstract With recent advances in FPGA (Field Programmable Gate Array) technology it is now feasible

More information

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu

More information

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely

More information

Chapter 5. Hardware Software co-simulation

Chapter 5. Hardware Software co-simulation Chapter 5 Hardware Software co-simulation Hardware Software co-simulation of a multiple image encryption technique has been described in the present study. Proposed multiple image encryption technique

More information

FPGA IMPLEMENTATION OF IMAGE FUSION USING DWT FOR REMOTE SENSING APPLICATION

FPGA IMPLEMENTATION OF IMAGE FUSION USING DWT FOR REMOTE SENSING APPLICATION FPGA IMPLEMENTATION OF IMAGE FUSION USING DWT FOR REMOTE SENSING APPLICATION 1 Gore Tai M, 2 Prof. S I Nipanikar 1 PG Student, 2 Assistant Professor, Department of E&TC, PVPIT, Pune, India Email: 1 goretai02@gmail.com

More information

Design and Implementation of 3-D DWT for Video Processing Applications

Design and Implementation of 3-D DWT for Video Processing Applications Design and Implementation of 3-D DWT for Video Processing Applications P. Mohaniah 1, P. Sathyanarayana 2, A. S. Ram Kumar Reddy 3 & A. Vijayalakshmi 4 1 E.C.E, N.B.K.R.IST, Vidyanagar, 2 E.C.E, S.V University

More information

A PRIMAL-DUAL EXTERIOR POINT ALGORITHM FOR LINEAR PROGRAMMING PROBLEMS

A PRIMAL-DUAL EXTERIOR POINT ALGORITHM FOR LINEAR PROGRAMMING PROBLEMS Yugoslav Journal of Operations Research Vol 19 (2009), Number 1, 123-132 DOI:10.2298/YUJOR0901123S A PRIMAL-DUAL EXTERIOR POINT ALGORITHM FOR LINEAR PROGRAMMING PROBLEMS Nikolaos SAMARAS Angelo SIFELARAS

More information

FPGA Polyphase Filter Bank Study & Implementation

FPGA Polyphase Filter Bank Study & Implementation FPGA Polyphase Filter Bank Study & Implementation Raghu Rao Matthieu Tisserand Mike Severa Prof. John Villasenor Image Communications/. Electrical Engineering Dept. UCLA 1 Introduction This document describes

More information

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm AMSE JOURNALS-AMSE IIETA publication-2017-series: Advances B; Vol. 60; N 2; pp 332-337 Submitted Apr. 04, 2017; Revised Sept. 25, 2017; Accepted Sept. 30, 2017 FPGA Implementation of Discrete Fourier Transform

More information

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Yojana Jadhav 1, A.P. Hatkar 2 PG Student [VLSI & Embedded system], Dept. of ECE, S.V.I.T Engineering College, Chincholi,

More information

High Performance Architecture for Reciprocal Function Evaluation on Virtex II FPGA

High Performance Architecture for Reciprocal Function Evaluation on Virtex II FPGA EurAsia-ICT 00, Shiraz-Iran, 9-31 Oct. High Performance Architecture for Reciprocal Function Evaluation on Virtex II FPGA M. Anane, H. Bessalah and N. Anane Centre de Développement des Technologies Avancées

More information

System Verification of Hardware Optimization Based on Edge Detection

System Verification of Hardware Optimization Based on Edge Detection Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection

More information

ISSN Vol.02, Issue.11, December-2014, Pages:

ISSN Vol.02, Issue.11, December-2014, Pages: ISSN 2322-0929 Vol.02, Issue.11, December-2014, Pages:1208-1212 www.ijvdcs.org Implementation of Area Optimized Floating Point Unit using Verilog G.RAJA SEKHAR 1, M.SRIHARI 2 1 PG Scholar, Dept of ECE,

More information

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier CHAPTER 3 METHODOLOGY 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier The design analysis starts with the analysis of the elementary algorithm for multiplication by

More information

Introduction to Linear Programming

Introduction to Linear Programming Introduction to Linear Programming Eric Feron (updated Sommer Gentry) (updated by Paul Robertson) 16.410/16.413 Historical aspects Examples of Linear programs Historical contributor: G. Dantzig, late 1940

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid

More information

A Dedicated Hardware Solution for the HEVC Interpolation Unit

A Dedicated Hardware Solution for the HEVC Interpolation Unit XXVII SIM - South Symposium on Microelectronics 1 A Dedicated Hardware Solution for the HEVC Interpolation Unit 1 Vladimir Afonso, 1 Marcel Moscarelli Corrêa, 1 Luciano Volcan Agostini, 2 Denis Teixeira

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information

Implementation of a Low Power Decimation Filter Using 1/3-Band IIR Filter

Implementation of a Low Power Decimation Filter Using 1/3-Band IIR Filter Implementation of a Low Power Decimation Filter Using /3-Band IIR Filter Khalid H. Abed Department of Electrical Engineering Wright State University Dayton Ohio, 45435 Abstract-This paper presents a unique

More information

ARITHMETIC operations based on residue number systems

ARITHMETIC operations based on residue number systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 2, FEBRUARY 2006 133 Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues A. B. Premkumar, Senior Member,

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: Configuring Floating Point Multiplier on Spartan 2E Hardware

More information

ANALYSIS OF AN AREA EFFICIENT VLSI ARCHITECTURE FOR FLOATING POINT MULTIPLIER AND GALOIS FIELD MULTIPLIER*

ANALYSIS OF AN AREA EFFICIENT VLSI ARCHITECTURE FOR FLOATING POINT MULTIPLIER AND GALOIS FIELD MULTIPLIER* IJVD: 3(1), 2012, pp. 21-26 ANALYSIS OF AN AREA EFFICIENT VLSI ARCHITECTURE FOR FLOATING POINT MULTIPLIER AND GALOIS FIELD MULTIPLIER* Anbuselvi M. and Salivahanan S. Department of Electronics and Communication

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: Implementation of Floating Point Multiplier on Reconfigurable

More information

Parallel graph traversal for FPGA

Parallel graph traversal for FPGA LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

DESIGN AND IMPLEMENTATION OF VLSI SYSTOLIC ARRAY MULTIPLIER FOR DSP APPLICATIONS

DESIGN AND IMPLEMENTATION OF VLSI SYSTOLIC ARRAY MULTIPLIER FOR DSP APPLICATIONS International Journal of Computing Academic Research (IJCAR) ISSN 2305-9184 Volume 2, Number 4 (August 2013), pp. 140-146 MEACSE Publications http://www.meacse.org/ijcar DESIGN AND IMPLEMENTATION OF VLSI

More information

Prachi Sharma 1, Rama Laxmi 2, Arun Kumar Mishra 3 1 Student, 2,3 Assistant Professor, EC Department, Bhabha College of Engineering

Prachi Sharma 1, Rama Laxmi 2, Arun Kumar Mishra 3 1 Student, 2,3 Assistant Professor, EC Department, Bhabha College of Engineering A Review: Design of 16 bit Arithmetic and Logical unit using Vivado 14.7 and Implementation on Basys 3 FPGA Board Prachi Sharma 1, Rama Laxmi 2, Arun Kumar Mishra 3 1 Student, 2,3 Assistant Professor,

More information

A Novel Energy Efficient Source Routing for Mesh NoCs

A Novel Energy Efficient Source Routing for Mesh NoCs 2014 Fourth International Conference on Advances in Computing and Communications A ovel Energy Efficient Source Routing for Mesh ocs Meril Rani John, Reenu James, John Jose, Elizabeth Isaac, Jobin K. Antony

More information

The Xilinx XC6200 chip, the software tools and the board development tools

The Xilinx XC6200 chip, the software tools and the board development tools The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions

More information

CSC 8301 Design & Analysis of Algorithms: Linear Programming

CSC 8301 Design & Analysis of Algorithms: Linear Programming CSC 8301 Design & Analysis of Algorithms: Linear Programming Professor Henry Carter Fall 2016 Iterative Improvement Start with a feasible solution Improve some part of the solution Repeat until the solution

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION 1.1 Advance Encryption Standard (AES) Rijndael algorithm is symmetric block cipher that can process data blocks of 128 bits, using cipher keys with lengths of 128, 192, and 256

More information

Performance Analysis of CORDIC Architectures Targeted by FPGA Devices

Performance Analysis of CORDIC Architectures Targeted by FPGA Devices International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Performance Analysis of CORDIC Architectures Targeted by FPGA Devices Guddeti Nagarjuna Reddy 1, R.Jayalakshmi 2, Dr.K.Umapathy

More information

Parallelized Radix-4 Scalable Montgomery Multipliers

Parallelized Radix-4 Scalable Montgomery Multipliers Parallelized Radix-4 Scalable Montgomery Multipliers Nathaniel Pinckney and David Money Harris 1 1 Harvey Mudd College, 301 Platt. Blvd., Claremont, CA, USA e-mail: npinckney@hmc.edu ABSTRACT This paper

More information

Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter

Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter African Journal of Basic & Applied Sciences 9 (1): 53-58, 2017 ISSN 2079-2034 IDOSI Publications, 2017 DOI: 10.5829/idosi.ajbas.2017.53.58 Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm

More information

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,

More information

Copyright 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin Introduction to the Design & Analysis of Algorithms, 2 nd ed., Ch.

Copyright 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin Introduction to the Design & Analysis of Algorithms, 2 nd ed., Ch. Iterative Improvement Algorithm design technique for solving optimization problems Start with a feasible solution Repeat the following step until no improvement can be found: change the current feasible

More information

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient ISSN (Online) : 2278-1021 Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient PUSHPALATHA CHOPPA 1, B.N. SRINIVASA RAO 2 PG Scholar (VLSI Design), Department of ECE, Avanthi

More information

Efficient Self-Reconfigurable Implementations Using On-Chip Memory

Efficient Self-Reconfigurable Implementations Using On-Chip Memory 10th International Conference on Field Programmable Logic and Applications, August 2000. Efficient Self-Reconfigurable Implementations Using On-Chip Memory Sameer Wadhwa and Andreas Dandalis University

More information

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO 2402 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 6, JUNE 2016 A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO Antony Xavier Glittas,

More information

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies VLSI IMPLEMENTATION OF HIGH PERFORMANCE DISTRIBUTED ARITHMETIC (DA) BASED ADAPTIVE FILTER WITH FAST CONVERGENCE FACTOR G. PARTHIBAN 1, P.SATHIYA 2 PG Student, VLSI Design, Department of ECE, Surya Group

More information

International Journal of Computer Sciences and Engineering. Research Paper Volume-6, Issue-2 E-ISSN:

International Journal of Computer Sciences and Engineering. Research Paper Volume-6, Issue-2 E-ISSN: International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-6, Issue-2 E-ISSN: 2347-2693 Implementation Sobel Edge Detector on FPGA S. Nandy 1*, B. Datta 2, D. Datta 3

More information

A Reconfigurable Multifunction Computing Cache Architecture

A Reconfigurable Multifunction Computing Cache Architecture IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 509 A Reconfigurable Multifunction Computing Cache Architecture Huesung Kim, Student Member, IEEE, Arun K. Somani,

More information

Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study

Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study Bradley F. Dutton, Graduate Student Member, IEEE, and Charles E. Stroud, Fellow, IEEE Dept. of Electrical and Computer Engineering

More information

FPGA: What? Why? Marco D. Santambrogio

FPGA: What? Why? Marco D. Santambrogio FPGA: What? Why? Marco D. Santambrogio marco.santambrogio@polimi.it 2 Reconfigurable Hardware Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much

More information

A Configurable Multi-Ported Register File Architecture for Soft Processor Cores

A Configurable Multi-Ported Register File Architecture for Soft Processor Cores A Configurable Multi-Ported Register File Architecture for Soft Processor Cores Mazen A. R. Saghir and Rawan Naous Department of Electrical and Computer Engineering American University of Beirut P.O. Box

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

DEGENERACY AND THE FUNDAMENTAL THEOREM

DEGENERACY AND THE FUNDAMENTAL THEOREM DEGENERACY AND THE FUNDAMENTAL THEOREM The Standard Simplex Method in Matrix Notation: we start with the standard form of the linear program in matrix notation: (SLP) m n we assume (SLP) is feasible, and

More information

DESIGN AND IMPLEMENTATION OF SDR SDRAM CONTROLLER IN VHDL. Shruti Hathwalia* 1, Meenakshi Yadav 2

DESIGN AND IMPLEMENTATION OF SDR SDRAM CONTROLLER IN VHDL. Shruti Hathwalia* 1, Meenakshi Yadav 2 ISSN 2277-2685 IJESR/November 2014/ Vol-4/Issue-11/799-807 Shruti Hathwalia et al./ International Journal of Engineering & Science Research DESIGN AND IMPLEMENTATION OF SDR SDRAM CONTROLLER IN VHDL ABSTRACT

More information

Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path

Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path Michalis D. Galanis, Gregory Dimitroulakos, and Costas E. Goutis VLSI Design Laboratory, Electrical and Computer Engineering

More information

Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction

Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction Rakhi S 1, PremanandaB.S 2, Mihir Narayan Mohanty 3 1 Atria Institute of Technology, 2 East Point College of Engineering &Technology,

More information

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers Subash Chandar G (g-chandar1@ti.com), Vaideeswaran S (vaidee@ti.com) DSP Design, Texas Instruments India

More information

EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL INTERLEAVER FOR LONG TERM EVOLUTION SYSTEMS

EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL INTERLEAVER FOR LONG TERM EVOLUTION SYSTEMS Rev. Roum. Sci. Techn. Électrotechn. et Énerg. Vol. 61, 1, pp. 53 57, Bucarest, 016 Électronique et transmission de l information EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL

More information

Simplify System Complexity

Simplify System Complexity 1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller

More information

Realization of Hardware Architectures for Householder Transformation based QR Decomposition using Xilinx System Generator Block Sets

Realization of Hardware Architectures for Householder Transformation based QR Decomposition using Xilinx System Generator Block Sets IJSTE - International Journal of Science Technology & Engineering Volume 2 Issue 08 February 2016 ISSN (online): 2349-784X Realization of Hardware Architectures for Householder Transformation based QR

More information

IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY FPGA

IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY FPGA IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY FPGA Implementations of Tiny Mersenne Twister Guoping Wang Department of Engineering, Indiana University Purdue University Fort

More information

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol.

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. 6937, 69370N, DOI: http://dx.doi.org/10.1117/12.784572 ) and is made

More information

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths International Journal of Computer Science and Telecommunications [Volume 3, Issue 5, May 2012] 105 ISSN 2047-3338 FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for

More information

Laboratory Exercise 8

Laboratory Exercise 8 Laboratory Exercise 8 Memory Blocks In computer systems it is necessary to provide a substantial amount of memory. If a system is implemented using FPGA technology it is possible to provide some amount

More information

Synthesis of VHDL Code for FPGA Design Flow Using Xilinx PlanAhead Tool

Synthesis of VHDL Code for FPGA Design Flow Using Xilinx PlanAhead Tool Synthesis of VHDL Code for FPGA Design Flow Using Xilinx PlanAhead Tool Md. Abdul Latif Sarker, Moon Ho Lee Division of Electronics & Information Engineering Chonbuk National University 664-14 1GA Dekjin-Dong

More information

Analysis of Different Multiplication Algorithms & FPGA Implementation

Analysis of Different Multiplication Algorithms & FPGA Implementation IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 2, Ver. I (Mar-Apr. 2014), PP 29-35 e-issn: 2319 4200, p-issn No. : 2319 4197 Analysis of Different Multiplication Algorithms & FPGA

More information

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

FPGA Implementation and Validation of the Asynchronous Array of simple Processors FPGA Implementation and Validation of the Asynchronous Array of simple Processors Jeremy W. Webb VLSI Computation Laboratory Department of ECE University of California, Davis One Shields Avenue Davis,

More information

The S6000 Family of Processors

The S6000 Family of Processors The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which

More information

Design of Convolution Encoder and Reconfigurable Viterbi Decoder

Design of Convolution Encoder and Reconfigurable Viterbi Decoder RESEARCH INVENTY: International Journal of Engineering and Science ISSN: 2278-4721, Vol. 1, Issue 3 (Sept 2012), PP 15-21 www.researchinventy.com Design of Convolution Encoder and Reconfigurable Viterbi

More information

Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs

Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Xin Fang and Miriam Leeser Dept of Electrical and Computer Eng Northeastern University Boston, Massachusetts 02115

More information

Efficient Implementation of Low Power 2-D DCT Architecture

Efficient Implementation of Low Power 2-D DCT Architecture Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 Efficient Implementation of Low Power 2-D DCT Architecture 1 Kalyan Chakravarthy. K, 2 G.V.K.S.Prasad 1 M.Tech student, ECE, AKRG College

More information

Advanced FPGA Design Methodologies with Xilinx Vivado

Advanced FPGA Design Methodologies with Xilinx Vivado Advanced FPGA Design Methodologies with Xilinx Vivado Alexander Jäger Computer Architecture Group Heidelberg University, Germany Abstract With shrinking feature sizes in the ASIC manufacturing technology,

More information

International Journal of Advance Engineering and Research Development

International Journal of Advance Engineering and Research Development Scientific Journal of Impact Factor (SJIF): 4.14 International Journal of Advance Engineering and Research Development Volume 3, Issue 11, November -2016 e-issn (O): 2348-4470 p-issn (P): 2348-6406 Review

More information

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm 455 A High Speed Binary Floating Point Multiplier Using Dadda Algorithm B. Jeevan, Asst. Professor, Dept. of E&IE, KITS, Warangal. jeevanbs776@gmail.com S. Narender, M.Tech (VLSI&ES), KITS, Warangal. narender.s446@gmail.com

More information

MATLAB Simulink Modeling and Simulation of Recurrent Neural Network for Solving Linear Programming Problems

MATLAB Simulink Modeling and Simulation of Recurrent Neural Network for Solving Linear Programming Problems International Conference on Mathematical Computer Engineering - ICMCE - 8 MALAB Simulink Modeling and Simulation of Recurrent Neural Network for Solving Linear Programming Problems Raja Das a a School

More information

"On the Capability and Achievable Performance of FPGAs for HPC Applications"

On the Capability and Achievable Performance of FPGAs for HPC Applications "On the Capability and Achievable Performance of FPGAs for HPC Applications" Wim Vanderbauwhede School of Computing Science, University of Glasgow, UK Or in other words "How Fast Can Those FPGA Thingies

More information

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase Abhay Sharma M.Tech Student Department of ECE MNNIT Allahabad, India ABSTRACT Tree Multipliers are frequently

More information

Chapter II. Linear Programming

Chapter II. Linear Programming 1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION

More information

Performance of Constant Addition Using Enhanced Flagged Binary Adder

Performance of Constant Addition Using Enhanced Flagged Binary Adder Performance of Constant Addition Using Enhanced Flagged Binary Adder Sangeetha A UG Student, Department of Electronics and Communication Engineering Bannari Amman Institute of Technology, Sathyamangalam,

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Digital Design Methodology

Digital Design Methodology Digital Design Methodology Prof. Soo-Ik Chae Digital System Designs and Practices Using Verilog HDL and FPGAs @ 2008, John Wiley 1-1 Digital Design Methodology (Added) Design Methodology Design Specification

More information

Controller Synthesis for Hardware Accelerator Design

Controller Synthesis for Hardware Accelerator Design ler Synthesis for Hardware Accelerator Design Jiang, Hongtu; Öwall, Viktor 2002 Link to publication Citation for published version (APA): Jiang, H., & Öwall, V. (2002). ler Synthesis for Hardware Accelerator

More information

Intro to System Generator. Objectives. After completing this module, you will be able to:

Intro to System Generator. Objectives. After completing this module, you will be able to: Intro to System Generator This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Explain why there is a need for an integrated

More information

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering An Efficient Implementation of Double Precision Floating Point Multiplier Using Booth Algorithm Pallavi Ramteke 1, Dr. N. N. Mhala 2, Prof. P. R. Lakhe M.Tech [IV Sem], Dept. of Comm. Engg., S.D.C.E, [Selukate],

More information

Memory, Area and Power Optimization of Digital Circuits

Memory, Area and Power Optimization of Digital Circuits Memory, Area and Power Optimization of Digital Circuits Laxmi Gupta Electronics and Communication Department Jaypee Institute of Information Technology Noida, Uttar Pradesh, India Ankita Bharti Electronics

More information

CHAPTER 4 BLOOM FILTER

CHAPTER 4 BLOOM FILTER 54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing,

More information

Chapter 5: ASICs Vs. PLDs

Chapter 5: ASICs Vs. PLDs Chapter 5: ASICs Vs. PLDs 5.1 Introduction A general definition of the term Application Specific Integrated Circuit (ASIC) is virtually every type of chip that is designed to perform a dedicated task.

More information

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier Design and Implementation of VLSI 8 Bit Systolic Array Multiplier Khumanthem Devjit Singh, K. Jyothi MTech student (VLSI & ES), GIET, Rajahmundry, AP, India Associate Professor, Dept. of ECE, GIET, Rajahmundry,

More information

AN EXPERIMENTAL INVESTIGATION OF A PRIMAL- DUAL EXTERIOR POINT SIMPLEX ALGORITHM

AN EXPERIMENTAL INVESTIGATION OF A PRIMAL- DUAL EXTERIOR POINT SIMPLEX ALGORITHM AN EXPERIMENTAL INVESTIGATION OF A PRIMAL- DUAL EXTERIOR POINT SIMPLEX ALGORITHM Glavelis Themistoklis Samaras Nikolaos Paparrizos Konstantinos PhD Candidate Assistant Professor Professor Department of

More information

An FPGA Implementation of the Simplex Algorithm

An FPGA Implementation of the Simplex Algorithm An FPGA Implementation of the Simplex Algorithm Samuel Bayliss #1,Christos-S.Bouganis #2, George A. Constantinides #3,WayneLuk 4 # Department of Electrical and Electronic Engineering Imperial College London

More information

[Dixit*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

[Dixit*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY REALIZATION OF CANNY EDGE DETECTION ALGORITHM USING FPGA S.R. Dixit*, Dr. A.Y.Deshmukh * Research scholar Department of Electronics

More information

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm

More information

Outline of Presentation Field Programmable Gate Arrays (FPGAs(

Outline of Presentation Field Programmable Gate Arrays (FPGAs( FPGA Architectures and Operation for Tolerating SEUs Chuck Stroud Electrical and Computer Engineering Auburn University Outline of Presentation Field Programmable Gate Arrays (FPGAs( FPGAs) How Programmable

More information

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information