International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

Similar documents
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders

ANALYZING THE PERFORMANCE OF CARRY TREE ADDERS BASED ON FPGA S

Design of Parallel Prefix Adders using FPGAs

Srinivasasamanoj.R et al., International Journal of Wireless Communications and Network Technologies, 1(1), August-September 2012, 4-9

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient

High Speed Han Carlson Adder Using Modified SQRT CSLA

Design and Implementation of High Performance Parallel Prefix Adders

Improved Fault Tolerant Sparse KOGGE Stone ADDER

Design and Characterization of High Speed Carry Select Adder

Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology

DESIGN AND IMPLEMENTATION 0F 64-BIT PARALLEL PREFIX BRENTKUNG ADDER

Implementation of 32-Bit Wave Pipelining Sparse Tree Adders

the main limitations of the work is that wiring increases with 1. INTRODUCTION

An Efficient Hybrid Parallel Prefix Adders for Reverse Converters using QCA Technology

Performance of Constant Addition Using Enhanced Flagged Binary Adder

An Efficient Carry Select Adder with Less Delay and Reduced Area Application

OPTIMIZING THE POWER USING FUSED ADD MULTIPLIER

Parallel-Prefix Adders Implementation Using Reverse Converter Design. Department of ECE

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

PARALLEL PREFIX TREE 32-BIT COMPARATOR AND ADDER BY USING SCALABLE DIGITAL CMOS

Design and Implementation of Fault Tolerant Adders on Field Programmable Gate Arrays

Implementation of 64-Bit Kogge Stone Carry Select Adder with ZFC for Efficient Area

DESIGN AND ANALYSIS OF COMPETENT ARITHMETIC AND LOGIC UNIT FOR RISC PROCESSOR

ISSN (Online)

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

Low-Area Low-Power Parallel Prefix Adder Based on Modified Ling Equations

High Performance and Area Efficient DSP Architecture using Dadda Multiplier

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

Design and Implementation of CVNS Based Low Power 64-Bit Adder

VLSI Implementation of Adders for High Speed ALU

Reduced Delay BCD Adder

Area Delay Power Efficient Carry-Select Adder

Performance Analysis of 64-Bit Carry Look Ahead Adder

Area Delay Power Efficient Carry-Select Adder

Design of Two Different 128-bit Adders. Project Report

Performance Analysis of CORDIC Architectures Targeted by FPGA Devices

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.

Design of Delay Efficient Carry Save Adder

AN EFFICIENT REVERSE CONVERTER DESIGN VIA PARALLEL PREFIX ADDER

DESIGN AND IMPLEMENTATION OF ADDER ARCHITECTURES AND ANALYSIS OF PERFORMANCE METRICS

AN IMPROVED FUSED FLOATING-POINT THREE-TERM ADDER. Mohyiddin K, Nithin Jose, Mitha Raj, Muhamed Jasim TK, Bijith PS, Mohamed Waseem P

A New Architecture Designed for Implementing Area Efficient Carry-Select Adder

High-Performance Carry Chains for FPGA s

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder

1. Introduction. Raj Kishore Kumar 1, Vikram Kumar 2

Design and Verification of Area Efficient High-Speed Carry Select Adder

Design and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems.

Power Optimized Programmable Truncated Multiplier and Accumulator Using Reversible Adder

University, Patiala, Punjab, India 1 2

Keywords: throughput, power consumption, area, pipeline, fast adders, vedic multiplier. GJRE-F Classification : FOR Code:

An FPGA based Implementation of Floating-point Multiplier

AUTONOMOUS RECONFIGURATION OF IP CORE UNITS USING BLRB ALGORITHM

DESIGN OF HYBRID PARALLEL PREFIX ADDERS

Design of an Efficient 128-Bit Carry Select Adder Using Bec and Variable csla Techniques

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

A Unified Addition Structure for Moduli Set {2 n -1, 2 n,2 n +1} Based on a Novel RNS Representation

32 bit Arithmetic Logical Unit (ALU) using VHDL

Implementation of Optimized ALU for Digital System Applications using Partial Reconfiguration

Design of High Speed Modulo 2 n +1 Adder

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE.

Parallel, Single-Rail Self-Timed Adder. Formulation for Performing Multi Bit Binary Addition. Without Any Carry Chain Propagation

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

DESIGN OF RADIX-8 BOOTH MULTIPLIER USING KOGGESTONE ADDER FOR HIGH SPEED ARITHMETIC APPLICATIONS

A High Performance Reconfigurable Data Path Architecture For Flexible Accelerator

Design and Implementation of Parallel Prefix Adder for Improving the Performance of Carry Lookahead Adder

A Low-Power Field Programmable VLSI Based on Autonomous Fine-Grain Power Gating Technique

International Journal Of Global Innovations -Vol.6, Issue.I Paper Id: SP-V6-I1-P14 ISSN Online:

Multi-Modulus Adder Implementation and its Applications

A Novel Carry-look ahead approach to an Unified BCD and Binary Adder/Subtractor

DESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA

OPTIMIZATION OF AREA COMPLEXITY AND DELAY USING PRE-ENCODED NR4SD MULTIPLIER.

International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: , Volume-3, Issue-5, September-2015

Compound Adder Design Using Carry-Lookahead / Carry Select Adders

FPGA Implementation of Efficient Carry-Select Adder Using Verilog HDL

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

Area Delay Power Efficient Carry Select Adder

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT RADIX-2 FFT USING VHDL

Low Power Circuits using Modified Gate Diffusion Input (GDI)

Non-Heuristic Optimization and Synthesis of Parallel-Prefix Adders

A Taxonomy of Parallel Prefix Networks

VLSI Arithmetic Lecture 6

VHDL-MODELING OF A GAS LASER S GAS DISCHARGE CIRCUIT Nataliya Golian, Vera Golian, Olga Kalynychenko

Implementation of Floating Point Multiplier Using Dadda Algorithm

DESIGN AND IMPLEMENTATION OF VLSI SYSTOLIC ARRAY MULTIPLIER FOR DSP APPLICATIONS

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number. Chapter 3

Design and Development of Vedic Mathematics based BCD Adder

A High Speed Design of 32 Bit Multiplier Using Modified CSLA

Study, Implementation and Survey of Different VLSI Architectures for Multipliers

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

High Speed Special Function Unit for Graphics Processing Unit

Parallelized Radix-4 Scalable Montgomery Multipliers

VLSI Design Of a Novel Pre Encoding Multiplier Using DADDA Multiplier. Guntur(Dt),Pin:522017

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs

Available online at ScienceDirect. Procedia Technology 24 (2016 )

FPGA architecture and design technology

Transcription:

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 1502 Design and Characterization of Koggestone, Sparse Koggestone, Spanning tree and Brentkung Adders V. Krishna Kumari, Y. Sri Chakrapani, Dr. M. Kamaraju Abstract-- In various VLSI designs, the adders are frequently used. The most commonly used adder is the Ripple Carry Adder (RCA), which can be implemented by using half adders and full adders.this RCA is a serial adder which is used to perform any number of additions, but it has propagation delay problem due to carry propagation from stage to stage which leads to more delay. To overcome this delay, parallel adders (parallel prefix adders) are preferred as they pre-compute the carry. The parallel prefix adders are KS adder (kogge-stone), SKS adder (sparse kogge-stone), Spanning tree and Brentkung adders. These adders are designed and compared by using power and delay constraints. Simulation and Synthesis process is performed on these adders using by Model sim6.4b, Xilinx ISE9.2i. Keywords Adders, KS adder, RCA, Simulation, SKS adder, Synthesis. I. INTRODUCTION In processors (DSP) and microprocessor data path units, adder is an important element. As such, extensive research continues to be focused on improving the power-delay performance of the adder. In VLSI implementations, parallel adders are known to have the best performance. Reconfigurable logic like Field Programmable Gate Arrays (FPGAs) has been gaining more popularity in recent years because it offers improved performance in terms of speed and power over DSP-based and microprocessor-based solutions, for many practical designs involving mobile DSP and telecommunications applications and a significant reduction in development time and cost over Application Specific Integrated Circuit (ASIC) designs. The power advantage is important with the growing popularity of mobile and portable electronics, which make extensive use of DSP functions. However, because of the structure of the configurable logic and routing resources in FPGAs, parallel-prefix adders will have a different performance than VLSI implementations [1]. In particular, most modern FPGAs employ a fast-carry chain which optimizes the carry path for the simple Ripple Carry Adder (RCA). V.Krishna Kumari is currently pursuing M.Tech, Embedded systems in department of electronics and communication engineering in JNTUK University, India, PH-9032247973. E-mail: kkecm@yahoo.com Y.Sri Chakrapani is currently working as Associate Professor in department of electronics and communication engineering,gudlavalleru engineering College,E-mail: srichakrapani@gmail.com Dr. M.Kamaraju is currently working as Pprofessor & HOD in electronics and communication engineering,gudlavalleru engineering College, E-mail:madduraju@yahoo.com Ripple carry adder is the cascade of full adders which performs the addition operation but the only drawback is propagation delay as the carry has to ripple from stage to stage. So in order to reduce this delay many adders came into existence. Some of them are Carry skip and Carry look ahead adders which reduce the delay by precomputing the carry. Later Parallel Prefix adders are preferred in order to reduce the delay as they will perform pre-computation and postcomputation. In this paper, some of the tree based adder structures are characterized and compared with the Ripple Carry Adder (RCA) and the Carry Skip Adder (CSA). Finally, some conclusions and suggestions for improving FPGA designs to enable better tree-based adder performance are given to meet the present day constraints to reduce the delay and speed up the computation. II. CARRY-TREE ADDER DESIGNS Parallel-prefix adders are also known as carry-tree adders. They pre-compute the propagate and generate signals. These signals are variously combined using the fundamental carry operator (fco)[2]. (gl, pl) ο (gr, pr) = (gl + pl gr, pl pr) (1) These operators can be combined in different ways to form various adder structures by the associative property of the fco,. For, example the four-bit carry-look ahead (CLA) generator is given by: c4 = (g4, p4) ο [ (g3, p3) ο [(g1, p1) ο (g2, p2)] ] (2) A simple rearrangement of the order of operations allows parallel operation, which results in a most efficient tree structure for this four bit example: c4 = [(g4, p4) ο (g3, p3)] ο [(g2, p2 ) ο (g1, p1)] (3) It is readily apparent that a key advantage of the treestructured adder is that the critical path due to the carry delay is on the order of log 2N for an N-bit wide adder. Various

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 1503 families of adders arise from the arrangement of the prefix network. For this study, the focus is on the Kogge-Stone adder, known for having minimal logic depth and fanout (see Fig 1). Here we designate BC as the black cell which generates the ordered pair in equation (1); the gray cell (GC) generates the left signal only. The interconnect area is known to be high, but for an FPGA with large routing overhead to begin with, this is not as important as in a VLSI implementation. The regularity of the Kogge Stone prefix tree network has built in redundancy which has implications for fault-tolerant designs. The sparse Kogge-Stone adder, shown in Fig 1, is also studied. This hybrid design fulfills the summation process with a 4 bit RCA allowing the carry prefix network to be simplified[3].the dark shaded square box is considered as black and node and the dotted square box is considered as gray node of parallel adder. This usage of a fast carry-chain for the RCA in FPGA, it is interesting to compare the performance of this adder with the Sparse Kogge -Stone and regular Kogge-Stone adders. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 1:0 15:12 14:11 13:10 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 3:0 2:0 15:8 14:7 13:6 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 15:014:013:0 12:011:010:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 Fig. 2. Spanning Tree Carry Look ahead Adder (16 bit) 16 bit Brent Kung adder. III. RELATED WORK Fig. 1. 16 bit Kogge-Stone adder and Sparse 16-bit Kogge-Stone adder Another carry-tree adder known as the spanning tree and brent kung adders as shown in Fig 2 and Fig 2 are examined. On the Xilinx 4000 series FPGAs, the ripple carry adder and carry-skip adders, only an optimized form of the carry-skip adder performance is more better than the ripple carry adder when the adder operands were above 56 bits [4]. A study of adders yielded similar results when implemented on the Xilinx Virtex II. In the authors considered several parallel prefix adders implemented on a Xilinx Virtex 5 FPGA. It is found that the normal RCA adder is superior to the parallel prefix designs because the RCA can take advantage of the fast carry chain This study focuses on carry-tree adders implemented on a FPGA of Xilinx Spartan 3E. The distinctive contributions of this paper are two-folded. In the first, we consider tree-based adders and a hybrid form which combines a tree structure with a ripple-carry design. The Kogge-Stone adder is taken as a representative of the former type and the sparse Kogge- Stone and spanning tree adder are representative of the latter category[5]. Second, this paper considers the practical issues involved in testing the adders and provides actual measurement data to compare with simulation results. The previous works mentioned above all rely upon the synthesis reports from the FPGA place and route software for their

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 1504 results. In addition with being able to compare the simulation data with measured data using a high-speed logic analyzer, our results present a different perspective in terms of both results and types of adders. The adders to be studied were designed with varied bit widths up to 128 bits and coded in VHDL. The functionality of the designs was verified via simulation with Model Sim 6.4b. The Xilinx ISE 10.1i software was used to synthesize the designs onto the Spartan 3E FPGA. In order to effectively test for the critical delay, two steps can be taken. First, a memory block (labeled as ROM in the figure below) can be instantiated on the FPGA using the Core Generator to allow arbitrary patterns of inputs to be applied to the adder design. At each adder output, the multiplexer selects whether to include or not to include the adder in the measured results,. A switch on the FPGA board was wired to the select pin of the multiplexers. This allows measurements to be made to deduct out the delay due to the memory, the multiplexers, and interconnect (both external cabling and internal routing) [6]. bits are taken as inputs and S and C are taken as output Intermediate signals are considered as G, P, X. In the simulation process of spanning adder, a and b of 16 bits are taken as inputs and Sum and C are taken as output and the intermediate signals are g1,p1,x4,x8 and x12. In the simulation process of brent kung adder, a and b of 16 bits are taken as inputs and Sum and C. Finally adders like carryskip and ripple carry adders are compared with koggestone, sparse kogge, spanning tree and brent kung adders in terms of delay and power and are tabulated in the table1 given below. V. SIMULATION AND SYNTHESIS REPORT IV. IMPLEMENTATION Xing and Yu [7] noted that delay models and cost analysis for designs developed for VLSI technology does match the second, the parallel prefix network was analyzed to directly to FPGA designs. They compared the design of determine if a specific pattern could be used to extract the worst case delay. Utilizatio Considering the structure of the Generate-Propagate (GP) Logic Utilization Used Available n blocks (i.e., the BC and GC cells), we were able to develop the Number of Slices 18 256 7% following scheme. 32 512 6% If we arbitrarily assign the (g, p) ordered pairs the values (1, 0) = True and (0, 1) = False, then the table is self-contained and forms an OR truth table. If both inputs to the GP block are 50 88 56% false, then the output will be false; conversely, the output is true if both the inputs are true, then. Hence, an input pattern that alternates between generating the (g, p) pairs of (1, 0) and Fig.3: Ripple carry adder simulated wave form RCA (0, 1) will force its GP pair block to alternate states. Likewise, it is easily seen that the GP blocks being fed by its predecessors will also change their states. Therefore, this scheme will ensure that a worse case delay will be generated in the parallel prefix network since every block will be active. In order to ensure this scheme works, the parallel prefix adders were synthesized with the Keep Hierarchy design setting turned on (otherwise, the FPGA compiler attempts to reorganize the logic assigned to each LUT). With this option turned on, it ensures that each GP block is mapped to one LUT, preserving the basic parallel prefix structure, and ensuring that this test strategy is effective for determining the critical delay. The designs were also synthesized for speed rather than area optimization [8]. In the simulation process of kogge stone adder, a and b of 16 bits are taken as inputs and S and Cout are taken as output Intermediate signals are G,C, l, m, q, r, s, t, v. In the simulation process of sparse kogge stone adder, a and b of 16 Logic Utilization Used Available Utilizatio

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 1505 n Number of Slices 22 256 8% 39 512 7% 50 88 56% 51 512 9% 65 88 73% Fig.6: Sparse kogge stone simulated wave form SKS Fig.4: Carry Skip adder simulated wave form Carry skip Number of Slices 18 256 7% Number of Slices 21 256 8% 32 512 6% 36 512 7% 65 88 73% 50 88 56% Fig.5: Kogge stone simulated wave form KS device utilization. Fig.7: Spanning tree simulated wave form Spanning tree Number of Slices 29 256 11% Logic Utilization Used Available Utilization Number of Slices 24 256 9% 43 512 8%

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 1506 50 88 56% Fig.8: Brent kung simulated wave form Brent kung ACKNOWLEDGMENT It is very glad to take this opportunity to thank the Professor and Head of Department., Dr. M. Kamaraju, Gudlavalleru Engineering College for taking keen interest and for providing encouragement in my project work and I wish to express my deep gratitude and sincere thanks to my mentor guide Mr. Y. Sri Chakrapani for taking interest and for providing his cooperation and guidance, which has helped me immensely, in completing this research work. I would like to thank the teaching and non-teaching Staff of Gudlavalleru engineering college for encouraging us for this research work. REFERENCES VI DELAY AND POWER ANALYSIS The synthesis of the above mentioned adders is done in Xilinx9.2i. The delay & power parameters are being observed and they are being compared between adders and parallel adders and then tabulated. Table.1: Comparison of adders in terms of delay and power ADDER DELAY POWER Ripple carry adder 24.68ns 28mw Carry skip adder 25.006ns 29mw Kogge stone adder 20.262ns 20mw Sparse kogge stone adder 22.56ns 19mw Spanning tree adder 20.16ns 19mw Brent kung adder 18.05ns 20mw VII CONCLUSION Computers, pp. 2213 7, 2003. From the results, it is concluded that the Brent kung adder provides better performance than Ripple Carry adder. and Carry Skip adder. The power-delay performance can be increased with Sparse koggestone adder and Spanning tree adder. This is important where large number of adders to be used in precision arithmetic and cryptographic applications where the addition of numbers on the order of a thousand bits is very common. Because the adder is often the critical element which determines to a large part the cycle time and power dissipation for many digital signal processing and cryptographical implementations, For future FPGA designs it would be worthwhile to include an optimized carry path to enable tree-based adder. So these parallel prefix adders are the best choice in many VLSI application where power is the main constraint [1] K. Vitoroulis and A. J. Al-Khalili, Performance of Parallel Prefix Adders Implemented with FPGA technology, IEEE Northeast Workshop on Circuits and Systems, pp. 498-501, Aug. 2007. [2] P. M. Kogge and H. S. Stone, A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations, IEEE Trans. on Computers, Vol. C-22, No 8, August 1973. [3] R. P. Brent and H. T. Kung, A regular layout for parallel adders, IEEE Trans. Comput., vol. C-31, pp. 260-264, 1982. [4] N. H. E. Weste and D. Harris, CMOS VLSI Design, 4th edition, Pearson Addison-Wesley, 2011. [5] T. Lynch and E. E. Swartzlander, A Spanning Tree Carry Lookahead Adder, IEEE Trans. on Computers, vol. 41, no. 8, pp. 931-939, Aug. 1992 [6] P. Ndai, S. Lu, D. Somesekhar, and K. Roy, Fine- Grained Redundancy in Adders, Int. Symp. on Quality Electronic Design, pp. 317-321, March 2007. [7] S. Xing and W. W. H. Yu, FPGA Adders: Performance Evaluation and Optimal Design, IEEE Design & Test of Computers, vol. 15, no. 1, pp. 24-29, Jan. 1998. [8] D. Harris, A Taxonomy of Parallel Prefix Networks, in Proc. 37th Asilomar Conf. Signals Systems and