An Efficient Carry Select Adder with Less Delay and Reduced Area Application

Similar documents
Design of an Efficient 128-Bit Carry Select Adder Using Bec and Variable csla Techniques

Design and Verification of Area Efficient High-Speed Carry Select Adder

A High Speed Design of 32 Bit Multiplier Using Modified CSLA

Design and Characterization of High Speed Carry Select Adder

Low-Power And Area-Efficient Of 128-Bit Carry Select Adder

Hard ware implementation of area and power efficient Carry Select Adder using reconfigurable adder structures

Implementation of CMOS Adder for Area & Energy Efficient Arithmetic Applications

Implementation of CMOS Adder for Area & Energy Efficient Arithmetic Applications

Area Delay Power Efficient Carry-Select Adder

Implementation of Regular Linear Carry Select Adder with Binary to Excess-1 Converter

ASIC IMPLEMENTATION OF 16 BIT CARRY SELECT ADDER

Design of Delay Efficient Carry Save Adder

Area Delay Power Efficient Carry-Select Adder

FPGA Implementation of Efficient Carry-Select Adder Using Verilog HDL

Area Delay Power Efficient Carry Select Adder

Area-Delay-Power Efficient Carry-Select Adder

Anisha Rani et al., International Journal of Computer Engineering In Research Trends Volume 2, Issue 11, November-2015, pp.

Carry Select Adder with High Speed and Power Efficiency

Implementation of 64-Bit Kogge Stone Carry Select Adder with ZFC for Efficient Area

International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: , Volume-3, Issue-5, September-2015

DESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient

A New Architecture Designed for Implementing Area Efficient Carry-Select Adder

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

A Novel Design of 32 Bit Unsigned Multiplier Using Modified CSLA

A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA

An Optimized Montgomery Modular Multiplication Algorithm for Cryptography

Implimentation of A 16-bit RISC Processor for Convolution Application

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Design and Implementation of CVNS Based Low Power 64-Bit Adder

16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE.

High Speed Han Carlson Adder Using Modified SQRT CSLA

Performance of Constant Addition Using Enhanced Flagged Binary Adder

DESIGN AND IMPLEMENTATION OF ADDER ARCHITECTURES AND ANALYSIS OF PERFORMANCE METRICS

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase

AREA-DELAY-POWER EFFICIENT CARRY SELECT ADDER

VLSI Implementation of Adders for High Speed ALU

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.

Power Optimized Programmable Truncated Multiplier and Accumulator Using Reversible Adder

16 Bit Low Power High Speed RCA Using Various Adder Configurations

International Journal of Research in Computer and Communication Technology, Vol 4, Issue 11, November- 2015

DESIGN AND IMPLEMENTATION 0F 64-BIT PARALLEL PREFIX BRENTKUNG ADDER

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier

Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders

High Performance and Area Efficient DSP Architecture using Dadda Multiplier

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

Low Power Circuits using Modified Gate Diffusion Input (GDI)

Study, Implementation and Survey of Different VLSI Architectures for Multipliers

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

A High Performance Reconfigurable Data Path Architecture For Flexible Accelerator

VLSI Arithmetic Lecture 6

Digital Circuit Design and Language. Datapath Design. Chang, Ik Joon Kyunghee University

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

An Efficient Flexible Architecture for Error Tolerant Applications

An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

AN EFFICIENT REVERSE CONVERTER DESIGN VIA PARALLEL PREFIX ADDER

Detection Of Fault In Self Checking Carry Select Adder

II. MOTIVATION AND IMPLEMENTATION

Date Performed: Marks Obtained: /10. Group Members (ID):. Experiment # 09 MULTIPLEXERS

PROJECT REPORT IMPLEMENTATION OF LOGARITHM COMPUTATION DEVICE AS PART OF VLSI TOOLS COURSE

DESIGN AND IMPLEMENTATION OF VLSI SYSTOLIC ARRAY MULTIPLIER FOR DSP APPLICATIONS

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier

ARITHMETIC operations based on residue number systems

Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

High Speed Radix 8 CORDIC Processor

Arithmetic Circuits. Nurul Hazlina Adder 2. Multiplier 3. Arithmetic Logic Unit (ALU) 4. HDL for Arithmetic Circuit

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA

Implementation of Floating Point Multiplier Using Dadda Algorithm

the main limitations of the work is that wiring increases with 1. INTRODUCTION

VARUN AGGARWAL

Case Study on DiaHDL: A Web-based Electronic Design Automation Tool for Education Purpose

An Efficient Hybrid Parallel Prefix Adders for Reverse Converters using QCA Technology

I. Introduction. India; 2 Assistant Professor, Department of Electronics & Communication Engineering, SRIT, Jabalpur (M.P.

Implementation of a Fast Sign Detection Algoritm for the RNS Moduli Set {2 N+1-1, 2 N -1, 2 N }, N = 16, 64

An Area Efficient Mixed Decimation MDF Architecture for Radix. Parallel FFT

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm

A High-Speed FPGA Implementation of an RSD-Based ECC Processor

Parallel-Prefix Adders Implementation Using Reverse Converter Design. Department of ECE

IMPLEMENTATION OF HIGH SPEED AND LOW POWER RADIX-4 8*8 BOOTH MULTIPLIER IN CMOS 32nm TECHNOLOGY

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS

OPTIMIZING THE POWER USING FUSED ADD MULTIPLIER

Area And Power Optimized One-Dimensional Median Filter

HIGH PERFORMANCE FUSED ADD MULTIPLY OPERATOR

An FPGA based Implementation of Floating-point Multiplier

An RNS Based Montgomery Modular Multiplication Algorithm For Cryptography

Paper ID # IC In the last decade many research have been carried

Optimized Design and Implementation of a 16-bit Iterative Logarithmic Multiplier

Implementation of Discrete Wavelet Transform for Image Compression Using Enhanced Half Ripple Carry Adder

International Journal for Research in Applied Science & Engineering Technology (IJRASET) IIR filter design using CSA for DSP applications

High Speed Multiplication Using BCD Codes For DSP Applications

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications

High-Performance Carry Chains for FPGA s

Design of Two Different 128-bit Adders. Project Report

FPGA Implementation of ALU Based Address Generation for Memory

Srinivasasamanoj.R et al., International Journal of Wireless Communications and Network Technologies, 1(1), August-September 2012, 4-9

Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator

Online Testing of Word-oriented RAMs by an Accumulator-based Compaction Scheme in Symmetric Transparent Built-In Self Test (BIST)

Transcription:

An Efficient Carry Select Adder with Less Delay and Reduced Area Application Pandu Ranga Rao #1 Priyanka Halle #2 # Associate Professor Department of ECE Sreyas Institute of Engineering and Technology, Hyderabad, India. Abstract Design of area, high speed and power-efficient data path logic systems forms the largest areas of research in VLSI system design. The addition speed is limited by the time necessary to transmit a carry through the adder. Carry Select Adder (CSLA) is one of the fastest adders used in several dataprocessing processors to perform fast arithmetic purpose. From the configuration of the CSLA, it is clear that there is scope for decreasing the area and delay in the CSLA. This work uses a simple and an efficient gate-level modification which drastically reduces the area and delay of the CSLA. Based on this modification 16, 32, 64 and 128-bit square-root Carry Select Adder (SQRT CSLA) architectures have been improved and compared with the regular SQRT CSLA architecture. The proposed design has compact area and delay to a great extent when compared with the regular SQRT CSLA. This work estimates the performance of the planned designs with the regular designs in terms of delay, area and synthesis are implemented in Xilinx FPGA. The results analysis shows that the proposed CSLA structure is better than the regular SQRT CSLA. Keywords (ASIC), area-efficient, CSLA, low delay. I. INTRODUCTION Reduced area and high speed data path logic systems are the major areas of research in VLSI system design. High-speed addition and multiplication has always been a fundamental necessity of high-performance processors and systems. In digital adders, the speed of addition is partial by the time necessary to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated serially only after the previous bit position has been summed and a carry propagated into the next position. There are several types of adder designs available (RCA, CLAA, CSA, CSA) which have its own advantages and disadvantages. The main speed limitation in any adder is in the production of carries and many authors considered the addition problem. To solve the carry propagation delay CSLA is developed which drastically decreases the area and delay to a great extent. The CSLA is used in many computational systems design to moderate the problem of carry propagation delay by separately generating multiple carries and then select a carry to generate the sum. It uses independent ripple carry adders (for Cin=0 and Cin=1) to generate the resultant sum. However, the Regular CSLA is not area and speed efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input. The final sum and carry are selected by the multiplexers (mux). Due to the use of two independent RCA the area will enlarge which leads an increase in delay. To overcome the above problem, the basic idea of the planned work is to use n-bit binary to excess-1 code converters (BEC) to improve the speed of addition. This logic can be replaced in RCA for Cin=1 to further improves the speed and thus decreases the delay. Using Binary to Excess-1 Converter (BEC) instead of RCA in the regular CSLA will achieve lower area, delay which speeds up the addition operation. The major advantage of this BEC logic comes from the lesser number of logic gates than the Full Adder (FA) structure because the number of gates used will be decreased. This work in brief is structured as follows. Section II deals with the delay and area evaluation methodology of the basic adder blocks and its corresponding delay and area values. Section III deals with the structure and function of BEC logic and its corresponding function table and logic equations. Section IV presents the architecture of the Regular CSLA of 128-bits. This SQRT CSLA has been developed using ripple carry adders and multiplexers. The architecture of the Modified SQRT CSLA is presented in Sections V. In section ISSN: 2231-5381 http://www.ijettjournal.org Page 3766

VI implementation methodologies and corresponding design tools are explained and finally the paper is concluded in section VIII. II. BASIC ADDER BLOCK The adder block with a Ripple carry adder, BEC and Mux is explained in this section. In this we calculate and explain the delay & area using the theoretical approach and explain how the delay and area effect the total implementation. The AND, OR, and Inverter (AOI) implementation of an XOR gate is shown in Fig. 1. The delay and area evaluation methodology considers all gates to be made up of AND, OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. We then add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates necessary for each logic block. Based on this approach, the blocks of 2:1 mux, Half Adder (HA), and FA are evaluated in Table I. Fig 1: Delay and area evalution of xor TABLE 1: DELAY AND AREA EVALUATION OF CSLA Design Delay Area XOR 3 5 2:1 MUX 3 4 Half Adder 3 6 Full Adder 6 13 Fig 2: 8-bit BEC with 16:8 mux Fig 2 shows the basic 8-bit addition procedure which includes 8-bit data, a 8-bit BEC logic and 16:8 mux. The addition operation is performed for Cin=0 and for Cin=1.For Cin=0 the addition is performed using ripple carry adder and for Cin=1 the operation is performed using 8-bit BEC (replacing the RCA for Cin=1). The resultant is selected based on Carry in signal from the earlier group. The total delay depends on mux delay and Cin signal from earlier group. III. BINARY TO EXCESS-1 CONVERTER (BEC) The basic work is to use Binary to Excess-1 Converter (BEC) in the regular CSLA to get lower area and improved speed of operation. This logic is replaced in RCA with Cin=1. This logic can be implemented for different bits which are used in the modified design. The major advantage of this BEC logic comes from the fact that it uses lesser number of logic gates than the n-bit Full Adder (FA) structure. As stated above the main idea of this work is to use BEC instead of the RCA with Cin=1 in order to decrease the area and increase the speed of operation in the regular CSLA to obtain modified CSLA. To replace the n-bit RCA, an n+1 bit BEC logic is required. The structure and the function table of a 8-bit BEC are shown in Figure 2 respectively. ISSN: 2231-5381 http://www.ijettjournal.org Page 3767

Figure 3: 6-binary to excess-1 converter TABLE 2: FUNCTION TABLE OF THE 6-BIT BEC B[5:0] X[5:0] 000000 000001 000001 000010 000010 000011 111111 000000 The Boolean expressions for the 6-bit BEC logic are expressed below X0 = ~B0 X1 = B0^B1 X2 = B2^ (B0 & B1) X3 = B3^ (B0 & B1 & B2) X4 = B4^ (B0 & B1 & B2 & B3) X5 = B5^ (B0 & B1 & B2 & B3 & B4). X6 = B6^ (B0 & B1 & B2 & B3 & B4 & B5). X7 = B^ (B0 & B1 & B2 & B3 & B4 & B5 & B6). Figure 4: Architecture of Regular 128-bit SQRT CSLA IV. ARCHITECTURE OF REGULAR 128-BIT SQRT CSLA A 16-bit carry select adder can be designed in two different sizes namely uniform block size and variable block size. Similarly a 32, 64 and 128-bit can also be developed in two modes of different block sizes. Ripple-carry adders are the simplest and most compact full adders, but their performance is limited by a carry that must spread from the least-significant bit to the most-significant bit. The various 16, 32, 64 and 128-bit CSLA can also be developed by using ripple carry adders. The speed of a carry-select adder can be enhanced upto 40% to 90%, by performing the additions in parallel, and reducing the maximum carry delay. Fig 4 shows the Regular structure of 128-bit SQRT CSLA. It includes many ripple carry adders of variable sizes which are separated into groups. Group 0 contains 2-bit RCA which contains only one ripple carry adder which adds the input bits and the input carry and results to sum [1:0] and the carry out. The carry out of the Group 0 which acts as the selection input to mux which is in group 1, selects the result from the consequent RCA (Cin=0) or RCA (Cin=1). Similarly the remain groups will be selected depending on the Cout from the previous groups. In Regular CSLA, there is only one RCA to achieve the addition of the least significant bits [1:0]. The remaining bits (other than LSBs), the addition is performed by using two RCAs corresponding to the one assuming a carry-in of 0, the other a carry-in of 1 within a group. In a group, there are two RCAs that receive the same data inputs but different Cin. The upper adder has a carry-in of 0, the lower adder a carry-in of 1. The actual Cin from the previous sector selects one of the two RCAs. That is, as shown in the Fig.4, if the carry-in is 0, the sum and carry-out of the upper RCA is selected, and if the carry-in is 1, the sum and carry-out of the lower RCA is selected. For this Regular CSLA architecture, the execution code, for the Full Adders and Multiplexers of different sizes (6:3, 8:4, 10:5 up to 24:11) were designed initially. The regular 64-bit, 128-bit CSLA were implemented by calling the ripple carry adders and all multiplexers. ISSN: 2231-5381 http://www.ijettjournal.org Page 3768

V. ARCHITECTURE OF MODIFIED 128-BIT SQRT CSLA This architecture is alike to regular 128-bit SQRT CSLA, the only change is that, we change RCA with Cin=1 among the two available RCAs in a group with a BEC. This BEC has a feature that it can perform the alike operation as that of the replaced RCA with Cin=1. Fig 5 shows the Modified block diagram of 128-bit SQRT CSLA. The number of bits necessary for BEC logic is 1 bit more than the RCA bits. The modified block diagram is also separated into various groups of variable sizes of bits with each group having the ripple carry adders, BEC and corresponding mux. As shown in the Fig.5, Group 0 include one RCA only which is having input of lower significant bit and carry in bit and produces result of sum[1:0] and carry out which is acting as mux selection line for the next group, also the procedure continues for higher groups but they includes BEC logic instead of RCA with Cin=1. Based on the consideration of delay values, the arrival time of selection input C1 of 8:3 mux is before than the sum of RCA and BEC. For remain groups the selection input arrival is later than the RCA and BEC. Thus, the sum1 and c1 (output from mux) are depending on mux and results computed by RCA and BEC respectively. The sum 2 depends on c1 and mux. For the remain parts the arrival time of mux selection input is always larger than the arrival time of data inputs from the BEC s. Thus, the delay of the remaining MUX depends on the arrival time of mux selection input and the mux delay. In this Modified CSLA architecture, the implementation code for Full Adder and Multiplexers of 6:3, 8:4, and 10:5 up to 24:11 were designed. The design code for the BEC was designed by using NOT, XOR and AND gates. Then 2, 3, 4, 5 up to 11-bit ripple carry adder was designed. Figure 5: Architecture of Modified 128-bit SQRT CSLA TABLE 3: COMPARISON VALUES OF 16 BIT, 32 BIT,64 BIT AND128 BIT CSLA Sr. Delay Area (No. Adders No (ns) of LUTs) 1 16 bit Regular 18.42 43 Modified 16.56 47 2 32 bit Regular 25.53 92 Modified 21.41 101 3 64 bit Regular 42.17 189 Modified 25.95 206 4 128 bit Regular 56.37 422 Modified 40.56 439 VI. RESULTS The implemented design in this work has been simulated using Verilog-HDL. The adders (of various size 16, 32, 64 and 128) are designed and simulated with Modelsim. All the V files (Regular and Modified) are also simulated in Modelsim and corresponding results are compared. After simulation the different size codes are synthesized using Xilinx ISE 9.1i. The simulated V files are imported into the synthesized tool and resultant values of delay and area are noted. The synthesized reports include area and delay values for different sized adders. The similar design flow is followed for both the regular and modified SQRT CSLA of different sizes. Table 3 shows the comparison of regular and modified CSLA of various bits which includes Delay and area comparisons. From the table it is clear that the delay decreases for 16-bit modified method when compared with regular method. Similarly the table also shows the comparison for the various 32, 64, and 128 bits. The relative values of areas shows that the number of LUT will be more for modified the 16, 32 and 64. This value reduces gradually for 128 bits. For 256 bits the value almost equal to regular method ISSN: 2231-5381 http://www.ijettjournal.org Page 3769

which will decreases more for still higher order bits. Thus the modified method reduces the delay and also area to a great extent. VII. CONCLUSION An efficient approach is planned in this paper to decrease the area and delay of SQRT CSLA architecture. The reduction in the number of gates is obtained by simply replacing the RCA with BEC in the structure. The compared results shows that the modified SQRT CSLA has a slightly bigger area for lower order bits which further reduces for higher order bits. The delay is reduced to a great extent with the modified SQRT CSLA. Thus the results shows that using modified method the area and delay will reduce thus leads to good alternative for adder implementation for many processors. The modified CSLA architecture is therefore low area and high speed approaches for VLSI hardware implementation. ACKNOWLEDGMENT Mrs. Priyanka Halle would like to thank Mr. Panduranga rao, Associate professor and HOD of E&TC Department who had been guiding throughout the project and supporting me in giving technical ideas about the paper and motivating me to complete the work effectively and successfully. REFERENCES [1] Bedrij, O. J., (1962), Carry-select adder, IRE Trans. Electron. Comput. Pp.340 344. [2] Ramkumar,B., Kittur, H.M. and Kannan,P. M.,(2010 ), ASIC implementation of modified faster carry save adder, Eur. J. Sci. Res., vol. 42, no. 1,pp.53 58. [3] Kim,Y. and Kim,L.-S.,(May2001), 64-bit carry-select adder with reduced area, Electron Lett., vol. 37, no. 10, pp. 614 615. [5] Ceiang, T. Y. and Hsiao. J., (Oct 1998), Carry-select adder using single ripple carry adder, Electron. Lett., vol. 34, no. 22, pp. 2101 2103 [6] He, Y., Chang, C. H. and Gu, J., (2005), An A rea efficient 64-bit square root carry-select adder for low power application, in Proc. IEEE Int. Symp.Circuits Syst. vol. 4, pp. 4082 4085. [7] E. Abu-Shama and M. Bayoumi, A New cell for low power adders, in Proc.Int.Midwest Symp. Circuits and Systems, 1995, pp. 1014 1017 [8] Samir Palnitkar, Verilog Hdl, A guide to Digital Design and Synthesis ISSN: 2231-5381 http://www.ijettjournal.org Page 3770