Area-Delay-Power Efficient Carry-Select Adder

Similar documents
Area Delay Power Efficient Carry-Select Adder

ASIC IMPLEMENTATION OF 16 BIT CARRY SELECT ADDER

Area Delay Power Efficient Carry Select Adder

Carry Select Adder with High Speed and Power Efficiency

FPGA Implementation of Efficient Carry-Select Adder Using Verilog HDL

Design of Delay Efficient Carry Save Adder

Area Delay Power Efficient Carry-Select Adder

A New Architecture Designed for Implementing Area Efficient Carry-Select Adder

International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: , Volume-3, Issue-5, September-2015

AREA-DELAY-POWER EFFICIENT CARRY SELECT ADDER

An Efficient Carry Select Adder with Less Delay and Reduced Area Application

Design and Characterization of High Speed Carry Select Adder

Hard ware implementation of area and power efficient Carry Select Adder using reconfigurable adder structures

A High Speed Design of 32 Bit Multiplier Using Modified CSLA

Design of an Efficient 128-Bit Carry Select Adder Using Bec and Variable csla Techniques

Design and Verification of Area Efficient High-Speed Carry Select Adder

Anisha Rani et al., International Journal of Computer Engineering In Research Trends Volume 2, Issue 11, November-2015, pp.

Low-Power And Area-Efficient Of 128-Bit Carry Select Adder

Implementation of CMOS Adder for Area & Energy Efficient Arithmetic Applications

A Novel Design of 32 Bit Unsigned Multiplier Using Modified CSLA

Implementation of CMOS Adder for Area & Energy Efficient Arithmetic Applications

DESIGN OF HYBRID PARALLEL PREFIX ADDERS

VARUN AGGARWAL

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient

Implementation of Regular Linear Carry Select Adder with Binary to Excess-1 Converter

OPTIMIZING THE POWER USING FUSED ADD MULTIPLIER

the main limitations of the work is that wiring increases with 1. INTRODUCTION

An Efficient Hybrid Parallel Prefix Adders for Reverse Converters using QCA Technology

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

Design and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems.

II. MOTIVATION AND IMPLEMENTATION

Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator

VLSI Implementation of Adders for High Speed ALU

Paper ID # IC In the last decade many research have been carried

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder

A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA

Parallel-Prefix Adders Implementation Using Reverse Converter Design. Department of ECE

High Performance and Area Efficient DSP Architecture using Dadda Multiplier

DESIGN AND IMPLEMENTATION 0F 64-BIT PARALLEL PREFIX BRENTKUNG ADDER

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

High Speed Han Carlson Adder Using Modified SQRT CSLA

R. Solomon Roach 1, N. Nirmal Singh 2.

Effective Improvement of Carry save Adder

IMPLEMENTATION OF TWIN PRECISION TECHNIQUE FOR MULTIPLICATION

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

AN EFFICIENT REVERSE CONVERTER DESIGN VIA PARALLEL PREFIX ADDER

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

Performance Analysis of 64-Bit Carry Look Ahead Adder

A Review of Various Adders for Fast ALU

Implementation of 64-Bit Kogge Stone Carry Select Adder with ZFC for Efficient Area

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

16 Bit Low Power High Speed RCA Using Various Adder Configurations

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA

Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders

Injntu.com Injntu.com Injntu.com R16

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

Low-Area Low-Power Parallel Prefix Adder Based on Modified Ling Equations

Computer Architecture, IFE CS and T&CS, 4 th sem. Fast Adders

Parallel, Single-Rail Self-Timed Adder. Formulation for Performing Multi Bit Binary Addition. Without Any Carry Chain Propagation

ECE 341. Lecture # 6

HIGH PERFORMANCE FUSED ADD MULTIPLY OPERATOR

ISSN (Online)

CHAPTER 9 MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

Delay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier

An FPGA based Implementation of Floating-point Multiplier

ECE 331 Digital System Design

International Journal for Research in Applied Science & Engineering Technology (IJRASET) IIR filter design using CSA for DSP applications

OPTIMIZATION OF AREA COMPLEXITY AND DELAY USING PRE-ENCODED NR4SD MULTIPLIER.

Power Optimized Programmable Truncated Multiplier and Accumulator Using Reversible Adder

UNIT - V MEMORY P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) Department of Electronics and Communication Engineering, VBIT

DESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA

Digital Circuit Design and Language. Datapath Design. Chang, Ik Joon Kyunghee University

DUE to the high computational complexity and real-time

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

Arithmetic Logic Unit. Digital Computer Design

JOURNAL OF INTERNATIONAL ACADEMIC RESEARCH FOR MULTIDISCIPLINARY Impact Factor 1.393, ISSN: , Volume 2, Issue 7, August 2014

Formulation for Performing Multi Bit Binary Addition using Parallel, Single-Rail Self-Timed Adder without Any Carry Chain Propagation

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier

ECE 341 Midterm Exam

Design of Low Power Digital CMOS Comparator

Programmable Logic Devices

ARITHMETIC operations based on residue number systems

An Optimized Montgomery Modular Multiplication Algorithm for Cryptography

A 65nm Parallel Prefix High Speed Tree based 64-Bit Binary Comparator

Code No: R Set No. 1

VLSI Design Of a Novel Pre Encoding Multiplier Using DADDA Multiplier. Guntur(Dt),Pin:522017

HIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC UNIT ON PROGRAMMABLE LOGIC DEVICE

Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number. Chapter 3

Efficient Radix-10 Multiplication Using BCD Codes

An Efficient Flexible Architecture for Error Tolerant Applications

Study, Implementation and Survey of Different VLSI Architectures for Multipliers

Srinivasasamanoj.R et al., International Journal of Wireless Communications and Network Technologies, 1(1), August-September 2012, 4-9

Henry Lin, Department of Electrical and Computer Engineering, California State University, Bakersfield Lecture 7 (Digital Logic) July 24 th, 2012

Chapter 2 Basic Logic Circuits and VHDL Description

Transcription:

Area-Delay-Power Efficient Carry-Select Adder Shruthi Nataraj 1, Karthik.L 2 1 M-Tech Student, Karavali Institute of Technology, Neermarga, Mangalore, Karnataka 2 Assistant professor, Karavali Institute of Technology, Neermarga, Mangalore, Karnataka Abstract: This paper proposes on the logic operations in conventional carry select adder (CSLA) and binary to excess-1 converter (BEC)-based CSLA to study the data dependency and to identify redundant logic operations. The new logic formulations have been proposed by eliminating all the redundant logic operations present in conventional CSLA. In the proposed scheme, the carry-select operation is scheduled before calculation of the final-sum. Anticipating carry-words (corresponding to C in =0 and 1) and fixed C in bits are the two Bit- patterns used for logic optimization of carry-select and general units. An efficient CSLA design is obtained using optimized logic units. The proposed CSLA design involves significantly less area and delay than the proposed BEC-based CSLA. Due to small carry-output delay, the proposed CSLA design is good for square root (SQRT)-CSLA. As per the theoretical estimation the proposed SQRT-CSLA involves nearly 35% less area-delay-product (ADP) than the BEC-based SQRT-CSLA which is the best amongst the existing SQRT-CSLA designs on average for different bit-widths. FPGA synthesis result shows that, the BEC-based SQRT-CSLA design involves more Area-Delay Product and consumes more energy than the proposed SQRT-CLSA on average for different bitwidths. Keywords: Adder, BEC, Arithmetic unit, Area-Delay Product, MSB, Low power design. 1. Introduction High-speed data path logic systems are one of the most substantial areas of research in VLSI system design. High performance, low power and area-efficient systems are increasingly used in portable devices. A complex digital signal processing (DSP) system involves several adders. An adder is the main component of an arithmetic unit. A ripple carry adder uses a simple design, but carry propagation delay (CPD) is the main concern in this adder. Carry lookahead and carry select (CS) methods have been used to reduce the CPD of adders. A conventional carry select adder (CLSA) is an RCA-RCA configuration that generates a pair of sum words and outputcarry bits corresponding anticipated input-carry (C in =0 and 1) and selects one out of each pair for final-sum and finaloutput-carry. The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum. The CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry. The basic idea of the proposed work is using n-bit binary to excess-1code converters (BEC) to improve the speed of addition. The proposed 16, 32 and 64-bit adders are compared with the conventional fast adders such as (CSA) and (CLA). The performance of the CSA can be improved by using BEC logic through customizing design and layout. The decrease in carry propagation delay will result in major enhancement of the speed of the adder and multiplier. CSL is implemented with dual ripple-carry adder (RCA) with the carry-in of 0 and 1, respectively. Based on this approach a 16, 32 and 64-bit adder architecture has been developed and compared with conventional fast adder architectures. This work identifies the performance of proposed designs in terms of delay-areapower through custom design and layout in process technology. The result analysis shows that the proposed architectures have better performance in reduction of carry propagation delay than contemporary architectures. 2. Block Diagram of Conventional CSLA and BEC- based CLSA The main objective is to identify redundant logic operations and data dependence. The CLSA has two units: 1) the sum and carry generator unit (SCG) and 2) the sum and carry selection unit. Various logic designs have been suggested for efficient implementation. Figure 1: (a)conventional CSLA (b)logical operations of RCA Block diagram of BEC-Based CSLA BEC-based CSLA provides the best area-delay-power efficiency among various CSLA s The final carry propagation adder (CPA) structure of many adders constitutes high carry propagation delay and this delay reduces the overall performance of the DSP processor. Paper ID: SUB155187 222

4.3) CG0 & CG1 Unit The logic circuits of CG0 and CG1 are optimized to take advantage of the fixed input-carry bits. The optimized designs of CG0 and CG1 are shown in Figures. It consist of logical or and logical and gates. Figure 2: BEC-Based CSLA 3. Module s Proposed CS Adder Design HSG Unit CG0 & CG1 Unit CS Unit FSG Unit Figure 5(a): CG0 & CG1 Unit 4. Module Description 4.1) Proposed CS Adder Design. The proposed CSLA is based on the logic formulation and its structure is shown in Fig. 3. It consists of one HSG unit, one FSG unit, one CG unit, and one CS unit. The CG unit is composed of two CGs (CG0 and CG1) corresponding to input-carry 0 and 1. The HSG receives two n-bit operands (A and B) and generate half-sum word s0 and halfcarry word c0 of width n bits each. Both CG0 and CG1 receive s0 and c0 from the HSG unit and generate two n-bit full-carry words c01 and c11 corresponding to input-carry 0 and 1, respectively. 4.4) CS Unit Figure 5(b): CG0 & CG1 Unit CS Unit is referred as Carry Select unit and it consist of combination of logical OR and Logical AND gates. The CS unit selects one final carry word from the two carry words available at its input line using the control signal cin. It selects c01 when cin = 0; otherwise, it selects c11. The LSB of s0 is XORed with cin to obtain the LSB of s. Figure 6: CS unit 4.2) HSG Unit Figure 3: Proposed CSLA Design. The logic diagram of the HSG unit is shown in figure.4. It consist of Combinational Logical AND Gate and OR Gate. 4.5) FSG Unit FSG Unit Referred as final-sum generation unit. It consists of Combination of logical XOR gates. It performs Sum operation. Figure 7: FSG Unit Figure 4: HSG Unit Paper ID: SUB155187 223

5. Relevant Terminology 5.1 Field-Programmable Device (FPD) A general term that refers to any type of integrated circuit used for implementing digital hardware, where the chip can be configured by the end user to realize different designs. Programming of such a device often involves placing the chip into a special programming unit, but some chips can also be configured in-system. Another name for FPDs is programmable logic devices (PLDs) 5.2 Programmable Logic Array (PLA) A Programmable Logic Array (PLA) is a relatively small FPD that contains two levels of logic, an AND-plane and an OR-plane, where both levels are programmable. 5.3 Programmable Array Logic (PAL) A Programmable Array Logic (PAL) is a relatively small FPD that has a programmable AND-plane followed by a fixed OR-plane. 5.4 Field-Programmable Gate Array (FPGA) A Field-Programmable Gate Array is an FPD featuring a general structure that allows very high logic capacity. Whereas CPLDs feature logic resources with a wide number of inputs (AND planes), FPGAs offer more narrow logic resources. FPGAs also offer a higher ratio of flip-flops to logic resources than do CPLDs. 5.5 High-Capacity PLDs (HCPLD) High-capacity PLDs: a single acronym that refers to both CPLDs and FPGAs. This term has been coined in trade literature for providing an easy way to refer to both types of devices. PAL is a trademark of Advanced Micro Devices. 5.6 Logic Block A relatively small circuit block that is replicated in an array in an FPD. When a circuit is implemented in an FPD, it is first decomposed into smaller sub-circuits that can each be mapped into a logic block. 5.7 Logic Capacity The amount of digital logic that can be mapped into a single FPD. This is usually measured in units of equivalent number of gates in a traditional gate array. In other words, the capacity of an FPD is measured by the size of gate array that it is comparable to. In simpler terms, logic capacity can be thought of as number of 2-input NAND gates. 5.8 Proposed Adder Design The proposed CSLA is based on the logic formulations and its structures. It consists of one HSG unit, one FSG unit and one CS unit. The CG unit is composed of two CGs (CG 0 and CG 1 ) corresponding to input-carry 0 and 1. The HSG receives two n-bit operands (A and B) and generate half-sum word s 0 and half carry word c 0 of width n bits each. Both CG 0 and CG 1 receive s 0 and c 0 from the HSG unit and generate two n-bit full carry words c 1 0 and c 1 1 corresponding to input-carry 0 and 1, respectively. The CS unit selects one final carry word from the two carry. Words available at its input line using the control signal c in. It selects c 1 0 when c in = 0; otherwise, it selects c 1 1. The CS unit can be implemented using an n-bit 2-to-l MUX. However, we find from the truth table of the CS unit that carry words c 1 0 and c 1 1 follow a specific bit pattern. If c 1 0 (i) = 1, then c 1 1 (i) = 1, irrespective of s 0 (i) and c(i), for 0 i n 1. This feature is used for logic optimization of the CS unit. The optimized design of the CS unit is shown which is composed of n AND OR gates. The final carry word c is obtained from the CS unit. The MSB of c is sent to output as c out, and (n 1) LSBs are XOR ed with (n 1) MSBs of half-sum (s0) in the FSG [ to obtain (n 1) MSBs of finalsum (s). The LSB of s 0 is XORed with c in to obtain the LSB of s. 6. Tables - 1 Area-Delay of standard Cell Library Datasheet AND-gate OR- gate Area (um 2 ) 7.37 7.37 Delay (ps) 180 170 The area and delay of each design are calculated from the AOI gate counts (N a, N o, N i ), ( n a, n o, n i ), and the cell details of Table I. 7. Result Analysis The simulation results are carried out for Area-delay-power carry select adder to find out the area-delay. The simulation is carried out by Modelsim 6.4c as a simulator tool. The simulation result is shown as a snapshots. 7.1 Various Snapshots Figure 8(a): BEC-Based CSLA n=8 Paper ID: SUB155187 224

Figure 8(b): BEC-Based CSLA n=16 Figure 8(e): Proposed SQRT CSLA n=64 8. Device Utilization Summary Figure 8(c): Proposed SQRT CSLA n=16 Figure 9(a): Area utilization for Existing Figure 8(d): Proposed SQRT CSLA n=32 Figure 9(b): Area utilization for Proposed Paper ID: SUB155187 225

9. Conclusion The logic operations involved in the conventional and BECbased CSLAs to study the data dependence and to identify redundant logic operations. The proposed CSLA design involves significantly less area and delay than the recently proposed BEC-based CSLA. Due to the small carry output delay, the proposed CSLA design is a good candidate for the SQRT adder. The FPGA synthesis result shows that the existing BEC-based SQRT-CSLA design involves more Area Delay Product and consumes energy than the proposed SQRT CSLA References [1] K. K. Parhi, VLSI Digital Signal Processing. New York, NY, USA: Wiley, 1998. [2] A. P. Chandrakasan, N. Verma, and D. C. Daly, Ultralow-power electronics for biomedical applications, Annu. Rev. Biomed. Eng., vol. 10, pp. 247 274, Aug. 2008. [3] O. J. Bedrij, Carry-select adder, IRE Trans. Electron. Comput.,vol. EC-11, no. 3, pp. 340 344, Jun. 1962. [4] Y. Kim and L.-S. Kim, 64-bit carry-select adder with reduced area, Electron. Lett., vol. 37, no. 10, pp. 614 615, May 2001. [5] Y. He, C. H. Chang, and J. Gu, An area-efficient 64- bit square root carry select adder for low power application, in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082 4085. [6] B. Ramkumar and H.M. Kittur, Low-power and areaefficient carry-select adder, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 2, pp. 371 375, Feb. 2012. [7] I.-C. Wey, C.-C. Ho, Y.-S. Lin, and C. C. Peng, An area-efficient carry select adder design by sharing the common Boolean logic term, in Proc.IMECS, 2012, pp. 1 4. [8] S. Manju and V. Sornagopal, An efficient SQRT architecture of carry select adder design by common Boolean logic, in Proc. VLSI ICEVENT, 2013,pp. 1 5. [9] B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs, 2nd ed. New York, NY, USA: Oxford Univ. Press, 2010. [10] B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs, 2nd ed. New York, NY, USA: Oxford Univ. Press, 2010. Paper ID: SUB155187 226