Implementation of ALU Using Asynchronous Design
|
|
- Ambrose Fleming
- 6 years ago
- Views:
Transcription
1 IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: , ISBN: Volume 3, Issue 6 (Nov. - Dec. 2012), PP Implementation of ALU Using Asynchronous Design P. Amrutha 1 and G. Hanumantha Reddy 2 1&2 Department of Electronics & Communication Engineering Chaitanya Engineering College Visakhapatnam, A.P, India Abstract: Power consumption has become one of the biggest challenges in design of high performance microprocessors. In this paper we present a design technique using GALs (Globally-Asynchronous Locally- Synchronous) for implementing asynchronous ALUs, which aims to eliminate the global clock. Here ALUs are designed with delay insensitive dual rail four phase logic and CMOS domino logic. It ensures economy in silicon area and potentially for low power consumption. This has been described and implemented in order to achieve a high performance in comparison with synchronous and available asynchronous design. Also simulation results, show significant reduction in the number of transistors as well as delay. I. Introduction The ALU (Arithmetic Logic Unit) is implemented by using asynchronous pipeline architecture. The architecture is having simple handshake cells and the handshake cells are embedded in the pipeline stage as normal logic cells. As a result, the speed of the ALU can be very fast. Pipeline is an important methodology to speed up the design [1]. It allows many operations to occur in parallel. Most pipeline systems use synchronous pipeline, in which a global clock is used to clock all the stages in the synchronous pipeline system. As a result the global clock signal is always connected to many logic cells and so the time of the clock signal to reach different logic cells may be different due to different length of connection from the global clock to different logic cells and hence this a problem. Asynchronous pipeline may be a solution. In asynchronous pipeline global clock is removed. Instead of global clock, handshake control signals are used to govern the operation of the pipeline [1&2]. One approach which promises high performance with low power consumption is the use of asynchronous computing techniques. To investigate this self-timed implementation of the ARM microprocessor - a 32-bit RISC architecture developed by Advanced RISC Machines Limited - is being produced as a commercially realistic technology demonstrator. Other researchers [3] have demonstrated the feasibility of building a complete asynchronous microprocessor; the current project addresses the problems associated with the re-implementation of an existing commercial architecture with the specific goal of minimizing power consumption several approaches to asynchronous circuit design are currently being investigated [4]. Some propose by using gated clocks to reduce the switching activity of logic in redundant cycles [5]. Although it minimizes the skew, this methodology is limited when operating frequency is very high. On the other hand, Globally Asynchronous and Locally Synchronous (GALS) technique aims to eliminate the global clock, by partitioning the system into several synchronous blocks and communicating asynchronously among blocks [7]. However, the global signaling protocol increases the total area power penalty and affects performance of the system. Several researchers propose asynchronous approaches to cope with performance and timing issue. Designed a 16-bit asynchronous ALU with an asynchronous pipeline architecture [6]. In this approach, simple handshake cells embedded in pipeline stages make the ALU run fast. Increases total area power penalty and affects performance of the system. However, large power has consumed by this design while waiting for the incoming data. Since the inappropriate rotate-wire concept of data buses, the time required for each multiplication operation becomes larger. In reducing the time dependency of an asynchronous design of Quantum dot Cellular Automata (QCA), Paper [8] used GALS with delay-insensitive data encoding scheme. Here each gate has locally synchronized by corresponding clocking zone(s). We focus in this asynchronous design to reduce the transistors count, power consumption and delay significantly by using delay insensitive dual rail logic and bundled data bounded delay model. II. Introduction of Asynchronous Circuits Asynchronous circuits are fundamentally different; they also assume binary signals, but there is no common and discrete time. Instead the circuits use handshaking between their components in order to perform the necessary synchronization, communication, and sequencing of operations. Expressed in synchronous terms this results in a behavior that is similar to systematic fine-grain clock gating and local clocks that are not in phase and whose period is determined by actual circuit delays registers are only clocked where and when 7 Page
2 needed. In all protocols, Muller pipeline is used. The 4-phase bundled data and 2 phase bundled data are pipelined designs in which matching delay elements needed to be inserted between latches to maintain correct behavior in the request signal path. On the other hand, 4-phase dual rail has designed to combine encoding of data and request. [10]. Request Acknowledgment Data n (a) Request Acknowledgment Data (b) Request Acknowledgment Data Figure 1. (a) A bundled-data channel (b) A 4-phase bundled-data protocol (c) A 2-phase bundled-data protocol 2.1. Bundled-data Protocols The term bundled-data refers to a situation where the data signals use normal Boolean levels to encode information, and where separate request and acknowledge wires are bundled with the data signals, Figure. 1(a). In the 4- phase protocol illustrated in Figure. 1(b) the request and acknowledge wires also use normal Boolean levels to encode information, and the term 4-phase refers to the number of communication actions: (1) the sender issues data and sets request high, (2) the receiver absorbs the data and sets acknowledge high, (3) the sender responds by taking request low (at which point data is no longer guaranteed to be valid) and (4) the receiver acknowledges this by taking acknowledge low. At this point the sender may initiate the next communication cycle The 4-phase dual-rail protocol concept This protocol makes the reliable communication between blocks of designed architecture regardless of delays in the wires connecting two blocks and also which is delay insensitive [9&10]. It uses a single wire for each data bit and one extra control line for each data word. It provides the reliable communication between blocks by combined encoding of data and request with an acknowledge signal after completion detection. This logic uses two request wires per bit of information d:; one wire d.t is used for signaling a logic 1(or true), and another wire d.f is used for signaling logic 0(or false). In a single bit channel with 4 phase dual rail logic, the request signal can be either of d.t or d.f for handshaking purpose. Viewed together the {x.f, x.t}={1,0} and {x.f, x.t}={0,1} represent valid data (logic 0and logic 1 respectively) and {x.f, x.t}={0,0} represents no data (or empty value or E ). The codeword {x.f, x.t}={1,1} is not used, and a transition from one valid codeword to another valid codeword is not allowed, as illustrated in Figure. 2(a). (c) 8 Page
3 Figure 2. (a) The 4-phase dual rail logic (b) Muller C-element and indication concept and (c) Muller Pipeline 2.3. Muller C-element and Indication Concept The concept of indication or acknowledgement plays an important role in the design of asynchronous circuits for synchronization. Muller C-element is a state-holding element much like an asynchronous set-reset latch [1]. When both inputs are 0 the output is set to 0, and when both inputs are 1 the output is set to 1. For other input combinations {(0, 1) or (1, 0)} the output does not change. Consequently, one can see the output changes from 0 to 1 can conclude that both inputs are now at 1, similarly one can see the output changes from 1 to 0 may conclude that both inputs are now 0. In this circuit design, the absence of a clock means that, in many circumstances, signals are required to be valid data: {x.f, x.t}={1, 0} and {x.f, x.t}={0, 1} all the time that every signal transition has a meaning and, consequently, that hazards and races must be avoided. Signal transitions are not indicated (acknowledged) for the other signal transitions such as {(0, 1), (1, 0)} and that are used to avoid the source of hazards. A circuit accomplish this requirement with Muller C-element is as shown in Figure. 2(b). The Muller C-element is indeed a fundamental component that is extensively used in asynchronous circuit design [1] Muller Pipeline A 4-phase dual-rail pipeline is based on the Muller pipeline that relays handshakes in Figure. 2(c). In the Muller pipeline, there is a 1-bit wide and 3-stage deep pipeline that uses a common acknowledge signal per stage to synchronize. Here the pipeline stage can store empty codeword {d.t, d.f}={0, 0}, causing the acknowledge signal out of that stage to be logic 0 or one of the two valid code words {0, 1} and {1, 0}, causing the acknowledge signal out of that stage to be logic 1. Initially all of the C-elements have been initialized to 0 and during the operation, according to the successor value; the current C-element transfers its predecessor s value for handshaking [11]. To understand what happens let s consider the i th C-element, c[i]: It will propagate a 1 from its predecessor c[i-1], only if its successor c[i+1] is 0. In a similar way it will propagate a 0 from its predecessor if its successor is 1. Figure 3. The Adder Circuit in ALU 9 Page
4 This ALU has no special fast carry logic and performs addition with a chain of thirty two full adders. The only concession to the asynchronous nature of the unit made at the design stage was that the carries between the individual bits are encoded onto pairs of wires signaling the 0 and 1 states of the carry bit respectively. In this fashion it is possible to detect that a carry signal has arrived at a given bit position by observing a change in state of one of these signals. ALU completion is signaled when a carry has been transmitted to all 32 bits in the word [10]. III. Design and Implementation of ALU Using the logics and principles outlined in Section II, an ALU has designed at the transistor level for single bit operation as shown in the Figure. 3 to demonstrate our design concept. A single bit-slice ALU uses only 53 transistors and is capability of operations in Table 1. Since we emphasize on the design of asynchronous component, there is no hardware implementation for 4- phase dual rail with Muller C. However, the proposed circuit assumes the signaling from such logic blocks. For example, c0 out and c1 out act as two wires of 4- phase logic, which makes reliable operation between its predecessor and successor blocks Architecture design & Implementation: We designed a 32-bit ALU, which requires 1696 transistors. The basic principle of Bundled data Bounded delay model of Sutherland s micro pipelines is used here [1]. The timing characteristics of all data busses of this architecture are bundled together. Here the statuses of the data busses are indicated by 4 phasedual rail hand shake signals. The clock power reduction at the architectural level is mainly due to pipeline technique. The dynamic logic of completion detection unit ensures precise internal operation, because of its 4- phase dual logic. It is also carrying the timing information because it uses common timing characteristics. Table 1. Functions available for the proposed ALU Logic Function Basic Operation a-input b-input and AND True True add AND True True add with carry AND True True subtract AND True complement reverse subtract AND complement True subtract with carry AND True Complement reverse subtract with carry AND complement True test bits AND True True compare AND True Complement compare negative AND True True bit clear AND True Complement xor XOR True True test equal XOR True True or OR True True move OR Zero True move NOT OR Zero complement 10 Page
5 Data and control path for a data dependent transistor level Figure 4. A CMOS level Asynchronous ALU circuit for 1-bit operation IV. Analysis Addition is one of fundamental functions of the ALU. We start by analyzing the number of transistors used in the addition. About 80% of the operations require some form of addition [11]. If we improve the processing time of addition operation, the performance of complete ALU can also be improved. The latency required by our design is depended upon the operation, the input data at that incident and the carry flow across the whole word length, i.e. it needs to propagate carry until it has predicted by the completion detection stage. The average length of the mean carry propagation distance is varying according to input data. In this 32-bit operation, a sum of 140 transistors has used for precharging (domino logic) and buffer purposes to meet the specifications at the layout. The simulated output waveform for the addition operation performed by this ALU presented in Figure. 5. It is performed with V DD =1.8V, input sequence c1 in =1111, c0 in =0000, a=0011, b=0101 and the simulated output sequence is output=1001, c0 out =1000, c1 out =0111 which coincides the expected specification. This simulation has done by HSPICE with 10ns local clock period at room temperature. The simulation results for the power consumption of typical addition operation with different supply voltages are shown in the Figure. 5. The simulation results of HSPICE 0.18um technology shows, the average power consumption for typical addition operation is 1.02x10-4 W under 1.8V supply with 1000 sample inputs at room temperature and average time delay is 2.5ns. A comparison of the simulated time performance and transistor count of this design with other published alternatives as shown in Table 2 and Table 3. They clearly indicate a significant reduction in transistors count. Our design has much reduction in silicon area. In addition, this architecture enables to have reduced switching capacitance because of missing master clock in the transistors of the ALU circuit design. It gives reduced switching actions for every arithmetic operation and reduction in silicon area. 11 Page
6 Figure 5. Simulated waveforms for the 1 bit addition operation Table 2. Simulated Results for Time Performance Time (ns) Best Case Worst Case Average Our Design Synchronous Ref [12] Ref [13] Table 3. Comparison of Reported Design Discussed Design and Existing Designs Comparison Synchronous ARM Asynchronous ARM ALU [12] ALU [12] Our Design Technology 1.2 um CMOS 1.2 um CMOS 0.18 um CMOS Supply Voltage ~5V (CO/SS) ~5V (CO/SS) ~1.8V (CO/MS) Data Width 32 bit 32 bit 32 bit Timing Purpose (Transistors) 140 (Transistors) Self Time Unit 3000 (Transistors) 2300 (Transistors) 1696 (Transistors) V. Conclusion The proposed design can reduce silicon area and improve the performance on average. This is mainly due to the use of the asynchronous design concepts with CMOS dynamic domino logic. They result in reduction in the transistor count & parasitic capacitances. Subsequently, power consumption can be reduced. Furthermore, this architecture provides glitch free operation, which is an important key factor to better reliability and performance. Effective reduction in area, power consumption and reasonable performance improvement are the special features of our designed architecture. References [1] I. E. Sutherland, Micropipelines, Communications of the ACM, pp , vol. 32, No. 6, [2] Scott Hauck, Asynchronous Design Mythologies: An Overview, in 1995, Proc. of the IEEE, vol. 83, no. 1, pp [3] A. J. Martin, S. Burns, T. K. Lee, D. Borkovie, and P. J. Hazewindus, The Design of an Asynchronous Microprocessor, in 1989, Proc. of the Decennial CalTech Conference on VLSI, MIT Press, pp [4] G. Gopalakrishnan, and P. Jain Some Recent Asynchronous System Design Methodologies, Technical Report Number UU-CS- TR , University of Utah, [5] L. Benini, and G. De Micheli, Transformations and synthesis of FSM s for low power gated clock implementation, IEEE Trans. Computer- Aided Design, vol. 15, [6] J. M. Rabaey, and M. Pedram, Low Power Design Methodologies, (Kluwer Academic Publishers, 1996, ISBNO ). [7] A. Hemani, T. Meincke, S. Kumar, A. Postula, P. Nilsson, J. Oberg, P. Ellervee, and D. Lundqvist, Lowering power consumption in clock by using globally asynchronous locally synchronous design style, in 1999, Proc. IEEE Design Automat. Conf., pp [8] M. Choi, and N. Park, Locally synchronous, globally asynchronous design for quantum-dot cellular automata (LSGA QCA), in 2005, Proc. IEEE NANO, pp [9] S. Hauck, Asynchronous design methodologies: an overview, in 1995, Proc. IEEE, vol. 83, pp [10] J. Sparso, and S. Furber, Principles of Asynchronous Circuit Design: A Systems Perspective, Kluwer Academic Publishers, [11] V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel and F. Baez, Reducing power in high performance microprocessors, in 1998, Proc. IEEE Design Automat. Conf., pp [12] J. D. Garside, A CMOS VLSI implementation of an asynchronous ALU, in 1993, Proc. IFIP working conference on Asynchronous Design Methodologies, Manchester, M139PL. [13] D. Kearney, and N. W. Bergmann, Bounded data asynchronous multipliers with data dependent computation times, in 1997, Proc. IEEE Advanced research in asynchronous circuits and systems, pp Page
CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER
84 CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 3.1 INTRODUCTION The introduction of several new asynchronous designs which provides high throughput and low latency is the significance of this chapter. The
More informationThe design of a simple asynchronous processor
The design of a simple asynchronous processor SUN-YEN TAN 1, WEN-TZENG HUANG 2 1 Department of Electronic Engineering National Taipei University of Technology No. 1, Sec. 3, Chung-hsiao E. Rd., Taipei,10608,
More informationA Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset
A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset M.Santhi, Arun Kumar S, G S Praveen Kalish, Siddharth Sarangan, G Lakshminarayanan Dept of ECE, National Institute
More informationA Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM
A Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM Mansi Jhamb, Sugam Kapoor USIT, GGSIPU Sector 16-C, Dwarka, New Delhi-110078, India Abstract This paper demonstrates an asynchronous
More informationImplementation of Asynchronous Topology using SAPTL
Implementation of Asynchronous Topology using SAPTL NARESH NAGULA *, S. V. DEVIKA **, SK. KHAMURUDDEEN *** *(senior software Engineer & Technical Lead, Xilinx India) ** (Associate Professor, Department
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017
Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of
More informationISSN Vol.08,Issue.07, July-2016, Pages:
ISSN 2348 2370 Vol.08,Issue.07, July-2016, Pages:1312-1317 www.ijatir.org Low Power Asynchronous Domino Logic Pipeline Strategy Using Synchronization Logic Gates H. NASEEMA BEGUM PG Scholar, Dept of ECE,
More informationDYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)
DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS
More informationDesign of 8 bit Pipelined Adder using Xilinx ISE
Design of 8 bit Pipelined Adder using Xilinx ISE 1 Jayesh Diwan, 2 Rutul Patel Assistant Professor EEE Department, Indus University, Ahmedabad, India Abstract An asynchronous circuit, or self-timed circuit,
More informationIntroduction to Asynchronous Circuits and Systems
RCIM Presentation Introduction to Asynchronous Circuits and Systems Kristofer Perta April 02 / 2004 University of Windsor Computer and Electrical Engineering Dept. Presentation Outline Section - Introduction
More informationTEMPLATE BASED ASYNCHRONOUS DESIGN
TEMPLATE BASED ASYNCHRONOUS DESIGN By Recep Ozgur Ozdag A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the
More informationIntroduction to asynchronous circuit design. Motivation
Introduction to asynchronous circuit design Using slides from: Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky, Intel Corporation, USA Alex Kondratyev, Theseus Logic,
More informationA Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding
A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely
More informationDesign of Parallel Self-Timed Adder
Design of Parallel Self-Timed Adder P.S.PAWAR 1, K.N.KASAT 2 1PG, Dept of EEE, PRMCEAM, Badnera, Amravati, MS, India. 2Assistant Professor, Dept of EXTC, PRMCEAM, Badnera, Amravati, MS, India. ---------------------------------------------------------------------***---------------------------------------------------------------------
More informationthe main limitations of the work is that wiring increases with 1. INTRODUCTION
Design of Low Power Speculative Han-Carlson Adder S.Sangeetha II ME - VLSI Design, Akshaya College of Engineering and Technology, Coimbatore sangeethasoctober@gmail.com S.Kamatchi Assistant Professor,
More informationOUTLINE Introduction Power Components Dynamic Power Optimization Conclusions
OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism
More informationDESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER
DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER Bhuvaneswaran.M 1, Elamathi.K 2 Assistant Professor, Muthayammal Engineering college, Rasipuram, Tamil Nadu, India 1 Assistant Professor, Muthayammal
More informationLow Power Circuits using Modified Gate Diffusion Input (GDI)
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 5, Ver. II (Sep-Oct. 2014), PP 70-76 e-issn: 2319 4200, p-issn No. : 2319 4197 Low Power Circuits using Modified Gate Diffusion Input
More informationHigh-Performance Full Adders Using an Alternative Logic Structure
Term Project EE619 High-Performance Full Adders Using an Alternative Logic Structure by Atulya Shivam Shree (10327172) Raghav Gupta (10327553) Department of Electrical Engineering, Indian Institure Technology,
More informationDesign of Low Power Asynchronous Parallel Adder Benedicta Roseline. R 1 Kamatchi. S 2
IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 04, 2015 ISSN (online): 2321-0613 Design of Low Power Asynchronous Parallel Adder Benedicta Roseline. R 1 Kamatchi. S 2
More informationPOWER consumption has become one of the most important
704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 4, APRIL 2004 Brief Papers High-Throughput Asynchronous Datapath With Software-Controlled Voltage Scaling Yee William Li, Student Member, IEEE, George
More informationKeywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.
ISSN 2319-8885 Vol.03,Issue.32 October-2014, Pages:6436-6440 www.ijsetr.com Design and Modeling of Arithmetic and Logical Unit with the Platform of VLSI N. AMRUTHA BINDU 1, M. SAILAJA 2 1 Dept of ECE,
More information16x16 Multiplier Design Using Asynchronous Pipeline Based On Constructed Critical Data Path
Volume 4 Issue 01 Pages-4786-4792 January-2016 ISSN (e): 2321-7545 Website: http://ijsae.in 16x16 Multiplier Design Using Asynchronous Pipeline Based On Constructed Critical Data Path Authors Channa.sravya
More informationDESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES
Volume 120 No. 6 2018, 4453-4466 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR
More informationDESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA
DESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA T.MALLIKARJUNA 1 *,K.SREENIVASA RAO 2 1 PG Scholar, Annamacharya Institute of Technology & Sciences, Rajampet, A.P, India.
More informationIMPLEMENTATION OF DIGITAL CMOS COMPARATOR USING PARALLEL PREFIX TREE
Int. J. Engg. Res. & Sci. & Tech. 2014 Nagaswapna Manukonda and H Raghunadh Rao, 2014 Research Paper ISSN 2319-5991 www.ijerst.com Vol. 3, No. 4, November 2014 2014 IJERST. All Rights Reserved IMPLEMENTATION
More informationParallel, Single-Rail Self-Timed Adder. Formulation for Performing Multi Bit Binary Addition. Without Any Carry Chain Propagation
Parallel, Single-Rail Self-Timed Adder. Formulation for Performing Multi Bit Binary Addition. Without Any Carry Chain Propagation Y.Gowthami PG Scholar, Dept of ECE, MJR College of Engineering & Technology,
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 5, Sep-Oct 2014
RESEARCH ARTICLE OPEN ACCESS A Survey on Efficient Low Power Asynchronous Pipeline Design Based on the Data Path Logic D. Nandhini 1, K. Kalirajan 2 ME 1 VLSI Design, Assistant Professor 2 Department of
More informationDesigning NULL Convention Combinational Circuits to Fully Utilize Gate-Level Pipelining for Maximum Throughput
Designing NULL Convention Combinational Circuits to Fully Utilize Gate-Level Pipelining for Maximum Throughput Scott C. Smith University of Missouri Rolla, Department of Electrical and Computer Engineering
More informationPNS-FCR: Flexible Charge Recycling Dynamic Circuit Technique for Low- Power Microprocessors
PNS-FCR: Flexible Charge Recycling Dynamic Circuit Technique for Low- Power Microprocessors N.Lakshmi Tejaswani Devi Department of Electronics & Communication Engineering Sanketika Vidya Parishad Engineering
More information16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE.
16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE. AditiPandey* Electronics & Communication,University Institute of Technology,
More informationLow Power PLAs. Reginaldo Tavares, Michel Berkelaar, Jochen Jess. Information and Communication Systems Section, Eindhoven University of Technology,
Low Power PLAs Reginaldo Tavares, Michel Berkelaar, Jochen Jess Information and Communication Systems Section, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands {regi,michel,jess}@ics.ele.tue.nl
More informationAnalysis of Different Multiplication Algorithms & FPGA Implementation
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 2, Ver. I (Mar-Apr. 2014), PP 29-35 e-issn: 2319 4200, p-issn No. : 2319 4197 Analysis of Different Multiplication Algorithms & FPGA
More informationA Low-Power Field Programmable VLSI Based on Autonomous Fine-Grain Power Gating Technique
A Low-Power Field Programmable VLSI Based on Autonomous Fine-Grain Power Gating Technique P. Durga Prasad, M. Tech Scholar, C. Ravi Shankar Reddy, Lecturer, V. Sumalatha, Associate Professor Department
More informationGlobally Asynchronous Locally Synchronous FPGA Architectures
Globally Asynchronous Locally Synchronous FPGA Architectures Andrew Royal and Peter Y. K. Cheung Department of Electrical & Electronic Engineering, Imperial College, London, UK {a.royal, p.cheung}@imperial.ac.uk
More informationDesign of Low Power Wide Gates used in Register File and Tag Comparator
www..org 1 Design of Low Power Wide Gates used in Register File and Tag Comparator Isac Daimary 1, Mohammed Aneesh 2 1,2 Department of Electronics Engineering, Pondicherry University Pondicherry, 605014,
More informationPerformance of Constant Addition Using Enhanced Flagged Binary Adder
Performance of Constant Addition Using Enhanced Flagged Binary Adder Sangeetha A UG Student, Department of Electronics and Communication Engineering Bannari Amman Institute of Technology, Sathyamangalam,
More informationDYNAMIC ASYNCHRONOUS LOGIC FOR HIGH SPEED CMOS SYSTEMS. I. Introduction
DYNAMIC ASYNCHRONOUS LOGIC FOR HIGH SEED CMOS SYSTEMS Anthony J. McAuley Bellcore (Room MRE-2N263), 445 South Street, Morristown, NJ 07962-1910, USA hone: (201)-829-4698; E-mail: mcauley@bellcore.com Abstract
More informationA Energy-Efficient Pipeline Templates for High-Performance Asynchronous Circuits
A Energy-Efficient Pipeline Templates for High-Performance Asynchronous Circuits Basit Riaz Sheikh and Rajit Manohar, Cornell University We present two novel energy-efficient pipeline templates for high
More informationInternational Journal of Engineering and Techniques - Volume 4 Issue 2, April-2018
RESEARCH ARTICLE DESIGN AND ANALYSIS OF RADIX-16 BOOTH PARTIAL PRODUCT GENERATOR FOR 64-BIT BINARY MULTIPLIERS K.Deepthi 1, Dr.T.Lalith Kumar 2 OPEN ACCESS 1 PG Scholar,Dept. Of ECE,Annamacharya Institute
More informationRedundant States in Sequential Circuits
Redundant States in Sequential Circuits Removal of redundant states is important because Cost: the number of memory elements is directly related to the number of states Complexity: the more states the
More informationMinimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007
Minimizing Power Dissipation during Write Operation to Register Files Kimish Patel, Wonbok Lee, Massoud Pedram University of Southern California Los Angeles CA August 28 th, 2007 Introduction Outline Conditional
More informationLOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES
LOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES D.Rani, R.Mallikarjuna Reddy ABSTRACT This logic allows operation in two modes: 1) static and2) dynamic modes. DML gates, which can be switched between
More informationTIMA Lab. Research Reports
ISSN 1292-862 TIMA Lab. Research Reports TIMA Laboratory, 46 avenue Félix Viallet, 38000 Grenoble France Session 1.2 - Hop Topics for SoC Design Asynchronous System Design Prof. Marc RENAUDIN TIMA, Grenoble,
More informationPower and Delay Analysis of Critical Path Delay Design Using Domino Logic Multiplier
Power and Delay Analysis of Critical Path Delay Design Using Domino Logic Multiplier K.Anjaneya Reddy PG Scholar, Dept of ECE, AITS, Kadapa, AP-India ABSTRACT This paper presents a high-throughput and
More informationDesign and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology
Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology Senthil Ganesh R & R. Kalaimathi 1 Assistant Professor, Electronics and Communication Engineering, Info Institute of Engineering,
More informationPOWER ANALYSIS OF CRITICAL PATH DELAY DESIGN USING DOMINO LOGIC
181 POWER ANALYSIS OF CRITICAL PATH DELAY DESIGN USING DOMINO LOGIC R.Yamini, V.Kavitha, S.Sarmila, Anila Ramachandran,, Assistant Professor, ECE Dept, M.E Student, M.E. Student, M.E. Student Sri Eshwar
More informationIMPLEMENTATION OF LOW POWER AREA EFFICIENT ALU WITH LOW POWER FULL ADDER USING MICROWIND DSCH3
IMPLEMENTATION OF LOW POWER AREA EFFICIENT ALU WITH LOW POWER FULL ADDER USING MICROWIND DSCH3 Ritafaria D 1, Thallapalli Saibaba 2 Assistant Professor, CJITS, Janagoan, T.S, India Abstract In this paper
More informationImplimentation of A 16-bit RISC Processor for Convolution Application
Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 4, Number 5 (2014), pp. 441-446 Research India Publications http://www.ripublication.com/aeee.htm Implimentation of A 16-bit RISC
More informationArea-Efficient Design of Asynchronous Circuits Based on Balsa Framework for Synchronous FPGAs
Area-Efficient Design of Asynchronous ircuits Based on Balsa Framework for Synchronous FPGAs ERSA 12 Distinguished Paper Yoshiya Komatsu, Masanori Hariyama, and Michitaka Kameyama Graduate School of Information
More informationGated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver
Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver E.Kanniga 1, N. Imocha Singh 2,K.Selva Rama Rathnam 3 Professor Department of Electronics and Telecommunication, Bharath
More informationDesign of Low Power Digital CMOS Comparator
Design of Low Power Digital CMOS Comparator 1 A. Ramesh, 2 A.N.P.S Gupta, 3 D.Raghava Reddy 1 Student of LSI&ES, 2 Assistant Professor, 3 Associate Professor E.C.E Department, Narasaraopeta Institute of
More informationIEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1
TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 The Design of High-Performance Dynamic Asynchronous Pipelines: Lookahead Style Montek Singh and Steven M. Nowick Abstract A new class of asynchronous
More informationDesign of a Pipelined 32 Bit MIPS Processor with Floating Point Unit
Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit P Ajith Kumar 1, M Vijaya Lakshmi 2 P.G. Student, Department of Electronics and Communication Engineering, St.Martin s Engineering College,
More informationHigh Performance Interconnect and NoC Router Design
High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali
More informationSingle-Track Asynchronous Pipeline Templates Using 1-of-N Encoding
Single-Track Asynchronous Pipeline Templates Using 1-of-N Encoding Marcos Ferretti, Peter A. Beerel Department of Electrical Engineering Systems University of Southern California Los Angeles, CA 90089
More informationA Comparative Power Analysis of an Asynchronous Processor
A Comparative Power Analysis of an Asynchronous Processor Aristides Efthymiou, Jim D. Garside, and Steve Temple Department of Computer Science,University of Manchester Oxford Road, Manchester, M13 9PL
More informationHigh speed Integrated Circuit Hardware Description Language), RTL (Register transfer level). Abstract:
based implementation of 8-bit ALU of a RISC processor using Booth algorithm written in VHDL language Paresh Kumar Pasayat, Manoranjan Pradhan, Bhupesh Kumar Pasayat Abstract: This paper explains the design
More informationFormulation for Performing Multi Bit Binary Addition using Parallel, Single-Rail Self-Timed Adder without Any Carry Chain Propagation
Formulation for Performing Multi Bit Binary Addition using Parallel, Single-Rail Self-Timed Adder without Any Carry Chain Propagation Y. Gouthami PG Scholar, Department of ECE, MJR College of Engineering
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016
NEW VLSI ARCHITECTURE FOR EXPLOITING CARRY- SAVE ARITHMETIC USING VERILOG HDL B.Anusha 1 Ch.Ramesh 2 shivajeehul@gmail.com 1 chintala12271@rediffmail.com 2 1 PG Scholar, Dept of ECE, Ganapathy Engineering
More informationEmbedded Soc using High Performance Arm Core Processor D.sridhar raja Assistant professor, Dept. of E&I, Bharath university, Chennai
Embedded Soc using High Performance Arm Core Processor D.sridhar raja Assistant professor, Dept. of E&I, Bharath university, Chennai Abstract: ARM is one of the most licensed and thus widespread processor
More informationMassively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain
Massively Parallel Computing on Silicon: SIMD Implementations V.M.. Brea Univ. of Santiago de Compostela Spain GOAL Give an overview on the state-of of-the- art of Digital on-chip CMOS SIMD Solutions,
More informationFPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.
FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different
More informationARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT
ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT Usha S. 1 and Vijaya Kumar V. 2 1 VLSI Design, Sathyabama University, Chennai, India 2 Department of Electronics and Communication Engineering,
More informationModeling Asynchronous Circuits in ACL2 Using the Link-Joint Interface
Modeling Asynchronous Circuits in ACL2 Using the Link-Joint Interface Cuong Chau ckcuong@cs.utexas.edu Department of Computer Science The University of Texas at Austin April 19, 2016 Cuong Chau (UT Austin)
More informationDelay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier
Delay Optimised 16 Bit Twin Precision Baugh Wooley Multiplier Vivek. V. Babu 1, S. Mary Vijaya Lense 2 1 II ME-VLSI DESIGN & The Rajaas Engineering College Vadakkangulam, Tirunelveli 2 Assistant Professor
More informationArea Efficient, Low Power Array Multiplier for Signed and Unsigned Number. Chapter 3
Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number Chapter 3 Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number Chapter 3 3.1 Introduction The various sections
More informationOPTIMIZING THE POWER USING FUSED ADD MULTIPLIER
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,
More informationDesigning and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders
Vol. 3, Issue. 4, July-august. 2013 pp-2266-2270 ISSN: 2249-6645 Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders V.Krishna Kumari (1), Y.Sri Chakrapani
More informationCAD4 The ALU Fall 2009 Assignment. Description
CAD4 The ALU Fall 2009 Assignment To design a 16-bit ALU which will be used in the datapath of the microprocessor. This ALU must support two s complement arithmetic and the instructions in the baseline
More informationCluster-based approach eases clock tree synthesis
Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network
More informationAsynchronous Behavior Related Retiming in Gated-Clock GALS Systems
Asynchronous Behavior Related Retiming in Gated-Clock GALS Systems Sam Farrokhi, Masoud Zamani, Hossein Pedram, Mehdi Sedighi Amirkabir University of Technology Department of Computer Eng. & IT E-mail:
More informationA Transistor-Level Placement Tool for Asynchronous Circuits
A Transistor-Level Placement Tool for Asynchronous Circuits M Salehi, H Pedram, M Saheb Zamani, M Naderi, N Araghi Department of Computer Engineering, Amirkabir University of Technology 424, Hafez Ave,
More informationPerformance Evaluation of Guarded Static CMOS Logic based Arithmetic and Logic Unit Design
International Journal of Engineering Research and General Science Volume 2, Issue 3, April-May 2014 Performance Evaluation of Guarded Static CMOS Logic based Arithmetic and Logic Unit Design FelcyJeba
More informationDynamic Logic ALU Design with Reduced Switching Power
Indian Journal of Science and Technology, Vol 8(20), DOI:10.17485/ijst/2015/v8i20/79080, August 2015 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Dynamic Logic ALU Design with Reduced Switching Power
More informationECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141
ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition
More informationII. MOTIVATION AND IMPLEMENTATION
An Efficient Design of Modified Booth Recoder for Fused Add-Multiply operator Dhanalakshmi.G Applied Electronics PSN College of Engineering and Technology Tirunelveli dhanamgovind20@gmail.com Prof.V.Gopi
More informationIMPLEMENTATION OF TWIN PRECISION TECHNIQUE FOR MULTIPLICATION
IMPLEMENTATION OF TWIN PRECISION TECHNIQUE FOR MULTIPLICATION SUNITH KUMAR BANDI #1, M.VINODH KUMAR *2 # ECE department, M.V.G.R College of Engineering, Vizianagaram, Andhra Pradesh, INDIA. 1 sunithjc@gmail.com
More informationA TWO-PHASE MICROPIPELINE CONTROL WRAPPER FOR WITH EARLY EVALUATION
A TWO-PHASE MIROPIPELINE ONTROL WRAPPER FOR WITH EARLY EVALUATION R. B. Reese, Mitchell A. Thornton, and herrice Traver A two-phase control wrapper for a micropipeline is presented. The wrapper is implemented
More informationEE577A FINAL PROJECT REPORT Design of a General Purpose CPU
EE577A FINAL PROJECT REPORT Design of a General Purpose CPU Submitted By Youngseok Lee - 4930239194 Narayana Reddy Lekkala - 9623274062 Chirag Ahuja - 5920609598 Phase 2 Part 1 A. Introduction The core
More informationTIMA Lab. Research Reports
ISSN 292-862 CNRS INPG UJF TIMA Lab. Research Reports TIMA Laboratory, 46 avenue Félix Viallet, 38 Grenoble France Technology Mapping for Area Optimized Quasi Delay Insensitive Circuits Bertrand Folco,
More informationImplementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient
ISSN (Online) : 2278-1021 Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient PUSHPALATHA CHOPPA 1, B.N. SRINIVASA RAO 2 PG Scholar (VLSI Design), Department of ECE, Avanthi
More informationPipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications
, Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar
More informationVERY large scale integration (VLSI) design for power
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 25 Short Papers Segmented Bus Design for Low-Power Systems J. Y. Chen, W. B. Jone, Member, IEEE, J. S. Wang,
More informationBUILDING BLOCKS OF A BASIC MICROPROCESSOR. Part 1 PowerPoint Format of Lecture 3 of Book
BUILDING BLOCKS OF A BASIC MICROPROCESSOR Part PowerPoint Format of Lecture 3 of Book Decoder Tri-state device Full adder, full subtractor Arithmetic Logic Unit (ALU) Memories Example showing how to write
More informationLearning Outcomes. Spiral 2-2. Digital System Design DATAPATH COMPONENTS
2-2. 2-2.2 Learning Outcomes piral 2-2 Arithmetic Components and Their Efficient Implementations I understand the control inputs to counters I can design logic to control the inputs of counters to create
More information1. Designing a 64-word Content Addressable Memory Background
UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Project Phase I Specification NTU IC541CA (Spring 2004) 1. Designing a 64-word Content Addressable
More informationChapter 6. CMOS Functional Cells
Chapter 6 CMOS Functional Cells In the previous chapter we discussed methods of designing layout of logic gates and building blocks like transmission gates, multiplexers and tri-state inverters. In this
More informationInvestigation and Comparison of Thermal Distribution in Synchronous and Asynchronous 3D ICs Abstract -This paper presents an analysis and comparison
Investigation and Comparison of Thermal Distribution in Synchronous and Asynchronous 3D ICs Brent Hollosi 1, Tao Zhang 2, Ravi S. P. Nair 3, Yuan Xie 2, Jia Di 1, and Scott Smith 3 1 Computer Science &
More informationReal Time NoC Based Pipelined Architectonics With Efficient TDM Schema
Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam
More informationDesign of 2-Bit ALU using CMOS & GDI Logic Architectures.
Design of 2-Bit ALU using CMOS & GDI Logic Architectures. Sachin R 1, Sachin R M 2, Sanjay S Nayak 3, Rajiv Gopal 4 1, 2, 3 UG Students, Dept. of ECE New Horizon College of Engineering, Bengaluru 4 Asst.
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK DESIGN OF QUATERNARY ADDER FOR HIGH SPEED APPLICATIONS MS. PRITI S. KAPSE 1, DR.
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationVLSI DESIGN OF REDUCED INSTRUCTION SET COMPUTER PROCESSOR CORE USING VHDL
International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol.2, Issue 3 (Spl.) Sep 2012 42-47 TJPRC Pvt. Ltd., VLSI DESIGN OF
More informationAnalysis of Performance and Designing of Bi-Quad Filter using Hybrid Signed digit Number System
International Journal of Electronics and Computer Science Engineering 173 Available Online at www.ijecse.org ISSN: 2277-1956 Analysis of Performance and Designing of Bi-Quad Filter using Hybrid Signed
More informationof Soft Core Processor Clock Synchronization DDR Controller and SDRAM by Using RISC Architecture
Enhancement of Soft Core Processor Clock Synchronization DDR Controller and SDRAM by Using RISC Architecture Sushmita Bilani Department of Electronics and Communication (Embedded System & VLSI Design),
More informationOptimization of Robust Asynchronous Circuits by Local Input Completeness Relaxation. Computer Science Department Columbia University
Optimization of Robust Asynchronous ircuits by Local Input ompleteness Relaxation heoljoo Jeong Steven M. Nowick omputer Science Department olumbia University Outline 1. Introduction 2. Background: Hazard
More informationDesign And Implementation Of Reversible Logic Alu With 4 Operations
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p-ISSN: 2278-8735 PP 55-59 www.iosrjournals.org Design And Implementation Of Reversible Logic Alu With 4 Operations
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationLecture #21 March 31, 2004 Introduction to Gates and Circuits
Lecture #21 March 31, 2004 Introduction to Gates and Circuits To this point we have looked at computers strictly from the perspective of assembly language programming. While it is possible to go a great
More information