Implimentation of A 16-bit RISC Processor for Convolution Application

Similar documents
Implementation of RISC Processor for Convolution Application

Design & Analysis of 16 bit RISC Processor Using low Power Pipelining

Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit

A High Speed Design of 32 Bit Multiplier Using Modified CSLA

An Efficient Carry Select Adder with Less Delay and Reduced Area Application

FPGA Based Implementation of Pipelined 32-bit RISC Processor with Floating Point Unit

DESIGN OF HIGH PERFORMANCE LOW POWER 32 BIT RISC PROCESSOR

Design and Verification of Area Efficient High-Speed Carry Select Adder

DESIGN AND IMPLEMENTATION OF APPLICATION SPECIFIC 32-BITALU USING XILINX FPGA

Embedded Soc using High Performance Arm Core Processor D.sridhar raja Assistant professor, Dept. of E&I, Bharath university, Chennai

VLSI DESIGN OF REDUCED INSTRUCTION SET COMPUTER PROCESSOR CORE USING VHDL

Design of 16-bit RISC Processor Supraj Gaonkar 1, Anitha M. 2

VHDL Design and Implementation of ASIC Processor Core by Using MIPS Pipelining

Design of an Efficient 128-Bit Carry Select Adder Using Bec and Variable csla Techniques

An FPGA Implementation of 8-bit RISC Microcontroller

Novel Design of Dual Core RISC Architecture Implementation

16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE.

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient

FPGA Implementation of A Pipelined MIPS Soft Core Processor

Design and Characterization of High Speed Carry Select Adder

A Novel Design of 32 Bit Unsigned Multiplier Using Modified CSLA

Introduction to Microcontrollers

Implementation of CMOS Adder for Area & Energy Efficient Arithmetic Applications

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.

Functional Verification of Enhanced RISC Processor

Study, Implementation and Survey of Different VLSI Architectures for Multipliers

International Journal of Digital Application & Contemporary research Website: (Volume 1, Issue 2, September 2012)

Design, Analysis and Processing of Efficient RISC Processor

Carry Select Adder with High Speed and Power Efficiency

Design and Implementation of 5 Stages Pipelined Architecture in 32 Bit RISC Processor

High Speed Han Carlson Adder Using Modified SQRT CSLA

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder

An Optimized Montgomery Modular Multiplication Algorithm for Cryptography

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase

A Review on RISC Processor

Designing an Improved 64 Bit Arithmetic and Logical Unit for Digital Signaling Processing Purposes

International Journal Of Global Innovations -Vol.6, Issue.II Paper Id: SP-V6-I1-P01 ISSN Online:

Anisha Rani et al., International Journal of Computer Engineering In Research Trends Volume 2, Issue 11, November-2015, pp.

Hardware Implementation High Speed RISC Processor for Convolution

EE 3170 Microcontroller Applications

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics

FPGA Implementation of MIPS RISC Processor

RECONFIGURABLE SPI DRIVER FOR MIPS SOFT-CORE PROCESSOR USING FPGA

Power Optimized Programmable Truncated Multiplier and Accumulator Using Reversible Adder

An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator

Area Delay Power Efficient Carry-Select Adder

Chapter 4. The Processor

Area Delay Power Efficient Carry Select Adder

Vertical-Horizontal Binary Common Sub- Expression Elimination for Reconfigurable Transposed Form FIR Filter

Implementation of CMOS Adder for Area & Energy Efficient Arithmetic Applications

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

REALIZATION OF AN 8-BIT PROCESSOR USING XILINX

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

High Performance and Area Efficient DSP Architecture using Dadda Multiplier

A 32-Bit RISC Processor for Convolution Application

Area Delay Power Efficient Carry-Select Adder

High speed Integrated Circuit Hardware Description Language), RTL (Register transfer level). Abstract:

DESIGN AND IMPLEMENTATION OF ADDER ARCHITECTURES AND ANALYSIS OF PERFORMANCE METRICS

2. BLOCK DIAGRAM Figure 1 shows the block diagram of an Asynchronous FIFO and the signals associated with it.

COMPUTER STRUCTURE AND ORGANIZATION

Low Power Circuits using Modified Gate Diffusion Input (GDI)

Implementation of Optimized ALU for Digital System Applications using Partial Reconfiguration

DC57 COMPUTER ORGANIZATION JUNE 2013

ECE260: Fundamentals of Computer Engineering

Design and Implementation of CVNS Based Low Power 64-Bit Adder

Implementation of Low Power Pipelined 64-bit RISC Processor with Unbiased FPU on CPLD

Chapter 4 Design of Function Specific Arithmetic Circuits

Let s put together a Manual Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

32 bit Arithmetic Logical Unit (ALU) using VHDL

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

CAD for VLSI 2 Pro ject - Superscalar Processor Implementation

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS

Design of Delay Efficient Carry Save Adder

Chapter 4. The Processor

ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT

Electronics Engineering, DBACER, Nagpur, Maharashtra, India 5. Electronics Engineering, RGCER, Nagpur, Maharashtra, India.

A Modified Radix2, Radix4 Algorithms and Modified Adder for Parallel Multiplication

Keywords: throughput, power consumption, area, pipeline, fast adders, vedic multiplier. GJRE-F Classification : FOR Code:

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

ECE 341 Midterm Exam

Design of 2-Bit ALU using CMOS & GDI Logic Architectures.

DESIGN AND ANALYSIS OF COMPETENT ARITHMETIC AND LOGIC UNIT FOR RISC PROCESSOR

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

Prachi Sharma 1, Rama Laxmi 2, Arun Kumar Mishra 3 1 Student, 2,3 Assistant Professor, EC Department, Bhabha College of Engineering

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

Chapter 4. The Processor Designing the datapath

A High Performance Reconfigurable Data Path Architecture For Flexible Accelerator

VLSI Design and Implementation of High Speed and High Throughput DADDA Multiplier

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

FPGA based Simulation of Clock Gated ALU Architecture with Multiplexed Logic Enable for Low Power Applications

University, Patiala, Punjab, India 1 2

REGISTER TRANSFER LANGUAGE

Implementation of Floating Point Multiplier Using Dadda Algorithm

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

FPGA Implementation of Single Precision Floating Point Multiplier Using High Speed Compressors

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES

Transcription:

Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 4, Number 5 (2014), pp. 441-446 Research India Publications http://www.ripublication.com/aeee.htm Implimentation of A 16-bit RISC Processor for Convolution Application B. Rajesh Kumar 1, Ravisaketh 2 and Santha Kumar 3 1,2 M.Tech Student, Dept of ECE, Audisankara College of Engineering & Technology, Jawaharlal Nehru University Guduru, Nellore, Andhra Pradesh, India. 3 Dept of ECE, Audisankara College of Engineering & Technology, Guduru, A.P, India. Abstract RISC architecture is used across a wide range of platforms from Cellular phones to super computers.in this paper,a 16- bit RISC processor is designed, which utilizes minimum functional units without compromising in performance. The design is based on architectural modification made in the incrementer circuit which is used in program counter.a Low Power Area Efficient carry select adder and a high speed low power modified Wallace tree multiplier has been designed to improving perfomance of ALU in RISC processor. The RISC processor has been realized using Verilog HDL.The individual modules are designed and tested at each level and finally integrated in the top level module.individual modules, toplevel module are simulated by using Xilinx ISE14.2. Synthesis, power estimation and area estimation is done by using Cadence.The power consumption obtanied is 1174 nw and area is 15041 nm 2.As against of referace RISC processor which is used Normal Carry select adder and Wallace tree multipler. Keywords: RISC, Lopower, modified Wallace tree multiplier, Carry select adder. 1. Introduction The trend in the recent past shows the RISC processors clearly outsmarting the earlier CISC processor architectures. The reasons have been the advantages, such as its simple, flexible and fixed instruction format and the RISC processor is its ability to support single cycle operation, meaning that the instruction is fetched from the

442 B. Rajesh Kumar et al instruction memory at the maximum speed of the memory. RISC processors in general, are designed to achieve this by pipelining, where there is a possibility of stalling of clock cycles due to wrong instruction fetch when jump type instructions are encountered. This reduces the efficiency of the processors. This paper describes a RISC architecture in which, single cycle operation is obtained without using a pipelined design [1][2]. The development of CMOS technology provides very high density and high performance integrated circuits. The performance provided by the existing devices has created a never-ending greed for increasingly better performing devices. This predicts the use of a whole RISC processor as a basic device by the year 2020. However, as the density of IC increases, the power consumption becomes a major threatening issue along with the complexity of the circuits. Hence, it becomes necessary to implement less complex, low power processor designs[3]. Program counter is one of the most complex building blocks of the processor design. It performs mainly two operations, namely, incrementing and loading. In order to address this issue, the present work establishes a novel design of an incrementer structure. The second part of this work concentrates on the complexity reduction in ALU by optimizing the design of arithmetic circuits. In this work, we have designed and developed a 16-bit single cycle RISC processor. In order to improve the performance, modification on incrementer circuit and Low power area efficent carry select adder circuit and modified wallace tree multipler have been done and modified structure has been integrated into the design and the performance is validated[5],[8]. In this paper we are main focus on ALU design. Section II presents the design of the RISC CPU. Section III presents the implementation of Low power area efficent carry select adder circuit and modified Wallace tree multipler. Section IV gives the ASIC implementation results and analysis. Section IV concludes. Fig. 1: Proposed Block Diagram of RISC processor. 2. Design of 16-BIT RISC CPU 2.1 Architecture The architecture of the proposed RISC CPU is a uniform 16-bit instruction format, single cycle processor. It has a load/store architecture, where the operations will only

Implimentation of A 16-bit RISC Processor for Convolution Application 443 be performed on registers, and not on memory locations. It follows the classical von- Neumann architecture with just one common memory bus for both instructions and data The instruction set consists of Load, store and HALT type of instructions. The Halt instruction acts as a border line between the instruction and data memory.each of the register is of 16-bits width capacity. 2.2 Program Counter The Program Counter (PC) is a 16-bit latch that holds the memory address of location, from which the next machine language instruction will be fetched by the processor. The proposed PC is the largest sub-block and second to the control unit in complexity. 2.3 Arithmetic and Logic unit The arithmetic and logic unit (ALU) performs arithmetic and logic operations. It also performs the bit operations such as rotate and shift by a defined number of bit positions. The proposed ALU contains three sub-modules, viz. arithmetic, logic and shift modules. The arithmetic unit involves the execution of addition operations and generates Sign flag and Zero flag as per the result shown in the process. In order to reduce the complexity of the adder circuits used in the arithmetic unit of the RISC CPU, a very fast and low power carry select adder circuit has been introduced. The ALU also consists of a modified Wallace tree multiplier, which uses compressor circuits to achieve low power and improved speed of operation. The multiplier is designed to execute in a single cycle. Hence, it satisfies the requirement of the RISC design, to execute single cycle instructions 3. ALU Design Mainly ALU design have LHI, LLI,Xor, left shift, right shift, adder,multiplier and halt. ALU by optimizing the design of arithmetic circuits. The previous works in literature focus on energy efficient arithmetic circuits. In order to increase the operating speed and power efficiency of the processor, we have come out with A Low Power Carry Select Adder, A Modified Wallace tree multiplier is proposed in ALU. 3.1 Low power area efficent carry select adder circuit CSA adder, like ripple-carry adders, is the carry has to to travel through every full adder block. There is a way to improve the speed by duplicating the hardware due to the fact that the carry can only be either 0 or 1. The method is based on the conditional sum adder and extended to a carry-select adder. With one RCA.each computing the case of the one polarity of the carry-in, the sum can be obtained with a 2x1 multiplexer with the carry-in as the select signal. The basic idea of this work is to use Binary to Excess-1 Converter (BEC) instead of RCA with in the regular CSLA to achieve lower area and power consumption. The main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure.this work is to use BEC instead of the RCA with cin=1

444 B. Rajesh Kumar et al in order to reduce power consumption of the regular CSA. To replace the n-bit RCA, an n+1bit BEC is required[5]. Fig. 2: Low power area efficent carry select adder circuit. 3.2 Modified Wallace tree multiplier Wallace tree multiplier is used in ALU in order to perform the multiplication operations, The purpose of defining the Wallace tree multiplier in this ALU is because of the full adders used in the above carry select adder, the Wallace tree multiplier comprises of full adders the full adders used in CSA can be called to the Wallace tree multiplier to perform the multiplication operation which in turn leads to reduction of powe [7][6]. Fig. 3: Modified Wallace tree Multiplier.

Implimentation of A 16-bit RISC Processor for Convolution Application 445 3.3 Register File The register file consists of 8 general purpose registers of 16-bits capacity each. These register files are utilized during the execution of arithmetic and datacentric instructions. It is fully visible to the programmer. It can be addressed as both source and destination using a 3-bit identifier. The register addresses are of 3-bit length, with the range of 000 to 111. The load instruction is used to load the values into the registers and store instruction is used to retrieve the values back to the memory to obtain the processed outputs back from the processor. 3.4 Instruction Decoder Unit (IDU) Program Counter output is input to the IDU, the purpose of IDU is to load and store the data depending on the WRENA (write enable) signal,and also to generate the opcode which is given to the ALU for its operations. 00000 Des addre Datain while ALU operations the status of the counter is as given below Opcode S add Des add 00000 4. ASIC Implementation and Results The RISC processor has been realized using Verilog HDL.The individual modules are designed and tested at each level and finally integrated in the top level module. Individual modules,toplevel module are simulated by using Xilinx ISE14.2. Synthesis, power estimation and area estimation is done by using Cadence. Fig. 5: RISC Simulation output.

446 B. Rajesh Kumar et al 5. Conclusion The design of a single cycle 16-bit RISC processor has been presented. A Low power adder and multiplier structures have been employed in the RISC architecture. The processor has been designed for executing based on the user requirements. 6. Acknowledgment We place our gratitude on record to the Department of Electronics and Communication Engineering, Audisankara college of Engineering & Technology. References [1] A True Single Cycle RISC Processor without Pipelining. ESS Design White Paper RISC Embedded Controller. [2] A Low Power 16-Bit RISC Microprocessor Using ECRL Circuits [3] Springer.Guide to RISC Processors- for Programmers and Engineers. [4] Samiappa Sakthikumaran et al., A Very Fast and Low Power Incrementer and Decrementer Circuits, International Journal of Computer Communication and Information System (IJCCIS) Vol2.No.1 2011, pp. 200-203 [5] Low-Power and Area-Efficient Carry Select Adder: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 2, FEBRUARY 2012. [6] A Novel VLSI Architecture for Low power FIR Filter: Published in International Journal of Advanced Engineering & Application, Jan 2011 Issue. [7] Wallace Tree Multiplier for RISC Processor, 3rd InternationalConference on Electronics Computer Technology- ICECT 2011. [8] K. Nishimura, T. Kudo, and H. Amano, Educational 16-bit microprocessor PICO-16, Proc. 3rd Japanese FPGA/PLD design conference and exhibit (Japanese Edition), Tokyo, July 19 21, 1995, pp. 589 595.