Design and Implementation o 64 bit RISC Processor on FPGA

Size: px

Start display at page:

Download "Design and Implementation o 64 bit RISC Processor on FPGA"

Rodney Kelley
6 years ago
Views:

1 Design and Implementation o 64 bit RISC Processor on FPGA Mr. Mohammad Gousuddin H Maniyar 1 Department of Electronic & Communication Engineering R V College of Engineering College,Bengulur Mrs. Sujatha Hiremath 2 Assistant Professor Department of Electronic & Communication Engineering R V College of Engineering College,Bengulur ABSTRACT In the early years this computer was stack architecture, later replaced by a RISC architecture. Now the intent is to replace the hypothetical, emulated computer by a real one. This idea was made realistic by the advent of programmable hardware components called field programmable gate arrays (FPGA). Complex Instruction Set Computer (CISC) and Reduced Instruction Set Computer (RISC). The CISC concept is an approach to the Instruction Set Architecture (ISA) design that emphasizes doing more with each instruction using a wide variety of addressing modes, number of operands in various locations in its Instruction Set. As a result, the instructions are of widely varying lengths and execution times thus demanding a very complex Control Unit, which occupies a large real estate on chip. On the other hand, the RISC Processor have reduced number of Instructions, fixed instruction length, more general purpose registers, load-store architecture and simplified addressing modes which makes individual instructions execute faster, achieve a net gain in performance and an overall simpler design with less silicon consumption as compared to CISC. The fact that real processors are far more complex than the one presented here. It is concentrate on the fundamental concepts rather than on their elaboration. The strive for a fair degree of completeness of facilities, but refrain from their optimization. In fact, the dominant part of the vast size and complexity of modern processors and software is due to speed-up called optimization (improvement would be a more honest word). It is the main culprit in obfuscating the basic principles, making them hard, if not impossible to study. In this light, the choice of a RISC (Reduced Instruction Set Computer) is obvious. The use of an FPGA provides a substantial amount of freedom for design. Yet, the hardware designer must be much more aware of availability of resources and of limitations than the software developer. Also, timing is a concern that usually does not occur in software, but pops up unavoidably in circuit design. I. INTRODUCTION RISC Processor is increasing widely used in every field. Most microprocessor in today s market is based on either RISC or CISC architecture technologies. The RISC architecture boost the computer speed also used in control algorithms.using of RISC Processor the time required to execute each instruction can be shortened and the number of cycles reduces. The depicts a RISC building design in which 2 cycle operation is gotten utilizing a pipelined outline. The pipelined architecture is used which minimizes the latency and increases the speed and in the new innovation the 2 stage pipelining which works on the positive edge and as well as in the negative edge minimizes the latency and increases the speed and also reduce the stalling in instruction. The whole architecture of RISC processor work on the 2 cycle.the fixed size of instruction allows the given instruction to be easily piped. RISC processor has a flexible architecture. The clock gating technique is used to minimize the Power. The CISC concept is an approach to the Instruction Set Architecture (ISA) design that emphasizes doing more with each instruction using a wide variety of addressing modes, number of operands in various locations in its Instruction Set. As a result, the instructions are of widely varying lengths and execution times thus demanding a very complex Control Unit, which occupies a large real estate on chip. On the other hand, the RISC Processor have reduced number of Instructions, NCICT-2016 Special Issue 2 Page 167

2 fixed instruction length, more general purpose registers, load-store architecture and simplified addressing modes which makes individual instructions execute faster, achieve a net gain in performance and an overall simpler design with less silicon consumption as compared to CISC. The fact that real processors are far more complex than the one presented here. It is concentrate on the fundamental concepts rather than on their elaboration. The strive for a fair degree of completeness of facilities, but refrain from their optimization.this is a most Popular method to reduce the dynamic Power consumption. Power is devoured by the combinational rationale whose qualities are changing on each one clock edge so the gating rationale comes into the Picture and clock is turned off. In the present research work, the outline of 16 bit RISC processor is introduced; implemented for high efficiency and low power [2]. The architecture supports 33 instructions. The instruction cycle consist of 2 stage pipelining and perform fetch, decode, execute, write back operation simultaneously. The control unit Generate signals from the given instructions. The architecture supports arithmetic, logical, shifting and rotation operations. On the other hand, RISC processor requires very few data types and performs the simple operations. It also supports very few addressing modes and mostly based on registers. Many of the instructions operate on data which are present in internal registers. LOAD and STORE instructions are the only instructions which access data from external memory. Here decoding becomes easier, since the instruction length is fixed. Execution of instructions in parallel using the pipelined stages will improves the overall throughput of the processor but it will introduces some of the hazards in its working operation. Data hazards are those hazards which are generated due to sharing of source and destination resources in succeeding instructions, this will happen in the case when the source for an instruction is destination for previous instruction [1]. This can be prevented by using the forwarding method. Structural hazards are generated when the program and data memory is used commonly. By designing the prefetch queue processor, structural hazards can be removed. RISC architecture s analysis gives many important issues in computer architecture. Most RISC computers have the same key elements: A limited and simple instruction set. A large number of GPR s (General Purpose Registers). Optimization of the instruction pipeline. Organization of RISC computers can be represented by three main components: ALU (Arithmetic Logic Unit): performs the actual computation and processing of data. CU (Control Unit): controls the movement of data and instructions into and out of the CPU and controls the operation of the ALU. RS (Register Set): a minimal internal memory, which consists of a set of storage locations. Comparing to CISC, RISC CPU have more advantages, such as faster speed, simplified structure easier implementation. RISC CPU is extensive use in embedded system. Therefore, designing of RISC CPU based on MIPS is the necessary choice II. LITERATURE SURVEY Most microprocessor in today s market is based on either RISC or CISC architecture technologies. The RISC architecture boost the computer speed also used in control algorithms.using of RISC Processor the time required to execute each instruction can be shortened and the number of cycles reduces. In [1] paper depicts a RISC building design in which 2 cycle operation is gotten utilizing a pipelined outline. The pipelined architecture is used which minimizes the latency and increases the speed and in the new innovation the 2 stage pipelining which works on the positive edge and as well as in the negative edge minimizes the latency and increases the speed and also reduce the stalling in instruction. The whole architecture of RISC processor work on the 2 cycle.the fixed size of instruction allows the given instruction to be easily piped. RISC processor has a flexible architecture. In [2] the second approach clock gating technique is used to minimize the Power. This is a most Popular method to reduce the dynamic Power consumption. Power is devoured by the combinational rationale whose qualities are changing on each one clock edge so the gating rationale comes into the Picture and clock is turned off. In the present research work, the outline of 16 bit RISC processor is introduced implemented for high efficiency and low power. The architecture supports 33 instructions. The instruction cycle consist of 2 stage pipelining and perform fetch, decode, execute, write back operation simultaneously. The control unit Generate signals from the given instructions. The architecture NCICT-2016 Special Issue 2 Page 168

3 supports arithmetic, logical, shifting and rotation operations. In [3] this approach proposed RISC processor based on MIPS designed here is an effort towards efficient processor suitable for various applications. CISC processors have received the marketplace over the years. They support various addressing modes and various data types like others complex processors. Length of instruction varies from instruction to instruction. They generally access data from external memory. They are basically implemented using micro programmed control. There is little gap between instructions of CISC processor and higher-level language statements. However, they may save memory space, its design is complicated and instructions are variable in length; special hardware is required for boundary marking of instruction. After an deep study it is proved that simple instructions has been used 80% of the time and complex instructions has been replaced by group of simple instructions. In [4] this approach a lot many new amount of Processor's are also into the market. Out of all these processor's a few of them were designed using processor cores i.e. Hardware Description Languages like Verilog-HDL and VHDL ( Very High Speed Integrated Circuit Hardware Description Language ), is used for writing a particular version of processor. This helps the designer to use them in any of the embedded applications. These can be used in the processor just by embedding a particular application in the processor. RISC (Reduced Instruction Set Computer) is an efficient Computer Architecture which can be used for the Low power and high speed applications of the processor RISC Processors are important in application of pipelining. The curb of the processor is the Instruction Set Architecture used for developing it. The total worthiness of the processor depends on utilizing the Instruction Set Architecture. Instruction Set Architecture is a metaphysical interface between Low level system of the machine and the hardware, that contain all the information about the machine, required to However a lot of research is being carried out in the field of processor's to satisfy the performance issues. But now a days it is mandatory to use a machine which is efficient in the terms of speed, power, performance and size. Though there are tradeoffs between all the performance parameter's, Research is being carried out to satisfy all the above performance parameters. Though a more number of instruction sets are available in the market, an instruction set which can handle less power, high speed low area needs to be selected for the efficient functioning. To satisfy all the above requirements we consider the MIPS (Microprocessor without Interlocked Pipelining Stages) Instruction Set Architecture. This paper also concentrates on reducing the power utilized by the processor in order to satisfy the Low Power constraint of the developed Processor. III. PROBLEM STATEMENT The RISC architecture boost the computer speed also used in control algorithms.using of RISC Processor the time required to execute each instruction can be shortened and the number of cycles reduces the RISC processor consists of the block mainly ALU, Universal shift register and Barrel Shifter. We have used modified Harvard architecture that uses separate memories for its instruction & data memory. The processor is the Instruction Set Architecture used for developing it. The total worthiness of the processor. Depends on utilizing the Instruction Set Architecture. However a lot of research is being carried out in the field of processor's to satisfy the performance issues. But now a days it is mandatory to use a machine which is efficient in the terms of speed, power, performance and size. Though there are tradeoffs between all the performance parameter's, Research is being carried out to satisfy all the above performance parameters. IV. OBJECTIVE Through this approach a lot many new amount of Processor's are also into the market. Out of all these processor's a few of them were designed using processor cores i.e. Hardware Description Languages like Verilog-HDL and VHDL ( Very High Speed Integrated Circuit Hardware Description Language ), is used for writing a particular version of processor. This helps the designer to use them in any of the embedded applications. These can be used in the processor just by embedding a particular application in the processor. RISC (Reduced Instruction Set Computer) is an efficient Computer Architecture which can be used for the Low power and high speed applications of the processor RISC Processors are important in application of pipelining. The curb of the processor is the Instruction Set Architecture used for developing it. The total worthiness of the processor depends on utilizing the Instruction Set Architecture. Instruction Set Architecture is a NCICT-2016 Special Issue 2 Page 169

4 metaphysical interface between Low level system of the machine and the hardware, that contain all the information about the machine, required to However a lot of research is being carried out in the field of processor's to satisfy the performance issues. 1) To develop and implement RISC processor algorithm, with high efficiency and less power consumption. 2) To maximize the performance and operations of a processors for many application. 3) Software scheduling and optimizing compilers of RISC architectures try to maximize the synergy between hardware and software 4) Develop with a suitable technique and dump it on FPGA board. 5) To carry out study the RICS processors for many further application V. METHODOLOGY In the present research work, the outline of 64 bit RISC processor is introduced; implemented for high efficiency and low power. The architecture supports minimum 35 instructions. The instruction cycle consist of single stage pipelining and perform fetch, decode, execute, write back operation simultaneously. The control unit Generate signals from the given instructions. The architecture supports arithmetic, logical, shifting and rotation operations. On the other hand, RISC processor requires very few data types and performs the simple operations. It also supports very few addressing modes and mostly based on registers. Many of the instructions operate on data which are present in internal registers. LOAD and STORE instructions are the only instructions which access data from external memory. Here decoding becomes easier, since the instruction length is fixed. Execution of instructions in parallel using the pipelined stages will improves the overall throughput of the processor but it will introduces some of the hazards in its working operation. Data hazards are those hazards which are generated due to sharing of source and destination resources in succeeding instructions, this will happen in the case when the source for an instruction is destination for previous instruction. This can be prevented by using the forwarding method. Structural hazards are generated when the program and data memory is used commonly. By designing the prefetch queue processor, structural hazards can be removed. Control hazards are generated in non-sequential executed circuits and to remove this hazards flushing method is used. RISC architecture s analysis gives many important issues in computer architecture. Most RISC computers have the same key elements: A limited and simple instruction set. A large number of GPR s (General Purpose Registers). Optimization of the instruction pipeline. Organization of RISC computers can be represented by three main components: ALU (Arithmetic Logic Unit): performs the actual computation and processing of data. CU (Control Unit): controls the movement of data and instructions into and out of the CPU and controls the operation of the ALU. RS (Register Set): a minimal internal memory, which consists of a set of storage location. FLOATING POINT UNIT: FPGA structures are optimized for fixed point computations. Implementation of floating point (FP) arithmetic requires large amount of resources, so maximum clock frequency becomes relatively low.to overcome this in part, efficient implementations of FP arithmetic have been designed, for example, sequential and pipelined divider and square root block [14], logarithm and exponential function [8], and a set of specific libraries [9]. The ALU of the CPU mentioned in the previous section handles integer numbers only. REALs in the CPD. tool are single precision floating point numbers of IEEE 754 standard. Apart from basic operations like addition, subtraction, multiplication and division, the controller FPU also performs comparisons (equal, more than, etc.), integer to floating point two-way conversion with rounding and truncating, determination of absolute value, sign inversion. Multiplication of two floating point numbers is executed by multiplication of mantissas and addition of exponents. Simplified architecture of the floating point multiplier is shown in Fig. 1. Normalization block on the right side is a simple shift register, which conditionally right-shifts the result of mantissas multiplication. The result is rounded using round to nearest even mode. Rounding requires three additional bits determined in A3 block, i.e. guard, round and sticky bits. Four versions of the multiplier have been investigated. They differ in the design of fixed point multiplier block A3, whose structure has essential influence on final implementation parameters, such as FPGA resource requirement and calculation speed. Standard serial paper-and-pencil multiplication algorithm as well as parallel, pipelined, fast array multipliers with or without FPGA embedded NCICT-2016 Special Issue 2 Page 170

5 hardware multipliers have been examined. Performance tests of the multiplier versions and implementation parameter evaluations have shown that the paper-and-pencil algorithm is a good compromise between speed and resources requirements. Therefore, this version has been chosen for the FPU floating point multiplier. Similar evaluations have been carried out for other components of the FPU before final implementation. In this project the architecture of the proposed RISC CPU is a uniform 64- bit instruction format, single cycle non-pipelined processor. It has a load/store architecture, where the operations will only be performed on registers, and not on memory locations. It follows the classical von-neumann architecture with just one common memory bus for both instructions and data. A total of 35 instructions are designed as a first step in the process of development of the processor. The instruction set consists of Logical, Immediate, Jump, Load, store and HALT type of instructions. The Halt instruction acts as a border line between the instruction and data memory. This offers the flexibility to the programmer, who uses this processor core to define their own instruction and data memory within the allotted 128 memory registers. Each of the register is of 64-bits width capacity. Program Counter: The Program Counter (PC) is a 64-bit latch that holds the memory address of location, from which the next machine language instruction will be fetched by the processor. It is a 12-bit pointer to indicate the instruction memory. It additionally uses a 12-bit pointer to point to the data memory, which will be used only when a Load/Store instruction is encountered for execution. Fig. 1. Architecture of floating point multiplier. Pre-fetching: The process of fetching next instruction or instructions into an event queue before the current instruction is complete is called pre-fetching. The earliest 64-bit microprocessor, the Intel 8086/8, pre-fetches into a non-board queue up to six bytes following the byte currently being executed thereby making them immediately available for decoding and execution, without latency. Pipelining: Pipelining instructions means starting or issuing an instruction prior to the completion of the currently executing one. The current generation of machines carries this to a considerable extent. The PowerPC 601 has 20 separate pipeline stages in which various portions of various instructions are executing simultaneously. VI. WORKING Arithmetic and Logic unit: The arithmetic and logic unit (ALU) performs arithmetic and logic operations. It also performs the bit operations such as rotate and shift by a defined number of bit positions. The proposed ALU contains three submodules, viz. arithmetic, logic and shift modules. The arithmetic unit involves the execution of addition and multiplication operations and generates Sign flag and Zero flag. The shift module is used for executing instructions such as rotation and shift operations [9]. Register File: The register file consists of 8 general purpose registers of 64-bits capacity each. These register files are utilized during the execution of arithmetic and data-centric instructions. The load instruction is used to load the values into the registers and store instruction is used to retrieve the values back to the memory to obtain the processed outputs back from the processor. Instruction fetch unit: The function of the instruction fetch unit is to obtain an NCICT-2016 Special Issue 2 Page 171

instruction from the instruction memory using the current value of the PC and increment the PC value for the next instruction as shown in Figure 2 Since this design uses an 64-bit data width we had

6 instruction from the instruction memory using the current value of the PC and increment the PC value for the next instruction as shown in Figure 2 Since this design uses an 64-bit data width we had to implement byte addressing to access the registers and word address to access the instruction memory. The instruction fetch component contains the following logic elements that are implemented in VHDL 64-bit program counter (PC) register, an adder to increment the PC by four, the instruction memory, a multiplexor, and an AND gate used to select the value of the next PC. Architecture of 64 bit RlSC Processor: Fig.2 Architecture of 64 bit RlSC Processor Floating Point Unit: A floating point (FPU), also known as a math co-processor or numeric processor is a specialized co-processor that manipulates numbers more quickly than the basic microprocessor circuitry. The FPU does this by means of instructions that focus entirely on large mathematical operations. This FPGA implementation of 64-bit double precision floating point has been shown which performs certain operations like addition, subtraction, multiplication and division. This kind of unit can be tremendously useful in the FPGA implementation of complex systems that benefits from the parallelism of the FPGA device [9]. FP Add: In the module FP Add, the inputs operands are separated into their mantissa and exponent components. Then the exponents are compared to check which variable is larger. The larger variable goes into "mantissa Jarge" and exponent_large". Similarly the smaller variable goes into "mantissa_small" and "exponent_small". The sign and exponent of the output will be determined the smaller exponent can be right shifted before performing the addition. FP Sub: The input variables are separated into two components namely mantissa and exponent. Subtraction is similar to that of addition such that the mantissa of the smaller exponent is shifted to the right before performing the subtraction [10]. FP Mul: Multiplying all 53 bits of varl by 53 bits of var2 would result in a 106-bit product. 53 bit by 53 bit multipliers are not available in the Altera FPGAs, so the multiply would be broken down into smaller multiplies and the results would be added together to give the final 106-bit product. The module (FP Mul) breaks up the multiply which can perform 24-bit by 17-bit. FP Div: Division is performed in FP Div. The exponent is obtained by adding 1023 with the exponent of varl and then by subtracting the exponent of var2 from this sum. Then, the mantissa of varl is the dividend and the mantissa of var2 is the divisor. F. Memory Unit: The load and store instructions are used to access this module. Finally, the memory access stage is where, if necessary, system memory is accessed for data. Also if a write to the data memory is required by the instruction it is done in this stage. In order to avoid additional complications it is assumed that a single read or write is accomplished within a single CPU clock cycle. G. Instruction Set: The instruction set used in this architecture consists of arithmetic, logical, memory and branch instructions. It will have short (8-bit) and long (16-bit) instructions, which are shown in Table 1. For all arithmetic & logical operations, 8- bit instructions are used. For all memory transactions and jump instructions, 16-bit instructions are used. It will have special instructions to access external ports. The architecture will also have 64-bit general purpose NCICT-2016 Special Issue 2 Page 172

registers that can be used in all operations. For all the jump instruction, the processor architecture will automatically flush the data in the pipeline, so as to avoid any misbehavior. Fig.

INSTRUCTION SET Short Instruction Format: Long Instruction Format: H Low Power Technique :There are several different RTL and gate-level design strategies for reducing power.

7 registers that can be used in all operations. For all the jump instruction, the processor architecture will automatically flush the data in the pipeline, so as to avoid any misbehavior. Fig.3 Clock Pulses of Low Power Unit VII. SIMULATION RESULTS TABLE I. INSTRUCTION SET Short Instruction Format: Long Instruction Format: H Low Power Technique :There are several different RTL and gate-level design strategies for reducing power. In the present work, Clock Gating design is used for reducing dynamic power. In this method, clock is applied to only the modules that are working at that instant [11]. Clock gating is a dynamic power reduction method in which the clock signals are stopped for selected registers banks during the time when the stored logic values are not changing. The clock pulse for low power technique is shown in Fig. 3. The input to low power unit is global clock and its output is gated clock, since the module will block the main clock in the following conditions. 1. When instruction is halt. 2. When there is a continuous Nop operation. 3. When program counter fails to increment. Fig. 4 Simulation Waveform of Double Precision Floating Point Fig. 5 Simulation Waveforms of 64-bit RlSC Processor VIII. CONCLUSION In this paper, we proposed an FPGA based low power pipelined 64-bit RISC processor with Double Precision Floating Point is designed. Modelsim is used to verify the simulation results. The design is implemented on Vertex 5 FPGA on which Arithmetic, Branch operations and Logical NCICT-2016 Special Issue 2 Page 173

8 functions are verified. Pipelining would not flush when branch instruction occurs as it is implemented using dynamic branch prediction. Branch predictions will increase flow in instruction pipeline and achieve high effective performance. The proposed architecture is able to prevent pipeline to multiple executions with a single instruction. Whenever the processor enters in sleep mode, then it disables the clock enable signal so this saves some power by using low power technique. The proposed design can access more data processing for data intensive applications like packet processing. This 64-bit RISC processor consumes only 1 instruction, whereas 32-bit RISC processor needs more than 1 instruction. The Carry select adder structures are employed verified through exhaustive simulation and lower power dissipation and it can increases the speed. It is expandable up to 35 instructions. Clock gating technique disabling the portion which is not required so the flip flop don t change their state and save the power. This processor with floating point operations is used in many applications like Signal processing, Graphics and Medical equipments, mathematical computation. It is also able to apply any control algorithm. REFERENCES [I] Preetam Bhosle, Hari Krishna Moorthy,"FPGA Implementation of Low Power Pipelined 32-bit RlSC Processor", Proceedings of International Journal of Innovative Technology and Exploring Engineering (IJITEE), ISSN: , Vol-I, Issue-3, August [2] Galani Tina G,Riya Saini and R.D.Daruwala,"Design and Implementation of 32-bit RlSC Processor using Xilinx",lnternational Journal of Emerging Trends in Electrical and Electronics(IJETEE),ISNN: ,Vol- 5,lssue I,July [3] Imran Mohammad, Ramananjaneyulu, "FPGA Implementation of a 64-bit RlSC Processor Using VHDL", Proceedings of International Journal of Reconfigurable and Embedded Systems(IJRES),ISSN: ,Vol-l, No.2, July [4] Aboobacker Sidheeq.V.M,"Four Stage Pipelined 16 bit RlSC on Xilinx Sparatn 3AN FPGA", Proceedings of International Journal of Computer Applications, ISNN: , Vol-48, June [5] Tashfia.Afreen, Minhaz. Uddin Md Ikram, Aqib. AI Azad, and Iqbalur Rahman Rokon," Efficient FPGA Implementation of Double Precision Floating Point Unit Using Verilog HDL", International Conference on Innovations in Electrical and Electronics Engineering (ICIEE'20 12),October 20 12,Dubai (UAE). [6] Addanki Purna Ramesh,Ch.Pradeep,"FPGA Based Implementation of Double Precision Floating point AdderlSubtarctor Using Verilog", Proceedings of International Journal of Emerging Technology and Advanced EngineeringISSN ,Vol-2,lssue 7,July [7] Indu, Arun Kumar, Design of Low Power Pipelined RISC Processor, International Journal of Advanced Research in Electrical & electronics & instrumentation Engineering, vol.2, no.3, pp , August [8] Priyanka Trivedi, Rajan Prasad Tripathi low Power pipelined RISC processor: A Review, IJSRD vol.2, no.4, pp , July [9] Jagrit Kathuria, M. Ayoubkhan, Arti Noor, A Review of Clock Gating Techniques, MIT International Journal of Electronics and Communication Engineering vol 1,no. 2, August [10] B. Ramkumar and Harish M Kittur Low- Power and Area- Efficient Carry Select Adder IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, no.2, 2012, pp [11] J.Ravindra, T.Anuradha,"Design of Low Power RlSC Processor by Applying Clock gating Technique", International Journal of Engineering Research and Applications, ISSN , Vol-2, Issue-3, May-Jun [12] R.uma, Design and Performance analysis of 8 bit RISC Processor Using Xilinx Tool, International Journal of Engineering Research and Application, vol.2, no.2,pp , April [13] Samiappa Sakthikumaran,S. Salivahanan,V.S,kanchan. Bhaaskaran, 16 bit RISC Processor Design For Convolution Application IEEE - International. Conference on Recent Trends in Information technology pp , June [14] Li Li and Ken Choi SeSCG: Selective Sequential Clock Gating for Ultra - low-power Multimedia Mobile Processor Design, IEEE EIT Conference, May NCICT-2016 Special Issue 2 Page 174

9 [15] Arora H,Gupta A,Singhai,R, Purwar D, Design space Exploration of RISC Architecture Using Retargetability IEEE, pp. 1-3,Jan [16] Geun-young Jeong, Ju-sung Park, Design of 32-BIT RISC Processor and Efficient Verification Proceeding -of the 7 th Korea - Russia International Symposium, Korws,2003. [17] Hai Li, Swarup Bhunia, Yiran Chen, Kaushik Roy, DCG: Deterministic Clock - Gating for Low - Power Microprocessor Design, IEEE Trans. On VLSI Systems, vol. 12, no. 3, March [18] Kumar J, v Nagaraju B,swapana C, Ramanjappa T, Deisgn & Development of FPGA based low Power Pipelined 64 Bit RISC Processor with double Precision Floating Point unit, IEEE, pp NCICT-2016 Special Issue 2 Page 175

Design & Analysis of 16 bit RISC Processor Using low Power Pipelining

International OPEN ACCESS Journal ISSN: 2249-6645 Of Modern Engineering Research (IJMER) Design & Analysis of 16 bit RISC Processor Using low Power Pipelining Yedla Venkanna 148R1D5710 Branch: VLSI ABSTRACT:-