Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator).

Similar documents
Reduced Instruction Set Computer

Microprocessor Architecture Dr. Charles Kim Howard University

Introduction to Microcontrollers

RISC Processors and Parallel Processing. Section and 3.3.6

EE 4980 Modern Electronic Systems. Processor Advanced

Evolution of ISAs. Instruction set architectures have changed over computer generations with changes in the

Chapter 04: Instruction Sets and the Processor organizations. Lesson 20: RISC and converged Architecture

Reduced Instruction Set Computers

Computer Architecture s Changing Definition

ELC4438: Embedded System Design Embedded Processor

ELCT708 MicroLab Session #1 Introduction to Embedded Systems and Microcontrollers. Eng. Salma Hesham

COMP3221: Microprocessors and. and Embedded Systems. Instruction Set Architecture (ISA) What makes an ISA? #1: Memory Models. What makes an ISA?

RISC (Reduced Instruction Set Computer)

INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design

Figure 1-1. A multilevel machine.

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

Processing Unit CS206T

Lecture 4: RISC Computers

3.3 Hardware Parallel processing

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

New Advances in Micro-Processors and computer architectures

RISC Principles. Introduction

Modern Design Principles RISC and CISC, Multicore. Edward L. Bosworth, Ph.D. Computer Science Department Columbus State University

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

A Cache Hierarchy in a Computer System

von Neumann Architecture Basic Computer System Early Computers Microprocessor Reading Assignment An Introduction to Computer Architecture

Basic Computer System. von Neumann Architecture. Reading Assignment. An Introduction to Computer Architecture. EEL 4744C: Microprocessor Applications

ELCT 912: Advanced Embedded Systems

COMP2121: Microprocessors and Interfacing. Instruction Set Architecture (ISA)

Computers in Engineering COMP 208. Computer Structure. Computer Architecture. Computer Structure Michael A. Hawker

More advanced CPUs. August 4, Howard Huang 1

Instruction Set And Architectural Features Of A Modern Risc Processor

ECE 1160/2160 Embedded Systems Design. Midterm Review. Wei Gao. ECE 1160/2160 Embedded Systems Design

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.

UNIT 2 (ECS-10CS72) VTU Question paper solutions

PDF created with pdffactory Pro trial version How Computer Memory Works by Jeff Tyson. Introduction to How Computer Memory Works

COSC 6385 Computer Architecture - Memory Hierarchy Design (III)

Outline Marquette University

administrivia final hour exam next Wednesday covers assembly language like hw and worksheets

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

CHAPTER 1 Introduction

Multiple Instruction Issue. Superscalars

Chapter 2 Lecture 1 Computer Systems Organization

CPS 303 High Performance Computing. Wensheng Shen Department of Computational Science SUNY Brockport

Lecture 15: Pipelining. Spring 2018 Jason Tang

Computer Systems Architecture

Lecture 9: More ILP. Today: limits of ILP, case studies, boosting ILP (Sections )

Chapter 1 : Introduction

Computer Architecture Dr. Charles Kim Howard University

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009

Instruction Set Architectures. Part 1

Announcement. Computer Architecture (CSC-3501) Lecture 25 (24 April 2008) Chapter 9 Objectives. 9.2 RISC Machines

CS Computer Architecture

COMP2121: Microprocessors and Interfacing. Introduction to Microprocessors

ECE 486/586. Computer Architecture. Lecture # 8

Microprocessors and Microcontrollers. Assignment 1:

Typical DSP application

COSC 6385 Computer Architecture. - Memory Hierarchies (II)

Lecture 4: RISC Computers

Lecture Topics. Branch Condition Options. Branch Conditions ECE 486/586. Computer Architecture. Lecture # 8. Instruction Set Principles.

EE 3170 Microcontroller Applications

SYSTEM BUS AND MOCROPROCESSORS HISTORY

COMPUTER STRUCTURE AND ORGANIZATION

Microelectronics. Moore s Law. Initially, only a few gates or memory cells could be reliably manufactured and packaged together.

COMPUTER ORGANIZATION AND DESI

Chapter 4. MARIE: An Introduction to a Simple Computer 4.8 MARIE 4.8 MARIE A Discussion on Decoding

Chapter 1: Introduction to Parallel Computing

Cpu Architectures Using Fixed Length Instruction Formats

Memory Systems IRAM. Principle of IRAM

Historical Perspective and Further Reading 3.10

Control Hazards. Prediction

Fig 1. Block diagram of a microcomputer

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle.

ECE 486/586. Computer Architecture. Lecture # 7

Lecture 4: ISA Tradeoffs (Continued) and Single-Cycle Microarchitectures

Introduction. CSCI 4850/5850 High-Performance Computing Spring 2018

From CISC to RISC. CISC Creates the Anti CISC Revolution. RISC "Philosophy" CISC Limitations

What is Pipelining? RISC remainder (our assumptions)

Unit 9 : Fundamentals of Parallel Processing

Chapter 2: Instructions How we talk to the computer

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Real Processors. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Embedded Computing Platform. Architecture and Instruction Set

Major Advances (continued)

Computer Systems. Binary Representation. Binary Representation. Logical Computation: Boolean Algebra

Concurrent/Parallel Processing

CS 5803 Introduction to High Performance Computer Architecture: RISC vs. CISC. A.R. Hurson 323 Computer Science Building, Missouri S&T

Overview of Computer Organization. Chapter 1 S. Dandamudi

CIS 371 Spring 2010 Thu. 4 March 2010

Processor Architecture

COMPUTER ARCHITECTURES

Computer Organization & Assembly Language Programming. CSE 2312 Lecture 2 Introduction to Computers

CPSC 313, 04w Term 2 Midterm Exam 2 Solutions

AVR MICROCONTROLLER ARCHITECTURTE

Advanced Computer Architecture

Modern Processors. RISC Architectures

This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

Overview of Computer Organization. Outline

Transcription:

Microprocessors

Von Neumann architecture The first computers used a single fixed program (like a numeric calculator). To change the program, one has to re-wire, re-structure, or re-design the computer. People that do that were not called computer programmers, as they are called today, but rather computer architects. A Von Neumann computer uses a single memory to hold both instructions and data. 1

The program is written in an appropriate language, and is not hardwired in the computer itself; the computer is re-programmable. In a Von Neumann computer programs can be seen as data; as a consequence a malfunctioning program can crash the computer.

In a Von Neumann processor an instruction is read from memory, decoded, the memory location the instruction asked for is fetched, the operation performed, and the results written back in memory. The term von Neumann architecture dates from June 1945, coined after the name of the mathematician John von Neumann, although such architecture was not designed by Von Neumann alone.

The Von Neumann bottleneck The separation between the CPU and memory leads to what is known as the von Neumann bottleneck. The throughput (data transfer rate) between the CPU and memory is very small in comparison with the amount of memory available and the rate at which the CPU can work. As a result, the CPU is continuously forced to wait for data to be transferred to or from memory. Since the CPU speed and memory size have increased much faster than the throughput between the two, the bottleneck has become more intense. A cache memory between the CPU and main memory helps to alleviate the problem. 2

The Harvard architecture In the Harvard architecture there is a separate storage and signal pathways for instructions and data. In this architecture, the word width, timing, implementation technology, and memory address structure can differ for program and data. Instruction memory is often wider than data memory. In some systems, instructions can be stored in read-only memory while data memory generally requires random-access memory. Typically, there is much more instruction memory than data memory, so instruction addresses are much wider than data addresses. The CPU can be either reading an instruction or reading/writing data from/to the memory. 3

Both cannot occur at the same time in a Von Neumann architecture, since the instructions and data use the same signal pathways and memory. A computer following the Harvard architecture can be faster because it is able to fetch the next instruction at the same time it completes the current instruction (a phenomenon known as pipelining). Speed is gained at the expense of more complex electrical circuitry. Modern high performance CPU chip designs incorporate aspects of both Harvard and von Neumann architecture. On chip cache memory is divided into instruction cache and data cache.

Complex instruction set computer CISC A complex instruction set computer (CISC) is a microprocessor instruction set architecture in which each instruction can execute several lowlevel operations, such as a load from memory, an arithmetic operation, and a memory store. The terms register-memory or memory-memory also apply to the same concept. In the early days of computers, compilers did not exist. Programming was done in either machine code or assembly language. To make programming easier, computer architects created more and more complex instructions, which were direct representations of high level functions of high level programming languages. The attitude at the time was that hardware design was easier than compiler design, so the complexity went into the hardware. 4

Another force that encouraged complexity was the lack of large memory. Indeed, as every byte of memory was precious (for example, an entire system only had a few kilobytes of storage) the industry moved to such features as highly encoded instructions, instructions which could be variable sized, instructions which did multiple operations and instructions which did both data movement and data calculation. For the above reasons, CPU designers tried to make instructions that would do as much work as possible. This led to one instruction that would do all of the work in a single instruction: load up the two numbers to be added, add them, and then store the result back directly to memory. The compact nature of CISC results in smaller program sizes and fewer calls to main memory.

While many designs achieved the aim of higher throughput at lower cost and also allowed highlevel language constructs to be expressed by fewer instructions, it was observed that programs did not took profit from this. This is the point of departure from CISC to RISC. Examples of CISC processors are the Intel x86 CPUs (8051 included). The terms RISC and CISC had become less meaningful with the continued evolution of both CISC and RISC designs and implementations.

Reduced instruction set computer RISC The reduced instruction set computer, or RISC, is a CPU design philosophy that favors a reduced and simpler instruction set. The term load-store applies to the same concept. The idea was originally inspired by the discovery that many of the features that were included in traditional CPU designs (i.e., CISC) to facilitate coding were being ignored by the programs/programmers. In the late 1970s researchers demonstrated that the majority of the many addressing modes present in CISC microprocessors were ignored by most programs. This was a side effect of the increasing use of compilers to generate programs, as opposed to writing them in assembly 5

language. In others words, compilers were not able to exploit the features of a CISC assembly. At about the same time CPUs started to run even faster than the memory they talked to. It became apparent that more registers (and later caches) would be needed to support these higher operating frequencies. These additional registers and cache memories would require sizeable chip or board areas that could be made available if the complexity of the CPU was reduced. Since real-world programs spent most of their time executing very simple operations, some researchers decided to focus on making those common operations as simple and as fast as possible. The goal of RISC was to make instructions so simple, each one could be executed in a single clock cycle.

However RISC also had its drawbacks. Since a series of instructions is needed to complete even simple tasks, the total number of instructions read from memory is larger, and therefore takes longer (see the Von Neumann bottleneck). In the early 1980s it was thought that existing design was reaching theoretical limits. Future improvements in speed would be primarily through improved semiconductor process, that is, smaller features (transistors and wires) on the chip. The complexity of the chip would remain largely the same, but the smaller size would allow it to run at higher clock rates (Moore s law). The RISC CDC 6600 supercomputer, designed in 1964 by Jim Thornton and Seymour Cray has 74 op-codes, while the 8086 from intel has 400.

RISC designs have led to a number of successful platforms and architectures, some of the larger ones being: PlayStation, PlayStation 2, PlayStation Portable, PlayStation 3, Nintendo 64 game consoles, Nintendo s Gamecube and Wii, Microsoft s Xbox 360 and Palm PDA s.

Pipeline An instruction is made of micro-instructions. In a processor with pipeline, the processor works on one micro-instruction of several different instructions at the same time. For example, the RISC pipeline is broken into five stages: 1. Instruction fetch 2. Instruction decode and register fetch 3. Execute 4. Memory access 6

5. Register write back The key to pipelining is the observation that the processor can start reading the next instruction as soon as it finishes reading the last, meaning that it works on two instructions simultaneously: one is being read, the next is being decoded two stage pipelining. While no single instruction is completed any faster, the next instruction would complete right after the previous one. The result was a much more efficient utilization of processor resources. Pipelining reduces cycle time of a processor and hence increases instruction throughput, the number of instructions that can be executed in a unit of time. A typical CISC instruction to add two numbers might be ADD A, B, C, which adds the

values found in memory locations A and B, and then puts the result in memory location C. In a pipelined processor the pipeline controller would break this into a series of instructions similar to: LOAD A, R1 LOAD B, R2 ADD R1, R2, R3 STORE R3, C LOAD next instruction The R locations are registers, temporary memory inside the CPU that is quick to access. The end result is the same, the numbers are added and the result placed in C, and the time taken to drive the addition to completion is no different (possibly greater than for the CISC case) from the non-pipelined case. The key to understanding the advantage of pipelining is to consider what happens when

this ADD function is half-way done, at the ADD instruction for instance. At this point the circuitry responsible for loading data from memory is no longer being used, and would normally sit idle. In this case the pipeline controller fetches the next instruction from memory, and starts loading the data it needs into registers. That way when the ADD instruction is complete, the data needed for the next ADD is already loaded and ready to go. The overall effective speed of the machine can be greatly increased because no parts of the CPU sit idle. Every microprocessor manufactured today uses at least 2 stages of pipeline. (The Atmel AVR and the PIC microcontroller each have a 2 stage pipeline). Advantages of pipelining: the cycle time of the processor is reduced, thus increasing instruction bandwidth in most cases.

Advantages of not pipelining: The processor executes only a single instruction at a time. This prevents branch delays (in effect, every branch is delayed) and problems with serial instructions being executed concurrently. Consequently the design is simpler and cheaper to manufacture. The instruction latency in a non-pipelined processor is slightly lower than in a pipelined equivalent. This is due to the fact that extra flip flops must be added to the data path of a pipelined processor. A non-pipelined processor will have a stable instruction bandwidth. The performance of a pipelined processor is much harder to

predict and may vary more widely between different programs. Many designs include pipelines as long as 7, 10 and even 31 stages (like in the Intel Pentium 4). The downside of a long pipeline is when a program branches, the entire pipeline must be flushed, a problem that branch predicting helps to alleviate. The higher throughput of pipelines falls short when the executed code contains many branches: the processor cannot know where to read the next instruction, and must wait for the branch instruction to finish, leaving the pipeline behind it empty. After the branch is resolved, the next instruction has to travel all the way through the pipeline before its result becomes available and the processor appears to work again. In the extreme case,

the performance of a pipelined processor could theoretically approach that of an unpipelined processor, or even slightly worse if all but one pipeline stages are idle and a small overhead is present between stages.

Bibliography http://www.wikipedia.com 7