Intel instruc,on set architecture-32 bit (IA-32)

Size: px
Start display at page:

Download "Intel instruc,on set architecture-32 bit (IA-32)"

Transcription

1 Intel instruc,on set architecture-32 bit (IA-32) ISA Background ISA evolution IA-32 Overview Pentium 4 / Netburst µarchitecture SSE2 Hyper Pipeline Overview Branch Prediction Execution Types Rapid Execution Engine Advanced Dynamic Execution Memory Management Segmentation Paging Virtual Memory Address Modes / Instruction Format Address Translation Cache Levels of Cache (L1 & L2) / Execution Trace Cache Instruction Decoder System Bus Register Files Enhanced Floating Point & Multi-Media Unit

2 Instruc,on Set Architecture (ISA) Serves as an interface between software and hardware. Provides a mechanism by which the software tells the hardware what should be done. High level language code : C, C++, Java, Fortran, compiler Assembly language code: architecture specific statements assembler Machine language code: architecture specific bit patterns software instruction set hardware

3 Evolu,on of Instruc,on Sets Single Accumulator (EDSAC 1950, Maurice Wilkes) Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953) Separation of Programming Model from Implementation High-level Language Based Concept of a Family (B ) (IBM ) General Purpose Register Machines Complex Instruction Sets Load/Store Architecture (Vax, Intel ) (CDC 6600, Cray ) CISC Intel x86, Pentium RISC (MIPS,Sparc,HP-PA,IBM RS6000,PowerPC )

4 IA-32 Overview Traced to 1969 Intel 4004 P4 1 st IA-32 processor based on Intel Netburst microprocessor. Netburst Allows Higher Performance Levels Performance at Higher Clock Speeds Compa,ble with exis,ng applica,ons and opera,ng systems WriPen to run on Intel IA-32 architecture Processors

5 IA-32 Overview Rapid Execu,on Engine Hyper Pipelined Technology Advanced Dynamic Execu,on Innova,ve Cache Subsystem Streaming SIMD Extensions 2 (SSE2) 400 MHz System Bus

6 Hyper Pipelined What is hyper pipeline technology? Deeper pipeline Fewer gates per pipeline stage What are the benefits of hyper pipeline? Increased clock rate Increased performance

7 Hyper Pipelined 1 Fetch 2 Fetch 3 4 Decode Decode 5 6 Decode Rename 7 ROB Rd Typical P6 Pipeline 8 9 Rdy/Sch Dispatch 10 Exec 1 2 TC Nxt IP 3 4 TC Fetch 5 6 Drive Alloc 7 8 Rename 9 Que 10 Sch 11 Sch 12 Sch 13 Disp 14 Disp 15 RF 16 RF 17 Ex 18 Flgs 19 BrCk 20 Drive Typical Pen,um 4 Pipeline

8 Hyper Pipelined 1 2 TC Nxt IP 3 4 TC Fetch 5 6 Drive Alloc 7 8 Rename 9 Que 10 Sch 11 Sch 12 Sch 13 Disp 14 Disp 15 RF 16 RF 17 Ex 18 Flgs 19 BrCk 20 Drive 3.2 GB/s System Interface BTB & I-TLB Decoder L2 Cache and Control BTB Trace Cache µcode ROM Rename/Alloc µop Queues Schedulers Integer RF FP RF Store AGU Load AGU ALU ALU ALU ALU FP move FP store Fmul Fadd MMX SSE L1 D-Cache and D-TLB

9 Branch Predic,on

10 Branch Predic,on Centerpiece of dynamic execu,on Delivers high performance in pipelined µ- architecture Allows con,nuous fetching and execu,on Predicts next instruc,on address Branch is predictable within 4 or less itera,ons Benefit: Branch Predic,on decreases the amount of instruc,ons that would normally be flushed from pipeline

11 Branch Predic,on If (a == 5) a = 7; Else a = 5; Examples L1: lpcnt++; If ((lpcnt % 5)== 0) prini ( Loop count is divisible by 5\n ); Predictable Not Predictable

12 Out-of-Order Execu,on Logic Re,rement Logic ` Branch History Update

13 Rapid Execu,on Engine Contains 2 ALU s Twice core processor frequency Allows basic integer instructions to execute in ½ a clock cycle Up to 126 instructions, 48 load, and 24 stores can be in flight at the same time Example Rapid Execution Engine on a 1.50 GHz P4 Processor runs at Hz?

14 Advanced Dynamic Execu,on Out-of-Order Engine Reorders Instructions Executes as input operands are ready ALU s kept busy Reports Branch History Information Increases overall speed

15 Memory Management (PVAM)* The protection mechanism is divided into two parts: Segmentation - isolates individual processes so that multiple programs can on same processor without interfering w/each other. Demand Paging - provides a mechanism for implementing a virtual-memory that is much larger than the actual memory, seemingly infinite. * PVAM = protected virtual addressing mode

16 Underlying Concepts The following slides will illustrate the underlying principles of the x86 PVAM operation: 1) Memory Management 2) Operating modes 3) Paging 4) Segmentation 5) Address translation 6) Privilege levels 7) addressing modes

17 Multitasking memory management Virtual Addressing Reloca'on In systems with virtual memory, programs in memory must be able to reside in different parts of the memory at different,mes. This is because when the program is swapped back into memory aper being swapped out for a while it can not always be placed in the same loca,on. Memory management in the opera,ng system should therefore be able to relocate programs in memory and handle memory references in the code of the program so that they always point to the right loca,on in memory. Protected Protec'on Processes should not be able to reference the memory for another process without permission. This is called memory protec,on, and prevents malicious or malfunc,oning code in one program from interfering with the opera,on of other running programs. Sharing Even though the memory for different processes is protected from each other different processes should be able to share informa,on and therefore access the same part of memory. Logical organiza'on Programs are open organized in modules. Some of these modules could be shared between different programs, some are read only and some contain data that can be modified. The memory management is responsible for handling this logical organiza,on that is different from the physical linear address space. One way to arrange this organiza,on is segmenta,on. Physical organiza'on Memory is usually divided into fast primary storage and slow secondary storage. Memory management in the opera,ng system handles moving informa,on between these two levels of memory.

18 Modes of Opera,on Concentration on: Protected mode - Native operating mode of the processor. All features available, providing highest performance and capability. Other modes: - Must use segmentation, paging optional. Real-address mode processor programming environment System management mode (SMM) - Standard arch. feature in all later IA-32 processors. Power management, OEM differentiation features Virtual-8086 mode - used while in protected mode, allows processor to execute 8086 software in a protected, multitasked environment.

19 Virtual Memory Main memory acts as a cache to secondary storage Allows memory to be shared Make memory appear to be larger than it physically is Each program has own address space Enforces protec,on Virtual memory block is called a page, a miss is called a page fault Virtual addresses are translated into physical

20 Memory Management Address Transla,on Ex: Instruction Address Instruction Instruction Decoder Control Word Memory IA-32 (Virtual Address) Logical Address Segmentation & Paging Physical Address Control Word Memory

21 Converting a Logical to Linear Address The segment selector (16-bit) points to a segment descriptor, which contains the base address of a memory segment. The 32-bit offset from the logical address is added to the segment s base address, generating a 32-bit linear address. addresslogical ector Sel LDTR GDTR/ fset Of tabledescriptor DescriptorSegment addresslinear + address of(contains base table)descriptor

22 Address Transla,on Segment Offset Linear Address Dir Page Offset Physical Address Control Word Index TI RPL Segment Table Index: The number of the segment. Serves as an index to the segment Table. TI: (one bit) Table indicator indicates either global or local segment table to be used for translation RPL: (two bits) Requested privilege level, 0=high privilege, 3 = low Page Directory Paging Page Table Main Memory

23

24

25 Paging Subdivide memory into small fixed-size chunks called frames or page frames Divide programs into same sized chunks, called pages Loading a program in memory requires the allocation of the required number of pages Limits wasted memory to a fraction of the last page Page frames used in loading process need not be contiguous - Each program has a page table associated with it that maps each program page to a memory page frame

26 Paging Linear Address Logical Address Segmentation Dir Page Offset Virtual Memory: Physical Address Control Word Only program pages required for execution of the program are actually loaded Demand Paging Only a few pages of any one program might be in memory at a time Possible to run program consisting of more pages than can fit in memory Page Directory Paging Page Table Main Memory

27 Page Faults Main memory is 100,000,mes faster than disk Page faults are expensive Reduce page fault rate Fully associa,ve placement of pages in memory Each process has a page table that maps virtual addresses to physical addresses OS creates space on disk for all the process s pages Swap space OS maintains another table that keeps track of each page in main memory During a page fault, the OS must decide which page to replace Least recently used (LRU) Write-back used for writes

28 TLB a memory cache that stores recent translations of virtual memory to physical addresses for faster retrieval. When a virtual memory address is referenced by a program, the search starts in the CPU. First, instruction caches are checked. Page lookups must be performed in hardware Page table is cached on-chip Transla,on-lookaside buffer Small fully associa,ve or large limited associa,ve

29 Segmenta,on Programmer subdivides the program into logical units called segments - Programs subdivided by function - Data array items grouped together as a unit Paging - invisible to programmer, Segmentation - usually visible to programmer - Convenience for organizing programs and data, and a means for associating access and usage rights with instructions and data - Sharing, segment could be addressed by other processes, ex: table of data - Dynamic size, growing data structure

30 Processor Privilege-Levels The usefulness of protected-mode derives from its ability to enforce restrictions upon software s freedom to take certain actions Four distinct privilege-levels are supported Organizing concept is concentric rings Innermost ring has greatest privileges, and privileges diminish as rings move outward

31 Four Privilege Rings Ring 3 Ring 2 Least-trusted level Ring 1 Ring 0 Most-trusted level

32 Suggested ring purposes Ring0: opera'ng system kernel Ring1: opera'ng system services Ring2: custom extensions Ring3: ordinary user applica'ons

33 Data Isolation To guard against unintentional sharing of privileged information, different stacks are provided at each distinct privilege-level Accordingly, any transition from one ring to another must necessarily be accompanied by an mandatory stack-switch operation The CPU provides for automatic switching of stacks and copying of parameter-values

34 Descriptors

35 Selectors

36 Call-Gate Descriptors offset[ ] D P P 0 L gate type parameter count code-selector offset[ ] 31 0 Legend: P=present (1=yes, 0=no) DPL=Descriptor Prvilege Level (0,1,2,3) code-selector (specifies memory-segment containing procedure code) offset (specifies the procedure s entry-point within its code-segment) parameter count (specifies how many parameter-values will be copied) gate-type ( 0x4 means a 16-bit call-gate, 0xC means a 32-bit call-gate)

37 An Interprivilege Call When a lesser privileged routine wants to invoke a more privileged routine, it does so by using a far call machine-instruction (also known as a long call in the GNU assembler s terminology) 0x9A (ignored) callgate-selector opcode offset-field segment-field In as assembly language: lcall $callgate-selector, $0

38 Sequence of CPU s actions - pushes the current SS:SP register-values onto a new stack-segment - copies the specified number of parameters from the old stack onto the new stack - pushes the updated CS:IP register-values onto the new stack - loads new values into registers CS:IP (from the callgate-descriptor) and into SS:SP

39 Diagram of the relationships old code-segment new code-segment CS:IP call-instruc'on TASK STATE SEGMENT called procedure OLD STACK SEGMENT params SS:SP stack-pointer Descriptor-Table gate-descriptor TSS-descriptor params NEW STACK SEGMENT TR GDTR

40 Segment Addressing Modes Offset - Determine technique for offset genera,on Base Register Index Register x Scale 1, 2, 4, or 8 Descriptor Registers + Displacement (in instruction; 0, 8, or 32 bits) Segment Base Address Access Rights Limit Base Address Effective Address (Offset) + Linear Address Paging (invisible to programmer) Limit Main Memory

41 Base Mode Addressing Modes e at Imedi dregister acement spl Di tbase dex tscaled ith tbase with d tbase ve i at Rel AOperand = R= LA (SR) A+ = LA (B) (SR) + = LA A+ (SR) (B) LA = + A+ (SR) S LA = x (I) + A+ (SR) (I) LA = (B) + A+ (B) + (SR) S LA = x (I) + (PC) A+ = LA thmi gor Al sla Xntents = (X) of t rsr rpc A = contents of an address field in the instruction rr rb ri g rs

42 Segment Ex: scaled index with displacement Index Register x Descriptor Registers Scale 1, 2, 4, or 8 + Displacement (in instruction; 0, 8, or 32 bits) Segment Base Address Access Rights Limit Base Address Effective Address (Offset) + Linear Address Limit

43 Instruc,on Format Bytes 0 or 1 0 or 1 0 or 1 0 or 1 Instruction Prefix Segment Override Operand Size Override Address Size Override Bytes 0 to 4 1 or 2 0 or 1 0 or 1 0, 1, 2, or 4 0, 1, 2, or 4 Instruction Prefixes Opcode Mod R/M SIB Displacement Immediate Mod Reg/Opcode R/M Scale Index Base

44 Cache Organiza,on Physical Memory System Bus (External) L2 Cache Data Cache Unit (L1) Bus Interface Unit Instruc'on TLBs Data TLBs Instruc'on Decoder Trace Cache Store Buffer

45 Enhanced FP & mul,-media Unit Expands Registers 128-bit (from 64-bit) Adds One Additional Register Data Movement Improves performance on applications Floating Point Multi-Media

46

47 Important ISA concepts: ISA evolu,on Netburst Features Virtual addressing Protec,on Real mode Protected mode Virtual 8086 mode Virtual address transla,on logical-linear-physical Paging Segmenta,on Privilege Levels Descriptors Selectors Addressing modes

Next Generation Technology from Intel Intel Pentium 4 Processor

Next Generation Technology from Intel Intel Pentium 4 Processor Next Generation Technology from Intel Intel Pentium 4 Processor 1 The Intel Pentium 4 Processor Platform Intel s highest performance processor for desktop PCs Targeted at consumer enthusiasts and business

More information

EC 513 Computer Architecture

EC 513 Computer Architecture EC 513 Computer Architecture Complex Pipelining: Superscalar Prof. Michel A. Kinsy Summary Concepts Von Neumann architecture = stored-program computer architecture Self-Modifying Code Princeton architecture

More information

Pentium 4 Processor Block Diagram

Pentium 4 Processor Block Diagram FP FP Pentium 4 Processor Block Diagram FP move FP store FMul FAdd MMX SSE 3.2 GB/s 3.2 GB/s L D-Cache and D-TLB Store Load edulers Integer Integer & I-TLB ucode Netburst TM Micro-architecture Pipeline

More information

Pentium IV-XEON. Computer architectures M

Pentium IV-XEON. Computer architectures M Pentium IV-XEON Computer architectures M 1 Pentium IV block scheme 4 32 bytes parallel Four access ports to the EU 2 Pentium IV block scheme Address Generation Unit BTB Branch Target Buffer I-TLB Instruction

More information

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department IA-32 Architecture COE 205 Computer Organization and Assembly Language Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Basic Computer Organization Intel

More information

CS450/650 Notes Winter 2013 A Morton. Superscalar Pipelines

CS450/650 Notes Winter 2013 A Morton. Superscalar Pipelines CS450/650 Notes Winter 2013 A Morton Superscalar Pipelines 1 Scalar Pipeline Limitations (Shen + Lipasti 4.1) 1. Bounded Performance P = 1 T = IC CPI 1 cycletime = IPC frequency IC IPC = instructions per

More information

Operating System Support

Operating System Support Operating System Support Objectives and Functions Convenience Making the computer easier to use Efficiency Allowing better use of computer resources Layers and Views of a Computer System Operating System

More information

IA32 Intel 32-bit Architecture

IA32 Intel 32-bit Architecture 1 2 IA32 Intel 32-bit Architecture Intel 32-bit Architecture (IA32) 32-bit machine CISC: 32-bit internal and external data bus 32-bit external address bus 8086 general registers extended to 32 bit width

More information

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers

More information

Objectives and Functions Convenience. William Stallings Computer Organization and Architecture 7 th Edition. Efficiency

Objectives and Functions Convenience. William Stallings Computer Organization and Architecture 7 th Edition. Efficiency William Stallings Computer Organization and Architecture 7 th Edition Chapter 8 Operating System Support Objectives and Functions Convenience Making the computer easier to use Efficiency Allowing better

More information

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown

More information

UNIT- 5. Chapter 12 Processor Structure and Function

UNIT- 5. Chapter 12 Processor Structure and Function UNIT- 5 Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data CPU With Systems Bus CPU Internal Structure Registers

More information

Instruction Set Architecture (ISA)

Instruction Set Architecture (ISA) Instruction Set Architecture (ISA)... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data

More information

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce CS 465 Final Review Fall 2017 Prof. Daniel Menasce Ques@ons What are the types of hazards in a datapath and how each of them can be mi@gated? State and explain some of the methods used to deal with branch

More information

EN164: Design of Computing Systems Lecture 24: Processor / ILP 5

EN164: Design of Computing Systems Lecture 24: Processor / ILP 5 EN164: Design of Computing Systems Lecture 24: Processor / ILP 5 Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design

INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 GBI0001@AUBURN.EDU ELEC 6200-001: Computer Architecture and Design Silicon Technology Moore s law Moore's Law describes a long-term trend in the history

More information

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model What is Computer Architecture? Structure: static arrangement of the parts Organization: dynamic interaction of the parts and their control Implementation: design of specific building blocks Performance:

More information

Superscalar Processors

Superscalar Processors Superscalar Processors Increasing pipeline length eventually leads to diminishing returns longer pipelines take longer to re-fill data and control hazards lead to increased overheads, removing any a performance

More information

Superscalar Machines. Characteristics of superscalar processors

Superscalar Machines. Characteristics of superscalar processors Superscalar Machines Increasing pipeline length eventually leads to diminishing returns longer pipelines take longer to re-fill data and control hazards lead to increased overheads, removing any performance

More information

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer

More information

Superscalar Processors Ch 14

Superscalar Processors Ch 14 Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion

More information

XT Node Architecture

XT Node Architecture XT Node Architecture Let s Review: Dual Core v. Quad Core Core Dual Core 2.6Ghz clock frequency SSE SIMD FPU (2flops/cycle = 5.2GF peak) Cache Hierarchy L1 Dcache/Icache: 64k/core L2 D/I cache: 1M/core

More information

Superscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency?

Superscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency? Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion

More information

Opera&ng Systems ECE344

Opera&ng Systems ECE344 Opera&ng Systems ECE344 Lecture 8: Paging Ding Yuan Lecture Overview Today we ll cover more paging mechanisms: Op&miza&ons Managing page tables (space) Efficient transla&ons (TLBs) (&me) Demand paged virtual

More information

William Stallings Computer Organization and Architecture

William Stallings Computer Organization and Architecture William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function Rev. 3.2.1 (2005-06) by Enrico Nardelli 11-1 CPU Functions CPU must: Fetch instructions Decode instructions

More information

Computer Organization and Architecture. OS Objectives and Functions Convenience Making the computer easier to use

Computer Organization and Architecture. OS Objectives and Functions Convenience Making the computer easier to use Computer Organization and Architecture Chapter 8 Operating System Support 1. Processes and Scheduling 2. Memory Management OS Objectives and Functions Convenience Making the computer easier to use Efficiency

More information

CPU Structure and Function

CPU Structure and Function Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com http://www.yildiz.edu.tr/~naydin CPU Structure and Function 1 2 CPU Structure Registers

More information

CS399 New Beginnings. Jonathan Walpole

CS399 New Beginnings. Jonathan Walpole CS399 New Beginnings Jonathan Walpole OS-Related Hardware & Software The Process Concept 2 Lecture 2 Overview OS-Related Hardware & Software - complications in real systems - brief introduction to memory

More information

Chapter 12. CPU Structure and Function. Yonsei University

Chapter 12. CPU Structure and Function. Yonsei University Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor

More information

Basic Computer Architecture

Basic Computer Architecture Basic Computer Architecture CSCE 496/896: Embedded Systems Witawas Srisa-an Review of Computer Architecture Credit: Most of the slides are made by Prof. Wayne Wolf who is the author of the textbook. I

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

CPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition

CPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU Structure and Function Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU must: CPU Function Fetch instructions Interpret/decode instructions Fetch data Process data

More information

Chapter 13 Reduced Instruction Set Computers

Chapter 13 Reduced Instruction Set Computers Chapter 13 Reduced Instruction Set Computers Contents Instruction execution characteristics Use of a large register file Compiler-based register optimization Reduced instruction set architecture RISC pipelining

More information

CS 333 Introduction to Operating Systems Class 2 OS-Related Hardware & Software The Process Concept

CS 333 Introduction to Operating Systems Class 2 OS-Related Hardware & Software The Process Concept CS 333 Introduction to Operating Systems Class 2 OS-Related Hardware & Software The Process Concept Jonathan Walpole Computer Science Portland State University 1 Lecture 2 overview OS-Related Hardware

More information

Real instruction set architectures. Part 2: a representative sample

Real instruction set architectures. Part 2: a representative sample Real instruction set architectures Part 2: a representative sample Some historical architectures VAX: Digital s line of midsize computers, dominant in academia in the 70s and 80s Characteristics: Variable-length

More information

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard. COMP 212 Computer Organization & Architecture Pipeline Re-Cap Pipeline is ILP -Instruction Level Parallelism COMP 212 Fall 2008 Lecture 12 RISC & Superscalar Divide instruction cycles into stages, overlapped

More information

Lecture 8: Memory Management

Lecture 8: Memory Management Lecture 8: Memory Management CSE 120: Principles of Opera>ng Systems UC San Diego: Summer Session I, 2009 Frank Uyeda Announcements PeerWise ques>ons due tomorrow. Project 2 is due on Friday. Milestone

More information

Understand the factors involved in instruction set

Understand the factors involved in instruction set A Closer Look at Instruction Set Architectures Objectives Understand the factors involved in instruction set architecture design. Look at different instruction formats, operand types, and memory access

More information

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand

More information

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture. Chapter Overview.

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture. Chapter Overview. Assembly Language for Intel-Based Computers, 4 th Edition Kip R. Irvine Chapter 2: IA-32 Processor Architecture Slides prepared by Kip R. Irvine Revision date: 09/25/2002 Chapter corrections (Web) Printing

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University ARM & IA-32 Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ARM (1) ARM & MIPS similarities ARM: the most popular embedded core Similar basic set

More information

COMPUTER ORGANIZATION & ARCHITECTURE

COMPUTER ORGANIZATION & ARCHITECTURE COMPUTER ORGANIZATION & ARCHITECTURE Instructions Sets Architecture Lesson 5a 1 What are Instruction Sets The complete collection of instructions that are understood by a CPU Can be considered as a functional

More information

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t Memory Management Ch. 3 Memory Hierarchy Cache RAM Disk Compromise between speed and cost. Hardware manages the cache. OS has to manage disk. Memory Manager Memory Hierarchy Cache CPU Main Swap Area Memory

More information

Memory Management Ch. 3

Memory Management Ch. 3 Memory Management Ch. 3 Ë ¾¾ Ì Ï ÒÒØ Å ÔÔ ÓÐÐ 1 Memory Hierarchy Cache RAM Disk Compromise between speed and cost. Hardware manages the cache. OS has to manage disk. Memory Manager Ë ¾¾ Ì Ï ÒÒØ Å ÔÔ ÓÐÐ

More information

ECE4680 Computer Organization and Architecture. Virtual Memory

ECE4680 Computer Organization and Architecture. Virtual Memory ECE468 Computer Organization and Architecture Virtual Memory If I can see it and I can touch it, it s real. If I can t see it but I can touch it, it s invisible. If I can see it but I can t touch it, it

More information

Virtual Memory: Concepts

Virtual Memory: Concepts Virtual Memory: Concepts Instructor: Dr. Hyunyoung Lee Based on slides provided by Randy Bryant and Dave O Hallaron Today Address spaces VM as a tool for caching VM as a tool for memory management VM as

More information

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:

More information

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative

More information

Assembly Language for Intel-Based Computers, 4 th Edition. Kip R. Irvine. Chapter 2: IA-32 Processor Architecture

Assembly Language for Intel-Based Computers, 4 th Edition. Kip R. Irvine. Chapter 2: IA-32 Processor Architecture Assembly Language for Intel-Based Computers, 4 th Edition Kip R. Irvine Chapter 2: IA-32 Processor Architecture Chapter Overview General Concepts IA-32 Processor Architecture IA-32 Memory Management Components

More information

Virtual Memory. Virtual Memory

Virtual Memory. Virtual Memory Virtual Memory Virtual Memory Main memory is cache for secondary storage Secondary storage (disk) holds the complete virtual address space Only a portion of the virtual address space lives in the physical

More information

Evolution of ISAs. Instruction set architectures have changed over computer generations with changes in the

Evolution of ISAs. Instruction set architectures have changed over computer generations with changes in the Evolution of ISAs Instruction set architectures have changed over computer generations with changes in the cost of the hardware density of the hardware design philosophy potential performance gains One

More information

Memory: Page Table Structure. CSSE 332 Operating Systems Rose-Hulman Institute of Technology

Memory: Page Table Structure. CSSE 332 Operating Systems Rose-Hulman Institute of Technology Memory: Page Table Structure CSSE 332 Operating Systems Rose-Hulman Institute of Technology General address transla+on CPU virtual address data cache MMU Physical address Global memory Memory management

More information

Review of instruction set architectures

Review of instruction set architectures Review of instruction set architectures Outline ISA and Assembly Language RISC vs. CISC Instruction Set Definition (MIPS) 2 ISA and assembly language Assembly language ISA Machine language 3 Assembly language

More information

Main Points of the Computer Organization and System Software Module

Main Points of the Computer Organization and System Software Module Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a

More information

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU 1-6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU Product Overview Introduction 1. ARCHITECTURE OVERVIEW The Cyrix 6x86 CPU is a leader in the sixth generation of high

More information

MEMORY MANAGEMENT/1 CS 409, FALL 2013

MEMORY MANAGEMENT/1 CS 409, FALL 2013 MEMORY MANAGEMENT Requirements: Relocation (to different memory areas) Protection (run time, usually implemented together with relocation) Sharing (and also protection) Logical organization Physical organization

More information

Instruction Set Principles and Examples. Appendix B

Instruction Set Principles and Examples. Appendix B Instruction Set Principles and Examples Appendix B Outline What is Instruction Set Architecture? Classifying ISA Elements of ISA Programming Registers Type and Size of Operands Addressing Modes Types of

More information

High-Performance Microarchitecture Techniques John Paul Shen Director of Microarchitecture Research Intel Labs

High-Performance Microarchitecture Techniques John Paul Shen Director of Microarchitecture Research Intel Labs High-Performance Microarchitecture Techniques John Paul Shen Director of Microarchitecture Research Intel Labs October 29, 2002 Microprocessor Research Forum Intel s Microarchitecture Research Labs! USA:

More information

Agenda. Pentium III Processor New Features Pentium 4 Processor New Features. IA-32 Architecture. Sunil Saxena Principal Engineer Intel Corporation

Agenda. Pentium III Processor New Features Pentium 4 Processor New Features. IA-32 Architecture. Sunil Saxena Principal Engineer Intel Corporation IA-32 Architecture Sunil Saxena Principal Engineer Corporation September 11, 2000 Copyright 2000 Corporation. Linux Supercluster Users Conference Agenda Pentium III Processor New Features Pentium 4 Processor

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function William Stallings Computer Organization and Architecture 8 th Edition Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data

More information

16 Sharing Main Memory Segmentation and Paging

16 Sharing Main Memory Segmentation and Paging Operating Systems 64 16 Sharing Main Memory Segmentation and Paging Readings for this topic: Anderson/Dahlin Chapter 8 9; Siberschatz/Galvin Chapter 8 9 Simple uniprogramming with a single segment per

More information

Instruction Set Principles. (Appendix B)

Instruction Set Principles. (Appendix B) Instruction Set Principles (Appendix B) Outline Introduction Classification of Instruction Set Architectures Addressing Modes Instruction Set Operations Type & Size of Operands Instruction Set Encoding

More information

ECE468 Computer Organization and Architecture. Virtual Memory

ECE468 Computer Organization and Architecture. Virtual Memory ECE468 Computer Organization and Architecture Virtual Memory ECE468 vm.1 Review: The Principle of Locality Probability of reference 0 Address Space 2 The Principle of Locality: Program access a relatively

More information

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

ASSEMBLY LANGUAGE MACHINE ORGANIZATION ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction

More information

Computer Organization (II) IA-32 Processor Architecture. Pu-Jen Cheng

Computer Organization (II) IA-32 Processor Architecture. Pu-Jen Cheng Computer Organization & Assembly Languages Computer Organization (II) IA-32 Processor Architecture Pu-Jen Cheng Materials Some materials used in this course are adapted from The slides prepared by Kip

More information

1.Explain with the diagram IVT of 80X86. Ans-

1.Explain with the diagram IVT of 80X86. Ans- 1.Explain with the diagram IVT of 80X86 In 8086 1 kb from 00000 to 003ff are reserved for interrupt routine as shown in figure known as interrupt vector. It supports 256 interrupt procedures containing

More information

Chapter 2: Instructions How we talk to the computer

Chapter 2: Instructions How we talk to the computer Chapter 2: Instructions How we talk to the computer 1 The Instruction Set Architecture that part of the architecture that is visible to the programmer - instruction formats - opcodes (available instructions)

More information

Registers. Registers

Registers. Registers All computers have some registers visible at the ISA level. They are there to control execution of the program hold temporary results visible at the microarchitecture level, such as the Top Of Stack (TOS)

More information

Hardware and Software Architecture. Chapter 2

Hardware and Software Architecture. Chapter 2 Hardware and Software Architecture Chapter 2 1 Basic Components The x86 processor communicates with main memory and I/O devices via buses Data bus for transferring data Address bus for the address of a

More information

CS162 Operating Systems and Systems Programming Lecture 14. Caching (Finished), Demand Paging

CS162 Operating Systems and Systems Programming Lecture 14. Caching (Finished), Demand Paging CS162 Operating Systems and Systems Programming Lecture 14 Caching (Finished), Demand Paging October 11 th, 2017 Neeraja J. Yadwadkar http://cs162.eecs.berkeley.edu Recall: Caching Concept Cache: a repository

More information

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Use main memory as a cache for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs

More information

Instruction Sets Ch 9-10

Instruction Sets Ch 9-10 Instruction Sets Ch 9-10 Characteristics Operands Operations Addressing Instruction Formats 1 Instruction Set (käskykanta) Collection of instructions that CPU understands Only interface to CPU from outside

More information

Processing Unit CS206T

Processing Unit CS206T Processing Unit CS206T Microprocessors The density of elements on processor chips continued to rise More and more elements were placed on each chip so that fewer and fewer chips were needed to construct

More information

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358 Memory Management Reading: Silberschatz chapter 9 Reading: Stallings chapter 7 1 Outline Background Issues in Memory Management Logical Vs Physical address, MMU Dynamic Loading Memory Partitioning Placement

More information

Instruction Sets Ch 9-10

Instruction Sets Ch 9-10 Instruction Sets Ch 9-10 Characteristics Operands Operations Addressing Instruction Formats 1 Instruction Set (käskykanta) Collection of instructions that CPU understands Only interface to CPU from outside

More information

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer

More information

UNIT V: CENTRAL PROCESSING UNIT

UNIT V: CENTRAL PROCESSING UNIT UNIT V: CENTRAL PROCESSING UNIT Agenda Basic Instruc1on Cycle & Sets Addressing Instruc1on Format Processor Organiza1on Register Organiza1on Pipeline Processors Instruc1on Pipelining Co-Processors RISC

More information

A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle.

A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle. CS 320 Ch. 16 SuperScalar Machines A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle. A superpipelined machine is one in which a

More information

Lecture 4: Instruction Set Architecture

Lecture 4: Instruction Set Architecture Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual

More information

Exploring the Effects of Hyperthreading on Scientific Applications

Exploring the Effects of Hyperthreading on Scientific Applications Exploring the Effects of Hyperthreading on Scientific Applications by Kent Milfeld milfeld@tacc.utexas.edu edu Kent Milfeld, Chona Guiang, Avijit Purkayastha, Jay Boisseau TEXAS ADVANCED COMPUTING CENTER

More information

Main Points. Address Transla+on Concept. Flexible Address Transla+on. Efficient Address Transla+on

Main Points. Address Transla+on Concept. Flexible Address Transla+on. Efficient Address Transla+on Address Transla+on Main Points Address Transla+on Concept How do we convert a virtual address to a physical address? Flexible Address Transla+on Segmenta+on Paging Mul+level transla+on Efficient Address

More information

Memory and multiprogramming

Memory and multiprogramming Memory and multiprogramming COMP342 27 Week 5 Dr Len Hamey Reading TW: Tanenbaum and Woodhull, Operating Systems, Third Edition, chapter 4. References (computer architecture): HP: Hennessy and Patterson

More information

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Chapter 5 A Closer Look at Instruction Set Architectures Gain familiarity with memory addressing modes. Understand

More information

CS152 Computer Architecture and Engineering March 13, 2008 Out of Order Execution and Branch Prediction Assigned March 13 Problem Set #4 Due March 25

CS152 Computer Architecture and Engineering March 13, 2008 Out of Order Execution and Branch Prediction Assigned March 13 Problem Set #4 Due March 25 CS152 Computer Architecture and Engineering March 13, 2008 Out of Order Execution and Branch Prediction Assigned March 13 Problem Set #4 Due March 25 http://inst.eecs.berkeley.edu/~cs152/sp08 The problem

More information

Superscalar Processors

Superscalar Processors Superscalar Processors Superscalar Processor Multiple Independent Instruction Pipelines; each with multiple stages Instruction-Level Parallelism determine dependencies between nearby instructions o input

More information

The Operating System. Chapter 6

The Operating System. Chapter 6 The Operating System Machine Level Chapter 6 1 Contemporary Multilevel Machines A six-level l computer. The support method for each level is indicated below it.2 Operating System Machine a) Operating System

More information

EC-801 Advanced Computer Architecture

EC-801 Advanced Computer Architecture EC-801 Advanced Computer Architecture Lecture 5 Instruction Set Architecture I Dr Hashim Ali Fall 2018 Department of Computer Science and Engineering HITEC University Taxila!1 Instruction Set Architecture

More information

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance 6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,

More information

Darshan Institute of Engineering & Technology

Darshan Institute of Engineering & Technology 1. Explain 80286 architecture. OR List the four major processing units in an 80286 microprocessor and briefly describe the function of each. Ans - The 80286 was designed for multi-user systems with multitasking

More information

How to write powerful parallel Applications

How to write powerful parallel Applications How to write powerful parallel Applications 08:30-09.00 09.00-09:45 09.45-10:15 10:15-10:30 10:30-11:30 11:30-12:30 12:30-13:30 13:30-14:30 14:30-15:15 15:15-15:30 15:30-16:00 16:00-16:45 16:45-17:15 Welcome

More information

15 Sharing Main Memory Segmentation and Paging

15 Sharing Main Memory Segmentation and Paging Operating Systems 58 15 Sharing Main Memory Segmentation and Paging Readings for this topic: Anderson/Dahlin Chapter 8 9; Siberschatz/Galvin Chapter 8 9 Simple uniprogramming with a single segment per

More information

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand

More information

EKT 303 WEEK Pearson Education, Inc., Hoboken, NJ. All rights reserved.

EKT 303 WEEK Pearson Education, Inc., Hoboken, NJ. All rights reserved. + EKT 303 WEEK 13 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Chapter 15 Reduced Instruction Set Computers (RISC) Table 15.1 Characteristics of Some CISCs, RISCs, and Superscalar

More information

Assembly Language. Lecture 2 x86 Processor Architecture

Assembly Language. Lecture 2 x86 Processor Architecture Assembly Language Lecture 2 x86 Processor Architecture Ahmed Sallam Slides based on original lecture slides by Dr. Mahmoud Elgayyar Introduction to the course Outcomes of Lecture 1 Always check the course

More information

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam Assembly Language Lecture 2 - x86 Processor Architecture Ahmed Sallam Introduction to the course Outcomes of Lecture 1 Always check the course website Don t forget the deadline rule!! Motivations for studying

More information

Uniprocessors. HPC Fall 2012 Prof. Robert van Engelen

Uniprocessors. HPC Fall 2012 Prof. Robert van Engelen Uniprocessors HPC Fall 2012 Prof. Robert van Engelen Overview PART I: Uniprocessors and Compiler Optimizations PART II: Multiprocessors and Parallel Programming Models Uniprocessors Processor architectures

More information

EJEMPLOS DE ARQUITECTURAS

EJEMPLOS DE ARQUITECTURAS Maestría en Electrónica Arquitectura de Computadoras Unidad 4 EJEMPLOS DE ARQUITECTURAS M. C. Felipe Santiago Espinosa Marzo/2017 ARM & MIPS Similarities ARM: the most popular embedded core Similar basic

More information

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

ISA and RISCV. CASS 2018 Lavanya Ramapantulu ISA and RISCV CASS 2018 Lavanya Ramapantulu Program Program =?? Algorithm + Data Structures Niklaus Wirth Program (Abstraction) of processor/hardware that executes 3-Jul-18 CASS18 - ISA and RISCV 2 Program

More information

a process may be swapped in and out of main memory such that it occupies different regions

a process may be swapped in and out of main memory such that it occupies different regions Virtual Memory Characteristics of Paging and Segmentation A process may be broken up into pieces (pages or segments) that do not need to be located contiguously in main memory Memory references are dynamically

More information