Comparison InstruCtions

Similar documents
ARM Shift Operations. Shift Type 00 - logical left 01 - logical right 10 - arithmetic right 11 - rotate right. Shift Amount 0-31 bits

Basic ARM InstructionS

Instruction Set Architecture (ISA)

Branch Instructions. R type: Cond

STEVEN R. BAGLEY ARM: PROCESSING DATA

Chapter 2 Instructions Sets. Hsung-Pin Chang Department of Computer Science National ChungHsing University

Processor Status Register(PSR)

Writing ARM Assembly. Steven R. Bagley

Control Flow. September 2, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 September 2, / 21

The PAW Architecture Reference Manual

Raspberry Pi / ARM Assembly. OrgArch / Fall 2018

Cortex M3 Programming

3 Assembly Programming (Part 2) Date: 07/09/2016 Name: ID:

ARM Cortex-M4 Architecture and Instruction Set 2: General Data Processing Instructions

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

ARM Cortex-M4 Programming Model Flow Control Instructions

Topic 10: Instruction Representation

MNEMONIC OPERATION ADDRESS / OPERAND MODES FLAGS SET WITH S suffix ADC

ARM Instruction Set Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Overview COMP Microprocessors and Embedded Systems. Lectures 14: Making Decisions in C/Assembly Language II

CprE 288 Introduction to Embedded Systems Course Review for Exam 3. Instructors: Dr. Phillip Jones

ECE 571 Advanced Microprocessor-Based Design Lecture 3

Control Flow Instructions

Flow Control In Assembly

ECE 471 Embedded Systems Lecture 6

Lecture 15 ARM Processor A RISC Architecture

Architecture. Digital Computer Design

Bitwise Instructions

A Bit of History. Program Mem Data Memory. CPU (Central Processing Unit) I/O (Input/Output) Von Neumann Architecture. CPU (Central Processing Unit)

University of California, San Diego CSE 30 Computer Organization and Systems Programming Winter 2014 Midterm Dr. Diba Mirza

ARM Cortex-M4 Architecture and Instruction Set 3: Branching; Data definition and memory access instructions

ECE 471 Embedded Systems Lecture 5

Chapter 15. ARM Architecture, Programming and Development Tools

ARM Architecture and Instruction Set

ARM Cortex M3 Instruction Set Architecture. Gary J. Minden March 29, 2016

Assembly language Simple, regular instructions building blocks of C, Java & other languages Typically one-to-one mapping to machine language

ARM Assembly Language. Programming

Unsigned and signed integer numbers

Assembler: Basics. Alberto Bosio October 20, Univeristé de Montpellier

18-349: Introduction to Embedded Real- Time Systems Lecture 3: ARM ASM

ARM Instruction Set. Computer Organization and Assembly Languages Yung-Yu Chuang. with slides by Peng-Sheng Chen

Assembly language Simple, regular instructions building blocks of C, Java & other languages Typically one-to-one mapping to machine language

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

3. The Instruction Set

The ARM Cortex-M0 Processor Architecture Part-2

ARM Cortex-A9 ARM v7-a. A programmer s perspective Part 2

ARM Instruction Set. Introduction. Memory system. ARM programmer model. The ARM processor is easy to program at the

Introduction to C. Write a main() function that swaps the contents of two integer variables x and y.

Lecture 4 (part 2): Data Transfer Instructions

Chapter 15. ARM Architecture, Programming and Development Tools

EEM870 Embedded System and Experiment Lecture 4: ARM Instruction Sets

ARM Cortex-M4 Programming Model Logical and Shift Instructions

F28HS2 Hardware-Software Interfaces. Lecture 6: ARM Assembly Language 1

CprE 288 Introduction to Embedded Systems ARM Assembly Programming: Translating C Control Statements and Function Calls

Embedded assembly is more useful. Embedded assembly places an assembly function inside a C program and can be used with the ARM Cortex M0 processor.

ARM Instruction Set 1

CS 310 Embedded Computer Systems CPUS. Seungryoul Maeng

ARM Assembly Programming

Assembly Language. CS2253 Owen Kaser, UNBSJ

CMPSCI 201 Fall 2005 Midterm #2 Solution

ARM. Assembly Language and Machine Code. Goal: Blink an LED

A block of memory (FlashROM) starts at address 0x and it is 256 KB long. What is the last address in the block?

ECE 498 Linux Assembly Language Lecture 5

Who am I and what am I doing?

Instruction Set. ARM810 Data Sheet. Open Access - Preliminary

LAB 1 Using Visual Emulator. Download the emulator Answer to questions (Q)

ARM-7 ADDRESSING MODES INSTRUCTION SET

ARM Cortex-M4 Programming Model Memory Addressing Instructions

ARM. Assembly Language and Machine Code. Goal: Blink an LED

The ARM Instruction Set

VE7104/INTRODUCTION TO EMBEDDED CONTROLLERS UNIT III ARM BASED MICROCONTROLLERS

Overview COMP Microprocessors and Embedded Systems. Lecture 11: Memory Access - I. August, 2003

EE319K Exam 1 Summer 2014 Page 1. Exam 1. Date: July 9, Printed Name:

The ARM processor. Morgan Kaufman ed Overheads for Computers as Components

ECE 471 Embedded Systems Lecture 9

You do not need to save the values in scratch registers, but you also cannot assume they have been saved when you call another function.

The ARM instruction set

ECE/CS 3710 RISC Processor Specification

EE319K Fall 2013 Exam 1B Modified Page 1. Exam 1. Date: October 3, 2013

ARM Assembly Language

Chapter 2. Instructions: Language of the Computer

EECS 427 RISC PROCESSOR

Assembly Language Programming

Computer Architecture Prof. Smruthi Ranjan Sarangi Department of Computer Science and Engineering Indian Institute of Technology, Delhi

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Pipeline Hazards. Midterm #2 on 11/29 5th and final problem set on 11/22 9th and final lab on 12/1.

Chapters 3. ARM Assembly. Embedded Systems with ARM Cortext-M. Updated: Wednesday, February 7, 2018

The Assembly Language of the Boz 5

Systems Architecture The ARM Processor

Chapter 13. The ISA of a simplified DLX Why use abstractions?

CMPSCI 201 Fall 2004 Midterm #2 Answers

Introduction to the ARM Processor Using Intel FPGA Toolchain. 1 Introduction. For Quartus Prime 16.1

CE1921: COMPUTER ARCHITECTURE SINGLE CYCLE PROCESSOR DESIGN WEEK 2

Binary Logic (review)

Boolean Unit (The obvious way)

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

Slides for Lecture 6

ARM Processors ARM ISA. ARM 1 in 1985 By 2001, more than 1 billion ARM processors shipped Widely used in many successful 32-bit embedded systems

(2) Part a) Registers (e.g., R0, R1, themselves). other Registers do not exists at any address in the memory map

EE319K Spring 2016 Exam 1 Solution Page 1. Exam 1. Date: Feb 25, UT EID: Solution Professor (circle): Janapa Reddi, Tiwari, Valvano, Yerraballi

Transcription:

Status Flags Now it is time to discuss what status flags are available. These five status flags are kept in a special register called the Program Status Register (PSR). The PSR also contains other important bits that control the processor. N - set if the result of an operation is negative (Most Significant Bit (MSB) is a 1) Z - set if the result of an operation is 0 C - set if the result of an operation has a carry out of it s MSB V - set if a sum of two positive operands gives a negative result, or if the sum of two negative operands gives a positive result Q - a sticky version of overflow created by instructions that generate multiple results (more on this later on). 1

Comparison InstruCtions These instructions modify the status flags, but leave the contents of the registers unchanged. They are used to test register contents, and they must have their S bit set to 1. They also don t modify their Rd, and by convention, Rd is set to 0000. S CMP R0,R1 CMN R2,R3 PSR flags set for the result PSR flags set for the result R0 - R1 R2 + R3 TST R4,#8 TEQ R5,#1024 PSR flags set for the result R4 & 8 PSR flags set for the result R5 ^ 1024 2

Register Transfer These instructions are used to transfer the contents of one register to another, or simply to initialize the contents of a register. They make use of only one operand, and, by convention, have their Rn field set to 0000. MOV R0,R3 MOV R1,#4096 MVN R2,R4 MVNS R3,#1 R0 R3 R2 - R4 R1 4096 R3-1, and set PSR (Z = 0, N = 1, V = 0, C = 0) 3

ARM Shift Operations A novel feature of ARM is that all data-processing instructions can include an optional shift, whereas most other architectures have separate shift instructions. This is actually very useful as we will see later on. The key to shifting is that 8-bit field between Rd and Rm. 4 3 4 1 4 4 5 2 1 4 R type: L 1110 000 Opcode S Rn Rd Shift Rm A 0 Shift Amount 0-31 bits Shift Type 00 - logical left 01 - logical right 10 - arithmetic right 11 - rotate right 4

Left Shifts Left Shifts effectively multiply the contents of a register by 2 s where s is the shift amount. MOV R0,R0,LSL 7 R0 before: R0 after: 0000 0000 0000 0000 0000 0000 0000 0111 0000 0000 0000 0000 0000 0011 1000 0000 = 7 = 7 * 2 7 = 896 Shifts can also be applied to the second operand of any data processing instruction ADD R1,R1,R0,LSL 7 5

Right Shifts Right Shifts behave like dividing the contents of a register by 2 s where s is the shift amount, if you assume the contents of the register are unsigned. MOV R0,R0,LSR 2 R0 before: R0 after: 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000 0001 0000 0000 = 1024 = 1024 / 2 2 = 256 6

Arithmetic Right Shifts Arithmetic right Shifts behave like dividing the contents of a register by 2 s where s is the shift amount, if you assume the contents of the register are signed. MOV R0,R0,ASR 2 R0 before: R0 after: 1111 1111 1111 1111 1111 1100 0000 0000 1111 1111 1111 1111 1111 1111 0000 0000 = -1024 = -1024 / 2 2 = -256 This is Java s >>> operator, LSR is >> and LSL is << 7

Rotate RiGHT Shifts Rotating shifts have no arithmetic analogy. However, they don t lose bits like both logical and arithmetic shifts. We saw rotate right shift used for the I-type immediate value earlier. MOV R0,R0,ROR 2 R0 before: R0 after: 0000 0000 0000 0000 0000 0000 0000 0111 1100 0000 0000 0000 0000 0000 0000 0001 = 7 = -1,073,741,823 Why no rotate left shift? Java doesn t have an operator for this one. Ran out of encodings? Almost anything Rotate lefts can do ROR can do as well! 8

Addressing Modes and Branches More on Immediates Reading and Writing Memory Registers holding addresses Pointers Changing the PC Loops Labels Calling Functions 9

Why Built-in Constant operands? (Immediates) 4 3 4 1 4 4 4 8 I type: 1110 001 Opcode S Rn Rd Rotate Imm8 Alternatives? Why not? Do we have a choice? put constants in memory (was common in older instruction sets) SMALL constants are used frequently (50% of operands) In a C compiler (gcc) 52% of ALU operations involve a constant In a circuit simulator (spice) 69% involve constants e.g., B = B + 1; C = W & 0xff; A = B - 1; ISA Design Principle: Make the common case easy Make the common case fast How large of constants should we allow for? If they are too big, we won t have enough bits leftover for the instructions or operands. 10

Rotations to make constants Recall that immediate constants are encoded in two parts: Thus, only a subset of 4096 32-bit numbers can Rotate be used directly as an operand. There are actually only 3073 distinct constants. There are 16, 0s and 4 ways to represent all powers 2. From last time, how might you encode 256? 1100 00000001 1101 00000100 4 8 imm 1110 00010000 1111 01000000 11

Moves and ORs We can load any 32-bit constant using a series of instructions, one-byte at a time. MOV R0,#85 ORR R0,R0,#21760 ORR R0,R0,#5570560 ORR R0,R0,#1426063360 ; 0x55 in hex ; 0x5500 in hex ; 0x550000 in hex ; 0x55000000 in hex But there are often better, faster, ways to load constants, and the assembler can figure out how for you, even if it needs to generate multiple instructions. MOV R0,=1431655765 Note that an equal sign is used here rather than a hashtag. ; 0x55555555 in hex 12

Load and Store Instructions ARM is a Load/Store architecture. That means that only a special class of instructions are used to reference data in memory. As a rule, data is loaded into registers first, then processed, and the results are written back using stores. Load and Store instructions have their own format: 4 3 1 1 1 1 1 4 4 12 D type: 1110 010 1 U 0 0 L Rn Rd Imm12 Why does a 1 imply an immediate operand for ALU types, but 0 for Loads and Stores? 4 3 1 1 1 1 1 4 4 5 2 1 4 L X type: 1110 011 1 U 0 0 L Rn Rd Shift Rm A 0 If U is 0 subtract offset from base, otherwise add them. L is a 1 for a Load and 0 for a Store The same shift options that we saw for the data processing instructions 13

Load and Store OPtions ARM s load and store instructions are versatile. They provide a wide range of addressing modes. Only a subset is shown here. LDR Rd,[Rn,#imm12] Rd Memory[Rn + imm12] Rd is loaded with the contents of memory at the address found by adding the contents of the base register to the supplied constant STR R0,[R1,#-4] Memory[R1-4] R0 Offsets can be either added or subtracted, as indicated by a negative sign LDR R2,[R3] STR R4,[R5,R6] LDR R4,[R5,-R6] If no offset is specified it is assumed to be zero The contents of a second register can be used as an offset rather than a constant (using the X-type format) Register offsets can be either added or subtracted, like constants STR R4,[R5,R4,LSL 2] Register offsets can also be optionally shifted, which is great for indexing arrays! 14

Changing the PC The Program Counter is special register (R15) that tracks the address of the next instruction to be fetched. There are special instructions for changing the PC. 4 3 1 24 B type: Cond 101 L Imm24 0000 - EQ - equals 0001 - NE - not equals 0010 - CS - carry set 0011 - CC - carry clear 0100 - MI - negative 0101 - PL - positive or zero 0110 - VS - overflow 0111 - VC - no overflow 1000 - HI - higher (unsigned) 1001 - LS - lower or same (unsigned) 1010 - GE - greater or equal (signed) 1011 - LT - less than (signed) 1100 - GT - greater than (signed) 1101 - LE - less than or equal (signed) 1110 - - always The L bit causes PC+4 to be saved In LP (R14). Branches are often executed conditionally based on the PSR state set by some previous instruction like CMP or TST. 15

Branch using ReGISTERs The standard Branch instruction has a limited range, the 24-bit signed 2 s complement immediate value is multiplied by 4 and added to the PC+8, giving a range of +/- 32 Mbytes. Larger branches make use of addresses previously loaded into a register using the BX instruction. 4 3 21 4 R type: Cond 000 1 0010 1111 1111 1111 0001 Rn 0000 - EQ - equals 0001 - NE - not equals 0010 - CS - carry set 0011 - CC - carry clear 0100 - MI - negative 0101 - PL - positive or zero 0110 - VS - overflow 0111 - VC - no overflow 1000 - HI - higher (unsigned) 1001 - LS - lower or same (unsigned) 1010 - GE - greater or equal (signed) 1011 - LT - less than (signed) 1100 - GT - greater than (signed) 1101 - LE - less than or equal (signed) 1110 - - always If the condition is true, the PC is loaded with the contents of Rn. BTW, BX is encoded as a TEQ instruction with its S field set to 0 16

Branch Examples BNE else BEQL func If some previous CMP instruction had a non-zero result (i.e. making the Z bit 0 in the PSR), then this instruction will cause the PC to be loaded with the address having the label else. If some previous CMP instruction set the Z bit in the PSR, then this instruction will cause the PC to be loaded with the address having the label func, and the address of the following instruction will be saved in R14. BX LR Loads the PC with the contents of R14. loop: B loop An infinite loop 17

A simple Program ; Assembly code for ; sum = 0; ; for (i = 0; i <= 10; i++) ; sum = sum + i; MOV R1,#0 ; R1 is i MOV R0,#0 ; R0 is sum loop: ADD R0,R0,R1 ; sum = sum + i ADD R1,R1,#1 ; i++ CMP R1,#10 ; i <= 10 BLE loop halt: B halt 18

Load and Stores in action An example of how loads and stores are used to access arrays. Java/C: int x[10]; int sum = 0; for (int i = 0; i < 10; i++) sum += x[i]; Assembly:.align 4 x:.space 40 sum:.word 0 MOV R0,=x ; base of x MOV R1,=sum LDR R2,[R1] MOV R3,#0 ; R3 is i for: LDR R4,[R0,R3 LSL 2] ADD R2,R2,R4 ADD R3,R3,#1 CMP R3,#10 BLT for STR R2,[R1] 19

Next time We ll write more Assembly programs Still some loose ends Multiplication? Division? Floating point? 20