ARM Architecture and Assembly Programming Intro

Similar documents
CprE 288 Introduction to Embedded Systems Course Review for Exam 3. Instructors: Dr. Phillip Jones

OUTLINE. STM32F0 Architecture Overview STM32F0 Core Motivation for RISC and Pipelining Cortex-M0 Programming Model Toolchain and Project Structure

The ARM Cortex-M0 Processor Architecture Part-1

ARM Cortex-M4 Programming Model

EE4144: ARM Cortex-M Processor

ARM Processor Architecture

Introduction to C. Write a main() function that swaps the contents of two integer variables x and y.

CprE 288 Translating C Control Statements and Function Calls, Loops, Interrupt Processing. Instructors: Dr. Phillip Jones Dr.

ECE254 Lab3 Tutorial. Introduction to MCB1700 Hardware Programming. Irene Huang

AND SOLUTION FIRST INTERNAL TEST

ARM Cortex-M4 Architecture and Instruction Set 1: Architecture Overview

CprE 288 Introduction to Embedded Systems ARM Assembly Programming: Translating C Control Statements and Function Calls

ARM Cortex core microcontrollers

ECE 471 Embedded Systems Lecture 2

Embedded System Design

AVR Microcontrollers Architecture

Computer Organization Laboratory. Class Notes. Instructor: Ken Q. Yang Dept. of ECE, URI

ECE254 Lab3 Tutorial. Introduction to Keil LPC1768 Hardware and Programmers Model. Irene Huang

Design and Implementation Interrupt Mechanism

Microcontrollers. Microcontroller

ARM Interrupts. EE383: Introduction to Embedded Systems University of Kentucky. James E. Lumpp

EE 354 Fall 2015 Lecture 1 Architecture and Introduction

A block of memory (FlashROM) starts at address 0x and it is 256 KB long. What is the last address in the block?

CprE 288 Introduction to Embedded Systems

ECE251: Thursday September 27

ARM Cortex-A9 ARM v7-a. A programmer s perspective Part1

Choosing a Micro for an Embedded System Application

Multitasking on Cortex-M(0) class MCU A deepdive into the Chromium-EC scheduler

Hercules ARM Cortex -R4 System Architecture. Processor Overview

ARM Ltd. ! Founded in November 1990! Spun out of Acorn Computers

Micro computer Organization

ECE 471 Embedded Systems Lecture 2

EE 308: Microcontrollers

ARM architecture road map. NuMicro Overview of Cortex M. Cortex M Processor Family (2/3) All binary upwards compatible

Stacks and Subroutines

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

ARM Processors for Embedded Applications

ECE251: Tuesday September 11

Interrupts and Low Power Features

IoT and Security: ARM v8-m Architecture. Robert Boys Product Marketing DSG, ARM. Spring 2017: V 3.1

Microcomputer Architecture and Programming

COMP2121: Microprocessors and Interfacing. Instruction Set Architecture (ISA)

ECE251: Tuesday September 12

COEN-4720 Embedded Systems Design Lecture 3 Intro to ARM Cortex-M3 (CM3) and LPC17xx MCU

Program SoC using C Language

ARM Processor. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

L2 - C language for Embedded MCUs

F28HS2 Hardware-Software Interfaces. Lecture 6: ARM Assembly Language 1

ARM Cortex A9. ARM Cortex A9

Wed. Aug 23 Announcements

ELC4438: Embedded System Design ARM Cortex-M Architecture II

Interconnects, Memory, GPIO

ARM Cortex-M and RTOSs Are Meant for Each Other

ECE2049 E17 Lecture 4 MSP430 Architecture & Intro to Digital I/O

Automation Engineers AB Pvt Ltd, NOIDA Job-Oriented Course on Embedded Microcontrollers & Related Software Stack

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

CPU: SOFTWARE ARCHITECTURE INSTRUCTION SET (PART

TEVATRON TECHNOLOGIES PVT. LTD Embedded! Robotics! IoT! VLSI Design! Projects! Technical Consultancy! Education! STEM! Software!

Raspberry Pi / ARM Assembly. OrgArch / Fall 2018


8051 Microcontrollers

MICROPROCESSOR BASED SYSTEM DESIGN

ECE251: Thursday September 13

CN310 Microprocessor Systems Design

ARM Processors ARM ISA. ARM 1 in 1985 By 2001, more than 1 billion ARM processors shipped Widely used in many successful 32-bit embedded systems

Agenda. ARM Core Data Flow Model Registers Program Status Register Pipeline Exceptions Core Extensions ARM Architecture Revision

E85 Lab 8: Assembly Language

Announcements. Class 7: Intro to SRC Simulator Procedure Calls HLL -> Assembly. Agenda. SRC Procedure Calls. SRC Memory Layout. High Level Program

Chapter 4. Enhancing ARM7 architecture by embedding RTOS

Introduction to ARM LPC2148 Microcontroller

V8-uRISC 8-bit RISC Microprocessor AllianceCORE Facts Core Specifics VAutomation, Inc. Supported Devices/Resources Remaining I/O CLBs

ECE 471 Embedded Systems Lecture 3

AVR ISA & AVR Programming (I) Lecturer: Sri Parameswaran Notes by: Annie Guo

Arm Cortex -M33 Devices

AVR ISA & AVR Programming (I)

Cortex-M4 Processor Overview. with ARM Processors and Architectures

Cortex-M3/M4 Software Development

Department of Electronics and Instrumentation Engineering Question Bank

COEN-4720 Embedded Systems Design Lecture 4 Interrupts (Part 1) Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University

Chapter 7 Central Processor Unit (S08CPUV2)

Implementing Secure Software Systems on ARMv8-M Microcontrollers

Microcontroller systems Lec 2 PIC18LF8722 Microcontroller s s core

Description of the Simulator

突破 8-/16-/32- 位和 DSP 界限的 ARM MCU 解决方案

x86 assembly CS449 Fall 2017

acret Ameya Centre for Robotics & Embedded Technology Syllabus for Diploma in Embedded Systems (Total Eight Modules-4 Months -320 Hrs.

ARM Cortex -M for Beginners

AVR ISA & AVR Programming (I) Lecturer: Sri Parameswaran Notes by: Annie Guo

Universität Dortmund. ARM Architecture

Lecture 14. Ali Karimpour Associate Professor Ferdowsi University of Mashhad

CprE 288 Introduction to Embedded Systems Exam 1 Review. 1

ARM Cortex-M4 Architecture and Instruction Set 4: The Stack and subroutines

The Definitive Guide to the ARM Cortex-M3

Lecture notes Lectures 1 through 5 (up through lecture 5 slide 63) Book Chapters 1-4

Computer Systems Lecture 9

CN310 Microprocessor Systems Design

Diploma in Embedded Systems

Embedded Systems. PIC16F84A Internal Architecture. Eng. Anis Nazer First Semester

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

Chapter 2 Sections 1 8 Dr. Iyad Jafar

Transcription:

ARM Architecture and Assembly Programming Intro Instructors: Dr. Phillip Jones http://class.ece.iastate.edu/cpre288 1

Announcements HW9: Due Sunday 11/5 (midnight) Lab 9: object detection lab Give TAs Creative Team name, and verified list of team members Exam 2: In class Thursday 11/9 HW5 HW8 Labs 1-8 Reading for remainder of semester Chapter 2.1-2.3, and 2.6.1-2.6.2 Chapter 4.1 4.3 Assemble ARM instruction set manual. Preface, Chapter 3, Chapter 4 ARM Procedure Call Standard Sections: 5, 7.1.1, 7.2. 2

ARM ARCHITECTURE OVERVIEW http://class.ece.iastate.edu/cpre288 3

Why use assembly programming? Full access to hardware features Compiler limits a programmers access to the hardware features that the compiler writer decided to implement Writing time critical portions of code Allows tight control over what the CPU is doing on every clock cycle Debugging It in not uncommon when trying to debug odd system behavior to have to look at disassembled code http://class.ece.iastate.edu/cpre288 4

Why learn the ARM Hardware Architecture? Helps give intuition to why the assembly instructions were created the way the were Help understand what special feature may be available for you to make use of. http://class.ece.iastate.edu/cpre288 5

ARM (Cortex-M4) Architecture Overview Table 2.1 Development history of the ARM MCUs family. Architecture Bit Width Cores Designed by ARM Holdings Cores Designed by Third Parties Cortex Profile ARMv1 32/26 ARM1 ARMv2 32/26 ARM2, ARM3 Amber, STORM Open Soft Core ARMv3 32 ARM6, ARM7 ARMv4 32 ARM8 StrongARM, FA526 ARMv4T 32 ARM7TDMI, ARM9TDMI ARMv5 32 ARM7EJ, ARM9E, ARM10E XScale, FA626TE, Feroceon, PJ1/Mohawk ARMv6 32 ARM11 ARMv6-M 32 ARM Cortex-M0, ARM Cortex-M0+, ARM Cortex-M1 Microcontroller ARMv7-M 32 ARM Cortex-M3 Microcontroller ARMv7E-M 32 ARM Cortex-M4 Microcontroller ARMv7-R 32 ARM Cortex-R4, ARM Cortex-R5, ARM Cortex-R7 Real-time ARMv7-A 32 ARM Cortex-A5, ARM Cortex-A7, ARM Cortex-A8, ARM Cortex-A9, ARM Cortex-A12, ARM Cortex-A15, ARM Cortex-A17 ARMv8-A 64/32 ARM Cortex-A53, ARM Cortex-A57 Krait, Scorpion, PJ4/Sheeva, Apple A6/A6X X-Gene, Denver, Apple A7 (Cyclone), K12 Application Application ARMv8-R 32 No announcements yet Real-time http://class.ece.iastate.edu/cpre288 6

ARM (Cortex-M4) Architecture Overview 32 bit processor size of bus is 32 bits size of registers is 32 bits Size of instructions 16/32 bits RISC architecture Harvard architecture separate data and instruction memory 203 instructions System Controller System Clock Control-PLL Memory Protection Unit (MPU) Float Point Unit (FPU) Watchdog Timers Reset Control MAC UARTs 0 ~ 7 SPI Analog Comparator ADCs 0 ~ 7 Peripheral Bridge Peripheral Data Flash Program Application Specific Logics CAN USB PWM I2C General Timers Figure 2.1 The block diagram for a general MCU. http://class.ece.iastate.edu/cpre288 7

ARM Block Diagram System Controller System Clock Control-PLL Memory Protection Unit (MPU) Peripheral Bridge Float Point Unit (FPU) Watchdog Timers Reset Control Peripheral Data Flash Program Application Specific Logics MAC UARTs 0 ~ 7 SPI Analog Comparator ADCs 0 ~ 7 CAN USB PWM I2C General Timers Figure 2.1 The block diagram for a general MCU. http://class.ece.iastate.edu/cpre288 8

ARM: RISC CPU Architecture What is RISC? Reduced Instruction Set Computing (with respect to CISC, Complex Instruction set Computing) Typical RISC LD/Store based: ALU to memory transaction via registers Most instruction are the same length Typically many less instructions than a CISC architecture Typically many more registers than CISC since Data must be moved into a register before it can be operated on Low number of instruction typically makes hardware design simpler (as compared to CISC) Ref: http://www.seas.upenn.edu/~palsetia/cit595s07/riscvscisc.pdf (Diana Palsetia) http://class.ece.iastate.edu/cpre288 9

ARM: RISC vs CISC example Size / time 1 byte, 1clk 1 byte, 1clk 4 byte, 30 clk CISC mov R1, 10 mov R2 5 mul R2, R1 Size / time 1 byte, 1clk 1 byte, 1clk 1 byte, 1 clk 1 byte, 1clk 1 byte, 1 clk RISC mov R1, 0 mov R2, 10 mov R3, 5 Begin: add R1, R2 loop Begin CISC: Instructions often variable length and variable time RISC: Instruction typically constant length and time Simpler hardware logic for decoding instructions (thus typically faster) Ref: http://www.seas.upenn.edu/~palsetia/cit595s07/riscvscisc.pdf (Diana Palsetia) http://class.ece.iastate.edu/cpre288 10

ARM CPU Core Summary Instructions are 16-bit or 32-bit Simple three-stage pipeline Registers are 32-bit and addresses are 32-bit http://class.ece.iastate.edu/cpre288 11

Cortex-M4 Architecture 32 bit processor size of data bus is 32 bits size of registers is 32 bits Harvard architecture RISC architecture ~203 instructions Interrupts Debug Interface JTAG/SWD Interrupt Controller SysTick Trace Interface Trace Interface ICode & DCode (Flash) & (SRAM) RAM + Peripherals Figure 2.3 The functional block diagram for the ARM Cortex -M4 MCU. http://class.ece.iastate.edu/cpre288 12

Cortex-M4 Processor Architecture 32 bit processor size of data bus is 32 bits size of registers is 32 bits Harvard architecture RISC architecture ~203 instructions Interrupts Debug Interface JTAG/SWD Interrupt Controller SysTick Trace Interface Trace Interface ICode & DCode (Flash) & (SRAM) RAM + Peripherals Figure 2.3 The functional block diagram for the ARM Cortex -M4 MCU. http://class.ece.iastate.edu/cpre288 13

ARM: Harvard Architecture 64K rows Program Memory (256KB) CPU Data Memory (32KB) 8K rows 32-bits Program memory Flash based: Program stays even if power turned off (non-volatile) 16-bits wide, instructions are 16-bit or 32-bit wide. Data Memory SRAM based: Data disappears if power is turned off (volatile) 32-bits wide: 32-bits 14 http://class.ece.iastate.edu/cpre288

ARM: Logical Data Memory organization CPU 16 32-bit Registers ALU Data Memory (32KB) 8K rows Registers: 32-bits wide Directly accessible by ALU Data Memory: 32-bit wide Must use a register to move to/from the ALU http://class.ece.iastate.edu/cpre288 32-bits 15

ARM: Memory Map organization CPU http://class.ece.iastate.edu/cpre288 16

Understanding Data Stack Stores data related to function variables, function calls, parameters, return variables, etc. Data on the stack can go out of scope, and is then automatically deallocated Starts at the top of the program s data memory space, and addresses move down as more variables are allocated Heap Stores dynamically allocated data Dynamically allocated data usually calls the functions alloc or malloc (or uses new in C++) to allocate memory, and free to (or delete in C++) deallocate There s no garbage collector! Starts at bottom of program s data memory space, and addresses move up as more variables are allocated http://class.ece.iastate.edu/cpre288 17

ARM: Typical Logical Memory Layout Static Data (e.g. Global vars) Stack (e.g. local vars, function args & return value, return address) Heap (e.g. dynamically allocated memory: malloc, new) Code (Code may be in the same or a separate memory device) 0xHigh address Stack Pointer (SP): Address of top of the stack Data Stack Heap Static Data Stack Base Address Stack grows downward Heap grows upward 0x0000_0000 Code Segment http://class.ece.iastate.edu/cpre288 Code 18

Example Memory Layout static char[] greeting = Hello world! ; main() { int i; char bval; } Memory Mapped I/O, Devices LCD_init(); LCD_PutString(greetin); High end Address I/O addresses Stack (grows down) Heap (grows up) Static Data Code Low end Address http://class.ece.iastate.edu/cpre288 19

ARM: Typical Function Stack-Frame Stack is typically divided into a number of function stack-frames. Stack-Frame: Contains information associated with a function call Function args, local vars, return address, preserved values Function 1 Frame Stack Function Return Address Preserved Values Function Args Local Vars Stack Base Address Stack grows downward Function 2 Frame Function Return Address Preserved Values Function Args Local Vars http://class.ece.iastate.edu/cpre288 20

Function and Stack Conventional program stack grows downwards: New items are put at the top, and the top grows down high address stack top Occupied stack growth low address

Function and Stack Auto, local variables have their storage in stack Why stack? The LIFO order matches perfectly with functions call/return order LIFO: Last In, First Out Function: Last called, first returned Efficient memory allocation and de-allocation Allocation: Decrease SP (stack top): PUSH De-allocation: Increase SP: POP

Function and Stack Function Frame: Local storage for a function Example: 1. A is called; 2. A calls B; 3. B calls C; 4. C returns

Function and Stack What can be put in a stack frame? Function return address Parameter values Return value Local variables Saved register values May 18, 2011 http://class.ece.iastate.edu/cpre288 24

Example: Stack The following example shows the execution of a simple program (left) and the memory map of the stack (right) http://class.ece.iastate.edu/cpre288 25

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z http://class.ece.iastate.edu/cpre288 26

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z i http://class.ece.iastate.edu/cpre288 27

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z i http://class.ece.iastate.edu/cpre288 28

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z i PC Address for line 9 http://class.ece.iastate.edu/cpre288 29

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z i PC Address for line 9 http://class.ece.iastate.edu/cpre288 30

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z i PC Address for line 9 c http://class.ece.iastate.edu/cpre288 31

Example: Stack void donothing() { char c; } int main() { char x, y, z; short i; for (i = 0; i < 10; i++) { donothing(); } return 0; }. 1000 999 998 997 996 995 994 993. x y z i http://class.ece.iastate.edu/cpre288 32

General Purpose Register File 13 General Purpose Registers (R0 ~ R12) Low Register High Register Stack Pointer Register Link Register Program Counter R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 (MSP-PSP) R14 (LR) R15 (PC) Application PSR APSR EPSR IPSR XPSR PRIMASK FAULTMASK BASEPRI CONTROL Execution PSR Main Stack Pointer (MSP) Process Stack Pointer (PSP) Interrupt PSR Program Status Register Interrupt Exception Mask Registers Control Register Figure 2.5 A structure block diagram of 21 registers in the Cortex -M4 Core. http://class.ece.iastate.edu/cpre288 33

ARM: GP Registers (Cont.) 16 32-bit general purpose registers Used for accessing SRAM Used for storing function parameters Used for instructions to execute operations on What is an 32-bit register. Basically just 32 D-Flips connected together Bit 31 Bit 30 Bit 5 Bit 2 Bit 1 Bit 0 http://class.ece.iastate.edu/cpre288 34

STATUS REGISTER (SREG) http://class.ece.iastate.edu/cpre288 35

Status Register (SREG) Bits 31 30 29 28 27 26:25 24 23:20 19:16 15:10 9 8 7 6 5 4 3 2 1 0 APSR N Z C V Q GE * Reserved IPSR Reserved Exception Number EPSR Reserved ICI/IT T Reserved ICI/IT Reserved (a) Three individual register APSR, IPSR and EPSR. Bits 31 30 29 28 27 26:25 24 23:20 19:16 15:10 9 8 7 6 5 4 3 2 1 0 PSR N Z C V Q ICI/IT T GE * ICI/IT Exception Number (b) The combined register PSR. Figure 2.6 Structure and bit functions in special registers. http://class.ece.iastate.edu/cpre288 36

Status Register (SREG): Common flags Z: Zero flag Set to 1 when the result of an instruction is 0 C: Carry flag Set to 1 when the result of an instruction causes a carry to occur N: Negative Set to 1 to indicate the result of an instruction is negative V: Overflow Set to 1 to indicate the result of an instruction caused an overflow http://class.ece.iastate.edu/cpre288 37

Example ARM Assembly int x, a, b; x = (a + b); MOVW r4,address-of-a; get address for a LDR r0,[r4]; get value of a MOVW r4,address-of-b; get address for b LDR r1,[r4]; get value of b ADD r3,r0,r1; compute a+b MOVW r4,address-of-x; get address for x STR r3,[r4]; store value of x http://class.ece.iastate.edu/cpre288 38

How to Study Assembly 1. Get to know CPU registers Memorize rules for usage 2. Know basic types of Instruction Memory load and store Arithmetic/Logic Compare and branch http://class.ece.iastate.edu/cpre288 39

How to Study Assembly 3. Translate C statements Memory accesses Simple arithmetic statements If statement Loop statements 4. Translate C functions Function Linkage Making a function call http://class.ece.iastate.edu/cpre288 40

How to Study Assembly 5. Interrupt System Principle of interrupt and exception Interrupt vector table Saving and restoring context http://class.ece.iastate.edu/cpre288 41

Challenges Challenges in learning Assembly Must understand how program works cycle by cycle Have to memorize some notations before fully understanding them http://class.ece.iastate.edu/cpre288 42

Announcements Final Projects Project Teams: Demo Thursday 7/6 +5 bonus if Demoed by Wednesday 7/5 Reminder lab attendance is mandatory: -10 points from final project for each lab session you miss Exam 2: In class Thursday 11/9 Reading for remainder of semester Chapter 2.1-2.3, and 2.6.1-2.6.2 Chapter 4.1 4.3 Assemble ARM instruction set manual. Preface, Chapter 3, Chapter 4 ARM Procedure Call Standard Sections: 5, 7.1.1, 7.2. 43

http://class.ece.iastate.edu/cpre288 44