A QUICK INTRO TO PRACTICAL OPTIMIZATION TECHNIQUES
|
|
- Chester Bennett
- 6 years ago
- Views:
Transcription
1 A QUICK INTRO TO PRACTICAL OPTIMIZATION TECHNIQUES 0. NO SILVER BULLETS HERE. 1. Set Compiler Options Appropriately: Select processor architecture: Enables compiler to make full use of instructions which are supported by the processor Compiler performs processor-specific optimizations E.g., use -proc ARM7 Also use the most appropriate arch?? ( -arch N) (4 and 4T enables halfword instruction) Debugging Options Debugging options affect both codesize and performance significantly. To allow efficient debugging a varying level of optimizations is disabled. (Some optimizations produce code that cannot be described in debugging tables. Switch off debugging when code size and/or performance is important. Debug options: -g, -gr, -go -go increases code size by 7-15% Optimization Options: For time/speed (-Otime) For code size (-Ospace) Use appropriate ARM Procedure Call Standard or APCS options: -aps /wide -aps /fp -aps /swst/fp Recommendation: experiment with these options for the project: -Ospace -Otime -Otime -apcs /wide -Otime -go -Otime -g -Otime -gr -Otime -apcs /fp -Otime -apcs /swst/fp 2. Division & Reminder Divisions are typically implemented by calling a C-lib function: rt_sdiv, and rt_udiv Divisions are very expensive: cycles! (rule of thumb: Ncycles) Avoid it when possible, use algebraic substitutions: (x/y)> z can be replaced by x > (z*y) Combine both division and reminder when needed int combined_div_mod (int a, int b) {return (a/b) +(a%b); Use power of 2 division when possible. (eg. Use a/16 instead of a/15) (128 instead 100) Typedef unsigned int uint Uint div16u (uint a) { return a/16; Avoid Modulo (uses reminder) arithmetic when possible: Uint counter1 (uint count) { return (++count % 60); SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 1 of 8
2 Instead use this: Uint counter2 (uint count) { if (++count >= 60) count = 0; return (count); Division by a constant? Lookup tables 3. Conditional Execution All ARM instructions are conditional. Each instruction contains a 4-bit field which is a condition code. The instruction is only executed if the ARM flag bits indicate that the specified condition is true. Conditional execution is applied mostly in the body of if statements and while evaluating complex expressions with relational (<, ==, >) and Boolean operators (&&,!, ) Typically it starts with a compare instruction followed by a few conditional instructions It reduces the number of branch instructions and, therefore, improves code size and performance. 1 Branch instruction takes about 2.5 ARM7 cycles. Recommendation: To enable the compiler to use conditional instruction you need to keep the bodies of if/else statements as simple as possible. And relational expressions should be grouped into blocks of similar conditions: generate all flags and stream through the code without branch instructions 4. Compare with zero Can be avoided if the code can directly test the N, Z flags (Z: result is zero, N negative) ADD R0, R0, R1 CMP R0, #0 Produces identical N and Z flags as ADDS R0, R0, R1 However: the C language has no concept of a carry flag or overflow flag so it is not possible to test the C or V flag bits directly without inline assembler. The compiler supports the carry flag. 5. LOOPS Loop termination condition can cause significant overhead if written without caution. Always use count-down-to-zero loops and use simple termination conditions. Exampe: n! int fact1 (int n) { int i, fact =1; for (i=1; i <= n; i++) fact *=i; return (fact); int fact2 (int n) { int i, fact =1; for (i=n; i!= 0; i--) fact *=i; return (fact); fact2 can use SUBS instead of ADD/CMP. This is because a compare with zero could be optimized away. Saves one instruction in the loop. SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 2 of 8
3 Also on fact2 the variable n does not need to be saved across the loop so a register is also saved. this eases register allocation and leads to more efficient code elsewhere in the function (2 more instruction saved) This observation (of initializing the loop counter to the number of iterations required and then decrementing down to zero) also applies to while and do statements. 6. LOOP UNROLLING Small loops can be unrolled for higher performance at the expense of increased codesize. When a loop is unrolled, the loop counter needs to be updated less often and fewer branches are executed. If the iteration number is small a loop can be fully unrolled and the loop overhead completely disappears. Not supported by the compiler, should be done manually int countbit1 (uint n) { int bits = 0; while (n!=0) { if (n & 1) bits++; n >>=1; return bits; Here if we assume ARM7, then checking a single bit takes 6 cycles, and the code size is only 9 instructions int countbit1 (uint n) { int bits = 0; while (n!=0) { if (n & 1) bits++; if (n & 2) bits++; if (n &4) bits++; if (n & 8) bits++; n >>=4; /* shift right by 4*/ return bits; The above code checks 4 bits at a time, taking on average 3 cycles per bit. However, the code size is 15 instructions. 7. REGISTER ALLOCATION This is a process where the compiler allocates variables to ARM registers, rather than to memory. This has a dramatic effect on both speed and memory as these variables can now be accessed quickly without needing instructions to transfer them to/from the memory. You can write code which enables the compiler to achieve a more optimal register allocation. All basic interger, pointer and floating-point types, fields of structures and complete structures can be allocated to registers. A variable may be allocated to a register if: it is a local variable or a function parameter and its address in never taken, or its address is taken but not assigned to another variable. A field in a structure may be allocated to a register if: it is declared locally or a function parameter and SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 3 of 8
4 the structure is not assigned directly with the result of a function call, and neither the address of the structure nor any of its filds is taken, or if any of these addresses is taken, it is not to another variable. 8. ALIASING/POINTERs Pointer must be used carefully or poor code can be produced If the address of a variable is taken, the compiler must assume that the variable can be changed by any assignment through a pointer or by any function call That makes it impossible to put it into a register, This is also true for Global Variables, as they might have their address taken in some other function. Pointer Aliasing Problem. Some other compilers ignore this but ARM compiler does not (because this rule is part of the ANSI/ISO standard. (ignoring can produce untraceable bugs) Rules of thumb on local and global variables: Avoid taking the address of local variables avoid global variables avoid pointer chains 9. LOCAL VARIABLES: Sometimes it is necessary to take the address of variables, example if they are passed as a reference parameter to a function. This means that these variables can not be allocated to registers. SOLUTION: Make a copy of the variable and pass the address of that copy instead. EXAMPLE: Void f(int *a); Int g(int a); Int test1(int I) { f(&i); address of I is taken cannot allocate reg /*now use 'I' extensively */ I += g(i); I += g(i); Return I; int test2(int I) { int dummy =I; f(&dummy); I=dummy; I += g(i); I += g(i); Return I; 10. GLOBAL VARIABLES Global variables are never allocated to registers (unless the global_reg feature is used). Global Variables can be changed by assigning the indirectly using a pointer, or by a function call. Hence the compiler cannot cache the variable in a register extra (often unnecessary) loads and stores when globals are used. Rule of thumb: NEVER USE GLOBAL VARIABLES IN A CRITICAL LOOPs. if a function uses global variables heavily, when possible and when it makes sense: copy those global variables into local variables so that they can be assigned to registers. SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 4 of 8
5 Of course, this is possible only if those globals are not used by any of the function which are called. EXAMPLE: Int f(void; Int g(void); Int errs; Void test1(void) {errs += f(); errs += g(); void test2(void) { int localerrs = errs; localerrs += f(); localerrs += g(); errs = localerrs; Here, test 1 must load and store the global errs value each time it is incremented, whereas test2 stores localerrs in a register and needs only a single instruction. 11. POINTER CHAINS Example: Typedef struct { int x, y, z; Point3; Typedef struct { Point3 *pos, * direction; Object; Void InitPos1(Object *p) { p pos x = 0; p pos y = 0; p pos z=0; This code must reload p pos for each assignment. Instead, cache p pos in a local variable: Void InitPos2(Object *p) { Point3 *pos = p pos; pos x = 0; pos y = 0; pos z = 0; An alternative is to avoid pointers in the first place by including Point3 structure in the Object structure. 12. LIVE VARIABLES & SPILLING These have effect on the quality of register allocation. ARM has 14 integer registers available. The ARM Compiler supports live-range spilling: LIVE RANGE OF A VARIABLE: last assignment & usage {all statements Next assignment In this range the value of the variable is valid, thus alive. In between live ranges the value of a variable is not needed: it is dead. So its register can be reused by other variables, resulting in allocation of more variables to registers. The number of registers needed for register-allocatable variables is at least the SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 5 of 8
6 number of overlapping live ranges at each point in a function. If this exceeds the number of registers available, some variables must be stored to memory temporarily. This process is called SPILLING. The compiler spills the least frequently used variables first. SPILLING CAN BE AVOIDED BY 1. Limiting the maximum number of variables. 2. Keep expressions SIMPLE and SMALL. 3. Minimize the number of variables in a function. 4. Subdivide large functions into SMALLER, SIMPLER ones 5. using register for frequently used variables etc 13. DECLARING VARIABLE TYPES Use the most appropriate variable types - char, short, int, long, signed and unsigned, float, double. For local variables, when possible avoid char & short as local variables. (The compiler needs to reduce the size of local variable to 8 (shift right by 24) or 16 bits (shift right by 16). These operation can be avoided if int is used, thus optimizing both the speed and the codesize. 14. FUNCTION DESIGN General Rule: keep functions small and simple. This enables the compiler to perform optimizations, such as register allocation, more efficiently. Function Call Overhead: Relatively small - The minimal call-return sequence is BL MOV pc, lr (~ 6 cycles) - The Multiple load and store instruction (LDM, STM) (PUSH, POP in Thumb instruction set) reduce the cost of function entry and exit when some registers need to be saved. - Under the APCS, up to 4 words of arguments can be passed to a function in registers. If more needed (e.g., 5 th and 6 th are passed on the stack), then there is additional cost of storing these words in the calling function and reloading them in the called function. Int f1(int a, int b, int c, int d) { return a + b+ c + d; int g1 (void) { return f1(1, 2, 3, 4); Int f2(int a, int b, int c, int d, int e, int f) { return a + b+ c + d + e + f; int g2 (void) { return f1(1, 2, 3, 4, 5, 6); Here, the 5 th and 6 th parameters are stored on the stack in g2, and reloaded in f2, costing 2 Memory accesses per parameter. Therefore, Use four or less arguments for small functions if more arguments are needed, make sure that the function does significant amount of work pass pointers to structures instead of passing the structure itself use value_in_regs specifier, which can be used to return structures of upto 4 words in registers. Otherwise, normally structures are returned on the stack. Example: SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 6 of 8
7 typedef struct {int hi; uint lo; int64; //low word unsigned value_in_regs int64 add64(int64 x, int64 y) // see specifier { int64 res; res.lo = x.lo + y.lo; res.hi = x.hi + y.hi; if (res.lo < y.lo) res.hi++; // carry from low word return res; void test(void) { int64 a, b, c, sum; a.hi = 0x ; a.lo = 0xF ; b.hi = 0x ; b.lo = 0x ; sum= add64(a,b); c.hi = 0x ; c.lo = 0xffffffff; sum = add64(sum c); Here, by using value_in_regs, the code size is 52 Otherwise it would have been 160 bytes! LEAF FUNCTIONS These are functions that do not call any other function. These can be efficiently compiled with ARM compiler, since we do not need to perform the usual saving and restoring of registers. maximize use of leaf functions TAIL CONTINUED FUNCTION When a func ends with a call to another function, the call can be converted to a branch to that function. This is called tail continuation. This usally saves stackspace and branch. PURE FUNCTION The result they return depends only on their arguments (math func). Therefore, there are no side effects: cannot read or write global state by using global variables or indirecting through pointers use pure pure int square (int x) { return x * x int f(int n) { return square(n) + square(n); INLINE FUNCTIONS Inline functions do not have call overhead, and a lower argument evaluation overhead. Therefore, more compiler optimizations are possible (e.g., Combine ADD + MUL MLA: Mul accumulate instruction inline: Each call to an inline function is substituted by its budy, instead of a normal call. Faster code, larger codesize. FUNCTION DEFINITION Placing function definitions before their use can produce better code. It allows the compiler to analyze the register usage of the called function. This is a simple form of interprocedural optimization (where opt are carried out between functions). Finally, where possible, use table lookup approximations, rather than function calls. 15. USE MACROS FOR PORTABILITY: #ifdef arm SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 7 of 8
8 # define INLINE inline # define VALUE_IN_REGS value_in_regs #define PURE pure #else # define INLINE # define VALUE_IN_REGS # define PURE #endif SW OPTIMIZATION (Courtesy M. Romdhane, Conexant) Page 8 of 8
Support for high-level languages
Outline: Support for high-level languages memory organization ARM data types conditional statements & loop structures the ARM Procedure Call Standard hands-on: writing & debugging C programs 2005 PEVE
More informationComputer Systems Lecture 9
Computer Systems Lecture 9 CPU Registers in x86 CPU status flags EFLAG: The Flag register holds the CPU status flags The status flags are separate bits in EFLAG where information on important conditions
More informationInstruction Sets: Characteristics and Functions Addressing Modes
Instruction Sets: Characteristics and Functions Addressing Modes Chapters 10 and 11, William Stallings Computer Organization and Architecture 7 th Edition What is an Instruction Set? The complete collection
More informationCompiler Optimization
Compiler Optimization The compiler translates programs written in a high-level language to assembly language code Assembly language code is translated to object code by an assembler Object code modules
More informationProgramming the ARM. Computer Design 2002, Lecture 4. Robert Mullins
Programming the ARM Computer Design 2002, Lecture 4 Robert Mullins 2 Quick Recap The Control Flow Model Ordered list of instructions, fetch/execute, PC Instruction Set Architectures Types of internal storage
More informationG Programming Languages - Fall 2012
G22.2110-003 Programming Languages - Fall 2012 Lecture 4 Thomas Wies New York University Review Last week Control Structures Selection Loops Adding Invariants Outline Subprograms Calling Sequences Parameter
More informationCprE 288 Introduction to Embedded Systems ARM Assembly Programming: Translating C Control Statements and Function Calls
CprE 288 Introduction to Embedded Systems ARM Assembly Programming: Translating C Control Statements and Function Calls Instructors: Dr. Phillip Jones 1 Announcements Final Projects Projects: Mandatory
More informationCprE 288 Introduction to Embedded Systems Course Review for Exam 3. Instructors: Dr. Phillip Jones
CprE 288 Introduction to Embedded Systems Course Review for Exam 3 Instructors: Dr. Phillip Jones 1 Announcements Exam 3: See course website for day/time. Exam 3 location: Our regular classroom Allowed
More informationARM Assembly Programming II
ARM Assembly Programming II Computer Organization and Assembly Languages Yung-Yu Chuang 2007/11/26 with slides by Peng-Sheng Chen GNU compiler and binutils HAM uses GNU compiler and binutils gcc: GNU C
More informationEL6483: Brief Overview of C Programming Language
EL6483: Brief Overview of C Programming Language EL6483 Spring 2016 EL6483 EL6483: Brief Overview of C Programming Language Spring 2016 1 / 30 Preprocessor macros, Syntax for comments Macro definitions
More informationARM Cortex-M4 Architecture and Instruction Set 4: The Stack and subroutines
ARM Cortex-M4 Architecture and Instruction Set 4: The Stack and subroutines M J Brockway February 13, 2016 The Cortex-M4 Stack SP The subroutine stack is full, descending It grows downwards from higher
More informationCS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find
CS1622 Lecture 15 Semantic Analysis CS 1622 Lecture 15 1 Semantic Analysis How to build symbol tables How to use them to find multiply-declared and undeclared variables. How to perform type checking CS
More informationECE 471 Embedded Systems Lecture 5
ECE 471 Embedded Systems Lecture 5 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 17 September 2013 HW#1 is due Thursday Announcements For next class, at least skim book Chapter
More informationISA: The Hardware Software Interface
ISA: The Hardware Software Interface Instruction Set Architecture (ISA) is where software meets hardware In embedded systems, this boundary is often flexible Understanding of ISA design is therefore important
More informationEmbedded Controller Programming 2
Embedded Controller Programming 2 Section 3: C Language for Embedded Systems - Ken Arnold ecp2@hte.com Copyright 2006 Ken Arnold Overview Structures Unions Scope of Variables Pointers Operators and Precedence
More informationLectures 5-6: Introduction to C
Lectures 5-6: Introduction to C Motivation: C is both a high and a low-level language Very useful for systems programming Faster than Java This intro assumes knowledge of Java Focus is on differences Most
More informationARM Assembly Programming
ARM Assembly Programming Computer Organization and Assembly Languages g Yung-Yu Chuang 2007/12/1 with slides by Peng-Sheng Chen GNU compiler and binutils HAM uses GNU compiler and binutils gcc: GNU C compiler
More informationARM Assembly Language. Programming
Outline: ARM Assembly Language the ARM instruction set writing simple programs examples Programming hands-on: writing simple ARM assembly programs 2005 PEVE IT Unit ARM System Design ARM assembly language
More informationARM PROGRAMMING. When use assembly
ARM PROGRAMMING Bùi Quốc Bảo When use assembly Functions that cannot be implemented in C, such as special register accesses and exclusive accesses Timing-critical routines Tight memory requirements, causing
More informationCourse Administration
Fall 2018 EE 3613: Computer Organization Chapter 2: Instruction Set Architecture Introduction 4/4 Avinash Karanth Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 45701
More informationARM Assembly Programming
ARM Assembly Programming Computer Organization and Assembly Languages g Yung-Yu Chuang with slides by Peng-Sheng Chen GNU compiler and binutils HAM uses GNU compiler and binutils gcc: GNU C compiler as:
More informationECEN 449 Microprocessor System Design. Review of C Programming. Texas A&M University
ECEN 449 Microprocessor System Design Review of C Programming 1 Objectives of this Lecture Unit Review C programming basics Refresh programming skills 2 Basic C program structure # include main()
More informationInstruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...
Instruction-set Design Issues: what is the format(s) Opcode Dest. Operand Source Operand 1... 1) Which instructions to include: How many? Complexity - simple ADD R1, R2, R3 complex e.g., VAX MATCHC substrlength,
More informationCS 3330 Exam 3 Fall 2017 Computing ID:
S 3330 Fall 2017 Exam 3 Variant E page 1 of 16 Email I: S 3330 Exam 3 Fall 2017 Name: omputing I: Letters go in the boxes unless otherwise specified (e.g., for 8 write not 8 ). Write Letters clearly: if
More informationBranch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015
Branch Addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits PC-relative
More informationPractical Malware Analysis
Practical Malware Analysis Ch 4: A Crash Course in x86 Disassembly Revised 1-16-7 Basic Techniques Basic static analysis Looks at malware from the outside Basic dynamic analysis Only shows you how the
More informationECEN 449 Microprocessor System Design. Review of C Programming
ECEN 449 Microprocessor System Design Review of C Programming 1 Objectives of this Lecture Unit Review C programming basics Refresh es programming g skills s 2 1 Basic C program structure # include
More informationArchitecture. Digital Computer Design
Architecture Digital Computer Design Architecture The architecture is the programmer s view of a computer. It is defined by the instruction set (language) and operand locations (registers and memory).
More informationReminder: tutorials start next week!
Previous lecture recap! Metrics of computer architecture! Fundamental ways of improving performance: parallelism, locality, focus on the common case! Amdahl s Law: speedup proportional only to the affected
More informationC Syntax Out: 15 September, 1995
Burt Rosenberg Math 220/317: Programming II/Data Structures 1 C Syntax Out: 15 September, 1995 Constants. Integer such as 1, 0, 14, 0x0A. Characters such as A, B, \0. Strings such as "Hello World!\n",
More informationCHAPTER 4 FUNCTIONS. 4.1 Introduction
CHAPTER 4 FUNCTIONS 4.1 Introduction Functions are the building blocks of C++ programs. Functions are also the executable segments in a program. The starting point for the execution of a program is main
More informationChapter 2A Instructions: Language of the Computer
Chapter 2A Instructions: Language of the Computer Copyright 2009 Elsevier, Inc. All rights reserved. Instruction Set The repertoire of instructions of a computer Different computers have different instruction
More informationFigure 1 Common Sub Expression Optimization Example
General Code Optimization Techniques Wesley Myers wesley.y.myers@gmail.com Introduction General Code Optimization Techniques Normally, programmers do not always think of hand optimizing code. Most programmers
More informationLectures 5-6: Introduction to C
Lectures 5-6: Introduction to C Motivation: C is both a high and a low-level language Very useful for systems programming Faster than Java This intro assumes knowledge of Java Focus is on differences Most
More informationG Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University
G22.2110-001 Programming Languages Spring 2010 Lecture 4 Robert Grimm, New York University 1 Review Last week Control Structures Selection Loops 2 Outline Subprograms Calling Sequences Parameter Passing
More informationCSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1
CSIS1120A 10. Instruction Set & Addressing Mode CSIS1120A 10. Instruction Set & Addressing Mode 1 Elements of a Machine Instruction Operation Code specifies the operation to be performed, e.g. ADD, SUB
More informationUnder the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.
Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Supercharge your PS3 game code Part 1: Compiler internals.
More informationTopic 6: A Quick Intro To C. Reading. "goto Considered Harmful" History
Topic 6: A Quick Intro To C Reading Assumption: All of you know basic Java. Much of C syntax is the same. Also: Some of you have used C or C++. Goal for this topic: you can write & run a simple C program
More informationARM Architecture and Instruction Set
AM Architecture and Instruction Set Ingo Sander ingo@imit.kth.se AM Microprocessor Core AM is a family of ISC architectures, which share the same design principles and a common instruction set AM does
More informationWhy Pointers. Pointers. Pointer Declaration. Two Pointer Operators. What Are Pointers? Memory address POINTERVariable Contents ...
Why Pointers Pointers They provide the means by which functions can modify arguments in the calling function. They support dynamic memory allocation. They provide support for dynamic data structures, such
More informationFixed-Point Math and Other Optimizations
Fixed-Point Math and Other Optimizations Embedded Systems 8-1 Fixed Point Math Why and How Floating point is too slow and integers truncate the data Floating point subroutines: slower than native, overhead
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats
William Stallings Computer Organization and Architecture 8 th Edition Chapter 11 Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement
More informationEE319K Exam 1 Summer 2014 Page 1. Exam 1. Date: July 9, Printed Name:
EE319K Exam 1 Summer 2014 Page 1 Exam 1 Date: July 9, 2014 UT EID: Printed Name: Last, First Your signature is your promise that you have not cheated and will not cheat on this exam, nor will you help
More informationCompiler Design and Construction Optimization
Compiler Design and Construction Optimization Generating Code via Macro Expansion Macroexpand each IR tuple or subtree A := B+C; D := A * C; lw $t0, B, lw $t1, C, add $t2, $t0, $t1 sw $t2, A lw $t0, A
More informationECE 372 Microcontroller Design Assembly Programming. ECE 372 Microcontroller Design Assembly Programming
Assembly Programming HCS12 Assembly Programming Basic Assembly Programming Top Assembly Instructions (Instruction You Should Know!) Assembly Programming Concepts Assembly Programming HCS12 Assembly Instructions
More informationCS 61c: Great Ideas in Computer Architecture
MIPS Functions July 1, 2014 Review I RISC Design Principles Smaller is faster: 32 registers, fewer instructions Keep it simple: rigid syntax, fixed instruction length MIPS Registers: $s0-$s7,$t0-$t9, $0
More informationIntroduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C
Final Review CS304 Introduction to C Why C? Difference between Python and C C compiler stages Basic syntax in C Pointers What is a pointer? declaration, &, dereference... Pointer & dynamic memory allocation
More informationBranch Instructions. R type: Cond
Branch Instructions Standard branch instructions, B and BL, change the PC based on the PCR. The next instruction s address is found by adding a 24-bit signed 2 s complement immediate value
More informationRobust Programming. Style of programming that prevents abnormal termination and unexpected actions
Robust Programming Style of programming that prevents abnormal termination and unexpected actions Code handles bad inputs reasonably Code assumes errors will occur and takes appropriate action Also called
More informationMachine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine
Machine Language Instructions Introduction Instructions Words of a language understood by machine Instruction set Vocabulary of the machine Current goal: to relate a high level language to instruction
More informationCIT Week13 Lecture
CIT 3136 - Week13 Lecture Runtime Environments During execution, allocation must be maintained by the generated code that is compatible with the scope and lifetime rules of the language. Typically there
More informationARM Assembler Workbook. CS160 Computer Organization Version 1.1 October 27 th, 2002 Revised Fall 2005
ARM Assembler Workbook CS160 Computer Organization Version 1.1 October 27 th, 2002 Revised Fall 2005 ARM University Program Version 1.0 January 14th, 1997 Introduction Aim This workbook provides the student
More informationCSCE 5610: Computer Architecture
HW #1 1.3, 1.5, 1.9, 1.12 Due: Sept 12, 2018 Review: Execution time of a program Arithmetic Average, Weighted Arithmetic Average Geometric Mean Benchmarks, kernels and synthetic benchmarks Computing CPI
More informationCommunicating with People (2.8)
Communicating with People (2.8) For communication Use characters and strings Characters 8-bit (one byte) data for ASCII lb $t0, 0($sp) ; load byte Load a byte from memory, placing it in the rightmost 8-bits
More informationEECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution
1. (40 points) Write the following subroutine in x86 assembly: Recall that: int f(int v1, int v2, int v3) { int x = v1 + v2; urn (x + v3) * (x v3); Subroutine arguments are passed on the stack, and can
More informationP.G.TRB - COMPUTER SCIENCE. c) data processing language d) none of the above
P.G.TRB - COMPUTER SCIENCE Total Marks : 50 Time : 30 Minutes 1. C was primarily developed as a a)systems programming language b) general purpose language c) data processing language d) none of the above
More informationSome Basic Concepts EL6483. Spring EL6483 Some Basic Concepts Spring / 22
Some Basic Concepts EL6483 Spring 2016 EL6483 Some Basic Concepts Spring 2016 1 / 22 Embedded systems Embedded systems are rather ubiquitous these days (and increasing rapidly). By some estimates, there
More informationECE 2035 A Programming Hw/Sw Systems Spring problems, 8 pages Final Exam 29 April 2015
Instructions: This is a closed book, closed note exam. Calculators are not permitted. If you have a question, raise your hand and I will come to you. Please work the exam in pencil and do not separate
More informationWe will begin our study of computer architecture From this perspective. Machine Language Control Unit
An Instruction Set View Introduction Have examined computer from several different views Observed programmer s view Focuses on instructions computer executes Collection of specific set of instructions
More informationComputer Organization CS 206 T Lec# 2: Instruction Sets
Computer Organization CS 206 T Lec# 2: Instruction Sets Topics What is an instruction set Elements of instruction Instruction Format Instruction types Types of operations Types of operand Addressing mode
More informationNET3001. Advanced Assembly
NET3001 Advanced Assembly Arrays and Indexing supposed we have an array of 16 bytes at 0x0800.0100 write a program that determines if the array contains the byte '0x12' set r0=1 if the byte is found plan:
More informationInstruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...
Instruction-set Design Issues: what is the format(s) Opcode Dest. Operand Source Operand 1... 1) Which instructions to include: How many? Complexity - simple ADD R1, R2, R3 complex e.g., VAX MATCHC substrlength,
More informationCOE608: Computer Organization and Architecture
Add on Instruction Set Architecture COE608: Computer Organization and Architecture Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview More
More informationQuiz for Chapter 2 Instructions: Language of the Computer3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: 1. [5 points] Prior to the early 1980s, machines were built
More informationSeptember, Saeid Nooshabadi. Review: Floating Point Representation COMP Single Precision and Double Precision
Review: Floating Point Representation COMP3211 lec22-fraction1 COMP 3221 Microprocessors and Embedded Systems Lectures 22 : Fractions http://wwwcseunsweduau/~cs3221 September, 2003 Saeid@unsweduau Single
More informationEmulation. Michael Jantz
Emulation Michael Jantz Acknowledgements Slides adapted from Chapter 2 in Virtual Machines: Versatile Platforms for Systems and Processes by James E. Smith and Ravi Nair Credit to Prasad A. Kulkarni some
More informationMIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.
MIPS Programming This is your crash course in assembler programming; you will teach yourself how to program in assembler for the MIPS processor. You will learn how to use the instruction set summary to
More informationC Programming. Course Outline. C Programming. Code: MBD101. Duration: 10 Hours. Prerequisites:
C Programming Code: MBD101 Duration: 10 Hours Prerequisites: You are a computer science Professional/ graduate student You can execute Linux/UNIX commands You know how to use a text-editing tool You should
More informationWhat Compilers Can and Cannot Do. Saman Amarasinghe Fall 2009
What Compilers Can and Cannot Do Saman Amarasinghe Fall 009 Optimization Continuum Many examples across the compilation pipeline Static Dynamic Program Compiler Linker Loader Runtime System Optimization
More informationProcedure Calling. Procedure Calling. Register Usage. 25 September CSE2021 Computer Organization
CSE2021 Computer Organization Chapter 2: Part 2 Procedure Calling Procedure (function) performs a specific task and return results to caller. Supporting Procedures Procedure Calling Calling program place
More informationWeek 4 Lecture 1. Expressions and Functions
Lecture 1 Expressions and Functions Expressions A representation of a value Expressions have a type Expressions have a value Examples 1 + 2: type int; value 3 1.2 + 3: type float; value 4.2 2 More expression
More informationCOMP322 - Introduction to C++ Lecture 02 - Basics of C++
COMP322 - Introduction to C++ Lecture 02 - Basics of C++ School of Computer Science 16 January 2012 C++ basics - Arithmetic operators Where possible, C++ will automatically convert among the basic types.
More informationControl Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary
Control Instructions Computer Organization Architectures for Embedded Computing Thursday, 26 September 2013 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,
More informationControl Instructions
Control Instructions Tuesday 22 September 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Instruction Set
More informationECE 2035 Programming HW/SW Systems Spring problems, 6 pages Exam Two 11 March Your Name (please print) total
Instructions: This is a closed book, closed note exam. Calculators are not permitted. If you have a question, raise your hand and I will come to you. Please work the exam in pencil and do not separate
More informationCSCI 171 Chapter Outlines
Contents CSCI 171 Chapter 1 Overview... 2 CSCI 171 Chapter 2 Programming Components... 3 CSCI 171 Chapter 3 (Sections 1 4) Selection Structures... 5 CSCI 171 Chapter 3 (Sections 5 & 6) Iteration Structures
More informationThe CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:
The CPU and Memory How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: 1 Registers A register is a permanent storage location within
More informationEECS 213 Introduction to Computer Systems Dinda, Spring Homework 3. Memory and Cache
Homework 3 Memory and Cache 1. Reorder the fields in this structure so that the structure will (a) consume the most space and (b) consume the least space on an IA32 machine on Linux. struct foo { double
More informationSOURCE LANGUAGE DESCRIPTION
1. Simple Integer Language (SIL) SOURCE LANGUAGE DESCRIPTION The language specification given here is informal and gives a lot of flexibility for the designer to write the grammatical specifications to
More informationCNIT 127: Exploit Development. Ch 1: Before you begin. Updated
CNIT 127: Exploit Development Ch 1: Before you begin Updated 1-14-16 Basic Concepts Vulnerability A flaw in a system that allows an attacker to do something the designer did not intend, such as Denial
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Arithmetic Unit 10032011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Chapter 3 Number Systems Fixed Point
More informationComputer Architecture and System Programming Laboratory. TA Session 3
Computer Architecture and System Programming Laboratory TA Session 3 Stack - LIFO word-size data structure STACK is temporary storage memory area register points on top of stack (by default, it is highest
More informationbutton.c The little button that wouldn t
Goals for today The little button that wouldn't :( the volatile keyword Pointer operations => ARM addressing modes Implementation of C function calls Management of runtime stack, register use button.c
More informationShort Notes of CS201
#includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system
More informationARM Instruction Set Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
ARM Instruction Set Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Condition Field (1) Most ARM instructions can be conditionally
More informationCS558 Programming Languages
CS558 Programming Languages Fall 2016 Lecture 4a Andrew Tolmach Portland State University 1994-2016 Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g.
More informationSummary: Direct Code Generation
Summary: Direct Code Generation 1 Direct Code Generation Code generation involves the generation of the target representation (object code) from the annotated parse tree (or Abstract Syntactic Tree, AST)
More informationChapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )
Chapter 2 Computer Abstractions and Technology Lesson 4: MIPS (cont ) Logical Operations Instructions for bitwise manipulation Operation C Java MIPS Shift left >>> srl Bitwise
More informationOverview. Introduction to the MIPS ISA. MIPS ISA Overview. Overview (2)
Introduction to the MIPS ISA Overview Remember that the machine only understands very basic instructions (machine instructions) It is the compiler s job to translate your high-level (e.g. C program) into
More informationOptimization Prof. James L. Frankel Harvard University
Optimization Prof. James L. Frankel Harvard University Version of 4:24 PM 1-May-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Reasons to Optimize Reduce execution time Reduce memory
More informationCS201 - Introduction to Programming Glossary By
CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with
More informationProgramming Language Implementation
A Practical Introduction to Programming Language Implementation 2014: Week 12 Optimisation College of Information Science and Engineering Ritsumeikan University 1 review of last week s topics why primitives
More informationEEM870 Embedded System and Experiment Lecture 4: ARM Instruction Sets
EEM870 Embedded System and Experiment Lecture 4 ARM Instruction Sets Wen-Yen Lin, Ph.D. Department of Electrical Engineering Chang Gung University Email wylin@mail.cgu.edu.tw March 2014 Introduction Embedded
More informationLecture 4: Instruction Set Architecture
Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)
More informationReview of the C Programming Language for Principles of Operating Systems
Review of the C Programming Language for Principles of Operating Systems Prof. James L. Frankel Harvard University Version of 7:26 PM 4-Sep-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights
More informationExercise Session 2 Simon Gerber
Exercise Session 2 Simon Gerber CASP 2014 Exercise 2: Binary search tree Implement and test a binary search tree in C: Implement key insert() and lookup() functions Implement as C module: bst.c, bst.h
More informationProgramming Fundamentals - A Modular Structured Approach using C++ By: Kenneth Leroy Busbee
1 0 1 0 Foundation Topics 1 0 Chapter 1 - Introduction to Programming 1 1 Systems Development Life Cycle N/A N/A N/A N/A N/A N/A 1-8 12-13 1 2 Bloodshed Dev-C++ 5 Compiler/IDE N/A N/A N/A N/A N/A N/A N/A
More informationProgramming in C - Part 2
Programming in C - Part 2 CPSC 457 Mohammad Reza Zakerinasab May 11, 2016 These slides are forked from slides created by Mike Clark Where to find these slides and related source code? http://goo.gl/k1qixb
More informationComputer Architecture. Chapter 2-2. Instructions: Language of the Computer
Computer Architecture Chapter 2-2 Instructions: Language of the Computer 1 Procedures A major program structuring mechanism Calling & returning from a procedure requires a protocol. The protocol is a sequence
More informationUtilizing Tools to Effectively Code for the Architectural Features of an ARM Platform. Chris Shore Training Manager
Utilizing Tools to Effectively Code for the Architectural Features of an ARM Platform Chris Shore Training Manager Have the right tools... Many tool sets are available This presentation assumes that you
More information