Assemblers and Linkers

Similar documents
Assemblers and Compilers

ECE 15B COMPUTER ORGANIZATION

Review (1/2) IEEE 754 Floating Point Standard: Kahan pack as much in as could get away with. CS61C - Machine Structures

CS 61C: Great Ideas in Computer Architecture CALL continued ( Linking and Loading)

From Code to Program: CALL Con'nued (Linking, and Loading)

CS 110 Computer Architecture Running a Program - CALL (Compiling, Assembling, Linking, and Loading)

Assembler. #13 Running a Program II

Architecture II. Computer Systems Laboratory Sungkyunkwan University

Review C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o

Lecture 10: Program Development versus Execution Environment

CS61C : Machine Structures

Review. Disassembly is simple and starts by decoding opcode field. Lecture #13 Compiling, Assembly, Linking, Loader I

ecture 12 From Code to Program: CALL (Compiling, Assembling, Linking, and Loading) Friedland and Weaver Computer Science 61C Spring 2017

MIPS Instruction Set Architecture (2)

Review C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o

Mips Code Examples Peter Rounce

CS 61C: Great Ideas in Computer Architecture Running a Program - CALL (Compiling, Assembling, Linking, and Loading)

CS61C Machine Structures. Lecture 18 - Running a Program I. 3/1/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

CS 61c: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture Running a Program - CALL (Compiling, Assembling, Linking, and Loading)

Topic Notes: MIPS Instruction Set Architecture

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

CS61C : Machine Structures

ECE 15B COMPUTER ORGANIZATION

UC Berkeley CS61C : Machine Structures

CS 61c: Great Ideas in Computer Architecture

We will study the MIPS assembly language as an exemplar of the concept.

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Two processors sharing an area of memory. P1 writes, then P2 reads Data race if P1 and P2 don t synchronize. Result depends of order of accesses

Machine Instructions - II. Hwansoo Han

Review. Lecture 18 Running a Program I aka Compiling, Assembling, Linking, Loading

MIPS (SPIM) Assembler Syntax

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 3 2/8/2018

UCB CS61C : Machine Structures

Where Are We Now? Linker (1/3) Linker (2/3) Four Types of Addresses we ll discuss. Linker (3/3)

CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Formats

CS3350B Computer Architecture MIPS Procedures and Compilation

Computer Science and Engineering 331. Midterm Examination #1. Fall Name: Solutions S.S.#:

COMPUTER ORGANIZATION AND DESIGN

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer

Lecture 6: Assembly Programs

CSE 2021: Computer Organization

COMP MIPS instructions 2 Feb. 8, f = g + h i;

Lecture 4: MIPS Instruction Set

Lecture 7: Examples, MARS, Arithmetic

Chapter 2. Instructions:

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

ECE260: Fundamentals of Computer Engineering

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

UCB CS61C : Machine Structures

Compilers and Interpreters

MIPS Assembly Language Programming

CS 61c: Great Ideas in Computer Architecture

Orange Coast College. Business Division. Computer Science Department CS 116- Computer Architecture. The Instructions

CSE 2021: Computer Organization

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

LECTURE 2: INSTRUCTIONS

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Chapter 2. Instruction Set Architecture (ISA)

Computer Architecture

Computer Architecture I Midterm I

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Review of Last Lecture. CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Representation II. Agenda. Dealing With Large Immediates

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Slide Set 4. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

CS61C : Machine Structures

CALL (Compiler/Assembler/Linker/ Loader)

EEC 581 Computer Architecture Lecture 1 Review MIPS

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

Lectures 3-4: MIPS instructions

Computer Architecture

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

Overview. Introduction to the MIPS ISA. MIPS ISA Overview. Overview (2)

Administrivia. Midterm Exam - You get to bring. What you don t need to bring. Conflicts? DSP accomodations? Head TA

Concocting an Instruction Set

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

ECE 473 Computer Architecture and Organization Project: Design of a Five Stage Pipelined MIPS-like Processor Project Team TWO Objectives

MIPS Coding Continued

Administrivia. Translation. Interpretation. Compiler. Steps to Starting a Program (translation)

MACHINE LANGUAGE. To work with the machine, we need a translator.

MIPS Assembly Language Programming

UCB CS61C : Machine Structures

MIPS Assembly: More about Memory and Instructions CS 64: Computer Organization and Design Logic Lecture #7

UCB CS61C : Machine Structures

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

ENCM 369 Winter 2017 Lab 3 for the Week of January 30

Behind the Curtain. Introduce TA Friday 1 st lab! (1:30-3:30) (Do Prelab before) 1 st Problem set on Thurs. Comp 411 Fall /16/13

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)

Lecture 5: Procedure Calls

Thomas Polzer Institut für Technische Informatik

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

Chapter 3. Instructions:

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Computer Science 2500 Computer Organization Rensselaer Polytechnic Institute Spring Topic Notes: MIPS Programming

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Transcription:

Assemblers and Linkers Long, long, time ago, I can still remember How mnemonics used to make me smile... Cause I knew that with those opcode names that I could play some assembly games and I d be hacking kernels in just awhile. But Comp 411 made me shiver, With every new lecture that was delivered, There was bad news at the door step, I just didn t get the problem sets. I can t remember if I cried, When inspecting my stack frame s insides, All I know is that it crushed my pride, On the day the joy of software died. And I was singing When I find my code in tons of trouble, Friends and colleagues come to me, Speaking words of wisdom: "Write in C." L07 Assemblers and Compilers 1

Path from Programs to Bits Traditional Compilation High-level, portable (architecture independent) program description C or C++ program Compiler Linker Library Routines A collection of precompiled object code modules Architecture dependent mnemonic program description with symbolic memory references Assembly Code Executable Machine language with all memory references resolved Assembler Loader Machine language with symbolic memory references Object Code Memory Program and data bits loaded into memory L07 Assemblers and Compilers 2

How an Assembler Works Three major components of assembly 1) Allocating and initialing data storage 2) Conversion of mnemonics to binary instructions 3) Resolving addresses.data array:.space 40 total:.word 0.text.globl main main: la $t1,array move $t2,$0 move $t3,$0 beq $0,$0,test loop: sll $t0,$t3,2 add $t0,$t1,$t0 sw $t3,($t0) add $t2,$t2,$t3 addi $t3,$t3,1 test: slti $t0,$t3,10 bne $t0,$0,loop sw $t2,total jr $ra lui ori $9, arrayhi $9,$9,arraylo 0x3c09???? 0x3529???? L07 Assemblers and Compilers 3

Resolving Addresses- 1 st Pass Old-style 2-pass assembler approach Pass 1 Segment offset 0 4 8 12 Code 0x3c090000 0x35290000 0x00005021 0x00005821 Instruction la $t1,array move $t2,$ move $t3,$0 16 0x10000000 beq $0,$0,test 20 0x000b4080 loop: sll $t0,$t3,2 24 28 32 36 0x01284020 0xad0b0000 0x014b5020 0x216b0001 add $t0,$t1,$t0 sw $t0,($t0) add $t0,$t1,$t0 addi $t3,$t3,1 40 0x2968000a test: slti $t0,$t3,10 44 0x15000000 bne $t0,$0,loop 48 52 0x3c010000 0xac2a0000 56 0x03e00008 j $ra sw $t2,total - In the first pass, data and instructions are encoded and assigned offsets within their segment, while the symbol table is constructed. - Unresolved address references are set to 0 Symbol table after Pass 1 Symbol Segment Location pointer offset array data 0 total data 40 main text 0 loop text 20 test text 40 L07 Assemblers and Compilers 4

Resolving Addresses in 2 Passes Old-style 2-pass assembler approach Pass 2 Segment offset 0 4 8 12 Code 0x3c091001 0x35290000 0x00005021 0x00005821 Instruction la $t1,array move $t2,$ move $t3,$0 16 0x10000006 beq $0,$0,test 20 0x000b4080 loop: sll $t0,$t3,2 24 28 32 36 0x01284020 0xad0b0000 0x014b5020 0x216b0001 add $t0,$t1,$t0 sw $t0,($t0) add $t0,$t1,$t0 addi $t3,$t3,1 40 0x2968000a test: slti $t0,$t3,10 44 0x1500fffa bne $t0,$0,loop 48 52 0x3c011001 0xac2a0028 56 0x03e00008 j $ra sw $t2,total In the second pass, the appropriate fields of those instructions that reference memory are filled in with the correct values if possible. Symbol table after Pass 1 Symbol Segment Location pointer offset array data 0 total data 40 main text 0 loop text 20 test text 40 L07 Assemblers and Compilers 5

Modern Way 1-Pass Assemblers Modern assemblers keep more information in their symbol table which allows them to resolve addresses in a single pass. Known addresses (backward references) are immediately resolved. Unknown addresses (forward references) are back-filled once they are resolved. State of the symbol table after the instruction sw $t0, ($t0) is assembled SYMBOL SEGMENT Location pointer offset Resolved? Reference list array data 0 y null total data 40 y null main text 0 y null loop text 20 y null test text? n 16 L07 Assemblers and Compilers 6

The Role of a Linker Some aspects of address resolution cannot be handled by the assembler alone. 1) References to data or routines in other object modules 2) The layout of all segments (.text,.data) in memory 3) Support for REUSABLE code modules 4) Support for RELOCATABLE code modules This final step of resolution is the job of a LINKER Source file Assembler Object file Source file Assembler Object file Linker Executable File Source file Assembler Object file Libraries L07 Assemblers and Compilers 7

Static and Dynamic Libraries LIBRARIES are commonly used routines stored as a concatenation of Object files. A global symbol table is maintained for the entire library with entry points for each routine. When a routine in a LIBRARY is referenced by an assembly module, the routine s address is resolved by the LINKER, and the appropriate code is added to the executable. This sort of linking is called STATIC linking. Many programs use common libraries. It is wasteful of both memory and disk space to include the same code in multiple executables. The modern alternative to STATIC linking is to allow the LOADER and THE PROGRAM ITSELF to resolve the addresses of libraries routines. This form of lining is called DYNAMIC linking (e.x..dll). L07 Assemblers and Compilers 8

Dynamically Linked Libraries C call to library function: printf( sqr[%d] = %d\n, x, y); Assembly code Maps to: addi la lw lw call addi lui ori lui lw lw lui lw jalr $a0,$0,1 $a1,ctrlstring $a2,x $a3,y fprintf Yet another pseudoinstruction $a0,$0,1 $a1,ctrlstringhi $a1,ctrlstringlo $at,globaldata $a2,x($at) $a3,y($at) $at,fprintfhi $at,fprintflo($at) $at,$31 How does dynamic linking work? Why are we loading the function s address into a register first, and then calling it? L07 Assemblers and Compilers 9

Dynamically Linked Libraries Lazy address resolution: sysload: addui $sp,$sp,16. Because, the entry points to dynamic library routines are stored in a TABLE. And the contents of this table are loaded on an as needed basis!. # check if stdio module # is loaded, if not load it.. # backpatch jump table la $t1,stdio la $t0,$dfopen sw $t0,($t1) la $t0,$dfclose sw $t0,4($t1) la $t0,$dfputc sw $t0,8($t1) la $t0,$dfgetc sw $t0,12($t1) la $t0,$dfprintf sw $t0,16($t1) Before any call is made to a procedure in stdio.dll.globl stdio: stdio: fopen:.word sysload fclose:.word sysload fgetc:.word sysload fputc:.word sysload fprintf:.word sysload After first call is made to any procedure in stdio.dll.globl stdio: stdio: fopen: dfopen fclose: dclose fgetc: dfgetc fputc: dfputc fprintf: dprintf L07 Assemblers and Compilers 10

Modern Languages Intermediate object code language High-level, portable (architecture independent) program description Java program Compiler PORTABLE mnemonic program description with symbolic memory references JVM bytecodes Library Routines An application that EMULATES a virtual machine. Can be written for any Instruction Set Architecture. In the end, machine language instructions must be executed for each JVM bytecode Interpreter L07 Assemblers and Compilers 11

Modern Languages Intermediate object code language High-level, portable (architecture independent) program description Java program Compiler PORTABLE mnemonic program description with symbolic memory references JVM bytecodes Library Routines While interpreting on the first pass it keeps a copy of the machine language instructions used. Future references access machine language code, avoiding further interpretation JIT Compiler Memory Today s JITs are nearly as fast as a native compiled code (ex..net). L07 Assemblers and Compilers 12

Assembly? Really? In the early days compilers were dumb literal line-by-line generation of assembly code of C source This was efficient in terms of S/W development time C is portable, ISA independent, write once run anywhere C is easier to read and understand Details of stack allocation and memory management are hidden However, a savvy programmer could nearly always generate code that would execute faster Enter the modern era of Compilers Focused on optimized code-generation Captured the common tricks that low-level programmers used Meticulous bookkeeping (i.e. will I ever use this variable again?) It is hard for even the best hacker to improve on code generated by good optimizing compilers L07 Assemblers and Compilers 13

Example Compiler Optimizations Example C Code: int array[10]; int total; int main( ) { int i; } total = 0; for (i = 0; i < 10; i++) { array[i] = i; total = total + i; } L07 Assemblers and Compilers 14

Unoptimized Assembly Output With debug flags set:.globl main.text main: addiu $sp,$sp,-8 sw $0,total # total = 0 sw $0,0($sp) # i = 0 lw $8,0($sp) # copy i to $t0 # allocates space for ra and i b L.3 # goto test L.2: # for(...) { sll $24,$8,2 # make i a word offset sw $8,array($24) # array[i] = i lw $24,total # total = total + i addu $24,$24,$8 sw $24,total addi $8,$8,1 # i = i + 1 L.3: Why does turning on debugging generate the worse code? Ans: Because the complier reverts back to line-by-line translation. sw $8,0($sp) # update i in memory slti $1,$8,10 # (i < 10)? bne $1,$0,L.2 #} if TRUE loop addiu $sp,$sp,8 jr $31 103, that s not so bad L07 Assemblers and Compilers 15

Register Allocation Assign local variable i to a register Two instructions outside the loop are replaced with one.globl main.text main: addiu $sp,$sp,-4 #allocates space for ra sw $0,total #total = 0 move $8,$0 #i = 0 b L.3 #goto test L.2: #for(...) { sll $24,$8,2 # make i a word offset sw $8,array($24) # array[i] = i lw $24,total # total = total + i addu $24,$24,$8 sw $24,total addi $8,$8,1 # i = i + 1 L.3: slti $1,$8,10 # (i < 10)? bne $1,$0,L.2 #} if TRUE loop addiu $sp,$sp,4 jr $31 91, I can play in public. L07 Assemblers and Compilers 16

Loop-Invariant Code Motion Temporarily allocate temp registers to hold global values to avoid loads inside the loop, yet mirroring changes We ve added an instruction here outside of the loop and eliminated an lw inside of loop.globl main.text main: addiu $sp,$sp,-4 #allocates space for ra sw $0,total #total = 0 move $9,$0 #temp for total move $8,$0 #i = 0 b L.3 #goto test L.2: #for(...) { sll $24,$8,2 # make i a word offset sw $8,array($24) # array[i] = i addu $9,$9,$8 sw $9,total addi $8,$8,1 # i = i + 1 L.3: slti $1,$8,10 # (i < 10)? bne $1,$0,L.2 #} if TRUE loop addiu $sp,$sp,4 jr $31 82! Side-bets anyone? L07 Assemblers and Compilers 17

Remove Unnecessary Tests Since i is initially set to 0, we already know it is less than 10, so why bother testing it the first time? Eliminated a branch here and the label it referenced.globl main.text main: addiu $sp,$sp,-4 #allocates space for ra sw $0,total #total = 0 move $9,$0 #temp for total move $8,$0 #i = 0 L.2: #for(...) { sll $24,$8,2 # make i a word offset sw $8,array($24) # array[i] = i addu $9,$9,$8 sw $9,total addi $8,$8,1 # i = i + 1 slti $1,$8,10 # loads const 10 bne $1,$0,L.2 #} loops while i < 10 addiu $sp,$sp,4 jr $31 79, almost scratch! L07 Assemblers and Compilers 18

Remove Unnecessary Stores All we care about it the value of total after the loop finishes, so there is no need to update it on each pass Moved this instruction outside the loop.globl main.text main: addiu $sp,$sp,-4 #allocates space for ra and i sw $0,total #total = 0 move $9,$0 #temp for total move $8,$0 #i = 0 L.2: sll $24,$8,2 #for(...) { sw $8,array($24) # array[i] = i addu $9,$9,$8 addi $8,$8,1 # i = i + 1 slti $1,$8,10 # loads const 10 bne $1,$0,L.2 #} loops while i < 10 sw $9,total addiu $sp,$sp,4 jr $31 70, ready for the PGA! L07 Assemblers and Compilers 19

Unrolling Loops By examining the function we can see it is always executed 10 times. Thus, we can make 2, 5, or 10 copies of the inner loop reduce the branching overhead..globl main.text Added a second copy of these four lines. main: addiu $sp,$sp,-4 #allocates space for ra and i sw $0,total #total = 0 move $9,$0 #temp for total move $8,$0 #i = 0 L.2: sll $24,$8,2 #for(...) { sw $8,array($24) # array[i] = i addu $9,$9,$8 addi $8,$8,1 # i = i + 1 sll $24,$8,2 # sw $8,array($24) # array[i] = i addu $9,$9,$8 addi $8,$8,1 # i = i + 1 slti $24,$8,10 # loads const 10 bne $24,$0,L.2 #} loops while i < 10 sw $9,total addiu $sp,$sp,4 jr $31 60, watch out Tiger! L07 Assemblers and Compilers 20

Next Time We go deeper into the rabbit hole Lab on Friday Play with a compiler to see the assembly code generated Problem Set 1 is due next Tuesday! Prof. McMillan will hold extend office hours Monday night L07 Assemblers and Compilers 21