The Software Stack: From Assembly Language to Machine Code

Similar documents
The So'ware Stack: From Assembly Language to Machine Code

The Processor Memory Hierarchy

Naming in OOLs and Storage Layout Comp 412

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Arrays and Functions

Optimizer. Defining and preserving the meaning of the program

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Intermediate Representations

Intermediate Representations

Instruction Selection: Preliminaries. Comp 412

Handling Assignment Comp 412

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Intermediate Represeation & Code Generation

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Runtime Support for Algol-Like Languages Comp 412

Just-In-Time Compilers & Runtime Optimizers

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Fall 2015 COMP Operating Systems. Lab #3

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Introduction to OS Processes in Unix, Linux, and Windows MOS 2.1 Mahmoud El-Gayyar

Implementing Control Flow Constructs Comp 412

The Procedure Abstraction

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Processes. Operating System CS 217. Supports virtual machines. Provides services: User Process. User Process. OS Kernel. Hardware

CSci 4061 Introduction to Operating Systems. Processes in C/Unix

Instruction Selection, II Tree-pattern matching

Parsing II Top-down parsing. Comp 412

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Building a Parser Part III

The ILOC Virtual Machine (Lab 1 Background Material) Comp 412

Operating System Structure

Reading Assignment 4. n Chapter 4 Threads, due 2/7. 1/31/13 CSE325 - Processes 1

by Marina Cholakyan, Hyduke Noshadi, Sepehr Sahba and Young Cha

Compiler, Assembler, and Linker

Code Shape II Expressions & Assignment

Operating Systems. II. Processes

CSC 1600 Unix Processes. Goals of This Lecture

Introduction to Optimization Local Value Numbering

The Procedure Abstraction Part I: Basics

Main Memory. CISC3595, Spring 2015 X. Zhang Fordham University

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

What is a Process? Processes and Process Management Details for running a program

Operating Systems CMPSC 473. Process Management January 29, Lecture 4 Instructor: Trent Jaeger

Lexical Analysis - An Introduction. Lecture 4 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Operating systems and concurrency - B03

OS Structure, Processes & Process Management. Don Porter Portions courtesy Emmett Witchel

OS lpr. www. nfsd gcc emacs ls 1/27/09. Process Management. CS 537 Lecture 3: Processes. Example OS in operation. Why Processes? Simplicity + Speed

Syntax Analysis, III Comp 412

Intermediate Representations Part II

Syntax Analysis, III Comp 412

Local Register Allocation (critical content for Lab 2) Comp 412

OS Lab Tutorial 1. Spawning processes Shared memory

Lecture 4: Process Management

Short Notes of CS201

From Code to Program: CALL Con'nued (Linking, and Loading)

OS lpr. www. nfsd gcc emacs ls 9/18/11. Process Management. CS 537 Lecture 4: Processes. The Process. Why Processes? Simplicity + Speed

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

CS201 - Introduction to Programming Glossary By

Princeton University. Computer Science 217: Introduction to Programming Systems. Process Management

CS61 Scribe Notes Date: Topic: Fork, Advanced Virtual Memory. Scribes: Mitchel Cole Emily Lawton Jefferson Lee Wentao Xu

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Linking and Loading. ICS312 - Spring 2010 Machine-Level and Systems Programming. Henri Casanova

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Princeton University Computer Science 217: Introduction to Programming Systems. Process Management

Mon Sep 17, 2007 Lecture 3: Process Management

CSCE 313 Introduction to Computer Systems. Instructor: Dezhen Song

Process Management! Goals of this Lecture!

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

CSCE 313: Intro to Computer Systems

Parents and Children

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

CISC Processor Design

www nfsd emacs lpr Process Management CS 537 Lecture 4: Processes Example OS in operation Why Processes? Simplicity + Speed

CS 111. Operating Systems Peter Reiher

Preview. Process Control. What is process? Process identifier The fork() System Call File Sharing Race Condition. COSC350 System Software, Fall

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators.

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Fall 2017 :: CSE 306. Process Abstraction. Nima Honarmand

ECE 598 Advanced Operating Systems Lecture 10

SOFTWARE ARCHITECTURE 3. SHELL

Combining Optimizations: Sparse Conditional Constant Propagation

Processes. Dr. Yingwu Zhu

Princeton University Computer Science 217: Introduction to Programming Systems. Process Management

A process. the stack

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

CS 550 Operating Systems Spring Process II

(In columns, of course.)

Lab 3, Tutorial 1 Comp 412

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 20

PROCESS CONTROL BLOCK TWO-STATE MODEL (CONT D)

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

CS 550 Operating Systems Spring Process III

Transcription:

COMP 506 Rice University Spring 2018 The Software Stack: From Assembly Language to Machine Code source code IR Front End Optimizer Back End IR target code Somewhere Out Here Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 506 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved

The Hardware Execution Model Computers execute individual operations, encoded into binary form Basic cycle of execution is: (fetch, decode, execute)* Fetch brings an operation from memory into the processor s decode unit Decode breaks the operation into its fields, based on the opcode, & sets up the operation s execution writes values to appropriate control registers Execute clocks the processor s functional unit to carry out the computation Each operation has a fixed format A processor will support several formats Opcode determines format People are not particularly good at reading & writing binary operations Productivity & error rate are better with higher level of abstraction People need a tool to translate some symbolic representation to binary We invented assemblers to perform that translation opcode reg 1 reg 2 reg 2 or constant 0 10 15 20 31 op constant 0 1 31 COMP Bits in 506, the Spring instruction 2018 format are a serious architectural limitation 2

The Hardware Execution Model Computers execute individual operations, encoded into binary form Basic cycle of execution is: (fetch, decode, execute)* Fetch brings an operation from memory into the processor s decode unit Decode breaks the operation into its fields, based on the opcode, & sets up the operation s execution writes values to appropriate control registers Execute clocks the processor s functional unit to carry out the computation Each operation has a fixed format A processor will support several formats Opcode determines format Instruction formats constrain hardware design Number of registers is limited by the number of bits in an operation Why not four address operations? Number of bits Can have a couple, but it wipes out a lot of opcodes opcode reg 1 reg 2 reg 2 or constant 0 10 15 20 31 op constant 0 1 31 COMP Bits in 506, the Spring instruction 2018 format are a serious architectural limitation 3

Symbolic Assembly Code The ILOC that we have seen in examples is a symbolic assembly code for a simplified fictional RISC processor ILOC Code: Symbolic names for operations load, store, add, mult, jump, Labels for addresses @a, @b, @c,... Register names specified with integers r 1, r 2, r 3,... ILOC has symbolic labels and virtual registers load @b Þ r 1 load @c Þ r 2 mult r 1,r 2 Þ r 3 load @d Þ r 4 add r 3,r 4 Þ r 5 store r 5 Þ @a load @f Þ r 6 add r 5,r 6 Þ r 7 store r 7 Þ @e Machine code needs addresses and physical registers Compiler maps virtual registers to physical registers Job of the compiler s regsiter allocator Software stack (assembler, linker, loader) converts labels to virtual addresses and mnemonics to opcodes COMP 506, Spring 2018 5

Symbolic Assembly Code assembly code Assembler object file In assembly code, the fields in an operation are alphanumeric symbols Opcodes are represented with mnemonics character strings add instead of 0110 Registers and constants are written using base 10 or base 16 numbers r15 instead of 01111 To prepare an assembly program for execution, it must be translated Mnemonics map directly to opcodes bit patterns Symbolic labels map directly to addresses bit patterns Base 10 numbers convert to base 2 numbers bit patterns For the most part, these translations are simple and direct COMP 506, Spring 2018 6

Building an Assembler assembly code Assembler object file What does an object file contain? A list of operations First operation (almost always) has an exported label Complete operations are in final binary form Operations with a symbolic reference have a hole & an index into a symbol table A symbol table Each symbol defined in the module and exported to the rest of the program has a symbol name, a type, and an offset from the start of the module Each symbol imported from outside the module has a symbol name, a storage class, and a list of operations in the module that reference this symbol Storage class might be static or dynamic, local or global COMP 506, Spring 2018 7

Building an Assembler assembly code Assembler object file Assemblers convert assembly code to object code Typical assemblers operate in two passes Pass 1 builds up a symbol table that maps names to addresses Use a hash table to record the map Linear pass over the input to find symbols & record <name,address> pairs Pass 2 rewrites the operations into binary form All symbols defined in the module should be defined after pass 1 Linear pass over the input to rewrite operations into binary form References to imported symbols are marked with their name and table entry At the end, the assembler writes out an object file COMP 506, Spring 2018 8

Building an Assembler The algorithm for Pass 1 is simple Initialize a counter to zero Initialize an empty map (a hash table 1 ) At each operation, from the first to the last If the operation is labeled, enter the label in the map with the current counter Enter any undefined labels used as operands in the map, with an invalid counter Compute the length of the current operation Increment the counter by the operation s length Iterate over the map If any symbol has an invalid counter, mark it as an external symbol or signal error Choice of action depends on the rules of the particular assembly language Pass 1 should operate in O( operations ) time Number of symbols is O( operations ) (if hash lookup is O(1)) 1 COMP See discussion 506, Spring of 2018 hash tables in Chapter 5 and Appendix B in EaC2e 9

Building an Assembler Pass 2 iterates over the input again and rewrites it Pass 1 resolved all the symbols that are not imported Pass 2 constructs the object code For each operation Mnemonic becomes a bit pattern for the opcode Opcode determines instruction format Operands become bit fields The exception is a reference to an externally defined symbol Replace externally defined symbol with a reference into table of such symbols Format for that reference and table is a system-wide convention Write out the finished binary operation At the end, write out the symbol table information Scalability: External reference might be as simple as an invalid value, with an outside table that lists lines and position. Again, it is all O( operations ) (if hash lookup is O(1)) COMP 506, Spring 2018 10

What happens to an internal label? Many labels in an assembly module are internal Most of the labels generated by AHSDT, for example Those labels can be resolved, in pass 1, to either An offset from the start of the assembly module, or An offset from the PC in the executing operation (loop header) Addresses relative to start of the assembly module can be adjusted later Addresses relative to the PC can be used directly in some opcode formats Limited range (due to the number of bits in the offset) PC-relative branches and jumps are common with reasonably large offsets Keeping those offsets small allows the compiler to use cheaper operations Jump to an external symbol probably needs to load address from constant pool COMP 506, Spring 2018 11

Efficient Building an Assembler Atypical assembler? assembly code Assembler object file An assembler can do most of its translation in the first pass Can translate most operations directly in pass 1 E.g., add r1, r2 => r3 contains all the information that it needs Exception arises from a reference to a symbol that is not yet defined When the assembler finds an (as yet) undefined label, it can: Enter that symbol into the symbol table and mark it as undefined Add the current operation to a list of operations containing symbolic references It can process any operation with an (already) defined label At the end of pass 1, the assembler can: Traverse the list of unfinished instructions Translate them in place After that half pass, it can output the object code Classic Pass& a Half Assembler COMP 506, Spring 2018 12

Assembler Pseudo-Operations Almost all assemblers provide pseudo-operations to manipulate layout Pseudo-operations define storage, define labels, initialize space, and mark symbols as external To create space for a global array A, define storage for it Label that pseudo-operation with a known label, say _@A The pseudo-operations serves two functions It advances the assembler s internal counter for addresses (by its argument) It provides a place to hang the label, at the start of that block of space If we define _@A as an external symbol, other code will be able to access it The linker will tie the symbols and addresses together The linker will match them by name, so only one object file can define _@A The process of creating unique labels from names is called mangling Procedure fee() might create something like _.fee_ Static fee() in file foe() might create something like _.foe.fee @A: dss 1024 _@B: bss 128 COMP 506, Spring 2018 13

So, What Happens With All Those Object Modules? The compiler produces a.asm file for each compilation unit. The assembler produces a.o file for each.asm file..o.o.o.o.o a.out Linker What happens next? A collection of.o files can be combined to form an executable file (a.out) The program that performs this task is called a linker The linker takes a collection of object files and libraries of object files It selects out the pieces that it needs to build a complete executable It lays out the executable, computes addresses for all external symbols, and rewrites the executable, replacing the external symbols with virtual addresses COMP 506, Spring 2018 14

Building a Linker.o.o.o.o.o a.out Linker A linker composes one or more object modules into an executable program Finds all of the entry points and labels in the object code Both exported and imported names Finds the main entry point Determines the spatial layout of the executable Resolves all imported names in an appropriate way Names that are statically linked Imported names must match an exported name Imported names must be rewritten with the exported name s address Names that are dynamically linked Imported names must be rewritted with a stub to load & link the name at runtime COMP You will 506, sometimes Spring 2018hear the term linkage editor rather than linker. 15

Building a Linker.o.o.o.o.o a.out Linker How does a linker work? Pass 1: Build a map of all symbols (exported & imported) by the object modules and libraries, and find the the main entry point Pass 2: Add object modules until all static symbols are resolved Starting with the module containing main: Add the module to the end of the code Assign addresses to its internal labels Pass 3: Rewrite the code with resolved symbols Static symbols are rewritten with addresses Dynamic symbols are rewritten with a jump to a stub that finds, loads, and links the needed object code Operates over the symbol tables Operates over the object code Operates over the object code Granularity of inclusion is an object file COMP Library 506, can Spring be one 2018 big file or a collection of smaller modules 16

Building a Linker.o.o.o.o.o a.out Linker How does a linker work? Pass 1: Build a map of all symbols (exported & imported) by the object modules and libraries, and find the the main entry point Pass 2: Add object modules until all static symbols are resolved Starting with the module containing main: Add the module to the end of the code Assign addresses to its internal labels Pass 3: Rewrite the code with resolved symbols Static symbols are rewritten with addresses Dynamic symbols are rewritten with a jump to a stub that finds, loads, and links the needed object code Does order matter? Actually, it does See Pettis & Hansen 90 or Section 8.7.2 in EaC2e Granularity of inclusion is an object file COMP Library 506, can Spring be one 2018 big file or a collection of smaller modules 17

What Happens at Runtime? How does an a.out execute? Some running process 1 (the parent ) starts the execution In a shell, the user types the name of an executable at an input prompt The parent process creates (or spawns ) a new process The new process executes a.out and returns In a shell, the parent process waits for the child to complete In a parallel computation, the parent may spawn many children and wait for them to complete, or it may simply continue without waiting Obviously, this high-level view obscures some of the detail... 1 The first running process is the boot process, created by a hardware COMP signal 506, and Spring the code 2018in the boot sequence. 18

Creating a New Process (Unix/Linux model) To create a new process, the parent calls fork() fork() clones the parent process New child has its own virtual address space New child has its own copies of all data Includes file descriptors, program counter, New child has a distinct process id (PID) Fork() returns child s PID to the parent and 0 to the child The parent waits on the child The child executes the new command pid_t vfork(void); pid_t process; process = vfork(); if (process < 0) fprintf(stderr, vfork() error.\n ); else if (process == 0) { execute new command } else { wait() } If the child only executes the new command It does not need the parent s full address space vfork() copies just enough of the address space to allow an exec() call It clones file descriptors, process status information, and the code vfork() is safe if the process immediately calls exec() COMP 506, Spring 2018 19

Replacing the Address (Unix/Linux model) To replace the child process s address space, it calls exec() 1 execl() takes a path to the executable file and a list of arguments exec() constructs the address space of the executable file If the file begins with #! interpreter, it runs the interpreter on the rest of the file If the file is an executable, it reads the file s header information and: Loads the program text, starting at the specified address Loads any initialized data areas, starting at their specified addresses Zeros any pages that are so specified Branches to the start address of main(), as specified in the file s header The original address space overwritten, except for parts of its environment File descriptors remain open (unless modified by fcntl()) The contents of the global environ remain intact Arguments passed through exec() are passed to main as argc, argv, and envp int main( int argc, char **argv, char **envp) { } 1 COMP execve() 506, is Spring the general 2018 form; execl() is a front end to execve() 20

A Couple More Points The linker describes the address space The executable file is fully resolved, except for dynamically linked labels exec(), in effect, inflates the address space based on header info in the a.out It creates the code region of the address space, along with static data areas and global data areas main() must create the stack and the heap main() starts the language s runtime system Allocates and initializes space for the stack and the heap Takes care of other critical details The compiler inserts a call to the runtime initialization code at the start of main() Special case for the main entry point In c, the compiler recognizes main() by name. In other languages, the compiler may recognize it by declaration On exit, main() must shut down the environment Compiler inserts a call to the runtime finalization code at the end of main() COMP 506, Spring 2018 21

What Happened to _@A? We went off on this tale to learn how a compile-time label becomes a physical address. What did happen to_ @A? Two cases: _@A was a label in the code The assembler converted _@A to an offset from the start of the object module _@A was a label on a pseudo-operation The assembler converted _@A to an address of a data area (initialized or not) Then, The linker computed _@A s virtual address as it laid out memory The linker replaced occurrences of _@A with its actual virtual address exec() built the address space and initialized the memory at _@A The relevant instructions executed with the _@A s virtual address Load, store, branch, jump, call, etc. see a virtual address rather than _@A COMP 506, Spring 2018 22

What Happens to the Virtual Address? The hardware uses physical address (for the most part) At runtime, the operating system and the hardware must translate the virtual address of _@A to a physical address The process requires cooperation between the processor and the OS The process requires both hardware and software support COMP 506, Spring 2018 23

Virtual Addresses to Physical Addresses Processes Memory Hierarchy Core Functional unit Functional unit Data & Code Registers Operating System s Paging Mechanism Functional unit Functional unit Each process has its own virtual address space (slide 18) Provides a safe, secure, clean environment for new process Allows compiler, assembler, and linker to work without context The OS maps multiple virtual address spaces into one physical address space Granularity is a page of memory OS manages which pages are in physical memory (& which are not) Modern hardware assists the various address translations COMP 506, Spring 2018 24

Virtual Addresses to Physical Addresses Processes Memory Hierarchy Core Functional unit Functional unit Data & Code Registers Operating System s Paging Mechanism Functional unit Functional unit Back to _@A Assembler turned it into either A resolved known constant An external symbol Linker turned it into a resolved address relative to either the PC or virtual address zero fork() and exec() created a new virtual address space and moved the entire executable into it Operating system mapped the page with _@A into some page frame in physical memory Hardware & software (OS) translated the virtual address to a physical address COMP 506, Spring 2018 25

Virtual Addresses to Physical Addresses Processes Back to _@A All of this mechanism allows the compiler to emit a label, such as _@A, and to know that, at runtime, it will just work Memory Hierarchy Data & Code Operating System s Paging Mechanism Next Lecture That box labelled Memory Hierarchy Core Functional unit Functional unit Registers Functional unit Functional unit COMP 506, Spring 2018 26