Linking February 24, 2005

Similar documents
Example C Program The course that gives CMU its Zip! Linking March 2, Static Linking. Why Linkers? Page # Topics

Example C program. 11: Linking. Why linkers? Modularity! Static linking. Why linkers? Efficiency! What do linkers do? 10/28/2013

Lecture 16: Linking Computer Architecture and Systems Programming ( )

A Simplistic Program Translation Scheme

Linking Oct. 15, 2002

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Systems I. Linking II

Example C Program. Linking CS Instructor: Sanjeev Se(a. int buf[2] = {1, 2}; extern int buf[]; int main() { swap(); return 0; }

LINKING. Jo, Heeseung

Computer Organization: A Programmer's Perspective

Computer Systems. Linking. Han, Hwansoo

CSE 2421: Systems I Low-Level Programming and Computer Organization. Linking. Presentation N. Introduction to Linkers

Linking Oct. 26, 2009"

Linking. Computer Systems Organization (Spring 2017) CSCI-UA 201, Section 3. Instructor: Joanna Klukowska

Exercise Session 7 Computer Architecture and Systems Programming

CS429: Computer Organization and Architecture

Relocating Symbols and Resolving External References. CS429: Computer Organization and Architecture. m.o Relocation Info

E = 2 e lines per set. S = 2 s sets tag. valid bit B = 2 b bytes per cache block (the data) CSE351 Inaugural EdiNon Spring

CS 201 Linking Gerson Robboy Portland State University

Linker Puzzles The course that gives CMU its Zip! Linking Mar 4, A Simplistic Program Translation Scheme. A Better Scheme Using a Linker

Lecture 12-13: Linking

(Extract from the slides by Terrance E. Boult

Outline. 1 Background. 2 ELF Linking. 3 Static Linking. 4 Dynamic Linking. 5 Summary. Linker. Various Stages. 1 Linking can be done at compile.

Revealing Internals of Linkers. Zhiqiang Lin

Sungkyunkwan University

Linking and Loading. CS61, Lecture 16. Prof. Stephen Chong October 25, 2011

Linking. Today. Next time. Static linking Object files Static & dynamically linked libraries. Exceptional control flows

Systemprogrammering och operativsystem Laborationer. Linker Puzzles. Systemprogrammering 2007 Föreläsning 1 Compilation and Linking

u Linking u Case study: Library interpositioning

CS241 Computer Organization Spring Buffer Overflow

Today. Linking. Example C Program. Sta7c Linking. Linking Case study: Library interposi7oning

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

COMPILING OBJECTS AND OTHER LANGUAGE IMPLEMENTATION ISSUES. Credit: Mostly Bryant & O Hallaron

Linking. CS 485 Systems Programming Fall Instructor: James Griffioen

A SimplisHc Program TranslaHon Scheme. TranslaHng the Example Program. Example C Program. Why Linkers? - Modularity. Linking

Link 7. Dynamic Linking

Link 7. Static Linking

Link 7.A Static Linking

CSE2421 Systems1 Introduction to Low-Level Programming and Computer Organization

Link 8.A Dynamic Linking

Linking : Introduc on to Computer Systems 13 th Lecture, October 11th, Instructor: Randy Bryant. Carnegie Mellon

CS 550 Operating Systems Spring Process I

Systemprogrammering och operativsystem Linker Puzzles. Systemprogrammering 2009 Före lä s n in g 1 Compilation and Linking

Link 2. Object Files

Linking. Explain what ELF format is. Explain what an executable is and how it got that way. With huge thanks to Steve Chong for his notes from CS61.

Link 2. Object Files

Executables and Linking. CS449 Spring 2016

238P: Operating Systems. Lecture 7: Basic Architecture of a Program. Anton Burtsev January, 2018

Link 8. Dynamic Linking

Computer Systems Organization

Link 4. Relocation. Young W. Lim Wed. Young W. Lim Link 4. Relocation Wed 1 / 22

Executables and Linking. CS449 Fall 2017

Link 4. Relocation. Young W. Lim Thr. Young W. Lim Link 4. Relocation Thr 1 / 26

Compiler Drivers = GCC

Generating Programs and Linking. Professor Rick Han Department of Computer Science University of Colorado at Boulder

CIT 595 Spring System Software: Programming Tools. Assembly Process Example: First Pass. Assembly Process Example: Second Pass.

Linkers and Loaders. CS 167 VI 1 Copyright 2008 Thomas W. Doeppner. All rights reserved.

Link Edits and Relocatable Code

A software view. Computer Systems. The Compilation system. How it works. 1. Preprocesser. 1. Preprocessor (cpp)

Link 3. Symbols. Young W. Lim Mon. Young W. Lim Link 3. Symbols Mon 1 / 42

Binghamton University. CS-220 Spring Loading Code. Computer Systems Chapter 7.5, 7.8, 7.9

Assembly I: Basic Operations. Jo, Heeseung

ASSEMBLY I: BASIC OPERATIONS. Jo, Heeseung

CS 33. Linkers. CS33 Intro to Computer Systems XXV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Introduction Presentation A

ELF (1A) Young Won Lim 10/22/14

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Lecture 8: linking CS 140. Dawson Engler Stanford CS department

Midterm. Median: 56, Mean: "midterm.data" using 1:2 1 / 37

Machine Programming 1: Introduction

CS 3214 Computer Systems. Do not start the test until instructed to do so! printed

An Experience Like No Other. Stack Discipline Aug. 30, 2006

Assembly Language Programming Linkers

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING PREVIEW SLIDES 16, SPRING 2013

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

CS3214 Spring 2017 Exercise 2

Linking and Loading. ICS312 - Spring 2010 Machine-Level and Systems Programming. Henri Casanova

Introduction to Computer Systems , fall th Lecture, Sep. 28 th

CMSC 313 Lecture 12 [draft] How C functions pass parameters

CSL373: Operating Systems Linking

Shell Code For Beginners

Process Environment. Pradipta De

Memory and C/C++ modules

CNIT 127: Exploit Development. Ch 3: Shellcode. Updated

Link 4. Relocation. Young W. Lim Sat. Young W. Lim Link 4. Relocation Sat 1 / 33

CSC 1600 Memory Layout for Unix Processes"

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

Machine Language, Assemblers and Linkers"

o Code, executable, and process o Main memory vs. virtual memory

1 /* file cpuid2.s */ 4.asciz "The processor Vendor ID is %s \n" 5.section.bss. 6.lcomm buffer, section.text. 8.globl _start.

Today s Big Adventure

Obtained the source code to gcc, one can just follow the instructions given in the INSTALL file for GCC.

Buffer Overflow Attacks

Advanced Buffer Overflow

Linux Memory Layout. Lecture 6B Machine-Level Programming V: Miscellaneous Topics. Linux Memory Allocation. Text & Stack Example. Topics.

Outline. Unresolved references

Link 4. Relocation. Young W. Lim Tue. Young W. Lim Link 4. Relocation Tue 1 / 38

Instruction Set Architectures

Stack Discipline Jan. 19, 2018

From Code to Program: CALL Con'nued (Linking, and Loading)

Transcription:

15-213 The course that gives CMU its Zip! Linking February 24, 2005 Topics Static linking Dynamic linking Case study: Library interpositioning 13-linking.ppt

Example C Program main.c int buf[2] = {1, 2}; int main() { swap(); return 0; } swap.c extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; void swap() { int temp; } bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; 2 15-213, S 05

Static Linking Programs are translated and linked using a compiler driver: unix> gcc -O2 -g -o p main.c swap.c unix>./p main.c swap.c Source files Translators (cpp, cc1, as) main.o Translators (cpp, cc1, as) swap.o Separately compiled relocatable object files Linker (ld) p Fully linked executable object file (contains code and data for all functions defined in main.c and swap.c 3 15-213, S 05

Why Linkers? Modularity Program can be written as a collection of smaller source files, rather than one monolithic mass. Efficiency Can build libraries of common functions (more on this later) e.g., Math library, standard C library Time: Change one source file, compile, and then relink. No need to recompile other source files. Space: Libraries of common functions can be aggregated into a single file... Yet executable files and running memory images contain only code for the functions they actually use. 4 15-213, S 05

What Does a Linker Do? Symbol resolution Programs define and reference symbols (variables and functions): code: a(); /* reference to symbol a */ data: int *xp=&x; /* reference to symbol x */ Symbol definitions are stored (by compiler) in symbol table. Each entry includes name, size, and location of object. Linker associates each symbol reference with exactly one symbol definition. Relocation Merges separate code and data sections into single sections Relocates symbols from their relative locations in the.o files to new absolute positions in the executable. Updates all references to these symbols to reflect their new positions. 5 15-213, S 05

Object Files (Modules) Relocatable object file (.o( file) Contains code and data in a form that can be combined with other relocatable object files at compile time to form an executable. Executable object file Contains code and data in a form that can be copied directly into memory and then executed. Shared object file (.so( file) Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or runtime. Called Dynamic Link Libraries (DLLs) by Windows 6 15-213, S 05

Executable and Linkable Format (ELF) Standard binary format for object files Originally proposed by AT&T System V Unix Later adopted by BSD Unix variants and Linux One unified format for Relocatable object files (.o), Executable object files Shared object files (.so) Generic name: ELF binaries 7 15-213, S 05

ELF Object File Format Elf header Magic number, type (.o, exec,.so), machine, byte ordering, etc. Program header table Page size, virtual addresses memory segments (sections), segment sizes..text section Code.data section Initialized (static) data.bss section Uninitialized (static) data Block Started by Symbol Better Save Space Has section header but occupies no space ELF header Program header table (required for executables).text section.data section.bss section.symtab.rel.txt.rel.data.debug Section header table (required for relocatables) 0 8 15-213, S 05

ELF Object File Format (cont).symtab section Symbol table Procedure and static variable names Section names and locations.rel.text section Relocation info for.text section Addresses of instructions that will need to be modified in the executable Instructions for modifying..rel.data section Relocation info for.data section Addresses of pointer data that will need to be modified in the merged executable.debug section Info for symbolic debugging (gcc -g) ELF header Program header table (required for executables).text section.data section.bss section.symtab.rel.text.rel.data.debug Section header table (required for relocatables) 0 9 15-213, S 05

Linker Symbols Global symbols Symbols defined by module m that can be referenced by other modules. Ex: non-static C functions and non-static global variables. External symbols Global symbols that are referenced by module m but defined by some other module. Local symbols Symbols that are defined and referenced exclusively by module m. Ex: C functions and variables that are defined with the static attribute. Key Point: Local linker symbols are not local program variables 10 15-213, S 05

Resolving Symbols Def of global symbol buf main.c int buf[2] = {1,2}; int main() { swap(); return 0; } swap.c extern int buf[]; static int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; Ref to external symbol buf Linker knows nothing of temp Ref to external symbol swap Def of local symbol bufp0 } bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; 11 15-213, S 05

Relocating Symbol Definitions and References Relocatable Object Files Executable Object File system code system data.text.data 0 headers system code main().text m.o a.o main() int e = 7 a() int *ep = &e int x = 15 int y.text.data.text.data.bss system data int e = 7 int *ep = &e int x = 15 uninitialized data.symtab.debug 12 15-213, S 05 a() more system code.data.bss

main.o Relocation Info int buf[2] = {1,2}; int main() { swap(); return 0; } 0000000 <main>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 08 sub $0x8,%esp 6: e8 fc ff ff ff call 7 <main+0x7> 7: R_386_PC32 swap b: 31 c0 xor %eax,%eax d: 89 ec mov %ebp,%esp f: 5d pop %ebp 10: c3 ret Disassembly of section.data: 00000000 <buf>: 0: 01 00 00 00 02 00 00 00 Source: objdump 13 15-213, S 05

swap.o Relocation Info (.text) extern int buf[]; static int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; } bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; Disassembly of section.text: 00000000 <swap>: 0: 55 push %ebp 1: 8b 15 00 00 00 00 mov 0x0,%edx 3: R_386_32 bufp0 7: a1 00 00 00 00 mov 0x4,%eax 8: R_386_32 buf c: 89 e5 mov %esp,%ebp e: c7 05 00 00 00 00 04movl $0x4,0x0 15: 00 00 00 10: R_386_32 bufp1 14: R_386_32 buf 18: 89 ec mov %ebp,%esp 1a: 8b 0a mov (%edx),%ecx 1c: 89 02 mov %eax,(%edx) 1e: a1 00 00 00 00 mov 0x0,%eax 1f: R_386_32 bufp1 23: 89 08 mov %ecx,(%eax) 25: 5d pop %ebp 26: c3 ret 14 15-213, S 05

a.o Relocation Info (.data) extern int buf[]; static int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; Disassembly of section.data: 00000000 <bufp0>: 0: 00 00 00 00 0: R_386_32 buf } bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; 15 15-213, S 05

Executable After Relocation (.text) 080483b4 <main>: 80483b4: 55 push %ebp 80483b5: 89 e5 mov %esp,%ebp 80483b7: 83 ec 08 sub $0x8,%esp 80483ba: e8 09 00 00 00 call 80483c8 <swap> 80483bf: 31 c0 xor %eax,%eax 80483c1: 89 ec mov %ebp,%esp 80483c3: 5d pop %ebp 80483c4: c3 ret 080483c8 <swap>: 80483c8: 55 push %ebp 80483c9: 8b 15 5c 94 04 08 mov 0x804945c,%edx 80483cf: a1 58 94 04 08 mov 0x8049458,%eax 80483d4: 89 e5 mov %esp,%ebp 80483d6: c7 05 48 95 04 08 58 movl $0x8049458,0x8049548 80483dd: 94 04 08 80483e0: 89 ec mov %ebp,%esp 80483e2: 8b 0a mov (%edx),%ecx 80483e4: 89 02 mov %eax,(%edx) 80483e6: a1 48 95 04 08 mov 0x8049548,%eax 80483eb: 89 08 mov %ecx,(%eax) 80483ed: 5d pop %ebp 80483ee: c3 ret 16 15-213, S 05

Executable After Relocation (.data) Disassembly of section.data: 08049454 <buf>: 8049454: 01 00 00 00 02 00 00 00 0804945c <bufp0>: 804945c: 54 94 04 08 17 15-213, S 05

Strong and Weak Symbols Program symbols are either strong or weak strong: procedures and initialized globals weak: uninitialized globals p1.c p2.c strong int foo=5; int foo; weak strong p1() { } p2() { } strong 18 15-213, S 05

Linker s Symbol Rules Rule 1. A strong symbol can only appear once. Rule 2. A weak symbol can be overridden by a strong symbol of the same name. references to the weak symbol resolve to the strong symbol. Rule 3. If there are multiple weak symbols, the linker can pick an arbitrary one. 19 15-213, S 05

Linker Puzzles int x; p1() {} p1() {} Link time error: two strong symbols (p1) int x; p1() {} int x; p2() {} References to x will refer to the same uninitialized int. Is this what you really want? int x; int y; p1() {} double x; p2() {} Writes to x in p2 might overwrite y! Evil! int x=7; int y=5; p1() {} double x; p2() {} Writes to x in p2 will overwrite y! Nasty! int x=7; p1() {} int x; p2() {} References to x will refer to the same initialized variable. Nightmare scenario: two identical weak structs, compiled by different compilers with different alignment rules. 20 15-213, S 05

Packaging Commonly Functions How to package functions commonly used by programmers? Math, I/O, memory management, string manipulation, etc. Awkward, given the linker framework so far: Option 1: Put all functions in a single source file Programmers link big object file into their programs Space and time inefficient Option 2: Put each function in a separate source file Programmers explicitly link appropriate binaries into their programs More efficient, but burdensome on the programmer Solution: static libraries (.a archive files) Concatenate related relocatable object files into a single file with an index (called an archive). Enhance linker so that it tries to resolve unresolved external references by looking for the symbols in one or more archives. If an archive member file resolves reference, link into executable. 21 15-213, S 05

Creating Static Libraries atoi.c printf.c random.c Translator Translator... Translator atoi.o printf.o random.o Archiver (ar) unix> ar rs libc.a \ atoi.o printf.o random.o libc.a C standard library Archiver allows incremental updates: Recompile function that changes and replace.o file in archive. 22 15-213, S 05

Commonly Used Libraries libc.a (the C standard library) 8 MB archive of 900 object files. I/O, memory allocation, signal handling, string handling, data and time, random numbers, integer math libm.a (the C math library) 1 MB archive of 226 object files. floating point math (sin, cos, tan, log, exp, sqrt, ) % ar -t /usr/lib/libc.a sort fork.o fprintf.o fpu_control.o fputc.o freopen.o fscanf.o fseek.o fstab.o % ar -t /usr/lib/libm.a sort e_acos.o e_acosf.o e_acosh.o e_acoshf.o e_acoshl.o e_acosl.o e_asin.o e_asinf.o e_asinl.o 23 15-213, S 05

Linking with Static Libraries addvec.o multvec.o main2.c vector.h Archiver (ar) Translators (cpp, cc1, as) libvector.a libc.a Static libraries Relocatable object files main2.o addvec.o printf.o and any other modules called by printf.o Linker (ld) p2 Fully linked executable object file 24 15-213, S 05

Using Static Libraries Linker s algorithm for resolving external references: Scan.o files and.a files in the command line order. During the scan, keep a list of the current unresolved references. As each new.o or.a file, obj, is encountered, try to resolve each unresolved reference in the list against the symbols defined in obj. If any entries in the unresolved list at end of scan, then error. Problem: Command line order matters! Moral: put libraries at the end of the command line. bass> gcc -L. libtest.o -lmine bass> gcc -L. -lmine libtest.o libtest.o: In function `main': libtest.o(.text+0x4): undefined reference to `libfun' 25 15-213, S 05

Loading Executable Object Files Executable Object File ELF header Program header table (required for executables).text section.data section.bss section.symtab.rel.text.rel.data.debug Section header table (required for relocatables) 0 0xc0000000 0x40000000 0x08048000 Kernel virtual memory User stack (created at runtime) Memory-mapped region for shared libraries Run-time heap (created by malloc) Unused 26 15-213, S 05 0 Read/write segment (.data,.bss) Read-only segment (.init,.text,.rodata) Memory invisible to user code %esp (stack pointer) brk Loaded from the executable file

Shared Libraries Static libraries have the following disadvantages: Potential for duplicating lots of common code in the executable files on a filesystem. e.g., every C program needs the standard C library Potential for duplicating lots of code in the virtual memory space of many processes. Minor bug fixes of system libraries require each application to explicitly relink Modern Solution: Shared Libraries Object files that contain code and data that are loaded and linked into an application dynamically, at either load-time or run-time Dynamic link libraries, DLLs,.so files 27 15-213, S 05

Shared Libraries (cont) Dynamic linking can occur when executable is first loaded and run (load-time linking). Common case for Linux, handled automatically by the dynamic linker (ld-linux.so). Standard C library (libc.so) usually dynamically linked. Dynamic linking can also occur after program has begun (run-time linking). In Unix, this is done by calls to the dlopen() interface. High-performance web wervers. Runtime library interpositioning Shared library routines can be shared by multiple processes. More on this when we learn about virtual memory. 28 15-213, S 05

Dynamic Linking at Load-time main2.c vector.h unix> gcc -shared -o libvector.so \ addvec.c multvec.c Relocatable object file Translators (cpp, cc1, as) main2.o libc.so libvector.so Relocation and symbol table info Linker (ld) Partially linked executable object file p2 Loader (execve) libc.so libvector.so Fully linked executable in memory Dynamic linker (ld-linux.so) Code and data 29 15-213, S 05

Dynamic Linking at Run-time #include <stdio.h> #include <dlfcn.h> int x[2] = {1, 2}; int y[2] = {3, 4}; int z[2]; int main() { void *handle; void (*addvec)(int *, int *, int *, int); char *error; /* dynamically load the shared lib that contains addvec() */ handle = dlopen("./libvector.so", RTLD_LAZY); if (!handle) { fprintf(stderr, "%s\n", dlerror()); exit(1); } 30 15-213, S 05

Dynamic Linking at Run-time... /* get a pointer to the addvec() function we just loaded */ addvec = dlsym(handle, "addvec"); if ((error = dlerror())!= NULL) { fprintf(stderr, "%s\n", error); exit(1); } /* Now we can call addvec() it just like any other function */ addvec(x, y, z, 2); printf("z = [%d %d]\n", z[0], z[1]); } /* unload the shared library */ if (dlclose(handle) < 0) { fprintf(stderr, "%s\n", dlerror()); exit(1); } return 0; 31 15-213, S 05

Some Interpositioning Applications Security Confinement (sandboxing) Autolab interposes calls to libc functions. Running /bin/xterm from student code on server a no-no. Behind the scenes encryption Automatically encrypt otherwise unencrypted network connections. Monitoring and Profiling Count number of calls to functions Characterize call sites and arguments to functions Malloc tracing Detecting memory leaks Generating malloc traces 32 15-213, S 05

Case Study: Library Interpositioning Library interpositioning is a powerful linking technique that allows programmers to intercept calls to arbitrary functions Interpositioning can occur at: compile time When the source code is compiled link time When the relocatable object files are linked to form an executable object file load/run time When an executable object file is loaded into memory, dynamically linked, and then executed. 33 15-213, S 05

Example: malloc() Statistics Find out how much memory is allocated by a function void *malloc(size_t size){ static void *(*fp)(size_t) = 0; void *mp; char *errorstr; if (!fp) { /* get a pointer to the real malloc() */ fp = dlsym(rtld_next, "malloc"); if ((errorstr = dlerror())!= NULL) { fprintf(stderr, "%s(): %s\n", fname, errorstr); exit(exit_failure); } } mp = fp(size); if (!mp) { /* report malloc failure */ } } mem_used += size; return mp; 34 15-213, S 05