Linking. CS 485 Systems Programming Fall Instructor: James Griffioen

Similar documents
Sungkyunkwan University

Lecture 12-13: Linking

Today. Linking. Example C Program. Sta7c Linking. Linking Case study: Library interposi7oning

Example C Program. Linking CS Instructor: Sanjeev Se(a. int buf[2] = {1, 2}; extern int buf[]; int main() { swap(); return 0; }

Example C program. 11: Linking. Why linkers? Modularity! Static linking. Why linkers? Efficiency! What do linkers do? 10/28/2013

Example C Program The course that gives CMU its Zip! Linking March 2, Static Linking. Why Linkers? Page # Topics

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Lecture 16: Linking Computer Architecture and Systems Programming ( )

Linking February 24, 2005

Outline. 1 Background. 2 ELF Linking. 3 Static Linking. 4 Dynamic Linking. 5 Summary. Linker. Various Stages. 1 Linking can be done at compile.

A Simplistic Program Translation Scheme

u Linking u Case study: Library interpositioning

Linking Oct. 15, 2002

Computer Systems. Linking. Han, Hwansoo

Systems I. Linking II

Revealing Internals of Linkers. Zhiqiang Lin

LINKING. Jo, Heeseung

CSE 2421: Systems I Low-Level Programming and Computer Organization. Linking. Presentation N. Introduction to Linkers

E = 2 e lines per set. S = 2 s sets tag. valid bit B = 2 b bytes per cache block (the data) CSE351 Inaugural EdiNon Spring

Computer Organization: A Programmer's Perspective

Exercise Session 7 Computer Architecture and Systems Programming

Linking. Computer Systems Organization (Spring 2017) CSCI-UA 201, Section 3. Instructor: Joanna Klukowska

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Linking Oct. 26, 2009"

CS429: Computer Organization and Architecture

Relocating Symbols and Resolving External References. CS429: Computer Organization and Architecture. m.o Relocation Info

CS 201 Linking Gerson Robboy Portland State University

Linking : Introduc on to Computer Systems 13 th Lecture, October 11th, Instructor: Randy Bryant. Carnegie Mellon

Linking and Loading. CS61, Lecture 16. Prof. Stephen Chong October 25, 2011

Linker Puzzles The course that gives CMU its Zip! Linking Mar 4, A Simplistic Program Translation Scheme. A Better Scheme Using a Linker

(Extract from the slides by Terrance E. Boult

A SimplisHc Program TranslaHon Scheme. TranslaHng the Example Program. Example C Program. Why Linkers? - Modularity. Linking

Linking. Today. Next time. Static linking Object files Static & dynamically linked libraries. Exceptional control flows

COMPILING OBJECTS AND OTHER LANGUAGE IMPLEMENTATION ISSUES. Credit: Mostly Bryant & O Hallaron

CS241 Computer Organization Spring Buffer Overflow

Systemprogrammering och operativsystem Laborationer. Linker Puzzles. Systemprogrammering 2007 Föreläsning 1 Compilation and Linking

Link 7. Dynamic Linking

CSE2421 Systems1 Introduction to Low-Level Programming and Computer Organization

Link 8.A Dynamic Linking

Link 7. Static Linking

Link 7.A Static Linking

CS 550 Operating Systems Spring Process I

Link 2. Object Files

Link 2. Object Files

Executables and Linking. CS449 Spring 2016

Executables and Linking. CS449 Fall 2017

Link 4. Relocation. Young W. Lim Wed. Young W. Lim Link 4. Relocation Wed 1 / 22

Linking. Explain what ELF format is. Explain what an executable is and how it got that way. With huge thanks to Steve Chong for his notes from CS61.

Link 8. Dynamic Linking

238P: Operating Systems. Lecture 7: Basic Architecture of a Program. Anton Burtsev January, 2018

Link 4. Relocation. Young W. Lim Thr. Young W. Lim Link 4. Relocation Thr 1 / 26

Compiler Drivers = GCC

Computer Systems Organization

CIT 595 Spring System Software: Programming Tools. Assembly Process Example: First Pass. Assembly Process Example: Second Pass.

Linkers and Loaders. CS 167 VI 1 Copyright 2008 Thomas W. Doeppner. All rights reserved.

Link 3. Symbols. Young W. Lim Mon. Young W. Lim Link 3. Symbols Mon 1 / 42

Systemprogrammering och operativsystem Linker Puzzles. Systemprogrammering 2009 Före lä s n in g 1 Compilation and Linking

A software view. Computer Systems. The Compilation system. How it works. 1. Preprocesser. 1. Preprocessor (cpp)

Compila(on, Disassembly, and Profiling

Generating Programs and Linking. Professor Rick Han Department of Computer Science University of Colorado at Boulder

Introduction Presentation A

Obtained the source code to gcc, one can just follow the instructions given in the INSTALL file for GCC.

Midterm. Median: 56, Mean: "midterm.data" using 1:2 1 / 37

Comprehensive Static Instrumentation for Dynamic-Analysis Tools

Link Edits and Relocatable Code

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING PREVIEW SLIDES 16, SPRING 2013

Draft. Chapter 1 Program Structure. 1.1 Introduction. 1.2 The 0s and the 1s. 1.3 Bits and Bytes. 1.4 Representation of Numbers in Memory

Linking and Loading. ICS312 - Spring 2010 Machine-Level and Systems Programming. Henri Casanova

Program Translation. text. text. binary. binary. C program (p1.c) Compiler (gcc -S) Asm code (p1.s) Assembler (gcc or as) Object code (p1.

CMSC 313 Lecture 12 [draft] How C functions pass parameters

Memory Management. CS449 Fall 2017

Introduction to Computer Systems , fall th Lecture, Sep. 28 th

CS5460/6460: Operating Systems. Lecture 21: Shared libraries. Anton Burtsev March, 2014

The Preprocessor. CS 2022: Introduction to C. Instructor: Hussam Abu-Libdeh. Cornell University (based on slides by Saikat Guha) Fall 2011, Lecture 9

Outline. Unresolved references

Memory and C/C++ modules

Link 4. Relocation. Young W. Lim Sat. Young W. Lim Link 4. Relocation Sat 1 / 33

Binghamton University. CS-220 Spring Loading Code. Computer Systems Chapter 7.5, 7.8, 7.9

Link 4. Relocation. Young W. Lim Tue. Young W. Lim Link 4. Relocation Tue 1 / 38

Machine- Level Programming III: Switch Statements and IA32 Procedures

ELF (1A) Young Won Lim 10/22/14

CS 3214 Computer Systems. Do not start the test until instructed to do so! printed

Link 4. Relocation. Young W. Lim Mon. Young W. Lim Link 4. Relocation Mon 1 / 35

CMSC 216 Introduction to Computer Systems Lecture 23 Libraries

238P: Operating Systems. Lecture 4: Linking and Loading (Basic architecture of a program) Anton Burtsev October, 2018

Intro x86 Part 3: Linux Tools & Analysis

o Code, executable, and process o Main memory vs. virtual memory

CS3214 Spring 2017 Exercise 2

CSC 405 Computer Security Stack Canaries & ASLR

High Performance Computing Lecture 1. Matthew Jacob Indian Institute of Science

CS 33. Libraries. CS33 Intro to Computer Systems XXVIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Link 4. Relocation. Young W. Lim Thr. Young W. Lim Link 4. Relocation Thr 1 / 48

Advanced Buffer Overflow

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Follow us on Twitter for important news and Compiling Programs

Practical C Issues:! Preprocessor Directives, Multi-file Development, Makefiles. CS449 Fall 2017

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

143A: Principles of Operating Systems. Lecture 4: Linking and Loading (Basic architecture of a program) Anton Burtsev October, 2018

CS2141 Software Development using C/C++ Libraries

Transcription:

Linking CS 485 Systems Programming Fall 2015 Instructor: James Griffioen Adapted from slides by R. Bryant and D. O Hallaron (hip://csapp.cs.cmu.edu/public/instructors.html) 1

Today Linking Case study: Library interposi7oning 2

Example C Program main.c int buf[2] = {1, 2; int main() { swap(); return 0; swap.c extern int buf[]; int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; 3

Compiling C files Programs are translated and linked using a compiler driver: unix> gcc -O2 -g -o p main.c swap.c unix>./p main.c swap.c Source files Compiler (gcc) p Executable program file (contains code and data for all func9ons defined in main.c and swap.c) The compiler (gcc) hides several important steps in the compila7on process. (See the hello world example from chapter 1 notes). 4

A more detailed look: Sta7c Linking Programs are translated and linked using a compiler driver: unix> gcc -O2 -g -o p main.c swap.c unix>./p main.c swap.c Source files Translators (cpp, cc1, as) main.o Translators (cpp, cc1, as) swap.o Separately compiled relocatable object files Linker (ld) Let s start with the linker. p Fully linked executable object file (contains code and data for all func9ons defined in main.c and swap.c) 5

Why Linkers? Reason 1: Modularity Program can be wriien as a collecpon of smaller source files, rather than one monolithic mass. Can build libraries of common funcpons (more on this later) e.g., Math library, standard C library 6

Why Linkers? (cont) Reason 2: Efficiency Time: Separate compilapon Change one source file, compile, and then relink. No need to recompile other source files. Space: Libraries Common funcpons can be aggregated into a single file... Yet executable files and running memory images contain only code for the funcpons they actually use. 7

Linking Programs are translated and linked using a compiler driver: unix> gcc -O2 -g -o p main.c swap.c unix>./p main.c Translators (cpp, cc1, as) main.o swap.c Translators (cpp, cc1, as) swap.o Note: the translator does not see the en7re program. In fact it may encounter func7ons and variables that are allocated in other source files. Linker (ld) p 8

What Do Linkers Do? Step 1. Symbol resolu7on Programs define and reference symbols (variables and funcpons): void swap() { /* define symbol swap */ swap(); /* reference symbol a */ int *xp = &x; /* define symbol xp, reference x */ Symbol definipons are stored (by compiler) in symbol table. Symbol table is an array of structs Each entry includes name, size, and locapon of symbol. Linker associates each symbol reference with exactly one symbol definipon. 9

What Do Linkers Do? (cont) Step 2. Reloca7on Merges separate code and data secpons into single secpons Relocates symbols from their relapve locapons in the.o files to their final absolute memory locapons in the executable. Updates all references to these symbols to reflect their new posipons. 10

Three Kinds of Object Files (Modules) Relocatable object file (.o file) Contains code and data in a form that can be combined with other relocatable object files to form executable object file. Each.o file is produced from exactly one source (.c) file Executable object file (a.out file) Contains code and data in a form that can be copied directly into memory and then executed. Shared object file (.so file) Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load Pme or run- Pme. Called Dynamic Link Libraries (DLLs) by Windows 11

Executable and Linkable Format (ELF) Standard binary format for object files Originally proposed by AT&T System V Unix Later adopted by BSD Unix variants and Linux One unified format for Relocatable object files (.o), Executable object files (a.out) Shared object files (.so) Generic name: ELF binaries 12

ELF Object File Format Elf header Word size, byte ordering, file type (.o, exec,.so), machine type, etc. Segment header table Page size, virtual addresses memory segments (secpons), segment sizes..text sec7on Code.rodata sec7on Read only data: jump tables,....data sec7on IniPalized global variables.bss sec7on UniniPalized global variables Block Started by Symbol BeIer Save Space Has secpon header but occupies no space ELF header Segment header table (required for executables).text sec7on.rodata sec7on.data sec7on.bss sec7on.symtab sec7on.rel.txt sec7on.rel.data sec7on.debug sec7on Sec7on header table 0 13

ELF Object File Format (cont.).symtab sec7on Symbol table Procedure and stapc variable names SecPon names and locapons.rel.text sec7on RelocaPon info for.text secpon Addresses of instrucpons that will need to be modified in the executable InstrucPons for modifying..rel.data sec7on RelocaPon info for.data secpon Addresses of pointer data that will need to be modified in the merged executable.debug sec7on Info for symbolic debugging (gcc -g) ELF header Segment header table (required for executables).text sec7on.rodata sec7on.data sec7on.bss sec7on.symtab sec7on.rel.txt sec7on.rel.data sec7on.debug sec7on 0 Sec7on header table Offsets and sizes of each secpon Sec7on header table 14

Linker Symbols Global symbols Symbols defined by module m that can be referenced by other modules. E.g.: non- static C funcpons and non- static global variables. External symbols Global symbols that are referenced by module m but defined by some other module. Local symbols Symbols that are defined and referenced exclusively by module m. E.g.: C funcpons and variables defined with the static airibute. Local linker symbols are not local program variables 15

Resolving Symbols Global Global External Local int buf[2] = {1, 2; extern int buf[]; int main() { swap(); return 0; main.c int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; Global External Linker knows nothing of temp bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; swap.c 16

Reloca7ng Code and Data Relocatable Object Files Executable Object File System code System data.text.data 0 Headers System code main.o main().text int buf[2]={1,2.data swap.o swap().text int *bufp0=&buf[0].data static int *bufp1.bss main() swap() More system code System data int buf[2]={1,2 int *bufp0=&buf[0] int *bufp1.symtab.debug.text.data.bss Even though private to swap, requires alloca7on in.bss 17

Reloca7on Info (main) main.c int buf[2] = {1,2; int main() { swap(); return 0; main.o 0000000 <main>: 0: 8d 4c 24 04 lea 0x4(%esp),%ecx 4: 83 e4 f0 and $0xfffffff0,%esp 7: ff 71 fc pushl 0xfffffffc(%ecx) a: 55 push %ebp b: 89 e5 mov %esp,%ebp d: 51 push %ecx e: 83 ec 04 sub $0x4,%esp 11: e8 fc ff ff ff call 12 <main+0x12> 12: R_386_PC32 swap 16: 83 c4 04 add $0x4,%esp 19: 31 c0 xor %eax,%eax 1b: 59 pop %ecx 1c: 5d pop %ebp 1d: 8d 61 fc lea 0xfffffffc(%ecx),%esp 20: c3 ret Source: objdump r -d Disassembly of section.data: 00000000 <buf>: 0: 01 00 00 00 02 00 00 00 18

Reloca7on Info (swap,.text) swap.c extern int buf[]; int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; swap.o Disassembly of section.text: 00000000 <swap>: 0: 8b 15 00 00 00 00 mov 0x0,%edx 2: R_386_32 buf 6: a1 04 00 00 00 mov 0x4,%eax 7: R_386_32 buf b: 55 push %ebp c: 89 e5 mov %esp,%ebp e: c7 05 00 00 00 00 04 movl $0x4,0x0 15: 00 00 00 10: R_386_32.bss 14: R_386_32 buf 18: 8b 08 mov (%eax),%ecx 1a: 89 10 mov %edx,(%eax) 1c: 5d pop %ebp 1d: 89 0d 04 00 00 00 mov %ecx,0x4 1f: R_386_32 buf 23: c3 ret 19

Reloca7on Info (swap,.data) swap.c extern int buf[]; int *bufp0 = &buf[0]; static int *bufp1; Disassembly of section.data: 00000000 <bufp0>: 0: 00 00 00 00 0: R_386_32 buf void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; 20

Executable Before/Aaer Reloca7on (.text) 0000000 <main>:... e: 83 ec 04 sub $0x4,%esp 11: e8 fc ff ff ff call 12 <main+0x12> 12: R_386_PC32 swap 16: 83 c4 04 add $0x4,%esp... 0x8048396 + 0x1a = 0x80483b0 08048380 <main>: 8048380: 8d 4c 24 04 lea 0x4(%esp),%ecx 8048384: 83 e4 f0 and $0xfffffff0,%esp 8048387: ff 71 fc pushl 0xfffffffc(%ecx) 804838a: 55 push %ebp 804838b: 89 e5 mov %esp,%ebp 804838d: 51 push %ecx 804838e: 83 ec 04 sub $0x4,%esp 8048391: e8 1a 00 00 00 call 80483b0 <swap> 8048396: 83 c4 04 add $0x4,%esp 8048399: 31 c0 xor %eax,%eax 804839b: 59 pop %ecx 804839c: 5d pop %ebp 804839d: 8d 61 fc lea 0xfffffffc(%ecx),%esp 80483a0: c3 ret 21

0: 8b 15 00 00 00 00 mov 0x0,%edx 2: R_386_32 buf 6: a1 04 00 00 00 mov 0x4,%eax 7: R_386_32 buf... e: c7 05 00 00 00 00 04 movl $0x4,0x0 15: 00 00 00 10: R_386_32.bss 14: R_386_32 buf... 1d: 89 0d 04 00 00 00 mov %ecx,0x4 1f: R_386_32 buf 23: c3 ret 080483b0 <swap>: 80483b0: 8b 15 20 96 04 08 mov 0x8049620,%edx 80483b6: a1 24 96 04 08 mov 0x8049624,%eax 80483bb: 55 push %ebp 80483bc: 89 e5 mov %esp,%ebp 80483be: c7 05 30 96 04 08 24 movl $0x8049624,0x8049630 80483c5: 96 04 08 80483c8: 8b 08 mov (%eax),%ecx 80483ca: 89 10 mov %edx,(%eax) 80483cc: 5d pop %ebp 80483cd: 89 0d 24 96 04 08 mov %ecx,0x8049624 80483d3: c3 ret 22

Executable Aaer Reloca7on (.data) Disassembly of section.data: 08049620 <buf>: 8049620: 01 00 00 00 02 00 00 00 08049628 <bufp0>: 8049628: 20 96 04 08 23

Tools for Manipula7ng Object Files ar creates sta7c libraries strings lists all of the printable strings contained in a file strip deletes symbol table informa7on from an object file nm list the symbols defined in the symbol table size list the names and sizes of the sec7ons of an object file readelf displays the complete structure of an object file objdump can display all the informa7on from an object file in lots of different ways ldd list the shared libraries that a program needs at run7me 24

Strong and Weak Symbols p1.c int foo=5; p1() { p2.c extern int foo; p2() { 25

Strong and Weak Symbols What if the keyword extern is missing? p1.c int foo=5; p1() { p2.c int foo; p2() { 26

Strong and Weak Symbols What if foo is define with a different value? p1.c int foo=5; p1() { p2.c int foo=2; p2() { 27

Strong and Weak Symbols What if foo is define with the same value? p1.c int foo=5; p1() { p2.c int foo=5; p2() { 28

Strong and Weak Symbols Program symbols are either strong or weak Strong: procedures and inipalized globals Weak: uninipalized globals p1.c p2.c strong int foo=5; int foo; weak strong p1() { p2() { strong 29

Linker s Symbol Rules Rule 1: Mul7ple strong symbols are not allowed Each item can be defined only once Otherwise: Linker error Rule 2: Given a strong symbol and mul7ple weak symbol, choose the strong symbol References to the weak symbol resolve to the strong symbol Rule 3: If there are mul7ple weak symbols, pick an arbitrary one Can override this with gcc fno-common 30

Linker Puzzles int x; p1() { p1() { Link Pme error: two strong symbols (p1) int x; p1() { int x; p2() { References to x will refer to the same uninipalized int. Is this what you really want? int x; int y; p1() { double x; p2() { Writes to x in p2 might overwrite y! Evil! int x=7; int y=5; p1() { double x; p2() { Writes to x in p2 will overwrite y! Nasty! int x=7; p1() { int x; p2() { References to x will refer to the same inipalized variable. Nightmare scenario: two iden7cal weak structs, compiled by different compilers with different alignment rules. 31

Role of.h Files c1.c #include "global.h" int f() { return g+1; c2.c #include <stdio.h> #include "global.h" global.h #ifdef INITIALIZE int g = 23; static int init = 1; #else int g; static int init = 0; #endif int main() { if (!init) g = 37; int t = f(); printf("calling f yields %d\n", t); return 0; 32

Running Preprocessor c1.c #include "global.h" int f() { return g+1; global.h #ifdef INITIALIZE int g = 23; static int init = 1; #else int g; static int init = 0; #endif -DINITIALIZE int g = 23; static int init = 1; int f() { return g+1; no ini7aliza7on int g; static int init = 0; int f() { return g+1; #include causes C preprocessor to insert file verba7m 33

Role of.h Files (cont.) c2.c c1.c #include "global.h" int f() { return g+1; #include <stdio.h> #include "global.h" global.h int main() { if (!init) g = 37; int t = f(); printf("calling f yields %d\n", t); return 0; #ifdef INITIALIZE int g = 23; static int init = 1; #else int g; static int init = 0; #endif What happens: gcc -o p c1.c c2.c?? gcc -o p c1.c c2.c \ -DINITIALIZE?? 34

Global Variables Avoid if you can Otherwise Use static if you can IniPalize if you define a global variable Use extern if you use external global variable 35

Packaging Commonly Used Func7ons How to package func7ons commonly used by programmers? Math, I/O, memory management, string manipulapon, etc. Awkward, given the linker framework so far: Op7on 1: Put all funcpons into a single source file Programmers link big object file into their programs Space and Pme inefficient Op7on 2: Put each funcpon in a separate source file Programmers explicitly link appropriate binaries into their programs More efficient, but burdensome on the programmer 36

Solu7on: Sta7c Libraries Sta7c libraries (.a archive files) Concatenate related relocatable object files into a single file with an index (called an archive). Enhance linker so that it tries to resolve unresolved external references by looking for the symbols in one or more archives. If an archive member file resolves reference, link it into the executable. 37

Crea7ng Sta7c Libraries atoi.c printf.c random.c Translator Translator... Translator atoi.o printf.o random.o Archiver (ar) unix> ar rs libc.a \ atoi.o printf.o random.o libc.a C standard library Archiver allows incremental updates Recompile func7on that changes and replace.o file in archive. 38

Commonly Used Libraries libc.a (the C standard library) 8 MB archive of 1392 object files. I/O, memory allocapon, signal handling, string handling, data and Pme, random numbers, integer math libm.a (the C math library) 1 MB archive of 401 object files. floapng point math (sin, cos, tan, log, exp, sqrt, ) % ar -t /usr/lib/libc.a sort fork.o fprintf.o fpu_control.o fputc.o freopen.o fscanf.o fseek.o fstab.o % ar -t /usr/lib/libm.a sort e_acos.o e_acosf.o e_acosh.o e_acoshf.o e_acoshl.o e_acosl.o e_asin.o e_asinf.o e_asinl.o 39

Linking with Sta7c Libraries addvec.o multvec.o main2.c vector.h Archiver (ar) Translators (cpp, cc1, as) libvector.a libc.a Sta9c libraries Relocatable object files main2.o addvec.o printf.o and any other modules called by printf.o Linker (ld) p2 Fully linked executable object file 40

Using Sta7c Libraries Linker s algorithm for resolving external references: Scan.o files and.a files in the command line order. During the scan, keep a list of the current unresolved references. As each new.o or.a file, obj, is encountered, try to resolve each unresolved reference in the list against the symbols defined in obj. If any entries in the unresolved list at end of scan, then error. Problem: Command line order maiers! Moral: put libraries at the end of the command line. unix> gcc -L. libtest.o -lmine unix> gcc -L. -lmine libtest.o libtest.o: In function `main': libtest.o(.text+0x4): undefined reference to `libfun' 41

Loading Executable Object Files File Start Executable Object File ELF header Program header table (required for executables).init sec7on High Memory 0 0x100000000 Kernel virtual memory User stack (created at run7me) Memory outside 32- bit address space %esp (stack pointer).text sec7on.rodata sec7on.data sec7on 0xf7e9ddc0 Memory- mapped region for shared libraries.bss sec7on.symtab.debug.line.strtab Sec7on header table (required for relocatables) File End 0x08048000 Low Memory 0 Run- 7me heap (created by malloc) Read/write segment (.data,.bss) Read- only segment (.init,.text,.rodata) Unused brk Loaded from the executable file 42

Shared Libraries Sta7c libraries have the following disadvantages: DuplicaPon in the stored executables (every funcpon need std libc) DuplicaPon in the running executables Minor bug fixes of system libraries require each applicapon to explicitly relink Modern solu7on: Shared Libraries Object files that contain code and data that are loaded and linked into an applicapon dynamically, at either load- 8me or run- 8me Also called: dynamic link libraries, DLLs,.so files 43

Shared Libraries (cont.) Dynamic linking can occur when executable is first loaded and run (load- 7me linking). Common case for Linux, handled automapcally by the dynamic linker (ld-linux.so). Standard C library (libc.so) usually dynamically linked. Dynamic linking can also occur aaer program has begun (run- 7me linking). In Linux, this is done by calls to the dlopen() interface. DistribuPng solware. High- performance web servers. RunPme library interposiponing. Shared library rou7nes can be shared by mul7ple processes. More on this when we learn about virtual memory 44

Dynamic Linking at Load- 7me main2.c vector.h CS 485 Systems Programming unix> gcc -shared -o libvector.so \ addvec.c multvec.c Relocatable object file Translators (cpp, cc1, as) main2.o libc.so libvector.so Reloca9on and symbol table info Linker (ld) Par9ally linked executable object file p2 Loader (execve) libc.so libvector.so Fully linked executable in memory Dynamic linker (ld-linux.so) Code and data 45

Dynamic Linking at Run- 7me #include <stdio.h> #include <dlfcn.h> int x[2] = {1, 2; int y[2] = {3, 4; int z[2]; int main() { void *handle; void (*addvec)(int *, int *, int *, int); char *error; /* dynamically load the shared lib that contains addvec() */ handle = dlopen("./libvector.so", RTLD_LAZY); if (!handle) { fprintf(stderr, "%s\n", dlerror()); exit(1); 46

Dynamic Linking at Run- 7me... /* get a pointer to the addvec() function we just loaded */ addvec = dlsym(handle, "addvec"); if ((error = dlerror())!= NULL) { fprintf(stderr, "%s\n", error); exit(1); /* Now we can call addvec() just like any other function */ addvec(x, y, z, 2); printf("z = [%d %d]\n", z[0], z[1]); /* unload the shared library */ if (dlclose(handle) < 0) { fprintf(stderr, "%s\n", dlerror()); exit(1); return 0; 47

Today Linking Case study: Library interposi7oning 48

Case Study: Library Interposi7oning Library interposi7oning : powerful linking technique that allows programmers to intercept calls to arbitrary func7ons Interposi7oning can occur at: Compile Pme: When the source code is compiled Link Pme: When the relocatable object files are stapcally linked to form an executable object file Load/run Pme: When an executable object file is loaded into memory, dynamically linked, and then executed. 49

Some Interposi7oning Applica7ons Security Confinement (sandboxing) Interpose calls to libc funcpons. Behind the scenes encryppon AutomaPcally encrypt otherwise unencrypted network connecpons. Monitoring and Profiling Count number of calls to funcpons Characterize call sites and arguments to funcpons Malloc tracing DetecPng memory leaks Genera7ng address traces 50

Example program #include <stdio.h> #include <stdlib.h> #include <malloc.h> int main() { free(malloc(10)); printf("hello, world\n"); exit(0); hello.c Goal: trace the addresses and sizes of the allocated and freed blocks, without modifying the source code. Three solu7ons: interpose on the lib malloc and free func7ons at compile 7me, link 7me, and load/ run 7me. 51

Compile- 7me Interposi7oning #ifdef COMPILETIME /* Compile-time interposition of malloc and free using C * preprocessor. A local malloc.h file defines malloc (free) * as wrappers mymalloc (myfree) respectively. */ #include <stdio.h> /* * mymalloc - malloc wrapper function */ void *mymalloc(size_t size, char *file, int line) { void *ptr = malloc(size); printf("%s:%d: malloc(%d)=%p\n", file, line, (int)size, ptr); return ptr; mymalloc.c 52

Compile- 7me Interposi7oning #define malloc(size) mymalloc(size, FILE, LINE ) #define free(ptr) myfree(ptr, FILE, LINE ) void *mymalloc(size_t size, char *file, int line); void myfree(void *ptr, char *file, int line); malloc.h linux> make helloc gcc -O2 -Wall -DCOMPILETIME -c mymalloc.c gcc -O2 -Wall -I. -o helloc hello.c mymalloc.o linux> make runc./helloc hello.c:7: malloc(10)=0x501010 hello.c:7: free(0x501010) hello, world 53

Link- 7me Interposi7oning #ifdef LINKTIME /* Link-time interposition of malloc and free using the static linker's (ld) "--wrap symbol" flag. */ #include <stdio.h> void * real_malloc(size_t size); void real_free(void *ptr); /* * wrap_malloc - malloc wrapper function */ void * wrap_malloc(size_t size) { void *ptr = real_malloc(size); printf("malloc(%d) = %p\n", (int)size, ptr); return ptr; mymalloc.c 54

Link- 7me Interposi7oning linux> make hellol gcc -O2 -Wall -DLINKTIME -c mymalloc.c gcc -O2 -Wall -Wl,--wrap,malloc -Wl,--wrap,free \ -o hellol hello.c mymalloc.o linux> make runl./hellol malloc(10) = 0x501010 free(0x501010) hello, world The -Wl flag passes argument to linker Telling linker --wrap,malloc tells it to resolve references in a special way: Refs to malloc should be resolved as wrap_malloc Refs to real_malloc should be resolved as malloc 55

#ifdef RUNTIME /* Run-time interposition of malloc and free based on * dynamic linker's (ld-linux.so) LD_PRELOAD mechanism */ #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <dlfcn.h> void *malloc(size_t size) { static void *(*mallocp)(size_t size); char *error; void *ptr; CS 485 Systems Programming Load/Run- 7me Interposi7oning /* get address of libc malloc */ if (!mallocp) { mallocp = dlsym(rtld_next, "malloc"); if ((error = dlerror())!= NULL) { fputs(error, stderr); exit(1); ptr = mallocp(size); printf("malloc(%d) = %p\n", (int)size, ptr); return ptr; mymalloc.c 56

Load/Run- 7me Interposi7oning linux> make hellor gcc -O2 -Wall -DRUNTIME -shared -fpic -o mymalloc.so mymalloc.c gcc -O2 -Wall -o hellor hello.c linux> make runr (LD_PRELOAD="/usr/lib64/libdl.so./mymalloc.so"./hellor) malloc(10) = 0x501010 free(0x501010) hello, world The LD_PRELOAD environment variable tells the dynamic linker to resolve unresolved refs (e.g., to malloc)by looking in libdl.so and mymalloc.so first. libdl.so necessary to resolve references to the dlopen funcpons. 57

Interposi7oning Recap Compile Time Apparent calls to malloc/free get macro- expanded into calls to mymalloc/myfree Link Time Use linker trick to have special name resolupons malloc à wrap_malloc real_malloc à malloc Compile Time Implement custom version of malloc/free that use dynamic linking to load library malloc/free under different names 58