Memory and C/C++ modules

Similar documents
Reminder: compiling & linking

Compilation/linking revisited. Memory and C/C++ modules. Layout of C/C++ programs. A sample C program demo.c. A possible structure of demo.

Memory and C/C++ modules

o Code, executable, and process o Main memory vs. virtual memory

CSC 1600 Memory Layout for Unix Processes"

Lecture 8 Dynamic Memory Allocation

C Programming. Course Outline. C Programming. Code: MBD101. Duration: 10 Hours. Prerequisites:

Short Notes of CS201

CS201 - Introduction to Programming Glossary By

High Performance Computing Lecture 1. Matthew Jacob Indian Institute of Science

Memory Allocation in C C Programming and Software Tools. N.C. State Department of Computer Science

Draft. Chapter 1 Program Structure. 1.1 Introduction. 1.2 The 0s and the 1s. 1.3 Bits and Bytes. 1.4 Representation of Numbers in Memory

Secure Software Programming and Vulnerability Analysis

In Java we have the keyword null, which is the value of an uninitialized reference type

Reminder of midterm 1. Next Thursday, Feb. 14, in class Look at Piazza announcement for rules and preparations

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

Limitations of the stack

CSE351 Spring 2018, Final Exam June 6, 2018

Dynamic memory allocation

POINTER AND ARRAY SUNU WIBIRAMA

C Structures & Dynamic Memory Management

CSci 4061 Introduction to Operating Systems. Programs in C/Unix

CS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018

advanced data types (2) typedef. today advanced data types (3) enum. mon 23 sep 2002 defining your own types using typedef

Programming. Pointers, Multi-dimensional Arrays and Memory Management

Dynamic Memory Allocation

14. Memory API. Operating System: Three Easy Pieces

Data Structure Series

CS 11 C track: lecture 5

Quiz Start Time: 09:34 PM Time Left 82 sec(s)

CS201 Latest Solved MCQs

Malloc Lab & Midterm Solutions. Recitation 11: Tuesday: 11/08/2016

Vector and Free Store (Pointers and Memory Allocation)

Introduction to C. Robert Escriva. Cornell CS 4411, August 30, Geared toward programmers

Programming in C - Part 2

Memory Management. CS449 Fall 2017

Memory Management. CSC215 Lecture

CS201 Some Important Definitions

Heap Arrays. Steven R. Bagley

C: Pointers. C: Pointers. Department of Computer Science College of Engineering Boise State University. September 11, /21

unsigned char memory[] STACK ¼ 0x xC of address space globals function KERNEL code local variables

CS C Primer. Tyler Szepesi. January 16, 2013

ME964 High Performance Computing for Engineering Applications

XMEM. Extended C/C++ Dynamic Memory Control and Debug Library. XMEM Manual. Version 1.7 / September 2010

Deep C. Multifile projects Getting it running Data types Typecasting Memory management Pointers. CS-343 Operating Systems

Recitation #11 Malloc Lab. November 7th, 2017

Dynamic Memory Management

Section Notes - Week 1 (9/17)

Princeton University. Computer Science 217: Introduction to Programming Systems. Dynamic Memory Management

Memory Allocation. General Questions

Technical Questions. Q 1) What are the key features in C programming language?

Programs in memory. The layout of memory is roughly:

Memory (Stack and Heap)

Dynamic Memory Allocation (and Multi-Dimensional Arrays)

CS Programming In C

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW

Dynamic memory. EECS 211 Winter 2019

For Teacher's Use Only Q No Total Q No Q No

Memory Allocation I. CSE 351 Autumn Instructor: Justin Hsia

Processes. Johan Montelius KTH

Generating Programs and Linking. Professor Rick Han Department of Computer Science University of Colorado at Boulder

Intermediate Programming, Spring 2017*

Stack overflow exploitation

CMSC 341 Lecture 2 Dynamic Memory and Pointers

CSCI 171 Chapter Outlines

A process. the stack

COSC Software Engineering. Lecture 16: Managing Memory Managers

Lecture 4 September Required reading materials for this class

Important From Last Time

Data Types and Computer Storage Arrays and Pointers. K&R, chapter 5

A software view. Computer Systems. The Compilation system. How it works. 1. Preprocesser. 1. Preprocessor (cpp)

Pointers. Memory. void foo() { }//return

Quiz 0 Review Session. October 13th, 2014

QUIZ. What are 3 differences between C and C++ const variables?

Dynamic Data Structures. CSCI 112: Programming in C

A Fast Review of C Essentials Part I

Hacking in C. Memory layout. Radboud University, Nijmegen, The Netherlands. Spring 2018

Content. In this chapter, you will learn:

Vector and Free Store (Vectors and Arrays)

Lecture 03 Bits, Bytes and Data Types

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

C: Pointers, Arrays, and strings. Department of Computer Science College of Engineering Boise State University. August 25, /36

CMPS 105 Systems Programming. Prof. Darrell Long E2.371

Dynamic Memory Management. Bin Li Assistant Professor Dept. of Electrical, Computer and Biomedical Engineering University of Rhode Island

United States Naval Academy Electrical and Computer Engineering Department EC310-6 Week Midterm Spring AY2017

Motivation was to facilitate development of systems software, especially OS development.

C Pointers. Abdelghani Bellaachia, CSCI 1121 Page: 1

printf( Please enter another number: ); scanf( %d, &num2);

Lecture 3: C Programm

CMPE-013/L. Introduction to C Programming

Memory Management I. two kinds of memory: stack and heap

ECE 551D Spring 2018 Midterm Exam

EL2310 Scientific Programming

Executables and Linking. CS449 Spring 2016

3.Constructors and Destructors. Develop cpp program to implement constructor and destructor.

Page 1. Today. Important From Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right?

CSE 333 Midterm Exam 5/10/13

Lectures 5-6: Introduction to C

Dynamically Allocated Memory in C

Operating systems. Lecture 9

Transcription:

Memory and C/C++ modules From Reading #5 and mostly #6 More OOP topics (templates; libraries) as time permits later

Program building l Have: source code human readable instructions l Need: machine language program binary instructions and associated data regions, ready to be executed l g++/gcc does two basic steps: compile, then link To compile means translate to object code To link means to combine with other object code (including library code) into an executable program mypgm.cpp (source code) Compile mypgm.o (object code) Link mypgm (executable)

Link combines object codes l From multiple source files and/or libraries e.g., always libc.a mypgm.c (source code) Compile mypgm.o (object code) Link mypgm (executable) libc.a (library file) Link l Use -c option with gcc/g++ to stop after creating.o file -bash-4.2$ gcc -c mypgm.c ; ls mypgm* mypgm.c mypgm.o Is necessary to compile a file without a main function l Later link it to libraries alone or with other object files: -bash-4.2$ gcc -o mypgm mypgm.o ; ls mypgm* mypgm mypgm.c mypgm.o

Compiling: 3 steps with C/C++ mypgm.c (source code) "Compile" mypgm.o (object code) Preprocess (source code with preproc. subsitutions) Compile mypgm.s (assembly code) Assemble l First the preprocessor runs Creates temporary source code with text substitutions as directed Use gcc -E (or just cpp) to run it alone output goes to stdout l Then the source is actually compiled to assembly code Use gcc -S to stop at this step and save code in.s file l Last, assembler produces the object code (machine language)

More about the C preprocessor l Create a symbolic constant by #define e.g., #define KPM 1.609344 l Create a macro by #define with parameters e.g., #define TRIPLE(a) (3 * (a)) l Notice macro text not just 3 * a Then evaluate TRIPLE(2 + 4) for instance l Also #a in macro text converts a to a string, while a ## b concatenates a and b to a string l See/try demos/macros

Compiling and linking source file 1 object file 1 source file 2 compilation object file 2 library object file 1 linking (relocation + linking) load file source file N object file N library object file M Usually performed by gcc/g++ in one uninterrupted sequence

Layout of C/C++ programs object 1 definition object 2 definiton function 1 static object 5 definition object 3 definition function 2 object 4 definition function 3 static object 5 definition... Source code ß becomes Object module à Header section Machine code section (a.k.a. text section) Initialized data section Symbol table section Relocation information section

A sample C program demo.c #include <stdio.h> int a[10]={0,1,2,3,4,5,6,7,8,9}; int b[10]; void main(){ int i; static int k = 3; } for(i = 0; i < 10; i++) { printf("%d\n",a[i]); b[i] = k*a[i]; } l Has text section of course: the machine code l Has initialized global data: a l Uninitialized global data: b l Static data: k l Has a local variable: i

A possible structure of demo.o Offset Contents Comment Header section 0 124 number of bytes of Machine code section 4 44 number of bytes of initialized data section 8 40 number of bytes of Uninitialized data section (array b[]) (not part of this object module) 12 60 number of bytes of Symbol table section 16 44 number of bytes of Relocation information section Machine code section (124 bytes) 20 X code for the top of the for loop (36 bytes) 56 X code for call to printf() (22 bytes) 68 X code for the assignment statement (10 bytes) 88 X code for the bottom of the for loop (4 bytes) 92 X code for exiting main() (52 bytes) Initialized data section (44 bytes) 144 0 beginning of array a[] 148 1 : 176 8 180 9 end of array a[] (40 bytes) 184 3 variable k (4 bytes) Symbol table section (60 bytes) 188 X array a[] : offset 0 in Initialized data section (12 bytes) 200 X variable k : offset 40 in Initialized data section (10 bytes) 210 X array b[] : offset 0 in Uninitialized data section (12 bytes) 222 X main : offset 0 in Machine code section (12 bytes) 234 X printf : external, used at offset 56 of Machine code section (14 bytes) Relocation information section (44 bytes) 248 X relocation information Object module contains neither uninitialized data (b), nor any local variables (i)

Reminder: compiling & linking source file 1 object file 1 source file 2 compilation object file 2 library object file 1 linking (relocation + linking) load file source file N object file N library object file M Usually performed by gcc/g++ in one uninterrupted sequence

Linux object file format l ELF stands for Executable and Linking Format A 4-byte magic number followed by a series of named sections l Addresses assume the object file is placed at memory 0 When multiple object files are linked together, we must update the offsets (relocation) l Tools to read contents: objdump and readelf not available on all systems \177ELF.text.rodata.data.bss.symtab.rel.text.rel.data.debug.line Sec<on header table

ELF sections l.text = machine code (compiled program instructions) l.rodata = read-only data l.data = initialized global variables l.bss = block storage start for uninitialized global variables actually just a placeholder that occupies no space in the object file l.symtab = symbol table with information about functions and global variables defined and referenced in the program \177ELF.text.rodata.data.bss.symtab.rel.text.rel.data.debug.line Sec<on header table

ELF Sections (cont.) l.rel.text = list of locations in.text section that need to be modified when linked with other object files l.rel.data = relocation information for global variables referenced but not defined l.debug = debugging symbol table; only created if compiled with -g option l.line = mapping between line numbers in source and machine code in.text; used by debugger programs \177ELF.text.rodata.data.bss.symtab.rel.text.rel.data.debug.line Sec<on header table

Reminder again: linking source file 1 object file 1 source file 2 compilation object file 2 library object file 1 linking (relocation + linking) load file source file N object file N library object file M

Creation of a load module Object Module A Header Section Machine Code Section Initialized data Section Symbol table Section Header Section Machine Code Section Initialized data Section Symbol table Section Object Module B Load Module Header Section Machine Code Section Initialized data Section Symbol table Section l Interleaved from multiple object modules Sections must be relocated l Addresses relative to beginning of a module Necessary to translate from beginnings of object modules l When loaded OS will translate again to absolute es

Loading and memory mapping Header Section Machine Code Section Initialized data Section Symbol table Section load module (logical) space of program 1 loading Code initialized Static data uninitialized Dynamic data Unused logical space Stack memory mapping OPERATING SYSTEM PHYSICAL MEMORY Code Static data Dynamic data Unused logical space Stack memory mapping (logical) space of program 2 Code Static data Dynamic data Unused Logical space Stack (logical) space of program 3 l Includes memory for stack, dynamic data (i.e., free store), and uninitialized global data l Physical memory is shared by multiple programs

From source program to placement in memory during execution source program int a[10]={0,1,2,3,4,5,6,7,8,9}; int b[10]; void main() { int i; static int k = 3; for(i = 0; i < 10; i++) { printf("%d\n",a[i]); b[i] = k*a[i]; }/*endfor*/ }/*end main*/ physical memory code for printf() code for top of for loop code for call to printf() code for b[i] = k*a[i] array a[] array b[] variable k

Dynamic memory allocation Code Code initialized initialized Static data Static data uninitialized uninitialized Dynamic data Dynamic data Unused logical space increment of dynamic data Unused logical space Stack Stack (logical) space of the program OPERATING SYSTEM (logical) space of the program OPERATING SYSTEM PHYSICAL MEMORY Before dynamic memory allocation PHYSICAL MEMORY After dynamic memory allocation

Sections of an executable file Segments:

Variables and objects in memory 'A' 16916 01000001 0100001000010100 l Variables and data objects are data containers with names l The value of the variable is the code stored in the container l To evaluate a variable is to fetch the code from the container and interpret it properly l To store a value in a variable is to code the value and store the code in the container l The size of a variable is the size of its container

Overflow is when a data code is larger than the size of its container l e.g., char i; // just 1 byte int *p = (int*)&i; // legal *p = 1673579060; // result if "big endian" storage: variable i 01001001100101100000001011010100 l If whole space (X) belongs to this program: Seems OK if X does not contain important data for rest of the program s execution Bad results or crash if important data are overwritten l If all or part of X belongs to another process, the program is terminated by the OS for a memory access violation (i.e., segmentation fault) X

More about overflow l Previous slide showed example of "right overflow" result truncated (also warning) 01000001010001 l Compilers handle "left overflow" by truncating too (usually without any warning) Easily happens: unsigned char i = 255; 11111111 i++; // What is the result of this increment? 100000000

Placement & padding word l Compiler places data at word boundaries e.g., word = 4 bytes l Imagine: struct { char a; int b; } x; Classes too x.a data completely ignored, junk padding x.a variable x 01001001 10010110000000101101010001101101 a machine word variable x 0100100110010110000000101101010001101101 a machine word x.b Compilers do it this way x.b Not like this! a machine word a machine word See/try ~mikec/cs32/demos/padding.cpp

Pointers are data containers too l As its value is a memory, we say it "points" to a place in memory l It points at just 1 byte, so it must "know" what data type starts at that How many bytes? How to interpret the bits? l Question: What is stored in the 4 bytes at es 802340..802343 in the diagram at right? Continued next slide 8090346 8090346 int* p byte with 8090346 byte with 8090346 integer "data container"...0101 010000010100001001000011010001001100... 802340 802341 802342 802343

...0101 010000010100001001000011010001001100... What is? 802340 802341 802342 802343 l Could be four chars: A, B, C, D l Or it could be two shorts: 16961, 17475 All numerical values shown here are for a "little endian" machine (more about endian next slide) l Maybe it s a long or an int: 1145258561 l It could be a floating point number too: 781.035217 802340...0101010000010100001001000011010001001100... 802340 char* b ASCII code for 'A' 802340...0101 010000010100001001000011010001001100... 802340 short* s 802340 binary code for short 16916 (on a little endian machine)...010101000001010000100100001101000100 1100... 802340 int* p 802340 binary code for int 1145258561 (on a little endian machine)...0101 01000001010000100100001101000100 1100... 802340 float* f binary code for float 781.035217 (on a little endian machine)

Beware: two different byte orders l Matters to actual value of anything but chars l Say: short int x = 1; l On a big endian machine it looks like this: 00000000 00000001 Some Macs, JVM, TCP/IP "Network Byte Order" l On a little endian machine it looks like this: 00000001 00000000 Intel, most communication hardware l Only important when dereferencing pointers See/try ~mikec/cs32/demos/endian.c

Dynamic memory allocation l OS memory manager (OSMM) allocates large blocks at a time to individual processes l A process memory manager (PMM) then takes over Operating System large memory blocks OS memory manager large memory blocks Process memory management Process 1 Process memory management Process 2

Memory management by OSMM l Essentially, a simple "accounting" of what process owns what part(s) of the memory l Memory allocation like making an entry in the accounting "book" that this segment is given to this process for keeps l Memory deallocation an entry that this segment is no longer needed (process died), so it s "free" l OSMM usually keeps track of allocated memory blocks in a binary heap, to quickly search for suitable free blocks hence the name "system heap" (traditionally called "free store" in C++)

PMM handles a process s memory l A "middle manager" intermediary to OSMM l Usually keeps a dynamic list of free segments l When program requests more memory PMM searches its list for a suitable segment l If none found, asks OSMM for another block OSMM searches its heap and delivers a block Then PMM carves out a suitable segment l Can be a significant time delay while all this goes on which can slow performance if a program makes many allocation requests

Dynamic memory in C programs l Use C standard functions all in <stdlib.h> All use void* means "any type" no dereferencing void *malloc(size_t size); Get at least size bytes; contents are arbitrary! void *calloc(size_t n, size_t elsize); Get at least n*elsize bytes; contents cleared! void *realloc(void *ptr, size_t size); Changes size of existing segment (at ptr) IMPORTANT: ptr must have come by malloc or calloc And beware dangling pointers if data must be moved l To deallocate, use void free(void *ptr);

Easier, better in C++ programs l Allocate memory by operator new Easier than malloc and other C functions: just need to specify type object s size is known Better than the C functions: also calls a constructor to create the object properly l Operator delete returns memory to the free store that was allocated by new Also calls class destructor to keep things neat Use delete[] if deallocating an array

Dynamic arrays of C++ objects l MyClass *array = new MyClass[5]; Creates an array of 5 MyClass objects Returns a pointer to the first object l Default ctor is called for every object l No way to call a different constructor So class must have a no-argument ctor l delete [] array; Calls dtor on all 5 objects ~mikec/cs32/demos/ dynarray.cpp

Using memory all over the place! l Fairly simple in C: an object is either in static memory, or on stack, or on heap l C++ objects can "be" more than one place! l So important in C++ to manage memory even for stack objects (with dynamic parts) (about program on Reader p. 190) static memory sample activation frame of doit() sample salutation salutation dynamic memory (heap) h e y '\0' dynamic memory (heap) h e y '\0'

Don t corrupt the PMM: guidelines l Never pass an to free that was not returned by malloc, calloc, or realloc l Deallocate segments allocated by malloc, calloc, or realloc only by using free l Never pass to delete (or delete[]) that was not previously returned by new l Deallocate segments allocated by new using exclusively delete And exclusively delete[] if array allocated BTW: in general, don t mix C and C++ ways to do things.