Machine Programming 4: Structured Data

Similar documents
Floating Point Stored & operated on in floating point registers Intel GAS Bytes C Single s 4 float Double l 8 double Extended t 10/12/16 long double

Machine-Level Prog. IV - Structured Data

Systems I. Machine-Level Programming VII: Structured Data

Introduction to Computer Systems , fall th Lecture, Sep. 14 th

Assembly IV: Complex Data Types. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Basic Data Types. Lecture 6A Machine-Level Programming IV: Structured Data. Array Allocation. Array Access Basic Principle

CS241 Computer Organization Spring Loops & Arrays

Data Structures in Memory!

CISC 360. Machine-Level Programming IV: Structured Data Sept. 24, 2008

Machine-Level Programming IV: Structured Data

Basic Data Types The course that gives CMU its Zip! Machine-Level Programming IV: Data Feb. 5, 2008

The Hardware/So=ware Interface CSE351 Winter 2013

Integral. ! Stored & operated on in general registers. ! Signed vs. unsigned depends on instructions used

Page 1. Basic Data Types CISC 360. Machine-Level Programming IV: Structured Data Sept. 24, Array Access. Array Allocation.

Basic Data Types The course that gives CMU its Zip! Machine-Level Programming IV: Structured Data Sept. 18, 2003.

Referencing Examples

Giving credit where credit is due

Giving credit where credit is due

Lecture 7: More procedures; array access Computer Architecture and Systems Programming ( )

h"p://news.illinois.edu/news/12/0927transient_electronics_johnrogers.html

Computer Systems CEN591(502) Fall 2011

2 Systems Group Department of Computer Science ETH Zürich. Argument Build 3. %r8d %rax. %r9d %rbx. Argument #6 %rcx. %r10d %rcx. Static chain ptr %rdx

Machine-Level Programming IV: Structured Data

Machine Programming 3: Procedures

Machine Representa/on of Programs: Arrays, Structs and Unions. This lecture. Arrays Structures Alignment Unions CS Instructor: Sanjeev Se(a

Basic Data Types. CS429: Computer Organization and Architecture. Array Allocation. Array Access

CS , Fall 2001 Exam 1

CS429: Computer Organization and Architecture

Memory Organization and Addressing

Machine- level Programming IV: Data Structures. Topics Arrays Structs Unions

Assembly IV: Complex Data Types. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

CS , Fall 2002 Exam 1

Assembly IV: Complex Data Types

Computer Organization: A Programmer's Perspective

Machine Programming 1: Introduction

Prac*ce problem on webpage

Arrays. CSE 351 Autumn Instructor: Justin Hsia

1.'(*2#3%* 2+6.)* !!"##$%$&'()&*$! +',-.$+'/-$! #0!"#$1"&2"/#',$! &'!)&,21'$3)*!40*,$ ! 5*'6728'*,20*"#$! 9)#46728'*,20*"#$:*',('7;$!

Arrays and Structs. CSE 351 Summer Instructor: Justin Hsia. Teaching Assistants: Josie Lee Natalie Andreeva Teagan Horkan.

Arrays. CSE 351 Autumn Instructor: Justin Hsia

Assembly I: Basic Operations. Computer Systems Laboratory Sungkyunkwan University

Introduction to Computer Systems. Exam 1. February 22, This is an open-book exam. Notes are permitted, but not computers.

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Machine-Level Programming I: Introduction Jan. 30, 2001

Assembly I: Basic Operations. Jo, Heeseung

ASSEMBLY I: BASIC OPERATIONS. Jo, Heeseung

Introduction to Computer Systems. Exam 1. February 22, Model Solution fp

EECS 213 Fall 2007 Midterm Exam

u Arrays One-dimensional Multi-dimensional (nested) Multi-level u Structures Allocation Access Alignment u Floating Point

CMSC 313 Fall2009 Midterm Exam 2 Section 01 Nov 11, 2009

CAS CS Computer Systems Spring 2015 Solutions to Problem Set #2 (Intel Instructions) Due: Friday, March 20, 1:00 pm

ASSEMBLY III: PROCEDURES. Jo, Heeseung

Assembly III: Procedures. Jo, Heeseung

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

1 /* file cpuid2.s */ 4.asciz "The processor Vendor ID is %s \n" 5.section.bss. 6.lcomm buffer, section.text. 8.globl _start.

Homework. In-line Assembly Code Machine Language Program Efficiency Tricks Reading PAL, pp 3-6, Practice Exam 1

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

CS241 Computer Organization Spring 2015 IA

Process Layout and Function Calls

UW CSE 351, Winter 2013 Midterm Exam

CS , Fall 2004 Exam 1

CISC 360. Machine-Level Programming I: Introduction Sept. 18, 2008

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Representation of Information

CS213. Machine-Level Programming III: Procedures

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CMSC 313 Lecture 12 [draft] How C functions pass parameters

Machine- Level Programming IV: x86-64 Procedures, Data

CPEG421/621 Tutorial

The course that gives CMU its Zip! Machine-Level Programming I: Introduction Sept. 10, 2002

Machine-level Programs Data

CSC 252: Computer Organization Spring 2018: Lecture 9

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

Assembly Language: Function Calls" Goals of this Lecture"

Machine-level Programming (3)

15-213/18-243, Fall 2010 Exam 1 - Version A

Process Layout, Function Calls, and the Heap

Machine Level Programming: Arrays, Structures and More

Arrays. Young W. Lim Mon. Young W. Lim Arrays Mon 1 / 17

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

CIS 2107 Computer Systems and Low-Level Programming Spring 2012 Final

Assembly Language: Function Calls" Goals of this Lecture"

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

Systems I. Machine-Level Programming I: Introduction

Machine-Level Programming Introduction

IA32 Processors The course that gives CMU its Zip! Machine-Level Programming I: Introduction Sept. 10, X86 Evolution: Programmer s View

Assembly Language: Function Calls

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

You may work with a partner on this quiz; both of you must submit your answers.

The course that gives CMU its Zip! Machine-Level Programming III: Procedures Sept. 17, 2002

CS 201 Winter 2014 (Karavanic) Final Exam

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Building an Executable

Meet & Greet! Come hang out with your TAs and Fellow Students (& eat free insomnia cookies) When : TODAY!! 5-6 pm Where : 3rd Floor Atrium, CIT

Machine-Level Programming III: Procedures

Lab 10: Introduction to x86 Assembly

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

Transcription:

Machine Programming 4: Structured Data CS61, Lecture 6 Prof. Stephen Chong September 20, 2011

Announcements Assignment 2 (Binary bomb) due Thursday We are trying out Piazza to allow class-wide questions and discussions Go to http://piazza.com/harvard/fall2011/cs61 to sign up and join in Google form to help you form a study group or find a partner See course web page for links Welcome to discuss problems with your classmates, and to collaborate in planning and thinking through solutions. For Assignments 1 3, graded individually For Assignments 4 6, allowed to work in pairs; pair shares grade More info on Course Policies page 2

Topics for today Structured data Arrays Multidimensional arrays Multi-level arrays Structs Memory alignment Arrays of structs 3

Basic data types Integer data types Stored & operated on in general (integer) registers Signed vs. unsigned depends on instructions used Floating Point Stored & operated on in floating point registers C declaration Intel data type Assembly code suffix 32-bit 64-bit char Byte b 1 1 short int Word w 2 2 int Double word l 4 4 long int Quad word q 4 8 float Single precision s 4 4 double Double precision d 8 8 long double Extended precision t 10/12 10/16 4

Array allocation T A[L]; Declares array of data type T and length L e.g., int foo[5] Represented as contiguously allocated region of L sizeof(t) bytes int foo[5]; foo[0] = 0; foo[1] = 2; foo[2] = 1; foo[3] = 3; foo[4] = 8; 0 2 1 3 8 x x+4 x+8 x+12 x+16 x+20 5

Array access Given T A[L]; A[i] accesses the ith element of the array Identifier A can be used as a pointer to array element 0 Type of A is T * E.g., int foo[5]; *foo = 4; is equivalent to foo[0] = 4; 6

Array notation int foo[5]; 0 2 1 3 8 x x+4 x+8 x+12 x+16 x+20 Expression Type Value foo[2] int 1 foo int * x foo+1 int * x+4 &foo[2] int * x+8 foo[5] int??? *(foo+1) int 2 foo + 3 int * x + 4 3 7

Array notation int foo[5]; 0 2 1 3 8 x x+4 x+8 x+12 x+16 x+20 Expression Type Value foo[2] int 1 foo int * x foo+1 int * x+4 &foo[2] int * x+8 foo[5] int??? *(foo+1) int 2 foo + 3 int * x + 4 3 8

Array example typedef int zip_dig[5]; zip_dig hvd = { 0, 2, 1, 3, 8 }; zip_dig cmu = { 1, 5, 2, 1, 3 }; zip_dig cor = { 1, 4, 8, 5, 3 }; zip_dig hvd; 0 2 1 3 8 16 20 24 28 32 36 zip_dig cmu; 1 5 2 1 3 36 40 44 48 52 56 zip_dig cor; 1 4 8 5 3 Note: declaration zip_dig hvd equivalent to int hvd[5] Example arrays were allocated in successive 20 byte blocks Not guaranteed to be true! 56 60 64 68 72 76 9

Accessing arrays typedef int zip_dig[5]; int get_digit(zip_dig z, int dig) { return z[dig]; } push mov %ebp %esp,%ebp mov 0xc(%ebp),%eax # %eax = dig mov 0x8(%ebp),%edx # %edx = z mov (%edx,%eax,4),%eax # %eax = z[dig] pop ret %ebp %edx contains starting address of array %eax contains the array index Desired value at %edx + (%eax * 4) 10

Referencing examples zip_dig hvd; 0 2 1 3 8 zip_dig cmu; 16 20 24 28 32 36 1 5 2 1 3 36 40 44 48 52 56 zip_dig cor; 1 4 8 5 3 56 60 64 68 72 76 Expression Address Value Guaranteed to work? cmu[3] 36 + 4*3 = 48 1 Yes cmu[5] 36 + 4*5 = 56 1 No cmu[-1] 36 + 4*-1 = 32 8 No hvd[15] 16+4*15 = 76?? No 11

Referencing examples zip_dig hvd; 0 2 1 3 8 zip_dig cmu; 16 20 24 28 32 36 1 5 2 1 3 36 40 44 48 52 56 zip_dig cor; 1 4 8 5 3 56 60 64 68 72 76 Expression Address Value Guaranteed to work? cmu[3] 36 + 4*3 = 48 1 Yes cmu[5] 36 + 4*5 = 56 1 No cmu[-1] 36 + 4*-1 = 32 8 No hvd[15] 16+4*15 = 76?? No 12

Looping over an array Original source int zd2int(zip_dig z) { int i; int zi = 0; for (i = 0; i < 5; i++) { zi = 10 * zi + z[i]; } return zi; } What gcc produces (pseudocode) Eliminate loop variable i Convert array code to pointer code Rewrite as do-while loop int zd2int(zip_dig z) { int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while (z <= zend); return zi; } 13

Looping over array in machine code int zd2int(zip_dig z) { int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while (z <= zend); return zi; } Notes: xorl %eax,%eax sets %eax to zero leal used for arithmetic 4 added to %ecx each iteration... # %ecx = z xorl %eax,%eax # zi = 0 leal 16(%ecx),%ebx # zend = z+4.l59: leal (%eax,%eax,4),%edx # edx = zi + 4*zi = 5*zi movl (%ecx),%eax # zi = *z addl $4,%ecx # z++ leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi) cmpl %ebx,%ecx # compare jle.l59 # if z <= zend, loop 14

Strings x86 doesn t have an internal notion of a string datatype. In C, a string is represented as an array of char, terminated by a 0 byte Each character is one ASCII encoded byte E.g., this is a string 0x74 0x68 0x69 0x73 0x20 0x69 0x73 0x20 0x61 0x20 0x73 0x74 0x72 0x69 0x6e 0x67 0x00 0x74 is ASCII encoding of t 0x0 terminates string Different languages use different string representations E.g., Java stores strings in Unicode format (1-4 bytes per character) 15

Topics for today Structured data Arrays Multidimensional arrays Multi-level arrays Structs Memory alignment Arrays of structs 16

Multidimensional arrays T A[R][C]; Two dimensional array of datatype T R rows and C columns Represented as contiguously allocated region of R C sizeof(t) bytes Row major ordering: rows stored one after the other int cambridge_zips[4][5] = {{0, 2, 1, 3, 8}, {0, 2, 1, 3, 9 }, {0, 2, 1, 4, 0 }, {0, 2, 1, 4, 1 }}; 0 2 1 3 8 0 2 1 3 9 0 2 1 4 0 0 2 1 4 1 x x+20 x+40 x+60 x+80 17

Multidimensional arrays typedef int zip_dig[5]; zip_dig cambridge_zips[4] is equivalent to int cambridge_zips[4][5] zip_dig cambridge_zips[4]: array of 4 elements, allocated continuously Each element is an array of 5 ints, allocated continuously int cambridge_zips[4][5] = {{0, 2, 1, 3, 8}, {0, 2, 1, 3, 9 }, {0, 2, 1, 4, 0 }, {0, 2, 1, 4, 1 }}; 0 2 1 3 8 0 2 1 3 9 0 2 1 4 0 0 2 1 4 1 x x+20 x+40 x+60 x+80 18

Accessing a single array element T A[R][C]; A[i][j] is a single element of type T Address is A + i * C * sizeof(t) + j*sizeof(t) E.g., int foo[3][4]; x x+4 x+8 x+12 x+16 x+20 x+24 x+28 x+32 x+36 x+40 x+44 &foo[2][1] evaluates to x + 2 * 4 * sizeof(int) + 1 * sizeof(int) = x + 32 + 4 = x + 36 19

Machine code for array access int cambridge_zips[4][5] = {... }; int get_zip_digit(int index, int digit) { return cambridge_zips[index][digit]; }... # %ecx = digit # %eax = index leal 0(,%ecx,4),%edx leal (%eax,%eax,4),%eax movl cambridge_zips(%edx,%eax,4),%eax... # %edx = 4*digit # %eax = 5*index # return contents of # cambridge_zips + 4*digit # + 4*5*index 20

Strange referencing examples int zips[4][5]; 0 2 1 3 8 0 2 1 3 9 0 2 1 4 0 0 2 1 4 1 x x+20 x+40 x+60 x+80 Expression Address Value Guaranteed to work? zips[3][3] x + 20*3 + 4*3 = x+72 4 Yes zips[2][5] x + 20*2 + 4*5 = x+60 0 Yes zips[2][-1] x + 20*2 + 4*-1 = x+36 9 zips[4][-1] x + 20*4 + 4*-1 = x+76 1 zips[0][19] x + 20*0 + 4*19 = x+76 1 Yes Yes Yes zips[0][-1] x + 0*2 + 4*-1 = x-4??? No 21

Strange referencing examples int zips[4][5]; 0 2 1 3 8 0 2 1 3 9 0 2 1 4 0 0 2 1 4 1 x x+20 x+40 x+60 x+80 Expression Address Value Guaranteed to work? zips[3][3] x + 20*3 + 4*3 = x+72 4 Yes zips[2][5] x + 20*2 + 4*5 = x+60 0 Yes zips[2][-1] x + 20*2 + 4*-1 = x+36 9 zips[4][-1] x + 20*4 + 4*-1 = x+76 1 zips[0][19] x + 20*0 + 4*19 = x+76 1 Yes Yes Yes zips[0][-1] x + 0*2 + 4*-1 = x-4??? No 22

Topics for today Structured data Arrays Multidimensional arrays Multi-level arrays Structs Memory alignment Arrays of structs 23

Multi-level array typedef int zip_dig[5]; zip_dig hvd = { 0, 2, 1, 3, 8 }; zip_dig cmu = { 1, 5, 2, 1, 3 }; zip_dig cor = { 1, 4, 8, 5, 3 }; int *univ[3] = {cmu, hvd, cor}; univ is array of 3 elements Each element is a pointer (4 bytes) Each pointer points to an array of ints univ 36 16 56 160 164 168 172 zip_dig hvd; zip_dig cmu; 0 2 1 3 8 16 20 24 28 32 36 1 5 2 1 3 36 40 44 48 52 56 zip_dig cor; 1 4 8 5 3 56 60 64 68 72 76 24

Multilevel array element access int cambridge_zips[4][5] = {... }; int get_digit(int index, int digit) { return cambridge_zips[index][digit]; } int *univ[3] = {cmu, hvd, cor}; int get_univ_digit (int index, int dig) { return univ[index][dig]; } Access looks similar, but evaluation is quite different!

Multilevel array element access int cambridge_zips[4][5] = {... }; int get_zip_digit(int index, int digit) { return cambridge_zips[index][digit];... } # %ecx = digit One memory read # %eax = index leal 0(,%ecx,4),%edx # %edx = 4*digit leal (%eax,%eax,4),%eax # %eax = 5*index movl cambridge_zips(%edx,%eax,4),%eax # return Mem[cambridge_zips + 20*index + 4*dig]... int *univ[3] = {cmu, hvd, cor}; int get_univ_digit (int index, int dig) { return univ[index][dig]; } Two memory reads - First get pointer to array - Then access array... # %ecx = index # %eax = dig leal 0(,%ecx,4),%edx # %edx = 4*index movl univ(%edx),%edx # %edx = Mem[univ + 4*index] movl (%edx,%eax,4),%eax # Mem[%edx + 4*dig]... 26

Strange referencing examples Expression Address Value Guaranteed to work? univ[2][3] 56 + 4*3 = 68 5 Yes univ[1][5] 16 + 4*5 = 36 1 No univ[2][-1] 56 + 4*-1 = 52 3 No univ[3][-1]?????? No univ[1][12] 16 + 4*12 = 64 8 No 27

Strange referencing examples Expression Address Value Guaranteed to work? univ[2][3] 56 + 4*3 = 68 5 Yes univ[1][5] 16 + 4*5 = 36 1 No univ[2][-1] 56 + 4*-1 = 52 3 No univ[3][-1]?????? No univ[1][12] 16 + 4*12 = 64 8 No 28

Topics for today Structured data Arrays Multidimensional arrays Multi-level arrays Structs Memory alignment Arrays of structs 29

Structs A struct groups objects into a single object Each field accessed by name Each field can have a different type Declare a struct datatype Use fields of a struct Convenient way to initialize fields of struct Equivalent struct rect { int llx; int lly; int width; int height; char *name; }; struct rect r; r.llx = r.lly = 0; struct rect s = {0, 0, 10, 20, Rodney }; struct rect *rp = &s; rp->height = rp->width; (*rp).height = (*rp).width; 30

Implementing structs Struct represented as contiguous region of memory E.g., struct rec { int i; int a[3]; int *p; }; i a p 0 4 8 12 16 20 Compiler generates code using appropriate offsets struct rec *myrec; int someint; void demo_struct() { myrec->i = 42; myrec->a[0] = 43; myrec->p = &someint; } demo_struct: pushl %ebp movl %esp, %ebp movl myrec, %eax movl $42, (%eax) movl $43, 4(%eax) movl $someint, 16(%eax) popl %ebp ret 31

Memory alignment Many systems require data to be aligned in memory E.g., Linux/x86 requires integers to be stored at memory address that is multiple of 4 bytes Why? 32-bit machines typically read and write 4 bytes of memory at a time from word aligned addresses If not aligned, reading an int from memory may require two memory accesses! 32

Memory alignment Some processors require aligned access If you try to access an int not stored on an aligned address, would get an alignment fault Intel x86 does not require memory access to be aligned However, Intel strongly recommends that compilers generate code that uses aligned access, to get the best performance. Different OS's and compilers have different rules Linux: 2-byte types aligned to 2 byte boundaries, 4 byte and larger types aligned to 4 byte boundaries Windows: primitive object of K bytes must have an address that is a multiple of K 33

Alignment within structs Each field in struct must be aligned correctly Compiler may have to insert padding within struct E.g., using Windows alignment rules Field i must be aligned on 4 byte address Field v must be aligned on 8 byte address struct S { char c; int i[2]; double v; }; c i[0] i[1] v x x+4 x+8 x+16 x+24 sizeof(struct S) is 24, not 1+4+4+8 = 17 34

Save memory! struct T1 { char c; double v; char d; int i; }; 10 bytes wasted space (Windows alignment) c v d i x x+1 x+8 x+16 x+20 x+24 struct T2 { double v; char c; char d; int i; }; 2 bytes wasted space (Windows alignment) v c d i x x+8 x+12 x+16 35

Alignment of whole structs E.g., struct U { float x[2]; int i[2]; char c; }; x[0] x[1] i[0] i[1] x x+4 x+8 x+12 x+16 c No padding needed within struct But what about struct U foo[2]? x[0] x[1] i[0] i[1] c x[0] x[1] x x+17 Misaligned! 36

Padding at end of struct To ensure that arrays of structs are correctly aligned, may need to pad at end of struct E.g., struct U { float x[2]; int i[2]; char c; }; sizeof(u) must be multiple of 4 (Windows and Linux) x[0] x[1] i[0] i[1] c x x+4 x+8 x+12 x+16 x+20 E.g., struct V { double x; int i[2]; char c; }; sizeof(v) must be multiple of 8 for Windows, 4 for Linux x i[0] i[1] x x+4 x+8 x+12 x+16 c Windows Linux x+24 x+20 37

Next lecture Buffer overruns and stack exploits B3 l33t H4x0r and pwnzorz the interwebz (Use your powers wisely) 38