CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Similar documents
CSC 8400: Computer Systems. Machine-Level Representation of Programs

Towards the Hardware"

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Second Part of the Course

Assembly Language: Overview!

Assembly Language: Overview

Assembly Language: IA-32 Instructions

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Program Exploitation Intro

The Hardware/Software Interface CSE351 Spring 2013

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Sungkyunkwan University

The x86 Architecture

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CSE351 Spring 2018, Midterm Exam April 27, 2018

COMPUTER ENGINEERING DEPARTMENT

Practical Malware Analysis

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Module 3 Instruction Set Architecture (ISA)

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

Machine-Level Programming II: Control and Arithmetic

Ex: Write a piece of code that transfers a block of 256 bytes stored at locations starting at 34000H to locations starting at 36000H. Ans.

Machine Programming 2: Control flow

Lab 3. The Art of Assembly Language (II)

Machine Level Programming II: Arithmetic &Control

Machine-Level Programming II: Arithmetic & Control. Complete Memory Addressing Modes

Process Layout and Function Calls

CPS104 Recitation: Assembly Programming

CS241 Computer Organization Spring 2015 IA

Credits to Randy Bryant & Dave O Hallaron

Basic Assembly Instructions

CS241 Computer Organization Spring Addresses & Pointers

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS61 Section Solutions 3

Machine-Level Programming II: Arithmetic & Control /18-243: Introduction to Computer Systems 6th Lecture, 5 June 2012

Machine-Level Programming II: Control Flow

SPRING TERM BM 310E MICROPROCESSORS LABORATORY PRELIMINARY STUDY

Assembly II: Control Flow

mith College Computer Science CSC231 Assembly Week #9 Spring 2017 Dominique Thiébaut

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

SOEN228, Winter Revision 1.2 Date: October 25,

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

Basic Pentium Instructions. October 18

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Lab 6: Conditional Processing

Introduction to 8086 Assembly

Assembly Language: Part 2

Introduction to IA-32. Jo, Heeseung

Lecture (08) x86 programming 7

x86 Assembly Tutorial COS 318: Fall 2017

Reverse Engineering II: The Basics

INTRODUCTION TO IA-32. Jo, Heeseung

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

An Introduction to x86 ASM

CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

16.317: Microprocessor Systems Design I Fall 2014

Complex Instruction Set Computer (CISC)

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

16.317: Microprocessor Systems Design I Fall 2015

9/25/ Software & Hardware Architecture

Reverse Engineering II: The Basics

ADDRESSING MODES. Operands specify the data to be used by an instruction

Credits and Disclaimers

Instruction Set Architectures

Review addressing modes

Lecture 8: Control Structures. Comparing Values. Flags Set by CMP. Example. What can we compare? CMP Examples

Inline Assembler. Willi-Hans Steeb and Yorick Hardy. International School for Scientific Computing

PESIT Bangalore South Campus

Chapter 3 Machine-Level Programming II Control Flow

Instruction Set Architectures

System Programming and Computer Architecture (Fall 2009)

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Credits and Disclaimers

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

Q1: Multiple choice / 20 Q2: Data transfers and memory addressing

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

Conditional Processing

mith College Computer Science CSC231 Assembly Week #10 Fall 2017 Dominique Thiébaut

Control flow. Condition codes Conditional and unconditional jumps Loops Switch statements

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary

System calls and assembler

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

Computer Architecture and Assembly Language. Practical Session 3

Lecture 2 Assembly Language

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

Intel Instruction Set (gas)

Transcription:

CSC 2400: Computer Systems Towards the Hardware: Machine-Level Representation of Programs Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32) 1

Compilation Stages count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; n = n/2; High level language mov ECX, 0.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je. mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.: sar EDX, 1.endif: jmp.loop.endloop: Assembly language 00000000000000000000000000000000 00000000000000000000000000000000 01100011010101110110001101010111 00101011101011010010101110101101 11010001110111011101000111011101 00101110100111000010111010011100 11010010001111001101001000111100 11010000011111011101000001111101 11010011101010011101001110101001 11010000111111101101000011111110 11010001010000101101000101000010 Machine language Building and Running To build an executable $ gcc program.c o program Result! Complete executable binary file! Machine language 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 To run: $./program 2

Compiling Into Assembly C Code (code.c) int sum(int x, int y) { int t = x+y; accum += t; return t; Generated Assembly (code.s) sum: push ebp mov ebp, esp mov edx, DWORD PTR [ebp+12] mov eax, DWORD PTR [ebp+8] add eax, edx pop ebp ret Obtained with command gcc masm=intel -S code.c Produces file code.s Lecture Goals Help you learn! Relationship between C and assembly language! IA-32 assembly language through an example Why do you need to know this? 3

Computer Architecture Background CPU and Memory Example of a dual core architecture Address 0 Address 1 Address 2 Address 3 10100111 11010010 11000010 00101001. Memory - The only large storage area that the CPU can access directly - Hence, any program executing must be in memory 4

Central Processing Unit (CPU) CPU Memory Control Unit ALU Addresses Data TEXT DATA HEAP Instructions Registers Control unit! Fetch, decode Arithmetic and logic unit! Execute low-level operations Registers! High-speed temporary storage STACK Central Processing Unit (CPU) Runs the loop Fetch-Decode-Execute START START Fetch Next Instruction Decode Execute Instruction Instruction Execute Execute Instruction Instruction HALT q Fetch the next instruction from memory - Where from? A CPU register called the Instruction Pointer (EIP) holds the address of the instruction to be fetched next q Decode the instruction and fetch the operands q Execute the instruction and store the result 5

Control Unit: Instruction Decoder Determines what operations need to take place! Translate the machine-language instruction Control what operations are done on what data! E.g., control what data are fed to the ALU! E.g., enable the ALU to do multiplication or addition! E.g., read from a particular address in memory src1 src2 operation ALU flag/carry dst IA-32 Architecture CPU Registers Registers EIP EAX EBX ECX EDX ESI EDI ESP EBP Addresses Data Instructions Memory TEXT DATA HEAP STACK Condition Codes CF ZF SF OF EFLAGS EIP ESP, EBP EAX Instruction Pointer Reserved for special use Always contains return value 6

Registers Small amount of storage on the CPU! Can be accessed more quickly than main memory Instructions move data in and out of registers! Loading registers from main memory! Storing registers to main memory Instructions manipulate the register contents! Registers essentially act as temporary variables! For efficient manipulation of the data Registers are the top of the memory hierarchy! Ahead of main memory, disk, tape, C Code vs. Assembly Code 7

Kinds of Instructions count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; n = n/2; Reading and writing data! count = 0! n Arithmetic and logic operations! Increment: count++! Multiply: n * 3! Divide: n/2! Logical AND: n & 1 Checking results of comparisons! Is (n > 1) true or false?! Is (n & 1) non-zero or zero? Changing the flow of control! To the end of the while loop (if n 1 )! Back to the beginning of the loop! To the clause (if n & 1 is 0) Variables in Registers count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; n = n/2; Registers count ECX n EDX mov ECX, DWORD PTR count mov EDX, DWORD PTR n 8

Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2; Read directly from the instruction mov ECX, 0 add ECX, 1 written to a register General Syntax op Dest, Src Perform operation op on Src and Dest Save result in Dest 9

Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2; mov EAX, EDX and EAX, 1 Computing intermediate value in register EAX Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2; mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 Adding n twice is cheaper than multiplication! 10

Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2; sar EDX, 1 Shifting right by 1 bit is cheaper than division! Changing Program Flow count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2; Cannot simply run next instruction! Check result of a previous operation! Jump to appropriate next instruction Jump instructions! Load new address in instruction pointer Example jump instructions! Jump unconditionally (e.g., )! Jump if zero (e.g., n & 1 )! Jump if greater/less (e.g., n > 1 ) 11

Jump Instructions Jump depends on the result of previous arithmetic instruction Jump jmp je jne js jns jg jge jl jle ja jb Description Unconditional Equal / Zero Registers EIP EAX EBX ECX EDX ESI EDI ESP EBP Condition Codes CF ZF SF OF EFLAGS Conditional and Unconditional Jumps Comparison cmp compares two integers! Done by subtracting the first number from the second! Discards the results, but sets flags as a side effect: cmp EDX, 1 (computes EDX 1) jle.endloop (checks whether result was 0 or negative) Logical operation and compares two integers: and EAX, 1 (bit-wise AND of EAX with 1) je. (checks whether result was 0) Also, can do an unconditional branch jmp jmp.endif jmp.loop 12

Jump and Labels: While Loop count ECX n EDX while (n>1) {.loop: cmp EDX, 1 jle.endloop Checking if EDX is less than or equal to 1. jmp.loop.endloop: Jump and Labels: While Loop count ECX n EDX mov ECX, 0 count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2;.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je. mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.: sar EDX, 1.endif: jmp.loop.endloop: 13

Jump and Labels: If-Then-Else count ECX n EDX if (n&1)...... block then block mov EAX, EDX and EAX, 1 je. jmp.endif.:.endif: Jump and Labels: If-Then-Else count ECX n EDX count=0; while(n>1) { count++; if (n&1) n = n*3+1; n = n/2; mov ECX, 0.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je. mov EAX, EDX add EDX, EAX then block add EDX, EAX add EDX, 1 jmp.endif.: sar EDX, 1.endif: jmp.loop.endloop: block 14

Code More Efficient count ECX n EDX mov ECX, 0 count=0; while(n>1) { count++; if (n&1) n = n*3+1; n = n/2; Replace with jmp loop.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je. mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.: sar EDX, 1.endif: jmp.loop.endloop: Complete Example count ECX n EDX mov ECX, 0 count=0; while (n>1) { count++; if (n&1) n = n*3+1; n = n/2;.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je. mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.: sar EDX, 1.endif: jmp.loop.endloop: 15

Reading IA-32 Assembly Language Referring to a register:! EAX, eax, EBX, ebx, etc. Result stored in the first argument! E.g. mov EAX, EDX moves EDX into EAX! E.g., add ECX, 1 increments register ECX Assembler directives: starting with a period (. )! E.g.,.section.text to start the text section of memory Comment: pound sign ( # )! E.g., # Purpose: Convert lower to upper case X86 à C Example int Example(int x){??? push EBP mov EBP, ESP mov ECX, [EBP+8] xor EAX, EAX xor EDX, EDX cmp EDX, ECX jge L2 L1: add EAX, EDX inc EDX cmp EDX, ECX jl L1 L2: mov ESP,EBP pop EBP ret L1: L2: Write comments # ECX = x # EAX = # EDX = # if # goto L2 # EAX = # EDX = # if # goto L1 16

Name the Variables int Example(int x){??? push EBP mov EBP, ESP mov ECX, [EBP+8] xor EAX, EAX xor EDX, EDX cmp EDX, ECX jge L2 L1: add EAX, EDX inc EDX cmp EDX, ECX jl L1 L2: mov ESP,EBP pop EBP ret L1: L2: EAX à result, EDX à i # ECX = x # result = # i = # if # goto L2 # result = # i = # if # goto L1 Identify the Loop result = 0; i = 0; if (i >= x) goto L2; L1:result += i; i++; if (i < x) goto L1; L2: 17

Identify the Loop result = 0; i = 0; if (i >= x) goto L2; L1:result += i; i++; if (i < x) goto L1; L2: result = 0; i = 0; if (i >= x) goto L2; do { result += i; i++; while (i < x); L2: result = 0; i = 0; while (i < x){ result += i; i++; result = 0; for (i = 0; i < x; i++) result += i; C Code int Example(int x) { int result=0; int i; for (i=0; i<x; i++) result += i; return result; 18

Exercise int F(int x, int y){???.l2: push EBP mov EBP,ESP mov ECX, [EBP+8] mov EDX, [EBP+12] xor EAX, EAX cmp ECX, EDX jle.l1 dec ECX inc EDX inc EAX cmp ECX,EDX jg.l2.l1: inc EAX mov ESP,EBP pop EBP ret.l2:.l1: # ECX = x # EDX = y C Code int F(int x, int y) 19

Instructions to Recognize Arithmetic Instructions (1) Two Operand Instructions ADD Dest, Src Dest = Dest + Src SUB Dest, Src Dest = Dest - Src MUL Dest, Src Dest = Dest * Src SAL Dest, Src Dest = Dest << Src SAR Dest, Src Dest = Dest >> Src Arithmetic SHR Dest, Src Dest = Dest >> Src Logical XOR Dest, Src Dest = Dest ^ Src AND Dest, Src Dest = Dest & Src OR Dest, Src Dest = Dest Src 20

Arithmetic Instructions (2) One Operand Instructions INC Dest Dest = Dest + 1 DEC Dest Dest = Dest 1 NEG Dest Dest = -Dest NOT Dest Dest = ~Dest Compare and Test Instructions CMP Dest, Src Compute Dest - Src without setting Dest TEST Dest, Src Compute Dest & Src without setting Dest 21

Jump Instructions Jump depending on the result of the previous arithmetic instruction: Jump Description jmp je jne js jns jg jge jl jle ja jb Unconditional Equal / Zero Not Equal / Not Zero Negative Nonnegative Greater (Signed) Greater or Equal (Signed) Less (Signed) Less or Equal (Signed) Above (unsigned) Below (unsigned) Loading and Storing Data Memory Addressing Modes 22

IA-32 General Purpose Registers 31 15 8 7 0 AH AL BH BL CH CL DH DL SI DI 16-bit 32-bit AX EAX BX EBX CX ECX DX EDX ESI EDI General-purpose registers C Example: One-Byte Data Global char variable i is in AL, the lower byte of the A register. char i; if (i > 5) { i++; i--; CMP AL, 5 JLE INC AL jmp endif : DEC AL endif: 23

C Example: Four-Byte Data Global int variable i is in EAX, the full 32 bits of the A register. int i; if (i > 5) { i++; i--; CMP EAX, 5 JLE INC EAX JMP endif : DEC EAX endif: Loading and Storing Data Processors have many ways to access data! Known as addressing modes! Two simple ways seen in previous examples Addressing Modes! Register addressing! Immediate addressing! Pointer addressing 24

Register Addressing Registers embedded in the instruction Examples! XOR EAX, EAX! MOV EBX, ECX! INC BH Immediate Addressing Data embedded in the instruction Examples! ADD EAX, 3! MOV AX, -40 25

Pointer Addressing (direct) Memory address embedded in the instruction! Direct memory addressing Examples! MOV AL, [2000]! Read the byte from memory address 2000! Load the byte value in the register AL Note that! 2000 refers to the constant value 2000! [2000] refers to the memory location at address 2000 Pointer Addressing (indirect) Load or store from a previously-computed address! Register with the base address is in the instruction! Indirect memory addressing Examples! MOV AX, [BX] (register addressing)! CMP DL, [BX+8] (base+displacement)! MOV EAX,[EBX+ESI*4+8] (base+index+displacement) 26

Indexed Addressing Example int a[20]; int i, sum=0; for (i=0; i<20; i++) sum += a[i]; EAX: i EBX: sum ECX: address of a[0] sumloop: global variable MOV EAX, 0 MOV EDX, OFFSET FLAT:a ADD EBX, [ECX+EAX*4] INC EAX CMP EAX, 20 jne sumloop Data Access Methods: Summary Immediate addressing: data stored in the instruction itself! MOV ECX, 10 Register addressing: data stored in a register! MOV ECX, EAX Direct addressing: address stored in instruction! MOV ECX, [200] Indirect addressing: address stored in a register! MOV ECX, [EAX]! MOV ECX, [EAX+4]! MOV ECX, [EAX + ESI*4 + 12] 27

LEA: Load Effective Address LEA Dest, Src! Src is address mode expression! Set Dest to address denoted by expression Example! LEA EAX, [EBX+4*ESI]! Load into EAX the value EBX+4*ESI Compare to! MOV EAX, [EBX+4*ESI]! Load into EAX the value stored in memory at address EBX+4*ESI Summary Hardware! Memory is the only storage area CPU can access directly! Executables are stored on the disk! Fetch-Decode-Execute Cycle for running executables Assembly code! Programming the bare metal of the hardware! Loading and storing data, arithmetic and logic operations, checking results, and changing control flow To get more familiar with IA-32 assembly! Generate your own assembly-language code gcc masm=intel S code.c 28