Towards the Hardware"

Similar documents
CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

CSC 8400: Computer Systems. Machine-Level Representation of Programs

Second Part of the Course

Assembly Language: Overview!

Assembly Language: Overview

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Program Exploitation Intro

Program Building. Goals for this Lecture. Help you learn about: Program structure. Detecting and reporting failure

Practical Malware Analysis

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

C Fundamentals! C vs. Java: Design Goals. q Java design goals! q Implications for Java! language, machine language, hardware!

CS61 Section Solutions 3

CSC 2400: Computer Systems. Key to Success

CPS104 Recitation: Assembly Programming

Process Layout and Function Calls

SOEN228, Winter Revision 1.2 Date: October 25,

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Digital Forensics Lecture 3 - Reverse Engineering

Credits and Disclaimers

Assembly Language: IA-32 Instructions

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary

The Hardware/Software Interface CSE351 Spring 2013

Instruction Set Architecture

Instruction Set Architecture

Overview of Compiler. A. Introduction

mith College Computer Science CSC231 Assembly Week #9 Spring 2017 Dominique Thiébaut

Lab 3. The Art of Assembly Language (II)

CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung

Y86 Processor State. Instruction Example. Encoding Registers. Lecture 7A. Computer Architecture I Instruction Set Architecture Assembly Language View

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

System Programming and Computer Architecture (Fall 2009)

Introduction to 8086 Assembly

CISC 360 Instruction Set Architecture

An Introduction to x86 ASM

Credits and Disclaimers

Instruction Set Architecture

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Princeton University Computer Science 217: Introduction to Programming Systems. Assembly Language: Part 1

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

Machine-Level Programming II: Control Flow

CSE351 Spring 2018, Midterm Exam April 27, 2018

Assembly II: Control Flow

16.317: Microprocessor Systems Design I Fall 2015

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Instruction Set Architecture

EECE.3170: Microprocessor Systems Design I Summer 2017

16.317: Microprocessor Systems Design I Fall 2014

CSC 2400: Computing Systems

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control.

Process Layout, Function Calls, and the Heap

mith College Computer Science CSC231 Assembly Week #10 Fall 2017 Dominique Thiébaut

Machine Language CS 3330 Samira Khan

Compiler Construction D7011E

How Software Executes

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control.

Control flow. Condition codes Conditional and unconditional jumps Loops Switch statements

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

Lecture 2 Assembly Language

The x86 Architecture

System calls and assembler

Machine Programming 2: Control flow

Machine-Level Programming II: Control and Arithmetic

Von Neumann Architecture. Machine Languages. Context of this Lecture. High-Level Languages. Assembly Language: Part 1. Princeton University.

Computer Science 104:! Y86 & Single Cycle Processor Design!

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

Intel x86-64 and Y86-64 Instruction Set Architecture

Computer Organization Chapter 4. Prof. Qi Tian Fall 2013

Assembly Language II: Addressing Modes & Control Flow

CSCI 192 Engineering Programming 2. Assembly Language

Basic Assembly Instructions

Computer Systems Organization V Fall 2009

Credits to Randy Bryant & Dave O Hallaron

mith College Computer Science CSC231 Assembly Week #11 Fall 2017 Dominique Thiébaut

W4118: PC Hardware and x86. Junfeng Yang

Review addressing modes

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Machine Level Programming II: Arithmetic &Control

Machine-Level Programming II: Arithmetic & Control /18-243: Introduction to Computer Systems 6th Lecture, 5 June 2012

CS241 Computer Organization Spring Addresses & Pointers

Computer Science 104:! Y86 & Single Cycle Processor Design!

CS429: Computer Organization and Architecture

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

Reading assignment. Chapter 3.1, 3.2 Chapter 4.1, 4.3

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Credits and Disclaimers

Machine-Level Programming II: Arithmetic & Control. Complete Memory Addressing Modes

Chapter 4! Processor Architecture!

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

Computer Systems Lecture 9

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Module 3 Instruction Set Architecture (ISA)

Sample Exam I PAC II ANSWERS

Transcription:

CSC 2400: Computer Systems Towards the Hardware Chapter 2 Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32) 1

High-Level Language Make programming easier by describing operations in a semi-natural language Increase the portability of the code One line may involve many low-level operations Examples: C, C++, Java count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2; Assembly Language mov ECX, DWORD PTR count mov EDX, DWORD PTR n Tied to the specifics of the underlying machine Commands and names to make the code readable and writeable by humans E.g., IA-32 from Intel mov ECX, 0.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je.else mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.else: sar EDX, 1.endif: jmp.loop.endloop: 4 2

Machine Language Also tied to the underlying machine What the computer sees and deals with Every instruction is a sequence of one or more numbers All stored in memory on the computer, and read and executed Unreadable by humans 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 9222 9120 1121 A120 1121 A121 7211 0000 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000A 000B 000C 000D 000E 000F 0000 0000 0000 FE10 FACE CAFE ACED CEDE 1234 5678 9ABC DEF0 0000 0000 F00D 0000 0000 0000 EEEE 1111 EEEE 1111 0000 0000 B1B2 F1F5 0000 0000 0000 0000 0000 0000 Building and Running To build an executable $ gcc program.c o program To run: $./program Building involves 4 stages: 1. Preprocess 2. Compile 3. Assemble 4. Link 3

Steps in the Build Process Preprocess: program.c program.i Compile: program.i program.s Assemble: program.s program.o Link: program.o program To build one step at a time: $ gcc E program.c > program.i $ gcc S program.i $ gcc c program.s $ gcc program.o o program Why build one step at a time? Helpful for learning how to interpret error messages Permits partial builds (described later in course) The Preprocessor s View File program.c: #include <stdio.h> int main(void) /* Read a circle's radius from stdin, and compute and write its diameter and circumference to stdout. Return 0. */ { Preprocessor directive! Preprocessor replaces with contents of file /usr/include/stdio.h int radius; int diam; double circum; printf(enter the circle's radius:\n); scanf(%d, &radius); diam = 2 * radius; circum = 3.14159 * (double)diam; printf(a circle with radius %d has diameter %d\n, radius, diam); printf(and circumference %f.\n, circum); return 0; Comment! Preprocessor removes 4

Results of Preprocessing File program.i: int printf(char*, ); int scanf(char*, ); Declarations of printf(), scanf(), and other functions; compiler will have enough information to check subsequent function calls int main(void) { int radius; int diam; double circum; printf(enter the circle's radius:\n); scanf(%d, &radius); diam = 2 * radius; circum = 3.14159 * (double)diam; printf(a circle with radius %d has diameter %d\n, radius, diam); printf(and circumference %f.\n, circum); return 0; Where are the DEFINITIONS of printf() and scanf()? The Compiler s View File program.i: int printf(char*, ); int scanf(char*, ); int main(void) { int radius; int diam; double circum; printf(enter the circle's radius:\n); scanf(%d, &radius); diam = 2 * radius; circum = 3.14159 * (double)diam; printf(a circle with radius %d has diameter %d\n, radius, diam); printf(and circumference %f.\n, circum); return 0; Function declarations! Compiler notes return types and parameter types so it can check your function calls 5

Results of Compiling File program.s:.lc0:.lc1: main:.section.data.string Enter the circle's radius:\n.string %d.section.text.globl main push EBP mov EBP, ESP push.lc0 call printf add ESP, 16 push EAX push.lc1 call scanf Assembly language printf() and scanf() still missing This is the assembler view Assembler translates assembly language into machine language Results of Assembling File program.o: Listing omitted Not human-readable Machine language Object file Still missing definitions of printf() and scanf() 6

The Linker s View File program.o: Listing omitted Not human-readable Machine language The linker: Observes that Code in circle.o calls printf() and scanf() Code in circle.o does not define printf() or scanf() Fetches machine language definitions of printf() and scanf() from standard C library (/usr/lib/libc.a on your machine) Merges those definitions with circle.o to create Results of Linking File program: Listing omitted Not human-readable Complete executable binary file Machine language 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 1 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 1 0 1 7

Lecture Goals Help you learn Basics of computer architecture Relationship between C and assembly language IA-32 assembly language through an example Why do you need to know this? Computer Architecture Chapter 2 8

General Architecture for a Computer A Typical Computer This model provides a good study framework. 9

Memory The only large storage area that CPU can access directly Hence, any program executing must be in memory 20 Memory Hierarchy (read sec. 2.4.3) A typical memory hierarchy. The numbers are very rough approximations. Cache Principle The more frequently data is accessed, the faster the access should be. 10

Memory (Unix Process Layout) Memory Central Processing Unit (CPU) Addresses Data Instructions TEXT DATA HEAP STACK Stores executable machine-language instructions (TEXT) Stores data (DATA, HEAP and STACK sections) Central Processing Unit (CPU) CPU Control Unit Registers ALU Control unit Fetch, decode, and execute Arithmetic and logic unit Execution of low-level operations Registers High-speed temporary storage Addresses Data Instructions Memory TEXT DATA HEAP STACK 11

Registers (sec. 2.6) Small amount of storage on the CPU Can be accessed more quickly than main memory Instructions move data in and out of registers Loading registers from main memory Storing registers to main memory Instructions manipulate the register contents Registers essentially act as temporary variables For efficient manipulation of the data Registers are the top of the memory hierarchy Ahead of main memory, disk, tape, IA-32 Architecture CPU Registers Registers EIP EAX EBX ECX EDX ESI EDI ESP EBP Addresses Data Instructions Memory TEXT DATA HEAP STACK Condition Codes CF ZF SF OF EFLAGS EIP Instruction Pointer ESP, EBP Reserved for special use EAX Always contains return value 12

CPU Control Unit and ALU (sec. 2.2, 2.3) CPU Memory Control Unit Registers ALU Addresses Data Instructions TEXT DATA HEAP Control unit Fetch, decode, and execute Arithmetic and logic unit (ALU) Execution of low-level operations STACK Control Unit: Instruction Decoder Determines what operations need to take place Translate the machine-language instruction Control what operations are done on what data E.g., control what data are fed to the ALU E.g., enable the ALU to do multiplication or addition E.g., read from a particular address in memory src1 src2 operation ALU flag/carry dst 13

Central Processing Unit (CPU) Runs the loop Fetch-Decode-Execute Fetch Cycle Decode Cycle Execute Cycle START START Fetch Next Instruction Decode Execute Instruction Instruction Execute Execute Instruction Instruction HALT q Fetch the next instruction from memory q Decode the instruction and fetch the operands q Execute the instruction and store the result Fetch-Decode-Execute Cycle Where is the next instruction held in the machine? a CPU register called the Instruction Pointer(EIP) holds the address of the instruction to be fetched next Fetch cycle Copy instruction from memory into a register Decode cycle Decode instruction and fetch operands, if necessary Execute cycle Execute the instruction Increment PC by the instruction length after execution 14

C Code vs. Assembly Code Chapter 5 Overview Kinds of Instructions count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2; Reading and writing data count = 0 n Arithmetic and logic operations Increment: count++ Multiply: n * 3 Divide: n/2 Logical AND: n & 1 Checking results of comparisons Is (n > 1) true or false? Is (n & 1) non-zero or zero? Changing the flow of control To the end of the while loop (if n 1 ) Back to the beginning of the loop To the else clause (if n & 1 is 0) 15

Variables in Registers count = 0; while (n > 1) { count++; if (n & 1) n = n*3 + 1; else n = n/2; Registers count ECX n EDX mov ECX, DWORD PTR count mov EDX, DWORD PTR n Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; Read directly from the instruction mov ECX, 0 add ECX, 1 written to a register 16

General Syntax op Dest, Src Perform operation op on Src and Dest Save result in Dest Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; mov EAX, EDX and EAX, 1 Computing intermediate value in register EAX 17

Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 Adding n twice is cheaper than multiplication! Immediate and Register Addressing count ECX n EDX count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; sar EDX, 1 Shifting right by 1 bit is cheaper than division! 18

Changing Program Flow count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2; Cannot simply run next instruction Check result of a previous operation Jump to appropriate next instruction Jump instructions Load new address in instruction pointer Example jump instructions Jump unconditionally (e.g., ) Jump if zero (e.g., n & 1 ) Jump if greater/less (e.g., n > 1 ) Jump Instructions Jump depends on the result of previous arithmetic instruction. Jump Description jmp Unconditional je Equal / Zero jne js jns jg jge jl jle 19

Conditional and Unconditional Jumps Comparison cmp compares two integers Done by subtracting the first number from the second Discards the results, but sets flags as a side effect: cmp EDX, 1 (computes EDX 1) jle.endloop (checks whether result was 0 or negative) Logical operation and compares two integers: and EAX, 1 (bit-wise AND of EAX with 1) je.else (checks whether result was 0) Also, can do an unconditional branch jmp jmp.endif jmp.loop Jump and Labels: While Loop count ECX n EDX while (n>1) {.loop: cmp EDX, 1 jle.endloop Checking if EDX is less than or equal to 1. jmp.loop.endloop: 20

Jump and Labels: While Loop count ECX n EDX mov ECX, 0 count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2;.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je.else mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.else: sar EDX, 1.endif: jmp.loop.endloop: Jump and Labels: If-Then-Else count ECX n EDX if (n&1)... else... else block then block mov EAX, EDX and EAX, 1 je.else jmp.endif.else:.endif: 21

Jump and Labels: If-Then-Else count ECX n EDX count=0; while(n>1) { count++; if (n&1) n = n*3+1; else n = n/2; mov ECX, 0.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je.else mov EAX, EDX add EDX, EAX then block add EDX, EAX add EDX, 1 jmp.endif.else: sar EDX, 1.endif: jmp.loop.endloop: else block Code More Efficient count ECX n EDX mov ECX, 0 count=0; while(n>1) { count++; if (n&1) n = n*3+1; else n = n/2; Replace with jmp loop.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je.else mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.else: sar EDX, 1.endif: jmp.loop.endloop: 22

Complete Example count ECX n EDX mov ECX, 0 count=0; while (n>1) { count++; if (n&1) n = n*3+1; else n = n/2;.loop: cmp EDX, 1 jle.endloop add ECX, 1 mov EAX, EDX and EAX, 1 je.else mov EAX, EDX add EDX, EAX add EDX, EAX add EDX, 1 jmp.endif.else: sar EDX, 1.endif: jmp.loop.endloop: Reading IA-32 Assembly Language Referring to a register: EAX, eax, EBX, ebx, etc. Result stored in the first argument E.g. mov EAX, EDX moves EDX into EAX E.g., add ECX, 1 increments register ECX Assembler directives: starting with a period (. ) E.g.,.section.text to start the text section of memory Comment: pound sign ( # ) E.g., # Purpose: Convert lower to upper case 23

X86 à C Example int Example(int x){??? push EBP mov EBP, ESP mov ECX, [EBP+8] xor EAX, EAX xor EDX, EDX cmp EDX, ECX jge L2 L1: add EAX, EDX inc EDX cmp EDX, ECX jl L1 L2: mov ESP,EBP pop EBP ret L1: L2: Write comments # ECX = x # EAX = # EDX = # if # goto L2 # EAX = # EDX = # if # goto L1 Name the Variables int Example(int x){??? push EBP mov EBP, ESP mov ECX, [EBP+8] xor EAX, EAX xor EDX, EDX cmp EDX, ECX jge L2 L1: add EAX, EDX inc EDX cmp EDX, ECX jl L1 L2: mov ESP,EBP pop EBP ret L1: L2: EAX à result, EDX à i # ECX = x # result = # i = # if # goto L2 # result = # i = # if # goto L1 24

Identify the Loop result = 0; i = 0; if (i >= x) goto L2; L1:result += i; i++; if (i < x) goto L1; L2: Identify the Loop result = 0; i = 0; if (i >= x) goto L2; L1:result += i; i++; if (i < x) goto L1; L2: result = 0; i = 0; if (i >= x) goto L1; do { result += i; i++; while (i < x); L1: result = 0; i = 0; while (i < x){ result += i; i++; result = 0; for (i = 0; i < x; i++) result += i; 25

C Code int Example(int x) { int result=0; int i; for (i=0; i<x; i++) result += i; return result; Exercise int F(int x, int y){???.l2: push EBP mov EBP,ESP mov ECX, [EBP+8] mov EDX, [EBP+12] xor EAX, EAX cmp ECX, EDX jle.l1 dec ECX inc EDX inc EAX cmp ECX,EDX jg.l2.l1: inc EAX mov ESP,EBP pop EBP ret.l2:.l1: # ECX = x # EDX = y 26

C Code int F(int x, int y) Conclusions Hardware Memory is the only storage area CPU can access directly Executables are stored on the disk Fetch-Decode-Execute Cycle for running executables Assembly language In between high-level language and machine code Programming the bare metal of the hardware Loading and storing data, arithmetic and logic operations, checking results, and changing control flow To get more familiar with IA-32 assembly Generate your own assembly-language code gcc masm=intel S code.c 27