CSCE 212H, Spring 2008 Lab Assignment 3: Assembly Language Assigned: Feb. 7, Due: Feb. 14, 11:59PM

Similar documents
Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Program Exploitation Intro

The x86 Architecture

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Instruction Set Architectures

The Hardware/Software Interface CSE351 Spring 2013

Complex Instruction Set Computer (CISC)

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Machine-Level Programming II: Control Flow

Homework 0: Given: k-bit exponent, n-bit fraction Find: Exponent E, Significand M, Fraction f, Value V, Bit representation

Instruction Set Architectures

Practical Malware Analysis

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

Introduction to IA-32. Jo, Heeseung

INTRODUCTION TO IA-32. Jo, Heeseung

Condition Codes The course that gives CMU its Zip! Machine-Level Programming II Control Flow Sept. 13, 2001 Topics

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Condition Codes. Lecture 4B Machine-Level Programming II: Control Flow. Setting Condition Codes (cont.) Setting Condition Codes (cont.

Control flow. Condition codes Conditional and unconditional jumps Loops Switch statements

Machine Level Programming II: Arithmetic &Control

CF Carry Flag SF Sign Flag ZF Zero Flag OF Overflow Flag. ! CF set if carry out from most significant bit. "Used to detect unsigned overflow

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Giving credit where credit is due

Sungkyunkwan University

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

Module 3 Instruction Set Architecture (ISA)

CS241 Computer Organization Spring 2015 IA

Machine-Level Programming II: Control and Arithmetic

CSC 8400: Computer Systems. Machine-Level Representation of Programs

CISC 360. Machine-Level Programming II: Control Flow Sep 23, 2008

Page # CISC 360. Machine-Level Programming II: Control Flow Sep 23, Condition Codes. Setting Condition Codes (cont.)

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Assembly Language: IA-32 Instructions

The Instruction Set. Chapter 5

CS241 Computer Organization Spring Addresses & Pointers

Machine-Level Programming II: Arithmetic & Control. Complete Memory Addressing Modes

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

Compiler Construction D7011E

Second Part of the Course

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Basic Execution Environment

Credits to Randy Bryant & Dave O Hallaron

Reverse Engineering II: The Basics

Lab 3. The Art of Assembly Language (II)

Lab 2: Introduction to Assembly Language Programming

CISC 360. Machine-Level Programming II: Control Flow Sep 17, class06

Machine-Level Programming II: Arithmetic & Control /18-243: Introduction to Computer Systems 6th Lecture, 5 June 2012

Page 1. Condition Codes CISC 360. Machine-Level Programming II: Control Flow Sep 17, Setting Condition Codes (cont.)

Addressing Modes on the x86

Chapter 3 Machine-Level Programming II Control Flow

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

System calls and assembler

Machine Programming 2: Control flow

Computer System Architecture

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Reverse Engineering II: The Basics

MACHINE-LEVEL PROGRAMMING I: BASICS

Instruction Set Architectures

x86 Assembly Tutorial COS 318: Fall 2017

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

SPRING TERM BM 310E MICROPROCESSORS LABORATORY PRELIMINARY STUDY

Credits and Disclaimers

Assembler Programming. Lecture 2

Computer Architecture and Assembly Language. Practical Session 3

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

The von Neumann Machine

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Instruction Set Architecture (ISA) Data Types

CS241 Computer Organization Spring Introduction to Assembly

x86 assembly CS449 Fall 2017

Lecture 2 Assembly Language

Credits and Disclaimers

Y86 Processor State. Instruction Example. Encoding Registers. Lecture 7A. Computer Architecture I Instruction Set Architecture Assembly Language View

Buffer Overflow Attack

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Machine- level Programming II: Control Flow

CPS104 Recitation: Assembly Programming

X86 Review Process Layout, ISA, etc. CS642: Computer Security. Drew Davidson

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html

Access. Young W. Lim Fri. Young W. Lim Access Fri 1 / 18

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 7. Procedures and the Stack

The von Neumann Machine

Transcription:

CSCE 212H, Spring 2008 Lab Assignment 3: Assembly Language Assigned: Feb. 7, Due: Feb. 14, 11:59PM February 7, 2008 1 Overview The purpose of this assignment is to introduce you to the assembly language model of the Pentium processors and to tools for analysis of programs at that level, disassemblers and debuggers. The portions of this assignment are: 1. Registers of the Pentium processor 2. Set up for lab3 3. Disassembling Code objdump Gdb GUI under CYGWIN 4. Condition Codes and Control Flow 5. Decompiling - from assembly to C 6. Logistics - submitting this lab 2 Registers of the Pentium processor The register names for the Pentium are modification of earlier names for registers of the Intel 8080. In the 8080 there were 8 bit registers: A, B, C, D, E, H, L and 16 bit registers SP (Stack Pointer) and PC (Program Counter). For certain operations pairs (HL, DE) of registers were used for 16 bit operations. 1

Flags A B C D E H L SP PC Intel 8080 Registers As the Intel processors grew up, 8 bit registers became 16 bits and the name A (accumulator) became AH (accumulator high) and AL (accumulator low). Then 16 bits were extended to 32 bit registers and an E for Extended was prepended to the names. 31 16 15 8 7 0 AH AL BH BL CH CL DH DL SI DI BP IP SP Flags Full Name EAX EBX ECX EDX ESI EDI EBP EIP ESP EFlags Basic Registers of Intel 80x86/Pentium There are additional collections of registers: the segment select registers, the Multi-Media Extension (MMX) group and that we will discuss later in the semester. EAX - sometimes still referred to as the accumulator. It can be used in 32-bit EAX, 16-bit AX, and 8-bit chunks AH and AL. EBX - the base register. It can hold the base address of data structures. ECX - the Counter register, has special role with loop counters. EDX - the Data register, is used in I/O operations and also in integer multiplications and divisions. ESI - the Source Index register. EDI - the Destination Index register. EBP - the Base Pointer register. EIP - the Instruction Pointer, holds the address of the next instruction to be executed. 2

ESP - the Stack Pointer, holds the address of the next avaiable item on the stack. EFlag - the CPU status flags. 3 Setup for Lab3 Check your engineering domain email and save the attachment as Lab3.tar. The next step is to transfer this to the correct folder so that CYGWIN can access it. The use the command tar xvf Lab3.tar to unpack the files. It should contain the files: sumsquares.c testsumsquares.c Makefile maxp7.c while.c switch.c 4 Disassembling Code In this section we will examine tools for investigating the behavior of executable programs. In particular we will start with disassemblers. A disassembler examines the text of an executable program and generates the assembly language program. Objdump -d The first tool that we examine is objdump. Executing objdump with no arguments will list the various command line options that objdump expects. In the directory Lab3 start by executing the commands: gcc -g -c sumsquares.c gcc -g -c testsumsquares.c objdump -d sumsquares.o The ouput of this last command should look something like: 3

sumsquares.o: file format pe-i386 Disassembly of section.text: 00000000 <_sumsquares>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 18 sub $0x18,%esp 6: c7 45 f8 00 00 00 00 movl $0x0,0xfffffff8(%ebp) d: c7 45 fc 01 00 00 00 movl $0x1,0xfffffffc(%ebp) 14: 8b 45 fc mov 0xfffffffc(%ebp),%eax 17: 3b 45 08 cmp 0x8(%ebp),%eax 1a: 7c 04 jl 20 <_sumsquares+0x20> 1c: eb 12 jmp 30 <_sumsquares+0x30> 1e: 89 f6 mov %esi,%esi 20: 8b 45 fc mov 0xfffffffc(%ebp),%eax 23: 0f af 45 fc imul 0xfffffffc(%ebp),%eax 27: 01 45 f8 add %eax,0xfffffff8(%ebp) 2a: ff 45 fc incl 0xfffffffc(%ebp) 2d: eb e5 jmp 14 <_sumsquares+0x14> 2f: 90 nop 30: 8b 55 f8 mov 0xfffffff8(%ebp),%edx 33: 89 d0 mov %edx,%eax 35: eb 01 jmp 38 <_sumsquares+0x38> 37: 90 nop 38: 89 ec mov %ebp,%esp 3a: 5d pop %ebp 3b: c3 ret $ Each line shows the disassembly of a series of bytes. Note that the number of bytes in each instruction varies in this example from 1 to 7 bytes. The length of the instruction is a function of the opcode which is contained in the first couple of bytes of the instruction. The disassmebler decodes this information just as the CPU would to determine how long this instruction is. Question 1 In testsumsquares.o what is the assembly language statement that corresponds to the assignment s=0x1234? Question 2 From the order of the bytes is this machine Big-Endian or Little-Endian? Note there is a lot of setup for the main that we will come back and discuss in a later lab. For the present we are not going to concentrate of this. Now in sumsquares.o (gcc -g -c sumsquares.c) note that n would be a parameter and thus the corresponding argument value would be stored in the activation record of the function sumsquares. 4

Question 3 In the address 0xfffffffc(%ebp) what is the displacement? i.e. how many bytes from the value of the Base pointer does this address refer? Question 4 In this code there are several NOP (no operation) instructions. When one of these is executed it effectively does nothing. Do the NOPs in this code slow it down any? Now compile sumsquares.c using the optimization flag (-O). Question 5 What major differences are there between the unoptimized compile and the optimized compile? 1. Where is the sum variable stored in each case? 2. Why would the optimized code be faster? 3. How long is each compiled function? In instructions and in bytes? 4. How many memory references are there in one iteration of the loop? Gdb GUI under CYGWIN The Gnu Debugger (GDB) is a valuable tools for studying the execution of Unix programs. In the CYGWIN environment there is a GUI interface to gdb. First compile testsumsquares as follows: gcc -g -O testsumsquares.c sumsquares.c -o test gdb test There are three pulldowns across the top. The gdb starts off in the SOURCE mode. This may be changed to ASSEMBLY, MIXED and SRC+ASM. Try all of these to see the various views of the code. The other pulldowns are for the file you are dealing with and then for the function within that file. Using the menu item View you can bring up windows that will show: the contents of registers, local variables, memory, breakpoints, etc. Setting a Break Point Set the file to sumsquares.c, the function to sumsquares and the mode to SRC+ASM. Highlight the statement sum = sum + i*i; and then click on the RUN icon (the running man). Bring up the registers under the View menu and you may have to rearrange windows a little to make it where you can see what you need to. The icon next to the RUN icon lets you single step, i.e., execute a single instruction. Question 6 What is held in register edx? in register eax? Question 7 What is the address of n? 5

5 Condition Codes There are a collection of single bit registers called condition codes. Some of the most useful are given in the table below. They are set implicitly as a result of arithmetic operations and comparisons. CF Carry Flag the most recent arithmetic operation generated a carry ZF Zero Flag the most recent arithmetic operation yielded 0 SF Sign Flag the most recent arithmetic operation generated a negative OF Overflow Flag the most recent arithmetic operation caused a two s complement overflow The SetX instructions can then be used to save the values of flags in a byte. int gt (int x, int y) { return x > y; } translates to movl 12(%ebp),%eax # eax = y cmpl %eax,8(%ebp) # Compare x : y setg %al # al = x > y movzbl %al,%eax # Zero rest of %eax Control Flow Compile maxp7.c using the flags:-g -O -c and disassemble the code. Now compile maxp7.c again but this time using the flags: -g -O -S. This should produce the assembly language program max.s. But there is a lot more produced also. Question 8 Describe the other information that is supplied in the maxp7.s file. Now examine the code generated for while.c. Question 9 What does this function do? note the decrement. Now examine the code generated for switch.c. Question 10 What happens when there is no break? 6 Decompiling A disassembler translates object code back into assembly code and thus effectively reverses the action of the assembler. We can carry this one step further and translate assembly language program back into 6

C, effectively reversing the action of the compiler. Such a tool is called a decompiler. We are going to decompile some assembly language programs by hand. The URL http://www.itee.uq.edu.au/ csmweb/decompilation/ethics.html has a few comments on the ethics or decompiling. Now try your hand at decompiling 3.31 on page 232 and 3.33 on page 233 of your text. 7 CYGWIN CYGWIN =, is a Unix emulator that runs under Windows. It was developed by Red Hat and is distributed for free. There is a directory /usr/doc that contains documentation on a number of packages but perhaps the best documentation is on the web Start by signing on to your engineering domain account and then executing CYGWIN from the programs menu. If you get the message command not found in response to a standard Unix command then you probably have a path problem. The path is a list of directories, separated by colons, in which the system will look for the command. The path can be set as in PATH= /usr/local/bin:/usr/bin:/bin:$path This command is usually in a setup file. The system should have /etc/profile and maybe there is a.profile in your home directory. First lets do a test. From the bash window (CYGWIN runs a bash shell), do the commands ls -a env // print the environment The first of these should list all the files in your HOME directory. In particular it will tell if you have a.profile file. The second command will print the environment and in particular show the setting of the PATH variable. Question 11 Do you have a.bash profile file? Question 12 Are all of the directories /usr/local/bin, /usr/bin and /bin on your path? In either case add the variable FOO= bar by adding the following to your.bash profile file. FOO= bar export FOO The you can either exit the bash shell and reopen another or use the source.profile command. 7

8 Secure Web Site The department s secure web site (https://www.cse.sc.edu) allows acess to a number of functions and access to crucial information. Logistics All labs will be submitted electronically using dropbox. If there are any corrections or modifications to assignments these will be sent out via email and posted to the website. To turn your files in you should create a directory Lab3, place all files in this directory then use the command tar xvf Lab3.tar Lab3 in the parent directory to create the file Lab3.tar. 9 Logistics Place the following files in the Lab3 directory: Questions.txt // this should have answers to the questions c331.c // the answer to problem number 31 on page 232 c333.c // the answer to problem number 33 on page 233 Then the following sequence of commands should be executed: rm Lab3.tar // remove the handout tar file make clean // to remove some of the extraneous files cd.. // change to the parent directory tar cvf Lab3.tar // create the tar file // then dropbox using the secure website The due date is Wednesday Feb 2. 8