Computer Architecture and Assembly Language. Practical Session 5

Similar documents
Computer Architecture and System Programming Laboratory. TA Session 5

Computer Architecture and Assembly Language. Practical Session 3

Machine and Assembly Language Principles

X86-NASM STANDARD COMMANDS. Comment your code with a semicolon (;)! The assembler won t read anything after it.

Lecture 3. Introduction to Unix Systems Programming: Unix File I/O System Calls

mith College Computer Science CSC231 Assembly Week #11 Fall 2017 Dominique Thiébaut

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB

ASSEMBLY - QUICK GUIDE ASSEMBLY - INTRODUCTION

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Practical Malware Analysis

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Name: CMSC 313 Fall 2001 Computer Organization & Assembly Language Programming Exam 1. Question Points I. /34 II. /30 III.

mith College Computer Science CSC231 Assembly Week #12 Thanksgiving 2017 Dominique Thiébaut

Assembly basics CS 2XA3. Term I, 2017/18

Lecture 2 Assembly Language

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Lab 3. The Art of Assembly Language (II)

CSC 8400: Computer Systems. Machine-Level Representation of Programs

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING PREVIEW SLIDES 16, SPRING 2013

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Summary: Direct Code Generation

CS61, Fall 2012 Midterm Review Section

Program Exploitation Intro

16.317: Microprocessor Systems Design I Spring 2015

x86 assembly CS449 Fall 2017

Subprograms: Arguments

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

SOEN228, Winter Revision 1.2 Date: October 25,

Module 3 Instruction Set Architecture (ISA)

Sistemi Operativi. Lez. 16 Elementi del linguaggio Assembler AT&T

CPS104 Recitation: Assembly Programming

Computer Architecture and System Programming Laboratory. TA Session 3

COMPUTER ENGINEERING DEPARTMENT

1 /* file cpuid2.s */ 4.asciz "The processor Vendor ID is %s \n" 5.section.bss. 6.lcomm buffer, section.text. 8.globl _start.

Chapter 3: Addressing Modes

mith College Computer Science CSC231 Assembly Week #9 Fall 2017 Dominique Thiébaut

CS Basics 8) Strings. Emmanuel Benoist. Fall Term Berner Fachhochschule Haute cole spcialise bernoise Berne University of Applied Sciences 1

Università Ca Foscari Venezia

How do we define pointers? Memory allocation. Syntax. Notes. Pointers to variables. Pointers to structures. Pointers to functions. Notes.

CS61 Section Solutions 3

22 Assembly Language for Intel-Based Computers, 4th Edition. 3. Each edge is a transition from one state to another, caused by some input.

ADDRESSING MODES. Operands specify the data to be used by an instruction

CMSC 313 Lecture 08 Project 2 Questions Recap Indexed Addressing Examples Some i386 string instructions A Bigger Example: Escape Sequence Project

We will first study the basic instructions for doing multiplications and divisions

Assembly Language Each statement in an assembly language program consists of four parts or fields.

EECE.3170: Microprocessor Systems Design I Summer 2017

Operating System Labs. Yuanbin Wu

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

EC 333 Microprocessor and Interfacing Techniques (3+1)

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

CSE351 Spring 2018, Midterm Exam April 27, 2018

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

3.1 DATA MOVEMENT INSTRUCTIONS 45

The Instruction Set. Chapter 5

CS 3843 Final Exam Fall 2012

CMSC 313 Lecture 12 [draft] How C functions pass parameters

CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

CMPS 105 Systems Programming. Prof. Darrell Long E2.371

Assembly Language: Function Calls" Goals of this Lecture"

mith College Computer Science Fixed-Point & Floating-Point Number Formats CSC231 Dominique Thiébaut

System calls and assembler

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

16.317: Microprocessor Systems Design I Fall 2015

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

Assembly Language: Function Calls" Goals of this Lecture"

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

CS Basics 5) Programming in Assembly Language

An Introduction to x86 ASM

Ethical Hacking. Assembly Language Tutorial

Assembly Language: Function Calls

The x86 Architecture

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

CS240: Programming in C

CSE 410: Systems Programming

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

Functions. Ray Seyfarth. August 4, Bit Intel Assembly Language c 2011 Ray Seyfarth

mith College Computer Science CSC231 Assembly Week #5 Fall 2017 Dominique Thiébaut

CPEG421/621 Tutorial

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 6

16.317: Microprocessor Systems Design I Fall 2014

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012)

File IO and command line input CSE 2451

x86 assembly CS449 Spring 2016

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

mith College Computer Science CSC231 - Assembly Week #9 Dominique Thiébaut

mith College Computer Science Fixed & Floating Point Formats CSC231 Fall 2017 Week #15 Dominique Thiébaut

Assembly Language Programming Debugging programs

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

Assembler lecture 4 S.Šimoňák, DCI FEEI TU of Košice

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

Towards the Hardware"

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Transcription:

Computer Architecture and Assembly Language Practical Session 5

Addressing Mode - "memory address calculation mode" An addressing mode specifies how to calculate the effective memory address of an operand. x86 addressing mode rule: up to two of the 32-bit registers and a 32-bit signed constant can be added together to compute a memory address. One of the registers can be optionally pre-multiplied by 2, 4, or 8. Example of right usage mov dword [myarray + ebx*4 + eax], ecx Examples of wrong usage

Addressing Mode - "operand accessing mode" Register Addressing operate on a variable or intermediate variable inc eax Immediate Addressing operate on a CONSTANT add ax, 0x4501 Absolute (Direct) Addressing specify the address as a number or label inc word [mystring] inc word [0x1000] Relative Addressing Effective Address =[PC]+ displacement jnz next ; jnz $+4 inc eax next: neg eax Indexed Addressing (with Displacement) useful for array indices inc dword [ebx*4] inc dword [ebx*4+eax] inc dword [myarray + ebx*4] inc dword [myarray + ebx*4+eax] Register Indirect Addressing inc byte [ebx] Displacement Addressing Effective Address=[register]+displacement inc byte [ebx+0x10]

Addressing Modes - Example #include <stdio.h> #define VECTOR_SIZE 5 extern long long* DotProduct (int V1[VECTOR_SIZE], int V2[VECTOR_SIZE], int size); int main () } int V1[VECTOR_SIZE] = {1,0,1,0,2}; int V2[VECTOR_SIZE] = {1,0,1,0,-2}; long long* result = DotProduct(V1,V2,VECTOR_SIZE); printf ("Dot product result: %#llx \n ", result); return 0; {

int V1[VECTOR_SIZE] = {1,0,1,0,2}; int V2[VECTOR_SIZE] = {1,0,1,0,-2}; long long* result = DotProduct(V1,V2,VECTOR_SIZE); EBP + 16 EBP + 12 EBP + 8 EBP ESP stack VECTOR_SIZE V2 V1 return address EBP previous value old ebx old ecx old edx next element of vector V1 next element of vector V2 MUL / IMUL - multiply unsigned / singed numbers Why are two different instructions needed? Example: MUL: 0xFF x 0xFF = 255 x 255 = 65025 (0xFE01) IMUL: 0xFF x 0xFF = ( 1) x ( 1) = 1 (0x0001) section.data result: dd 0,0 section.text global DotProduct DotProduct: push mov push push push mov ecx, 0 DotProduct_start: mov edx, 0 cmp je mov mov mov imul add adc inc jmp DotProduct_end: mov pop pop pop mov pop ret Since the function is called from C, the function is allowed to mess up the values of EAX, ECX and EDX registers. EBX, ESI, and EDI registers values should be preserved and restored by the function. ebp ebp, esp ebx ecx edx ecx, dword [ebp+16] DotProduct_end ebx, dword [ebp+8] eax, dword [ebx + (4*ecx)] ebx, dword [ebp+12] dword [ebx + (4*ecx)] dword [result], eax dword [result+4], edx ecx DotProduct_start eax, result ; return value edx ecx ebx esp, ebp ebp But in fact it is not 100% sure that all C code obeys this (may be compiler implementation dependent), so it is a good idea to save/restore all registers used by the function

Linux System Calls System calls are low level functions the operating system makes available to applications via a defined API (Application Programming Interface) System calls represent the interface the kernel presents to user applications. In Linux all low-level I/O is done by reading and writing file handles, regardless of what particular peripheral device is being accessed - a tape, a socket, even your terminal, they are all files. Files are referenced by an integer file descriptor. Low level I/O is performed by making system calls

Linux System Call format A system call is explicit request to the kernel, made via a software interrupt Put the system call number in EAX register Set up the arguments to the system call. The first argument goes in EBX, the second in ECX, then EDX, ESI, EDI, and finally EBP. If more then 6 arguments needed (not likely), the EBX register must contain the memory location where the list of arguments is stored. Call the relevant interrupt (for Linux it is 0x80) The result is usually returned in EAX

sys_read read from a file system call number (in EAX): 3 arguments: EBX: file descriptor (to read from it) ECX: pointer to input buffer (to keep a read data into it) EDX: maximal number of bytes to receive (maximal buffer size) return value (in EAX): number of bytes received On errors: 0 (no bits read) or negative number section.bss buffer: resb 1 section.text global _start _start: mov eax, 3 mov ebx, 0 mov ecx, buffer mov edx, 1 mov eax,1 mov ebx, 0 ; system call number (sys_read) ; file descriptor (stdin) ; buffer to keep the read data ; bytes to read ;call kernel ;system call number (sys_exit) ;exit status ;call kernel

sys_write write into a file system call number (in EAX): 4 arguments: EBX: file descriptor (to write to it) ECX: pointer to the first byte to write (beginning of the string) EDX: number of bytes (characters) to write return value (in EAX): number of bytes written On errors: negative number section.data msg: db Message',0xa len: equ $ - msg section.text global _start _start: mov ebx,1 mov ecx, msg mov edx, len mov eax,4 mov eax,1 mov ebx, 0 ;our string, 0xA is newline ;length of our string ;file descriptor (stdout) ;message to write ;message length ;system call number (sys_write) ;call kernel ;system call number (sys_exit) ;exit status ;call kernel

sys_open - open a file system call number (in EAX): 5 arguments: EBX: pathname of the file to open/create ECX: set file access bits (can be bitwise OR ed together) O_RDONLY (0x000) open for reading only O_WRONLY (0x001) open for writing only O_RDRW (0x002) open for both reading and writing O_APPEND (0x008) open for appending to the end of file O_TRUNC (0x200) truncate to 0 length if file exists O_CREAT (0x100) create the file if it doesn t exist EDX: set file permissions (in a case O_CREAT is set; can be bitwise OR ed together) S_IRWXU 0x700 ; RWX mask for owner S_IRUSR 0x400 ; R(read) for owner USR(user) S_IWUSR 0x200 ; W(write) for owner S_IXUSR 0x100 ; X(execute) for owner return value (in EAX): file descriptor On errors: negative number must choose (at least) one of these section.data filename: db file.txt", 0 handle: dd 0 section.text global _start _start: file_open: mov eax, 5 mov ebx, filename mov ecx, 0x101 mov edx, S_IRWXU mov [handle], eax mov eax, 1 mov ebx, 0 may add some of these ; system call number (sys_open) ; set file name ; set file access bits (O_WRONLY O_CREAT) ; set file permissions ;call kernel ; move file handle to memory ;system call number (sys_exit) ;exit status ;call kernel

sys_lseek change a file pointer system call number (in EAX): 19 arguments: EBX: file descriptor ECX: offset (number of bytes to move) EDX: where to move from SEEK_SET (0) ; beginning of the file SEEK_CUR (1) ; current position of the file pointer SEEK_END (2) ; end of file return value (in EAX): Current position of the file pointer On errors: SEEK_SET section.data filename: db file.txt", 0 handle: dd 0 section.text global _start _start: file_open: mov eax, 5 ; system call number (sys_open) mov ecx, 0 ; set file access bits (O_RDONLY ) mov ebx, filename ; set file name ;call kernel mov [handle], eax ; move file handle to memory mov ebx, [handle] mov ecx,15 mov edx,0 mov eax,19 ; file descriptor ; number of byte to move ; move from beginning of the file ; system call number (lseek) ;call kernel mov eax, 1 mov ebx, 0 ;system call number (sys_exit) ;exit status ;call kernel

sys_close close a file system call number (in EAX): 6 arguments: EBX: file descriptor section.data filename: db file.txt", 0 handle: dd 0 return value (in EAX): nothing meaningful On errors: negative number section.text global _start _start: file_open: mov eax, 5 mov ecx, 0 mov ebx, filename mov [handle], eax mov ebx, [handle] mov eax, 6 mov eax, 1 mov ebx, 0 ; system call number (sys_open) ; set file access bits (O_RDONLY) ; set file name ;call kernel ; move file handle to memory ;system call number (sys_exit) ;exit status ;call kernel

Linux System Calls - Example section.data filename: db file.txt", 0 handle: dd 0 section.bss buffer: resb 1 section.text global _start _exit: mov ebx, [handle] mov eax, 6 ; system call (sys_close) ; call kernel mov eax, 1 ; system call (sys_exit) mov ebx, 0 ; exit status ; call kernel _start: mov eax, 5 ; system call (sys_open) mov ebx, filename ; set file name mov ecx, O_RDONLY ; set file access bits (O_RDONLY) ; call kernel mov [handle], eax ; move file handle to memory _read: mov eax, 3 mov ebx, [handle] mov ecx, buffer mov edx, 1 cmp eax, 0 je _exit ; system call (sys_read) ; file handle ; buffer ; read byte count ; call kernel mov eax, 4 mov ebx, 1 mov ecx, buffer mov edx, 1 ; system call (sys_write) ; stdout ; buffer ; write byte count ; call kernel jmp _read

Writing a simple calculator for unlimited-precision integers. Operators: Addition unsigned (+) (BCD decimal) Pop-and-print (p) Duplicate (d) Shift right/left(r / l) (Hexadecimal) Quit (q) Operands in the input and output will be in decimal. Your program should allow -d command line argument, which means a debug option. When "-d" option is set, you should print out to stderr various debugging messages (as a minimum, print out every number read from the user, and every result pushed onto the operand stack).

The calculator uses Reverse Polish Notation (RPN) i.e. every operator follows all of its operands Example: 1 2 + 1+2 10 6 r shift right of 10 with 6 Operands in the calculator are implicit taken from a stack that is implemented by you

>>calc: 9 >>calc: 1 >>calc: d >>calc: p >>1 >>calc: + >>calc: d >>calc: p >>10 ESP -> ESP -> ESP -> ESP-> Stack 910 10 1

>>calc: 1 ESP -> Stack 10

>>calc: 1 >>calc: r ESP -> Stack 10 1

>>calc: 1 >>calc: r >>calc: 3 ESP -> Stack 5

>>calc: 1 >>calc: r >>calc: 3 ESP -> Stack 5 3

>>calc: 1 >>calc: r >>calc: 3 >>calc: l ; shift left ESP -> Stack 5 3

>>calc: 1 >>calc: r >>calc: 3 >>calc: l ; shift left ESP -> Stack 40

>>calc: 1 >>calc: r >>calc: 3 >>calc: l ; shift left >>calc: p ESP -> Stack 40

>>calc: 1 >>calc: r >>calc: 3 >>calc: l ; shift left >>calc: p >>40 ESP -> Stack

Your program should be able to handle an unbounded operand size This may be implemented as a linked list: each element represents 2 decimal digits in the number, in BCD (binary-coded decimal) representation an element block is composed of a 4 byte pointer to the next element, and a byte data BCD (binary-coded decimal) is a class of binary encodings of decimal numbers where each decimal digit is represented by 4 bits. For example, 56 decimal is represented by 0 1 0 1 0 1 1 0 Indeed, we get the value of 0x56, so in BCD a decimal number is represented by hexadecimal number with the same digits. Example: 123456 decimal number could be represented by the following linked list: 0x56 0x34 0x12 Null 0x56 0x34 0x12 0 Using this representation, each stack element is simply a list head pointer.

The stack is implemented by you The stack is of size 5, i.e. you can insert only 5 inputs in it We insert into stack only pointers to elements >>calc:123456 >>calc:1885 Our Stack 0x56 0x34 0x12 0 ESP -> 0x85 0x18 0

Add two elements byte by byte using add instruction After addition, use daa instruction to decimal adjust the result after addition Example: 123456 + 1885= 125341 0x56 0x34 0x12 0 0x85 0x18 0 AL 0x56 + 0x85, then adjust to decimal AL 0x34 + 0x18 + carry of the addition from the previous byte, then adjust to decimal AL 0x12 + carry of the addition from the previous byte, then adjust to decimal AL 0x56 + 0x85 = 0xDB daa ; AL 0x41, CF = 1 since the right result is 0x141 AL 0x34 + 0x18 + CF = 0x4D daa ; AL 0x53, CF = 0 AL 0x12 + CF = 0x12 daa ; AL 0x12 0x41 0x53 0x12 0

(Packed) BCD representation - saves space by packing two digits into a byte (one decimal digit (0..9) per nibble). Example: 1234 is stored as 12 34H Two instructions to process packed BCD numbers daa (decimal adjust after addition) decimal adjust after addition das (decimal adjust after subtraction) decimal adjust after subtraction Algorithm of daa: if low nibble of AL > 9 or AF = 1 then: AL = AL + 6 AF = 1 if AL > 9Fh or CF = 1 then: AL = AL + 60h CF = 1 Algorithm of das: if low nibble of AL > 9 or AF = 1 then: AL = AL - 6 AF = 1 if AL > 9Fh or CF = 1 then: AL = AL - 60h CF = 1

Packed BCD addition examples: mov AL, 05h add AL, 09h ; AL = 0Fh (15) daa ; AL = 15h mov AL,71h add AL,43h ; AL := B4h daa ; AL := 14h and CF := 1 The result including the carry (i.e., 114h) is the correct answer Packed BCD subtraction example: mov AL,71h sub AL,43h ; AL := 2Eh das ; AL := 28h

C functions you may use in your assembly code: char *fgets(char *str, int n, FILE *stream) // use fgets(buffer, BUFFERSIZE, stdin) to read from standard input int fprintf(file *stream, const char *format, arg list ) // use fprintf(stderr, ) to print to standard error (usually same as stdout) int printf(char *format, arg list ) void* malloc(size_t size) // size_t is unsigned int for our purpose void free(void *ptr) If you use those functions the beginning of your text section will be as follows (no _start label): extern printf extern fprintf extern malloc extern free extern fgets section.text align 16 global main main: ; your code Compile and link your assembly file calc.s as follows: nasm -f elf calc.s -o calc.o gcc -m32 -Wall -g calc.o -o calc.bin ; -Wall enables all warnings Note: there is no need in c file. gcc will connect external c functions to your assembly program.