X86-NASM STANDARD COMMANDS. Comment your code with a semicolon (;)! The assembler won t read anything after it.

Similar documents
mith College Computer Science CSC231 - Assembly Week #4 Dominique Thiébaut

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

CSE351 Spring 2018, Midterm Exam April 27, 2018

Q1: Multiple choice / 20 Q2: Protected mode memory accesses

Computer Architecture and Assembly Language. Practical Session 5

ASSEMBLY - QUICK GUIDE ASSEMBLY - INTRODUCTION

Computer Architecture and Assembly Language. Practical Session 3

16.317: Microprocessor Systems Design I Fall 2015

mith College Computer Science CSC231 Assembly Week #9 Fall 2017 Dominique Thiébaut

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

Assembly Language: IA-32 Instructions

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Lecture 8: Control Structures. Comparing Values. Flags Set by CMP. Example. What can we compare? CMP Examples

Q1: Multiple choice / 20 Q2: Data transfers and memory addressing

CSC 8400: Computer Systems. Machine-Level Representation of Programs

Lab 3. The Art of Assembly Language (II)

Ethical Hacking. Assembly Language Tutorial

CS Basics 5) Programming in Assembly Language

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

Do not turn the page until 11:30.

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

CSE 351 Midterm Exam

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

ELEC 242 Time Delay Procedure

16.317: Microprocessor Systems Design I Spring 2014

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Assembly Language Each statement in an assembly language program consists of four parts or fields.

ALT-Assembly Language Tutorial

Computer Architecture and System Programming Laboratory. TA Session 5

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Branching and Looping

mith College Computer Science CSC231 - Assembly Week #9 Dominique Thiébaut

SOEN228, Winter Revision 1.2 Date: October 25,

Do not turn the page until 5:10.

Introduction to 8086 Assembly

System calls and assembler

Experiment 3 3 Basic Input Output

6/20/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

16.317: Microprocessor Systems Design I Fall 2014

By: Dalbir Singh, Computer Science Dep't

Bits, Flags, Branches, and Tables

mith College Computer Science CSC231 Assembly Week #11 Fall 2017 Dominique Thiébaut

Q1: Define a character string named CO_NAME containing "Internet Services" as a constant?

ADDRESSING MODES. Operands specify the data to be used by an instruction

Chapter 3: Addressing Modes

Basic Assembly Instructions

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Q1: Define a character string named CO_NAME containing "Internet Services" as a constant?

Introduction to Reverse Engineering. Alan Padilla, Ricardo Alanis, Stephen Ballenger, Luke Castro, Jake Rawlins

mith College Computer Science CSC231 - Assembly Week #3 Dominique Thiébaut

Computer Architecture and System Programming Laboratory. TA Session 3

EEM336 Microprocessors I. Addressing Modes

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

EECE.3170: Microprocessor Systems Design I Spring 2016

Summary: Direct Code Generation

COMPUTER ENGINEERING DEPARTMENT

Philadelphia University Student Name: Student Number:

16.317: Microprocessor Systems Design I Spring 2015

COMPUTER ENGINEERING DEPARTMENT

Overview of Assembly Language

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Lab 6: Conditional Processing

CSE 351 Midterm Exam Spring 2016 May 2, 2015

Program Exploitation Intro

Basic Intel x86 Assembly Language Programming

mith College Computer Science CSC231 Assembly Week #5 Fall 2017 Dominique Thiébaut

Subprograms: Local Variables

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Transfer of Control. Lecture 10 JMP. JMP Formats. Jump Loop Homework 3 Outputting prompts Reading single characters

Computer Department Chapter 7. Created By: Eng. Ahmed M. Ayash Modified and Presented by: Eng. Eihab S. El-Radie. Chapter 7

Lecture (08) x86 programming 7

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html

Q1: Multiple choice / 20 Q2: Memory addressing / 40 Q3: Assembly language / 40 TOTAL SCORE / 100

/ 30 Q3: Arithmetic instructions / 30 Q4: Logical instructions / 20 TOTAL SCORE / 100 Q5: EXTRA CREDIT / 10

Practical Malware Analysis

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

CMSC 313 Lecture 08 Project 2 Questions Recap Indexed Addressing Examples Some i386 string instructions A Bigger Example: Escape Sequence Project

Q1: Multiple choice / 20 Q2: Memory addressing / 40 Q3: Assembly language / 40 TOTAL SCORE / 100

Inline Assembler. Willi-Hans Steeb and Yorick Hardy. International School for Scientific Computing

CSE 351 Midterm - Winter 2015 Solutions

Reverse Engineering II: The Basics

Module 3 Instruction Set Architecture (ISA)

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

Reverse Engineering II: The Basics

LAB 5 Arithmetic Operations Simple Calculator

Basic Assembly SYSC-3006

Lab 3: Defining Data and Symbolic Constants

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012)

Name: CMSC 313 Fall 2001 Computer Organization & Assembly Language Programming Exam 1. Question Points I. /34 II. /30 III.

X86 Assembly Language and C Fundamentals. Chapter 5. Data Transfer Instructions. X86 Code Figures

CPS104 Recitation: Assembly Programming

CMSC 313 Lecture 07. Short vs Near Jumps Logical (bit manipulation) Instructions AND, OR, NOT, SHL, SHR, SAL, SAR, ROL, ROR, RCL, RCR

Assembly basics CS 2XA3. Term I, 2017/18

CS241 Computer Organization Spring 2015 IA

The x86 Architecture. ICS312 - Spring 2018 Machine-Level and Systems Programming. Henri Casanova

16.317: Microprocessor Systems Design I Fall 2013

mith College Computer Science CSC231 Assembly Week #10 Fall 2017 Dominique Thiébaut

CS/ECE/EEE/INSTR F241 MICROPROCESSOR PROGRAMMING & INTERFACING MODULE 4: 80X86 INSTRUCTION SET QUESTIONS ANUPAMA KR BITS, PILANI KK BIRLA GOA CAMPUS

Transcription:

X86-NASM STANDARD COMMANDS Comment your code with a semicolon (;)! The assembler won t read anything after it. Move mov ax,bx ; ax = bx Use this command when you want to move a value around. You can also use this to move from a variable to a register You can also use it to move a constant (ie. number) into a register Mov works both ways. You can also move the contents of a register into memory Usually, you use a variable as the base, then add to it. For example, [output + ecx] is the memory location ecx bytes from output Some examples: mov bl, 5 ;move the value 5 into the register bl mov rsi, output ;move the variable output into register rsi mov eax, ebx ;move the value in ebx to eax mov [output + ecx], bh # move the value in bh to the memory location [output + ecx] Load Effective Address lea eax, [output+esi+0x10] ;eax = output + esi + 0x10 An effective address is just a location in memory. You can think of it as an address on your street. output would be the location of your house esi would be a number, let s say 0x5. Added to 0x10, that s 0x15. So the address from this computation would be 0x15 houses down from yours Some examples (you won t use lea in the programs you write, but it will be useful for our final project) lea eax, [ebx, esi * 4] ;load ebx + esi*4 into eax Add add eax,ebx ;eax = eax + ebx Add two values. You can add the values in two registers, a register and a constant (ex. 0x5), or dereference an address to access a location in memory Some examples:

add eax, ebx ;if eax is 2 and ebx is 3, eax will now be 5 add eax, 5 ;if eax is 2 then eax will now be 7 add eax, [ 0x400ee4 ] ;if eax is 2 and 3 is at the address 0x400ee4, then eax will now be 5 You can subtract by adding a negative number Shift right shr eax, 7 # eax = eax >> 7 Before: eax = 0110 0001 (binary 97) Instruction: shr eax,3 After: eax = 0000 1100 (now binary 12 (97 / 2^3)) Shifting is the act of shifting the bits in a binary* number back and forth. It can be used to multiply and divide numbers Right shifting is used to divide by powers of 2 shr eax, 7 divides the value in eax by 2^7, or 128 Examples shr eax, 2 ( 1000 0010, binary 8 binary 2 ) Shift left shl eax, 7 # eax = eax << 7 Before: eax = 0100 0011 (binary 67) Instruction: shl eax,3 After: eax = 0010 0001 1000 (now binary 536 (67 * 2^3)) Left shifting is used to multiply by powers of 2 BITWISE OPERATIONS You probably won t use these much in the class, but it s important to know and you can use it to do some cool math! And and eax,ebx # ax = eax & ebx A bitwise instruction i.e. each bit is handled individually 1 & 1 = 1 1 & 0 = 0 0 & 0 = 0 Eg: 0101 & 1100 = 0100

Or or eax,ebx ax = ax bx A bitwise instruction i.e. each bit is handled individually 1 1 = 1 1 0 = 1 0 0 = 0 Eg: 0101 & 1100 = 1101 Not not eax eax =!eax A bitwise instruction i.e. each bit is handled individually!1 = 0!0 = 1 Eg:!0101 = 1010 CONDITIONALS Compare cmp eax, ebx Comparing is used to jump around the program. Say you only want to do the next few lines if the value of eax is 5 or greater. You must use the compare instruction first to set the necessary flags that tell the computer what to do next Unconditional Jump jmp label # jump to label Conditional jump j(cc) label # jump to label if condition (see condition codes below) true (cc = condition code. ex jle = jump if less than) Important!! You must use the cmp command first. Then the condition will be applied to whatever you compared Condition code - Meaning E - Equal Z - Zero NE - Not equal NZ - Not zero

B/ NAE- Below/ Not above or equal NB/ AE - Not below / Above or equal BE/ NA - Below or equal / Not above NBE/ A - Not below or equal / Above L/ NGE - Less than / Not greater than or equal NL/ GE - Not less than / Greater than or equal LE/ NG - Less than or equal / Not greater than NLE/ G - Not less than or equal/ Greater than

Goto C X86 NASM Declaration (initialized) string word = hangman ; int a = 5; char c = a ; Initialized variable declarations go in the.data section. Initialized means that they are immediately set equal to a value, like hangman or the number 5. Since you already know the value of all the variables you want to initialize, you can put their values in memory before the program even starts. section.data word: db 'hangman' a: db 0x05 c: db 'a' db stands for declare byte. You re simply telling the assembler that you want to declare some bytes with the following data inside of them. The labels (word, a, and c) are how you can access this data anywhere in the program later. If you want a little more space, you can use dw instead. This stands for declare word. A word is made up of four bytes or 32 bits. Declaration (uninitialized) char guess; int i; Uninitialized variables go in the.bss section. Uninitialized means that you don t know what value you want to put in that variable yet (like if the user wants to input a guess) but you know you re gonna need a name and space for it later. You can use the bss section to tell the assembler that you want to reserve space in memory for some future data. Similar to initialized data, you use a label so you can refer back to it later. section.bss guess: resb 1 i: resb 1 resb stands for reserve byte. The 1 after is simply telling the assembler to reserve 1 byte. You can use any number here. If you wrote resb 4, you would be reserving 4 bytes.

Conditionals and jumps (converting the goto part) if (i == 5) goto endwhile; if (x >= 2) goto done; Writing a conditional in assembly is actually done in 2-3 steps. The first thing you need to do, if you haven t already, is put the value you want to compare (i, x) into a register. Let s say you already declared i and x way back in your.data section. i and x were the labels you used when you declared them. mov eax, i mov ebx, x ;put the value of i into the register eax ;put the value of x into the register ebx cmp eax, 5 ;compare eax and 5 je endwhile ;if they were equal jump to (goto) endwhile cmp ebx, 2 ;compare ebx and 2 jge done ;if ebx was >= 2, jump to (goto) done The jump command is always in the format jxx where xx is the condition code. A list of condition codes is in the x86 cheat sheet. To execute a condition goto statement (a goto statement with an if before it), you MUST use the cmp command before you jump. If you just have a statement that has goto without a conditional, you can use jmp to jump no matter what. Printing and getting input in X86 print(output) inputchar(guess) To output and print in nasm assembly, you must use system calls. Here is a chart of useful system calls and how to use them: Name eax edi rsi edx read 0 0 (stdin) label you want to read into write 1 1 (stdout) label you want to print out length of input length of output exit 60 0 (no error) N/A N/A

Below is a full program to help you understand how to use syscalls: It s worth noting that the syscall assumes that output and guess (basically any value in rsi) will hold the address of where you want to put the data. It won t change the actual value inside guess since it s changing the data at the provided address. section.data output: '-------', 0xa, 0xd 0xd) outputlen: equ $-output ; declare output with endline (0xa, ; declare the length of the output section.bss guess: resb 1 ; declare the uninitialized variable guess (1 byte) section.text global _start _start: mov eax, 1 mov edi, 1 mov rsi, output mov edx, outputlen syscall mov eax, 0 mov edi, 0 mov rsi, guess mov edx, 1 syscall ; eax = 1 = sys_write ; edi = 1 = stdout ; rsi = output = label you want to print ; edx = outputlen = length of output ; eax = 0 = sys_read ; edi = 1 = stdin ; rsi = guess = label you want to read into ; edx = 1 = size of input (guess = 1 byte) mov eax, 60 mov edi, 0 syscall ; eax = 60 = sys_exit (exit the program correctly) ; edi = 0 = no error (error code) Array Access word[i] = b third = word[3] An array is a series of characters or numbers that are all related and right next to each other in memory. To access different pieces of an array, most programming languages use the square bracket operator ( [] ). The number in the brackets is called the index. Let s say you have an array called number :

number[0] would be 10, the first element. number[2] would be 30. So how does this related to our word? In C and assembly language, strings are actually just arrays of characters, like this: Index 0 1 2 3 4 5 6 Value h a n g m a n If you have the statement word[i] = b, and i was equal to 2, then the array would change to this: Index 0 1 2 3 4 5 6 Value h a b g m a n In assembly, only the address of the first value is stored in memory (in this case, the h). So, if you declared word like this: section.data word: db 'hangman' wordlen: equ $- word word would technically hold only the address of the h. That s why you need the wordlen variable to remember how long the word is. While there is a square bracket operator in assembly, it is used a bit differently. The square brackets tell the assembler that the value inside of it is an address and it needs to dereference it (that is, go to the address) to access the value. mov eax, word ; put the address of word into eax mov ebx, 3 ; put 3 into ebx. this is your index mov [ word + ebx ], 0x62 ; the ascii value for 'b' is 0x62. put that in word + ebx (3). These three lines are equivalent to word[3] = 'b'

Registers in x86 Each register is 64 bits, with a 32 bit part and a 16 bit part. Some registers also have two 8 bit parts. The idea is that since there are only 6 general purpose registers, you can use the smaller pieces to store more information, when the data you want to store is smaller. Here is a chart with the sizes of data you ll probably be working with in this class. It will help you decide which size register to use. Type Examples Size Register Examples Small number -128 to 127 8 bits (1 byte) ah, al, bh, bl Medium number -32768 to 32767 16 bits (2 bytes) ax, bx, cx, dx Large number -2147483648 to 2147483647 32 bits (4 bytes) eax, ebx, ecx Character a, b, c 8 bits (1 byte) ah, al, bh, bl ***WARNING*** The pieces of each register are still technically the same register. That s why in the diagram the larger ones overlap with the smaller ones. This means, for example, if you are using register ah and decide to put something in rax, eax or ax, the value in ah will be overwritten since the larger pieces use the same space. That means you can t use ah and ax, eax or rax at the same time (or any two+ registers that overlap in the diagram).

int main () { string word = "hangman" ; string output = "-------" ; int guessed = 0 ; while (guessed!= len(word)){ char guess = 'a' ; print(output); inputchar(guess); } Hangman Goto C Answer Key C Code for ( int i = 0 ; i < len(word); i++){ if (guess == word[i]){ output[i] = guess; print(output); guessed++; } } } print( "You won!" ); Goto C string word = "hangman" ; string output = "-------" ; int guessed = 0 ; while : if (guessed == len(word)) goto endwhile; char guess = 'a' ; print(output); inputchar(guess); int i = 0 ; for : if (i >= len(word)) goto endfor; if (guess!= word[i]) goto endif; output[i] = guess; print(output); guessed++; endif: goto for endfor: goto while endwhile: print( "You won!" );

Arrays and Strings in x86 Let s say you write the following assembly code: section.data word: db 'hangman' wordlen: equ $- word section.text global _start _start: mov eax, word ; put the address of word into eax Then, you print the value of eax. The value that prints is 0x400ee4 Remember the 0x means that this value is in hexadecimal. Fill in the bold cells of the table. If you don t think we can know what s in that location, write a?. Address in memory 0x300fe2 (location of word) Content 0x400ee4... 0x308e42 (location of wordlen) 7... 0x400ee3 0x400ee4 0x400ee5 0x400ee6 0x400ee7 0x400ee8 0x400ee9 0x400eea 0x400eeb

You now add the following lines to your assembly code. Write in the blank what the value of eax will be after each line. It will be very helpful to look back at the filled out table. Hint: Remember that the square brackets ([]) in assembly mean to go to the address inside of them. Find out the value inside the square brackets first and then try to figure out what is at that location. Example: mov ebx, 0 mov eax, [ word + ebx ] eax = h mov ebx, 3 mov eax, [ word + ebx ] eax = mov ebx, 6 mov eax, [ word + ebx ] eax = mov ebx, 4 mov eax, [ word + ebx ] eax = Then, you add these lines: mov ebx, 2 mov [ word + ebx ], 'y' If you print word to the screen, what will be outputted?