Assembler lecture 4 S.Šimoňák, DCI FEEI TU of Košice Addressing data access specification arrays - specification and manipulation impacts of addressing to performance Processor architecture CISC (more addressing modes) RISC (limited number of addressing modes) instructions work with operands in CPU registers load/store instructions transfers between registers and memory Addressing of Pentium processors (x86, CISC) Register Addressing Mode CPU registers, speed Immediate Addressing Mode only one operand immediate, operand is a part of instruction Memory Addressing Mode a number of modes (a way of specifying the effective address (offset)) Memory addressing mode motivation effective support of HLL constructions and data structures available addressing modes (according to address size) 16-bit addresses (like a i8086) 32-bit addresses (more flexible)
16-bit addresses [1] 32-bit addresses [1]
Comparing 16-bit and 32-bit modes 32-bit mode gives higher flexibility in register usage possibility to take the operand size into consideration (scale factor) which addressing mode will CPU use? bit D of segment descriptor CS (D = 1: 32-bit default) possibility to change an implicit option explicitly (size override prefix) 66H (operand size override prefix) 67H (address size override prefix) by using these prefixes mixed mode available (16/32-bit data and addresses) within our course 32-bit data and addresses generally used Example: asembler code generated mov eax, 123 B8 0000007B mov ax, 123 66 B8 007B (prefix inserted automatically) mov al, 123 B0 7B (different operation code!) mov EAX, [BX] 67 8B 07 mov AX, [BX] 66 67 8B 07 (both prefixes in use)
Based addressing one of registers in role of base when address of operand is calculated effective address sum of register content and offset (signed) Base + disp ; signed displacement access to structure elements Example: number of free places in given course? [1] let EBX contains SSA:... mov AX, [EBX + 48] sub AX, [EBX + 46]...
Indexed addressing effective address calculation (Index * scale) + disp ; signed displacement access to array elements start of array (disp) array element (index register) element size (scale 2, 4, 8 32-bit mode only) Example: mov EAX, [marks_table + ESI*4] ; elements of marks_table and table1-4b add EAX, [table1 + ESI] ; ESI - offset in bytes (e.g.36 for 10.element) Based indexed addressing two types without operand size taking into account (B-I with No Scale Factor) Base + Index + disp ; signed displacement (8/16 in 16-bit, 8/32 in 32-bit mode) two dimensional arrays (disp start of array) arrays of records (disp record element offset) with operand size taking into account (B-I with Scale Factor) effective way of accessing elements of two dimensional arrays (element size 2, 4, 8 B) Base + (Index * scale) + disp
Example: Insertion sort program reads a sequence of integers, prints them in sorted order algorithm operation (insert new element into sorted array to correct position) we start with empty array after first element insertion sorted insert new element to correct position repeat the process, until all elements are inserted pseudocode index i element being inserted elements to the left of i sorted elements to insert to the right of i (including i)
Main program: 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 %include "asm_io.inc" MAX_SIZE EQU 100 segment.data input_prompt db "Enter an input array: " db "(negative terminates input)",0 out_msg segment.bss array db "The array sorted:",0 resd MAX_SIZE segment.text global _asm_main _asm_main: enter 0,0 pusha mov EAX, input_prompt call print_string mov EBX, array mov ECX, MAX_SIZE array_loop: call read_int call print_nl cmp EAX,0 jl exit_loop mov [EBX], EAX add EBX, 4 loop array_loop 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 exit_loop: mov EDX, EBX sub EDX, array shr EDX, 2 push EDX push array call insertion_sort mov EAX, out_msg call print_string call print_nl mov ECX, EDX mov EBX, array display_loop: mov EAX, [EBX] call print_int call print_nl add EBX, 4 loop display_loop done: popa mov EAX, 0 leave ret
Procedure insertion_sort: 52 53 54 55 56 57 58 59 60 61 62 63 64 65 %define SORT_ARRAY EBX insertion_sort: pushad mov EBP, ESP mov EBX, [EBP+36] mov ECX, [EBP+40] mov ESI, 4 for_loop: ; variable mapping: ; EDX = temp, ESI = i, and EDI = j mov EDX, [SORT_ARRAY+ESI] mov EDI, ESI ; j = i-1 sub EDI, 4 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 while_loop: cmp EDX, [SORT_ARRAY+EDI] ; temp < array[j] jge exit_while_loop ; array[j+1] = array[j] mov EAX, [SORT_ARRAY+EDI] mov [SORT_ARRAY+EDI+4], EAX sub EDI, 4 ; j = j-1 cmp EDI, 0 ; j >= 0 jge while_loop exit_while_loop: ; array[j+1] = temp mov [SORT_ARRAY+EDI+4], EDX add ESI, 4 ; i = i+1 dec ECX cmp ECX, 1 jne for_loop sort_done: popad ret 8 procedure without return value (pushad/popad) access to parameters (pushad 32B) while loop (r.66 76) for loop (r.60 82) based addressing (r. 57, 58) based-indexed addressing (r.63, 67, 71, 72, 78)
Arrays One dimensional arrays one dimensional array in C (index starts at 0) int test_marks[10]; HL declaration (size: 40B) array name number of elements (10) element size (4) element type (int) indexes (0 9) array in assembly language space allocation test_marks resd 10 correct access to elements programmer's task (indexes, element size) elements linearly ordered offset from the beginning of an array (offset = index * element size)
Multi dimensional arrays two dimensional array in C (5 rows x 3 columns) int class_marks[5][3]; memory representation (linear array of bytes) row-major ordering, e.g. C column-major ordering, e.g. Fortran two dimensional array in assembly language [1] memory representation essential allocation (60B) class_marks resd 5*3 index to offset translation (row-major) offset = (i * COLS + j) * ELM_SIZE COLS number of columns, i row, j column
Integer arithmetic impact of arithmetic and logic instruction execution to status flags multiplication and division multi-word arithmetic Status flags 6 flags monitoring of operation results ZF, CF, OF, SF, AF, PF if flag is updated remains unchanged, till next instruction changes its state not all of instructions affect status flags (add, sub all 6; inc, dec except CF; mov, push no flags) flags can be tested (individually, in combinations) in order to control the program execution Zero flag the result of last operation (affecting ZF) was 0 ZF = 1, otherwise ZF = 0 sub intuitive, other instructions sometimes bit less intuitive Example: mov AL,0FH add AL,0F1H (sets ZF = 1, all 8 bits of AL 0) mov AX,0FFFFH inc AX (sets ZF = 1) mov EAX,1 dec EAX (sets ZF = 1) instructions of conditional jumps: jz (if ZF = 1), jnz (if ZF = 0)
using the ZF test of equality (often the cmp instruction) cmp char,'$' cmp EAX, EBX counting to given value M, N 1, inner loop (ECX/loop does not affect flags) outer loop (EDX/dec/jnz) Carry flag the result of arithmetic operation on unsigned numbers exceeded the destination range (R/M) Example: mov AL,0FH 00001111 add AL,0F1H 11110001 -------- 100000000
in case of 8-bit register the 9.bit required (AL 8-bits) value range of unsigned integers operation producing the result out of range sets the CF negative result thus is out of range Example: mov EAX,12AEH mov EAX,0 sub EAX,12AFH dec EAX (4782-4783 = -1, CF = 1) (CF = 0; inc, dec do not affect CF) instructions of conditional jumps: jc (if CF = 1), jnc (if CF = 0) using CF carry/borrow propagation in multi-word addition/subtraction instructions operand size 8,16,32b, if greater operand size step by step, taking the carry into account underflow/overflow detection result out of range indication (situation handling by the programmer) testing the bit using shifts/rotations bit (MSb, LSb) captured in CF conditional jumps can be used (conditional code execution) instructions inc, dec don't affect CF often the number of loop iterations (32b value) is enough for most applications condition detected by CF is detectable also with ZF (setting CF redundant) if ECX = FFFFFFFFH and inc is executed inc ECX we suppose CF = 1, but we can detect the condition also by ZF (ECX = 0)
Overflow flag like CF, but for operations with signed numbers indication of result out of valid range ranges for signed numbers Example: mov AL,72H add AL,0EH (114 + 14 = 128, OF = 1) 128 (80H) is a correct result of sum of unsigned numbers when signed interpretation is used incorrect: 80H means -128 Signed/unsigned interpretation how the system will recognize the way of interpretation the string of bits by the program? (not at all) processor takes into account both the interpretations and sets the CF and OF correctly mov AL,72H add AL,0EH (114 + 14 = 128: CF = 0, OF = 1) respecting the corresponding bit is the task of programmer instructions of conditional jumps: jo (if OF = 1), jno (if OF = 0) instruction of SW interrupt: into (interrupt on overflow, generates INT 4)
Sign flag the sign of operation result useful only with signed interpretation copy of the highest (most significant) bit of the result Example: mov EAX,15 add EAX,97 (15 + 97 = 112, SF = 0) mov EAX,15 sub EAX,97 (15 97 = -82, SF = 1) 15 + (-97): 00001111 (15) 10011111 (-97, c-repr.) -------- 10101110 (-82, c-repr.) instructions of conditional jumps: js (if SF = 1), jns (if SF = 0) usage the sign of result loops with the control variable value decreasing to zero (including)
Auxiliary carry flag carry from (borrow to) lower 4 bits (nibble) of operand mov AL,43 00101011 (43) mov AL,43 00101011 (43) add AL,94 (AF=1) 01011110 (94) add AL,84 (AF=0) 01010100 (84) -------- -------- 10001001 (137) 01111111 (127) related instructions non-existence of conditional jumps testing AF arithmetic operations with BCD numbers aaa, aas, aam, aad (ASCII adjust for addition, subtraction,...) daa, das (decimal adjust for addition,...) Parity flag parity of operation producing 8-bit result (only the lower 8 bits affects the PF) even number of 1 (PF = 1), odd number (PF = 0) mov AL,53 00110101 (53) add AL,89 (PF =1) 01011001 (89) -------- 10001110 (142) related instructions jumps: jp (if PF = 1), jnp (if PF = 0) usage (e.g. data encoding)
Example: modem transfer using 7-bit ASCII code simple transfer errors detection adding the parity bit (to 7-bit datum) suppose encoding of even parity (update of 8.bit when needed) receiver counts the number of ones in byte received (error, if it contains odd number of them) A 41H (code: 01000001, MSb 0) C 43H (code: 11000011, MSb 1, set) Example: effects of arithmetic operations execution to flags [1]
Arithmetic instructions addition (add, adc, inc) subtraction (sub, sbb, dec, neg, cmp) multiplication and division (mul, imul, div, idiv) relative instructions (cbw, cwd, cdq, cwde, movsx, movzx) addition and subtraction discussed yet Instructions for multiplying properties of multiplication operation the result size (2n bits, when multiplying two n-bit numbers) multiplication of signed numbers is different from those of unsigned (result 2 multiplication instructions) multiplication of unsigned numbers (mul) syntax mul src (src 8, 16, 32-bit GPR, memory) semantics (according to size of src) 8 bits: AX src * AL 16 bits: DX:AX src * AX 32 bits: EDX:EAX src * EAX instruction affects all the status (6) flags, sets only CF and OF, rest of them undefined CF and OF set, if upper half part of result is not zero (AH, DX, EDX) Example: mov AL,10 mov AL,10 mov DL,25 mov DL,26 mul DL ; CF = OF = 0 mul DL ; CF = OF = 1
multiplication of signed numbers (imul) syntax (like mul, support of additional formats, e.g. immediate datum like a parameter) CF, OF set, if upper half part of result is not the sign-extension of lower Example: sign-extension of value -66 (-66) 10 = (10111110) 2 8-bit (-66) 10 = (1111111110111110) 2 16-bit Example: mov DL,0FFH ; DL = -1 mov AL,42H ; AL = 66 imul DL ; AX = -66 (1111111110111110) 2, CF = OF = 0 Instructions for dividing properties of division operation two values as a result quotient and remainder multiplication operation (result with double length of operands) no overflow, division overflow can occur (divide overflow) syntax div src (unsigned, src 8, 16, 32-bit GPR, memory) idiv src (signed) semantics of div instruction (due to size of divider src) 8 bits: AL quot(ax/src), AH rem(ax/src) 16 bits: AX quot(dx:ax/src), DX rem(dx:ax/src) 32 bits: EAX quot(edx:eax/src), EDX rem(edx:eax/src) flags affected by instructions not defined semantics of idiv instruction same format and behaviour like div complication if dividend is negative sign-extension is required
Example: division -251/12 (16-bit) (-251) = FF14H, thus DX initialized to FFFFH if DX initialized to 0000H (like in case of div), DX:AX represents a positive number! if dividend positive DX should be 0000H instructions for sign-extension cbw (convert byte to word) cwd (convert word to doubleword) cdq (convert doubleword to quadword) - extension AL to AH (8-bit idiv) - extension AX to DX (16-bit idiv) - extension EAX to EDX (32-bit idiv) next relative instructions cwde - sign-extension AX to EAX movsx dst,src (move sign-extended src to dst) dst R, src R/M, if src 8-bit dst 16- or 32-bit, if src 16-bit dst 32-bit movzx dst,src (move zero-extended src to dst) like movsx Example: 16-bit. signed division mov AX,-5147 cwd ; DX = FFFFH mov CX,300 idiv CX ; AX = FFEFH (-17) quotient, DX = FFD1H (-47) reminder Using shifts for multiplying and dividing effective alternative for performing operations mentioned; if it is possible, use it (multiplying/dividing by power of 2) Example: AX * 32 (multiplicand in AX), 2 alternatives (b speed, space) a) mov CX,32 b) sal AX,5 mul CX
Arithmetic operations over multiword data (multiword arithmetic) arithmetic instructions work with data of size 8, 16, 32-bit (data of greater size problem) basics of multiword arithmetic Addition and subtraction (64-bit, unsigned) relatively simple addition we sum right 32 bits, left in a next step (with a carry from previous step) Example: addition of two 64-bit numbers in EBX:EAX and EDX:ECX, result in EBX:EAX. Overflow indicated by CF. add64: add EAX,ECX ; subtraction similarly (add sub, adc sbb) adc EBX,EDX ret Multiplication and division detailed information on multiplication and division can be found in [1] Study literature: [1] Dandamudi,S.,P.: Introduction to Assembly Language Programming, Springer Science+Business Media, Inc., 2005. [2] Carter, A., P.: PC Assembly Language, 2006, http://www.drpaulcarter.com/pcasm/