Lab Session 4 Objective: Learn how Data is represented in Assembly Language Introduction to Data Types and using different Data Types in Assembly language programs Theory: The basic machine data types are bit, byte, word, double word, quadword, tword, and double quadword. These data types can be signed or unsigned, integer or floating-point, packed or unpacked, and represented in one of four formats: binary (base 2), octal (base 8), decimal (base 10), or hexadecimal (base 16). Note that hexadecimal is often referred to as hex. Integer and Floating-point Data Types Integers are whole numbers stored as either signed or unsigned values. Whole numbers are numbers without any decimal point. Signed numbers are designated either positive or negative. Floating-point numbers are simply numbers with decimal points. Like integers, they are designated as either signed or unsigned. Basically, floating-point describes a method of representing real numbers to a number of significant digits and scaled using an exponent. The floating part of the term refers to the fact that the decimal point can float (be positioned) anywhere in the significant part of the number. Typical floating-point numbers are represented in the following scientific notation: Significant digits x base exponent The advantage of floating-point numbers over fixed-point and integer representations is that they can support a much wider range of values. However, floating-point numbers achieve their wider range at the expense of precision. Single-precision refers to floating-point numbers that are represented by 32 bits (DWORD). Double-precision floating-point numbers are represented by 64 bits (QWORD). Double-extended precision floating-point numbers are represented by 80 bits (TWORD). Unsigned Data Types The basic unsigned data types used in assembly coding to store data values include: byte, word, doubleword, quadword, and double quadword. Unsigned integers are non-negative and can be represented in binary, octal, decimal, and hex formats. Their values range from 0 to 255 for unsigned byte integers, from 0 to 65,535 for unsigned word integers, from 0 to 2 32-1 for unsigned
doublewords, and from 0 to 2 64-1 for unsigned quadwords. Unsigned integers are sometimes referred to as ordinal numbers. Figure: Unsigned Data Types As an example, the number 2 would be represented in a byte as 00000010. Therefore, byte-sized unsigned integers can range in value from 0 to 255 because all eight bits are available to represent the integer and if they were all 1 s, they would represent the number 255. byte-size integers contain 8 contiguous bits starting at any logical address in your allocated program space. The bits are numbered 0 through 7; bit zero is the least significant bit. Bytes are useful in storing character, string, and Boolean type information. In assembly, bytes are designated as db (define byte) and BYTE depending on the context. Example of defining BYTE types: db 65 ; Define the integer number 65, which is ASCII character A db a ; Define storage requirements for one character db abc ; Define a string of characters db a, b, c' ; Same set of string characters byte This is a string., 0 ; Define a zero-delimited string of character bytes
Unsigned WORDs are 16-bit integers (2 bytes) in the range 0 to 65,535. A WORD consists of two contiguous bytes starting at any logical address in your allocated program space. A WORD contains 16 bits numbered 0 through 15 with bit 0 being the least significant bit. Bit 15 is referred to as the most significant bit. Each byte in a word has its own memory address, and the smaller of the addresses is the address of the word. The byte at the lower address contains the eight least significant bits of the WORD. The byte at the higher address contains the eight most significant bits. The primary use for the WORD data type is for storing wide (Unicode) characters. WORDs can represent up to 65,536 different Unicode characters, allowing the use of non-roman character sets in computer programs. In assembly, WORDs are designated as either dw (define word) or WORD depending on the context. Example of defining WORD types: dw a, b, c ; Define a Unicode string word This is a Unicode string.' ; Another form of an Unicode string word 16500 ; Define a 16-bit decimal integer value Unsigned doublewords are 32-bit (4 byte) integers in the range 0 to 4,294,967,295. Doubleword values have larger positive ranges than word-sized (16-bit) integers. In assembly, doublewords are designated as dd (define doubleword) or DWORD depending on the context. The DWORD is the standard data size and most efficient integer type in 32-bit programming environments. Pointer values make extensive use of the DWORD data type. DWORDs consist of two contiguous WORDs starting at any logical address in your allocated program space. The bits of a DWORD are numbered from 0 through 31 with bit 0 being the least significant bit and bit 31 being the most significant bit. Each byte within a DWORD has its own address. The lowest address of the DWORD contains the eight least significant bits while the byte at the highest address location contains the eight most significant bits. Example of defining DWORD types: dd 12345678 ; Define a 32-bit integer dword 123_456_789 ; Define a 32-bit integer Binary-Coded Decimal (BCD) numbers are stored as a series of decimal digits in the range 0 through 9. BCD numbers come in two forms: packed and unpacked. When packed, each stored byte represents two decimal digits while the unpacked form stores only one decimal digit per byte. Packed BCD constants represent the tenbyte data type and utilize the 80-bit FPU registers. BCD numbers are suffixed with p or prefixed with 0p and can include up to 18 decimal digits. You can use underscores to separate digits. Example of defining TWORD types: dt 312_345_678_901_245_678p ; Packed BCD constant
Signed Data Types Signed data types are stored differently and must be decoded by special assembly code instructions. Typically, the most significant (highest) bit is used to store a 0 or 1 to signify a positive or negative number. Data stored in BCD format are represented differently and discussed here only to familiarize you with the format. We will not delve into BCD as this format does not port to the 64-bit programming environment. As an example, the number -2 would be represented in a byte as 10000010 where the most significant bit (bit 7-bolded 1) indicates that this number is negative. A +2 would be represented in a byte as 00000010 where the most significant bit (bit 7-bolded 0) indicates this number is positive. As a result of using one bit to indicate the sign of the value, signed byte-sized integers can have values only ranging from -128 to +127 because only seven bits are available to represent the number. Signed WORD integers can have values from -32,768 to 32,767, signed DWORDs can have values ranging from -2,147,483,648 to 2,147,483,647, and signed QWORDs can range in values from 9.22x10-18 to 9.22x10 18. Figure 2-2 shows how signed BYTE, WORD, DWORD, and QWORD integers are stored. The figure also depicts how single-, double-, and double-extended precision floating-point values are stored in registers. Notice that the most significant bit of each signed data type represents the sign where a zero (0) equals a positive number and a one (1) equals a negative number. Single precision floating-point values are 32-bit (4 bytes) data types. All floating-point numbers are stored in three binary parts. These parts are the sign, exponent, and the significand. In singleprecision floating-point numbers, bits 0 through 22 represent the significand or root part of the number. Bits 23 through 30 represent the exponent of the value. Bit 31 holds the sign.
Figure: Signed Data Types Example of defining a DWORD floating-point type: dd 1.234567e20 ; Define a 4-byte single precision floating-point value Double precision floating-point values are 64-bit (8 bytes) data types. Bits 0 through 51 contain the significand while bits 52 through 62 hold the exponent. Bit 63 contains the sign represented by either a 0 or 1. Example of defining QWORD floating-point types: dq 1.234567e20 ; Define 8-byte double precision floating-point value qword +1.5 ; Define 64-bit double precision floating-point value Double extended precision floating-point types require 80-bits (10 bytes) of storage. One primary use of this data type is for storing BCD numbers. Utilizing these values requires special instructions. Example of defining TWORDs: dt 1.234567e20 ; Define a 10-byte extended precision float value tword 3.141592653589793238462 ; Pi
Sign Extension Since integers are stored in registers using two s compliment format, a problem develops when we need to transfer an 8-bit or 16-bit integer value into a 16-bit or 32-bit register. Without getting into a lengthy numbers theory discussion, the solution is simple. If the number is positive, then fill the higher-order bits with zeros and if the number is negative, fill the higher-order bits with ones. For example, if the number in a byte register is 00110111b, then it is positive as denoted by bit 7 (most significant bit is 0). To transfer this byte to a word register, just fill the high-order byte of the 16-bit register with zeros to form the number 00000000_00110111b in that register. Now we have a 32-bit signed integer. Zero Extension Similarly, if you want to put an 8-bit number into a 16-bit or 32-bit register, you first need to zero out all the binary ones in the most significant bytes of the larger register. If you don t, you could have some ones left over in the higher-order bytes when you transfer the 8-bit value into the lower-order byte. Then, when you read the number as a word or doubleword value, you may get some surprises. There are assembly code instructions that can clear out the high-order bits as you transfer an 8-bit or 16-bit number into a larger register. Character Values Computers store characters using a standard character mapping referred to as the American Standard Code for Information Exchange (ASCII). In ASCII, a unique 7-bit integer is assigned to each character in the alphabet and to selected symbols. On some computers, the eighth bit is used to form proprietary characters like lines and Greek characters. When HTML first came out, the international community designated the ASCII character set as UTF-8. As an example, the integer value 65d stands for the capital letter A. Unicode extended the ASCII character set to 16 bits (WORD) in order to encompass additional European and Arabic characters. Unicode allows you to have up to 65,535 different characters. UTF-16 is the designated standard for 16-bit Unicode and is the character set used by Microsoft in designing its Windows OS and application programming interface (API). Additional character sets exist such as UTF-32 to provide sufficient space for Japanese and Chinese type styles. Procedure: Start Emu8086 by selecting its icon. Write the following codes in the text editor Observe values in the registers and fill the observation tables.
Program 1: org 100h ; variables: bytea db 15d mov al,bytea ret Program 2: org 100h ; variables: bytea db 15d byteb db 06d mov al,bytea mov ah,byteb ret Program 3: org 100h ; variables: worda dw 12d wordb dw 13d Register Value Value AX AH= AL= Register Value Value AX AH= AL=
mov ax,worda mov bx,wordb ret Register AX BX Value Program 4: org 100h ; variables: worda dw 120d ; wordb dw 121d mov ax,worda mov bx,wordb ret Register AX BX Value Exercise: 1. What is the major advantage of floating-point numbers over fixed-point and integer representations?
2. Modify Program 1 and 2 for DX and fill the observation table: Register Value Value DX DH= DL= DX DH= DL= 3. Modify program 3 and 4 for AX and DX and fill the observation table: Register AX DX Register AX DX Value Value 4. Write a program to add two Byte-size integers. Name the register in which the result is stored. 5. Write a program to subtract two Byte- size integers. Name the register in which the result is stored. 6. What is the range of values for an unsigned byte integer? 7. What do DB, DW, DD, DQ, and DT stand for?
8. What is the most efficient data storage size in the 32-bit programming environment?... 9. What must you do when transferring an 8-bit integer value from a byte-size register to a word-size register?