COE538 Lecture Notes Week 3 (Week of Sept 17, 2012)

COE538 Lecture Notes: Week 3 1 of 11 COE538 Lecture Notes Week 3 (Week of Sept 17, 2012) Announcements My lecture sections should now be on Blackboard. I've also created a discussion forum (and anonymous submissions are allowed.) Topics More Addressing Modes Arithmetic and logic Code Warrior Questions and Answers Indexed Addressing Mode Consider now another problem: We've got a bunch of bytes stored in sequential places in memory We want to add them all up. This calls for some kind of loop. But, unlike the previous situation, the location of the data changes each time through the loop. Extended addressing works when only the data changes (variable i in the previous example) but the address of where to find the data does not change. Indexed addressing solves this problem, First let's phrase the solution to the problem in C: byte data[] = {3, 1, 4, 5, 2, 7, 8}; byte sum = 0; char * nextdata = &data[0]; while (nextdata!= &data[0] + 7) { sum += *nextdata++; } //all done We start by defining our data using a new assembler directive fcb (form constant byte) which reserves a byte of memory and initializes its contents with the specified value: org $3000

COE538 Lecture Notes: Week 3 2 of 11 data fcb 3 ;initialize the contents of $3000 to 03 fcb 1 ;initialize the contents of $3001 to 01 fcb 4 fcb 5 fcb 2 fcb 7 fcb 8 sum ds.b 1 ;reserve 1 byte(at $3007) contents UNDEFINED We also introduce a few new instructions: clr address Uses extended addressing to set to zero the contents of the byte at the specified address. ldx operand Loads the 16-bit X register withthe operand which can be specified using immediate, extended or indexed addressing modes. adda 0,x Adds the operand to A. The operand here (0,x) uses indexed addressing which specifies the operand as the contents of the address contained in the X register. cpx operand Compares X with the operand to determine if X is smaller, equal to or bigger than the operand which can be specified using any addressing mode. Compare instructions are usually followed by a conditional branch instruction. We can now write: ldx #data ;initialize X as address of first byte clr sum ;set sum = 0 loop cpx #data+7 ;Compare X with end of data address+1 beq done ;Exit loop if X points past data end ldaa sum ;set sum = sum + *X adda 0,x staa sum inx ;set X = X + 1 (increment X) bra loop ;go back to loop start done:... And, yes, we could have made the loop shorter. But again, the important thing at this point is understanding how assembly language programming works. We'll leave optimization until you have a decent grasp of the basic ideas. (Donald Knuth, an influential computer scientist, once remarked Premature optimization is the root of all evil. ) Addressing mode summary Inherent mode is used when no operand is required. (eg. inca, clra, inx, incb...) Immediate mode (specified by # ) is used when the operand is a constant. Extended mode is used when the operand is a variable but is always found at the same address.

COE538 Lecture Notes: Week 3 3 of 11 Indexed mode is used when the address of the operand can change. Arithmetic and logic programming Adding and subtracting and the Condition Code register You can add (subtract) an operand to A, B or D with adda (suba), addb (subb) or addd (subd). Any addressing mode can be used. You can also increment (decrement) by one any of the registers A, B, S, X or Y with instructions inca (deca), incb (decb), ins (des), inx (dex), iny (dey). (But you can only increment D with addd #1.) Suppose you have two variables p, q and sum declared as: p ds.b 1 q ds.b 1 sum ds.b 1 How do you get the contents of sum to be p + q? How about this? adda q staa sum Seems OK. But what if the answer is wrong! At this point you do not have enough information to know if the answer is right or wrong. For example, if p is 0x7f and q is 0x02, the answer will be 0x81. Is this the right answer? It depends! If p and q are meant to represent unsigned integers, then the answer if correct. But if p and q are meant to represent signed integers, the answer is wrong! (The answer would be interpreted as meaning 127 + 2 = 127!) Similarly, 0xff + 0xff yields 0xfe which is wrong if the numbers are unsigned (it says 255 + 255 = 254) but correct if the numbers are signed (where it says 1 + 1 = 2). Ultimately, the problem occurs because when 2 8-bit numbers (signed or unsigned) are added, the result may not fit into 8 bits. Unlike high-level languages (HLL) like C, the assembly language programmer cannot declare variables to be signed or unsigned and let the compiler take care of the details. Rather, assembly language programmers have to keep clear in their own heads whether a variable (the contents of a byte) is to be treated as signed or unsigned. Fortunately, there is hardware in the ALU that detects errors in addition. In particular, a bit in the Condition Code Register (CCR) called the Carry (C) bit indicates if the

COE538 Lecture Notes: Week 3 4 of 11 result is incorrect when unsigned numbers are added. Another CCR bit, the Overflow (V) bit, indicates if the result is incorrect when signed numbers are added. In short, here are the two ways to add p and q and determine if the result is correct: ;version for unsigned numbers adda q staa sum bcs unsigned_oops ;Branch if C is set ;version for signed numbers adda q staa sum bvs signed_oops ;Branch if V is set The N Z V and C bits We have now seen conditional branch instructions that use each of these CCR bits. (We will learn about the other 4 bits in the CCR later on in the course.) To summarize: N (Negative) is a copy of the most significant bit an ALU operation. It is used by instructions such as bmi or bpl. Z (Zero) is the NOR of all the result bits. (Hence it is 0 if any of the bits are 1.) It is used in instructions like beq or bne. C (Carry) is a copy of the carry-out of the most significant bit of the adder (ALU). It is used by instructions like bcs (branch if carry set) or bcc (branch if carry clear). V (overflow) is the exclusive-or betwen the carry-in and the carry out of the most significant bit. It is used by instructions like bvs (branch if overflow set) or bvc (branch if overflow clear). Conditionals and loops All programs involve conditions and loops. Consider the pseudo code: if (somthing is true) { do one thing } else { do other thing }

COE538 Lecture Notes: Week 3 5 of 11 In general this is translated into assembler as: branch if the "something" is false to else_clause_label do the "then" clause branch to endif_label else_clause_label: do the "else" clause endif_label Example: if (p == q ) p++ else q-- is translated to: cmpa q ;compare a to q setting CCR bits ;this works by computing a q and ;throwing away the answer bne else ;branch if NOT equal to else part inc p ;Increment p using extended addressing bra endif ;branch around the "else" clause else: dec q ;Decrement q using extended addressing endif: ;program continues here It is common for the conditional to involve comparing the sizes of integers. In these cases it is essential that the programmer be clear in their mind whether the numbers are signed or unsigned. Example: unsigned byte p, q; if (p > q ) p = p + q is translated to: endif: Notes: cmpa q bls endif ;branch is less or the same adda q; AccA now p + q staa p ;program continues We branch if (p > q) is false; the opposite of > is <= To branch if <= for unsigned numbers, we use the conditional branch bls. Had p and q been signed numbers, we would have to use ble (branch if less than or

COE538 Lecture Notes: Week 3 6 of 11 equal). The following table summarizes how the various signed and unsigned comparisons are used. Comparison Signed Signed Meaning Unsigned Unsigned Meaning > bgt Branch if greater bhi Branch if higher < blt Branch if lesser blo Branch if lower >= bge Branch if greater or equal bhs Branch if higher or same <= ble Branch if less than or equal bls Branch if lower or same Loops are basically: while (condition) { //while_label: if (condition) loop body //loop body } //goto while_label Example: signed byte p, q; while (p > q) { q++; } ;translation to assembler p ds.b 1 q ds.b 1 while: cmpa q ble end_while inc q bra while end_while: ;program continues Example: Counting number of characters in a null-terminated String char * cp = "blah blah"; strlen = 0; while (*cp) { strlen++; cp++; } ;translation to assembler ;NOTE: the address of the character changes ;each time through the loop --> indexed addressing

COE538 Lecture Notes: Week 3 7 of 11 ;mode is required ;Rather than keep "cp" in 2 memory bytes, ;we just maintain it in index register X ;Also, we just maintain "strlen" in Acc B. string fcc "blah blah" fcb 0 ;the null terminator ldx #string clrb ;Set Acc B (strlen) to zero while: ldaa 0,x beq end_while incb ;increment strlen leax 1,x ;add 1 to x bra while end_while: Notes: The fcc directive is new. It means "form constant character(s)". It places the ASCII code for each character in a string delimited by double quotes in sequential memory locations. We have only used indexed addressing in the form 0,x so far. This is the most common way to use it, but the 0 can be replaced by a a signed constant. In the form offset,x the effective address of the operand is offset + x. Thus if x is $1000, then ldaa 3,x would load A with the contents of address 0x1003. The instruction leax offset,x (load effective address) loads the effective address (NOT the contents) into X. In short, it adds offset to X. It is also possible to use any of the accumulators (A,B or D) as a variable offset. For example: ldd #$12 addd #2 ldx #$1000 leay d,x ldab 2,y would load Y with 0x1014 and load B with the contents of 0x1016

COE538 Lecture Notes: Week 3 8 of 11 Boolean (logic) operations and shifts/rotates In addition to arithmetic, the CPU's ALU (Arithmetic and Logic Unit) can also perform logical operations with instructions like ora, anda and eora perform the bitwise logical OR, AND or XOR between A and the specified operand. The coma instruction inverts each bit in A. An operand (memory or any Accumulator) can be shifted one position to the left or right. The bit shifted out is lost. For left shifts, the bit shifted in is 0. But there are two kinds of right shifts: logical right shift in which a 0 is shifted in. arithmetic right shift in which in which the most significant bit is shifted in. (This ensures that a signed number has the same sign after shifting. To divide a signed number by 2, do a right arithmetic shift; for an unsigned one, do a logical shift.) No bits are lost with a (left or right) rotate. The bit shifted in is the old Carry bit and the bit shifted out becomes the new Carry bit. Multiply and divide There are 3 multiply (8-bit unsigned, 16-bit signed and unsigned) instructions and there are 5 divide instructions. We will only use mul and idiv at this point. How to interpret the Instruction Set Reference Figure 1: Instruction Set Reference for ldaa instruction The Instruction Set Reference (Appendix A of the text and available here) contains a wealth of information presented in a compact fashion. (You will be provided a copy of this during the midterm and final; it is imperative that you learn how to use this reference.) The figure above shows the entry for the ldaa instruction. The first column gives the assembly language syntax for its use in 8 different addressing modes. The Machine Coding column (4 th column) gives the machine language in hexadecimal. The Access Detail column indicates how the bus is used. For our purposes, the number of letters is all that matters for now: this is the number of bus cycles required to fetch and execute

COE538 Lecture Notes: Week 3 9 of 11 the instruction. For example, in immediate addressing mode, 1 bus cycle is required while 3 bus cycles are required in the extended addressing mode. The last column (NZVC) indicates if and how these 4 CCR bits are affected by the instruction. In the case of ldaa, the N bit is set to 0 or 1 depending on the value loaded. Similarly, the Z bit is set to 1 if zero is loaded and set to 0 otherwise. The V bit is cleared to 0 unconditionally and the C bit is unaffected. Using Code Warrior Demonstration in class. Subroutines: a first look It's hard to imagine writing even a simple C program as one big "main" function. Programmers split the task into various components (functions). Functions in C also have the huge advantage of being usable many different times in the program. The equivalent of functions at the assembly language level are called subroutines. Like functions, subroutines may require parameters to be passed to them and may return some value. A first look at the Stack In addition to the X and Y index register, the SP register can also be used in indexed addressing. However, the SP has a more important role: it is a pointer to a Stack maintained in volatile memory (RAM). The Stack is usually access at the Top of Stack. Bytes can be pushed onto the stack at its top and later pulled (or "popped") of the Stack to retrieve them. To push a register, the instructions psha, pshb, pshc, pshd, pshx or pshy are used for registers A, B, D, CCR, X or Y respectively. They can then be retrieved with pula, pulb, pulc, puld, pulx or puly. How to use a subroutine The simplest kind of function to use in C is one that requires no passed parameters and returns no value. The following is an example in C of using such a function called foo() twice in a program:

COE538 Lecture Notes: Week 3 10 of 11 foo(); i++; //assume i is an 8-bit integer foo(); This would be translated to assembler as follows: jsr foo ;"jump to subroutine" foo inc i ;increment i jsr foo ;invoke foo again ;after foo completes, control is returned here How does the "jump to subroutine" instruction work? First, of course, the instruction is fetched. During the fetch phase the program counter is automatically incremented to point to the next sequential instruction. In the above example, after fetching (and before executing) the first jsr foo instruction, the PC would be the address of the inc i instruction. During the execute phase, the CPU does two things in the following order: First, the PC (which points to the next instruction) is pushed onto the stack Then, the PC is replaced with the target address specified as the jsr operand. Once the jsr is executed, the next instruction is the first instruction of the subroutine. When the subroutine has done its work, it returns to the calling program with the rts instruction which simply pops the top of the stack into the PC. Consequently, the next instruction executed following the rts is the instruction immediately following the jsr instruction that invoked the subroutine. How to write a simple subroutine The simplest type of subroutine has no passed parameters and does not return anything. A common example is a software delay loop. Basic I/O hcs12 Parallel Ports

COE538 Lecture Notes: Week 3 11 of 11 Programming the Liquid Crystal Display (LCD) interface Questions 1. Convert the following pseudo-c into assembler: byte bytes[] = {2, 7, 1, 8, 3, 1, 4, 8}; int i = 0 int sum = 0 byte * bp = bytes; //same as bp = &bytes[0] while(bp <= bytes+7) { sum = sum + (*bp)*2; bp = bp + 2; } 2. Assuming that the NZV and C bits are all zero, show how they evolve with each instruction: org $3000 ldaa #0f ora $3000 ;Note: some disassembly required! ldaa #$f0 adda #$e0 Answers