IWKS 2300/5300 Fall 2017 John K. Bennett Machine Language
Assembly Language / Machine Code Abstraction implementation duality: Assembly language (the instruction set architecture) can be viewed as a (sort-of) programmer-oriented abstraction of the hardware platform The hardware platform represents a physical means for realizing the assembly language / machine code abstraction Assembly Language and Machine Code Statements are Generally One-to-One: Symbolic assembly language statement, e.g., D=A That statement translates into executable binary, e.g., 1110110000010000 Assembly Language / Machine Code: Roughly speaking, machine code represents an agreed-upon formalism for manipulating data in memory using a processor and (usually) a set of registers. Assembly language syntax can differ widely across different hardware platforms.
Machine Code vs. Assembly 1010 0001 0010 1011 ADD R1, R2, R3 Evolution: Physical coding Symbolic documentation Symbolic coding Translation and execution Jacquard loom (1801) Assembly requires translation. Augusta Ada King, Countess of Lovelace (1815-1852)
Some Typical Assembly Language Commands // In what follows R1,R2,R3 are registers, PC is program counter, // and addr is some address in memory. ADD R1,R2,R3 // R1 R2 + R3 ADDI R1,R2,addr // R1 R2 + addr AND R1,R1,R2 JMP addr JEQ R1,R2,addr LOAD R1, addr STORE R1, addr NOP // R1 R1 and R2 (bit-wise) // PC addr // IF R1 == R2 THEN PC addr ELSE PC++ // R1 RAM[addr] // RAM[addr] R1 // Do nothing // Etc. *many* variants
Three Address Architecture Consider the following code fragment: X = (A-B) / (C+(D*E)) Load R1, A // R1 Mem[A] Load R2, B // R2 Mem[B] Sub R3, R2, R1 // R3 R2 R1 Load R1, D // R1 Mem[D] Load R2, E // R2 Mem[E] Mpy R4, R1, R2 // R4 R1 * R2 Load R1, C // R1 Mem[C] Add R2, R1, R4 // R2 R1 + R4 Div R1, R3, R2 // R1 R3 / R2 Store X, R1 // Mem[X] R1 There are typically a finite number of registers, on the order of 16-32 This code: 6 Memory references, code is not compact
Two Address Architecture Consider the same code fragment: X = (A-B) / (C+(D*E)) Load R1, A // R1 Mem[A] Load R2, B // R2 Mem[B] Sub R2, R1 // R2 R2 R1 Load R1, D // R1 Mem[D] Load R3, E // R3 Mem[E] Mpy R1, R3 // R1 R1 * R3 Load R4, C // R4 Mem[C] Add R1, R4 // R1 R1 + R4 Div R2, R1 // R2 R2 / R1 Store X, R2 // Mem[X] R2 There are typically a finite number of registers, on the order of 16-32 This code: 6 Memory references, code is not compact
One Address Architecture Consider the following code fragment: X = (A-B) / (C+(D*E)) Load A // Acc Mem[A] Add B // Acc Acc + Mem[B] Store Temp1 // Mem[Temp1] Acc Load D // Acc Mem[D] Mpy E // Acc Acc * Mem[E] Add C // Acc Acc + Mem[C] Store Temp2 // Mem[Temp2] Acc Load Temp1 // Acc Mem[Temp1] Div Temp2 // Acc Acc / Mem[Temp2] Store X // Mem[X] Acc There is one register, called the Accumulator This code: 10 Memory references, code is more compact
Zero Address Architecture Consider the following code fragment: X = (A-B) / (C+(D*E)) Push D // SP = SP + 1; Mem[SP] Mem[D]; Push E // SP = SP + 1; Mem[SP] Mem[E]; Mpy // Mem[SP-1] Mem[SP] * Mem[SP-1]; // SP = SP -1 Push C // SP = SP + 1; Mem[SP] Mem[C] Add // Mem[SP-1] Mem[SP] + Mem[SP-1] // SP = SP -1 Push A // SP = SP + 1; Mem[SP] Mem[A] Push B // SP = SP + 1; Mem[SP] Mem[B] Sub // Mem[SP-1] Mem[SP] - Mem[SP-1]; // SP = SP -1 Div // Mem[SP-1] Mem[SP] / Mem[SP-1] Pop X // Mem[X] Mem[SP]; SP = SP - 1 24 Memory references, code is very compact
The Hack Computer A 16-bit machine (this means that the ALU and memory width are 16 bits) consisting of the following elements: Data memory: RAM an addressable sequence of registers* Instruction memory: ROM an addressable sequence of registers* Registers: Processing: D, A, M, where M stands for RAM[A] ALU, capable of computing various functions Program counter: PC, holding an address Control: The ROM is loaded with a sequence of 16-bit instructions, one per memory location, beginning at address 0. These instructions are fetched and executed in sequence until a jump instruction is encountered, at which time the PC is set to the jump address. Simple Instruction set: Only two Instructions: A-instruction, C- instruction; A fair amount of complexity is packed into C-instructions * Hack computer only; Recall that real memory is not implemented using registers.
The Hack CPU (no control shown) Think of the D Register as the Accumulator
The Hack Computer
Hack Assembly and VM Code Hack Assembly is a hybrid (1/2 address) code, but has only two kinds of instructions Hack VM is zero address (see IWKS 3300 )
Jack Code // Adds 1+ +100. var int i, sum; let i =1; let sum = 0; while (i < 101) { let sum = sum + 1; let i = i + 1; } A Hack Machine Language Example
A-Instructions Symbolic: @value // Where value is either a non-negative decimal number // or a symbol referring to such number. value (v = 0 or 1) Binary: 0 v v v v v v v v v v v v v v v Translation to binary: If value represents a non-negative number, see next slide We will handle the case when value is a symbol in Chapter 6 (Assembler).
A Instructions @value // A value Where value is either a non-negative number or a symbol referring to a number. Used for: Entering a constant value ( A = value) Coding example: @17 // A = 17 D = A // D = 17 Selecting a RAM location ( register = RAM[A]) @17 // A = 17 D = M // D = RAM[17] Selecting a ROM location ( PC = A ) @17 // A = 17 JMP // branch to and fetch the // instruction stored in ROM[17]
C-Instructions - Lots Going On Symbolic: dest=comp;jump // Either the dest or jump fields may be empty. // If dest is empty, the "=" is ommitted; // If jump is empty, the ";" is omitted. comp dest jump Binary: 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3
The Comp Field Determines ALU Output (a and c1-c6) c1-c6 of the Comp Field are identical to ALU control inputs The a bit chooses A or M as the Y input to the ALU (the X ALU input is always D).
The A-Register/M Multiplexor (and ALU Out Data Path) D register D x A register A a-bit ALU out address input RAM (selected register) M Mux A/M y PC address input ROM (selected register) Instruction Control Logic Control Signals
The Hack CPU (where the control signals go)
The Hack Computer
A D M The Dest Field Each bit designates a destination (A, D, or M):
The Jump Field The effects of j1-j3 are related to the zr and ng ALU output bits: j1 (out < 0): ng j2 (out = 0 ): zr j3 (out > 0): ng NOR zr
The C-Instruction Possible Destinations dest = n + s dest = n - s dest = n dest = 0 dest = 1 dest = -1 Exercise 1: Implement the following tasks Using Hack Assembly and Machine Code: Set D to A-1 Set both A and D to A + 1 Set D to 19 Set both A and D to A + D Set RAM[5034] to D - 1 n = {A, D, M} s = {A, D, M, 1} dest = {A, D, M, MD, A, AM, AD, AMD, null} Set RAM[53] to 171 Add 1 to RAM[7], and store the result in D.
The C-Instruction dest = x + y dest = x - y dest = x dest = 0 dest = 1 dest = -1 x = {A, D, M} y = {A, D, M, 1} dest = {A, D, M, MD, A, AM, AD, AMD, null} Symbol table: j 3012 sum 4500 q 3812 arr 20561 (Symbols and values are arbitrary examples) Exercise 2: Implement the following code using Hack Assembly Language: sum = 0 j = j + 1 q = sum + 12 j arr[3] = -1 arr[j] = 0 arr[j] = 17
Control (focus on the yellow chips) D register D A register A a-bit ALU address input RAM (selected register) M Mux A/M In the Hack architecture: ROM = instruction memory PC address input ROM (selected register) Instruction Program = sequence of 16-bit numbers, starting at ROM[0] Current instruction = ROM[PC] To select instruction n from the ROM, we set A to n, using the instruction @n
Handling A-Register Conflicts The Hack programmer can use the A register to select either a data memory location for a subsequent C- instruction involving M, or an instruction memory location for a subsequent C-instruction involving a jump. Thus, to prevent conflicting use of the A register, in wellwritten Hack programs, a C-instruction may contain nonzero j bits (indicating a jump), OR the d3 bit can be set (indicating a reference to M), but not both.
Coding Example Exercise 3: Implement the following code using Hack Assembly Language: goto 50 if D==0 goto 112 if D<9 goto 507 if RAM[12] > 0 goto 50 if sum>0 goto END if x[i]<=0 goto NEXT. Symbol table: sum 2200 x 4000 i 15 END 50 NEXT 120 Hack convention: (All symbols and values in are arbitrary examples) True is represented by -1 Hack commands: A-command: @value // set A to value C-command: dest = comp ; jump // dest = and ;jump // are optional Where: comp = 0, 1, -1, D, A,!D,!A, -D, -A, D+1, A+1, D-1, A-1, D+A, D-A, A-D, D&A, D A, M,!M, -M,M+1, M-1, D+M, D-M, M-D, D&M, D M dest = M, D, MD, A, AM, AD, AMD, or null jump = JGT, JEQ, JGE, JLT, JNE, JLE, JMP, or null In the command dest = comp; jump, the jump materialzes if (comp jump 0) is true. For example, in D=D+1,JLT, we jump if D+1 < 0. False is represented by 0
Hack If logic High level: if condition { code block 1} else { code block 2} code block 3 Hack: D not condition @IF_TRUE D;JEQ code block 2 @END 0;JMP (IF_TRUE) code block 1 (END) Hack convention: code block 3 True is represented by -1 False is represented by 0
Hack While Logic High level: while condition { code block 1 } Code block 2 Hack: (LOOP) D not condition) @END D;JEQ code block 1 @LOOP 0;JMP (END) code block 2 Hack convention: True is represented by -1 False is represented by 0
Complete program example C language code: Hack assembly code: // Adds 1+...+100. int i = 1; int sum = 0; while (i <= 100){ sum += i; i++; } Hack assembly convention: Variables: lower-case Labels: upper-case Commands: upper-case // Adds 1+...+100. @i // i refers to some RAM location M=1 // i=1 @sum // sum refers to some RAM location M=0 // sum=0 (LOOP) @i D=M // D = i @100 D=D-A // D = i - 100 @END D;JGT // If (i-100) > 0 goto END @i D=M // D = i @sum M=D+M // sum += i @i M=M+1 // i++ @LOOP 0;JMP // Got LOOP (END) @END 0;JMP // Infinite loop
add.jack // JKB /** Computes the sum of the first 100 integers. */ class Main { function void main() { var int i, sum; let i =1; let sum = 0; } } while (i < 101) { let sum = sum + i; let i = i + 1; } do Output.printString("THE SUM IS: "); do Output.printInt(sum); do Output.println(); return;
Symbols in Hack assembly programs Symbols created by Hack programmers and code generators: Label symbols: Used to label destinations of goto commands. Declared by the pseudo command (label). This directive defines the symbol label to refer to the instruction memory location holding the next command in the program (within the program, label is called label ) Variable symbols: Any user-defined symbol label appearing in an assembly program that is not defined elsewhere using the (label) directive is treated as a variable, and is automatically assigned a unique RAM address, starting at RAM address 16 By convention, Hack programmers use lower-case and upper-case letters for variable names and labels, respectively. Predefined symbols: I/O pointers: The symbols SCREEN and KBD are automatically predefined to refer to RAM addresses 16384 and 24576, respectively (base addresses of the Hack platform s screen and keyboard memory maps) Virtual registers: covered in future lectures. VM control registers: covered in future lectures. Q: What does the assignment of symbols to RAM addresses? A: The assembler, which is the program that translates symbolic Hack programs into binary Hack program. As part of the translation process, the symbols are resolved to RAM addresses. // Typical symbolic // Hack code, meaning // not important @R0 D=M @INFINITE_LOOP D;JLE @counter M=D @SCREEN D=A @addr M=D (LOOP) @addr A=M M=-1 @addr D=M @32 D=D+A @addr M=D @counter MD=M-1 @LOOP D;JGT (INFINITE_LOOP) @INFINITE_LOOP 0;JMP
Hack is a simple machine language Perspective User friendly syntax: D=D+A instead of ADD D,D,A Hack is a ½-address machine : any operation that needs to operate on the RAM must be specified using two commands: an A-command to address the RAM, and a subsequent C-command to operate on it The Hack assembler will be discusses and developed in Chapter 6 A Macro-language could be developed
Exercise 1 Implement the following operations using both Hack Assembly and Hack Machine (binary) code : Set D to A-1 Set both A and D to A + 1 Set D to 19 Set both A and D to A + D Set RAM[5034] to D - 1 Set RAM[53] to 171 Add 1 to RAM[7], and store the result in D.
Exercise 2 Implement the following high level code snippets using Hack Assembly Language: sum = 0 j = j + 1 q = sum + 12 j Symbol table: j 3012 sum 4500 q 3812 arr 20561 (Symbols and values are arbitrary examples) arr[3] = -1 arr[j] = 0 arr[j] = 17 Hack convention: True is represented by -1 False is represented by 0 R13-R15 are temps
Exercise 3 Implement the following pseudo code snippets using Hack Assembly Language: goto 50 if (D==0) goto 112 if (D<9) goto 507 Symbol table: sum 2200 x 4000 i 15 END 50 NEXT 120 (All symbols and values are arbitrary examples) if (RAM[12] > 0) goto 50 if (sum>0) goto END Hack convention: True is represented by -1 if (x[i]<=0) goto NEXT False is represented by 0 R13-R15 are temps