IWKS 3300: NAND to Tetris Spring John K. Bennett. Assembler

Similar documents
Assembler. Building a Modern Computer From First Principles.

Assembler Human Thought

Chapter 6: Assembler

Machine (Assembly) Language

Chapter 6: The Assembler The Assembler Hack Assembly-to-Binary Translation Specification

IWKS 2300/5300 Fall John K. Bennett. Machine Language

Machine (Assembly) Language Human Thought

Computer Architecture

STEVEN R. BAGLEY THE ASSEMBLER

Assembler. Building a Modern Computer From First Principles.

Virtual Machine. Part I: Stack Arithmetic. Building a Modern Computer From First Principles.

Introduction: From Nand to Tetris

Virtual Machine Where we are at: Part I: Stack Arithmetic. Motivation. Compilation models. direct compilation:... 2-tier compilation:

Introduction: Hello, World Below

Virtual Machine. Part II: Program Control. Building a Modern Computer From First Principles.

Chapter 11: Compiler II: Code Generation

Chapter 10: Compiler I: Syntax Analysis

Compiler II: Code Generation Human Thought

Chapter 8: Virtual Machine II: Program Control

Course overview. Introduction to Computer Yung-Yu Chuang. with slides by Nisan & Schocken (

Course overview. Introduction to Computer Yung-Yu Chuang. with slides by Nisan & Schocken (

Motivation. Compiler. Our ultimate goal: Hack code. Jack code (example) Translate high-level programs into executable code. return; } } return

Compiler I: Sytnax Analysis

Course overview. Introduction to Computer Yung-Yu Chuang. with slides by Nisan & Schocken (

Boolean Arithmetic. From Nand to Tetris Building a Modern Computer from First Principles. Chapter 2

EP1200 Introduktion till datorsystemteknik Tentamen tisdagen den 3 juni 2014, till 18.00

Computer Architecture

Computer Architecture

4 Programing Paradigms in 45 Minutes

Compiler I: Syntax Analysis

4. Computer Architecture 1

4 Programing Paradigms in 45 Minutes

//converts a string to a non-negative number string2int(string s){

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

EP1200 Introduction to Computing Systems Engineering. Computer Architecture

Virtual Machine (Part II)

COMP 250: Java Programming I. Carlos G. Oliver, Jérôme Waldispühl January 17-18, 2018 Slides adapted from M. Blanchette

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

Creating a C++ Program

Introduction to C# Applications

Full file at

7. The Virtual Machine II: Flow Control 1

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control.

6.096 Introduction to C++ January (IAP) 2009

CS107 Handout 13 Spring 2008 April 18, 2008 Computer Architecture: Take II

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control.

This lecture presents ordered lists. An ordered list is one which is maintained in some predefined order, such as alphabetical or numerical order.

We will study the MIPS assembly language as an exemplar of the concept.

BASIC ELEMENTS OF A COMPUTER PROGRAM

CS2141 Software Development using C/C++ C++ Basics

Chapter 1 Getting Started

1 Lexical Considerations

INTRODUCTION 1 AND REVIEW

The MARIE Architecture

Compiler Code Generation COMP360

Expressions and Data Types CSC 121 Spring 2015 Howard Rosenthal

x = 3 * y + 1; // x becomes 3 * y + 1 a = b = 0; // multiple assignment: a and b both get the value 0

Chapter. Focus of the Course. Object-Oriented Software Development. program design, implementation, and testing

McGill University School of Computer Science COMP-202A Introduction to Computing 1

EMBEDDED SYSTEMS PROGRAMMING Language Basics

Chapter 9: High Level Language

2 rd class Department of Programming. OOP with Java Programming

Hash Table and Hashing

CS 11 C track: lecture 8

Memory Addressing, Binary, and Hexadecimal Review

COMP 202 Java in one week

12/22/11. Java How to Program, 9/e. Help you get started with Eclipse and NetBeans integrated development environments.

Assoc. Prof. Dr. Marenglen Biba. (C) 2010 Pearson Education, Inc. All rights reserved.

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Programming (1.0hour)

Maciej Sobieraj. Lecture 1

BASIC COMPUTER ORGANIZATION. Operating System Concepts 8 th Edition

Intermediate Code Generation

Memory, Data, & Addressing II CSE 351 Spring

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

Decaf Language Reference Manual

1- Write a single C++ statement that: A. Calculates the sum of the two integrates 11 and 12 and outputs the sum to the consol.

Assembly Language Manual for the STACK Computer

CS 6353 Compiler Construction Project Assignments

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University

Real instruction set architectures. Part 2: a representative sample

Sequential Logic. Building a Modern Computer From First Principles.

GraphQuil Language Reference Manual COMS W4115

Lecture 03 Bits, Bytes and Data Types

Building a Virtual Computer

A Short Summary of Javali

DEPARTMENT OF MATHS, MJ COLLEGE

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Objectives. Problem Solving. Introduction. An overview of object-oriented concepts. Programming and programming languages An introduction to Java

CHAPTER ASSEMBLY LANGUAGE PROGRAMMING

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

10. The Compiler II: Code Generation 1

last time Assembly part 2 / C part 1 condition codes reminder: quiz

Computer Organization & Systems Exam I Example Questions

8. The C++ language, 1. Programming and Algorithms II Degree in Bioinformatics Fall 2017

WYSE Academic Challenge Computer Science Test (State) 2013 Solution Set

CMPE-013/L. Introduction to C Programming

Objectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program

Computer Systems and Networks. ECPE 170 Jeff Shafer University of the Pacific. MIPS Assembly

Chapter 2: Basic Elements of C++

Transcription:

IWKS 3300: NAND to Tetris Spring 2018 John K. Bennett Assembler Foundations of Global Networked Computing: Building a Modern Computer From First Principles This course is based upon the work of Noam Nisan and Shimon Schocken. More information can be found at (www.nand2tetris.org).

Where We Are Human Thought Abstract design Chapters 9, 12 abstract interface H.L. Language & Operating Sys. Compiler Chapters 10-11 abstract interface Virtual Machine Software Hierarchy VM Translator Chapters 7-8 abstract interface Assembly Language Assembler Chapter 6 abstract interface Machine Code Computer Architecture Chapters 4-5 Hardware Hierarchy abstract interface Hardware Platform Gate Logic Chapters 1-3 abstract interface Chips & Logic Gates Electrical Engineering Physics

Because Why Am I Making You Write an Assembler? Writing an assembler helps cement our understanding of the hardware architecture Assemblers demonstrate several programming techniques useful in language translation Assemblers are relatively simple to write Assemblers are the gateway drug to compiler writing

Assembly Example Assembly Source Code // Compute 1+...+RAM[0] // And store the sum in RAM[1] @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @R0 D=D-M @WRITE D;JGT... // Etc. assemble Machine Code 0000000000010000 1110111111001000 0000000000010001 1110101010001000 0000000000010000 1111110000010000 0000000000000000 1111010011010000 0000000000010010 1110001100000001 0000000000010000 1111110000010000 0000000000010001... execute The program translation challenge (for any language translator) Extract the program s semantics from the source program, using the syntax rules of the source language Re-express the program s semantics in the target language, using the syntax rules of the target language An assembler generally translates line by line, making it relatively easy to write Translates each assembly command into one or more binary machine instructions Handles symbols (e.g. i, sum, LOOP, ).

Revisiting Hack Low-level Programming Assembly program (sum.asm) // Computes 1+...+RAM[0] // And stores the sum in RAM[1]. @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @0 D=D-M @WRITE D;JGT @i // sum += i M=D+M @i // i++ M=M+1 @LOOP // goto LOOP 0;JMP (WRITE) @1 M=D // RAM[1] = the sum (END) @END 0;JMP CPU emulator screen shot after running this program user supplied input program generated output The CPU emulator allows loading and executing symbolic Hack code. It resolves all the symbolic symbols to memory locations, and executes the code.

The Assembler s View of an Assembly Program Assembly Program // Computes 1+...+RAM[0] // And stores the sum in RAM[1]. @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @0 D=D-M @WRITE D;JGT @i // sum += i M=D+M @i // i++ M=M+1 @LOOP // goto LOOP 0;JMP (WRITE) @1 M=D // RAM[1] = the sum (END) @END 0;JMP The Hack assembly language program is a sequence of text lines, where each line is one of the following: A-instruction ( @nnn or @SYMBOL ) C-instruction ( dest=comp;jmp ; where dest= and ;jmp are optional) Symbol declaration: (SYMBOL) Comment or white space: // comment The Challenge: Translate the program into a sequence of 16-bit machine code instructions that can be executed by the target hardware platform.

Translating / Assembling A-instructions Symbolic: @value // Where value is either a non-negative decimal number // or a symbol referring to such number. value (v = 0 or 1) Binary: 0 v v v v v v v v v v v v v v v Translation to binary: If value is a non-negative decimal number, just translate to binary If value is a symbol, requires two passes (more later).

Translating / Assembling C-instructions Symbolic: dest=comp;jump // Either the dest or jump fields may be empty. // If dest is empty, the "=" is ommitted; // If jump is empty, the ";" is omitted. comp dest jump Binary: 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

Translation by Table Lookup: Example D=D+1;JLE n = 0xE000; // The first 3 bits are 1 int compvalue = 0; if (CTables.compTable.ContainsKey(comp)) compvalue = CTables.compTable[comp]; compvalue = compvalue << 6; } int destvalue = 0; if (dest!= null) if (CTables.destTable.ContainsKey(dest)) destvalue = CTables.destTable[dest]; destvalue = destvalue << 3; } int jumpvalue = 0; if (jump!= null) if (CTables.jumpTable.ContainsKey(jump)) jumpvalue = CTables.jumpTable[jump]; } comptable // a= 0 "D+1", 0x1f}, desttable "D", 2}, jumptable "JLE", 6},

Ctables More Detail namespace csasm public class CTables public static Dictionary<string, int> comptable = new Dictionary<string, int>() // a= 0 "0", 0x2a}, "1", 0x3f}, "-1", 0x3a}, "D", 0x0c}, "!D", 0x0d}, "-D", 0x0f}, "D+1", 0x1f}, "1+D", 0x1f}, }; "D&A", 0x00}, "A&D", 0x00}, "D A", 0x15}, "A D", 0x15}, // a = 1 "M", 0x30+0x40}, "!M", 0x31+0x40}, "-M", 0x33+0x40}, "M+1", 0x37+0x40}, "1+M", 0x37+0x40}, "M-1", 0x32+0x40}, "D M", 0x15+0x40}, "M D", 0x15+0x40}

Ctables More Detail public static Dictionary<string, int> desttable = new Dictionary<string, int>() // Allow all combinations of these letters (i.e., don't care about order); //this is not done in the book "", 0}, "M", 1}, "D", 2}, "A", 4}, "MD", 3}, "DM", 3}, "AM", 5}, "MA", 5}, "AD", 6}, "DA", 6}, "ADM", 7}, "AMD", 7}, "DAM", 7}, "DMA", 7}, "MAD", 7}, "MDA", 7} };

Ctables More Detail public static Dictionary<string, int> jumptable = new Dictionary<string, int>() "", 0}, "JGT", 1}, "JEQ", 2}, "JGE", 3}, "JLT", 4}, "JNE", 5}, "JLE", 6}, "JMP", 7}, }; } }

Translation by Table lookup: Example D=D+1;JLE Symbolic: dest=comp;jump // Either the dest or jump fields may be empty. // Assemble the whole instruction (as an integer) // If dest is empty, the "=" is ommitted; // If jump is empty, the ";" is omitted. n = compvalue destvalue jumpvalue; comp dest jump Binary: n = compvalue = destvalue = 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3 1 1 1 0 0 1 1 1 1 1 0 1 0 jumpvalue = 1 1 0 n = 1 1 1 0 0 1 1 1 1 1 0 1 0 1 1 0 = D=D+1;JLE

The Overall Assembly Logic Assembly program // Computes 1+...+RAM[0] // And stores the sum in RAM[1]. @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @0 D=D-M @WRITE D;JGT @i // sum += i M=D+M @i // i++ M=M+1 @LOOP // goto LOOP 0;JMP (WRITE) @1 M=D // RAM[1] = the sum (END) @END 0;JMP For each (real) command Parse the command, i.e. break it into its underlying fields A-instruction: replace the symbolic reference (if any) with the corresponding memory address, which is a number (how to do this in a moment) C-instruction: for each field in the instruction, generate the corresponding binary code Assemble the translated binary codes into a complete 16-bit machine instruction Write the 16-bit instruction to the output (text) file.

Handling Symbols (symbol resolution) Assembly programs typically have many symbols: Labels that mark destinations of jump commands Labels that mark special memory locations (e.g., SCREEN) Variables These symbols fall into two categories: User defined symbols (created by programmers) Pre-defined symbols (used by the Hack platform) Typical symbolic Hack assembly code: @R0 @END D;JLE @counter M=D @SCREEN D=A @x M=D (LOOP) @x A=M M=-1 @x @32 D=D+A @x M=D @counter M-1 @LOOP D;JGT (END) @END 0;JMP

Handling Symbols: User-defined Symbols Label symbols: Used to label destinations of jump commands. Declared by the pseudo-command (XXX). This directive defines the symbol XXX to refer to the instruction memory (ROM) location holding the next command in the program Variable symbols: Any user-defined symbol xxx appearing in an assembly program that is not defined elsewhere using the (xxx) directive is treated as a variable, and is automatically assigned a unique RAM address, starting at RAM address 16 (RAM addresses 0-15 are reserved) By convention, Hack programs use lower-case and upper-case to represent variable and label names, respectively (nothing enforces this convention) These assignments are the responsibility of the assembler, requiring the construction of a symbol table. Typical symbolic Hack assembly code: @R0 @END D;JLE @counter M=D @SCREEN D=A @x M=D (LOOP) @x A=M M=-1 @x @32 D=D+A @x M=D @counter M-1 @LOOP D;JGT (END) @END 0;JMP

Handling symbols: Pre-defined Symbols Virtual registers: The symbols R0,, R15 are predefined to refer to RAM addresses 0,,15 I/O pointers: The symbols SCREEN and KBD are predefined to refer to RAM addresses 16384 and 24576, respectively (base addresses of the screen and keyboard memory maps) VM control pointers: the symbols SP, LCL, ARG, THIS, and THAT (these do not appear in the code example on the right) are predefined to refer to RAM addresses 0 to 4, respectively (The VM control pointers, which overlap R0,, R4, will come to play in the virtual machine implementation, covered in the next chapter) These assignments are also the responsibility of the assembler, requiring the construction of a symbol table. Typical symbolic Hack assembly code: @R0 @END D;JLE @counter M=D @SCREEN D=A @x M=D (LOOP) @x A=M M=-1 @x @32 D=D+A @x M=D @counter M-1 @LOOP D;JGT (END) @END 0;JMP

Handling Symbols: Symbol Table Source code (example) // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @R0 D=D-M @WRITE D;JGT @i // sum += i M=D+M @i // i++ M=M+1 @LOOP // goto LOOP 0;JMP (WRITE) @R1 M=D // RAM[1] = the sum (END) @END 0;JMP Symbol Table R0 0 R1 1 R2 2...... R15 15 SCREEN 16384 KBD 24576 SP 0 LCL 1 ARG 2 THIS 3 THAT 4 WRITE 18 END 22 i 16 sum 17 (Note: LOOP is missing) This symbol table is generated by the assembler, and used to translate the symbolic code into binary code.

Handling symbols: Constructing the Symbol Table Source code (example) // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @R0 D=D-M @WRITE D;JGT @i // sum += i M=D+M @i // i++ M=M+1 @LOOP // goto LOOP 0;JMP (WRITE) @R1 M=D // RAM[1] = the sum (END) @END 0;JMP Symbol table R0 0 R1 1 R2 2... R15 15 SCREEN 16384 KBD 24576 SP 0 LCL 1 ARG 2 THIS 3 THAT 4 WRITE 18 END 22 i 16 sum 17 Initialization: create an empty symbol table and populate it with all the pre-defined symbols First pass: go through the entire source code, and add all the user-defined label symbols to the symbol table (without generating any code) Second pass: go again through the source code, and use the symbol table to translate all the commands. In the process, handle all the userdefined variable symbols.

Symbol Table Initialization (C#) using System.Collections.Generic; // This class creates a symbol table and initializes it. // We also implement the book ST functions. // This is straight from the book, except where noted // Dictionaries in C# make this really easy namespace assembler class SymbolTable public static Dictionary<string, int> symboltable = new Dictionary<string, int>() "SP", 0}, "LCL", 1}, "ARG", 2}, "THIS", 3}, "THAT", 4}, "R0", 0}, "R1", 1}, "R2", 2}, "R3", 3}, "R4", 4}, "R5", 5}, "R6", 6}, "R7", 7}, "R8", 8}, "R9", 9}, "R10", 10}, "R11", 11}, "R12", 12}, "R13", 13}, "R14", 14}, "R15", 15}, "SCREEN", 0x4000}, "KBD", 0x6000} };

Symbol Table Functions (C#) public static bool contains(string symbol) return (symboltable.containskey(symbol)); } public static void addentry(string symbol, int address) symboltable.add(symbol, address); } public static int getaddress(string symbol) // We only call this after we check to see that the symbol is present return symboltable[symbol]; }

The Assembly Process (in more detail) Initialization: create the symbol table and initialize it with the pre-defined symbols First pass: walk through the source code without generating any code. For each label declaration (LABEL) that appears in the source code, add the pair <LABEL,n > to the symbol table, where n is the current ROM address Second pass: walk again through the source code, and process each line: If the line is a C-instruction, translate as previously described If the line is @nnn where nnn is a number, create an A instruction If the line is @xxx and xxx is a symbol, look up xxx in the symbol table and proceed as follows: If xxx is found, create an A instruction using xxx as the numeric value If the symbol is not found, then it must represent a new variable: add the pair <xxx,n > to the symbol table, where n is the next available RAM address, and complete the command s translation. (Platform design decision: the allocated RAM addresses are continuous, starting at address 16).

The Result... Source code (example) // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i M=1 // i = 1 M=0 // sum = 0 (LOOP) @i // if i>ram[0] goto WRITE @R0 D=D-M @WRITE D;JGT @i // sum += i M=D+M @i // i++ M=M+1 @LOOP // goto LOOP 0;JMP (WRITE) @R1 M=D // RAM[1] = the sum (END) @END 0;JMP assemble Target code 0000000000010000 1110111111001000 0000000000010001 1110101010001000 0000000000010000 1111110000010000 0000000000000000 1111010011010000 0000000000010010 1110001100000001 0000000000010000 1111110000010000 0000000000010001 1111000010001000 0000000000010000 1111110111001000 0000000000000100 1110101010000111 0000000000010001 1111110000010000 0000000000000001 1110001100001000 0000000000010110 1110101010000111 Note that comment lines and pseudo-commands (label declarations) generate no code.

Suggested Assembler Implementation An assembler program can be written in any high-level language. The book suggests a language-independent design, as follows. Software modules: Parser: Unpacks each command into its underlying fields Code: Translates each field into its corresponding binary value, and assembles the resulting values SymbolTable: Manages the symbol table Main: Initializes I/O files and drives the show. Suggested implementation stages Stage I: Build a basic assembler for programs with no symbols Stage II: Extend the basic assembler with symbol handling capabilities.

Parser (a software module in the assembler program) enum Parser_CommandType Parser_NO_COMMAND=0, Parser_A_COMMAND, Parser_C_COMMAND, Parser_L_COMMAND }; while ((line = file.readline())!= null) // encapsulates the book s "hasmorecommands" and "advance"

Parser (a software module in the assembler program) / continued In Pass 1: In Pass 2: switch (commandtype) case Parser_CommandType.Parser_A_COMMAND: case Parser_CommandType.Parser_C_COMMAND: // Command will require one word in the ROM ++romaddr; break; case Parser_CommandType.Parser_L_COMMAND: // The label's address is the current ROM address. // Labels don't require any space in the ROM. SymbolTable.addEntry(symbol, romaddr); break; case Parser_CommandType.Parser_NO_COMMAND: // Do nothing. break; switch (commandtype) case Parser_CommandType.Parser_A_COMMAND: if (SymbolTable.contains(symbol)) n = SymbolTable.getAddress(symbol); } else // Allocate a new RAM variable n = ramaddr++; SymbolTable.addEntry(symbol, n); }

Code (a software module in the assembler program) Let s revisit how to translate C Instructions

Translating / Assembling C-instructions Symbolic: dest=comp;jump // Either the dest or jump fields may be empty. // If dest is empty, the "=" is ommitted; // If jump is empty, the ";" is omitted. comp dest jump Binary: 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

Translation by Table Lookup: Example D=D+1;JLE n = 0xE000; // The first 3 bits are 1 int compvalue = 0; if (CTables.compTable.ContainsKey(comp)) compvalue = CTables.compTable[comp]; compvalue = compvalue << 6; } int destvalue = 0; if (dest!= null) if (CTables.destTable.ContainsKey(dest)) destvalue = CTables.destTable[dest]; destvalue = destvalue << 3; } int jumpvalue = 0; if (jump!= null) if (CTables.jumpTable.ContainsKey(jump)) jumpvalue = CTables.jumpTable[jump]; } comptable // a= 0 "D+1", 0x1f}, desttable "D", 2}, jumptable "JLE", 6},

Translation by Table lookup: Example D=D+1;JLE Symbolic: dest=comp;jump // Either the dest or jump fields may be empty. // Assemble the whole instruction (as an integer) // If dest is empty, the "=" is ommitted; // If jump is empty, the ";" is omitted. n = compvalue destvalue jumpvalue; comp dest jump Binary: n = compvalue = destvalue = 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3 1 1 1 0 0 1 1 1 1 1 0 1 0 jumpvalue = 1 1 0 n = 1 1 1 0 0 1 1 1 1 1 0 1 0 1 1 0 = D=D+1;JLE

SymbolTable (a software module in the assembler program) Note that these in-class examples omit some important logic, e.g., deleting white space and comments, checking that symbols are valid, the numbers in A instructions are within range, how to write a hack file, etc.

Example: What Constitutes a Valid Symbol? // Alphabetics, numerics, _. $ : allowed. // Numerics not allowed as first character. // Example code to test symbol validity bool first = true; bool valid = true; int i = 0; int symlen = sym.length; while (valid && (i < symlen)) valid = false; if (!first && sym[i] >= '0' && sym[i] <= '9') valid = true; else if (!first && sym[i] >= '0' && sym[i] <= '9') valid = true; else if (sym[i] >= 'A' && sym[i] <= 'Z') valid = true; else if (sym[i] >= 'a' && sym[i] <= 'z') valid = true; else if (sym[i] == '_' sym[i] == '.' sym[i] == '$' sym[i] == ':') valid = true; i++; first = false;

Example: Deal With White Space // Strip comments and white space. char [] outtemp = new char [line.length]; // need a char array since strings are read-only // j is our non-whitespace index int j = 0; for (int i=0; i < line.length; i++) if (line[i] == '/') if (i == line.length - 2) // have to do this first so we don't get an out of bounds //error if the "//" is at the end of the line // end of line "//"; done with this line break; } else if (line[i+1] == '/') // not at end of line but still a comment, so ignore the rest of the line break; } } // not a comment, keep going with this line if (!char.iswhitespace(line,i)) outtemp[j] = line[i]; // only copy if it's not whitespace j++; } } string temp = new string(outtemp); temp = temp.substring(0, j);

Perspective Simple machine language, simple assembler Most assemblers are not stand-alone, but rather encapsulated in a higher level language translator Modern compilers have good to excellent optimizers, so we need to code in assembly only in very rare cases. Examples of such cases: real-time systems, which need guaranteed response times; systems with limited memory (e.g., deep space probes); critical I/O functions (e.g., disk drivers); interrupt handlers (requiring careful coordination of multiple tasks). Writing an assembler is good practice for writing more challenging translators, for example, the VM Translator and compiler that we will tackle next.