Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde
Where we are at: Human Thought Abstract desgn Chapters 9, 2 abstract nterface H.L. Language & Operatng Sys. Compler Chapters - abstract nterface Vrtual Machne Software herarchy VM Translator Chapters 7-8 abstract nterface Assembly Language Assembler Chapter 6 abstract nterface Machne Language Computer Archtecture Chapters 4-5 Hardware herarchy abstract nterface Hardware Platform Gate Logc Chapters - 3 abstract nterface Chps & Logc Gates Electrcal Engneerng Physcs Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 2
Why care about assemblers? Because Assemblers employ nfty programmng trcks Assemblers are frst rung up software herarchy ladder An assembler s a translator of a smple language Wrtng an assembler low-mpact practce for wrtng complers. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 3
Assembly example Source code (example) +...+RAM[] +...+RAM[] stored stored n n RAM[] RAM[] M M @ @ M M f f >RAM[] >RAM[] DM DM @R @R DD-M DD-M @ @...... Etc. Etc. assemble Target code...... For now, gnore all detals! execute The program translaton challenge Extract program s semantcs from source program, usng syntax rules of source language Re-express program s semantcs n target language, usng syntax rules of target language Assembler smple translator Translates each assembly command nto one or more bnary machne nstructons Handles symbols (e.g.,,, ). Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 4
Revstng Hack low-level programmng: an example Assembly program (.asm) +...+RAM[] +...+RAM[] stores stores n n RAM[]. RAM[]. M M @ @ M M f f >RAM[] >RAM[] DM DM @ @ DD-M DD-M @ @ + + DM DM @ @ MD+M MD+M ++ ++ MM+ MM+ @ @ () () @ @ DM DM @ @ MD MD RAM[] RAM[] CPU emulator screen shot after runnng ths program user suppled nput program generated output The CPU emulator allows loadng and executng symbolc Hack code. It resolves all symbolc symbols to memory locatons, and executes code. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 5
The assembler s vew of an assembly program Assembly program +...+RAM[] +...+RAM[] stores stores n n RAM[]. RAM[]. M M @ @ M M f f >RAM[] >RAM[] DM DM @ @ DD-M DD-M @ @ + + DM DM @ @ MD+M MD+M ++ ++ MM+ MM+ @ @ () () @ @ DM DM @ @ MD MD RAM[] RAM[] Assembly program a stream of text lnes, each beng one of followng: A-nstructon C-nstructon Symbol declaraton: (SYMBOL) Comment or whte space: comment The challenge: Translate program nto a sequence of 6-bt nstructons that can be executed by target hardware platform. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 6
Translatng / assemblng A-nstructons Symbolc: @value Where value s er a non-negatve decmal number or a symbol referrng to such number. value (v or ) Bnary: v v v v v v v v v v v v v v v Translaton to bnary: If value s a non-negatve decmal number, smple If value s a symbol, later. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 7
Translatng / assemblng C-nstructons Symbolc: destcomp;jump Er dest or jump felds may be empty. If dest s empty, "" s ommtted; If jump s empty, ";" s omtted. comp dest jump Bnary: a c c2 c3 c4 c5 c6 d d2 d3 j j2 j3 Translaton to bnary: smple! Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 8
The overall assembly logc Assembly program +...+RAM[] +...+RAM[] stores stores n n RAM[]. RAM[]. M M @ @ M M f f >RAM[] >RAM[] DM DM @ @ DD-M DD-M @ @ + + DM DM @ @ MD+M MD+M ++ ++ MM+ MM+ @ @ () () @ @ DM DM @ @ MD MD RAM[] RAM[] For each (real) command Parse command,.e. break t nto ts underlyng felds A-nstructon: replace symbolc reference (f any) wth correspondng memory address, whch s a number (how to do t, later) C-nstructon: for each feld n nstructon, generate correspondng bnary code Assemble translated bnary codes nto a complete 6-bt machne nstructon Wrte 6-bt nstructon to output fle. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 9
Handlng symbols (aka symbol resoluton) Assembly programs typcally have many symbols: Labels that mark destnatons of commands Labels that mark specal memory locatons Varables These symbols fall nto two categores: User defned symbols (created by programmers) Pre-defned symbols (used by Hack platform). Typcal symbolc Hack assembly code: @R @R DM DM D;JLE D;JLE @counter @counter MD MD @SCREEN @SCREEN DA DA MD MD AM AM M- M- DM DM @32 @32 DD+A DD+A MD MD @counter @counter MDM- MDM- @ @ Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde
Handlng symbols: user-defned symbols Label symbols: Used to label destnatons of commands. Declared by pseudo-command (XXX). Ths drectve defnes symbol XXX to refer to nstructon memory locaton holdng next command n program Varable symbols: Any user-defned symbol xxx appearng n an assembly program that s not defned elsewhere usng (xxx) drectve s treated as a varable, and s automatcally assgned a unque RAM address, startng at RAM address 6 (why start at 6? Later.) By conventon, Hack programmers use lower-case and uppercase to represent varable and label names, respectvely Q: Who does all automatc assgnments of symbols to RAM addresses? A: As part of program translaton process, assembler resolves all symbols nto RAM addresses. Typcal symbolc Hack assembly code: @R @R DM DM D;JLE D;JLE @counter @counter MD MD @SCREEN @SCREEN DA DA MD MD AM AM M- M- DM DM @32 @32 DD+A DD+A MD MD @counter @counter MDM- MDM- @ @ Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde
Handlng symbols: pre-defned symbols Vrtual regsters: The symbols R,, R5 are automatcally predefned to refer to RAM addresses,,5 I/O ponters: The symbols SCREEN and KBD are automatcally predefned to refer to RAM addresses 6384 and 24576, respectvely (base addresses of screen and keyboard memory maps) VM control ponters: symbols SP, LCL, ARG, THIS, and THAT (that don t appear n code example on rght) are automatcally predefned to refer to RAM addresses to 4, respectvely (The VM control ponters, whch overlap R,, R4 wll come to play n vrtual machne mplementaton, covered n next lecture) Q: Who does all automatc assgnments of symbols to RAM addresses? A: As part of program translaton process, assembler resolves all symbols nto RAM addresses. Typcal symbolc Hack assembly code: @R @R DM DM D;JLE D;JLE @counter @counter MD MD @SCREEN @SCREEN DA DA MD MD AM AM M- M- DM DM @32 @32 DD+A DD+A MD MD @counter @counter MDM- MDM- @ @ Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 2
Handlng symbols: symbol table Source code (example) +...+RAM[] +...+RAM[] stored stored n n RAM[] RAM[] M M @ @ M M f f >RAM[] >RAM[] DM DM @R @R DD-M DD-M @ @ + + DM DM @ @ MD+M MD+M ++ ++ MM+ MM+ @ @ () () @ @ DM DM @R @R MD MD RAM[] RAM[] Symbol table R R R R R2 R2 2 2............ R5 R5 5 5 SCREEN SCREEN 6384 6384 KBD KBD 24576 24576 SP SP LCL LCL ARG ARG 2 2 THIS THIS 3 3 THAT THAT 4 4 8 8 END END 22 22 6 6 7 7 Ths symbol table s generated by assembler, and used to translate symbolc code nto bnary code. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 3
Handlng symbols: constructng symbol table Source code (example) +...+RAM[] +...+RAM[] stored stored n n RAM[] RAM[] M M @ @ M M f f >RAM[] >RAM[] DM DM @R @R DD-M DD-M @ @ + + DM DM @ @ MD+M MD+M ++ ++ MM+ MM+ @ @ () () @ @ DM DM @R @R MD MD RAM[] RAM[] Symbol table R R R R R2 R2 2 2...... R5 R5 5 5 SCREEN SCREEN 6384 6384 KBD KBD 24576 24576 SP SP LCL LCL ARG ARG 2 2 THIS THIS 3 3 THAT THAT 4 4 8 8 END END 22 22 6 6 7 7 Intalzaton: create an empty symbol table and populate t wth all pre-defned symbols Frst pass: go through entre source code, and add all user-defned label symbols to symbol table (wthout generatng any code) Second pass: go agan through source code, and use symbol table to translate all commands. In process, handle all userdefned varable symbols. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 4
The assembly process (detaled) Intalzaton: create symbol table and ntalze t wth pre-defned symbols Frst pass: march through source code wthout generatng any code. For each label declaraton (LABEL) that appears n source code, add par <LABEL,n > to symbol table Second pass: march agan through source code, and process each lne: If lne s a C-nstructon, smple If lne s @xxx where xxx s a number, smple If lne s @xxx and xxx s a symbol, look t up n symbol table and proceed as follows: If symbol s found, replace t wth ts numerc value and complete command s translaton If symbol s not found, n t must represent a new varable: add par <xxx,n > to symbol table, where n s next avalable RAM address, and complete command s translaton. (Platform desgn decson: allocated RAM addresses are runnng, startng at address 6). Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 5
The result... Source code (example) +...+RAM[] +...+RAM[] stored stored n n RAM[] RAM[] M M @ @ M M f f >RAM[] >RAM[] DM DM @R @R DD-M DD-M @ @ + + DM DM @ @ MD+M MD+M ++ ++ MM+ MM+ @ @ () () @ @ DM DM @R @R MD MD RAM[] RAM[] assemble Target code Note that comment lnes and pseudo-commands (label declaratons) generate no code. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 6
Proposed assembler mplementaton An assembler program can be wrtten n any hgh-level language. We propose a language-ndependent desgn, as follows. Software modules: Parser: Unpacks each command nto ts underlyng felds Code: Translates each feld nto ts correspondng bnary value, and assembles resultng values SymbolTable: Manages symbol table Man: Intalzes I/O fles and drves show. Proposed mplementaton stages Stage I: Buld a basc assembler for programs wth no symbols Stage II: Extend basc assembler wth symbol handlng capabltes. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 7
Parser (a software module n assembler program) Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 8
Parser (a software module n assembler program) / contnued Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 9
Code (a software module n assembler program) Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 2
SymbolTable (a software module n assembler program) Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 2
Perspectve Smple machne language, smple assembler Most assemblers are not stand-alone, but rar encapsulated n a translator of a hgher order C programmers that understand code generated by a C compler can mprove r code consderably C programmng (e.g. for real-tme systems) may nvolve re-wrtng crtcal segments n assembly, for optmzaton Wrtng an assembler s an excellent practce for wrtng more challengng translators, e.g. a VM Translator and a compler, as we wll do n next lectures. Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde 22