Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005 Chapter 11: Compiler II: Code Generation www.idc.ac.il/tecs Usage and Copyright Notice: Copyright 2005 Noam Nisan and Shimon Schocken This presentation contains lecture materials that accompany the textbook The Elements of Computing Systems by Noam Nisan & Shimon Schocken, MIT Press, 2005. The book web site, www.idc.ac.il/tecs, features 13 such presentations, one for each book chapter. Each presentation is designed to support about 3 hours of classroom or self-study instruction. You are welcome to use or edit this presentation for instructional and non-commercial purposes. If you use our materials, we will appreciate it if you will include in them a reference to the book s web site. And, if you have any comments, you can reach us at tecs.ta@gmail.com Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 1
Course map Human Thought Abstract design Chapters 9, 12 abstract interface H.L. Language & Operating Sys. Compiler Chapters 10-11 abstract interface Virtual Machine Software hierarchy VM Translator Chapters 7-8 abstract interface Assembly Language Assembler Chapter 6 abstract interface Machine Language Computer Architecture Chapters 4-5 Hardware hierarchy abstract interface Hardware Platform Gate Logic Chapters 1-3 abstract interface Chips & Logic Gates Electrical Engineering Physics Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 2
The big picture Syntax analysis: understanding the code Code generation: constructing semantics Jack Compiler (Chapter 10) XML code Syntax Analyzer Jack Program Tokenizer Parser Code Gene -ration (Chapter 11) VM code Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 3
Syntax analysis (review) Class Class Bar Bar { method method Fraction Fraction foo(int foo(int y) y) { var var int int temp; temp; // // a variable variable let let temp temp = (xxx+12)*-63; Syntax analyzer The code generation challenge: Extend the syntax analyzer into a full-blown compiler Program = a series of operations that manipulate data The compiler should convert each understood (parsed) source operation and data item into corresponding operations and data items in the target language So we have to generate code for handling data handling operations. <vardec> <vardec> <keyword> var var </keyword> <keyword> int int </keyword> <identifier> temp temp </identifier> <symbol> <symbol> ; </symbol> </vardec> <statements> <letstatement> <keyword> let let </keyword> <identifier> temp temp </identifier> <symbol> <symbol> = </symbol> <expression> <term> <term> <symbol> <symbol> ( </symbol> <expression> <term> <term> <identifier> xxx xxx </identifier> </term> </term> <symbol> <symbol> + </symbol> <term> <term> <int.const.> 12 12 </int.const.> </term> </term> </expression> Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 4
Handling data When dealing with a variable, say x, we have to know: What is x s data type? Primitive, or ADT (class name)? (Need to know in order to properly allocate to it RAM resources) What kind of variable is x? local, static, field, argument? (Need to know in order to properly manage its life cycle). Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 5
Symbol table Classical implementation: A list of hash tables, each reflecting a single scope nested within the next one in the list The identifier lookup works from the current table upwards. Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 6
Life cycle Static: single copy must be kept alive throughout the program duration Field: different copies must be kept for each object Local: created on subroutine entry, killed on exit Argument: similar to local Good news: the VM handles all these details!!! Hurray!!! Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 7
Handling arrays Java code class class Complex Complex { { void void foo(int foo(int k) k) { { int int x, x, y; y; int[] int[] bar; bar; // // declare declare an an array array // // Construct Construct the the array: Following array: bar bar = = new new int[10]; compilation: int[10]; bar[k]=19; bar[k]=19; Main.foo(2); Main.foo(2); // // Call Call the the foo foo method method 0 275 276 277 504 4315 4316 4317 4318 4324 RAM state, just after executing bar[k]=19 4315 19 x y 2 k bar (local 0) (local 1) (local 2) (argument 0) (bar array) Bar = new int(n) Is typically handled by causing the compiler to generate code affecting: bar = Mem.alloc(n) VM Code (pseudo) // // bar[k]=19, or or *(bar+k)=19 push push bar bar push push k add add // // Use Use a pointer pointer to to access access x[k] x[k] pop pop addr addr // // addr addr points points to to bar[k] bar[k] push push 19 19 pop pop *addr *addr // // Set Set bar[k] bar[k] to to 19 19 VM Code (final) // // bar[k]=19, or or *(bar+k)=19 push push local local 2 push push argument argument 0 add add // // Use Use the the that that segment segment to to access access x[k] x[k] pop pop pointer pointer 1 push push constant constant 19 19 pop pop that that 0 Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 8
Handling objects: memory allocation Java code class class Complex Complex { // // Properties (fields): int int re; re; // // Real Real part part int int im; im; // // Imaginary part part /** /** Constructs a new new Complex Complex object. object. */ */ public public Complex(int are, are, int int aim) aim) { re re = are; are; im im = aim; Following aim; compilation: // // The The following code code can can be be in in any any class: class: public public void void bla() bla() { Complex Complex a, a, b, b, c; c; a = new new Complex(5,17); b = new new Complex(12,192); c = a; a; // // Only Only the the reference is is copied copied foo = new ClassName( ) Is typically handled by causing the compiler to generate code affecting: foo = Mem.alloc(n) Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 9
Handling objects: operations Java code class class Complex Complex { // // Properties (fields): int int re; re; // // Real Real part part int int im; im; // // Imaginary part part /** /** Constructs a new new Complex Complex object. object. */ */ public public Complex(int are, are, int int aim) aim) { re re = are; are; im im = aim; aim; // // Multiplication: public public void void mult mult (int (int c) c) { re re = re re * c; c; im im = im im * c; c; Translating im = im * c : Look up the symbol table Resulting semantics: // // im im = im im * c : *(this+1) = *(this+1) times times (argument 0) 0) Of course this should be written in the target language. Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 10
Handling objects: method calls Java code class class Complex Complex { // // Properties (fields): int int re; re; // // Real Real part part int int im; im; // // Imaginary part part /** /** Constructs a new new Complex Complex object. object. */ */ public public Complex(int are, are, int int aim) aim) { re re = are; are; im im = aim; aim; class class Foo Foo { public public void void foo() foo() { Complex Complex x; x; x = new new Complex(1,2); x.mult(5); Translating General rule: each method call foo.bar(v1,v2,) can be translated into push foo push v1 push v2 call bar x.mult(5): Can also be viewed as mult(x,5) Generated code: // // x.mult(5): push push x push push 5 call call mult mult Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 11
Generating code for expressions x+g(2,y,-z)*5 Syntax analysis Code generation push push x push push 2 push push y push push z neg neg call call g push push 5 call call mult mult add add The codewrite(exp) algorithm: Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 12
Handling control flow (e.g. IF, WHILE) Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 13
Program flow Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 14
Final example Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 15
Perspective Hard Jack simplifications: Primitive type system No inheritance No public class fields (e.g. must use r=c.getradius() rather than r=c.radius) Soft Jack simplifications: Limited control structures (no for, switch, ) Cumbersome handling of char types (cannot use let x= c ) Optimization For example, c++ will be translated into push c, push 1, add, pop c. Parallel processing Many other examples of possible improvements Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, www.idc.ac.il/tecs, Chapter 11: Compiler II: Code Generation slide 16