Week 3: Compilers and Interpreters

Size: px
Start display at page:

Download "Week 3: Compilers and Interpreters"

Transcription

1 CS320 Principles of Programming Languages Week 3: Compilers and Interpreters Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 3: Compilers and Interpreters 1/ 52

2 Programming Language Implementation Framework high-level Programs Compiler/ Interpreter low-level Execution Programming languages enable people to express tasks at high-level However, to perform a program s actions, we need to execute it on a low-level machine Compilers and interpreters provide the bridge between these two parts PSU CS320 Fall 17 Week 3: Compilers and Interpreters 2/ 52

3 High-Level vs. Low-Level High-level Language s Features: Declarations and nested scopes Many data types, allowing declarations and nested scopes Many forms of expressions and statements Many levels of program abstractions Type-inference, exceptions, concurrency,... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 3/ 52

4 High-Level vs. Low-Level High-level Language s Features: Declarations and nested scopes Many data types, allowing declarations and nested scopes Many forms of expressions and statements Many levels of program abstractions Type-inference, exceptions, concurrency,... Some other forms of high-level descriptions can also be included in this framework: Speeches, written texts, images, videos,... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 3/ 52

5 High-Level vs. Low-Level High-level Language s Features: Declarations and nested scopes Many data types, allowing declarations and nested scopes Many forms of expressions and statements Many levels of program abstractions Type-inference, exceptions, concurrency,... Some other forms of high-level descriptions can also be included in this framework: Speeches, written texts, images, videos,... Low-Level Language s Characteristics: Explicit registers, explicit memory management Limited operation forms: machine instructions Limited control mechanism: only labels and conditional branches PSU CS320 Fall 17 Week 3: Compilers and Interpreters 3/ 52

6 Compiler vs. Interpreter An interpreter implements a program in one single step. It runs a program directly: Source program Interpreter execution PSU CS320 Fall 17 Week 3: Compilers and Interpreters 4/ 52

7 Compiler vs. Interpreter An interpreter implements a program in one single step. It runs a program directly: Source program Interpreter execution A compiler implements a program in two steps. It translates a program first; then executes: Source program Compiler Target program execution PSU CS320 Fall 17 Week 3: Compilers and Interpreters 4/ 52

8 Compiler vs. Interpreter An interpreter implements a program in one single step. It runs a program directly: Source program Interpreter execution A compiler implements a program in two steps. It translates a program first; then executes: Source program Compiler Target program execution JIT (Just-In-Time) compiler performs compilation after source program is loaded into memory for execution. (To the user, it appears like an interpreter, since there is no explicit target program.) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 4/ 52

9 Compiler vs. Interpreter Any programming language can be implemented either through interpretation or compilation. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 5/ 52

10 Compiler vs. Interpreter Any programming language can be implemented either through interpretation or compilation. However, due to the differences of PLs features, Some languages are more suitable for compilation, e.g. languages with many static features Fortran, C, Ada,... They sometimes are referred to as compiled languages. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 5/ 52

11 Compiler vs. Interpreter Any programming language can be implemented either through interpretation or compilation. However, due to the differences of PLs features, Some languages are more suitable for compilation, e.g. languages with many static features Fortran, C, Ada,... They sometimes are referred to as compiled languages. Some are more suitable for interpretation, e.g. Very simple languages: BASIC, Logo,... Scripting languages: PHP, Python, Ruby, Perl, Javascript,... Declarative languages: Lisp, Scheme, ML, Haskell, Prolog,... They sometimes are referred to as interpreted languages. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 5/ 52

12 Compiler vs. Interpreter Some languages use a compilation-interpretation combined approach, e.g. through an intermediate representation: Pascal (p-code), Java (bytecode), VB (p-code), C#,... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 6/ 52

13 Compiler vs. Interpreter Some languages use a compilation-interpretation combined approach, e.g. through an intermediate representation: Pascal (p-code), Java (bytecode), VB (p-code), C#,... Some languages have both forms of implementations: Pascal, Lisp, C/C++,... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 6/ 52

14 Compiler vs. Interpreter Since a compiler is a language-to-language translator, they can be used in a chain: L1 Program L1 Compiler L2 Program L2 Compiler L3 Program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 7/ 52

15 Compiler vs. Interpreter Since a compiler is a language-to-language translator, they can be used in a chain: L1 Program L1 Compiler L2 Program L2 Compiler L3 Program Example: The classical Unix C compiler, cc, proc.c cc proc.o is in fact three compilers chained together. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 7/ 52

16 Compiler vs. Interpreter CC proc.c proc.i proc.s proc.o cpp cc1 as PSU CS320 Fall 17 Week 3: Compilers and Interpreters 8/ 52

17 Compiler vs. Interpreter CC proc.c proc.i proc.s proc.o cpp cc1 as cpp: the C preprocessor, expands the use of macros and compiler directives in the source program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 8/ 52

18 Compiler vs. Interpreter CC proc.c proc.i proc.s proc.o cpp cc1 as cpp: the C preprocessor, expands the use of macros and compiler directives in the source program cc1: the main C compiler, which translates C code to the assembly language for a particular machine PSU CS320 Fall 17 Week 3: Compilers and Interpreters 8/ 52

19 Compiler vs. Interpreter CC proc.c proc.i proc.s proc.o cpp cc1 as cpp: the C preprocessor, expands the use of macros and compiler directives in the source program cc1: the main C compiler, which translates C code to the assembly language for a particular machine as: the assembler, which translates assembly language programs into machine code PSU CS320 Fall 17 Week 3: Compilers and Interpreters 8/ 52

20 Compiler Overview Source Program Compiler Target Program diagnostics PSU CS320 Fall 17 Week 3: Compilers and Interpreters 9/ 52

21 Compiler Overview Source Program Compiler Target Program diagnostics A compiler translates a program It reads a source program as input, analyzes it, and then outputs a semantically equivalent target program. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 9/ 52

22 Compiler Overview Source Program Compiler Target Program diagnostics A compiler translates a program It reads a source program as input, analyzes it, and then outputs a semantically equivalent target program. In a typical setting, the source language is high-level, while the target language is low-level. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 9/ 52

23 Compiler Overview Source Program Compiler Front-end AST Back-end Target Program diagnostics PSU CS320 Fall 17 Week 3: Compilers and Interpreters 10 / 52

24 Compiler Overview Source Program Compiler Front-end AST Back-end Target Program diagnostics Front-end Main task is to understand the input program s syntax and validate its static semantics PSU CS320 Fall 17 Week 3: Compilers and Interpreters 10 / 52

25 Compiler Overview Source Program Compiler Front-end AST Back-end Target Program diagnostics Front-end Main task is to understand the input program s syntax and validate its static semantics Back-end Main task is to synthesize a semantically-equivalent target program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 10 / 52

26 Compiler Overview Source Program Compiler Front-end AST Back-end Target Program diagnostics Front-end Main task is to understand the input program s syntax and validate its static semantics Back-end Main task is to synthesize a semantically-equivalent target program AST Internal program representation, with essential syntax info PSU CS320 Fall 17 Week 3: Compilers and Interpreters 10 / 52

27 Basic Requirement for a Compiler A compiler needs to ensure that the source program s semantics is preserved in the target program. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 11 / 52

28 Basic Requirement for a Compiler A compiler needs to ensure that the source program s semantics is preserved in the target program. In today s practice, compiler s correctness is largely established through informal validation approaches. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 11 / 52

29 Basic Requirement for a Compiler A compiler needs to ensure that the source program s semantics is preserved in the target program. In today s practice, compiler s correctness is largely established through informal validation approaches. Provably correct compilers is still an active research topic. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 11 / 52

30 Desirable Properties of a Compiler PSU CS320 Fall 17 Week 3: Compilers and Interpreters 12 / 52

31 Desirable Properties of a Compiler Performance: Of both the compiler itself and compiled code PSU CS320 Fall 17 Week 3: Compilers and Interpreters 12 / 52

32 Desirable Properties of a Compiler Performance: Of both the compiler itself and compiled code Diagnostics: High quality error messages and warnings enable early diagnosis and resolution of programming errors PSU CS320 Fall 17 Week 3: Compilers and Interpreters 12 / 52

33 Desirable Properties of a Compiler Performance: Of both the compiler itself and compiled code Diagnostics: High quality error messages and warnings enable early diagnosis and resolution of programming errors Convenient development environment: IDEs, tools for profiling and debugging, etc. Separate compilation PSU CS320 Fall 17 Week 3: Compilers and Interpreters 12 / 52

34 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

35 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

36 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

37 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

38 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

39 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

40 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree Back-end: Abstract Syntax Tree PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

41 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree Back-end: Abstract IR Code Syntax Tree Generator PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

42 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree Back-end: Abstract IR Code Syntax Tree Generator IR Code Optimizer PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

43 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree Back-end: Abstract IR Code Syntax Tree Generator IR Code Optimizer Target Code Generator PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

44 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree Back-end: Abstract IR Code Syntax Tree Generator IR Code Optimizer Target Code Generator Target Program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

45 The Compiler Pipeline The compilation process is typically broken down into a sequence of phases: Front-end: Source program Lexical Analysis (Lexer) Syntax Analysis (Parser) Static Analysis (Checker) Abstract Syntax Tree Back-end: Abstract IR Code Syntax Tree Generator IR Code Optimizer Target Code Generator Target Program There are many variations on the phase sequence of the back-end, e.g. extra phases or iterated phases. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 13 / 52

46 Lexical Analysis character stream Lexer token stream PSU CS320 Fall 17 Week 3: Compilers and Interpreters 14 / 52

47 Lexical Analysis character stream Lexer token stream Tasks: Looking for patterns in the input, converting them to tokens PSU CS320 Fall 17 Week 3: Compilers and Interpreters 14 / 52

48 Lexical Analysis character stream Lexer token stream Tasks: Looking for patterns in the input, converting them to tokens Skipping comments and white space characters PSU CS320 Fall 17 Week 3: Compilers and Interpreters 14 / 52

49 Lexical Analysis character stream Lexer token stream Tasks: Looking for patterns in the input, converting them to tokens Skipping comments and white space characters Detecting lexical errors PSU CS320 Fall 17 Week 3: Compilers and Interpreters 14 / 52

50 Lexer Implementation Alexerisbasicallyafiniteautomaton. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 15 / 52

51 Lexer Implementation Alexerisbasicallyafiniteautomaton. Every step for converting RE to DFA to lexer can be automated. As such, many lexer-generators exist: lex flex, flex++, jflex JavaCC... "mini.jflex" // minijava keywords and ID // in jflex specification %% %% "class" "extends" "static" "public" "void" "int" "boolean" "new" "if" "else" "while" "return" "main" "true" "false" "String" "System" "out" "println" [A-Za-z]* [ \t\n]+ { /* ignore */ } PSU CS320 Fall 17 Week 3: Compilers and Interpreters 15 / 52

52 Lexer Implementation It s possible to manually implement a lexer following the RE to DFA conversion steps. But the process can be very tedious, and the resulting DFA can be very large: linux> jflex mini.jflex Reading "mini.jflex" Constructing NFA : 118 states in NFA Converting NFA to DFA : states before minimization, 91 states in minimized DFA Writing code to "Yylex.java" PSU CS320 Fall 17 Week 3: Compilers and Interpreters 16 / 52

53 Lexer Implementation It s possible to manually implement a lexer following the RE to DFA conversion steps. But the process can be very tedious, and the resulting DFA can be very large: linux> jflex mini.jflex Reading "mini.jflex" Constructing NFA : 118 states in NFA Converting NFA to DFA : states before minimization, 91 states in minimized DFA Writing code to "Yylex.java" Alternative manual approaches exist. They generally process token RE patterns directly, without converting them to DFAs. They use techniques such as buffering, lookahead, and post-processing, to simply the task. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 16 / 52

54 Lexer Implementation Sample manual code (for keywords and IDs): // Treat keywords as IDs first, then distinguish them out int c = nextchar(); int c2 = peeknextchar(); if (isletter(c)) { // identifying an ID token while (isletter(c2) isdigit(c2)) { c = nextchar(); c2 = peeknextchar(); } // assume lexeme is buffered in a String if (lexeme.equals("class")) return CLASS; else if (lexeme.equals("extends")) return EXTENDS;... else return ID; } PSU CS320 Fall 17 Week 3: Compilers and Interpreters 17 / 52

55 Syntax Analysis token stream Parser abstract syntax tree PSU CS320 Fall 17 Week 3: Compilers and Interpreters 18 / 52

56 Syntax Analysis token stream Parser abstract syntax tree Tasks: Recognizing the hierarchical syntactic structure of the input program, representing it in an internal data structure, typically a syntax tree. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 18 / 52

57 Syntax Analysis token stream Parser abstract syntax tree Tasks: Recognizing the hierarchical syntactic structure of the input program, representing it in an internal data structure, typically a syntax tree. Detecting syntax errors PSU CS320 Fall 17 Week 3: Compilers and Interpreters 18 / 52

58 Parser Implementation A parser is basically a push-down automaton, i.e. an automaton with a stack storage. On each input token, it not only can transit from one state to another, it can also store information for later use. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 19 / 52

59 Parser Implementation A parser is basically a push-down automaton, i.e. an automaton with a stack storage. On each input token, it not only can transit from one state to another, it can also store information for later use. Steps to implement a parser: PSU CS320 Fall 17 Week 3: Compilers and Interpreters 19 / 52

60 Parser Implementation A parser is basically a push-down automaton, i.e. an automaton with a stack storage. On each input token, it not only can transit from one state to another, it can also store information for later use. Steps to implement a parser: 1. Describe the input language s syntax by a context-free grammar PSU CS320 Fall 17 Week 3: Compilers and Interpreters 19 / 52

61 Parser Implementation A parser is basically a push-down automaton, i.e. an automaton with a stack storage. On each input token, it not only can transit from one state to another, it can also store information for later use. Steps to implement a parser: 1. Describe the input language s syntax by a context-free grammar 2. Convert the grammar into a form that is suitable for parsing e.g. unambiguous, restricted recursion form PSU CS320 Fall 17 Week 3: Compilers and Interpreters 19 / 52

62 Parser Implementation A parser is basically a push-down automaton, i.e. an automaton with a stack storage. On each input token, it not only can transit from one state to another, it can also store information for later use. Steps to implement a parser: 1. Describe the input language s syntax by a context-free grammar 2. Convert the grammar into a form that is suitable for parsing e.g. unambiguous, restricted recursion form 3. Build a parser based on the transformed grammar PSU CS320 Fall 17 Week 3: Compilers and Interpreters 19 / 52

63 A Context-Free Grammar Example Program "begin" StmtList "end" StmtList Stmt {Stmt} Stmt Assignment ReadStmt WriteStmt Assignment id ":=" Expr ";" ReadStmt "read" "(" IdList ")" ";" WriteStmt "write" "(" ExprList ")" ";" IdList id {"," id} ExprList Expr {"," Expr} Expr Expr Op Expr "(" Expr ")" id intlit Op "+" "-" "*" "/" PSU CS320 Fall 17 Week 3: Compilers and Interpreters 20 / 52

64 Grammar Transformation A programming language s official grammar is not always suitable for use as the base for parser construction, e.g. it might be ambiguous it might contain wrong forms of recursion it might require multiple lookahead tokens PSU CS320 Fall 17 Week 3: Compilers and Interpreters 21 / 52

65 Grammar Transformation A programming language s official grammar is not always suitable for use as the base for parser construction, e.g. it might be ambiguous it might contain wrong forms of recursion it might require multiple lookahead tokens Transformation is often required to prepare a grammar for parsing. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 21 / 52

66 Grammar Transformation A programming language s official grammar is not always suitable for use as the base for parser construction, e.g. it might be ambiguous it might contain wrong forms of recursion it might require multiple lookahead tokens Transformation is often required to prepare a grammar for parsing. Example: Eliminating ambiguity in expression grammar. Expr Expr Op Expr "(" Expr ")" id intlit Op "+" "-" "*" "/" PSU CS320 Fall 17 Week 3: Compilers and Interpreters 21 / 52

67 Grammar Transformation A programming language s official grammar is not always suitable for use as the base for parser construction, e.g. it might be ambiguous it might contain wrong forms of recursion it might require multiple lookahead tokens Transformation is often required to prepare a grammar for parsing. Example: Eliminating ambiguity in expression grammar. Expr Expr Op Expr "(" Expr ")" id intlit Op "+" "-" "*" "/" Expr Expr ("+" "-") Factor Factor Factor Factor "*" "/" Primary Primary Primary "(" Expr ")" id intlit PSU CS320 Fall 17 Week 3: Compilers and Interpreters 21 / 52

68 Parsing Techniques Top-Down Parsing Building a syntax tree from top down. Use lookahread to predict the next production to apply. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 22 / 52

69 Parsing Techniques Top-Down Parsing Building a syntax tree from top down. Use lookahread to predict the next production to apply. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: S 1 a B cde 2 a B bcde 3 a bbcde 5 abbcde PSU CS320 Fall 17 Week 3: Compilers and Interpreters 22 / 52

70 Parsing Techniques Top-Down Parsing Building a syntax tree from top down. Use lookahread to predict the next production to apply. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: S 1 a B cde 2 a B bcde 3 a bbcde 5 abbcde S a B c D e PSU CS320 Fall 17 Week 3: Compilers and Interpreters 22 / 52

71 Parsing Techniques Top-Down Parsing Building a syntax tree from top down. Use lookahread to predict the next production to apply. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: S 1 a B cde 2 a B bcde 3 a bbcde 5 abbcde S a B c D e S a B c D e B b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 22 / 52

72 Parsing Techniques Top-Down Parsing Building a syntax tree from top down. Use lookahread to predict the next production to apply. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: S 1 a B cde 2 a B bcde 3 a bbcde 5 abbcde S S S a B c D e a B c D e B b a B c D e B b b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 22 / 52

73 Parsing Techniques Top-Down Parsing Building a syntax tree from top down. Use lookahread to predict the next production to apply. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: S 1 a B cde 2 a B bcde 3 a bbcde 5 abbcde S a B c D e S a B c D e B b S a B c D e B b b S a B c D e B b d b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 22 / 52

74 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

75 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: in a in a b 3 ab in abb 2 ab in abcd 5 abcd in abcde in abc 1 S PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

76 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: in a in a b 3 ab in abb 2 ab in abcd 5 abcd in abcde in abc 1 S a b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

77 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: in a in a b 3 ab in abb 2 ab in abcd 5 abcd in abcde in abc 1 S a b a B b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

78 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: in a in a b 3 ab in abb 2 ab in abcd 5 abcd in abcde in abc 1 S a b a B b a B b b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

79 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: in a in a b 3 ab in abb 2 ab in abcd 5 abcd in abcde in abc 1 S a b a B b a B b b a B B b b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

80 Parsing Techniques Bottom-Up Parsing Build a syntax tree from bottom up: Find a sequence on the stack that matches a production s right-hand-side. 1. S abcde 2. B Bb 3. B b 4. D Dd 5. D d Input: abbcde Parsing Steps: in a in a b 3 ab in abb 2 ab in abcd 5 abcd in abcde in abc 1 S a b a B b a B b b a B B b b a B c d B b b PSU CS320 Fall 17 Week 3: Compilers and Interpreters 23 / 52

81 Static Analysis abstract syntax tree Static Checker validated abstract syntax tree PSU CS320 Fall 17 Week 3: Compilers and Interpreters 24 / 52

82 Static Analysis abstract syntax tree Static Checker validated abstract syntax tree Task: Check that the input program is valid according to the language s static semantics. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 24 / 52

83 Static Analysis Implementation Traverse the AST and validate every node w.r.t. semantic rules. Non-local information is often needed for the validation (such as variables types); these info is typically maintained in global data structures (i.e. environments). PSU CS320 Fall 17 Week 3: Compilers and Interpreters 25 / 52

84 Static Analysis Implementation Traverse the AST and validate every node w.r.t. semantic rules. Non-local information is often needed for the validation (such as variables types); these info is typically maintained in global data structures (i.e. environments). Example: // Make sure operands types are legal with respect to the operator. static Ast.Type check(ast.binop n) throws Exception { Ast.Type t1 = check(n.e1); Ast.Type t2 = check(n.e2); if (n.op == Ast.BOP.ADD n.op == Ast.BOP.SUB n.op == Ast.BOP.MUL n.op == Ast.BOP.DIV) { if ((t1 instanceof Ast.IntType) && (t2 instanceof Ast.IntType)) return Ast.IntType; } else if (n.op == Ast.BOP.AND n.op == Ast.BOP.OR) { if ((t1 instanceof Ast.BoolType) && (t2 instanceof Ast.BoolType)) return Ast.BoolType; }... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 25 / 52

85 IR Code Generation validated abstract syntax tree IR Code Generator IR code Task: Translating the input program into IR code. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 26 / 52

86 IR Code Generation validated abstract syntax tree IR Code Generator IR code Task: Translating the input program into IR code. An IR (Intermediate Representation) is an internal program representation used by a compiler. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 26 / 52

87 IR Code Generation validated abstract syntax tree IR Code Generator IR code Task: Translating the input program into IR code. An IR (Intermediate Representation) is an internal program representation used by a compiler. Reasons for using IR: Enables a compiler to analyze and manipulate a program independent of both input-language and target-language constraints. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 26 / 52

88 IR Code Generation validated abstract syntax tree IR Code Generator IR code Task: Translating the input program into IR code. An IR (Intermediate Representation) is an internal program representation used by a compiler. Reasons for using IR: Enables a compiler to analyze and manipulate a program independent of both input-language and target-language constraints. Provides a favorable environment for optimization. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 26 / 52

89 IR Code Generation Implementation Traverse the AST and generate IR code for every node. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 27 / 52

90 IR Code Generation Implementation Traverse the AST and generate IR code for every node. The rules for IR code generation can be formally specified with attribute grammars. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 27 / 52

91 IR Code Generation Implementation Traverse the AST and generate IR code for every node. The rules for IR code generation can be formally specified with attribute grammars. Example: Exp (Binop OP Exp 1 Exp 2 ) OP + - * / NewTemp t Exp.c := Exp 1.c Exp 2.c "t = Exp 1.v OP Exp 2.v" Exp.v := t Exp.c thegeneratedcode Exp.v Exp s value, or the temp or id holding the value The operator denotes IR code concatenation. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 27 / 52

92 IR Code Optimization IR code Optimizer improved IR code PSU CS320 Fall 17 Week 3: Compilers and Interpreters 28 / 52

93 IR Code Optimization Task: IR code Optimizer improved IR code Transforming IR code into a functionally equivalent, but more efficient form. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 28 / 52

94 IR Code Optimization Task: IR code Optimizer improved IR code Transforming IR code into a functionally equivalent, but more efficient form. Techniques: Pattern matching for local optimization Dataflow analysis for global optimization PSU CS320 Fall 17 Week 3: Compilers and Interpreters 28 / 52

95 Target Code Generation IR code Code Generator target code PSU CS320 Fall 17 Week 3: Compilers and Interpreters 29 / 52

96 Target Code Generation IR code Code Generator target code Tasks: Generating target machine code PSU CS320 Fall 17 Week 3: Compilers and Interpreters 29 / 52

97 Target Code Generation IR code Code Generator target code Tasks: Generating target machine code Performing machine-specific optimization PSU CS320 Fall 17 Week 3: Compilers and Interpreters 29 / 52

98 Compilation Process Example The source program (toy.c): /* A toy C program */ int main(void) { int a, b, s; printf("enter two integers: "); scanf("%d %d", &a, &b); s = a*a + b*b; printf("%d^2 + %d^2 = %d\n", a, b, s); } PSU CS320 Fall 17 Week 3: Compilers and Interpreters 30 / 52

99 Actual Memory Content of toy.c In unit of bytes (dumped via the linux od utility): 2F 2A F F D 20 2A 2F 0A 69 6E D E F B 0A E C C B 0A E E F E A B 0A E C C B 0A D A B A 62 3B A E E B E D C 6E 22 2C C C B 0A 7D 0A PSU CS320 Fall 17 Week 3: Compilers and Interpreters 31 / 52

100 Actual Memory Content of toy.c In unit of bytes (dumped via the linux od utility): 2F 2A F F D 20 2A 2F 0A 69 6E D E F B 0A E C C B 0A E E F E A B 0A E C C B 0A D A B A 62 3B A E E B E D C 6E 22 2C C C B 0A 7D 0A Binary sequences are used to represent all types of information in a computer. We have to assume an encoding scheme in order to interpret the content of any file. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 31 / 52

101 Interpreting the Content as ASCII Characters / * A t o y C p r o g r a m * / \n i n t m a i n ( v o i d ) { \n i n t a, b, s ; \n p r i n t f ( " E n t e r t w o i n t e g e r s : " ) ; \n s c a n f ( " % d % d ", & a, & b ) ; \n s = a * a + b * b ; \n p r i n t f ( " % d ^ 2 + % d ^ 2 = % d \ n ", a, b, s ) ; \n } \n PSU CS320 Fall 17 Week 3: Compilers and Interpreters 32 / 52

102 Interpreting the Content as ASCII Characters / * A t o y C p r o g r a m * / \n i n t m a i n ( v o i d ) { \n i n t a, b, s ; \n p r i n t f ( " E n t e r t w o i n t e g e r s : " ) ; \n s c a n f ( " % d % d ", & a, & b ) ; \n s = a * a + b * b ; \n p r i n t f ( " % d ^ 2 + % d ^ 2 = % d \ n ", a, b, s ) ; \n } \n This is the actual input to a compiler. The compiler will read from an input program file one character at a time. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 32 / 52

103 Lexing the Toy Program Input: / * A t o y C p r o g r a m * / \n i n t m a i n ( v o i d ) { \n i n t a, b, s ; \n p r i n t f ( " E n t e r t w o i n t e g e r s : " ) ; \n s c a n f ( " % d % d ", & a, & b ) ; \n s = a * a + b * b ; \n p r i n t f ( " % d ^ 2 + % d ^ 2 = % d \ n ", a, b, s ) ; \n } \n PSU CS320 Fall 17 Week 3: Compilers and Interpreters 33 / 52

104 Lexing the Toy Program Input: / * A t o y C p r o g r a m * / \n i n t m a i n ( v o i d ) { \n i n t a, b, s ; \n p r i n t f ( " E n t e r t w o i n t e g e r s : " ) ; \n s c a n f ( " % d % d ", & a, & b ) ; \n s = a * a + b * b ; \n p r i n t f ( " % d ^ 2 + % d ^ 2 = % d \ n ", a, b, s ) ; \n } \n Processing: Input chars Action / * A t o y... * / skip \n skip i n t return token INT skip m a i n return token MAIN ( return token LPAREN v o i d return token VOID ) return token RPAREN skip { return token LBRACE \n skip skip... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 33 / 52

105 Lexing the Toy Program Input: / * A t o y C p r o g r a m * / \n i n t m a i n ( v o i d ) { \n i n t a, b, s ; \n p r i n t f ( " E n t e r t w o i n t e g e r s : " ) ; \n s c a n f ( " % d % d ", & a, & b ) ; \n s = a * a + b * b ; \n p r i n t f ( " % d ^ 2 + % d ^ 2 = % d \ n ", a, b, s ) ; \n } \n Processing: Input chars Action / * A t o y... * / skip \n skip i n t return token INT skip m a i n return token MAIN ( return token LPAREN v o i d return token VOID ) return token RPAREN skip { return token LBRACE \n skip skip... Output: INT ID(b) MAIN ) ( ; VOID Id(s) ) = { ID(a) INT * ID(a) ID(a), + ID(b) ID(b), * ID(s) ID(b) ; ; ID(printf) ID(printf) ( ( STRLIT("Ent..") STRLIT("%d^2..") ), ; ID(a) ID(scanf), ( ID(b) STRLIT("%d %d"),, Id(s) & ) ID(a), ; & } PSU CS320 Fall 17 Week 3: Compilers and Interpreters 33 / 52

106 Parsing the Toy Program Input: INT ID(b) MAIN ) ( ; VOID Id(s) ) = { ID(a) INT * ID(a) ID(a), + ID(b) ID(b), * ID(s) ID(b) ; ; ID(printf) ID(printf) ( ( STRLIT("Ent..") STRLIT("%d^2..") ), ; ID(a) ID(scanf), ( ID(b) STRLIT("%d %d"),, Id(s) & ) ID(a), ; & } PSU CS320 Fall 17 Week 3: Compilers and Interpreters 34 / 52

107 Parsing the Toy Program Input: Output: program INT ID(b) MAIN ) ( ; VOID Id(s) ) = { ID(a) INT * ID(a) ID(a), + ID(b) ID(b), * ID(s) ID(b) ; ; ID(printf) ID(printf) ( ( STRLIT("Ent..") STRLIT("%d^2..") ), ; ID(a) ID(scanf), ( ID(b) STRLIT("%d %d"),, Id(s) & ) ID(a), ; & } decls func-decl INT main null decls stmts var-decl call-stmt call-stmt assign call-stmt ID(printf) args ID(scanf) args lvalue expr ID(printf) args STRLIT(... ) expr expr expr ID(a) ID(b) ID(s) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 34 / 52

108 Performing Static Analysis program decls func-decl INT main null decls stmts var-decl call-stmt call-stmt assign call-stmt ID(printf) args ID(scanf) args lvalue expr ID(printf) args STRLIT(... ) expr expr expr ID(a) ID(b) ID(s) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 35 / 52

109 Performing Static Analysis program decls func-decl INT main null decls stmts var-decl call-stmt call-stmt assign call-stmt ID(printf) args ID(scanf) args lvalue expr ID(printf) args Verified! STRLIT(... ) expr expr expr ID(a) ID(b) ID(s) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 35 / 52

110 IR Code Example Register-machine IR code t1 = malloc (8) L0: a = t1 flag = t7 t2 = -2 if flag == false goto L2 t3 = t2 * 3 t12 = 0 * 4 t4 = 1 + t3 t13 = a + t12 t5 = 0 * 4 t14 = [t13] t6 = a + t5 t15 = 1 * 4 [t6] = t4 t16 = a + t15 t7 = false [t16] = t14 if true == false goto L0 goto L3 t8 = 0 * 4 L2: t9 = a + t8 t17 = 1 * 4 t10 = [t9] t18 = a + t17 t11 = true [t18] = 0 if t10 < 0 goto L1 L3: t11 = false t19 = 1 * 4 L1: t20 = a + t19 if t11 == false goto L0 t21 = [t20] t7 = true print (t21) PSU CS320 Fall 17 Week 3: Compilers and Interpreters 36 / 52

111 IR Code Example Stack-machine IR code 0. CONST CONST LOAD 0 1. NEWARRAY 15. ALOAD 28. CONST 0 2. STORE CONST ALOAD 3. LOAD IFLT ASTORE 4. CONST CONST GOTO CONST GOTO LOAD 0 6. CONST CONST CONST 1 7. NEG 21. AND 34. CONST 0 8. CONST STORE ASTORE 9. MUL 23. LOAD LOAD ADD 24. IFZ CONST ASTORE 25. LOAD ALOAD 12. CONST CONST PRINT 13. LOAD 0 PSU CS320 Fall 17 Week 3: Compilers and Interpreters 37 / 52

112 Local Optimization Example Analyze and transform a few adjacent IR instructions at a time. Example: Original: t1 = malloc (8) a = t1 t2 = -2 t3 = t2 * 3 t4 = 1 + t3 t5 = 0 * 4 t6 = a + t5 [t6] = t4 Optimized: a = malloc (8) [a] = -5 PSU CS320 Fall 17 Week 3: Compilers and Interpreters 38 / 52

113 Local Optimization Example Analyze and transform a few adjacent IR instructions at a time. Example: Original: t1 = malloc (8) a = t1 t2 = -2 t3 = t2 * 3 t4 = 1 + t3 t5 = 0 * 4 t6 = a + t5 [t6] = t4 Optimized: a = malloc (8) [a] = -5 Optimizations Performed: constant folding, constant propagation, copy instruction elimination PSU CS320 Fall 17 Week 3: Compilers and Interpreters 38 / 52

114 Global Optimization Example Perform dataflow analysis over the program s control-flow graph. t6 := 4*i x := a[t6] t8 := 4*j t9 := a[t8] a[t6] := t9 i := m-1 j := n t1 := 4*n v := a[t1] i := i+1 t2 := 4*i t3 := a[t2] if t3<v goto B2 j := j-1 t4 := 4*j t5 := a[t4] if t5>v goto B3 if i>=j goto B6 B5 B1 B2 B3 B4 t11 := 4*i x := a[t11] t13 := 4*n t14 := a[t13] a[t11] := t14 B6 x := t3 a[t2] := t5 a[t4] := x goto B2 i := m-1 j := n t1 := 4*n v := a[t1] i := i+1 t2 := 4*i t3 := a[t2] if t3<v goto B2 j := j-1 t4 := 4*j t5 := a[t4] if t5>v goto B3 if i>=j goto B6 B5 B1 B2 B3 B4 x := t3 t14 := a[t1] a[t2] := t14 a[t1] := x B6 a[t8] := x goto B2 a[t13] := x After Before PSU CS320 Fall 17 Week 3: Compilers and Interpreters 39 / 52

115 Target Code for the Toy Program (SPARC).file "toy.c" gcc2_compiled.:.section ".rodata".align 8.LLC0:.asciz "Enter two integers: ".align 8.LLC1:.asciz "%d %d".global.umul.align 8.LLC2:.asciz "%d^2 + %d^2 = %d\n".section ".text".align 4.global main.type main,#function.proc 04 main:!#prologue# 0 save %sp, -128, %sp!#prologue# 1 sethi %hi(.llc0), %o1 or %o1, %lo(.llc0), %o0 call printf, 0 nop add %fp, -20, %o1 add %fp, -24, %o2 sethi %hi(.llc1), %o3 or %o3, %lo(.llc1), %o0 call scanf, 0 nop ld [%fp-20], %o0 ld [%fp-20], %o1 call.umul, 0 nop mov %o0, %l0 ld [%fp-24], %o0 ld [%fp-24], %o1 call.umul, 0 nop add %l0, %o0, %o1 st %o1, [%fp-28] sethi %hi(.llc2), %o1 or %o1, %lo(.llc2), %o0 ld [%fp-20], %o1 ld [%fp-24], %o2 ld [%fp-28], %o3 call printf, 0 nop.ll2: ret restore.llfe1:.size main,.llfe1-main.ident "GCC: (GNU) (release)" PSU CS320 Fall 17 Week 3: Compilers and Interpreters 40 / 52

116 Target Code for the Toy Program (IA32) LC0: LC1: LC2:.file "toy.c".def main;.scl 2;.type 32;.endef.section.rdata,"dr".ascii "Enter two integers: \0".ascii "%d %d\0".ascii "%d^2 + %d^2 = %d\12\0".text.globl _main.def _main;.scl 2;.type 32;.endef _main: pushl %ebp movl %esp, %ebp subl $40, %esp andl $-16, %esp movl $0, %eax addl $15, %eax addl $15, %eax shrl $4, %eax sall $4, %eax movl %eax, -16(%ebp) movl -16(%ebp), %eax call alloca call main movl $LC0, (%esp) call _printf leal -8(%ebp), %eax movl %eax, 8(%esp) leal -4(%ebp), %eax movl %eax, 4(%esp) movl $LC1, (%esp) call _scanf movl -4(%ebp), %eax movl %eax, %edx imull -4(%ebp), %edx movl -8(%ebp), %eax imull -8(%ebp), %eax leal (%edx,%eax), %eax movl %eax, -12(%ebp) movl -12(%ebp), %eax movl %eax, 12(%esp) movl -8(%ebp), %eax movl %eax, 8(%esp) movl -4(%ebp), %eax movl %eax, 4(%esp) movl $LC2, (%esp) call _printf leave ret.def _scanf;.scl 3;.type 32;.endef.def _printf;.scl 3;.type 32;.endef PSU CS320 Fall 17 Week 3: Compilers and Interpreters 41 / 52

117 Final Executable Code 7F45 4C A B A A D A B F F6C F6C 642E 736F 2E A C D E F B A C A E B D C E F C B D E C C EC C A F A full cycle the executable file s content is just another binary sequence. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 42 / 52

118 Back to the Top-Level... Source Program Compiler Target Program diagnostics Question: How is the compiler program itself written and compiled? In particular, in what language? PSU CS320 Fall 17 Week 3: Compilers and Interpreters 43 / 52

119 Writing and Compiling the Compiler Approach 1. Use an existing language and compiler. Source Program L L Compiler Target Program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 44 / 52

120 Writing and Compiling the Compiler Approach 1. Use an existing language and compiler. Source Program L L Compiler Target Program L Compiler C GCC L Compiler.exe PSU CS320 Fall 17 Week 3: Compilers and Interpreters 44 / 52

121 Writing and Compiling the Compiler Approach 2. Cross Compiling Use an existing compiler to generate executable code for a different target machine. Source Program L L Compiler x86-64.exe Target Program x86-64.exe PSU CS320 Fall 17 Week 3: Compilers and Interpreters 45 / 52

122 Writing and Compiling the Compiler Approach 2. Cross Compiling Use an existing compiler to generate executable code for a different target machine. Source Program L L Compiler x86-64.exe Target Program x86-64.exe L Compiler L L Compiler IA-32.exe L Compiler x86-64.exe PSU CS320 Fall 17 Week 3: Compilers and Interpreters 45 / 52

123 Writing and Compiling the Compiler Approach 3. Bootstrapping Use an existing compiler for a simpler version of the source language. Source Program L L Compiler Target Program PSU CS320 Fall 17 Week 3: Compilers and Interpreters 46 / 52

124 Writing and Compiling the Compiler Approach 3. Bootstrapping Use an existing compiler for a simpler version of the source language. Source Program L L Compiler Target Program L Compiler L L Compiler L Compiler.exe L is a simpler version of L. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 46 / 52

125 Writing and Compiling the Compiler Approach 3. Bootstrapping Use an existing compiler for a simpler version of the source language. Source Program L L Compiler Target Program L Compiler L L Compiler L Compiler.exe L is a simpler version of L. L Compiler L L Compiler L Compiler.exe L is a simpler version of L. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 46 / 52

126 Bootstrapping (cont.) Following the chain of languages and compilers, L, L, L,..., the compiler for the first version of the language (i.e. aminimalcore)is then written in a different language, such as an assembly. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 47 / 52

127 Bootstrapping (cont.) Following the chain of languages and compilers, L, L, L,..., the compiler for the first version of the language (i.e. aminimalcore)is then written in a different language, such as an assembly. Many programming languages compilers are bootstrapped: BASIC, Lisp, Algol, C, Pascal, PL/I, Scheme, Java, Python, Modula-2, Oberon, Haskell, OCaml, Go, Rust, Scala,... PSU CS320 Fall 17 Week 3: Compilers and Interpreters 47 / 52

128 Interpreter Overview Source Program Interpreter execution diagnostics PSU CS320 Fall 17 Week 3: Compilers and Interpreters 48 / 52

129 Interpreter Overview Source Program Interpreter execution diagnostics An interpreter runs a program. It reads and analyzes a source program, then performs the operations implied by the program. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 48 / 52

130 Interpreter Overview For simple languages, an interpreter reads and executes one statement at a time. Source Program Interpreter while ( stmts) { read next stmt; execute this stmt; } execution PSU CS320 Fall 17 Week 3: Compilers and Interpreters 49 / 52

131 Interpreter Overview For simple languages, an interpreter reads and executes one statement at a time. Source Program Interpreter while ( stmts) { read next stmt; execute this stmt; } execution For complex languages, an interpreter may read and convert the source program into an internal AST, then executes from the AST. Source Program Interpreter Parser AST execution PSU CS320 Fall 17 Week 3: Compilers and Interpreters 49 / 52

132 Common Interpreter Characteristics In comparison, interpreters are generally easier to write and are more portable than compilers; while program execution through compiled code is generally faster than through interpretation. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 50 / 52

133 Common Interpreter Characteristics In comparison, interpreters are generally easier to write and are more portable than compilers; while program execution through compiled code is generally faster than through interpretation. Interpreters put more emphasis on interactive use. Most interpreters support the read-eval-print loop (REPL). PSU CS320 Fall 17 Week 3: Compilers and Interpreters 50 / 52

134 Common Interpreter Characteristics In comparison, interpreters are generally easier to write and are more portable than compilers; while program execution through compiled code is generally faster than through interpretation. Interpreters put more emphasis on interactive use. Most interpreters support the read-eval-print loop (REPL). Interpreters can be used to specify programming language semantics. PSU CS320 Fall 17 Week 3: Compilers and Interpreters 50 / 52

135 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

136 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : Prelude> 1+2*3 7 :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

137 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : Prelude> 1+2*3 7 Prelude> let x = 5 Prelude> x :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

138 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : Prelude> 1+2*3 7 Prelude> let x = 5 Prelude> x Prelude> let x = 3 Prelude> x :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

139 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : Prelude> 1+2*3 7 Prelude> let x = 5 Prelude> x Prelude> let x = 3 Prelude> x Prelude> let y = x * x Prelude> y 9 :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

140 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : Prelude> 1+2*3 7 Prelude> let x = 5 Prelude> x Prelude> let x = 3 Prelude> x Prelude> let y = x * x Prelude> y 9 Prelude> reverse "abcd" "dcba" :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

141 The REPL Environment Example: AHaskellinterpretersession: linux> ghci GHCi, version : Prelude> 1+2*3 7 Prelude> let x = 5 Prelude> x Prelude> let x = 3 Prelude> x Prelude> let y = x * x Prelude> y 9 Prelude> reverse "abcd" "dcba" Prelude> :q Leaving GHCi. linux> :? for help PSU CS320 Fall 17 Week 3: Compilers and Interpreters 51 / 52

Week 2: Syntax Specification, Grammars

Week 2: Syntax Specification, Grammars CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences

More information

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g., C++) to low-level assembly language that can be executed by hardware int a,

More information

Compilers Crash Course

Compilers Crash Course Compilers Crash Course Prof. Michael Clarkson CSci 6907.85 Spring 2014 Slides Acknowledgment: Prof. Andrew Myers (Cornell) What are Compilers? Translators from one representation of program code to another

More information

symbolic name data type (perhaps with qualifier) allocated in data area, stack, or heap duration (lifetime or extent)

symbolic name data type (perhaps with qualifier) allocated in data area, stack, or heap duration (lifetime or extent) variables have multiple attributes variable symbolic name data type (perhaps with qualifier) allocated in data area, stack, or heap duration (lifetime or extent) storage class scope (visibility of the

More information

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular

More information

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher COP4020 ming Languages Compilers and Interpreters Robert van Engelen & Chris Lacher Overview Common compiler and interpreter configurations Virtual machines Integrated development environments Compiler

More information

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information CS 4120 / 4121 CS 5120/5121 Introduction to Compilers Fall 2011 Andrew Myers Lecture 1: Overview Outline About this course Introduction to compilers What are compilers? Why should we learn about them?

More information

A Tour of Language Implementation

A Tour of Language Implementation 1 CSCE 314: Programming Languages Dr. Flemming Andersen A Tour of Language Implementation Programming is no minor feat. Prometheus Brings Fire by Heinrich Friedrich Füger. Image source: https://en.wikipedia.org/wiki/prometheus

More information

CSc 453 Compilers and Systems Software

CSc 453 Compilers and Systems Software CSc 453 Compilers and Systems Software 3 : Lexical Analysis I Christian Collberg Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2009 Christian Collberg August 23, 2009

More information

COMPILER DESIGN LECTURE NOTES

COMPILER DESIGN LECTURE NOTES COMPILER DESIGN LECTURE NOTES UNIT -1 1.1 OVERVIEW OF LANGUAGE PROCESSING SYSTEM 1.2 Preprocessor A preprocessor produce input to compilers. They may perform the following functions. 1. Macro processing:

More information

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573 What is a compiler? Xiaokang Qiu Purdue University ECE 573 August 21, 2017 What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g.,

More information

Semantic Analysis. Lecture 9. February 7, 2018

Semantic Analysis. Lecture 9. February 7, 2018 Semantic Analysis Lecture 9 February 7, 2018 Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average

More information

CSCE 314 Programming Languages. Type System

CSCE 314 Programming Languages. Type System CSCE 314 Programming Languages Type System Dr. Hyunyoung Lee 1 Names Names refer to different kinds of entities in programs, such as variables, functions, classes, templates, modules,.... Names can be

More information

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006 C Compilation Model Comp-206 : Introduction to Software Systems Lecture 9 Alexandre Denault Computer Science McGill University Fall 2006 Midterm Date: Thursday, October 19th, 2006 Time: from 16h00 to 17h30

More information

Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis Chapter 3 Lexical Analysis Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata Design of lexical analyzer generator The role of lexical

More information

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis CS 1622 Lecture 2 Lexical Analysis CS 1622 Lecture 2 1 Lecture 2 Review of last lecture and finish up overview The first compiler phase: lexical analysis Reading: Chapter 2 in text (by 1/18) CS 1622 Lecture

More information

Chapter 2 - Programming Language Syntax. September 20, 2017

Chapter 2 - Programming Language Syntax. September 20, 2017 Chapter 2 - Programming Language Syntax September 20, 2017 Specifying Syntax: Regular expressions and context-free grammars Regular expressions are formed by the use of three mechanisms Concatenation Alternation

More information

Compiler course. Chapter 3 Lexical Analysis

Compiler course. Chapter 3 Lexical Analysis Compiler course Chapter 3 Lexical Analysis 1 A. A. Pourhaji Kazem, Spring 2009 Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata

More information

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation Language Implementation Methods The Design and Implementation of Programming Languages Compilation Interpretation Hybrid In Text: Chapter 1 2 Compilation Interpretation Translate high-level programs to

More information

Programming Languages, Summary CSC419; Odelia Schwartz

Programming Languages, Summary CSC419; Odelia Schwartz Programming Languages, Summary CSC419; Odelia Schwartz Chapter 1 Topics Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation Criteria Influences on Language Design

More information

LECTURE 18. Control Flow

LECTURE 18. Control Flow LECTURE 18 Control Flow CONTROL FLOW Sequencing: the execution of statements and evaluation of expressions is usually in the order in which they appear in a program text. Selection (or alternation): a

More information

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF) Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical

More information

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS CIS 341 Midterm February 28, 2013 Name (printed): Pennkey (login id): My signature below certifies that I have complied with the University of Pennsylvania s Code of Academic Integrity in completing this

More information

Compiler Construction LECTURE # 1

Compiler Construction LECTURE # 1 Compiler Construction AN OVERVIEW LECTURE # 1 The Course Course Code: CS-4141 Course Title: Compiler Construction Instructor: JAWAD AHMAD Email Address: jawadahmad@uoslahore.edu.pk Web Address: http://csandituoslahore.weebly.com/cc.html

More information

CSE P 501 Exam 11/17/05 Sample Solution

CSE P 501 Exam 11/17/05 Sample Solution 1. (8 points) Write a regular expression or set of regular expressions that generate the following sets of strings. You can use abbreviations (i.e., name = regular expression) if it helps to make your

More information

CST-402(T): Language Processors

CST-402(T): Language Processors CST-402(T): Language Processors Course Outcomes: On successful completion of the course, students will be able to: 1. Exhibit role of various phases of compilation, with understanding of types of grammars

More information

CS5363 Final Review. cs5363 1

CS5363 Final Review. cs5363 1 CS5363 Final Review cs5363 1 Programming language implementation Programming languages Tools for describing data and algorithms Instructing machines what to do Communicate between computers and programmers

More information

CMSC 350: COMPILER DESIGN

CMSC 350: COMPILER DESIGN Lecture 11 CMSC 350: COMPILER DESIGN see HW3 LLVMLITE SPECIFICATION Eisenberg CMSC 350: Compilers 2 Discussion: Defining a Language Premise: programming languages are purely formal objects We (as language

More information

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only Instructions: The following questions use the AT&T (GNU) syntax for x86-32 assembly code, as in the course notes. Submit your answers to these questions to the Curator as OQ05 by the posted due date and

More information

Lexical Analyzer Scanner

Lexical Analyzer Scanner Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

More information

Writing Evaluators MIF08. Laure Gonnord

Writing Evaluators MIF08. Laure Gonnord Writing Evaluators MIF08 Laure Gonnord Laure.Gonnord@univ-lyon1.fr Evaluators, what for? Outline 1 Evaluators, what for? 2 Implementation Laure Gonnord (Lyon1/FST) Writing Evaluators 2 / 21 Evaluators,

More information

Compiler Design (40-414)

Compiler Design (40-414) Compiler Design (40-414) Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007 Evaluation: Midterm Exam 35% Final Exam 35% Assignments and Quizzes 10% Project

More information

CSE 401 Midterm Exam Sample Solution 2/11/15

CSE 401 Midterm Exam Sample Solution 2/11/15 Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:

More information

CS 415 Midterm Exam Spring 2002

CS 415 Midterm Exam Spring 2002 CS 415 Midterm Exam Spring 2002 Name KEY Email Address Student ID # Pledge: This exam is closed note, closed book. Good Luck! Score Fortran Algol 60 Compilation Names, Bindings, Scope Functional Programming

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal

More information

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program. Language Translation Compilation vs. interpretation Compilation diagram Step 1: compile program compiler Compiled program Step 2: run input Compiled program output Language Translation compilation is translation

More information

CSE 582 Autumn 2002 Exam Sample Solution

CSE 582 Autumn 2002 Exam Sample Solution Question 1. (10 points) Regular expressions. Describe the set of strings that are generated by the each of the following regular expressions. a) (a (bc)* d)+ One or more of the string a or the string d

More information

Undergraduate Compilers in a Day

Undergraduate Compilers in a Day Question of the Day Backpatching o.foo(); In Java, the address of foo() is often not known until runtime (due to dynamic class loading), so the method call requires a table lookup. After the first execution

More information

Lexical Analyzer Scanner

Lexical Analyzer Scanner Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 Addison-Wesley. All

More information

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream

More information

CS 360 Programming Languages Interpreters

CS 360 Programming Languages Interpreters CS 360 Programming Languages Interpreters Implementing PLs Most of the course is learning fundamental concepts for using and understanding PLs. Syntax vs. semantics vs. idioms. Powerful constructs like

More information

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string

More information

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End Outline Semantic Analysis The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors

More information

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer. The Compiler So Far CSC 4181 Compiler Construction Scanner - Lexical analysis Detects inputs with illegal tokens e.g.: main 5 (); Parser - Syntactic analysis Detects inputs with ill-formed parse trees

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

The Structure of a Syntax-Directed Compiler

The Structure of a Syntax-Directed Compiler Source Program (Character Stream) Scanner Tokens Parser Abstract Syntax Tree Type Checker (AST) Decorated AST Translator Intermediate Representation Symbol Tables Optimizer (IR) IR Code Generator Target

More information

Introduction. Compiler Design CSE Overview. 2 Syntax-Directed Translation. 3 Phases of Translation

Introduction. Compiler Design CSE Overview. 2 Syntax-Directed Translation. 3 Phases of Translation Introduction Compiler Design CSE 504 1 Overview 2 Syntax-Directed Translation 3 Phases of Translation Last modifled: Mon Jan 25 2016 at 00:15:02 EST Version: 1.5 23:45:54 2013/01/28 Compiled at 12:59 on

More information

CIT Week13 Lecture

CIT Week13 Lecture CIT 3136 - Week13 Lecture Runtime Environments During execution, allocation must be maintained by the generated code that is compatible with the scope and lifetime rules of the language. Typically there

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

Time : 1 Hour Max Marks : 30

Time : 1 Hour Max Marks : 30 Total No. of Questions : 6 P4890 B.E/ Insem.- 74 B.E ( Computer Engg) PRINCIPLES OF MODERN COMPILER DESIGN (2012 Pattern) (Semester I) Time : 1 Hour Max Marks : 30 Q.1 a) Explain need of symbol table with

More information

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08 CS412/413 Introduction to Compilers Tim Teitelbaum Lecture 2: Lexical Analysis 23 Jan 08 Outline Review compiler structure What is lexical analysis? Writing a lexer Specifying tokens: regular expressions

More information

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2 Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence

More information

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING PRINCIPLES OF COMPILER DESIGN 2 MARKS UNIT I INTRODUCTION TO COMPILING 1. Define compiler? A compiler is a program that reads a program written in one language (source language) and translates it into

More information

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation Assigned: Sunday, November 14, 2004 Due: Thursday, Dec 9, 2004, at 11:59pm No solution will be accepted after Sunday, Dec 12,

More information

Compiler Construction D7011E

Compiler Construction D7011E Compiler Construction D7011E Lecture 8: Introduction to code generation Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1 What is a Compiler? Compilers

More information

Lexical Analysis. Chapter 2

Lexical Analysis. Chapter 2 Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

More information

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Compiling and Interpreting Programming. Overview of Compilers and Interpreters Copyright R.A. van Engelen, FSU Department of Computer Science, 2000 Overview of Compilers and Interpreters Common compiler and interpreter configurations Virtual machines Integrated programming environments

More information

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria CS152 Programming Language Paradigms Prof. Tom Austin, Fall 2014 Syntax & Semantics, and Language Design Criteria Lab 1 solution (in class) Formally defining a language When we define a language, we need

More information

CS606- compiler instruction Solved MCQS From Midterm Papers

CS606- compiler instruction Solved MCQS From Midterm Papers CS606- compiler instruction Solved MCQS From Midterm Papers March 06,2014 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 Final Term MCQ s and Quizzes CS606- compiler instruction If X is a

More information

CSE 401 Midterm Exam Sample Solution 11/4/11

CSE 401 Midterm Exam Sample Solution 11/4/11 Question 1. (12 points, 2 each) The front end of a compiler consists of three parts: scanner, parser, and (static) semantics. Collectively these need to analyze the input program and decide if it is correctly

More information

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012 Scanners Xiaokang Qiu Purdue University ECE 468 Adapted from Kulkarni 2012 August 24, 2016 Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved

More information

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

SEMANTIC ANALYSIS TYPES AND DECLARATIONS SEMANTIC ANALYSIS CS 403: Type Checking Stefan D. Bruda Winter 2015 Parsing only verifies that the program consists of tokens arranged in a syntactically valid combination now we move to check whether

More information

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines Lecture Outline Code Generation Lecture 30 (based on slides by R. Bodik) Stack machines The MIPS assembly language The x86 assembly language A simple source language Stack-machine implementation of the

More information

COMPILER DESIGN. For COMPUTER SCIENCE

COMPILER DESIGN. For COMPUTER SCIENCE COMPILER DESIGN For COMPUTER SCIENCE . COMPILER DESIGN SYLLABUS Lexical analysis, parsing, syntax-directed translation. Runtime environments. Intermediate code generation. ANALYSIS OF GATE PAPERS Exam

More information

Lexical Analysis. Lecture 2-4

Lexical Analysis. Lecture 2-4 Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.

More information

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful? Chapter 1 :: Introduction Introduction Programming Language Pragmatics Michael L. Scott Why are there so many programming languages? evolution -- we've learned better ways of doing things over time socio-economic

More information

Compiler Construction D7011E

Compiler Construction D7011E Compiler Construction D7011E Lecture 2: Lexical analysis Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1 Basics of Lexical Analysis: 2 Some definitions:

More information

Introduction to Compiler Design

Introduction to Compiler Design Introduction to Compiler Design Lecture 1 Chapters 1 and 2 Robb T. Koether Hampden-Sydney College Wed, Jan 14, 2015 Robb T. Koether (Hampden-Sydney College) Introduction to Compiler Design Wed, Jan 14,

More information

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators Jeremy R. Johnson 1 Theme We have now seen how to describe syntax using regular expressions and grammars and how to create

More information

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Crafting a Compiler with C (II) Compiler V. S. Interpreter Crafting a Compiler with C (II) 資科系 林偉川 Compiler V S Interpreter Compilation - Translate high-level program to machine code Lexical Analyzer, Syntax Analyzer, Intermediate code generator(semantics Analyzer),

More information

Compilation I. Hwansoo Han

Compilation I. Hwansoo Han Compilation I Hwansoo Han Language Groups Imperative von Neumann (Fortran, Pascal, Basic, C) Object-oriented (Smalltalk, Eiffel, C++) Scripting languages (Perl, Python, JavaScript, PHP) Declarative Functional

More information

Monday, August 26, 13. Scanners

Monday, August 26, 13. Scanners Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can

More information

Programming. translate our algorithm into set of instructions machine can execute

Programming. translate our algorithm into set of instructions machine can execute Programming translate our algorithm into set of instructions machine can execute Programming it's hard to do the programming to get something done details are hard to get right, very complicated, finicky

More information

Parsing and Pattern Recognition

Parsing and Pattern Recognition Topics in IT 1 Parsing and Pattern Recognition Week 10 Lexical analysis College of Information Science and Engineering Ritsumeikan University 1 this week mid-term evaluation review lexical analysis its

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

Code Generation. Lecture 30

Code Generation. Lecture 30 Code Generation Lecture 30 (based on slides by R. Bodik) 11/14/06 Prof. Hilfinger CS164 Lecture 30 1 Lecture Outline Stack machines The MIPS assembly language The x86 assembly language A simple source

More information

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis Lexical and Syntactic Analysis Lexical and Syntax Analysis In Text: Chapter 4 Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input characters and output

More information

Data in Memory. variables have multiple attributes. variable

Data in Memory. variables have multiple attributes. variable Data in Memory variables have multiple attributes variable symbolic name data type (perhaps with qualifier) allocated in data area, stack, or heap duration (lifetime or extent) storage class scope (visibility

More information

Wednesday, September 3, 14. Scanners

Wednesday, September 3, 14. Scanners Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can

More information

General Concepts. Abstraction Computational Paradigms Implementation Application Domains Influence on Success Influences on Design

General Concepts. Abstraction Computational Paradigms Implementation Application Domains Influence on Success Influences on Design General Concepts Abstraction Computational Paradigms Implementation Application Domains Influence on Success Influences on Design 1 Abstractions in Programming Languages Abstractions hide details that

More information

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview COMP 181 Compilers Lecture 2 Overview September 7, 2006 Administrative Book? Hopefully: Compilers by Aho, Lam, Sethi, Ullman Mailing list Handouts? Programming assignments For next time, write a hello,

More information

A simple syntax-directed

A simple syntax-directed Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

More information

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking CSE450 Translation of Programming Languages Lecture 11: Semantic Analysis: Types & Type Checking Structure Project 1 - of a Project 2 - Compiler Today! Project 3 - Source Language Lexical Analyzer Syntax

More information

G Programming Languages - Fall 2012

G Programming Languages - Fall 2012 G22.2110-003 Programming Languages - Fall 2012 Lecture 3 Thomas Wies New York University Review Last week Names and Bindings Lifetimes and Allocation Garbage Collection Scope Outline Control Flow Sequencing

More information

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1 CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions

More information

Part 5 Program Analysis Principles and Techniques

Part 5 Program Analysis Principles and Techniques 1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

More information

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation Compiler Construction SMD163 Lecture 8: Introduction to code generation Viktor Leijon & Peter Jonsson with slides by Johan Nordlander Contains material generously provided by Mark P. Jones What is a Compiler?

More information

Compiler Design IIIT Kalyani, West Bengal 1. Introduction. Goutam Biswas. Lect 1

Compiler Design IIIT Kalyani, West Bengal 1. Introduction. Goutam Biswas. Lect 1 Compiler Design IIIT Kalyani, West Bengal 1 Introduction Compiler Design IIIT Kalyani, West Bengal 2 Programming a Computer High level language program Assembly language program Machine language program

More information

Software II: Principles of Programming Languages

Software II: Principles of Programming Languages Software II: Principles of Programming Languages Lecture 4 Language Translation: Lexical and Syntactic Analysis Translation A translator transforms source code (a program written in one language) into

More information

Front End. Hwansoo Han

Front End. Hwansoo Han Front nd Hwansoo Han Traditional Two-pass Compiler Source code Front nd IR Back nd Machine code rrors High level functions Recognize legal program, generate correct code (OS & linker can accept) Manage

More information

The role of semantic analysis in a compiler

The role of semantic analysis in a compiler Semantic Analysis Outline The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors

More information

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100 GATE- 2016-17 Postal Correspondence 1 Compiler Design Computer Science & Information Technology (CS) 20 Rank under AIR 100 Postal Correspondence Examination Oriented Theory, Practice Set Key concepts,

More information

UNIT -1 1.1 OVERVIEW OF LANGUAGE PROCESSING SYSTEM 1.2 Preprocessor A preprocessor produce input to compilers. They may perform the following functions. 1. Macro processing: A preprocessor may allow a

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

Writing a Lexer. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, February 6, Glenn G.

Writing a Lexer. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, February 6, Glenn G. Writing a Lexer CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, February 6, 2017 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks

More information

Lecture 9 CIS 341: COMPILERS

Lecture 9 CIS 341: COMPILERS Lecture 9 CIS 341: COMPILERS Announcements HW3: LLVM lite Available on the course web pages. Due: Monday, Feb. 26th at 11:59:59pm Only one group member needs to submit Three submissions per group START

More information