Lecture 09: Data Abstraction ++ Parsing Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree. program text Parser AST Processor Compilers (and some interpreters) analyse abstract syntax trees. Compilers very well understood In fact, there exist compiler compilers Example: YACC (yet another compiler compiler) http://dinosaur.compilertools.net/ Copyright Bill Havens 1
Expression Syntax Extended An abstract syntax expresses the structure of the concrete syntax without the details. Example: extended syntax for Scheme expressions Includes literals (constant numbers) <exp> ::= <number> <symbol> (lambda (<symbol>) <exp>) (<exp> <exp>) We would define an abstract datatype for this grammar as follows: (define-datatype expression expression? (lit-exp (datum number?)) (var-exp (id symbol?)) (lambda-exp (id symbol?) (body expression?)) (app-exp (rator expression?) (rand expression?))) Now lets parse sentences in the language specified by the BNF grammar into abstract syntax trees Copyright Bill Havens 2
Parser for Extended Expressions Here is an example parser for the lambda expression language above: (define parse-expression (lambda (datum) (cond ((number? datum) (lit-exp datum)) ((symbol? datum) (var-exp datum)) ((pair? datum) (if (eqv? (car datum) 'lambda) (lambda-exp (caadr datum) (parse-expression (caddr datum))) (app-exp (parse-expression (car datum)) (parse-expression (cadr datum))))) (else (eopl:error 'parse-expression "Invalid concrete syntax ~s" datum))))) Fortunately, Schme (read) function does most of the hard work Identifies tokens in input stream (eg- (, ), identifiers, strings) Converts parenthesized structures into dotted-pairs and proper lists Copyright Bill Havens 3
Example All we have to do is convert between the lists that read produces into abstract syntax trees (parse-expression (foo x)) (parse-expression 3) (parse-expression (lambda (x) (lambda (y) (cons x y)))) Let s see how these work in Dr. Scheme... Hard to see! Lets implement an un-parse procedure to take the abstract syntax tree apart Turns back into concrete Scheme syntax (define unparse-expression (lambda (exp) (cases expression exp (lit-exp (datum) datum) (var-exp (id) id) (lambda-exp (id body) (list 'lambda (list id) (unparse-expression body))) (app-exp (rator rand) (list (unparse-expression rator) (unparse-expression rand)))))) Copyright Bill Havens 4
Environments Introduction Consider evaluating the following expression: (+ x 3) How does the interpreter know the value of variable x? Variable bindings stored in a dictionary (aka- symbol table) called an environment Definition: An environment is a function that maps variable names to their current bindings. - Interpreter: mapping variables to their current values - Compiler: mapping variables to their lexical addresses We denote an environment by a finite set of bindings, each having the form s v. Example: env = { x 3, y (a b c), z hello } Environment function can be applied to an argument symbol returning its binding in that enviornment Example: env(y) = (a b c) But env(w) = error - undefined variable Copyright Bill Havens 5
Nested Environments Block structured languages (eg- C, Java, Scheme) allow nested blocks Example in C: int foo (int z) { int x; float y; if (x < z) { float y = 3.0;... } else { print x;... }; for (int x = 0; x < z; x++) { print x+y;...} } How many different environments are there in this example? What are the variable bindings in each environment? Are there any holes in any environment? Since blocks can be nested, we need to nest environments as well Each scoping block needs to have its own environment. The environment of a nested block must to refer to the environments of the enclosing blocks recursively. Copyright Bill Havens 6
Introduction Recursive Environments We can think of an environment as extending its enclosing environment. If you look something up in the environment and don't find it, look in the enclosing environment. Recursion has to stop, so we need the concept of an empty environment We can think of an environment then as an inductively-defined type: The BNF for <environment> <environment> ::= ( ) ( {<variable>, <value>}* <environment>) An environment is either empty or its a set of (variable,value) pairs that extend an existing enclosing environment. Copyright Bill Havens 7
Abstract Environments Basic Idea An abstract environment requires: 1. a function for creating an empty environment 2. An operator for extending an environment with new bindings 3. An operator for accessing the binding of a variable in an environment Operator interfaces: ;; returns an empty environment (define empty-env (lambda ()... )) ;; returns an environment that extends env (define extend-env (lambda (vars vals env)... )) ;; returns the value of the variable "var" in "env" (define apply-env (lambda (env var)... )) Copyright Bill Havens 8
Usage (define first-env (empty-env)) (define second-env (extend-env '(a b) '(1 2) first-env) (define third-env (extend-env '(c d b) '(3 4 5) second-env) (apply-env first-env 'a) ; returns an error (apply-env second-env 'a) ; returns 1 (apply-env third-env 'a) ; returns 1 (same "a") (apply-env second-env 'b) ; returns 2 (apply-env third-env 'b) ; returns 5 (different "b") Copyright Bill Havens 9
Implementing Environments Procedural versus Datatype implementations Basis for Object-Oriented Programming Language (OOPL) concept Data is only accessbile via a procedural interface (eg- methods in Java) Actual implementation of data is hidden by the interface Implementation can be changed without breaking code Procedural Implementation Environments are a function: f(variable) = value. The empty environment is a function that when called always returns an error! The method (empty-env) returns an environment with no bindings (define empty-env (lambda () (lambda (sym) (eopl:error 'apply-env "No binding for ~s" sym)))) Copyright Bill Havens 10
Extending an Environment An extended environment is also a function which returns a value for a specified variable. If the variable you're looking for is one defined in that environment, it returns the corresponding value; otherwise, it calls the environment function which it extends Here is an implementation: (define extend-env (lambda (syms vals env) (lambda (sym) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym)))))) Note that the bindings are represented as corresponding lists of variables and values Method extend-env returns a function that when called searches these lists If desired variable is not found in this environment then extend-env is called recursively on the enclosing environment Copyright Bill Havens 11
Helper functions Method (list-find-position sym syms) searches the list of variable names (syms) to find the desired variable (sym) Returns a zero-based index on success and #f on failure (Scheme convention) (define list-find-position (lambda (sym los) (list-index (lambda (sym1) (eqv? sym1 sym)) los))) Method (list-index pred ls) is a higher-order function which applies a specified predicate pred to a list ls (define list-index (lambda (pred ls) (cond ((null? ls) #f) ((pred (car ls)) 0) (else (let ((list-index-r (list-index pred (cdr ls)))) (if (number? list-index-r) (+ list-index-r 1) #f)))))) Comment: what is peculiar about this method implementation? Copyright Bill Havens 12
Recoding function list-index Very inefficient implementation when an error is detected Need an exception thrown instead Why not use call/cc mechanism? Reminder: (call/cc <lambda>) calls a function <lambda> of one argument which is the continuation of the call/cc expression. Applying the continuation function immediately exits from the call/cc Recoding using call/cc (define list-index (lambda (pred ls) (call/cc (lambda (exit) (list-index1 pred ls exit))))) (define list-index1 (lambda (pred ls exit) (cond ((null? ls) (exit #f)) ((pred (car ls)) 0) (else (+ 1 (list-index1 pred (cdr ls) exit)))))) Much simpler! But can you improve it further? Use tail recursion! Copyright Bill Havens 13
Applying a procedural environment Just call the function on the variable to be accessed (define apply-env (lambda (env sym) (env sym))) Example: (apply-env (extend-env (x z) (1 3) (extend-env (y z) (2 2) (extend-env (x y) (4 7) (empty-env)))) y) Copyright Bill Havens 14
Overview Datatype Implementation Abstract datatype representation Declarative view of data (cf- procedural view) using the define-datatype construct: (define-datatype environment environment? (empty-env-record) (extended-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?))) Defining trivial predicate for any Scheme value: (define scheme-value? (lambda (v) #t)) Copyright Bill Havens 15
Empty Environment Creating an empty environment is pretty simple just use the empty-env-record constructor: (define empty-env (lambda () (empty-env-record))) Extending an Environment Extending an environment is also straightforward Use the extended-env-record constructor: (define extend-env (lambda (syms vals env) (extended-env-record syms vals env))) Copyright Bill Havens 16
Applying the Environment Applying the environment is a little more complicated Need to traverse the data structures instead of just calling a function. (define apply-env (lambda (env sym) (cases environment env (empty-env-record () (eopl:error 'apply-env "No binding for ~s" sym)) (extended-env-record (syms vals env) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym))))))) Chapter 3 will use the abstract notion of an environment Implementation is irrelevant! Copyright Bill Havens 17