Algorithms for AI and NLP (INF4820 Lisp & FSAs)

Similar documents
INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. More Common Lisp

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Common Lisp Fundamentals

Algorithms for AI and NLP (INF4820 TFSs)

Common Lisp. Blake McBride

Artificial Intelligence Programming

Functional programming techniques

Computational Linguistics (INF2820 TFSs)

Theory and Compiling COMP360

INF4820. Common Lisp: Closures and Macros

CS 842 Ben Cassell University of Waterloo

Computational Linguistics (INF2820 TFSs)

Common LISP Tutorial 1 (Basic)

Symbolic Programming. Dr. Zoran Duric () Symbolic Programming 1/ 89 August 28, / 89

Functional programming with Common Lisp

A Small Interpreted Language

Lecture #2 Kenneth W. Flynn RPI CS

Modern Programming Languages. Lecture LISP Programming Language An Introduction

Automating Construction of Lexers

LISP. Everything in a computer is a string of binary digits, ones and zeros, which everyone calls bits.

Streams and Lazy Evaluation in Lisp

Functional Programming

Infinite transducers on terms denoting graphs

CS 480. Lisp J. Kosecka George Mason University. Lisp Slides

Languages and Finite Automata

CSCI337 Organisation of Programming Languages LISP

Common LISP-Introduction

Equivalence of NTMs and TMs

Formal Languages. Formal Languages

CSE528 Natural Language Processing Venue:ADB-405 Topic: Regular Expressions & Automata. www. l ea rn ersd esk.weeb l y. com

Common Lisp Object System Specification. 1. Programmer Interface Concepts

FUNKCIONÁLNÍ A LOGICKÉ PROGRAMOVÁNÍ 2. ÚVOD DO LISPU: ATOMY, SEZNAMY, FUNKCE,

CT32 COMPUTER NETWORKS DEC 2015

Lecture 3 Regular Expressions and Automata

Computer Sciences Department

Variants of Turing Machines

Scheme. Functional Programming. Lambda Calculus. CSC 4101: Programming Languages 1. Textbook, Sections , 13.7

CIS4/681 { Articial Intelligence 2 > (insert-sort '( )) ( ) 2 More Complicated Recursion So far everything we have dened requires

October 19, 2004 Chapter Parsing

CMSC 132: Object-Oriented Programming II

Outline. Language Hierarchy

WFST: Weighted Finite State Transducer. September 12, 2017 Prof. Marie Meteer

Source of Slides: Introduction to Automata Theory, Languages, and Computation By John E. Hopcroft, Rajeev Motwani and Jeffrey D.

Functional Programming. Pure Functional Languages

1. Lexical Analysis Phase

Regular Expressions & Automata

Introduction to Computers & Programming

Documentation for LISP in BASIC

Lisp: Lab Information. Donald F. Ross

A LISP Interpreter in ML

Theory of Programming Languages COMP360

CS 61A Interpreters, Tail Calls, Macros, Streams, Iterators. Spring 2019 Guerrilla Section 5: April 20, Interpreters.

COMP Logic for Computer Scientists. Lecture 25

COMP 382: Reasoning about algorithms

Formal Languages and Compilers Lecture VI: Lexical Analysis

Th(N, +) is decidable

Turing Machines. A transducer is a finite state machine (FST) whose output is a string and not just accept or reject.

Introduction to Functional Programming

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Functional Languages. Hwansoo Han

Theory of Languages and Automata

Finite Automata Part Three

Organization of Programming Languages CS3200/5200N. Lecture 11

Functional Programming. Pure Functional Languages

Lec-5-HW-1, TM basics

Structure of a Compiler: Scanner reads a source, character by character, extracting lexemes that are then represented by tokens.

TAFL 1 (ECS-403) Unit- V. 5.1 Turing Machine. 5.2 TM as computer of Integer Function

LECTURE 16. Functional Programming

CIS 1.5 Course Objectives. a. Understand the concept of a program (i.e., a computer following a series of instructions)

A Genetic Algorithm Implementation

19 Machine Learning in Lisp

CMPSCI 250: Introduction to Computation. Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014

Programming Language Pragmatics

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Lecture #13: Type Inference and Unification. Typing In the Language ML. Type Inference. Doing Type Inference

FP Foundations, Scheme

Chapter 14: Pushdown Automata

An Introduction to Functions

Environments

CS 314 Principles of Programming Languages. Lecture 3

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

CS210 THEORY OF COMPUTATION QUESTION BANK PART -A UNIT- I

Principles of Programming Languages Topic: Functional Programming Professor L. Thorne McCarty Spring 2003

CS402 - Theory of Automata Glossary By

Principles of Programming Languages 2017W, Functional Programming

Introduction to Scheme

CS 314 Principles of Programming Languages

Imperative, OO and Functional Languages A C program is

Chapter 11 :: Functional Languages

Theory of Computations Spring 2016 Practice Final Exam Solutions

Functional Languages. CSE 307 Principles of Programming Languages Stony Brook University

CMSC 350: COMPILER DESIGN

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

CS 415 Midterm Exam Spring SOLUTION

A little bit of Lisp

Gene Kim 9/9/2016 CSC 2/444 Lisp Tutorial

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

CSE341 Autumn 2017, Final Examination December 12, 2017

Transcription:

S NP Det N VP V The dog barked LTOP h 1 INDEX e 2 def q rel bark v rel prpstn m rel LBL h 4 dog n rel LBL h RELS LBL h 1 ARG0 x 5 LBL h 9 8 ARG0 e MARG h 3 RSTR h 6 ARG0 x 2 5 ARG1 x BODY h 5 7 HCONS h 3 = q h 9, h 6 = q h 8 Algorithms for AI and NLP (INF4820 Lisp & FSAs) {baa!, baaa!, baaaa!,... } Stephan Oepen and Jonathon Read Universitetet i Oslo { oe jread }@ifi.uio.no

Vectors and Arrays Multidimensional grids of data can be represented as vectors or arrays; (make-array ( rank 1... rank n )) creates an array with n dimensions;? (setf *foo* (make-array (2 5) :initial-element 0)) #((0 0 0 0 0) (0 0 0 0 0))? (setf (aref *foo* 1 2) 42) 42 0 1 2 3 4 0 0 0 0 0 0 1 0 0 42 0 0 all dimensions count from zero; aref() accesses one individual cell; one-dimensional arrays are called vectors (abstractly similar to lists). More Common Lisp & Finite-State Automata (2)

Some Objects are More Equal Than Others Plethora of equality tests: eq(), eql(), equal(), equalp(), and more; eq() tests object identity; it is not useful for numbers or characters; eql() is much like eq(), but well-defined on numbers and characters; equal() tests structural equivalence (recursively for lists and strings);? (eq (cons 1 (cons 2 (cons 3 nil))) (1 2 3)) nil? (equal (cons 1 (cons 2 (cons 3 nil))) (1 2 3)) t? (eq 42 42) dependent on implementation? (eql 42 42) t? (eql 42 42.0) nil? (equalp 42 42.0) t? (equal "foo" "foo") t? (equalp "FOO" "foo") t More Common Lisp & Finite-State Automata (3)

Functions as First-Class Citizens Built-in functions (member(), position(), et al.) use eql() by default; but almost all take an optional :test argument to provide a predicate:? (position "foo" ("foo" "bar")) nil? (position "foo" ("foo" "bar") :test # equal) 0 # foo is a short-hand for (function foo), yielding a function object; function objects are first-class citizens, i.e. can be treated just like data; funcall() can be used to invoke a function object on a list of arguments:? (funcall # + 1 2 3) 6 many built-in functions also take a:key argument, an accessor function:? (sort ((47 :bar) (11 :foo)) # < :key # first) ((11 :FOO) (47 :BAR)) More Common Lisp & Finite-State Automata (4)

Hash Tables: Yet Another Container Data Structure Lists are inefficient for indexing data, arrays restricted to numeric keys; hash table is a (one-dimensional) associative array; free-choice keys; any of the four (built-in) equality tests can be used for key comparison;? (defparameter *map* (make-hash-table :test # equal)) *map*? (gethash "foo" *map*) nil? (setf (gethash "foo" *map*) 4711) 4711? (gethash "foo" *map*) 4711 the more restricted the test, the more efficient will be hash table access; to iterate over hash table entries, use specialized loop() directives:? (loop for key being each hash-key in *map* collect (gethash key *map*)) More Common Lisp & Finite-State Automata (5)

Some Remarks on Choosing Among Container Types More Common Lisp & Finite-State Automata (6)

Input and Output Side Effects Input and output, to files or the terminal, is mediated through streams; the symbol t can be used to refer to the default stream, the terminal:? (format t "line: ~a; token ~a.~%" 42 "foo") line: 42; token foo. nil (read stream nil) reads one well-formed s-expression from stream; (read-line stream nil) reads one line of text, returning it as a string; the second argument to reader functions asks to returnnil on end-of-file. (with-open-file (stream "sample.txt" :direction :input) (loop for line = (read-line stream nil) while line do (format t "~a~%" line))) More Common Lisp & Finite-State Automata (7)

Local Variables Sometimes intermediate results need to be accessed more than once; let() and let*() create temporary value bindings for symbols, e.g;? (defparameter *foo* 42) *FOO*? (let ((bar (+ *foo* 1))) bar) 43? bar error (let ((variable 1 sexp 1 ). (variable n sexp n )) sexp... sexp) bindings valid only in the body of let() (other bindings are shadowed); let*() binds sequentially, i.e. variable i will be accesible for variable i + 1. More Common Lisp & Finite-State Automata (8)

Abstract Data Types defstruct() creates a new abstract data type, encapsulating a structure:? (defstruct cd artist title) CD defstruct() defines a new constructor, accessors, and a type predicate:? (setf *foo* (make-cd :artist "Dixie Chicks" :title "Home")) #S(CD :ARTIST "Dixie Chicks" :TITLE "Home")? (listp *foo*) nil? (cd-p *foo*) t? (setf (cd-title *foo*) "Fly") "Fly"? *foo* #S(CD :ARTIST "Dixie Chicks" :TITLE "Fly") abstract data types encapsulate a group of related data (i.e. an object ). More Common Lisp & Finite-State Automata (9)

Background: A Bit of Formal Language Theory Languages as Sets of Utterances What is a language? And how can one characterize it (precisely)? simplifying assumption: language as a set of strings ( utterances ); well-formed utterances are set members, ill-formed ones are not; provides no account of utterance-internal structure, e.g. subject ; + mathematically very simple, hence computationally straightforward. Regular Expressions Even simple languages (e.g. arithmetic expressions) can be infinite; to obtain a finite description of an infinite set regular expressions. More Common Lisp & Finite-State Automata (10)

Background: A Bit of Formal Language Theory Languages as Sets of Utterances What is a language? And how can one characterize it (precisely)? simplifying assumption: language as a set of strings ( utterances ); well-formed utterances are set members, ill-formed ones are not; provides no account of utterance-internal structure, e.g. subject ; + mathematically very simple, hence computationally straightforward. Regular Expressions Even simple languages (e.g. arithmetic expressions) can be infinite; to obtain a finite description of an infinite set regular expressions. More Common Lisp & Finite-State Automata (10)

Brushing Up our Knowledge of Regular Expressions /[ww]oodchucks?/ woodchuck Woodchuck woodgrubs woodchucks wood /baa+!/ ba! baa! baah! baaaa! baaaaaaaaa! aa aaa aaaa aaaaaa aaaaaaaa aaaaaaaaa... More Common Lisp & Finite-State Automata (11)

Brushing Up our Knowledge of Regular Expressions /[ww]oodchucks?/ woodchuck Woodchuck woodgrubs woodchucks wood /baa+!/ ba! baa! baah! baaaa! baaaaaaaaa! aa aaa aaaa aaaaaa aaaaaaaa aaaaaaaaa... More Common Lisp & Finite-State Automata (11)

Brushing Up our Knowledge of Regular Expressions /[ww]oodchucks?/ woodchuck Woodchuck woodgrubs woodchucks wood /baa+!/ ba! baa! baah! baaaa! baaaaaaaaa! aa aaa aaaa aaaaaa aaaaaaaa aaaaaaaaa... More Common Lisp & Finite-State Automata (11)

Pattern Matching: Finite-State Automata /baa+!/ ba! baa! baah! baaaa! baaaaaaaaa! Recognizing Regular Languages Finite-State Automata (FSAs) are very restricted Turing machines; states and transitions: read one symbol at a time from input tape; accept utterance when no more input, in a final state; else reject. More Common Lisp & Finite-State Automata (12)

Pattern Matching: Finite-State Automata /baa+!/ ba! baa! baah! baaaa! baaaaaaaaa! Recognizing Regular Languages Finite-State Automata (FSAs) are very restricted Turing machines; states and transitions: read one symbol at a time from input tape; accept utterance when no more input, in a final state; else reject. More Common Lisp & Finite-State Automata (12)

Tracing the Recognition of a Simple Input /baa+!/ ba! baa! baah! baaaa! baaaaaaaaa! Input Tape 0 1 2 3 4 b a a a! More Common Lisp & Finite-State Automata (13)

A Rather More Complex Example /(aa)+ (aaa)+/ aa aaa aaaa aaaaaa aaaaaaaa aaaaaaaaa... Non-Deterministic FSAs (NFSAs): multiple transitions per symbol; a search space of possible solutions: decisions no longer obvious. More Common Lisp & Finite-State Automata (14)

A Rather More Complex Example /(aa)+ (aaa)+/ aa aaa aaaa aaaaaa aaaaaaaa aaaaaaaaa... Non-Deterministic FSAs (NFSAs): multiple transitions per symbol; a search space of possible solutions: decisions no longer obvious. More Common Lisp & Finite-State Automata (14)

Quite Abstractly: Three Approaches to Search (Heuristic) Look-Ahead Peek at input tape one or more positions beyond the current symbol; try to work out (or guess ) which branch to take for current symbol. Parallel Computation Assume unlimited computational resources, i.e. any number of cpus; copy FSA, remaining input, and current state multiple branches. Backtracking (Or Back-Up) Keep track of possibilities (choice points) and remaining candidates; leave a bread crumb, go down one branch; eventually come back. More Common Lisp & Finite-State Automata (15)

NFSA Recognition (From Jurafsky & Martin, 2008) 1 procedure nd-recognize(tape, fsa) 2 agenda { 0, 0 }; 3 do 4 current pop(agenda); 5 state first(current); 6 index second(current); 7 if (index = length(tape) and state is final state) then 8 return accept; 9 fi 10 for(next in fsa.transitions[state, tape[index]]) do 11 agenda agenda { next, index + 1 } 12 od 13 if agenda is empty then return reject; fi 14 od 15 end More Common Lisp & Finite-State Automata (16)