Introduction to AI Spring 2006 Dan Klein Midterm Solutions

Similar documents
To earn the extra credit, one of the following has to hold true. Please circle and sign.

Constraint Satisfaction Problems

Introduction to Spring 2007 Artificial Intelligence Midterm Solutions

Final Exam. Introduction to Artificial Intelligence. CS 188 Spring 2010 INSTRUCTIONS. You have 3 hours.

Introduction to Fall 2008 Artificial Intelligence Midterm Exam

What is Search For? CS 188: Artificial Intelligence. Constraint Satisfaction Problems

Announcements. Homework 1: Search. Project 1: Search. Midterm date and time has been set:

CS 188: Artificial Intelligence Fall 2011

ˆ The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Introduction to Fall 2010 Artificial Intelligence Midterm Exam

CS 343: Artificial Intelligence

Announcements. Homework 4. Project 3. Due tonight at 11:59pm. Due 3/8 at 4:00pm

Announcements. CS 188: Artificial Intelligence Fall 2010

CS 188: Artificial Intelligence. What is Search For? Constraint Satisfaction Problems. Constraint Satisfaction Problems

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Introduction to Fall 2014 Artificial Intelligence Midterm Solutions

Announcements. CS 188: Artificial Intelligence Spring Today. Example: Map-Coloring. Example: Cryptarithmetic.

CS188 Fall 2018 Section 2: Graph Search + CSPs

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2013

CS 188: Artificial Intelligence Fall 2008

Announcements. CS 188: Artificial Intelligence Fall Large Scale: Problems with A* What is Search For? Example: N-Queens

CSE 473: Artificial Intelligence

(Due to rounding, values below may be only approximate estimates.) We will supply these numbers as they become available.

CS 188: Artificial Intelligence Spring Today

CSEP 573: Artificial Intelligence

CS 4100 // artificial intelligence

Q1. [11 pts] Foodie Pacman

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2014

Informed Search A* Algorithm

6.034 Quiz 1, Spring 2004 Solutions

PROBLEM 4

Name: UW CSE 473 Midterm, Fall 2014

Artificial Intelligence Naïve Bayes

Section Marks Pre-Midterm / 32. Logic / 29. Total / 100

Grad AI Fall, 2007 Homework 1

Simple Model Selection Cross Validation Regularization Neural Networks

Stat 602X Exam 2 Spring 2011

CS 188: Artificial Intelligence. Recap: Search

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2016

What is Search For? CSE 473: Artificial Intelligence. Example: N-Queens. Example: N-Queens. Example: Map-Coloring 4/7/17

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2014

Introduction to Artificial Intelligence Midterm 1. CS 188 Spring You have approximately 2 hours.

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2017

Machine Learning. Decision Trees. Manfred Huber

6.034 Quiz 2, Spring 2005

CS 5522: Artificial Intelligence II

Boosting Simple Model Selection Cross Validation Regularization. October 3 rd, 2007 Carlos Guestrin [Schapire, 1989]

Ensemble Methods, Decision Trees

Problem 1 Zero: 11% (~20 students), Partial: 66% (~120 students), Perfect: 23% (~43 students)

CS188 Spring 2010 Section 2: CSP s

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Informed search. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2016

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016

Question 1: 25% of students lost more than 2 points Question 2: 50% of students lost 2-4 points; 50% got it entirely correct Question 3: 95% of

10-701/15-781, Fall 2006, Final

CSPs: Search and Arc Consistency

This should prevent residual sexism, racism, favoritism lurking in the unconscious of your professor & TA from biasing grading!!

CS 343: Artificial Intelligence

CIS 192: Artificial Intelligence. Search and Constraint Satisfaction Alex Frias Nov. 30 th

CS 188: Artificial Intelligence Fall Machine Learning

Homework #6 (Constraint Satisfaction, Non-Deterministic Uncertainty and Adversarial Search) Out: 2/21/11 Due: 2/29/11 (at noon)

Constraint Satisfaction Problems (CSPs)

Today. Informed Search. Graph Search. Heuristics Greedy Search A* Search

C P E / C S C A RTIFICIAL I N T E L L I G E N C E M I D T E R M S E C T I O N 1 FA L L

CS 188: Artificial Intelligence Fall 2011

What is Search For? CS 188: Artificial Intelligence. Example: Map Coloring. Example: N-Queens. Example: N-Queens. Constraint Satisfaction Problems

Today. CS 188: Artificial Intelligence Fall Example: Boolean Satisfiability. Reminder: CSPs. Example: 3-SAT. CSPs: Queries.

CS 188: Artificial Intelligence Spring Announcements

Assignment 1 is out! Due: 9 Sep 23:59! Can work in a group of 2-3 students.! NO cheating!!!! Submit in turnitin! Code + report!

CS 540: Introduction to Artificial Intelligence

Boosting Simple Model Selection Cross Validation Regularization

Decision Trees: Discussion

CMU-Q Lecture 4: Path Planning. Teacher: Gianni A. Di Caro

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

4 Search Problem formulation (23 points)

Planning and Control: Markov Decision Processes

Self assessment due: Monday 10/29/2018 at 11:59pm (submit via Gradescope)

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Potential Midterm Exam Questions

Can work in a group of at most 3 students.! Can work individually! If you work in a group of 2 or 3 students,!

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012

Announcements. Project 0: Python Tutorial Due last night

Local Search Methods. CS 188: Artificial Intelligence Fall Announcements. Hill Climbing. Hill Climbing Diagram. Today

CMU-Q Lecture 2: Search problems Uninformed search. Teacher: Gianni A. Di Caro

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

Announcements. CS 188: Artificial Intelligence Fall Reminder: CSPs. Today. Example: 3-SAT. Example: Boolean Satisfiability.

CS 188: Artificial Intelligence Fall 2008

Constraint Satisfaction Problems

Supervised Learning. CS 586 Machine Learning. Prepared by Jugal Kalita. With help from Alpaydin s Introduction to Machine Learning, Chapter 2.

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron

6.867 Machine Learning

CSC384 Midterm Sample Questions

Announcements. CS 188: Artificial Intelligence Spring Today. A* Review. Consistency. A* Graph Search Gone Wrong

(a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are rational they can be written as the ratio of integers a 1

AI Programming CS S-15 Probability Theory

Midterm I. Introduction to Artificial Intelligence. CS 188 Fall You have approximately 3 hours.

University of Waterloo Department of Electrical and Computer Engineering ECE 457A: Cooperative and Adaptive Algorithms Midterm Examination

PRACTICE FINAL EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

Transcription:

NAME: SID#: Login: Sec: 1 CS 188 Introduction to AI Spring 2006 Dan Klein Midterm Solutions 1. (20 pts.) True/False Each problem is worth 2 points. Incorrect answers are worth 0 points. Skipped questions are worth 1 point. (a) True/False: The action taken by a rational agent will always be a deterministic function of the agent s current percepts. False. First, non-deterministic functions are useful in many settings such as multi-agent environments. Second, the agent must often consider its percept history and other sources of information. (b) True/False: An optimal solution path for a search problem with positive costs will never have repeated states. True. For any solution path with repeated states, removing the cycle will also be a solution path with lower cost. (c) True/False: Depth-first graph search is always complete on problems with finite search graphs. True. It will reach all reachable states in finite time because graph search does not repeat states. (d) True/False: If two search heuristics h 1 (s) and h 2 (s) have the same average value, the heuristic h 3 (s) = max(h 1 (s), h 2 (s)) could give better A* efficiency than h 1 or h 2. True. h 1 could have accurate values for early states and h 2 have accurate values for later states. h 3 could then have more accurate values than either component heuristic. (e) True/False: For robots with only rotational joints, polygonal obstacles will have polygonal images in configuration space. False. See example in homework 1. (f) True/False: There exist constraint satisfaction problems which can be expressed using trinary (3-variable) constraints, but not binary constraints. False. Proof that all CSPs have a binary formulation was also in homework 1. (g) True/False: For all random variables W, X, Y and Z P(Y, Z W, X)P(W, X) P(Y, Z) = P(W, X Y, Z) True. This is Bayes rule if you replace (Y, Z) with A and (W, X) with B. (h) True/False: For any set of attributes, and any training set generated by a deterministic function f of those attributes, there is a decision tree which is consistent with that training set. True. Decision trees can represent any deterministic function. (i) True/False: There is no hypothesis space which can be PAC learned with one example. True. A hypothesis space containing only one or two values (e.g., {T, F }) can be PAC learned from one example. (j) True/False: Smoothing is needed in a Naive Bayes classifier to compute the maximum likelihood estimates of P(feature class). False. Smoothing converts maximum likelihood estimates into smoothed estimates.

2 2. (24 pts.) Search Consider the following search problem: G 5 A B 1 2 5 C D 2 5 Node h 0 h 1 h 2 S 0 5 6 A 0 3 5 B 0 4 2 C 0 2 5 D 0 5 3 G 0 0 0 2 S (a) (3 points) Which of the heuristics shown are admissible? admissible. h 0 and h 1. h 2 (C) > h (C), so h 2 is not (b) (8 points) Give the path A* search will return using each of the three heuristics? (Hint: you should not actually have to execute A* to know the answer in some cases.) h 0 : S B C G h 1 : S B C G h 2 : S B D G (c) (3 points) What path will greedy best-first search return using h 1? S A G

NAME: SID#: Login: Sec: 3 Mazeworld: From the MazeWorld domain of Project 1, consider the following maze configuration: S ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ E Recall that states are cells of this maze. The agent starts at S and tries to reach the goal E. Valid actions are moves into adjacent cells that are either clear (step cost of 1) or marked as ~ (step cost of 5). Below are the outputs of the function mazeworld.testagentonmaze, tested with several agents that implemented graph search algorithms. Overlaid on the original maze, an x is placed on a cell that is used along the solution path, and an o denotes a cell that was expanded but not was not part of the solution path. For this problem, DFS and BFS explore successors according to a fixed ordering of moves: right, down, left, up. Greedy and A search use the Manhattan distance heuristic. A different search strategy (BFS, DFS, UCS, Greedy, or A ) was used to produce each of the outputs below. Specify which strategies were used (2 points each): ooooooooooooooo oooooooooooooooo oooooosoooooooooo ooooooxxxxxxxooo ~~ooooooooo~xooo ~~~~~~~~~~~~xo ~~~~~~~~~~~~xe Strategy: UCS S x ~~~~~~x~~~~~ ~~~~~~x~~~~~ ~~~~~~xxxxxxxe Strategy: Greedy Sxxxxxxxxxx x ~~~~~~~~~~~~ x ~~~~~~~~~~~~ x ~~~~~~~~~~~~ Exxx Strategy: DFS Soooooo xxxxxxxo ~~~~~~~~~~~~xo ~~~~~~~~~~~~xo ~~~~~~~~~~~~xe Strategy: A* oooooooooooooooo ooooooooooooooooo oooooosoooooooooo ooooooxooooooooo ooooooxoooooooo ooooooxooooooo ooooooxxxxxxxe Strategy: BFS

4 3. (16 pts.) CSPs Two trains, the A-line and the B-line, use the simple rail network shown below. We must schedule a single daily departure for each train. Each departure will be on an hour, between 1pm and 7pm, inclusive. The trains run at identical, constant speeds, and each segment takes a train one hour to cover. We can model this problem as a CSP in which the variables represent the departure times of the trains from their source stations: Variables: A, B Domains: {1, 2, 3, 4, 5, 6, 7} For example, if A = 1 and B = 1, then both trains leave at 1pm. The complication is that the trains cannot pass each other in the region of the track that they share, and will collide if improperly scheduled. The only points in the shared region where trains trains can pass or touch without a collision are the terminal stations (squares) and the intersection Y. Example 1: The A-line and B-line both depart at 4pm. At 6pm, the A-line will have reached Y, clearing the shared section of the track. The B train will have only reached Z. No collision will occur. Example 2: The A-line leaves at 4pm and the B-line at 2pm. At 5pm, the A-line will be at node X and the B-line at the intersection Y. They will then collide around 5:30pm. Example 3: The A-line leaves at 4pm and the B-line at 3pm. At 5pm, the A train will be at node X and the B train at node Z. At 6pm they will pass each other safely at the intersection Y with no collision. (a) (2 points) If the B-line leaves at 1pm, list the times that the A-line can safely leave without causing a collision. 1, 2, 6, 7 (b) (6 points) Implicitly state the binary constraint between the variables A and B for this CSP. Your statement should be precise, involving variables and inequalities, not a vague assertion that the trains should not collide. A < B + 2 OR A > B + 4

NAME: SID#: Login: Sec: 5 (c) (2 points) Imagine the A-line must leave between 4pm and 5pm, inclusive, and the B-line must leave between 1pm and 7pm, inclusive. State the variables domains after these unary constraints have been imposed. A {4, 5} B {1X, 2X, 3, 4, 5, 6, 7} (d) (6 points) Put an X through any domain value above that is removed by applying arc consistency to these domains.

6 4. (20 pts.) Probability and Naive Bayes You ve been hired to build a quality control system to decide whether car engines coming off an assembly line are bad or ok. However, this decision must be based on three noisy boolean observations: the engine may be wobbly (motion sensor), rumbly (sound sensor), or hot (heat sensor). Each sensor gives a Boolean observation: true or false. You formalize the problem using the following variables and domains: Cause: B {bad, ok} Evidence: W, R, H {true, false} You reason that a naive Bayes classifier is appropriate, and build one which predicts P(B W, R, H). (a) (4 points) Under the naive Bayes assumption, write an expression for P(b w, r, h) in terms of only probabilities of the forms P(b), P(w b), P(r b), and P(h b). P(w b)p(r b)p(h b)p(b) P(b w, r, h) = b P(w b )P(r b )P(h b )P(b ) (b) (3 points) The company has records for several engines which recently came off the assembly line: B W R H ok false false true ok false true false ok false true false ok true false false ok false false false bad true false false bad true true false bad false true true Given this data, circle EITHER the top row of tables OR the bottom row to indicate which are the correct maximum likelihood parameters for the naive Bayes model of this domain. The correct estimates ensure that each conditional table sums to 1. ˆP(B) bad 3/8 ok 5/8 ˆP(W B) true bad 2/3 false bad 1/3 true ok 1/5 false ok 4/5 ˆP(R B) true bad 2/3 false bad 1/3 true ok 2/5 false ok 3/5 ˆP(H B) true bad 1/3 false bad 2/3 true ok 1/5 false ok 4/5

NAME: SID#: Login: Sec: 7 Tables repeated for convenience: ˆP(B) bad 3/8 ok 5/8 ˆP(W B) true bad 2/8 false bad 1/8 true ok 1/8 false ok 4/8 ˆP(R B) true bad 2/8 false bad 1/8 true ok 2/8 false ok 3/8 ˆP(H B) true bad 1/8 false bad 2/8 true ok 1/8 false ok 4/8 ˆP(B) bad 3/8 ok 5/8 ˆP(W B) true bad 2/3 false bad 1/3 true ok 1/5 false ok 4/5 ˆP(R B) true bad 2/3 false bad 1/3 true ok 2/5 false ok 3/5 ˆP(H B) true bad 1/3 false bad 2/3 true ok 1/5 false ok 4/5 (c) (6 points) Suppose you are given an observation of a new engine: W=false, R=true, H=false. What prediction would your naive Bayes classifier make: bad or ok? Let b signify B = bad and b signify B = ok. P(b w, r, h) 3 8 1 3 2 3 2 3 = 1 18 P( b w, r, h) 5 8 4 5 2 5 4 5 = 4 25 (d) (3 points) What would the posterior probability of that prediction be? Normalizing, we have 4 18 4 18 + 25 = 72 97 (e) (4 points) The company informs you that, according to their utility function, the utility of rejecting an engine is 1, regardless of whether or not it is actually bad. However, the utility of accepting an engine is 2 for ok engines and 20 for bad engines. Given these utilities, what is the minimum posterior probability your agent needs to have for B = ok before accepting an engine becomes the rational action? Let p = P(B = ok W, R, B), then EU(reject) = 1 and EU(accept) = p 2 + (1 p) 20. Accepting would be a rational decision if EU(accept) EU(reject). These expected utilities reach equality when: Or when p = 19 22. 1 = p 2 + (1 p) 20

T t f t f 8 5. (20 pts.) Classification Consider the following data, which shows observations of three Boolean variables: A indicates whether or not your friend aces his exam, S indicates whether or not he studied, and H indicates whether he wore his lucky hat. In this problem, 1 indicates the variable was true in that observation. You decide to build a decision tree to see which variable is actually more predictive of his grades. A S H true true false true true true true true false true true true true true false true false true false false false false false true (a) (3 points) Draw the decision stump (single-level decision tree) which will be chosen using the information gain criterion presented in class. The entropy is decreased more by splitting on S. (b) (4 points) Draw the full decision tree which will be learned using the information gain-guided recursive splitting learner presented in class. S? r u e H? T r u e F a l s e A second recursive split expands the right sub-tree. (c) (2 points) Which, if either, of the two trees is consistent with the training data? Justify your answer. Neither. Some training example is misclassified by each tree.

NAME: SID#: Login: Sec: 9 For the following questions, your answers should fit in a few sentences, possibly one. Scenario: Imagine you grow a very large, complex decision tree from a training set which contains many features (attributes). (d) (4 points) You find that the training set accuracy for your tree is very high, but the test set accuracy (as measured on held out data) is very low. Explain why the accuracy on the training data could be so much higher than on the test data. Decision trees are very expressive, and so often overfit the training data. (e) (4 points) You decide to use pruning to create a series of successively smaller subtrees from your full tree (e.g. χ-squared pruning, or truncating the tree at smaller depths). As you prune more and more of the tree, the test accuracy goes up at first, but then starts to drop again after extensive pruning. Explain the rise and then fall of test accuracy in terms of specific terminology and concepts discussed in class. Pruning the full tree will reduce overfitting, which will yield a tree that more accurately generalizes to unseen data. Continuing to prune the tree will yield a classifier that is too simple to represent the patterns in the training data, so accuracy will eventually decrease. (f) (3 points) Why is it important to use accuracy on held out data rather than accuracy on training data in order to decide when to stop pruning a decision tree? Pruning using the training data would typically not have any effect because the recursive algorithm generates a tree that classifies the training data well. That is, deciding to stop when training accuracy decreased would cause an immediate stop. When pruning yields a decline in held-out data accuracy, on the other hand, the overfitting subtrees have been pruned away but the classifier is still expressive enough to model trends in the data. End of Exam