The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Similar documents
ˆ The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Midterm I. Introduction to Artificial Intelligence. CS 188 Fall You have approximately 3 hours.

Introduction to Fall 2008 Artificial Intelligence Midterm Exam

Midterm I. Introduction to Artificial Intelligence. CS 188 Fall You have approximately 3 hours.

Final Exam. Introduction to Artificial Intelligence. CS 188 Spring 2010 INSTRUCTIONS. You have 3 hours.

ˆ The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Introduction to Artificial Intelligence Midterm 1. CS 188 Spring You have approximately 2 hours.

To earn the extra credit, one of the following has to hold true. Please circle and sign.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Introduction to Fall 2014 Artificial Intelligence Midterm Solutions

Introduction to Spring 2007 Artificial Intelligence Midterm Solutions

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2016

CS 540-1: Introduction to Artificial Intelligence

Homework #6 (Constraint Satisfaction, Non-Deterministic Uncertainty and Adversarial Search) Out: 2/21/11 Due: 2/29/11 (at noon)

Introduction to Fall 2010 Artificial Intelligence Midterm Exam

(Due to rounding, values below may be only approximate estimates.) We will supply these numbers as they become available.

Probabilistic Belief. Adversarial Search. Heuristic Search. Planning. Probabilistic Reasoning. CSPs. Learning CS121

CS 540: Introduction to Artificial Intelligence

Name: UW CSE 473 Midterm, Fall 2014

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2017

CS 416, Artificial Intelligence Midterm Examination Fall 2004

10-701/15-781, Fall 2006, Final

Spring 2007 Midterm Exam

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2014

1 Tree Search (12 points)

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2013

6.034 Quiz 1, Spring 2004 Solutions

Potential Midterm Exam Questions

University of Waterloo Department of Electrical and Computer Engineering ECE 457A: Cooperative and Adaptive Algorithms Midterm Examination

Artificial Intelligence. Chapters Reviews. Readings: Chapters 3-8 of Russell & Norvig.

Problem 1 Zero: 11% (~20 students), Partial: 66% (~120 students), Perfect: 23% (~43 students)

7 Temporal Models (20 points)

Q1. [11 pts] Foodie Pacman

: Principles of Automated Reasoning and Decision Making Midterm

4 Search Problem formulation (23 points)

Introduction to AI Spring 2006 Dan Klein Midterm Solutions

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2014

Structure and Interpretation of Computer Programs

Structure and Interpretation of Computer Programs

6.034 QUIZ 1 Solutons. Fall Problem 1: Rule-Based Book Recommendations (30 points)

CSCI-630 Foundations of Intelligent Systems Fall 2015, Prof. Zanibbi

Question 1: 25% of students lost more than 2 points Question 2: 50% of students lost 2-4 points; 50% got it entirely correct Question 3: 95% of

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Structure and Interpretation of Computer Programs

CS 188: Artificial Intelligence. Recap Search I

6.034 QUIZ 1. Fall 2002

University of Waterloo Midterm Examination Solution

CS 188: Artificial Intelligence Spring Recap Search I

Structure and Interpretation of Computer Programs

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

This should prevent residual sexism, racism, favoritism lurking in the unconscious of your professor & TA from biasing grading!!

CSC384 Midterm Sample Questions

Recap Search I. CS 188: Artificial Intelligence Spring Recap Search II. Example State Space Graph. Search Problems.

Monte Carlo Tree Search

Monte Carlo Tree Search PAH 2015

CS 380/480 Foundations of Artificial Intelligence Winter 2007 Assignment 2 Solutions to Selected Problems

Structure and Interpretation of Computer Programs

Structure and Interpretation of Computer Programs

15-780: Problem Set #3

Artificial Intelligence

Markov Decision Processes and Reinforcement Learning

Structure and Interpretation of Computer Programs

Informatics 2D. Coursework 1: Search and Games

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 3.5, Page 1

CS188 Spring 2010 Section 2: CSP s

Review Adversarial (Game) Search ( ) Review Constraint Satisfaction ( ) Please review your quizzes and old CS-271 tests

Games and Adversarial Search II Alpha-Beta Pruning (AIMA 5.3)

Monotonicity. Admissible Search: That finds the shortest path to the Goal. Monotonicity: local admissibility is called MONOTONICITY

You should provide the best solutions you can think of.

Structure and Interpretation of Computer Programs Summer 2014 Midterm 2

Structure and Interpretation of Computer Programs

Midterm Exam 1 Solutions

Apprenticeship Learning for Reinforcement Learning. with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang

Reinforcement Learning (INF11010) Lecture 6: Dynamic Programming for Reinforcement Learning (extended)

EECS 492 Midterm #1. Example Questions. Note: Not a complete exam!

Artificial Intelligence

Algorithm Design Techniques (III)

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012

Practice Midterm 1. Problem Points Score TOTAL 50

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. ECE 345 Algorithms and Data Structures Fall 2012

CSE 123: Computer Networks Fall Quarter, 2013 MIDTERM EXAM

5 Learning hypothesis classes (16 points)

CSEP 573: Artificial Intelligence

a. The following method would allow an object of the static type List<String> to be passed to it as an argument.

CS510 \ Lecture Ariel Stolerman

Midterm I Exam Principles of Imperative Computation André Platzer Ananda Gunawardena. February 23, Name: Andrew ID: Section:

Last time: Problem-Solving

15-780: Graduate AI Homework Assignment #2 Solutions

PROBLEM 4

CMPUT 366 Assignment 1

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Artificial Intelligence CS 6364

Final Examination. Winter Problem Points Score. Total 180

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

Two-player Games ZUI 2012/2013

Using Reinforcement Learning to Optimize Storage Decisions Ravi Khadiwala Cleversafe

Final Exam 2, CS154. April 25, 2010

Pascal De Beck-Courcelle. Master in Applied Science. Electrical and Computer Engineering

Structure and Interpretation of Computer Programs

Transcription:

CS Summer Introduction to Artificial Intelligence Midterm You have approximately minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet. Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. All short answer sections can be successfully answered in a few sentences AT MOST. First name Last name SID edx username Name of person on your left Name of person on your right For staff use only: Q. Search / Q. Game Trees and Pruning /9 Q. Encrypted Knowledge Base /6 Q. MDPs - Farmland / Q. Reinforcement Learning /9 Total /

THIS PAGE IS INTENTIONALLY LEFT BLANK

Q. [ pts] Search Each True/False question is worth points. Leaving a question blank is worth points. Answering incorrectly is worth points. (a) Consider a graph search problem where for every action, the cost is at least ɛ, with ɛ >. Assume the used heuristic is consistent. (i) [ pt] [true or false] Depth-first graph search is guaranteed to return an optimal solution. (ii) [ pt] [true or false] Breadth-first graph search is guaranteed to return an optimal solution. (iii) [ pt] [true or false] Uniform-cost graph search is guaranteed to return an optimal solution. (iv) [ pt] [true or false] Greedy graph search is guaranteed to return an optimal solution. (v) [ pt] [true or false] A* graph search is guaranteed to return an optimal solution. (vi) [ pt] [true or false] A* graph search is guaranteed to expand no more nodes than depth-first graph search. (vii) [ pt] [true or false] A* graph search is guaranteed to expand no more nodes than uniform-cost graph search. (b) Let h (s) be an admissible A* heuristic. Let h (s) = h (s). Then: (i) [ pt] [true or false] The solution found by A* tree search with h is guaranteed to be an optimal solution. (ii) [ pt] [true or false] The solution found by A* tree search with h is guaranteed to have a cost at most twice as much as the optimal path. (iii) [ pt] [true or false] The solution found by A* graph search with h is guaranteed to be an optimal solution. (c) [ pts] The heuristic values for the graph below are not correct. For which single state (S, A, B, C, D, or G) could you change the heuristic value to make everything admissible and consistent? What range of values are possible to make this correction? State: Range: C h= S h=6 A h= B h= G h= D h=

Q. [9 pts] Game Trees and Pruning You and one of the robots are playing a game where you both have your own score. The maximum possible score for either player is. You are trying to maximize your score, and you do not care what score the robot gets. The robot is trying to minimize the absolute difference between the two scores. In the case of a tie, the robot prefers a lower score. For example, the robot prefers (,) to (6,); it prefers (,) to (,); and it prefers (,) to (,). The figure below shows the game tree of your max node followed by the robots nodes for your four different actions. The scores are shown at the leaf nodes with your score always on top and the robots score on the bottom. (a) [ pts] Fill in the dashed rectangles with the pair of scores preferred by each node of the game tree. 6 9 (b) [ pts] You can save computation time by using pruning in your game tree search. On the game tree above, put an X on line of branches that do not need to be explored. Assume that branches are explored from left to right. (c) [ pts] You now have access to an oracle that tells you the order of branches to explore that maximizes pruning. On the copy of the game tree below, put an X on line of branches that do not need to be explored given this new information from the oracle. 6 9

Q. [6 pts] Encrypted Knowledge Base We have a propositional logic knowledge base, but unfortunately, it is encrypted. The only information we have is that: Each of the following boxes contains a propositional logic symbol (A, B, C, D, or E) or a propositional logic operator and Each line is a valid propositional logic sentence. X X X X X X 6 X X X 9 X X X (a) [ pts] We are going to implement a constraint satisfaction problem solver to find a valid assignment to each box from the domain {A, B, C, D, E,,,,, }. Propositional logic syntax imposes constraints on what can go in each box. What values are in the domain of boxes -6 after enforcing the unary syntax constraints? Box 6 Remaining Values

(b) [ pts] You are given the following assignment as a solution to the knowledge base CSP on the previous page: A B A D C B D E Now that the encryption CSP is solved, we have an entirely new CSP to work on: finding a model. In this new CSP the variables are the symbols {A, B, C, D, E} and each variable could be assigned to true or false. We are going to run CSP backtracking search with forward checking to find a propositional logic model M that makes all of the sentences in this knowledge base true. After choosing to assign C to false, what values are removed by running forward checking? On the table of remaining values below, cross off the values that were removed. Symbol A B C D E Remaining Values T F T F T F T T F (c) [ pts] We eventually arrive at the model M = {A = F alse, B = F alse, C = T rue, D = T rue, E = T rue} that causes all of the knowledge base sentences to be true. We have a query sentence α specific as (A C) E. Our model M also causes α to be true. Can we say that the knowledge base entails α? Explain briefly (in one sentence) why or why not. 6

Q. [ pts] MDPs - Farmland In the game FARMLAND, players alternate taking turns drawing a card from one of two piles, PIG and COW. PIG cards have equal probability of being points or point, and COW cards are always worth points. Players are trying to be the first to reach points or more. We are designing an agent to play FARMLAND. We will use a modified MDP to come up with a policy to play this game. States will be represented as tuples (x, y) where x is our score and y is the opponent s score. The value V (x, y) is an estimate of the probability that we will win at the state (x, y) when it is our turn to play and both players are playing optimally. Unless otherwise stated, assume both players play optimally. First, suppose we work out by hand V, the table of actual probabilities. Opponent.... You... According to this table, V (, ) =., so with both players playing optimally, the probability that we will win if our score is, the opponent s score is, and it is our turn to play is.. (a) [ pts] At the beginning of the game, would you choose to go first or second? Justify your answer using the table. (b) [ pts] If our current state is (x, y) (so our score is x and the opponent s score is y) but it is the opponent s turn to play, what is the probability that we will win if both players play optimally in terms of V? (c) [ pts] As FARMLAND is a very simple game, you quickly grow tired of playing it. You decide to buy the FARMLAND expansion, BOVINE BONANZA, which adds loads of exciting cards to the COW pile! Of course, this changes the transition function for our MDP, so the table V above is no longer correct. We need to come up with an update equation that will ultimately make V converge on the actual probabilities that we will win. You are given the transition function T ((x, y), a, (x, y)) and the reward function R((x, y), a, (x, y)). The transition function T ((x, y), a, (x, y)) is the probability of transitioning from state (x, y) to state (x, y) when action a is taken (i.e. the probability that the card drawn gives x x points). Since we are only trying to find the probability of winning and we don t care about the margin of victory, the reward function R((x, y), a, (x, y)) is whenever (x, y) is a winning state and everywhere else. As in normal value iteration, all values will be initialized to (i.e. V (x, y) = for all states (x, y)). Write an update equation for V k+ (x, y) in terms of T, R and V k. Hint: you will need to use your answer from part b.

Q. [9 pts] Reinforcement Learning (a) Each True/False question is worth points. Leaving a question blank is worth points. Answering incorrectly is worth points. (i) [ pt] [true or false] Temporal difference learning is an online learning method. (ii) [ pt] [true or false] Q-learning: Using an optimal exploration function leads to no regret while learning the optimal policy. (iii) [ pt] [true or false] In a deterministic MDP (i.e. one in which each state / action leads to a single deterministic next state), the Q-learning update with a learning rate of α = will correctly learn the optimal q-values (assume that all state/action pairs are visited sufficiently often). (iv) [ pt] [true or false] A small discount (close to ) encourages greedy behavior. (v) [ pt] [true or false] A large, negative living reward ( ) encourages greedy behavior. (vi) [ pt] [true or false] A negative living reward can always be expressed using a discount <. (vii) [ pt] [true or false] A discount < can always be expressed as a negative living reward. (b) [ pts] Given the following table of Q-values for the state A and the set of actions {F orward, Reverse, Stop}, what is the probability that we will take each action on our next move when we following an ɛ-greedy exploration policy (assuming any random movements are chosen uniformly from all actions)? Q(A, F orward) =. Q(A, Reverse) =. Q(A, Stop) =. Action Probability (in terms of ɛ) F orward Reverse Stop