CSE 473: Ar+ficial Intelligence Uncertainty and Expec+max Tree Search

Similar documents
CSE 473: Ar+ficial Intelligence

CS 188: Artificial Intelligence. Recap Search I

CS 188: Ar)ficial Intelligence

More Realistic Adversarial Settings. Virginia Tech CS5804 Introduction to Artificial Intelligence Spring 2015

What is Search For? CS 188: Ar)ficial Intelligence. Constraint Sa)sfac)on Problems Sep 14, 2015

CS 188: Artificial Intelligence

Today. Informed Search. Graph Search. Heuristics Greedy Search A* Search

Games and Adversarial Search II Alpha-Beta Pruning (AIMA 5.3)

CS 188: Artificial Intelligence

K-Consistency. CS 188: Artificial Intelligence. K-Consistency. Strong K-Consistency. Constraint Satisfaction Problems II

Artificial Intelligence Informed Search

CPSC 436D Video Game Programming

CSCI 446: Artificial Intelligence

CSC 2114: Artificial Intelligence Search

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 6 February, 2018

CS 188: Artificial Intelligence

Introduction to Artificial Intelligence Midterm 1. CS 188 Spring You have approximately 2 hours.

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science

CS 188: Artificial Intelligence

Announcements. Project 0: Python Tutorial Due last night

Machine Learning. CS 232: Ar)ficial Intelligence Naïve Bayes Oct 26, 2015

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen

CS 4700: Artificial Intelligence

CS 188: Artificial Intelligence

CS 5522: Artificial Intelligence II

Multiple Agents. Why can t we all just get along? (Rodney King) CS 3793/5233 Artificial Intelligence Multiple Agents 1

CS 343: Artificial Intelligence

Today. CS 188: Artificial Intelligence Fall Example: Boolean Satisfiability. Reminder: CSPs. Example: 3-SAT. CSPs: Queries.

CS 188: Artificial Intelligence Spring Recap Search I

Recap Search I. CS 188: Artificial Intelligence Spring Recap Search II. Example State Space Graph. Search Problems.

CSC384: Intro to Artificial Intelligence Game Tree Search II. Midterm: Everything covered so far plus this lecture slides.

CSE 40171: Artificial Intelligence. Informed Search: A* Search

Announcements. Homework 4. Project 3. Due tonight at 11:59pm. Due 3/8 at 4:00pm

Announcements. CS 188: Artificial Intelligence Fall Reminder: CSPs. Today. Example: 3-SAT. Example: Boolean Satisfiability.

CS 188: Artificial Intelligence Fall 2008

CS 343: Artificial Intelligence

Constraint Satisfaction Problems

Potential Midterm Exam Questions

Alpha-Beta Pruning in Mini-Max Algorithm An Optimized Approach for a Connect-4 Game

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2016

CS 4100 // artificial intelligence

Announcements. CS 188: Artificial Intelligence Fall Robot motion planning! Today. Robotics Tasks. Mobile Robots

CS 188: Artificial Intelligence Fall Announcements

BIL 682 Ar+ficial Intelligence Week #2: Solving problems by searching. Asst. Prof. Aykut Erdem Dept. of Computer Engineering HaceDepe University

3.6.2 Generating admissible heuristics from relaxed problems

CS 188: Artificial Intelligence. Recap: Search

(Due to rounding, values below may be only approximate estimates.) We will supply these numbers as they become available.

HEURISTIC SEARCH. 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and References 4.6 Exercises

CS 188: Artificial Intelligence Spring Today

Artificial Intelligence Naïve Bayes

Introduction. 2D1380 Constraint Satisfaction Problems. Outline. Outline. Earlier lectures have considered search The goal is a final state

CS 343: Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.

What is Search For? CS 188: Artificial Intelligence. Example: Map Coloring. Example: N-Queens. Example: N-Queens. Constraint Satisfaction Problems

CSE 473: Ar+ficial Intelligence Machine Learning: Naïve Bayes and Perceptron

Beyond Minimax: Nonzero-Sum Game Tree Search with Knowledge Oriented Players

Two-player Games ZUI 2016/2017

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010

Local Search Methods. CS 188: Artificial Intelligence Fall Announcements. Hill Climbing. Hill Climbing Diagram. Today

CSEP 573: Artificial Intelligence

CS 188: Artificial Intelligence. What is Search For? Constraint Satisfaction Problems. Constraint Satisfaction Problems

PEAS: Medical diagnosis system

Graph Search. Chris Amato Northeastern University. Some images and slides are used from: Rob Platt, CS188 UC Berkeley, AIMA

CSCI 599 Class Presenta/on. Zach Levine. Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates

Artificial Intelligence CS 6364

Transport Layer (Congestion Control)

Extending Heuris.c Search

CS 6140: Machine Learning Spring 2016

TerraSwarm. A Machine Learning and Op0miza0on Toolkit for the Swarm. Ilge Akkaya, Shuhei Emoto, Edward A. Lee. University of California, Berkeley

Introduction to Fall 2008 Artificial Intelligence Midterm Exam

CS 188: Artificial Intelligence Fall 2011

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012

Data Structures and Algorithms

Founda'ons of So,ware Engineering. Lecture 11 Intro to QA, Tes2ng Claire Le Goues

What is Search For? CSE 473: Artificial Intelligence. Example: N-Queens. Example: N-Queens. Example: Map-Coloring 4/7/17

CS395T Visual Recogni5on and Search. Gautam S. Muralidhar

Inference. Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation:

From Deep Blue to Monte Carlo: An Update on Game Tree Research

AI Programming CS S-15 Probability Theory

CSE 473: Artificial Intelligence

CS 4100 // artificial intelligence

Announcements. Homework 1: Search. Project 1: Search. Midterm date and time has been set:

Monte Carlo Tree Search

Announcements. CS 188: Artificial Intelligence Spring Today. Example: Map-Coloring. Example: Cryptarithmetic.

Announcements. CS 188: Artificial Intelligence Spring Production Scheduling. Today. Backtracking Search Review. Production Scheduling

Finding optimal configurations Adversarial search

Chapter 1 Introduc0on

CSE 573: Artificial Intelligence

Adversarial Policy Switching with Application to RTS Games

Introduction to Generative Adversarial Networks

Review Adversarial (Game) Search ( ) Review Constraint Satisfaction ( ) Please review your quizzes and old CS-271 tests

Foundations of Artificial Intelligence

Domain Splitting CPSC 322 CSP 4. Textbook 4.6. February 4, 2011

Search with Costs and Heuristic Search

Announcements. CS 188: Artificial Intelligence

STA 4273H: Sta-s-cal Machine Learning

Using Classical Planners for Tasks with Con5nuous Ac5ons in Robo5cs

CS 343H: Artificial Intelligence

Artificial Intelligence

Transcription:

CSE 473: Ar+ficial Intelligence Uncertainty and Expec+max Tree Search Instructors: Luke ZeDlemoyer Univeristy of Washington [These slides were adapted from Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at hdp://ai.berkeley.edu.]

Uncertain Outcomes

Worst-Case vs. Average Case max min 10 10 9 100 Idea: Uncertain outcomes controlled by chance, not an adversary!

Expec+max Search Why wouldn t we know what the result of an ac+on will be? Explicit randomness: rolling dice Unpredictable opponents: the ghosts respond randomly Ac+ons can fail: when moving a robot, wheels might slip max Values should now reflect average-case (expec+max) outcomes, not worst-case (minimax) outcomes Expec+max search: compute the average score under op+mal play Max nodes as in minimax search Chance nodes are like min nodes but the outcome is uncertain Calculate their expected u+li+es I.e. take weighted average (expecta+on) of children 10 10 4 59 100 7 chance Later, we ll learn how to formalize the underlying uncertainresult problems as Markov Decision Processes [Demo: min vs exp (L7D1,2)]

Video of Demo Min vs. Exp (Min)

Video of Demo Min vs. Exp (Exp)

Expec+max Pseudocode def value(state): if the state is a terminal state: return the state s u+lity if the next agent is MAX: return max-value(state) if the next agent is EXP: return exp-value(state) def max-value(state): ini+alize v = - for each successor of state: v = max(v, value(successor)) return v def exp-value(state): ini+alize v = 0 for each successor of state: p = probability(successor) v += p * value(successor) return v

Expec+max Pseudocode def exp-value(state): ini+alize v = 0 for each successor of state: 1/2 p = probability(successor) 1/3 1/6 v += p * value(successor) return v 58 24 7-12 v = (1/2) (8) + (1/3) (24) + (1/6) (-12) = 10

Expec+max Example 3 12 9 2 4 6 15 6 0

3 12 9 2 Expec+max Pruning?

Depth-Limited Expec+max 400 300 Es+mate of true expec+max value (which would require a lot of work to compute) 492 362

Probabili+es

Reminder: Probabili+es A random variable represents an event whose outcome is unknown A probability distribu+on is an assignment of weights to outcomes Example: Traffic on freeway Random variable: T = whether there s traffic Outcomes: T in {none, light, heavy} Distribu+on: P(T=none) = 0.25, P(T=light) = 0.50, P(T=heavy) = 0.25 Some laws of probability (more later): Probabili+es are always non-nega+ve Probabili+es over all possible outcomes sum to one As we get more evidence, probabili+es may change: P(T=heavy) = 0.25, P(T=heavy Hour=8am) = 0.60 We ll talk about methods for reasoning and upda+ng probabili+es later 0.25 0.50 0.25

Reminder: Expecta+ons The expected value of a func+on of a random variable is the average, weighted by the probability distribu+on over outcomes Example: How long to get to the airport? Time: Probability: 20 min 30 min 60 min + + x x x 0.25 0.50 0.25 35 min

What Probabili+es to Use? In expec+max search, we have a probabilis+c model of how the opponent (or environment) will behave in any state Model could be a simple uniform distribu+on (roll a die) Model could be sophis+cated and require a great deal of computa+on We have a chance node for any outcome out of our control: opponent or environment The model might say that adversarial ac+ons are likely! For now, assume each chance node magically comes along with probabili+es that specify the distribu+on over its outcomes Having a probabilis.c belief about another agent s ac.on does not mean that the agent is flipping any coins!

Quiz: Informed Probabili+es Let s say you know that your opponent is actually running a depth 2 minimax, using the result 80% of the +me, and moving randomly otherwise Ques+on: What tree search should you use? 0.1 0.9 Answer: Expec+max! To figure out EACH chance node s probabili+es, you have to run a simula+on of your opponent This kind of thing gets very slow very quickly Even worse if you have to simulate your opponent simula+ng you except for minimax, which has the nice property that it all collapses into one game tree

Modeling Assump+ons

The Dangers of Op+mism and Pessimism Dangerous Op+mism Assuming chance when the world is adversarial Dangerous Pessimism Assuming the worst case when it s not likely

Assump+ons vs. Reality Adversarial Ghost Random Ghost Minimax Pacman Won 5/5 Avg. Score: 483 Won 5/5 Avg. Score: 493 Expec+max Pacman Won 1/5 Avg. Score: -303 Won 5/5 Avg. Score: 503 Results from playing 5 games Pacman used depth 4 search with an eval func+on that avoids trouble Ghost used depth 2 search with an eval func+on that seeks Pacman [Demos: world assump+ons (L7D3,4,5,6)]

Video of Demo World Assump+ons Random Ghost Expec+max Pacman

Video of Demo World Assump+ons Adversarial Ghost Minimax Pacman

Video of Demo World Assump+ons Adversarial Ghost Expec+max Pacman

Video of Demo World Assump+ons Random Ghost Minimax Pacman

Other Game Types

Mixed Layer Types E.g. Backgammon Expec+minimax Environment is an extra random agent player that moves axer each min/max agent Each node computes the appropriate combina+on of its children

Example: Backgammon Dice rolls increase b: 21 possible rolls with 2 dice Backgammon 20 legal moves Depth 2 = 20 x (21 x 20) 3 = 1.2 x 10 9 As depth increases, probability of reaching a given search node shrinks So usefulness of search is diminished So limi+ng depth is less damaging But pruning is trickier Historic AI: TDGammon uses depth-2 search + very good evalua+on func+on + reinforcement learning: world-champion level play 1 st AI world champion in any game! Image: Wikipedia

Mul+-Agent U+li+es What if the game is not zero-sum, or has mul+ple players? Generaliza+on of minimax: Terminals have u+lity tuples Node values are also u+lity tuples Each player maximizes its own component Can give rise to coopera+on and compe++on dynamically 1,6,6 7,1,2 6,1,2 7,2,1 5,1,7 1,5,2 7,7,1 5,2,5