Two-player Games ZUI 2016/2017

Similar documents
Two-player Games ZUI 2012/2013

Artificial Intelligence. Game trees. Two-player zero-sum game. Goals for the lecture. Blai Bonet

More Realistic Adversarial Settings. Virginia Tech CS5804 Introduction to Artificial Intelligence Spring 2015

Multiple Agents. Why can t we all just get along? (Rodney King) CS 3793/5233 Artificial Intelligence Multiple Agents 1

APHID: Asynchronous Parallel Game-Tree Search

Algorithms for Computing Strategies in Two-Player Simultaneous Move Games

3.6.2 Generating admissible heuristics from relaxed problems

Downloaded from ioenotes.edu.np

Potential Midterm Exam Questions

Finding optimal configurations Adversarial search

Foundations of Artificial Intelligence

44.1 Introduction Introduction. Foundations of Artificial Intelligence Monte-Carlo Methods Sparse Sampling 44.4 MCTS. 44.

University of Alberta

CS 188: Artificial Intelligence. Recap Search I

Games and Adversarial Search II Alpha-Beta Pruning (AIMA 5.3)

Search and Optimization

Monte Carlo Tree Search

Searching: Where it all begins...

Parallel Programming. Parallel algorithms Combinatorial Search

CSC384: Intro to Artificial Intelligence Game Tree Search II. Midterm: Everything covered so far plus this lecture slides.

Adversarial Policy Switching with Application to RTS Games

Introduction. 2D1380 Constraint Satisfaction Problems. Outline. Outline. Earlier lectures have considered search The goal is a final state

Algorithm Design Techniques (III)

CSCI-630 Foundations of Intelligent Systems Fall 2015, Prof. Zanibbi

Artificial Intelligence CS 6364

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

CS347 FS2003 Final Exam Model Answers

Monotonicity. Admissible Search: That finds the shortest path to the Goal. Monotonicity: local admissibility is called MONOTONICITY

Algorithmic Game Theory - Introduction, Complexity, Nash

CS 416, Artificial Intelligence Midterm Examination Fall 2004

Alpha-Beta Pruning in Mini-Max Algorithm An Optimized Approach for a Connect-4 Game

Best-First Fixed-Depth Game-Tree Search in Practice

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2016

Chapters 3-5 Problem Solving using Search

Markov Decision Processes and Reinforcement Learning

Partially Observable Markov Decision Processes. Mausam (slides by Dieter Fox)

Name: UW CSE 473 Midterm, Fall 2014

arxiv: v4 [math.co] 12 Sep 2014

Monte Carlo Tree Search PAH 2015

University of Alberta

Forward Estimation for Game-Tree Search *

Bounded Suboptimal Game Tree Search

(Due to rounding, values below may be only approximate estimates.) We will supply these numbers as they become available.

CS 4700: Artificial Intelligence

slide 1 gaius Game Trees

Minimum Proof Graphs and Fastest-Cut-First Search Heuristics

State Space Search. Many problems can be represented as a set of states and a set of rules of how one state is transformed to another.

Markov Decision Processes. (Slides from Mausam)

CPSC 436D Video Game Programming

University of Waterloo Department of Electrical and Computer Engineering ECE 457A: Cooperative and Adaptive Algorithms Midterm Examination

Exam Topics. Search in Discrete State Spaces. What is intelligence? Adversarial Search. Which Algorithm? 6/1/2012

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 3.5, Page 1

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 6 February, 2018

Discovering Search Algorithms with Program Transformation

Uninformed Search. Reading: Chapter 4 (Tuesday, 2/5) HW#1 due next Tuesday

Chapter 3: Search. c D. Poole, A. Mackworth 2010, W. Menzel 2015 Artificial Intelligence, Chapter 3, Page 1

Artificial Intelligence. Chapters Reviews. Readings: Chapters 3-8 of Russell & Norvig.

Best-First Fixed-Depth Minimax Algorithms

ARTIFICIAL INTELLIGENCE (CSC9YE ) LECTURES 2 AND 3: PROBLEM SOLVING

CIS 192: Artificial Intelligence. Search and Constraint Satisfaction Alex Frias Nov. 30 th

Moving On. 10. Single-agent Search. Applications. Why Alpha-Beta First?

Review Adversarial (Game) Search ( ) Review Constraint Satisfaction ( ) Please review your quizzes and old CS-271 tests

Monte Carlo Tree Search

ˆ The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

A4B36ZUI - Introduction to ARTIFICIAL INTELLIGENCE

Best-First Minimax Search: Othello Results

Artificial Intelligence

Othello, Average over 20 Trees. depth 8 depth 9. Cumulative Leaves Relative to Asp NS (%) Difference of first guess from f

CSC384 Midterm Sample Questions

CS 188: Artificial Intelligence Spring Recap Search I

Recap Search I. CS 188: Artificial Intelligence Spring Recap Search II. Example State Space Graph. Search Problems.

C S C A RTIF I C I A L I N T E L L I G E N C E M I D T E R M E X A M

Artificial Intelligence

Strategies to Fast Evaluation of Tree Networks

Artificial Intelligence

Final Exam. Introduction to Artificial Intelligence. CS 188 Spring 2010 INSTRUCTIONS. You have 3 hours.

Searching with Partial Information

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019

Artificial Intelligence (part 4d)

10703 Deep Reinforcement Learning and Control

CS 540: Introduction to Artificial Intelligence

A Series of Lectures on Approximate Dynamic Programming

521495A: Artificial Intelligence

Problem solving as Search (summary)

CS885 Reinforcement Learning Lecture 9: May 30, Model-based RL [SutBar] Chap 8

Branch & Bound (B&B) and Constraint Satisfaction Problems (CSPs)

Heuristic Search Value Iteration Trey Smith. Presenter: Guillermo Vázquez November 2007

CSE 203A: Randomized Algorithms

Data Structures and Algorithms

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 9 February, 2018

Algorithms for Real-Time Game-Tree Search for Hybrid System Control

CS-171, Intro to A.I. Mid-term Exam Fall Quarter, 2013

Search: Advanced Topics and Conclusion

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Solution Trees as a Basis for Game Tree Search

Planning and Control: Markov Decision Processes

Heuristic Search in MDPs 3/5/18

Downloded from: CSITauthority.blogspot.com

Chapter 2 Classical algorithms in Search and Relaxation

Introduction to Artificial Intelligence Midterm 1. CS 188 Spring You have approximately 2 hours.

Transcription:

Two-player Games ZUI 2016/2017 Branislav Bošanský bosansky@fel.cvut.cz

Two Player Games Important test environment for AI algorithms Benchmark of AI Chinook (1994/96) world champion in checkers Deep Blue (1997) beats G. Kasparov in chess (3.5 2.5) Alpha Go (2016) beats Lee Sedol in Go (4 1) DeepStack (2016/2017) beats Poker Pros (https://www.deepstack.ai/)

Game-tree Search / Adversarial Search until now only the searching player acts in the environment there could be others: Nature stochastic environment (MDP, POMDP, ) other agents rational opponents

Game-tree Search / Adversarial Search until now only the searching player acts in the environment there could be others: Nature stochastic environment (MDP, POMDP, ) other agents rational opponents Game Theory mathematical framework that describes optimal behavior of rational self-interested agents Mag. OI B4M36MAS (Multi-agent Systems)

Game-tree Search / Adversarial Search What are the basic games categories? perfect / imperfect information deterministic / stochastic zero-sum / general-sum finite / infinite two-player / n-player

Game-tree Search / Adversarial Search What are the basic games categories? perfect / imperfect information deterministic / stochastic zero-sum / general-sum finite / infinite two-player / n-player

Game-tree Search / Adversarial Search What are the basic games categories? perfect / imperfect information deterministic / stochastic zero-sum / general-sum finite / infinite two-player / n-player What is the goal?

Game-tree Search / Adversarial Search What are the basic games categories? perfect / imperfect information deterministic / stochastic zero-sum / general-sum finite / infinite two-player / n-player What is the goal? Finding an optimal strategy (i.e., what to play in which situation)

Game-tree Search / Adversarial Search Players are rational each player wants to maximize her/his utility value

Game-tree Search / Adversarial Search Players are rational each player wants to maximize her/his utility value

Minimax function minimax(node, Player) if (node is a terminal node) return utility value of node if (Player = MaxPlayer) for each child of node α := max(α, minimax(child, switch(player) )) return α else for each child of node β := min(β, minimax(child, switch(player) )) return β

Minimax in Real Games search space in games is typically extremely large exponential in branching factor b d e.g., 35 in chess, up to 360 in Go, up to 45000 in Arimaa No-limit heads-up poker has 10 160 decision points we have to limit the depth of the search we need an evaluation function

Minimax in Real Games search space in games is typically very large exponential in branching factor b d e.g., 35 in chess, up to 360 in Go, up to 45000 in Arimaa No-limit heads-up poker has 10 160 decision points we have to limit the depth of the search we need an evaluation function

Minimax function minimax(node, depth, Player) if (depth = 0 or node is a terminal node) return evaluation value of node if (Player = MaxPlayer) for each child of node α := max(α, minimax(child, depth-1, switch(player) )) return α else for each child of node β := min(β, minimax(child, depth-1, switch(player) )) return β

Minimax in Real Games - Problems good evaluation function depth? horizon problem iterative deepening searching deeper does not always improve the results

Minimax in Real Games - Problems good evaluation function depth? horizon problem iterative deepening searching deeper does not always improve the results online vs. offline decision making game playing vs. equilibrium computation appear also in robotics / planning / decision making

Alpha-Beta Pruning

Alpha-Beta Pruning

Alpha-Beta Pruning function alphabeta(node, depth, α, β, Player) if (depth = 0 or node is a terminal node) return evaluation value of node if (Player = MaxPlayer) for each child of node α := max(α, alphabeta(child, depth-1, α, β, switch(player) )) if (β α) break return α else for each child of node β := min(β, alphabeta(child, depth-1, α, β, switch(player) )) if (β α) break return β

Negamax function negamax(node, depth, α, β, Player) if (depth = 0 or node is a terminal node) return the heuristic value of node if (Player = MaxPlayer) for each child of node α := max(α, -negamax(child, depth-1, -β, -α, switch(player) )) if (β α) break return α else for each child of node β := min(β, alphabeta(child, depth-1, α, β, not(player) )) if (β α) break return β

Aspiration Search [α, β] interval window alphabeta initialization [-, + ]

Aspiration Search [α, β] interval window alphabeta initialization [-, + ] what if we use [α 0, β 0 ] x = alphabeta(node, depth, α 0, β 0,player) α 0 x β 0 - we found a solution x α 0 - failing low (run again with [-, x]) x β 0 - failing high (run again with [x, + ])

NegaScout Main Idea enhancement of alpha-beta algorithm assume some heuristic that determines move ordering the algorithm assumes that the first action is the best one after evaluating the first action, the algorithm checks whether the remaining actions are worse the test is performed via null-window search

Scout Test what we really need at that moment is a bound (not the precise value)

Scout Test what we really need at that moment is a bound (not the precise value) Remember Aspiration Search? x α 0 - failing low (we know, that solution is x) x β 0 - failing high (we know, that solution is x) What if we use a null-window [α, α+1] (or [α,α])? we obtain a bound

NegaScout function negascout(node, depth, α, β, Player) if ((depth = 0) or (node is a terminal node)) return eval(node) b := β for each child of node v := -negascout(child, depth-1, -b, -α, switch(player))) if (( α < v ) and (child is not the first child)) v := -negascout(child, depth-1, -β, -α, switch(player))) α := max(α, v) if (β α) break b := α + 1 return α

Alpha-Beta vs. Negascout [-, [4, + ] 4 MAX [-, 4] + ] 4 [4, [4, + ] 7] 3 [4, 3] MIN [4, [-, + ] [-, 4] 4 [0, [5, 4] 5 [7, [4, + ] + ] 7 3 [4, 7] MAX 4 2 0 5 7 2 3 0

Alpha-Beta vs. Negascout [4, + ] 4 MAX [-, 4] 4 [4, 5] 3 MIN [4, + ] 4 [4, 5] [3, 4] 5 [7, 5] [5, 4] 7 3 [4, 5] X 4 2 0 5 7 2 3 0 MAX

Alpha-Beta vs. Negascout [4, + ] 7 Re-search needed MAX [-, 4] 4 [4, 5] 7 MIN [4, + ] 4 [3, 4] 5 [5, 4] [4, 5] [7, 5] 7 9 [4, 5] MAX 4 2 0 5 7 2 9 0

NegaScout also termed Principal Variation Search (PVS) dominates alpha-beta never evaluates more different nodes than alpha-beta can evaluate some nodes more than once depends on the move ordering can benefit from transposition tables generally 10-20% faster compared to alpha-beta

MTD Memory-enhanced Test Driver Best-first fixed-depth minimax algorithms. Plaat et. al., In Artificial Intelligence, Volume 87, Issues 1-2, November 1996, Pages 255-293

Further Topics more advanced algorithms Monte-Carlo Tree Search (leading algorithm for many games) further enhanced with many heuristics and techniques more complex games games with uncertainty chance (Nature player), calculating expected utilities imperfect information (players cannot distinguish certain states)

Other Games - Chance nodes

Games and Game Theory in AIC more fundamental research general algorithms for solving sequential games with imperfect information implementation of domain independent algorithms

Invitation