Artificial Intelligence, CS, Nanjing University Spring, 2016, Yang Yu. Lecture 5: Search 4.

Similar documents
Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 5: Search 4.

TDT4136 Logic and Reasoning Systems

Lecture Plan. Best-first search Greedy search A* search Designing heuristics. Hill-climbing. 1 Informed search strategies. Informed strategies

Artificial Intelligence: Search Part 2: Heuristic search

Informed Search Algorithms. Chapter 4

Informed Search and Exploration

Robot Programming with Lisp

Informed search algorithms

Artificial Intelligence

Informed Search. Topics. Review: Tree Search. What is Informed Search? Best-First Search

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany, Course on Artificial Intelligence,

Solving Problems using Search

Foundations of Artificial Intelligence

Introduction to Artificial Intelligence. Informed Search

Artificial Intelligence. 2. Informed Search

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 3: Search 2.

Artificial Intelligence. Informed search methods

Informed Search. Dr. Richard J. Povinelli. Copyright Richard J. Povinelli Page 1

Informed search algorithms

CS:4420 Artificial Intelligence

Searching and NetLogo

Introduction to Artificial Intelligence (G51IAI)

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 2: Search 1.

Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) Computer Science Department

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

CS414-Artificial Intelligence

Overview. Path Cost Function. Real Life Problems. COMP219: Artificial Intelligence. Lecture 10: Heuristic Search

COMP219: Artificial Intelligence. Lecture 10: Heuristic Search

Solving Problems by Searching

COMP219: Artificial Intelligence. Lecture 10: Heuristic Search

CS486/686 Lecture Slides (c) 2015 P.Poupart

Informed search algorithms. Chapter 4, Sections 1 2 1

Foundations of Artificial Intelligence

CS486/686 Lecture Slides (c) 2014 P.Poupart

Informed Search and Exploration

Problem solving and search

COMP9414/ 9814/ 3411: Artificial Intelligence. 5. Informed Search. Russell & Norvig, Chapter 3. UNSW c Alan Blair,

COSC343: Artificial Intelligence

AGENTS AND ENVIRONMENTS. What is AI in reality?

Planning, Execution & Learning 1. Heuristic Search Planning

TDDC17. Intuitions behind heuristic search. Recall Uniform-Cost Search. Best-First Search. f(n) =... + h(n) g(n) = cost of path from root node to n

Problem solving and search

AGENTS AND ENVIRONMENTS. What is AI in reality?

PEAS: Medical diagnosis system

TDDC17. Intuitions behind heuristic search. Best-First Search. Recall Uniform-Cost Search. f(n) =... + h(n) g(n) = cost of path from root node to n

Informed Search and Exploration

Artificial Intelligence CS 6364

Week 3: Path Search. COMP9414/ 9814/ 3411: Artificial Intelligence. Motivation. Example: Romania. Romania Street Map. Russell & Norvig, Chapter 3.

4. Solving Problems by Searching

CS:4420 Artificial Intelligence

Foundations of Artificial Intelligence

Informed search algorithms

16.1 Introduction. Foundations of Artificial Intelligence Introduction Greedy Best-first Search 16.3 A Weighted A. 16.

Contents. Foundations of Artificial Intelligence. General Algorithm. Best-First Search

Solving Problems by Searching. Artificial Intelligence Santa Clara University 2016

Outline. Informed search algorithms. Best-first search. Review: Tree search. A search Heuristics. Chapter 4, Sections 1 2 4

Problem solving and search

Automated Planning & Artificial Intelligence

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Solving Problems by Searching

Solving Problems by Searching

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Informed Search and Exploration

Artificial Intelligence

Optimal Control and Dynamic Programming

Graphs vs trees up front; use grid too; discuss for BFS, DFS, IDS, UCS Cut back on A* optimality detail; a bit more on importance of heuristics,

Informatics 2D: Tutorial 2 (Solutions)

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Planning and search. Lecture 1: Introduction and Revision of Search. Lecture 1: Introduction and Revision of Search 1

Introduction to Computer Science and Programming for Astronomers

CS:4420 Artificial Intelligence

Problem solving and search

CS 380: Artificial Intelligence Lecture #4

Artificial Intelligence: Search Part 1: Uninformed graph search

CS-E4800 Artificial Intelligence

Algorithm. December 7, Shortest path using A Algorithm. Phaneendhar Reddy Vanam. Introduction. History. Components of A.

Gradient Descent. 1) S! initial state 2) Repeat: Similar to: - hill climbing with h - gradient descent over continuous space

Warm- up. IteraAve version, not recursive. class TreeNode TreeNode[] children() boolean isgoal() DFS(TreeNode start)

COMP219: Artificial Intelligence. Lecture 7: Search Strategies

Artificial Intelligence

Artificial Intelligence

Problem solving and search: Chapter 3, Sections 1 5

Problem solving and search: Chapter 3, Sections 1 5

Monte Carlo Tree Search PAH 2015

Beyond Classical Search

Lecture 4: Local and Randomized/Stochastic Search

Problem Solving and Search. Geraint A. Wiggins Professor of Computational Creativity Department of Computer Science Vrije Universiteit Brussel

Introduction to Artificial Intelligence (G51IAI) Dr Rong Qu. Blind Searches

Seminar: Search and Optimization

Clustering (Un-supervised Learning)

Solving problems by searching

Problem Solving: Informed Search

Outline. Solving Problems by Searching. Introduction. Problem-solving agents

Informed search algorithms

Informed search algorithms

Informed search methods

Why Search. Things to consider. Example, a holiday in Jamaica. CSE 3401: Intro to Artificial Intelligence Uninformed Search

Artificial Intelligence

ARTIFICIAL INTELLIGENCE. Informed search

Solving Problems by Searching

Transcription:

Artificial Intelligence, CS, Nanjing University Spring, 2016, Yang Yu Lecture 5: Search 4 http://cs.nju.edu.cn/yuy/course_ai16.ashx

Previously... Path-based search Uninformed search Depth-first, breadth first, uniform-cost search Informed search Best-first, A* search Adversarial search Alpha-Beta search

Beyond classical search Bandit search Tree search: Monte-Carlo Tree Search General search: Gradient decent Metaheuristic search

Bandit Multiple arms Each arm has an expected reward, but unknown, with an unknown distribution Maximize your award in fixed trials

Simplest strategies Two simplest strategies Exploration-only: for T trails and K arms, try each arm T/K times problem?

Simplest strategies Two simplest strategies Exploration-only: for T trails and K arms, try each arm T/K times problem? waste on suboptimal arms

Simplest strategies Two simplest strategies Exploration-only: for T trails and K arms, try each arm T/K times problem? waste on suboptimal arms Exploitation-only:

Simplest strategies Two simplest strategies Exploration-only: for T trails and K arms, try each arm T/K times problem? waste on suboptimal arms Exploitation-only: 1. try each arm once 2. try the observed best arm T-K times

Simplest strategies Two simplest strategies Exploration-only: for T trails and K arms, try each arm T/K times problem? waste on suboptimal arms Exploitation-only: 1. try each arm once 2. try the observed best arm T-K times problem?

Simplest strategies Two simplest strategies Exploration-only: for T trails and K arms, try each arm T/K times problem? waste on suboptimal arms Exploitation-only: 1. try each arm once 2. try the observed best arm T-K times problem? risk of wrong best arm

ε-greedy Balance the exploration and exploitation: with ε probability, try a random arm with 1-ε probability, try the best arm ε controls the balance

Softmax Balance the exploration and exploitation: Choose arm with probability τ controls the balance

Upper-confidence bound Balance the exploration and exploitation: Choose arm with the largest value of average reward + upper confidence bound 2.5% Q(k) UCB

Monte-Carlo Tree Search Gradually grow the search tree: I Iterate Tree-Walk I Building Blocks I I I I I I Returned solution: I Select next action Bandit phase Add a node Grow a leaf of the search tree Select next action bis Random phase, roll-out Compute instant reward Evaluate Update information in visited nodes Propagate Path visited most often Kocsis Szepesvári, 06 Bandit Based Phase Search Tree Random Phase Explored Tree New Node

Monte-Carlo Tree Search Example: Pic from https://en.wikipedia.org/wiki/monte_carlo_tree_search#cite_note-kocsis-szepesvari-5 How to select the leave? As bandit rollout

Monte-Carlo Tree Search codes from http://mcts.ai/code/java.html

Monte-Carlo Tree Search codes from http://mcts.ai/code/java.html

Monte-Carlo Tree Search optimal? Yes, after infinite tries compare with alpha-beta pruning no need of heuristic function

Monte-Carlo Tree Search Improving random rollout Monte-Carlo-based Brügman 93 1. Until the goban is filled, add a stone (black or white in turn) at a uniformly selected empty position 2. Compute r =Win(black) 3. The outcome of the tree-walk is r Improvements? I Put stones randomly in the neighborhood of a previous stone I Put stones matching patterns prior knowledge I Put stones optimizing a value function Silver et al. 07

General search

Greedy idea in continuous space Suppose we want to site three airports in Romania: 6-D state space defined by (x 1, y 2 ), (x 2, y 2 ), (x 3, y 3 ) objective function f(x 1, y 2, x 2, y 2, x 3, y 3 ) = sum of squared distances from each city to nearest airport 71 Oradea Neamt Zerind 75 Arad 140 118 Timisoara 111 70 75 Dobreta 151 Sibiu 99 Fagaras 80 Rimnicu Vilcea Lugoj Mehadia 120 Pitesti 211 97 146 101 85 138 Bucharest Craiova 90 Giurgiu 87 Iasi 92 142 98 Urziceni Vaslui Hirsova 86 Eforie

Greedy idea in continuous space discretize and use hill climbing 71 Oradea Neamt Zerind 75 Arad 140 118 Timisoara 111 70 75 Dobreta 151 Sibiu 99 Fagaras 80 Rimnicu Vilcea Lugoj Mehadia 120 Pitesti 211 97 146 101 85 138 Bucharest Craiova 90 Giurgiu 87 Iasi 92 142 98 Urziceni Vaslui Hirsova 86 Eforie

Greedy idea in continuous space gradient decent 6-D state space defined by (x 1, y 2 ), (x 2, y 2 ), (x 3, y 3 ) objective function f(x 1, y 2, x 2, y 2, x 3, y 3 ) = sum of squared distances from each city to nearest airport Gradient methods compute f = ± f, f, f, f, f, f x 1 y 1 x 2 y 2 x 3 y 3 to increase/reduce f, e.g., by x x + α f(x) 1-order method

Greedy idea in continuous space gradient decent 6-D state space defined by (x 1, y 2 ), (x 2, y 2 ), (x 3, y 3 ) objective function f(x 1, y 2, x 2, y 2, x 3, y 3 ) = sum of squared distances from each city to nearest airport Sometimes can solve for f(x) = 0 exactly (e.g., with one city). Newton Raphson (1664, 1690) iterates x x H 1 f (x) f(x) to solve f(x) = 0, where H ij = 2 f/ x i x j 2-order method Taylor s series: f(x) =f(a)+(x a)f (a)+ (x a)2 2 f (a)+ = i=0 (x a) i f (i) (a). i!

Greedy idea 1st and 2nd order methods may not find global optimal solutions they work for convex functions objective function global maximum shoulder local maximum "flat" local maximum current state state space

Meta-heuristics problem independent black-box zeroth-order method... and usually inspired from nature phenomenon

Simulated annealing temperature from high to low when high temperature, form the shape when low temperature, polish the detail

Simulated annealing Idea: escape local maxima by allowing some bad moves but gradually decrease their size and frequency function Simulated-Annealing( problem, schedule) returns a solution state inputs: problem, a problem schedule, a mapping from time to temperature local variables: current, a node next, a node T, a temperature controlling prob. of downward steps current Make-Node(Initial-State[problem]) for t 1 to do T schedule[t] if T = 0 then return current next a randomly selected successor of current E Value[next] Value[current] if E > 0 then current next else current next only with probability e E/T the neighborhood range shrinks with T the probability of accepting a bad solution decreases with T

Simulated annealing a demo graphic from http://en.wikipedia.org/wiki/simulated_annealing

Local beam search Idea: keep k states instead of 1; choose top k of all their successors Not the same as k searches run in parallel! Searches that find good states recruit other searches to join them Problem: quite often, all k states end up on same local hill Idea: choose k successors randomly, biased towards good ones Observe the close analogy to natural selection!

Genetic algorithm a simulation of Darwin s evolutionary theory over-reproduction with diversity nature selection random initialization parent population reproduction offspring solutions selection evaluated offspring solutions evaluation

Genetic algorithm Encode a solution as a vector, 1: Pop n randomly drawn solutions from X 2: for t=1,2,... do 3: Pop m {mutate(s) 8s 2 Pop}, the mutated solutions 4: Pop c {crossover(s 1,s 2 ) 9s 1,s 2 2 Pop m }, the recombined solutions 5: evaluate every solution in Pop c by f(s)(8s 2 Pop c ) 6: Pop s selected solutions from Pop and Pop c 7: Pop Pop s 8: terminate if meets a stopping criterion 9: end for

Genetic algorithm 24748552 24 31% 32752411 32748552 32748152 32752411 23 29% 24748552 24752411 24752411 24415124 20 26% 32752411 32752124 32252124 32543213 11 14% 24415124 24415411 24415417 Fitness Selection Pairs Cross Over Mutation GAs require states encoded as strings (GPs use programs) Crossover helps iff substrings are meaningful components + =

Example Encode a solution as a vector with length n each element of the vector can be chosen from {1,...,V } parameters: mutation probability pm, crossover probability pc 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: Pop = randomly generate n solutions from {1,...,V } n for t=1,2,... do Pop m =emptyset, Pop c =emptyset for i = 1 to n let x be the i-th solution in Pop for j = 1 to n: with probability p m, change x j by a random value from {1,...,V } add x into Pop m end for for i = 1 to n let x be the i-th solution in Pop m let x be a randomly selected solution from Pop m with probability p c, exchange a random part of x with x add x into Pop c end for evaluate solutions in Pop c, select the best n solutions from Pop and Pop c to Pop terminal if a good solution is found end for

An evolutionary of virtual life

An evolutionary of virtual life

Properties of meta-heuristics zeroth order do not need differentiable functions convergence will find an optimal solution if or P( x -> x1 ->... -> xk -> x* )>0 P( x* x )>0 a missing link observation simulation

Properties of meta-heuristics zeroth order do not need differentiable functions convergence will find an optimal solution if or P( x -> x1 ->... -> xk -> x* )>0 P( x* x )>0 a missing link observation simulation observation principle application

Properties of meta-heuristics grey wolf optimizer 2010 gravitational search algorithm river formation dynamics 2000 differential evolution 1990 memetic algorithms cultural algorithms 1980 fireworks algorithm brainstorm algorithm bat algorithm intelligent water drops algorithm artificial bee colony algorithms particle swarm optimization algorithms ant colony optimization algorithms artificial immune systems tabu search simulated annealing 1970 evolutionary strategies evolutionary programming 1960 genetic algorithms

Example hard to apply traditional optimization methods but easy to test a given solution Representation: parameterize Fitness: represented as a vector of parameters x i f(x i ) test by simulation/experiment

Example this nose... has been newly developed... using the latest analytical technique (i.e. genetic algorithms) N700 cars save 19% energy... 30% increase in the output... This is a result of adopting the... nose shape

Example NASA ST5 hard to apply traditional optimization methods but easy to test a given solution

Example NASA ST5 hard to apply traditional optimization methods but easy to test a given solution

Example NASA ST5 hard to apply traditional optimization methods but easy to test a given solution تج) QHAs % $ ) 38% efficiency evolved antennas resulted in 93% efficiency

Different Environment Properties

Nondeterministic actions ose that we introduce nondeterminism in the form of a powerful In the erratic vacuum world, thesuck action works as follows: When applied to a dirty square the action cleans the square and sometimes cleans up dirt in an adjacent square, too. When applied to a clean square the action sometimes deposits dirt on the carpet. 9 1 2 3 4 5 6 7 8 almost all real-world problems are nondeterministic how do you solve this problem?

AND-OR tree search OR node: different actions (as usual) AND node: different transitions 1 Suck Right 7 5 2 GOAL Suck Right Left Suck 5 LOOP 1 LOOP 6 Suck Left 1 LOOP 8 4 GOAL 8 GOAL 5 LOOP a solution is not a path but a tree

Depth-first AND-OR tree search function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan, or failure OR-SEARCH(problem.INITIAL-STATE, problem, []) function OR-SEARCH(state, problem, path) returns a conditional plan, or failure if problem.goal-test(state) then return the empty plan if state is on path then return failure for each action in problem.actions(state) do plan AND-SEARCH(RESULTS(state, action), problem, [state path]) if plan failure then return [action plan] return failure function AND-SEARCH(states, problem, path) returns a conditional plan, or failure for each s i in states do plan i OR-SEARCH(s i, problem, path) if plan i = failure then return failure return [if s 1 then plan 1 else if s 2 then plan 2 else...if s n 1 then plan n 1 else plan n ]

Search with no observations search in belief (in agent s mind) L R 1 3 5 7 L 1 2 3 4 5 6 R 2 4 6 8 7 8 S S 4 5 7 8 S L R 5 S 5 3 L 6 4 S 4 R 7 L 7 R 8 8 L R 6 8 S 8 7 L R S 3 7