Announcements. CS 188: Artificial Intelligence Spring Advanced Applications. Robot folds towels. Robotic Control Tasks

CS 188: Artificial Intelligence Spring 2011 Advanced Applications: Robotics Announcements Practice Final Out (optional) Similar extra credit system as practice midterm Contest (optional): Tomorrow night 11pm deadline for final submission Project 5 Classification is out: due next week Friday Pieter Abbeel UC Berkeley A few slides from Sebastian Thrun, Dan Klein 1 So Far Mostly Foundational Methods Advanced Applications 3 4 Robotic Control Tasks Perception / Tracking Where exactly am I? What s around me? [pile of 5 video] Robot folds towels Low-Level Control How to move the robot and/or objects from position A to position B High-Level Control What are my goals? What are the optimal high-level actions? [Maitin-Shepard, Cusumano-Towner, Lei & Abbeel, 2010] 6 1

Low-Level Planning Low-level: move from configuration A to configuration B A Simple Robot Arm Configuration Space What are the natural coordinates for specifying the robot s configuration? These are the configuration space coordinates Can t necessarily control all degrees of freedom directly Work Space What are the natural coordinates for specifying the effector tip s position? These are the work space coordinates Coordinate Systems Workspace: The world s (x, y) system Obstacles specified here Obstacles in C-Space What / where are the obstacles? Remaining space is free space Configuration space The robot s state Planning happens here Obstacles can be projected to here Two-link manipulator d Y 2 (x, y) Example Obstacles in C-Space d 1 α 2 α 1 X 2

Two-link manipulator Probabilistic Roadmaps Demo Idea: sample random points as nodes in a visibility graph http://www-inst.eecs.berkeley.edu/~cs188/fa08/demos/robot.html This gives probabilistic roadmaps Very successful in practice Lets you add points where you need them If insufficient points, incomplete or weird paths Robotic Control Tasks Perception Perception / Tracking 1. Find a point see in two camera views 2. Find 3D coordinates by finding the intersection of the rays Where exactly am I? What s around me? Low-Level Control How to move the robot and/or objects from position A to position B High-Level Control What are my goals? What are the optimal high-level actions? 18 19 20 3

21 22 23 24 25 26 4

27 28 29 30 31 32 5

33 34 35 36 Motivating Example Autonomous Helicopter Flight n How do we specify a task like this? Key challenges: Track helicopter position and orientation during flight Decide on control inputs to send to helicopter [demo: autorotate / tictoc] 6

Autonomous Helicopter Setup HMM for Tracking the Helicopter Position On-board inertial measurement unit (IMU) Send out controls to helicopter s = (x, y, z, Á, µ, Ã, ẋ, ẏ, ż, Á, µ, Ã) State: Measurements: 3-D coordinates from vision, 3-axis magnetometer, 3-axis gyro, 3-axis accelerometer Transitions (dynamics): [time elapse update] s t+1 = f (s t, a t ) + w t [f encodes helicopter dynamics] [w is a probabilistic noise model] 42 Helicopter MDP Problem: What s the Reward? State: s = (x, y, z, Á, µ, Ã, ẋ, ẏ, ż, Á, µ, Ã) Rewards for hovering: [demo: hover] Actions (control inputs): a lon : Main rotor longitudinal cyclic pitch control (affects pitch rate) a lat : Main rotor latitudinal cyclic pitch control (affects roll rate) a coll : Main rotor collective pitch (affects main rotor thrust) a rud : Tail rotor collective pitch (affects tail rotor thrust) Transitions (dynamics): s t+1 = f (s t, a t ) + w t [f encodes helicopter dynamics] [w is a probabilistic noise model] Can we solve the MDP yet? Rewards for Tic-Toc? Problem: what s the target trajectory? Just write it down by hand? [demo: bad] 44 Helicopter Apprenticeship? [demo: unaligned] Probabilistic Alignment using a Bayes Net Intended trajectory Expert demonstrations Time indices 47 Intended trajectory satisfies dynamics. Expert trajectory is a noisy observation of one of the hidden states. But we don t know exactly which one. [Coates, Abbeel & Ng, 2008] 7

[demo: alignment] Alignment of Samples [demo: airshow] Final Behavior Result: inferred sequence is much cleaner! 49 Advanced Applications 50 Quadruped Low-level control problem: moving a foot into a new location à similar search as for moving robot arm High-level control problem: where should we place the feet? Reward function R(x) = w. f(s) [25 features] [Kolter, Abbeel & Ng, 2008] 51 Without learning Apprenticeship Learning Goal: learn reward function from expert demonstration Assume Get expert demonstrations Guess initial policy Repeat: Find w which make the expert better than Solve MDP for new weights w: 53 8

With learned reward function Advanced Applications 57 Autonomous Vehicles Grand Challenge: Barstow, CA, to Primm, NV 150 mile off-road robot race across the Mojave desert Natural and manmade hazards No driver, no remote control No dynamic passing Autonomous vehicle slides adapted from Sebastian Thrun Inside an Autonomous Car GPS GPS compass 6 Computers IMU Readings: No Obstacles 3 E-stop 5 Lasers Camera Radar 2 1 Control Screen Steering motor 9

Readings: Obstacles Obstacle Detection Trigger if Z i Z j > 15cm for nearby z i, z j ΔZ Raw Measurements: 12.6% false positives Probabilistic Error Model HMMs for Detection GPS IMU GPS IMU GPS IMU x t x t+1 x t+2 z t z t+1 z t+2 Raw Measurements: 12.6% false positives HMM Inference: 0.02% false positives 10