Learning to Shoot a Goal Lecture 8: Learning Models and Skills

Learig to Shoot a Goal Lecture 8: Learig Models ad Skills How do we acquire skill at shootig goals? CS 344R/393R: Robotics Bejami Kuipers Learig to Shoot a Goal The robot eeds to shoot the ball i the goal. How ca it lear skill from practice? Parameterize the task ad result: x describes the egocetric ball positio y is a parameter of the kick actio z describes where the ball goes Lear a forward model: z = a + bx + cy A forward model predicts the result of a actio. A iverse model predicts the actio that will give the result. Practice, Practice, Practice To lear a predictive model: z = a + bx + cy Collect a lot of data (x i, y i, z i ) Fid a, b, ad c to best fit the data. To decide how to shoot, ivert the model to get: y = (z - a - bx)/c More geerally: Lear result = f(situatio,actio) Ivert to get actio = g(situatio,result) Do t igore the ucertaity. Regressio Regressio fids the best fuctio from a give class, to fit the available data. Liear regressio fids a liear fuctio. Like z = a + bx + cy Other regressios: polyomial, logistic, memory-based, kerel-based, etc. We ca add terms like x, y ad xy ad still use liear regressio to fid a model z = a + bx + dx + cy + ey + fxy Simple example: z = a + bx The residual is the remaiig error. Stochastic equatio: z i = a + bx i + ε i Set a ad b to esure that E[ε] = 0 ad miimize E[ε ]. Not quite idetical to: ε i N(0,σ) with miimal σ.

First, the simple case Suppose we have data poits (x i, z i ) ad we wat to lear a model z = a + bx + ε We eed to fid a ad b to miimize the squared error: E = "(z i # a ) Shift the mea to the origi: (x i " x, z i " z ) First we fid b such that (z i " z ) = b(x i Oce we have b, we will get a = z " bx After shiftig the data to the origi Give the data set (x i " x, z i " z ) look for the best fit b for (z " z ) = b(x that miimizes E = "(z i # z # b(x i # x )) Look for a local miimum of E de db = #"(z i " z " b(x i ) (x i = 0 ( ) = "# (z i " b(x i = 0 It s a miimum because d E db = + # (x " x i ) > 0 #(z i b = #(x i Summary Give data poits (x i, z i ) The best fittig lie z = a + bx is give by #(z i b = #(x i a = z " bx Up to more dimesios Suppose our dataset is (x i, y i, z i ) (For simplicity, assume data cetered at origi) We wat to fit the plae z = bx + cy The error term is E = "(z i ) Fid a miimum "E = $ #x i (z i ) = 0 "b "E Solve for b ad c "c b" x i + c" x i y i = " x i z i b" x i y i + c" y i = " y i z i = $ #y i (z i ) = 0 Cautio! It s easy to liste to this, ad eve read it carefully, ad thik it all makes sese. But you still do t uderstad it! Your dataset (x i, y i, z i ) will ot be cetered aroud the origi. You eed to fit the plae z = a + bx + cy Work through the math for this, by had. Cleaig the data Give the data (x i, y i, z i ) fid the best-fittig plae z = a + bx + cy Compute the residuals: r i = z i a bx i cy i The mea of the r i should be zero. Compute the stadard deviatio σ r A data poit is a outlier if r i > 3σ r Discard the outliers Recompute the regressio, usig oly iliers.

Discardig Outliers Why is it OK to discard outliers if r i > 3σ r? Outliers are still data, are t they? A model explais data by sayig that some causes are relevat, ad others are egligible. If a data poit has p <.00 accordig to the model, it is more likely explaied as a modelig error, tha as a ulikely outcome. A ulikely outcome is t helpful i fittig model parameters, ayway. Represetig a lie x cos θ + y si θ = r (x,y) (cos θ, si θ) = r For fixed (r,θ), represet all poits (x,y) o a give lie. For fixed (x,y), represet all lies (r,θ) through (x,y). Hough Trasform Hough Space: (r,θ) represetatios Each observed poit (x,y) votes for all lies (r,θ) passig through it. Votes From Three Poits Each poit cotributes a curve of votes. Lies Get the Most Votes Votes from three very strog lies. Lies Get the Most Votes Idetify local max i Hough Space to defie a lie i Image Space. 3

Hough Trasform Issues Hough Trasform works with ay parameterized model: circle, rectagle, etc. But i a high-dimesioal Hough Space, each cell gets few votes RANSAC Radom Sample Cosesus A method for robust model-fittig. Separatig iliers from outliers. To maximize votes, use large cells. But they give low resolutio model descriptios. RANSAC to Fid Lie Models Repeat k times: Select poits from data, to defie a model M. Collect all poits from data, withi tolerace t of the model M. These are the iliers. If #iliers < d, give up o model M. Fid the model M that best fits the iliers. I this case, by liear regressio. Record the error of the iliers from model M. Retur the model M with the lowest error. RANSAC Pros ad Cos Very robust search for models. The model classifies data as iliers ad outliers Ca estimate probability of failure as a fuctio of k. But o upper boud. Ca fid multiple models by deletig data explaied by curret best model. But this ca fail if curret best model is bad. Back to Learig a Skill! Remember: x describes the egocetric ball positio y is a parameter of the kick actio z describes where the ball goes Lear a forward model: z = a + bx + cy Practice to collect the data (x i, y i, z i ) Do regressio to fid a, b, ad c To decide how to shoot, ivert the model to get: y = (z - a - bx)/c Learig to Shoot a Goal Ball positio x. Goal positio z. Kick param y 4

Egocetric Ball Positio: x I assume that you ca positio your robot so that the ball positio ca be described by a sigle parameter x. You ca use (x, x ), but more variables requires more data. Whe you re close eough to kick the ball, you ll be too close to be sure where it is! Pick a positio farther away, for accurate x. Kick Actio parameter: y The built-i kicks have o parameters. Your kick actio icludes a step or two to approach the ball, the a built-i kick. Embed the parameter y i the approach. Try various parameterizatios: Sideways compoet of the walk Turig while walkig forward etc. Avoid gaps i the search space of y values. Where the ball goes: z Track the ball after the kick. Keep your eye o the ball! After a suitable period of time ( secod?), record the directio the ball wet. Body-cetered egocetric frame of referece Plaig the shot You have a forward model: z = a + bx + cy Ivert the model: y = (z - a - bx)/c z is where you wat the ball to go x is where you see the ball right ow a, b, c have bee leared Compute y: how to cotrol the kick actio Q: Would it help to have a Bayesia model of the distributio p(z x, y)? Next Kalma filters: trackig dyamic systems. Exteded Kalma filters: hadlig oliearity by local liearizatio. 5