Model learning for robot control: a survey

Similar documents
Neuro-Fuzzy Inverse Forward Models

Locally Weighted Learning

Learning Inverse Dynamics: a Comparison

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Locally Weighted Learning for Control. Alexander Skoglund Machine Learning Course AASS, June 2005

CMPT 882 Week 3 Summary

CS 229 Midterm Review

Machine Learning Classifiers and Boosting

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Learning Task-Space Tracking Control with Kernels

CS6220: DATA MINING TECHNIQUES

Neural Network Neurons

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

Clustering Lecture 5: Mixture Model

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

What is machine learning?

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

Learning from Data: Adaptive Basis Functions

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank

Data Mining. Neural Networks

Dynamic Controllers in Character Animation. Jack Wang

scikit-learn (Machine Learning in Python)

Robust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson

Perceptron as a graph

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Unsupervised Learning: Clustering

Introduction to Mobile Robotics

Learning via Optimization

Mixture Models and the EM Algorithm

08 An Introduction to Dense Continuous Robotic Mapping

STA 4273H: Statistical Machine Learning

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

For Monday. Read chapter 18, sections Homework:

Support Vector Machines

Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Support Vector Machines

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni

Multilayer Feed-forward networks

Style-based Inverse Kinematics

Machine Learning / Jan 27, 2010

Learning Multiple Models of Non-Linear Dynamics for Control under Varying Contexts

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Can we quantify the hardness of learning manipulation? Kris Hauser Department of Electrical and Computer Engineering Duke University

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

ECG782: Multidimensional Digital Signal Processing

Supervised Learning in Neural Networks (Part 2)

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Logistic Regression and Gradient Ascent

Machine Learning in Biology

Experimental Data and Training

A Brief Introduction to Reinforcement Learning

Written exams of Robotics 2

Applying Supervised Learning

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Lecture 20: Neural Networks for NLP. Zubin Pahuja

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming

Classification: Linear Discriminant Functions

Instance-based Learning

3 Nonlinear Regression

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Fall 09, Homework 5

3 Nonlinear Regression

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Simple Model Selection Cross Validation Regularization Neural Networks

COMPUTATIONAL INTELLIGENCE

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models

Contents. Preface to the Second Edition

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

Bioinformatics - Lecture 07

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Machine Learning Lecture 3

DESIGN AND MODELLING OF A 4DOF PAINTING ROBOT

Computed Torque Control with Nonparametric Regression Models

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

The K-modes and Laplacian K-modes algorithms for clustering

Note Set 4: Finite Mixture Models and the EM Algorithm

Based on Raymond J. Mooney s slides

COMS 4771 Clustering. Nakul Verma

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

A Systematic Overview of Data Mining Algorithms

CP365 Artificial Intelligence

Unsupervised Learning of a Kinematic Arm Model

SEMI-BLIND IMAGE RESTORATION USING A LOCAL NEURAL APPROACH

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Artificial Neural Networks MLP, RBF & GMDH

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Instance-based Learning

5 Learning hypothesis classes (16 points)

Transcription:

Model learning for robot control: a survey Duy Nguyen-Tuong, Jan Peters 2011 Presented by Evan Beachly 1

Motivation Robots that can learn how their motors move their body Complexity Unanticipated Environments Degradation Images: Boston Dynamics Atlas, Boston Dynamics Big Dog, Ascending Technologies Hummingbird 2

Outline Background: Controllers Modelling Model Types Learning Architectures Challenges of applying machine learning to Robotics Machine Learning Methods 3 Examples of ways model learning has been applied to robotics Simulation-based optimization Approximation-based inverse dynamics control Learning operational space control Future Directions 3

Background: Controllers 4

Open Loop Control Example: Microwave Oven 5

Closed Loop Control Example: Thermostat, Cruise Control 6

Implicit Models A model describes how the system is expected to react It might be implicitly coded into the rules of the controller 7

Explicit Models We will discuss more advanced control algorithms that explicitly model the system's behavior 8

Modelling 9

Preview: Examples of Model Learning Controllers Learning Architecture Direct Indirect Distal Teacher Model Type Forward Self-tuning regulator Model Reference Adaptive Control Model Predictive Control N/A N/A Inverse Arm Joint Torque Controller Feedback Error Learning N/A Mixed Multi Module Switching Controller Gated network of experts Distal Teacher Multi-Step ARX, ARMAX variants N/A N/A 10

Model Types Forward Given the current state and an action, predicts the next state Inverse Given the current state and the desired state, predicts the action Mixed Pairs a forward and inverse model together Multi-step prediction Predicts multiple future states/actions 11

Forward Model Given the current state and an action, predict the next state Corresponds to the state transfer function of the environment Model Reference Adaptive Control (MRAC): Uses forward model to choose the action that gets it closest to the destination Model Predictive Control (MPC): Uses forward model to choose the sequence of N actions that gets it closest to the destination 12

Problems with Forward Models There may be many possible next states. Forward Models return the mean of the distribution, even if it is unlikely. Inverted Pendulum Example Likelihood of next states of the unstable inverted pendulum Probability Θ t = 90 180 90 0 Θ t+1 13

Inverse Model Given the current state and the next state, what is the action? Easy and Fast policy: take the action given by the inverse model PD-controller that helps stabilize the robot. Not necessary if the inverse model is good 14

Problems with Inverse Models An action that will get you from the current state to the desired state may not exist. Ill Posedness: Multiple actions from the current state to the desired state. Robot Arm Example Θ 2 = 270 Θ 2 = 180 Θ 2 = 90 Θ 1 = 120 Θ 1 = 60 Θ 1 = 90 One way to grip the desired position Another way The average 15 Link to Direct Modelling

Mixed Model Use a forward model to resolve ill-posedness of the inverse model In figure: Use forward model to estimate a "latent" state variable z k from the previous state, previous action, and current state. Use z k to disambiguate multiple possible a k 's in the inverse model for s k and s k+1. Distral Teacher: Use forward model to resolve ill-posedness when training the inverse model. 16

Multi-step Prediction Model Uses a different model for each time step. Avoids accumulating prediction error True path for the planned actions Path Predicted by Model Predictive Control with a forward model biased to the left Path Predicted by the last model in a multi-step model 17

Learning Architectures Direct Modelling Indirect Modelling Distal Teacher Learning 18

Direct Modelling Use observed input/output pairs to train the model Can be done off-line or on-line If on-line, use a feedback controller to guide the robot during learning Difficult to train inverse models in ill-posed problems unless: the data is generated in an intelligent way local models are used. Link to Robot Arm Example 19

Indirect Modelling Minimizes an error signal on-line, and therefore the inverse model converges to a single action in ill-posed problems. In Figure: Feedback Error Model Learning Train an open-loop inverse model controller using the output of a feedback controller. As the model gets better, the output of the feedback controller will tend toward zero. 20

Distal Teacher Learning Forward Model is trained directly to predict the next state. Compose the Inverse and Forward models, and train the composed model while holding the Forward model fixed. Advantage of adding the Forward model is that it results in a globally consistent inverse model instead of local on-policy optimization s d [n] s[n-1] Training the Inverse Model: Inverse Model action Forward Model s p [n] s d [n] - s[n] 21

Day 2 Model learning for robot control: a survey Duy Nguyen-Tuong, Jan Peters 2011 Presented by Evan Beachly 22

Outline Day 1 Background: Controllers Modelling Model Types Forward Inverse Mixed Multistep Learning Architectures Direct Indirect Distal Teacher 23

Outline Day 2 Challenges of applying machine learning to Robotics Machine Learning Methods Global Methods Artificial Neural Networks Linear Regression Local Methods Locally Weighted Linear Regression 3 Examples of ways model learning has been applied to robotics Future Directions 24

Challenges of Applying Machine Learning to Robotics 25

Data Challenges Need to sample a large region of the state space Make random movements Robotics may not have smooth dynamics (Static Friction to Kinetic Friction) Use kernel methods that allow for unsmooth functions Switch between local models (discontinuous at boundaries) High dimensionality Dimensionality reduction Redundant data Filter Noise and outliers Estimate noise level in the data, and fit a model of appropriate complexity 26

Algorithmic Constraints Needs to run in real time Don't use Ω(n 3 ) algorithms Reduce data set size Parallel computation Incorporate prior knowledge Training data may be sparse or difficult to get Specify apriori probability in probabilistic frameworks Online learning May not know the important region of the state space beforehand Active learning Labelling data may be time consuming Figure out which regions of the state space need to be explored, and only ask the human to label those 27

Real-world challenges Safety Fail safe Robustness and Reliability Feature selection to avoid overfitting Measurement Errors & Missing Data Probabilistic learning methods Time-varying systems Online learning 28

Machine Learning Methods 29

Global Regression Algorithms Estimate the form of the entire function Parametric Algorithms (model complexity is fixed): Artificial Neural Network Linear Regression Fit a polynomial to the data Nonparametric Algorithms (model complexity depends on data): Self-reconfiguring Artificial Neural Networks Gaussian Process Regression Sparse Gaussian Process Regression Support Vector Regression Incremental Support Vector Machine 30

Artificial Neural Network 1 1 1 x 1 x 2 y x 3 Output Layer Input Layer Hidden Layer 31

Perceptron Computes a weighted sum of inputs 1 Sigmoid function: w 2 w 1 w 3 w 4 The zero of the weighted sum is a hyperplane that divides the space. On one half of the space, the sigmoid returns a value near 1, on the other, 0 32 Sigmoid function image: https://qph.ec.quoracdn.net/main-qimg-05edc1873d0103e36064862a45566dba

Backpropagation An iterative method to set the weights in the network 1. Initialize all of the weights to random values 2. Select a training sample 3. Plug the training input into the network, and compute the error between the output and training sample's value 4. Compute the gradient of the error with respect to the weights of the network 5. Slightly change each weight along the gradient 6. If the network's performance is not good enough, goto 2 Gradient descent can run into local minima. Try many different initial weights 33

ANN as a function approximator The sigmoid function constrains the output between 0 and 1. Use a linear transfer function at the output layer instead. Network learns to use nonlinearities of the sigmoid function to fit the training data. Image shows output from a network with 5 neurons in the input layer and no hidden layers. 34

Linear Regression Basics y is the variable you are trying to predict x 1... x P are the variables you are using to predict y You have N samples of y and x 1... x P Linear Regression fits a linear model: y predicted = 1 x 1 + 2 x 2 +... + P x P It finds coefficients 1... P that minimizes the sum of squared errors: (y i - y predicted,i ) 2 i=0 N 35

Linear Regression Algorithm y 1 y 2 y 3... y N x 1,1 x 1,2... x 1,P x 2,1 x 2,2... x 2,P 1 x 3,1 x 3,2... x 3,P 2 =............ + x N,1 x N,2... P x N,P 1 2 3... N Y = X + Magic Formula: = (X T X) -1 X T Y (P 2 N) 36

Linear Regression with more complex Models x 1... x P don't have to be the variables you measured, they can be a function of them. Suppose m is the measured variable To fit a parabola: x 1 = 1 x 2 = m x 3 = m 2 37

Local learning Estimate the form of the function near the point we're interested in (x q ) Model complexity is not fixed Local Algorithms: Locally Weighted Linear Regression Locally Weighted Projection Regression Local Gaussian Process Regression Semi-local (combine both global and local methods): Gaussian Mixture Model Bayesian Committee Machine 38

Weighted Linear Regression Minimize the weighted sum of squared errors: w i (y i - y predicted,i ) 2 N i=0 W = w 1 0 0... 0 0 w 2 0... 0 0 0 w 3... 0.......... 0 0 0... w N Magic Formula: = (X T WX) -1 X T WY (P 2 N) Useful for accomodating Heteroskedasticity: measurement variance w i = 1/σ i 2 39

Locally Weighted Linear Regression Weight samples according to how close they are to the point of interest. Gaussian distribution of weights works well 40

Problems with Local Learning Methods Need to determine a distance function d(x k,x q ) = (x k - x q ) / h Methods for choosing h: Minimize cross validation error Adaptive bandwidth selection Speed Optimization: Problems: Partition space, and only consider points in the query point's partition Notions of locality break down in high-dimensional space Project data to lower dimensional space Combine with a probabilistic regression to exploit its strengths in high dimensional space Sensitive to noise 41

3 Examples of Ways Model Learning has been Applied to Robotics 42

Simulation Based Optimization Learn a forward model and use it for planning 43

Approximation Based Inverse Dynamics Control Model learning accounts for nonlinearities that physics models don't account for. 44

Learning Operational Space Control Learn how to follow trajectories in operational space (not joint space) Normally this is ill-posed, but there are several ways this has been tackled Local linearizations aren't ill-posed. Use local models Learn the forward kinematics, then invert it, and combine it with an inverse dynamics model Use a neural network to jointly learn the forward and inverse kinematics 45

Future Directions Dealing with uncertainty in the environment Transfer learning between tasks Semi-supervised learning Approximation based control needs more stability analysis Learning nonunique mappings 46

Summary There are many benefits of robots that can learn how to control their body Controllers use a model of the system to determine what action to take Forward models are well-defined Inverse models have a simple policy Models can be trained directly from observations, or indirectly from error signals Learning a model is a regression problem Model learning in robotics has several challenges that general machine learning doesn't 47