Learning Motor Behaviors: Past & Present Work
|
|
- Jean Greene
- 5 years ago
- Views:
Transcription
1 Stefan Schaal Computer Science & Neuroscience University of Southern California, Los Angeles & ATR Computational Neuroscience Laboratory Kyoto, Japan Learning Motor Behaviors: Past & Present Work
2 Auke Ijspeert Aaron D Souza Jun Nakanishi Jan Peters Michael Mistry Dimitris Pongas Joint Work With:
3 How are Motor Skills Generated? A Question Shared by Biological and Robotics Research Movies from collaborations with C. Atkeson, S. Kotosaka, S. Vijayakumar
4 How are Motor Skills Generated? A Question Shared by Biological and Robotics Research Unfortunately, each of these skills required manual generation of representations control policies, and learning mechanisms Movies from collaborations with C. Akteson, S. Kotosaka, S. Vijayakumar
5 What Motor Behaviors Exist? Tracking Tasks e.g., tracing a figure-8 on a piece of paper Regulator Tasks e.g., balance control (pole balancing, biped balancing, helicopter hover) Discrete Tasks e.g., reach for a cup, tennis forehand, basket ball shot Periodic Tasks e.g., legged locomotion, swimming, dancing Complex sequences and superposition of the above e.g., assembly tasks, empty the dishwasher, playing tennis, almost every daily life behavior Level of Difficulty
6 Learning Motor Behaviors: Control Policies The General Goal of Motor Learning: Control Policies u(t)=p(x(t),t,å)
7 How Are Control Policies Used in Robotics? Direct Control (Model Free) Indirect Control (Model-Based)
8 Approaches to Learning Motor Behaviors in Robotics Supervised Learning direct inverse model learning, forward model learning distal teacher feedback error learning Reinforcement Learning value function-based approaches policy gradients Motor Primitives schemas, basis behaviors, units of actions, macros, options parameterized policies Imitation Learning learning a policy from observation learning the task goal from observation (inverses RL) learning an initial strategy for self-improvement Past to Present
9 Supervised Learning of Motor Behaviors Given: A parameterized policy A task goal A measure of (signed) error Usually applied to discrete tasks Goal Learn a task-level controller that produces the right motor command for the given goal from all initial conditions.
10 Supervised Learning of Motor Behaviors Approaches: Learn Task Models Direct Inverse Learning Forward Model Learning & Search Inverse Model u ff x desired Feedback Controller u fb Σ Robot y Jordan & Rumelhart Distal Teacher Feedback Error Learning Kawato
11 Supervised Learning of Motor Behaviors Example: Learning Devilsticking
12 Supervised Learning of Motor Behaviors Example: Learning Polebalancing
13 Approaches to Learning Motor Behaviors in Robotics Supervised Learning direct inverse model learning, forward model learning distal teacher feedback error learning Reinforcement Learning value function-based approaches policy gradients Motor Primitives schemas, basis behaviors, units of actions, macros, options parameterized policies Imitation Learning learning a policy from observation learning the task goal from observation (inverses RL) learning an initial strategy for self-improvement
14 Reinforcement Learning: Value Function Based Q-Learning or SARSA requires function approximation for the action value function usually only discrete actions considered only low dimensional robotic systems e.g., acrobot Qπ (x,u) = E{ r 1 + γ r 2 + γ 2 r 3 + x 0 = x,u 0 = u} Watkins; Sutton
15 Reinforcement Learning: Value Function Based RL in Continuous Time and Space continuous version of actor-critic systems closed form solution for optimal action for motor systems of the form: i.e., x = f ( x) + g( x)u u * g x ( ) T V x particularly useful for model-based RL T V π (x) = E{ r 1 + γ r 2 + γ 2 r 3 + x 0 = x} Doya, Morimoto, Kimura
16 Reinforcement Learning: Value Function Based RL in Continuous Time and Space: Example
17 Reinforcement Learning: Policy Gradients Motivation for Policy Gradients value function approximation is too hard in complex motor systems, thus avoid value function smooth policy improvement instead of greedy jumps even useful for hidden state systems useful for parsimoniously parameterized policies e.g., J π θ = X d π x θ π θ π + α J π ( ) π( u x) U θ θ ( ) b( x) ( Q π x,u )dudx Note that policy gradients can only achieve local optimization.
18 Reinforcement Learning: Policy Gradients Examples: Robot Peg-in-hole insertion, Tuning Biped Locomotion Tedrake Gullapalli more results available, e.g., see Andrew Ng, Drew Bagnell, etc. Benbrahim & Franklin
19 Approaches to Learning Motor Behaviors in Robotics Supervised Learning direct inverse model learning, forward model learning distal teacher feedback error learning Reinforcement Learning value function-based approaches policy gradients Motor Primitives schemas, basis behaviors, units of actions, macros, options parameterized policies Imitation Learning learning a policy from observation learning the task goal from observation (inverses RL) learning an initial strategy for self-improvement
20 Motor Primitives Motivation 1: Divide & Conquer Motivation 2: Suitable Parameterization u t ( ) = p x t ( ( ),t,α )
21 What is Good Motor Primitive? From the view of biological research Previous Su(estions Included: Organizational Principles 2/3 Power Law Piecewise Planarity Speed-Accuracy Tradeoff Optimization of Energy, Jerk, Torque Change, Motor Command Change, Task Variance, Stochastic Feedback Control, Effort, etc. Equilibrium Point/Trajectory Hypotheses VITE Model of trajectory planning Force Fields Pattern Generators and Dynamics System Theory Focusing mostly on coupling phenomena (e.g,. inter-limb, perception-action, intra-limb) and the necessary interaction of control and musculoskeletal dynamics Contraction Theory A version of control theory for modular control and many more
22 What is Good Motor Primitive? From the view of machine learning/robotics Previous Su(estions Included: hardcrafted basis behaviors that are of some level of generality e.g., flocking, dispersing, door finding, object pick-up, closed-loop policies, etc. automatic regular coarse partitioning of the world e.g., a very coarse grid, potentia,y with hidden state automatic detection of basis behaviors from examining the statistics of the world e.g., states with drastic changes of value gradients, states that are common on successful trials, etc.
23 Movement Primitives as Attractor Systems Note the similarity between a generic control policy ( ) = p x t u t ( ( ),t,α ) and nonlinear differential equations u( t) = x desired ( t) = p( x ( t),goal,α ) desired This view creates a natural distinction between two major movement classes: Rhythmic Movement Discrete Movement
24 Rhythmic & Discrete Movement Representation in the Brain PMdr M1,S1 BA40 BA7 BA44 BA47 DISC RETE-RHYTHMIC RHYTHMIC-DISCRETE Joint work with Dagmar Sternad, Rieko Osu, and Mitsuo Kawato Nature Neuroscience 7: , 2004
25 Movement Primitives as Attractor Systems: Goals x = f ( x,goal) A Class of Dynamic Systems that Can Code: Point-to-point and periodic behavior as their attractor Multi-dimensional systems that required phase locking Attractors that have rather complex shape (e.g., complex phase relationships, movement reversals) Learning and optimization Coupling phenomena Timing (without requiring explicit time) Generalization (structural equivalence for parameter changes) Robustness to disturbances and interactions with the environment Stability guarantees
26 A Dynamic Systems Model for Discrete Movement A learnable nonlinear point attractor with guaranteed stability properties Behavioral Phase v = α v β ( v g x) v x = α x v ( ) Nonlinear Function f ( x,v) z = α ( β ( z z g y) z) y = α y ( f ( x,v) + z)
27 A Dynamic Systems Model for Discrete Movement Use Gaussian Basis Functions to build nonlinear learning system Trajectory Plan Dynamics Canonical Dynamics Local Linear Model Approx. ( ( ) z) ( f ( x,v) + z) z = α z β z g y y = α y where ( ( ) v) v = α v β v g x x = α x v f ( x,v) = k i =1 k w i b i v i =1 w i w i = exp 1 2 d i Linear in learning parameters ( ) 2 x c i and x = x x 0 g x 0
28 An Example Desired Position Desired Velocity Basis Functions in Time Phase Velocity Phase
29 Extension to Periodic Systems A learnable nonlinear limit cycle attractor with guaranteed stability properties Behavioral Phase r = α r ϕ = ω ( A r) Nonlinear Function f ( r,ϕ) Phase Oscillator with amplitude A z = α ( z β ( z g y) z) y = α y ( f ( r,ϕ) + z)
30 Example: Policy Gradients with Movement Primitives Goal: Hit ball precisely Note: about 150 trials are needed.
31 Approaches to Learning Motor Behaviors in Robotics Supervised Learning direct inverse model learning, forward model learning distal teacher feedback error learning Reinforcement Learning value function-based approaches policy gradients Motor Primitives schemas, basis behaviors, units of actions, macros, options parameterized policies Imitation Learning learning a policy from observation learning the task goal from observation (inverses RL) learning an initial strategy for self-improvement
32 Imitation Learning What can be learned.om imitation? control policies (assume actions are observable) internal models reward criteria e.g., inverse reinforcement learning (Ng et al.) use demonstration as soft-constraint value functions
33 Imitation Learning: Example Learning an internal model.om demonstration
34 Imitation Learning: Example Using the demonstrated behavior as soft-constraint
35 Given: Imitation Learning with Motor A desired Canonical trajectory Dynamics Algorithm Trajectory Plan Dynamics y, y, demo demo Primitives y demo k Local Linear Extract movement duration and movement w i goal i =1 Model Approx. Adjust time constants of canonical dynamics to movement duration w i = exp 1 Using Locally Weighted Learning to solve 2 d i ( x c i ) 2 and x = x x 0 nonlinear function g x 0 approximation problem y target = ( ( ) z) ( f ( x,v) + z) z = α z β z g y y = α y where ( ( ) v) v = α v β v g x x = α x v f ( x,v) = Note: This is a one-shot w i b i v y demo z = f x,v α y learning problem, i.e., ( ) where z can be calculated by integrating the differential equation with desired trajectory information k i =1 no iterations!
36 Example: A Tennis Forehand as a Movement Primitive
37 Example: A Tennis Forehand as a Dynamic Primitive
38 Example: Various Rhythmic Movement Primitives
39 Example: Imitation Learning with Self-Improvement Goal: Hit ball precisely Note: about 150 trials are needed.
40 Movement Primitives for Planar Walking
41 Coupling of Mechanics and Control
42 Movement Primitives in Interaction with Sound
43 Discussion The amount of learning research in manipulator robotics is poor! Reinforcement Learning in this domain is very hard! Finding good reward functions is hard! Policy gradients are of some use, at the cost of giving up global optimality and the discovery of new strategies Imitation Learning is great for initializing policies We, designed motor primitives can facilitate learning tremendously. But no autonomous learning.amework yet...
Learning to bounce a ball with a robotic arm
Eric Wolter TU Darmstadt Thorsten Baark TU Darmstadt Abstract Bouncing a ball is a fun and challenging task for humans. It requires fine and complex motor controls and thus is an interesting problem for
More informationApplying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning
Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning Jan Peters 1, Stefan Schaal 1 University of Southern California, Los Angeles CA 90089, USA Abstract. In this paper, we
More informationMovement Imitation with Nonlinear Dynamical Systems in Humanoid Robots
Movement Imitation with Nonlinear Dynamical Systems in Humanoid Robots Auke Jan Ijspeert & Jun Nakanishi & Stefan Schaal Computational Learning and Motor Control Laboratory University of Southern California,
More informationRobot learning for ball bouncing
Robot learning for ball bouncing Denny Dittmar Denny.Dittmar@stud.tu-darmstadt.de Bernhard Koch Bernhard.Koch@stud.tu-darmstadt.de Abstract For robots automatically learning to solve a given task is still
More informationTeaching a robot to perform a basketball shot using EM-based reinforcement learning methods
Teaching a robot to perform a basketball shot using EM-based reinforcement learning methods Tobias Michels TU Darmstadt Aaron Hochländer TU Darmstadt Abstract In this paper we experiment with reinforcement
More informationLearning and Generalization of Motor Skills by Learning from Demonstration
Learning and Generalization of Motor Skills by Learning from Demonstration Peter Pastor, Heiko Hoffmann, Tamim Asfour, and Stefan Schaal Abstract We provide a general approach for learning robotic motor
More informationThe Organization of Cortex-Ganglia-Thalamus to Generate Movements From Motor Primitives: a Model for Developmental Robotics
The Organization of Cortex-Ganglia-Thalamus to Generate Movements From Motor Primitives: a Model for Developmental Robotics Alessio Mauro Franchi 1, Danilo Attuario 2, and Giuseppina Gini 3 Department
More informationReinforcement Learning to Adjust Robot Movements to New Situations
Reinforcement Learning to Adjust Robot Movements to New Situations Jens Kober MPI Tübingen, Germany jens.kober@tuebingen.mpg.de Erhan Oztop ATR, Japan erhan@atr.jp Jan Peters MPI Tübingen, Germany jan.peters@tuebingen.mpg.de
More informationReinforcement Learning to Adjust Robot Movements to New Situations
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Reinforcement Learning to Adjust Robot Movements to New Situations Jens Kober MPI Tübingen, Germany jens.kober@tuebingen.mpg.de
More informationIntroduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University
Introduction to Reinforcement Learning J. Zico Kolter Carnegie Mellon University 1 Agent interaction with environment Agent State s Reward r Action a Environment 2 Of course, an oversimplification 3 Review:
More informationLearning to Select and Generalize Striking Movements in Robot Table Tennis
AAAI Technical Report FS-12-07 Robots Learning Interactively from Human Teachers Learning to Select and Generalize Striking Movements in Robot Table Tennis Katharina Muelling 1,2 and Jens Kober 1,2 and
More informationNatural Actor-Critic. Authors: Jan Peters and Stefan Schaal Neurocomputing, Cognitive robotics 2008/2009 Wouter Klijn
Natural Actor-Critic Authors: Jan Peters and Stefan Schaal Neurocomputing, 2008 Cognitive robotics 2008/2009 Wouter Klijn Content Content / Introduction Actor-Critic Natural gradient Applications Conclusion
More informationMovement reproduction and obstacle avoidance with dynamic movement primitives and potential fields
8 8 th IEEE-RAS International Conference on Humanoid Robots December 1 ~ 3, 8 / Daejeon, Korea Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields Dae-Hyung
More informationWarsaw University of Technology. Optimizing Walking of a Humanoid Robot using Reinforcement Learning
Warsaw University of Technology Faculty of Power and Aeronautical Engineering Division of Theory of Machines and Robots MS Thesis Optimizing Walking of a Humanoid Robot using Reinforcement Learning by
More informationFeedback Error Learning for Gait Acquisition
Feedback Error Learning for Gait Acquisition Master-Thesis von Nakul Gopalan November 22 Fachbereiche ETIT und Informatik Intelligent Autonomous Systems Feedback Error Learning for Gait Acquisition Vorgelegte
More informationApprenticeship Learning for Reinforcement Learning. with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang
Apprenticeship Learning for Reinforcement Learning with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang Table of Contents Introduction Theory Autonomous helicopter control
More informationLearning of a ball-in-a-cup playing robot
Learning of a ball-in-a-cup playing robot Bojan Nemec, Matej Zorko, Leon Žlajpah Robotics Laboratory, Jožef Stefan Institute Jamova 39, 1001 Ljubljana, Slovenia E-mail: bojannemec@ijssi Abstract In the
More informationImitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling
Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling Jens Kober, Betty Mohler, Jan Peters Abstract Traditional motor primitive approaches deal largely with open-loop policies
More informationLearning Inverse Dynamics: a Comparison
Learning Inverse Dynamics: a Comparison Duy Nguyen-Tuong, Jan Peters, Matthias Seeger, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics Spemannstraße 38, 72076 Tübingen - Germany Abstract.
More informationModel learning for robot control: a survey
Model learning for robot control: a survey Duy Nguyen-Tuong, Jan Peters 2011 Presented by Evan Beachly 1 Motivation Robots that can learn how their motors move their body Complexity Unanticipated Environments
More informationNonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach
Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach Christopher G. Atkeson Robotics Institute and HCII Carnegie Mellon University Pittsburgh, PA 15213, USA cga@cmu.edu
More informationTopics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning
Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 12: Deep Reinforcement Learning Types of Learning Supervised training Learning from the teacher Training data includes
More informationAccelerating Synchronization of Movement Primitives: Dual-Arm Discrete-Periodic Motion of a Humanoid Robot
5 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Congress Center Hamburg Sept 8 - Oct, 5. Hamburg, Germany Accelerating Synchronization of Movement Primitives: Dual-Arm Discrete-Periodic
More informationICRA 2012 Tutorial on Reinforcement Learning I. Introduction
ICRA 2012 Tutorial on Reinforcement Learning I. Introduction Pieter Abbeel UC Berkeley Jan Peters TU Darmstadt Motivational Example: Helicopter Control Unstable Nonlinear Complicated dynamics Air flow
More informationReducing Hardware Experiments for Model Learning and Policy Optimization
Reducing Hardware Experiments for Model Learning and Policy Optimization Sehoon Ha 1 and Katsu Yamane 2 Abstract Conducting hardware experiment is often expensive in various aspects such as potential damage
More informationInverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations
Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations Peter Englert Machine Learning and Robotics Lab Universität Stuttgart Germany
More informationLearning CPG-based biped locomotion with a policy gradient method
Robotics and Autonomous Systems 54 (2006) 911 920 www.elsevier.com/locate/robot Learning CPG-based biped locomotion with a policy gradient method Takamitsu Matsubara a,c,, Jun Morimoto b,c, Jun Nakanishi
More informationExplaining and Exploiting Impedance Modulation in Motor Control
Explaining and Exploiting Impedance Modulation in Motor Control Professor Sethu Vijayakumar FRSE Microsoft Research RAEng Chair in Robotics University of Edinburgh, UK http://homepages.inf.ed.ac.uk/svijayak
More informationProblem characteristics. Dynamic Optimization. Examples. Policies vs. Trajectories. Planning using dynamic optimization. Dynamic Optimization Issues
Problem characteristics Planning using dynamic optimization Chris Atkeson 2004 Want optimal plan, not just feasible plan We will minimize a cost function C(execution). Some examples: C() = c T (x T ) +
More informationLearning Task-Space Tracking Control with Kernels
Learning Task-Space Tracking Control with Kernels Duy Nguyen-Tuong 1, Jan Peters 2 Max Planck Institute for Intelligent Systems, Spemannstraße 38, 72076 Tübingen 1,2 Universität Darmstadt, Intelligent
More informationNeuro-Fuzzy Inverse Forward Models
CS9 Autumn Neuro-Fuzzy Inverse Forward Models Brian Highfill Stanford University Department of Computer Science Abstract- Internal cognitive models are useful methods for the implementation of motor control
More informationLearning Multiple Models of Non-Linear Dynamics for Control under Varying Contexts
Learning Multiple Models of Non-Linear Dynamics for Control under Varying Contexts Georgios Petkos, Marc Toussaint, and Sethu Vijayakumar Institute of Perception, Action and Behaviour, School of Informatics
More informationReinforcement Learning for Parameterized Motor Primitives
2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 2006 Reinforcement Learning for Parameterized Motor Primitives Jan Peters
More information10703 Deep Reinforcement Learning and Control
10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture
More informationCONTROLO th Portuguese Conference on Automatic Control Instituto Superior Técnico, Lisboa, Portugal September 11-13, 2006
CONTROLO 6 7th Portuguese Conference on Automat Control Instituto Superior Técno, Lisboa, Portugal September -3, 6 TEMPORAL COORDINATION OF SIMULATED TIMED TRAJECTORIES FOR TWO VISION-GUIDED VEHICLES:
More informationLearning Complex Motions by Sequencing Simpler Motion Templates
Learning Complex Motions by Sequencing Simpler Motion Templates Gerhard Neumann gerhard@igi.tugraz.at Wolfgang Maass maass@igi.tugraz.at Institute for Theoretical Computer Science, Graz University of Technology,
More informationGeneralized Inverse Reinforcement Learning
Generalized Inverse Reinforcement Learning James MacGlashan Cogitai, Inc. james@cogitai.com Michael L. Littman mlittman@cs.brown.edu Nakul Gopalan ngopalan@cs.brown.edu Amy Greenwald amy@cs.brown.edu Abstract
More informationMarco Wiering Intelligent Systems Group Utrecht University
Reinforcement Learning for Robot Control Marco Wiering Intelligent Systems Group Utrecht University marco@cs.uu.nl 22-11-2004 Introduction Robots move in the physical environment to perform tasks The environment
More informationReinforcement Learning to Adjust Parametrized Motor Primitives to New Situations
Noname manuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to New Situations JensKober AndreasWilhelm Erhan Oztop JanPeters Received: 2 May 2
More informationLocally Weighted Learning
Locally Weighted Learning Peter Englert Department of Computer Science TU Darmstadt englert.peter@gmx.de Abstract Locally Weighted Learning is a class of function approximation techniques, where a prediction
More informationReinforcement le,arning of dynamic motor sequence: Learning to stand up
Proceedings of the 1998 EEERSJ Intl. Conference on Intelligent Robots and Systems Victoria, B.C., Canada October 1998 Reinforcement le,arning of dynamic motor sequence: Learning to stand up Jun Morimoto
More informationQuadruped Robots and Legged Locomotion
Quadruped Robots and Legged Locomotion J. Zico Kolter Computer Science Department Stanford University Joint work with Pieter Abbeel, Andrew Ng Why legged robots? 1 Why Legged Robots? There is a need for
More informationData-Efficient Generalization of Robot Skills with Contextual Policy Search
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence Data-Efficient Generalization of Robot Skills with Contextual Policy Search Andras Gabor Kupcsik Department of Electrical and
More informationarxiv: v1 [cs.ro] 10 May 2014
Efficient Reuse of Previous Experiences to Improve Policies in Real Environment arxiv:1405.2406v1 [cs.ro] 10 May 2014 Norikazu Sugimoto 1,3, Voot Tangkaratt 2, Thijs Wensveen 4, Tingting Zhao 2, Masashi
More informationOnline movement adaptation based on previous sensor experiences
211 IEEE/RSJ International Conference on Intelligent Robots and Systems September 25-3, 211. San Francisco, CA, USA Online movement adaptation based on previous sensor experiences Peter Pastor, Ludovic
More informationLearning Parameterized Skills
Bruno Castro da Silva bsilva@cs.umass.edu Autonomous Learning Laboratory, Computer Science Dept., University of Massachusetts Amherst, 13 USA. George Konidaris gdk@csail.mit.edu MIT Computer Science and
More informationSelf-Reconfigurable Robots for Space Exploration Effect of compliance in modular robots structures on the locomotion Adi Vardi
Self-Reconfigurable Robots for Space Exploration Effect of compliance in modular robots structures on the locomotion Adi Vardi Prof. Auke Jan Ijspeert Stéphane Bonardi, Simon Hauser, Mehmet Mutlu, Massimo
More informationReinforcement Learning of Clothing Assistance with a Dual-arm Robot
Reinforcement Learning of Clothing Assistance with a Dual-arm Robot Tomoya Tamei, Takamitsu Matsubara, Akshara Rai, Tomohiro Shibata Graduate School of Information Science,Nara Institute of Science and
More informationRich Periodic Motor Skills on Humanoid Robots: Riding the Pedal Racer
Rich Periodic Motor Skills on Humanoid Robots: Riding the Pedal Racer Andrej Gams 1,2, Jesse van den Kieboom 1, Massimo Vespignani 1, Luc Guyot 1, Aleš Ude 2 and Auke Ijspeert 1 Abstract Just as their
More informationOn Movement Skill Learning and Movement Representations for Robotics
On Movement Skill Learning and Movement Representations for Robotics Gerhard Neumann 1 1 Graz University of Technology, Institute for Theoretical Computer Science November 2, 2011 Modern Robotic Systems:
More informationVisual Servoing for Floppy Robots Using LWPR
Visual Servoing for Floppy Robots Using LWPR Fredrik Larsson Erik Jonsson Michael Felsberg Abstract We have combined inverse kinematics learned by LWPR with visual servoing to correct for inaccuracies
More informationLearned parametrized dynamic movement primitives with shared synergies for controlling robotic and musculoskeletal systems
COMPUTATIONAL NEUROSCIENCE ORIGINAL RESEARCH ARTICLE published: 17 October 2013 doi: 10.3389/fncom.2013.00138 Learned parametrized dynamic movement primitives with shared synergies for controlling robotic
More informationParametric Primitives for Motor Representation and Control
Proceedings, IEEE International Conference on Robotics and Automation (ICRA-2002) volume 1, pages 863-868, Washington DC, May 11-15, 2002 Parametric Primitives for Motor Representation and Control R. Amit
More informationMovement Learning and Control for Robots in Interaction
Movement Learning and Control for Robots in Interaction Dr.-Ing. Michael Gienger Honda Research Institute Europe Carl-Legien-Strasse 30 63073 Offenbach am Main Seminar Talk Machine Learning Lab, University
More informationNonlinear Dynamical Systems for Imitation with Humanoid Robots
Nonlinear Dynamical Systems for Imitation with Humanoid Robots Auke Jan Ijspeert, & Jun Nakanishi & Tomohiro Shibata & Stefan Schaal, Computational Learning and Motor Control Laboratory University of Southern
More informationExperimental Verification of Stability Region of Balancing a Single-wheel Robot: an Inverted Stick Model Approach
IECON-Yokohama November 9-, Experimental Verification of Stability Region of Balancing a Single-wheel Robot: an Inverted Stick Model Approach S. D. Lee Department of Mechatronics Engineering Chungnam National
More informationLocally Weighted Learning for Control. Alexander Skoglund Machine Learning Course AASS, June 2005
Locally Weighted Learning for Control Alexander Skoglund Machine Learning Course AASS, June 2005 Outline Locally Weighted Learning, Christopher G. Atkeson et. al. in Artificial Intelligence Review, 11:11-73,1997
More informationARTICLE IN PRESS Neural Networks ( )
Neural Networks ( ) Contents lists available at ScienceDirect Neural Networks journal homepage: www.elsevier.com/locate/neunet 2008 Special Issue Reinforcement learning of motor skills with policy gradients
More informationHierarchical Reinforcement Learning with Movement Primitives
Hierarchical Reinforcement Learning with Movement Primitives Freek Stulp, Stefan Schaal Computational Learning and Motor Control Lab University of Southern California, Los Angeles, CA 90089, USA stulp@clmc.usc.edu,
More informationControl 2. Keypoints: Given desired behaviour, determine control signals Inverse models:
Control 2 Keypoints: Given desired behaviour, determine control signals Inverse models: Inverting the forward model for simple linear dynamic system Problems for more complex systems Open loop control:
More informationADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL
ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY BHARAT SIGINAM IN
More informationTwo steps Natural Actor Critic Learning for Underwater Cable Tracking
2 IEEE International Conference on Robotics and Automation Anchorage Convention District May 3-8, 2, Anchorage, Alaska, USA Two steps Natural Actor Critic Learning for Underwater Cable Tracking Andres
More informationGeneralization of Example Movements with Dynamic Systems
9th IEEE-RAS International Conference on Humanoid Robots December 7-1, 29 Paris, France Generalization of Example Movements with Dynamic Systems Andrej Gams and Aleš Ude Abstract In the past, nonlinear
More informationElastic Bands: Connecting Path Planning and Control
Elastic Bands: Connecting Path Planning and Control Sean Quinlan and Oussama Khatib Robotics Laboratory Computer Science Department Stanford University Abstract Elastic bands are proposed as the basis
More informationKinesthetic Teaching via Fast Marching Square
Kinesthetic Teaching via Fast Marching Square Javier V. Gómez, David Álvarez, Santiago Garrido and Luis Moreno Abstract This paper presents a novel robotic learning technique based on Fast Marching Square
More informationMotion Recognition and Generation for Humanoid based on Visual-Somatic Field Mapping
Motion Recognition and Generation for Humanoid based on Visual-Somatic Field Mapping 1 Masaki Ogino, 1 Shigeo Matsuyama, 1 Jun ichiro Ooga, and 1, Minoru Asada 1 Dept. of Adaptive Machine Systems, HANDAI
More informationLearning to Walk through Imitation
Abstract Programming a humanoid robot to walk is a challenging problem in robotics. Traditional approaches rely heavily on prior knowledge of the robot's physical parameters to devise sophisticated control
More informationPolicy gradient methods with model predictive control applied to ball bouncing
Policy gradient methods with model predictive control applied to ball bouncing Paul Kulchenko Department of Computer Science and Engineering University of Washington, Seattle, WA Email: paul@kulchenko.com
More informationImitation Learning of Robot Movement Using Evolutionary Algorithm
Proceedings of the 17th World Congress The International Federation of Automatic Control Imitation Learning of Robot Movement Using Evolutionary Algorithm GaLam Park, Syungkwon Ra ChangHwan Kim JaeBok
More informationThis is an author produced version of Definition and composition of motor primitives using latent force models and hidden Markov models.
This is an author produced version of Definition and composition of motor primitives using latent force models and hidden Markov models. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/116580/
More informationOn-line periodic movement and force-profile learning for adaptation to new surfaces
On-line periodic movement and force-profile learning for adaptation to new surfaces Andrej Gams, Martin Do, Aleš Ude, Tamim Asfour and Rüdiger Dillmann Department of Automation, Biocybernetics and Robotics,
More informationDistance-based Kernels for Dynamical Movement Primitives
Distance-based Kernels for Dynamical Movement Primitives Diego ESCUDERO-RODRIGO a,1, René ALQUEZAR a, a Institut de Robotica i Informatica Industrial, CSIC-UPC, Spain Abstract. In the Anchoring Problem
More informationIVR: Open- and Closed-Loop Control. M. Herrmann
IVR: Open- and Closed-Loop Control M. Herrmann Overview Open-loop control Feed-forward control Towards feedback control Controlling the motor over time Process model V B = k 1 s + M k 2 R ds dt Stationary
More informationSynthesis of Controllers for Stylized Planar Bipedal Walking
Synthesis of Controllers for Stylized Planar Bipedal Walking Dana Sharon, and Michiel van de Panne Department of Computer Science University of British Columbia Vancouver, BC, V6T 1Z4, Canada {dsharon,van}@cs.ubc.ca
More informationShape-based coordination in locomotion control
Article Shape-based coordination in locomotion control The International Journal of Robotics Research 1 16 The Author(s) 2018 Reprints and permissions: sagepub.co.uk/journalspermissions.nav DOI: 10.1177/0278364918761569
More informationComparative Experiments on Task Space Control with Redundancy Resolution
Comparative Experiments on Task Space Control with Redundancy Resolution Jun Nakanishi,RickCory, Michael Mistry,JanPeters, and Stefan Schaal ICORP, Japan Science and Technology Agency, Kyoto, Japan ATR
More informationOn-line periodic movement and force-profile learning for adaptation to new surfaces
21 IEEE-RAS International Conference on Humanoid Robots Nashville, TN, USA, December 6-8, 21 On-line periodic movement and force-profile learning for adaptation to new surfaces Andrej Gams, Martin Do,
More informationA Connectionist Learning Control Architecture for Navigation
A Connectionist Learning Control Architecture for Navigation Jonathan R. Bachrach Department of Computer and Information Science University of Massachusetts Amherst, MA 01003 Abstract A novel learning
More informationA Brief Introduction to Reinforcement Learning
A Brief Introduction to Reinforcement Learning Minlie Huang ( ) Dept. of Computer Science, Tsinghua University aihuang@tsinghua.edu.cn 1 http://coai.cs.tsinghua.edu.cn/hml Reinforcement Learning Agent
More informationWritten exams of Robotics 2
Written exams of Robotics 2 http://www.diag.uniroma1.it/~deluca/rob2_en.html All materials are in English, unless indicated (oldies are in Year Date (mm.dd) Number of exercises Topics 2018 07.11 4 Inertia
More informationSelf-Organization of Place Cells and Reward-Based Navigation for a Mobile Robot
Self-Organization of Place Cells and Reward-Based Navigation for a Mobile Robot Takashi TAKAHASHI Toshio TANAKA Kenji NISHIDA Takio KURITA Postdoctoral Research Fellow of the Japan Society for the Promotion
More informationCentipede Robot Locomotion
Master Project Centipede Robot Locomotion Brian Jiménez García [brian.jimenez@epfl.ch] Supervisor: Auke Jan Ikspeert Biologically Inspired Robotics Group (BIRG) Swiss Federal Institute of Technology Lausanne
More informationA Fuzzy Reinforcement Learning for a Ball Interception Problem
A Fuzzy Reinforcement Learning for a Ball Interception Problem Tomoharu Nakashima, Masayo Udo, and Hisao Ishibuchi Department of Industrial Engineering, Osaka Prefecture University Gakuen-cho 1-1, Sakai,
More informationSimulation. x i. x i+1. degrees of freedom equations of motion. Newtonian laws gravity. ground contact forces
Dynamic Controllers Simulation x i Newtonian laws gravity ground contact forces x i+1. x degrees of freedom equations of motion Simulation + Control x i Newtonian laws gravity ground contact forces internal
More informationSynthesizing Goal-Directed Actions from a Library of Example Movements
Synthesizing Goal-Directed Actions from a Library of Example Movements Aleš Ude, Marcia Riley, Bojan Nemec, Andrej Kos, Tamim Asfour and Gordon Cheng Jožef Stefan Institute, Dept. of Automatics, Biocybernetics
More informationarxiv: v2 [cs.ro] 8 Mar 2018
Learning Task-Specific Dynamics to Improve Whole-Body Control Andrej Gams 1, Sean A. Mason 2, Aleš Ude 1, Stefan Schaal 2,3 and Ludovic Righetti 3,4 arxiv:1803.01978v2 [cs.ro] 8 Mar 2018 Abstract In task-based
More informationLearning Humanoid Motion Dynamics through Sensory-Motor Mapping in Reduced Dimensional Spaces
Learning Humanoid Motion Dynamics through Sensory-Motor Mapping in Reduced Dimensional Spaces Rawichote Chalodhorn, David B. Grimes, Gabriel Y. Maganis and, Rajesh P. N. Rao Neural Systems Laboratory Department
More informationRobotics 2 Iterative Learning for Gravity Compensation
Robotics 2 Iterative Learning for Gravity Compensation Prof. Alessandro De Luca Control goal! regulation of arbitrary equilibium configurations in the presence of gravity! without explicit knowledge of
More informationBEST2015 Autonomous Mobile Robots Lecture 2: Mobile Robot Kinematics and Control
BEST2015 Autonomous Mobile Robots Lecture 2: Mobile Robot Kinematics and Control Renaud Ronsse renaud.ronsse@uclouvain.be École polytechnique de Louvain, UCLouvain July 2015 1 Introduction Mobile robot
More information10. Cartesian Trajectory Planning for Robot Manipulators
V. Kumar 0. Cartesian rajectory Planning for obot Manipulators 0.. Introduction Given a starting end effector position and orientation and a goal position and orientation we want to generate a smooth trajectory
More information10703 Deep Reinforcement Learning and Control
10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient II Used Materials Disclaimer: Much of the material and slides for this lecture
More informationYaMoR and Bluemove an autonomous modular robot with Bluetooth interface for exploring adaptive locomotion
YaMoR and Bluemove an autonomous modular robot with Bluetooth interface for exploring adaptive locomotion R. Moeckel, C. Jaquier, K. Drapel, E. Dittrich, A. Upegui, A. Ijspeert Ecole Polytechnique Fédérale
More informationRobotic Behaviors. Potential Field Methods
Robotic Behaviors Potential field techniques - trajectory generation - closed feedback-loop control Design of variety of behaviors - motivated by potential field based approach steering behaviors Closed
More informationarxiv: v2 [cs.ro] 26 Feb 2015
Learning Contact-Rich Manipulation Skills with Guided Policy Search Sergey Levine, Nolan Wagener, Pieter Abbeel arxiv:1501.05611v2 [cs.ro] 26 Feb 2015 Abstract Autonomous learning of object manipulation
More informationNeuro-Dynamic Programming An Overview
1 Neuro-Dynamic Programming An Overview Dimitri Bertsekas Dept. of Electrical Engineering and Computer Science M.I.T. May 2006 2 BELLMAN AND THE DUAL CURSES Dynamic Programming (DP) is very broadly applicable,
More informationIn Homework 1, you determined the inverse dynamics model of the spinbot robot to be
Robot Learning Winter Semester 22/3, Homework 2 Prof. Dr. J. Peters, M.Eng. O. Kroemer, M. Sc. H. van Hoof Due date: Wed 6 Jan. 23 Note: Please fill in the solution on this sheet but add sheets for the
More informationMachine Learning on Physical Robots
Machine Learning on Physical Robots Alfred P. Sloan Research Fellow Department or Computer Sciences The University of Texas at Austin Research Question To what degree can autonomous intelligent agents
More informationDeep Reinforcement Learning
Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3. Policy Gradient and Gradient Estimators 4. Q-prop: Sample Efficient Policy Gradient and an Off-policy Critic
More informationReinforcement learning for imitating constrained reaching movements
Advanced Robotics, Vol. 21, No. 13, pp. 1521 1544 (2007) VSP and Robotics Society of Japan 2007. Also available online - www.brill.nl/ar Full paper Reinforcement learning for imitating constrained reaching
More informationDynamic Controllers in Character Animation. Jack Wang
Dynamic Controllers in Character Animation Jack Wang Overview Definition Related Work Composable Controllers Framework (2001) Results Future Work 2 Physics-based Animation Dynamic Controllers vs. Simulation
More information