On Scalability Issues in Reinforcement Learning for Modular Robots. Paulina Varshavskaya, Leslie Pack Kaelbling and Daniela Rus

Size: px
Start display at page:

Download "On Scalability Issues in Reinforcement Learning for Modular Robots. Paulina Varshavskaya, Leslie Pack Kaelbling and Daniela Rus"

Transcription

1 On Scalability Issues in Reinforcement Learning for Modular Robots Paulina Varshavskaya, Leslie Pack Kaelbling and Daniela Rus

2 motivate reinforcement learning (RL) larger modular system challenges a specific solution a general solution experiments in simulation related current work 2

3 RL for selfreconfigurable robots. automated controller design Butler et al 2000 MTRANII controller: genetic algorithms (Kamimura et al 2004) Molecube controller: genetic algorithms (Mytilinaios et al 2004) Telecube primitives and controller: genetic programming (Kubica and Rieffel 2002) 3

4 RL for selfreconfigurable robots. automated controller design 2. online adaptation from the point of view of each robot or module limited knowledge and resources moving from. to 2. 4

5 Task: locomotion in modular robots x4 latticebased robots Molecule Kotay & Rus 2005 simulated generalized latticebased robot 5

6 From each agent s point of view state: global configuration of robot unknown local observation: Moore neighborhood actions: 8 directions + NOP reward: Eastward displacement along x axis x acting module neighbor present empty lattice cell assumptions: primitive physics, no disconnections, failed actions don t execute, synchronous execution 6

7 Large modular sytem challenges partial observability cannot use Markovassuming nice algorithms large observationaction spaces 2 8 observations without obstacles x actions 3 8 observations with obstacles x actions 7

8 Partial observability in a multiagent partially observable Markov Decision Process (POMDP) direct policy search using gradient ascent in policy space (GAPS) Peshkin 200 value of " (! ) Value V(θ) of policy π(θ) V! (! ) = E [ R]! ( o, a) R =! T t = " t r t 8

9 GAPS each agent executes a parameterized policy = t e $ (" ) ( at ot," )! e # " ( o, a = P # t" ( o a' t t t ), a') β t is temperature o t a t R t θ at time t: observe o t select a t according to policy execute a t receive reward r t keep an execution trace at end of episode: update parameters θ

10 Large modular sytem challenges partial observability large observationaction spaces 2 8 observations without obstacles 2,304 parameters 3 8 observations with obstacles 5,04 parameters x acting module neighbor present empty lattice cell 0

11 Specific solution: incremental learning possible observations x possible actions = total number of parameters 2 modules x 2 x 2 = 36 3 modules 4 modules x 2 x 4 x 8 x 4 x 4 x 8 x 8 =62 = 44 acting module neighbor present empty lattice cell x 4 + modules = 2,304

12 Incremental GAPS (IGAPS) performance Mean convergence times comparison Mean average reward comparison Varshavskaya et al

13 A specific solution additive nature of modular robots unclear applicability to other tasks and systems faster RL but not fast enough for online adaptation 3

14 Desirable general solution possible observations total number of parameters 2 modules x 2 x 2 small n 3 modules 4 modules x 2 x 4 x 8 x 4 x 4 x 8 x 8 small n small n acting module neighbor present empty lattice cell x 4 + modules small n 4

15 Features acting module neighbor present empty space don t care if upper left corner full & a=ne φ 7 = φ 88 = 0 otherwise if upper right corner empty & a=se 0 otherwise if empty row in front & a=se φ 45 = 0 φ 75 = otherwise if upper left corner empty & a=w 0 otherwise 5

16 6 Approximation by feature spaces define a number of feature functions over the observationaction space policy computed from the dot product of the feature vector and parameters θ a few features approximate the desired space learning with a modified loglinear GAPS (LLGAPS)! " # " # = ' ) ', ( ), ( ), ( a o a o a t t t t t t t e e o a P $ % $ % $ ), ( ), ( ), ( o a o a o a! n! L = "

17 7 Full feature set 44 features acting module neighbor present empty space don t care observation observation observation observation action action action action

18 Performance comparison Performance as learning progresses: Smoothed average reward over 0 runs reward Mean convergence times comparison 8

19 Learned locomotion

20 Summary so far reinforcement learning for modular robots with more realistic observability assumptions learning algorithm with feature representation results in locomotion by selfreconfiguration 20

21 Advantages of a feature representation faster learning number of parameters independent of robot size or observation space size domain knowledge don t try to move into an occupied cell features can be designed for many tasks and robots 2

22 Current work: asynchronous execution reward 2 4 observations, actions only 220 features 22

23 Current work: large system locomotion 20 modules 70 modules 23

24 Current work: complex reward r world o t θ θ 2 arbitration a t θ n r n distribute subtasks within each module 24

25 Current work: locomotion in truss robots MultiShadySim Detweiler et al 2006 x4 Shady Vona et al

26 Questions? project sponsored by Boeing Corporation 26

Learning Locomotion Gait of Lattice-Based Modular Robots by Demonstration

Learning Locomotion Gait of Lattice-Based Modular Robots by Demonstration Learning Locomotion Gait of Lattice-Based Modular Robots by Demonstration Seung-kook Yun Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology, Cambridge, Massachusetts,

More information

Gradient Reinforcement Learning of POMDP Policy Graphs

Gradient Reinforcement Learning of POMDP Policy Graphs 1 Gradient Reinforcement Learning of POMDP Policy Graphs Douglas Aberdeen Research School of Information Science and Engineering Australian National University Jonathan Baxter WhizBang! Labs July 23, 2001

More information

From Crystals to Lattice Robots

From Crystals to Lattice Robots 2008 IEEE International Conference on Robotics and Automation Pasadena, CA, USA, May 19-23, 2008 From Crystals to Lattice Robots Nicolas Brener, Faiz Ben Amar, Philippe Bidaud Université Pierre et Marie

More information

Generic Decentralized Control for a Class of Self-Reconfigurable Robots

Generic Decentralized Control for a Class of Self-Reconfigurable Robots eneric Decentralized ontrol for a lass of Self-Reconfigurable Robots Zack Butler, Keith Kotay, Daniela Rus and Kohji Tomita Department of omputer Science National nstitute of Advanced Dartmouth ollege

More information

Distributed Planning and Control for Modular Robots with Unit-Compressible Modules

Distributed Planning and Control for Modular Robots with Unit-Compressible Modules Zack Butler Daniela Rus Department of Computer Science Dartmouth College Hanover, NH 03755-3510, USA zackb@cs.dartmouth.edu Distributed Planning and Control for Modular Robots with Unit-Compressible Modules

More information

Hierarchical Reinforcement Learning for Robot Navigation

Hierarchical Reinforcement Learning for Robot Navigation Hierarchical Reinforcement Learning for Robot Navigation B. Bischoff 1, D. Nguyen-Tuong 1,I-H.Lee 1, F. Streichert 1 and A. Knoll 2 1- Robert Bosch GmbH - Corporate Research Robert-Bosch-Str. 2, 71701

More information

Generic Decentralized Control for Lattice-Based Self-Reconfigurable Robots

Generic Decentralized Control for Lattice-Based Self-Reconfigurable Robots Generic Decentralized ontrol for Lattice-Based Self-Reconfigurable Robots Zack Butler, Keith Kotay, Daniela Rus and Kohji Tomita Abstract Previous work on self-reconfiguring modular robots has concentrated

More information

Generalized Inverse Reinforcement Learning

Generalized Inverse Reinforcement Learning Generalized Inverse Reinforcement Learning James MacGlashan Cogitai, Inc. james@cogitai.com Michael L. Littman mlittman@cs.brown.edu Nakul Gopalan ngopalan@cs.brown.edu Amy Greenwald amy@cs.brown.edu Abstract

More information

Self Assembly of Modular Manipulators with Active and Passive Modules

Self Assembly of Modular Manipulators with Active and Passive Modules Self Assembly of Modular Manipulators with Active and Passive Modules Seung-kook Yun and Daniela Rus Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology, Cambridge,

More information

Partially Observable Markov Decision Processes. Silvia Cruciani João Carvalho

Partially Observable Markov Decision Processes. Silvia Cruciani João Carvalho Partially Observable Markov Decision Processes Silvia Cruciani João Carvalho MDP A reminder: is a set of states is a set of actions is the state transition function. is the probability of ending in state

More information

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 12: Deep Reinforcement Learning Types of Learning Supervised training Learning from the teacher Training data includes

More information

Reconfiguration Planning for Heterogeneous Self-Reconfiguring Robots

Reconfiguration Planning for Heterogeneous Self-Reconfiguring Robots Reconfiguration Planning for Heterogeneous Self-Reconfiguring Robots Robert Fitch, Zack Butler and Daniela Rus Department of Computer Science, Dartmouth College {rfitch, zackb, rus}@cs.dartmouth.edu Abstract

More information

A System to Support the Design of Dynamic Structure Configurations

A System to Support the Design of Dynamic Structure Configurations ESPRESSOCAD A System to Support the Design of Dynamic Structure Configurations MICHAEL PHILETUS WELLER 1, ELLEN YI-LUEN DO 1 AND MARK D GROSS 1 1 Design Machine Group, University of Washington 1. Introduction

More information

An Analysis of the Million Module March algorithm applied to the ATRON robotic platform

An Analysis of the Million Module March algorithm applied to the ATRON robotic platform Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 2011 An Analysis of the Million Module March algorithm applied to the ATRON robotic platform James Phipps Follow

More information

Reinforcement Learning (INF11010) Lecture 6: Dynamic Programming for Reinforcement Learning (extended)

Reinforcement Learning (INF11010) Lecture 6: Dynamic Programming for Reinforcement Learning (extended) Reinforcement Learning (INF11010) Lecture 6: Dynamic Programming for Reinforcement Learning (extended) Pavlos Andreadis, February 2 nd 2018 1 Markov Decision Processes A finite Markov Decision Process

More information

When Network Embedding meets Reinforcement Learning?

When Network Embedding meets Reinforcement Learning? When Network Embedding meets Reinforcement Learning? ---Learning Combinatorial Optimization Problems over Graphs Changjun Fan 1 1. An Introduction to (Deep) Reinforcement Learning 2. How to combine NE

More information

10703 Deep Reinforcement Learning and Control

10703 Deep Reinforcement Learning and Control 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture

More information

Lecture 13: Learning from Demonstration

Lecture 13: Learning from Demonstration CS 294-5 Algorithmic Human-Robot Interaction Fall 206 Lecture 3: Learning from Demonstration Scribes: Samee Ibraheem and Malayandi Palaniappan - Adapted from Notes by Avi Singh and Sammy Staszak 3. Introduction

More information

Mobilized ad-hoc networks: A reinforcement learning approach

Mobilized ad-hoc networks: A reinforcement learning approach Mobilized ad-hoc networks: A reinforcement learning approach Yu-Han Chang A.I. Lab, MIT Cambridge, MA 02139 ychang@ai.mit.edu Tracey Ho LIDS, MIT Cambridge, MA 02139 trace@mit.edu Leslie Pack Kaelbling

More information

Planning and Control: Markov Decision Processes

Planning and Control: Markov Decision Processes CSE-571 AI-based Mobile Robotics Planning and Control: Markov Decision Processes Planning Static vs. Dynamic Predictable vs. Unpredictable Fully vs. Partially Observable Perfect vs. Noisy Environment What

More information

Self-Repair Through Scale Independent Self-Reconfiguration

Self-Repair Through Scale Independent Self-Reconfiguration Self-Repair Through Scale Independent Self-Reconfiguration K. Stoy The Maersk Mc-Kinney Moller Institute for Production Technology University of Southern Denmark Odense, Denmark Email: kaspers@mip.sdu.dk

More information

A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots

A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots 50 A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots KwangEui Lee Department of Multimedia Engineering, Dongeui University, Busan, Korea Summary In this paper, we propose

More information

Distributed planning and control for modular robots with unit-compressible modules

Distributed planning and control for modular robots with unit-compressible modules 1 Distributed planning and control for modular robots with unit-compressible modules Zack Butler and Daniela Rus Dept. of Computer Science Dartmouth College Abstract Self-reconfigurable robots are versatile

More information

A Brief Introduction to Reinforcement Learning

A Brief Introduction to Reinforcement Learning A Brief Introduction to Reinforcement Learning Minlie Huang ( ) Dept. of Computer Science, Tsinghua University aihuang@tsinghua.edu.cn 1 http://coai.cs.tsinghua.edu.cn/hml Reinforcement Learning Agent

More information

Machine Learning on Physical Robots

Machine Learning on Physical Robots Machine Learning on Physical Robots Alfred P. Sloan Research Fellow Department or Computer Sciences The University of Texas at Austin Research Question To what degree can autonomous intelligent agents

More information

Movement Primitives for an Orthogonal Prismatic Closed-Lattice-Constrained Self-Reconfiguring Module

Movement Primitives for an Orthogonal Prismatic Closed-Lattice-Constrained Self-Reconfiguring Module Movement Primitives for an Orthogonal Prismatic Closed-Lattice-Constrained Self-Reconfiguring Module Michael Philetus Weller, Mustafa Emre Karagozler, Brian Kirby, Jason Campbell and Seth Copen Goldstein

More information

Modeling Robot Path Planning with CD++

Modeling Robot Path Planning with CD++ Modeling Robot Path Planning with CD++ Gabriel Wainer Department of Systems and Computer Engineering. Carleton University. 1125 Colonel By Dr. Ottawa, Ontario, Canada. gwainer@sce.carleton.ca Abstract.

More information

Reconfigurable Robot

Reconfigurable Robot Reconfigurable Robot What is a Reconfigurable Robot? Self-reconfiguring modular robots are autonomous kinematic machines with variable morphology They are able to deliberately change their own shape by

More information

Realistic Reconfiguration of Crystalline (and Telecube) Robots

Realistic Reconfiguration of Crystalline (and Telecube) Robots Realistic Reconfiguration of Crystalline (and Telecube) Robots Greg Aloupis, Sébastien Collette, Mirela Damian, Erik D. Demaine 3, Dania El-Khechen 4, Robin Flatland 5, Stefan Langerman, Joseph O Rourke

More information

Introduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University

Introduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University Introduction to Reinforcement Learning J. Zico Kolter Carnegie Mellon University 1 Agent interaction with environment Agent State s Reward r Action a Environment 2 Of course, an oversimplification 3 Review:

More information

Realistic Reconfiguration of Crystalline (and Telecube) Robots

Realistic Reconfiguration of Crystalline (and Telecube) Robots Realistic Reconfiguration of Crystalline (and Telecube) Robots Greg Aloupis, Sébastien Collette, Mirela Damian, Erik D. Demaine, Dania El-Khechen, Robin Flatland, Stefan Langerman, Joseph O Rourke, Val

More information

Design of Prismatic Cube Modules for Convex Corner Traversal in 3D

Design of Prismatic Cube Modules for Convex Corner Traversal in 3D Design of Prismatic Cube Modules for Convex Corner Traversal in 3D Michael Philetus Weller, Brian T Kirby, H Benjamin Brown, Mark D Gross and Seth Copen Goldstein Abstract The prismatic cube style of modular

More information

Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations

Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations Peter Englert Machine Learning and Robotics Lab Universität Stuttgart Germany

More information

CS Path Planning

CS Path Planning Why Path Planning? CS 603 - Path Planning Roderic A. Grupen 4/13/15 Robotics 1 4/13/15 Robotics 2 Why Motion Planning? Origins of Motion Planning Virtual Prototyping! Character Animation! Structural Molecular

More information

Marco Wiering Intelligent Systems Group Utrecht University

Marco Wiering Intelligent Systems Group Utrecht University Reinforcement Learning for Robot Control Marco Wiering Intelligent Systems Group Utrecht University marco@cs.uu.nl 22-11-2004 Introduction Robots move in the physical environment to perform tasks The environment

More information

3D Rectilinear Motion Planning with Minimum Bend Paths

3D Rectilinear Motion Planning with Minimum Bend Paths 3D Rectilinear Motion Planning with Minimum Bend Paths Robert Fitch, Zack Butler and Daniela Rus Dept. of Computer Science Dartmouth College Abstract Computing rectilinear shortest paths in two dimensions

More information

Motion Simulation of a Modular Robotic System

Motion Simulation of a Modular Robotic System Motion Simulation of a Modular Robotic System Haruhisa KUROKAWA, Kohji TOMITA, Eiichi YOSHIDA, Satoshi MURATA and Shigeru KOKAJI Mechanical Engineering Laboratory, AIST, MITI Namiki 1-2, Tsukuba, Ibaraki

More information

Distributed Control of the Center of Mass of a Modular Robot

Distributed Control of the Center of Mass of a Modular Robot Proc. 6 IEEE/RSJ Intl. Conf on Intelligent Robots and Systems (IROS-6), pp. 47-475 Distributed Control of the Center of Mass of a Modular Robot Mark Moll, Peter Will, Maks Krivokon, and Wei-Min Shen Information

More information

A Self-Reconfigurable Modular Robot: Reconfiguration Planning and Experiments

A Self-Reconfigurable Modular Robot: Reconfiguration Planning and Experiments A Self-Reconfigurable Modular Robot: Reconfiguration Planning and Experiments Eiichi Yoshida, Satoshi Murata, Akiya Kamimura, Kohji Tomita, Haruhisa Kurokawa, and Shigeru Kokaji Distributed System Design

More information

ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL

ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY BHARAT SIGINAM IN

More information

Shape Complexes for Metamorhpic Robots

Shape Complexes for Metamorhpic Robots Shape Complexes for Metamorhpic Robots Robert Ghrist University of Illinois, Urbana, IL 61801, USA Abstract. A metamorphic robotic system is an aggregate of identical robot units which can individually

More information

Robots & Cellular Automata

Robots & Cellular Automata Integrated Seminar: Intelligent Robotics Robots & Cellular Automata Julius Mayer Table of Contents Cellular Automata Introduction Update Rule 3 4 Neighborhood 5 Examples. 6 Robots Cellular Neural Network

More information

Learning Motor Behaviors: Past & Present Work

Learning Motor Behaviors: Past & Present Work Stefan Schaal Computer Science & Neuroscience University of Southern California, Los Angeles & ATR Computational Neuroscience Laboratory Kyoto, Japan sschaal@usc.edu http://www-clmc.usc.edu Learning Motor

More information

Problem characteristics. Dynamic Optimization. Examples. Policies vs. Trajectories. Planning using dynamic optimization. Dynamic Optimization Issues

Problem characteristics. Dynamic Optimization. Examples. Policies vs. Trajectories. Planning using dynamic optimization. Dynamic Optimization Issues Problem characteristics Planning using dynamic optimization Chris Atkeson 2004 Want optimal plan, not just feasible plan We will minimize a cost function C(execution). Some examples: C() = c T (x T ) +

More information

Reconfiguration of Cube-Style Modular Robots Using O(log n) Parallel Moves

Reconfiguration of Cube-Style Modular Robots Using O(log n) Parallel Moves Reconfiguration of Cube-Style Modular Robots Using O(log n) Parallel Moves Greg Aloupis 1, Sébastien Collette 1, Erik D. Demaine 2, Stefan Langerman 1, Vera Sacristán 3, and Stefanie Wuhrer 4 1 Université

More information

Natural Actor-Critic. Authors: Jan Peters and Stefan Schaal Neurocomputing, Cognitive robotics 2008/2009 Wouter Klijn

Natural Actor-Critic. Authors: Jan Peters and Stefan Schaal Neurocomputing, Cognitive robotics 2008/2009 Wouter Klijn Natural Actor-Critic Authors: Jan Peters and Stefan Schaal Neurocomputing, 2008 Cognitive robotics 2008/2009 Wouter Klijn Content Content / Introduction Actor-Critic Natural gradient Applications Conclusion

More information

Increasing the Efficiency of Distributed Goal-Filling Algorithms for Self-Reconfigurable Hexagonal Metamorphic Robots

Increasing the Efficiency of Distributed Goal-Filling Algorithms for Self-Reconfigurable Hexagonal Metamorphic Robots Increasing the Efficiency of Distributed Goal-Filling lgorithms for Self-Reconfigurable Hexagonal Metamorphic Robots J. ateau,. Clark, K. McEachern, E. Schutze, and J. Walter Computer Science Department,

More information

CELLULAR automata (CA) are mathematical models for

CELLULAR automata (CA) are mathematical models for 1 Cellular Learning Automata with Multiple Learning Automata in Each Cell and its Applications Hamid Beigy and M R Meybodi Abstract The cellular learning automata, which is a combination of cellular automata

More information

Asymptotically Optimal Kinodynamic Motion Planning for Self-reconfigurable Robots

Asymptotically Optimal Kinodynamic Motion Planning for Self-reconfigurable Robots Asymptotically Optimal Kinodynamic Motion Planning for Self-reconfigurable Robots John H. Reif 1 and Sam Slee 1 Department of Computer Science, Duke University, Durham, NC, USA {reif,sgs}@cs.duke.edu Abstract.

More information

Markov Decision Processes. (Slides from Mausam)

Markov Decision Processes. (Slides from Mausam) Markov Decision Processes (Slides from Mausam) Machine Learning Operations Research Graph Theory Control Theory Markov Decision Process Economics Robotics Artificial Intelligence Neuroscience /Psychology

More information

An active connection mechanism for modular self-reconfigurable robotic systems based on physical latching

An active connection mechanism for modular self-reconfigurable robotic systems based on physical latching An active connection mechanism for modular self-reconfigurable robotic systems based on physical latching A. Sproewitz, M. Asadpour, Y. Bourquin and A. J. Ijspeert Abstract This article presents a robust

More information

Learning Grasp Strategies Composed of Contact Relative Motions

Learning Grasp Strategies Composed of Contact Relative Motions Learning Grasp Strategies Composed of Contact Relative Motions Robert Platt Dextrous Robotics Laboratory Johnson Space Center, NASA robert.platt-1@nasa.gov Abstract Of central importance to grasp synthesis

More information

Value Iteration. Reinforcement Learning: Introduction to Machine Learning. Matt Gormley Lecture 23 Apr. 10, 2019

Value Iteration. Reinforcement Learning: Introduction to Machine Learning. Matt Gormley Lecture 23 Apr. 10, 2019 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Reinforcement Learning: Value Iteration Matt Gormley Lecture 23 Apr. 10, 2019 1

More information

Technical Report TR Texas A&M University Parasol Lab. Algorithms for Filling Obstacle Pockets with Hexagonal Metamorphic Robots

Technical Report TR Texas A&M University Parasol Lab. Algorithms for Filling Obstacle Pockets with Hexagonal Metamorphic Robots Technical Report TR03-002 Texas A&M University Parasol Lab Algorithms for Filling Obstacle Pockets with Hexagonal Metamorphic Robots Author: Mary E. Brooks, University of Oklahoma Advisors: Dr. Jennifer

More information

Hormone- Controlled Metamorphic Robots

Hormone- Controlled Metamorphic Robots Proceedings of the 2001 IEEE International Conference on Robotics & Automation Seoul, Korea. May 21-26, 2001 Hormone- Controlled Metamorphic Robots Behnam Salemi, Wei-Min Shen, and Peter Will USC Information

More information

Hierarchical Motion Planning for Self-reconfigurable Modular Robots

Hierarchical Motion Planning for Self-reconfigurable Modular Robots Hierarchical Motion Planning for Self-reconfigurable Modular Robots Preethi Bhat James Kuffner Seth Goldstein Siddhartha Srinivasa 2 The Robotics Institute 2 Intel Research Pittsburgh Carnegie Mellon University

More information

Continuous Valued Q-learning for Vision-Guided Behavior Acquisition

Continuous Valued Q-learning for Vision-Guided Behavior Acquisition Continuous Valued Q-learning for Vision-Guided Behavior Acquisition Yasutake Takahashi, Masanori Takeda, and Minoru Asada Dept. of Adaptive Machine Systems Graduate School of Engineering Osaka University

More information

Lecture 15: Segmentation (Edge Based, Hough Transform)

Lecture 15: Segmentation (Edge Based, Hough Transform) Lecture 15: Segmentation (Edge Based, Hough Transform) c Bryan S. Morse, Brigham Young University, 1998 000 Last modified on February 3, 000 at :00 PM Contents 15.1 Introduction..............................................

More information

Asymptotically Optimal Kinodynamic Motion Planning for Self-Reconfigurable Robots

Asymptotically Optimal Kinodynamic Motion Planning for Self-Reconfigurable Robots Asymptotically Optimal Kinodynamic Motion Planning for Self-Reconfigurable Robots John H. Reif 1 and Sam Slee 1 Department of Computer Science, Duke University, Durham, NC, USA. {reif,sgs}@cs.duke.edu

More information

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning Jan Peters 1, Stefan Schaal 1 University of Southern California, Los Angeles CA 90089, USA Abstract. In this paper, we

More information

Apprenticeship Learning for Reinforcement Learning. with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang

Apprenticeship Learning for Reinforcement Learning. with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang Apprenticeship Learning for Reinforcement Learning with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang Table of Contents Introduction Theory Autonomous helicopter control

More information

Sensor Models / Occupancy Mapping / Density Mapping

Sensor Models / Occupancy Mapping / Density Mapping Statistical Techniques in Robotics (16-831, F09) Lecture #3 (09/01/2009) Sensor Models / Occupancy Mapping / Density Mapping Lecturer: Drew Bagnell Scribe: Mehmet R. Dogar 1 Sensor Models 1.1 Beam Model

More information

CSE151 Assignment 2 Markov Decision Processes in the Grid World

CSE151 Assignment 2 Markov Decision Processes in the Grid World CSE5 Assignment Markov Decision Processes in the Grid World Grace Lin A484 gclin@ucsd.edu Tom Maddock A55645 tmaddock@ucsd.edu Abstract Markov decision processes exemplify sequential problems, which are

More information

A New Meta-Module for Controlling Large Sheets of ATRON Modules

A New Meta-Module for Controlling Large Sheets of ATRON Modules A New Meta-Module for Controlling Large Sheets of ATRON Modules David Brandt and David Johan Christensen The Adaptronics Group, The Maersk Institute, University of Southern Denmark {david.brandt, david}@mmmi.sdu.dk

More information

Optimal Kinodynamic Motion Planning for 2D Reconfiguration of Self-Reconfigurable Robots

Optimal Kinodynamic Motion Planning for 2D Reconfiguration of Self-Reconfigurable Robots Optimal Kinodynamic Motion Planning for 2D Reconfiguration of Self-Reconfigurable Robots John Reif Department of Computer Science, Duke University Abstract A self-reconfigurable (SR) robot is one composed

More information

EM-Cube: Cube-Shaped, Self-reconfigurable Robots Sliding on Structure Surfaces

EM-Cube: Cube-Shaped, Self-reconfigurable Robots Sliding on Structure Surfaces 2008 IEEE International Conference on Robotics and Automation Pasadena, CA, USA, May 19-23, 2008 EM-Cube: Cube-Shaped, Self-reconfigurable Robots Sliding on Structure Surfaces Byoung Kwon An Abstract Many

More information

Evolutionary Synthesis of Dynamic Motion and Reconfiguration Process for a Modular Robot M-TRAN

Evolutionary Synthesis of Dynamic Motion and Reconfiguration Process for a Modular Robot M-TRAN Evolutionary Synthesis of Dynamic Motion and Reconfiguration Process for a Modular Robot M-TRAN Eiichi Yoshida *, Satoshi Murata **, Akiya Kamimura *, Kohji Tomita *, Haruhisa Kurokawa * and Shigeru Kokaji

More information

Self-Reconfigurable Modular Robot (M-TRAN) and its Motion Design

Self-Reconfigurable Modular Robot (M-TRAN) and its Motion Design Self-Reconfigurable Modular Robot (M-TRAN) and its Motion Design Haruhisa KUROKAWA *, Akiya KAMIMURA *, Eiichi YOSHIDA *, Kohji TOMITA *, Satoshi MURATA ** and Shigeru KOKAJI * * National Institute of

More information

Distributed Goal Recognition Algorithms for Modular Robots

Distributed Goal Recognition Algorithms for Modular Robots Distributed Goal Recognition Algorithms for Modular Robots Zack Butler, Robert Fitch, Daniela Rus and Yuhang Wang Department of Computer Science Dartmouth College {zackb,rfitch,rus,wyh}@cs.dartmouth.edu

More information

Strategies for simulating pedestrian navigation with multiple reinforcement learning agents

Strategies for simulating pedestrian navigation with multiple reinforcement learning agents Strategies for simulating pedestrian navigation with multiple reinforcement learning agents Francisco Martinez-Gil, Miguel Lozano, Fernando Ferna ndez Presented by: Daniel Geschwender 9/29/2016 1 Overview

More information

Reinforcement Learning (2)

Reinforcement Learning (2) Reinforcement Learning (2) Bruno Bouzy 1 october 2013 This document is the second part of the «Reinforcement Learning» chapter of the «Agent oriented learning» teaching unit of the Master MI computer course.

More information

A motion planning method for mobile robot considering rotational motion in area coverage task

A motion planning method for mobile robot considering rotational motion in area coverage task Asia Pacific Conference on Robot IoT System Development and Platform 018 (APRIS018) A motion planning method for mobile robot considering rotational motion in area coverage task Yano Taiki 1,a) Takase

More information

TURN AROUND BEHAVIOR GENERATION AND EXECUTION FOR UNMANNED GROUND VEHICLES OPERATING IN ROUGH TERRAIN

TURN AROUND BEHAVIOR GENERATION AND EXECUTION FOR UNMANNED GROUND VEHICLES OPERATING IN ROUGH TERRAIN 1 TURN AROUND BEHAVIOR GENERATION AND EXECUTION FOR UNMANNED GROUND VEHICLES OPERATING IN ROUGH TERRAIN M. M. DABBEERU AND P. SVEC Department of Mechanical Engineering, University of Maryland, College

More information

Linear Reconfiguration of Cube-Style Modular Robots

Linear Reconfiguration of Cube-Style Modular Robots Linear Reconfiguration of ube-style Modular Robots Greg loupis Sébastien ollette Mirela Damian Erik D. Demaine Robin Flatland Stefan Langerman Joseph O Rourke Suneeta Ramaswami Vera Sacristán Stefanie

More information

Forward Search Value Iteration For POMDPs

Forward Search Value Iteration For POMDPs Forward Search Value Iteration For POMDPs Guy Shani and Ronen I. Brafman and Solomon E. Shimony Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel Abstract Recent scaling up of POMDP

More information

Layering Algorithm for Collision-Free Traversal Using Hexagonal Self-Reconfigurable Metamorphic Robots

Layering Algorithm for Collision-Free Traversal Using Hexagonal Self-Reconfigurable Metamorphic Robots Layering Algorithm for Collision-Free Traversal Using Hexagonal Self-Reconfigurable Metamorphic Robots Plamen Ivanov and Jennifer Walter Computer Science Department Vassar College {plivanov,jewalter}@vassar.edu

More information

Distributed Control for Unit-compressible Robots: Goal-recognition, Locomotion and Splitting

Distributed Control for Unit-compressible Robots: Goal-recognition, Locomotion and Splitting 1 Distributed Control for Unit-compressible Robots: Goal-recognition, Locomotion and Splitting Zack Butler, Robert Fitch and Daniela Rus Dept. of Computer Science Dartmouth College Abstract We present

More information

Markov Decision Processes and Reinforcement Learning

Markov Decision Processes and Reinforcement Learning Lecture 14 and Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig Course Overview Introduction Artificial Intelligence

More information

Q-learning with linear function approximation

Q-learning with linear function approximation Q-learning with linear function approximation Francisco S. Melo and M. Isabel Ribeiro Institute for Systems and Robotics [fmelo,mir]@isr.ist.utl.pt Conference on Learning Theory, COLT 2007 June 14th, 2007

More information

Self Assembly of Modular Manipulators with Active and Passive modules

Self Assembly of Modular Manipulators with Active and Passive modules Self Assembly of Modular Manipulators with Active and Passive modules Seung-kook Yun, Yeoreum Yoon, Daniela Rus Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology,

More information

Learning Complex Motions by Sequencing Simpler Motion Templates

Learning Complex Motions by Sequencing Simpler Motion Templates Learning Complex Motions by Sequencing Simpler Motion Templates Gerhard Neumann gerhard@igi.tugraz.at Wolfgang Maass maass@igi.tugraz.at Institute for Theoretical Computer Science, Graz University of Technology,

More information

Homework #1. Displays, Image Processing, Affine Transformations, Hierarchical modeling, Projections

Homework #1. Displays, Image Processing, Affine Transformations, Hierarchical modeling, Projections Computer Graphics Instructor: rian Curless CSEP 557 Winter 213 Homework #1 Displays, Image Processing, Affine Transformations, Hierarchical modeling, Projections Assigned: Tuesday, January 22 nd Due: Tuesday,

More information

Reinforcement Learning with Parameterized Actions

Reinforcement Learning with Parameterized Actions Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Reinforcement Learning with Parameterized Actions Warwick Masson and Pravesh Ranchod School of Computer Science and Applied

More information

Slides credited from Dr. David Silver & Hung-Yi Lee

Slides credited from Dr. David Silver & Hung-Yi Lee Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to

More information

A (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes

A (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes A (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes Douglas Aberdeen National ICT Australia, Canberra, Australia. December 8, 2003 Abstract Partially observable

More information

6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning

6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning 6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning Lecture Notes Prepared by Daniela Rus EECS/MIT Spring 2012 Figures by Nancy Amato, Rodney Brooks, Vijay Kumar Reading:

More information

Edges and Lines Readings: Chapter 10: better edge detectors line finding circle finding

Edges and Lines Readings: Chapter 10: better edge detectors line finding circle finding Edges and Lines Readings: Chapter 10: 10.2.3-10.3 better edge detectors line finding circle finding 1 Lines and Arcs Segmentation In some image sets, lines, curves, and circular arcs are more useful than

More information

From Theory to Practice: Distributed Coverage Control Experiments with Groups of Robots

From Theory to Practice: Distributed Coverage Control Experiments with Groups of Robots From Theory to Practice: Distributed Coverage Control Experiments with Groups of Robots Mac Schwager 1, James McLurkin 1, Jean-Jacques E. Slotine 2, and Daniela Rus 1 1 Computer Science and Artificial

More information

Robotic Search & Rescue via Online Multi-task Reinforcement Learning

Robotic Search & Rescue via Online Multi-task Reinforcement Learning Lisa Lee Department of Mathematics, Princeton University, Princeton, NJ 08544, USA Advisor: Dr. Eric Eaton Mentors: Dr. Haitham Bou Ammar, Christopher Clingerman GRASP Laboratory, University of Pennsylvania,

More information

Optic Flow and Basics Towards Horn-Schunck 1

Optic Flow and Basics Towards Horn-Schunck 1 Optic Flow and Basics Towards Horn-Schunck 1 Lecture 7 See Section 4.1 and Beginning of 4.2 in Reinhard Klette: Concise Computer Vision Springer-Verlag, London, 2014 1 See last slide for copyright information.

More information

Assignment 4: CS Machine Learning

Assignment 4: CS Machine Learning Assignment 4: CS7641 - Machine Learning Saad Khan November 29, 2015 1 Introduction The purpose of this assignment is to apply some of the techniques learned from reinforcement learning to make decisions

More information

Markov Decision Processes (MDPs) (cont.)

Markov Decision Processes (MDPs) (cont.) Markov Decision Processes (MDPs) (cont.) Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 29 th, 2007 Markov Decision Process (MDP) Representation State space: Joint state x

More information

AUTONOMOUS SYSTEMS. LOCALIZATION, MAPPING & SIMULTANEOUS LOCALIZATION AND MAPPING Part V Mapping & Occupancy Grid Mapping

AUTONOMOUS SYSTEMS. LOCALIZATION, MAPPING & SIMULTANEOUS LOCALIZATION AND MAPPING Part V Mapping & Occupancy Grid Mapping AUTONOMOUS SYSTEMS LOCALIZATION, MAPPING & SIMULTANEOUS LOCALIZATION AND MAPPING Part V Mapping & Occupancy Grid Mapping Maria Isabel Ribeiro Pedro Lima with revisions introduced by Rodrigo Ventura Instituto

More information

Programmable Reinforcement Learning Agents

Programmable Reinforcement Learning Agents Programmable Reinforcement Learning Agents David Andre and Stuart J. Russell Computer Science Division, UC Berkeley, CA 9702 dandre,russell @cs.berkeley.edu Abstract We present an expressive agent design

More information

Automated Refinement Checking of Asynchronous Processes. Rajeev Alur. University of Pennsylvania

Automated Refinement Checking of Asynchronous Processes. Rajeev Alur. University of Pennsylvania Automated Refinement Checking of Asynchronous Processes Rajeev Alur University of Pennsylvania www.cis.upenn.edu/~alur/ Intel Formal Verification Seminar, July 2001 Problem Refinement Checking Given two

More information

6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning

6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning 6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning Lecture Notes Prepared by Daniela Rus EECS/MIT Spring 2011 Figures by Nancy Amato, Rodney Brooks, Vijay Kumar Reading:

More information

Adaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces

Adaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces Adaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces Eric Christiansen Michael Gorbach May 13, 2008 Abstract One of the drawbacks of standard reinforcement learning techniques is that

More information

Mobile Robots: An Introduction.

Mobile Robots: An Introduction. Mobile Robots: An Introduction Amirkabir University of Technology Computer Engineering & Information Technology Department http://ce.aut.ac.ir/~shiry/lecture/robotics-2004/robotics04.html Introduction

More information

Gauss-Sigmoid Neural Network

Gauss-Sigmoid Neural Network Gauss-Sigmoid Neural Network Katsunari SHIBATA and Koji ITO Tokyo Institute of Technology, Yokohama, JAPAN shibata@ito.dis.titech.ac.jp Abstract- Recently RBF(Radial Basis Function)-based networks have

More information

Predictive Autonomous Robot Navigation

Predictive Autonomous Robot Navigation Predictive Autonomous Robot Navigation Amalia F. Foka and Panos E. Trahanias Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece and Department of Computer

More information