On Scalability Issues in Reinforcement Learning for Modular Robots. Paulina Varshavskaya, Leslie Pack Kaelbling and Daniela Rus
|
|
- Maryann Hortense Morgan
- 5 years ago
- Views:
Transcription
1 On Scalability Issues in Reinforcement Learning for Modular Robots Paulina Varshavskaya, Leslie Pack Kaelbling and Daniela Rus
2 motivate reinforcement learning (RL) larger modular system challenges a specific solution a general solution experiments in simulation related current work 2
3 RL for selfreconfigurable robots. automated controller design Butler et al 2000 MTRANII controller: genetic algorithms (Kamimura et al 2004) Molecube controller: genetic algorithms (Mytilinaios et al 2004) Telecube primitives and controller: genetic programming (Kubica and Rieffel 2002) 3
4 RL for selfreconfigurable robots. automated controller design 2. online adaptation from the point of view of each robot or module limited knowledge and resources moving from. to 2. 4
5 Task: locomotion in modular robots x4 latticebased robots Molecule Kotay & Rus 2005 simulated generalized latticebased robot 5
6 From each agent s point of view state: global configuration of robot unknown local observation: Moore neighborhood actions: 8 directions + NOP reward: Eastward displacement along x axis x acting module neighbor present empty lattice cell assumptions: primitive physics, no disconnections, failed actions don t execute, synchronous execution 6
7 Large modular sytem challenges partial observability cannot use Markovassuming nice algorithms large observationaction spaces 2 8 observations without obstacles x actions 3 8 observations with obstacles x actions 7
8 Partial observability in a multiagent partially observable Markov Decision Process (POMDP) direct policy search using gradient ascent in policy space (GAPS) Peshkin 200 value of " (! ) Value V(θ) of policy π(θ) V! (! ) = E [ R]! ( o, a) R =! T t = " t r t 8
9 GAPS each agent executes a parameterized policy = t e $ (" ) ( at ot," )! e # " ( o, a = P # t" ( o a' t t t ), a') β t is temperature o t a t R t θ at time t: observe o t select a t according to policy execute a t receive reward r t keep an execution trace at end of episode: update parameters θ
10 Large modular sytem challenges partial observability large observationaction spaces 2 8 observations without obstacles 2,304 parameters 3 8 observations with obstacles 5,04 parameters x acting module neighbor present empty lattice cell 0
11 Specific solution: incremental learning possible observations x possible actions = total number of parameters 2 modules x 2 x 2 = 36 3 modules 4 modules x 2 x 4 x 8 x 4 x 4 x 8 x 8 =62 = 44 acting module neighbor present empty lattice cell x 4 + modules = 2,304
12 Incremental GAPS (IGAPS) performance Mean convergence times comparison Mean average reward comparison Varshavskaya et al
13 A specific solution additive nature of modular robots unclear applicability to other tasks and systems faster RL but not fast enough for online adaptation 3
14 Desirable general solution possible observations total number of parameters 2 modules x 2 x 2 small n 3 modules 4 modules x 2 x 4 x 8 x 4 x 4 x 8 x 8 small n small n acting module neighbor present empty lattice cell x 4 + modules small n 4
15 Features acting module neighbor present empty space don t care if upper left corner full & a=ne φ 7 = φ 88 = 0 otherwise if upper right corner empty & a=se 0 otherwise if empty row in front & a=se φ 45 = 0 φ 75 = otherwise if upper left corner empty & a=w 0 otherwise 5
16 6 Approximation by feature spaces define a number of feature functions over the observationaction space policy computed from the dot product of the feature vector and parameters θ a few features approximate the desired space learning with a modified loglinear GAPS (LLGAPS)! " # " # = ' ) ', ( ), ( ), ( a o a o a t t t t t t t e e o a P $ % $ % $ ), ( ), ( ), ( o a o a o a! n! L = "
17 7 Full feature set 44 features acting module neighbor present empty space don t care observation observation observation observation action action action action
18 Performance comparison Performance as learning progresses: Smoothed average reward over 0 runs reward Mean convergence times comparison 8
19 Learned locomotion
20 Summary so far reinforcement learning for modular robots with more realistic observability assumptions learning algorithm with feature representation results in locomotion by selfreconfiguration 20
21 Advantages of a feature representation faster learning number of parameters independent of robot size or observation space size domain knowledge don t try to move into an occupied cell features can be designed for many tasks and robots 2
22 Current work: asynchronous execution reward 2 4 observations, actions only 220 features 22
23 Current work: large system locomotion 20 modules 70 modules 23
24 Current work: complex reward r world o t θ θ 2 arbitration a t θ n r n distribute subtasks within each module 24
25 Current work: locomotion in truss robots MultiShadySim Detweiler et al 2006 x4 Shady Vona et al
26 Questions? project sponsored by Boeing Corporation 26
Learning Locomotion Gait of Lattice-Based Modular Robots by Demonstration
Learning Locomotion Gait of Lattice-Based Modular Robots by Demonstration Seung-kook Yun Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology, Cambridge, Massachusetts,
More informationGradient Reinforcement Learning of POMDP Policy Graphs
1 Gradient Reinforcement Learning of POMDP Policy Graphs Douglas Aberdeen Research School of Information Science and Engineering Australian National University Jonathan Baxter WhizBang! Labs July 23, 2001
More informationFrom Crystals to Lattice Robots
2008 IEEE International Conference on Robotics and Automation Pasadena, CA, USA, May 19-23, 2008 From Crystals to Lattice Robots Nicolas Brener, Faiz Ben Amar, Philippe Bidaud Université Pierre et Marie
More informationGeneric Decentralized Control for a Class of Self-Reconfigurable Robots
eneric Decentralized ontrol for a lass of Self-Reconfigurable Robots Zack Butler, Keith Kotay, Daniela Rus and Kohji Tomita Department of omputer Science National nstitute of Advanced Dartmouth ollege
More informationDistributed Planning and Control for Modular Robots with Unit-Compressible Modules
Zack Butler Daniela Rus Department of Computer Science Dartmouth College Hanover, NH 03755-3510, USA zackb@cs.dartmouth.edu Distributed Planning and Control for Modular Robots with Unit-Compressible Modules
More informationHierarchical Reinforcement Learning for Robot Navigation
Hierarchical Reinforcement Learning for Robot Navigation B. Bischoff 1, D. Nguyen-Tuong 1,I-H.Lee 1, F. Streichert 1 and A. Knoll 2 1- Robert Bosch GmbH - Corporate Research Robert-Bosch-Str. 2, 71701
More informationGeneric Decentralized Control for Lattice-Based Self-Reconfigurable Robots
Generic Decentralized ontrol for Lattice-Based Self-Reconfigurable Robots Zack Butler, Keith Kotay, Daniela Rus and Kohji Tomita Abstract Previous work on self-reconfiguring modular robots has concentrated
More informationGeneralized Inverse Reinforcement Learning
Generalized Inverse Reinforcement Learning James MacGlashan Cogitai, Inc. james@cogitai.com Michael L. Littman mlittman@cs.brown.edu Nakul Gopalan ngopalan@cs.brown.edu Amy Greenwald amy@cs.brown.edu Abstract
More informationSelf Assembly of Modular Manipulators with Active and Passive Modules
Self Assembly of Modular Manipulators with Active and Passive Modules Seung-kook Yun and Daniela Rus Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology, Cambridge,
More informationPartially Observable Markov Decision Processes. Silvia Cruciani João Carvalho
Partially Observable Markov Decision Processes Silvia Cruciani João Carvalho MDP A reminder: is a set of states is a set of actions is the state transition function. is the probability of ending in state
More informationTopics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning
Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 12: Deep Reinforcement Learning Types of Learning Supervised training Learning from the teacher Training data includes
More informationReconfiguration Planning for Heterogeneous Self-Reconfiguring Robots
Reconfiguration Planning for Heterogeneous Self-Reconfiguring Robots Robert Fitch, Zack Butler and Daniela Rus Department of Computer Science, Dartmouth College {rfitch, zackb, rus}@cs.dartmouth.edu Abstract
More informationA System to Support the Design of Dynamic Structure Configurations
ESPRESSOCAD A System to Support the Design of Dynamic Structure Configurations MICHAEL PHILETUS WELLER 1, ELLEN YI-LUEN DO 1 AND MARK D GROSS 1 1 Design Machine Group, University of Washington 1. Introduction
More informationAn Analysis of the Million Module March algorithm applied to the ATRON robotic platform
Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 2011 An Analysis of the Million Module March algorithm applied to the ATRON robotic platform James Phipps Follow
More informationReinforcement Learning (INF11010) Lecture 6: Dynamic Programming for Reinforcement Learning (extended)
Reinforcement Learning (INF11010) Lecture 6: Dynamic Programming for Reinforcement Learning (extended) Pavlos Andreadis, February 2 nd 2018 1 Markov Decision Processes A finite Markov Decision Process
More informationWhen Network Embedding meets Reinforcement Learning?
When Network Embedding meets Reinforcement Learning? ---Learning Combinatorial Optimization Problems over Graphs Changjun Fan 1 1. An Introduction to (Deep) Reinforcement Learning 2. How to combine NE
More information10703 Deep Reinforcement Learning and Control
10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture
More informationLecture 13: Learning from Demonstration
CS 294-5 Algorithmic Human-Robot Interaction Fall 206 Lecture 3: Learning from Demonstration Scribes: Samee Ibraheem and Malayandi Palaniappan - Adapted from Notes by Avi Singh and Sammy Staszak 3. Introduction
More informationMobilized ad-hoc networks: A reinforcement learning approach
Mobilized ad-hoc networks: A reinforcement learning approach Yu-Han Chang A.I. Lab, MIT Cambridge, MA 02139 ychang@ai.mit.edu Tracey Ho LIDS, MIT Cambridge, MA 02139 trace@mit.edu Leslie Pack Kaelbling
More informationPlanning and Control: Markov Decision Processes
CSE-571 AI-based Mobile Robotics Planning and Control: Markov Decision Processes Planning Static vs. Dynamic Predictable vs. Unpredictable Fully vs. Partially Observable Perfect vs. Noisy Environment What
More informationSelf-Repair Through Scale Independent Self-Reconfiguration
Self-Repair Through Scale Independent Self-Reconfiguration K. Stoy The Maersk Mc-Kinney Moller Institute for Production Technology University of Southern Denmark Odense, Denmark Email: kaspers@mip.sdu.dk
More informationA Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots
50 A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots KwangEui Lee Department of Multimedia Engineering, Dongeui University, Busan, Korea Summary In this paper, we propose
More informationDistributed planning and control for modular robots with unit-compressible modules
1 Distributed planning and control for modular robots with unit-compressible modules Zack Butler and Daniela Rus Dept. of Computer Science Dartmouth College Abstract Self-reconfigurable robots are versatile
More informationA Brief Introduction to Reinforcement Learning
A Brief Introduction to Reinforcement Learning Minlie Huang ( ) Dept. of Computer Science, Tsinghua University aihuang@tsinghua.edu.cn 1 http://coai.cs.tsinghua.edu.cn/hml Reinforcement Learning Agent
More informationMachine Learning on Physical Robots
Machine Learning on Physical Robots Alfred P. Sloan Research Fellow Department or Computer Sciences The University of Texas at Austin Research Question To what degree can autonomous intelligent agents
More informationMovement Primitives for an Orthogonal Prismatic Closed-Lattice-Constrained Self-Reconfiguring Module
Movement Primitives for an Orthogonal Prismatic Closed-Lattice-Constrained Self-Reconfiguring Module Michael Philetus Weller, Mustafa Emre Karagozler, Brian Kirby, Jason Campbell and Seth Copen Goldstein
More informationModeling Robot Path Planning with CD++
Modeling Robot Path Planning with CD++ Gabriel Wainer Department of Systems and Computer Engineering. Carleton University. 1125 Colonel By Dr. Ottawa, Ontario, Canada. gwainer@sce.carleton.ca Abstract.
More informationReconfigurable Robot
Reconfigurable Robot What is a Reconfigurable Robot? Self-reconfiguring modular robots are autonomous kinematic machines with variable morphology They are able to deliberately change their own shape by
More informationRealistic Reconfiguration of Crystalline (and Telecube) Robots
Realistic Reconfiguration of Crystalline (and Telecube) Robots Greg Aloupis, Sébastien Collette, Mirela Damian, Erik D. Demaine 3, Dania El-Khechen 4, Robin Flatland 5, Stefan Langerman, Joseph O Rourke
More informationIntroduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University
Introduction to Reinforcement Learning J. Zico Kolter Carnegie Mellon University 1 Agent interaction with environment Agent State s Reward r Action a Environment 2 Of course, an oversimplification 3 Review:
More informationRealistic Reconfiguration of Crystalline (and Telecube) Robots
Realistic Reconfiguration of Crystalline (and Telecube) Robots Greg Aloupis, Sébastien Collette, Mirela Damian, Erik D. Demaine, Dania El-Khechen, Robin Flatland, Stefan Langerman, Joseph O Rourke, Val
More informationDesign of Prismatic Cube Modules for Convex Corner Traversal in 3D
Design of Prismatic Cube Modules for Convex Corner Traversal in 3D Michael Philetus Weller, Brian T Kirby, H Benjamin Brown, Mark D Gross and Seth Copen Goldstein Abstract The prismatic cube style of modular
More informationInverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations
Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations Peter Englert Machine Learning and Robotics Lab Universität Stuttgart Germany
More informationCS Path Planning
Why Path Planning? CS 603 - Path Planning Roderic A. Grupen 4/13/15 Robotics 1 4/13/15 Robotics 2 Why Motion Planning? Origins of Motion Planning Virtual Prototyping! Character Animation! Structural Molecular
More informationMarco Wiering Intelligent Systems Group Utrecht University
Reinforcement Learning for Robot Control Marco Wiering Intelligent Systems Group Utrecht University marco@cs.uu.nl 22-11-2004 Introduction Robots move in the physical environment to perform tasks The environment
More information3D Rectilinear Motion Planning with Minimum Bend Paths
3D Rectilinear Motion Planning with Minimum Bend Paths Robert Fitch, Zack Butler and Daniela Rus Dept. of Computer Science Dartmouth College Abstract Computing rectilinear shortest paths in two dimensions
More informationMotion Simulation of a Modular Robotic System
Motion Simulation of a Modular Robotic System Haruhisa KUROKAWA, Kohji TOMITA, Eiichi YOSHIDA, Satoshi MURATA and Shigeru KOKAJI Mechanical Engineering Laboratory, AIST, MITI Namiki 1-2, Tsukuba, Ibaraki
More informationDistributed Control of the Center of Mass of a Modular Robot
Proc. 6 IEEE/RSJ Intl. Conf on Intelligent Robots and Systems (IROS-6), pp. 47-475 Distributed Control of the Center of Mass of a Modular Robot Mark Moll, Peter Will, Maks Krivokon, and Wei-Min Shen Information
More informationA Self-Reconfigurable Modular Robot: Reconfiguration Planning and Experiments
A Self-Reconfigurable Modular Robot: Reconfiguration Planning and Experiments Eiichi Yoshida, Satoshi Murata, Akiya Kamimura, Kohji Tomita, Haruhisa Kurokawa, and Shigeru Kokaji Distributed System Design
More informationADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL
ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY BHARAT SIGINAM IN
More informationShape Complexes for Metamorhpic Robots
Shape Complexes for Metamorhpic Robots Robert Ghrist University of Illinois, Urbana, IL 61801, USA Abstract. A metamorphic robotic system is an aggregate of identical robot units which can individually
More informationRobots & Cellular Automata
Integrated Seminar: Intelligent Robotics Robots & Cellular Automata Julius Mayer Table of Contents Cellular Automata Introduction Update Rule 3 4 Neighborhood 5 Examples. 6 Robots Cellular Neural Network
More informationLearning Motor Behaviors: Past & Present Work
Stefan Schaal Computer Science & Neuroscience University of Southern California, Los Angeles & ATR Computational Neuroscience Laboratory Kyoto, Japan sschaal@usc.edu http://www-clmc.usc.edu Learning Motor
More informationProblem characteristics. Dynamic Optimization. Examples. Policies vs. Trajectories. Planning using dynamic optimization. Dynamic Optimization Issues
Problem characteristics Planning using dynamic optimization Chris Atkeson 2004 Want optimal plan, not just feasible plan We will minimize a cost function C(execution). Some examples: C() = c T (x T ) +
More informationReconfiguration of Cube-Style Modular Robots Using O(log n) Parallel Moves
Reconfiguration of Cube-Style Modular Robots Using O(log n) Parallel Moves Greg Aloupis 1, Sébastien Collette 1, Erik D. Demaine 2, Stefan Langerman 1, Vera Sacristán 3, and Stefanie Wuhrer 4 1 Université
More informationNatural Actor-Critic. Authors: Jan Peters and Stefan Schaal Neurocomputing, Cognitive robotics 2008/2009 Wouter Klijn
Natural Actor-Critic Authors: Jan Peters and Stefan Schaal Neurocomputing, 2008 Cognitive robotics 2008/2009 Wouter Klijn Content Content / Introduction Actor-Critic Natural gradient Applications Conclusion
More informationIncreasing the Efficiency of Distributed Goal-Filling Algorithms for Self-Reconfigurable Hexagonal Metamorphic Robots
Increasing the Efficiency of Distributed Goal-Filling lgorithms for Self-Reconfigurable Hexagonal Metamorphic Robots J. ateau,. Clark, K. McEachern, E. Schutze, and J. Walter Computer Science Department,
More informationCELLULAR automata (CA) are mathematical models for
1 Cellular Learning Automata with Multiple Learning Automata in Each Cell and its Applications Hamid Beigy and M R Meybodi Abstract The cellular learning automata, which is a combination of cellular automata
More informationAsymptotically Optimal Kinodynamic Motion Planning for Self-reconfigurable Robots
Asymptotically Optimal Kinodynamic Motion Planning for Self-reconfigurable Robots John H. Reif 1 and Sam Slee 1 Department of Computer Science, Duke University, Durham, NC, USA {reif,sgs}@cs.duke.edu Abstract.
More informationMarkov Decision Processes. (Slides from Mausam)
Markov Decision Processes (Slides from Mausam) Machine Learning Operations Research Graph Theory Control Theory Markov Decision Process Economics Robotics Artificial Intelligence Neuroscience /Psychology
More informationAn active connection mechanism for modular self-reconfigurable robotic systems based on physical latching
An active connection mechanism for modular self-reconfigurable robotic systems based on physical latching A. Sproewitz, M. Asadpour, Y. Bourquin and A. J. Ijspeert Abstract This article presents a robust
More informationLearning Grasp Strategies Composed of Contact Relative Motions
Learning Grasp Strategies Composed of Contact Relative Motions Robert Platt Dextrous Robotics Laboratory Johnson Space Center, NASA robert.platt-1@nasa.gov Abstract Of central importance to grasp synthesis
More informationValue Iteration. Reinforcement Learning: Introduction to Machine Learning. Matt Gormley Lecture 23 Apr. 10, 2019
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Reinforcement Learning: Value Iteration Matt Gormley Lecture 23 Apr. 10, 2019 1
More informationTechnical Report TR Texas A&M University Parasol Lab. Algorithms for Filling Obstacle Pockets with Hexagonal Metamorphic Robots
Technical Report TR03-002 Texas A&M University Parasol Lab Algorithms for Filling Obstacle Pockets with Hexagonal Metamorphic Robots Author: Mary E. Brooks, University of Oklahoma Advisors: Dr. Jennifer
More informationHormone- Controlled Metamorphic Robots
Proceedings of the 2001 IEEE International Conference on Robotics & Automation Seoul, Korea. May 21-26, 2001 Hormone- Controlled Metamorphic Robots Behnam Salemi, Wei-Min Shen, and Peter Will USC Information
More informationHierarchical Motion Planning for Self-reconfigurable Modular Robots
Hierarchical Motion Planning for Self-reconfigurable Modular Robots Preethi Bhat James Kuffner Seth Goldstein Siddhartha Srinivasa 2 The Robotics Institute 2 Intel Research Pittsburgh Carnegie Mellon University
More informationContinuous Valued Q-learning for Vision-Guided Behavior Acquisition
Continuous Valued Q-learning for Vision-Guided Behavior Acquisition Yasutake Takahashi, Masanori Takeda, and Minoru Asada Dept. of Adaptive Machine Systems Graduate School of Engineering Osaka University
More informationLecture 15: Segmentation (Edge Based, Hough Transform)
Lecture 15: Segmentation (Edge Based, Hough Transform) c Bryan S. Morse, Brigham Young University, 1998 000 Last modified on February 3, 000 at :00 PM Contents 15.1 Introduction..............................................
More informationAsymptotically Optimal Kinodynamic Motion Planning for Self-Reconfigurable Robots
Asymptotically Optimal Kinodynamic Motion Planning for Self-Reconfigurable Robots John H. Reif 1 and Sam Slee 1 Department of Computer Science, Duke University, Durham, NC, USA. {reif,sgs}@cs.duke.edu
More informationApplying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning
Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning Jan Peters 1, Stefan Schaal 1 University of Southern California, Los Angeles CA 90089, USA Abstract. In this paper, we
More informationApprenticeship Learning for Reinforcement Learning. with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang
Apprenticeship Learning for Reinforcement Learning with application to RC helicopter flight Ritwik Anand, Nick Haliday, Audrey Huang Table of Contents Introduction Theory Autonomous helicopter control
More informationSensor Models / Occupancy Mapping / Density Mapping
Statistical Techniques in Robotics (16-831, F09) Lecture #3 (09/01/2009) Sensor Models / Occupancy Mapping / Density Mapping Lecturer: Drew Bagnell Scribe: Mehmet R. Dogar 1 Sensor Models 1.1 Beam Model
More informationCSE151 Assignment 2 Markov Decision Processes in the Grid World
CSE5 Assignment Markov Decision Processes in the Grid World Grace Lin A484 gclin@ucsd.edu Tom Maddock A55645 tmaddock@ucsd.edu Abstract Markov decision processes exemplify sequential problems, which are
More informationA New Meta-Module for Controlling Large Sheets of ATRON Modules
A New Meta-Module for Controlling Large Sheets of ATRON Modules David Brandt and David Johan Christensen The Adaptronics Group, The Maersk Institute, University of Southern Denmark {david.brandt, david}@mmmi.sdu.dk
More informationOptimal Kinodynamic Motion Planning for 2D Reconfiguration of Self-Reconfigurable Robots
Optimal Kinodynamic Motion Planning for 2D Reconfiguration of Self-Reconfigurable Robots John Reif Department of Computer Science, Duke University Abstract A self-reconfigurable (SR) robot is one composed
More informationEM-Cube: Cube-Shaped, Self-reconfigurable Robots Sliding on Structure Surfaces
2008 IEEE International Conference on Robotics and Automation Pasadena, CA, USA, May 19-23, 2008 EM-Cube: Cube-Shaped, Self-reconfigurable Robots Sliding on Structure Surfaces Byoung Kwon An Abstract Many
More informationEvolutionary Synthesis of Dynamic Motion and Reconfiguration Process for a Modular Robot M-TRAN
Evolutionary Synthesis of Dynamic Motion and Reconfiguration Process for a Modular Robot M-TRAN Eiichi Yoshida *, Satoshi Murata **, Akiya Kamimura *, Kohji Tomita *, Haruhisa Kurokawa * and Shigeru Kokaji
More informationSelf-Reconfigurable Modular Robot (M-TRAN) and its Motion Design
Self-Reconfigurable Modular Robot (M-TRAN) and its Motion Design Haruhisa KUROKAWA *, Akiya KAMIMURA *, Eiichi YOSHIDA *, Kohji TOMITA *, Satoshi MURATA ** and Shigeru KOKAJI * * National Institute of
More informationDistributed Goal Recognition Algorithms for Modular Robots
Distributed Goal Recognition Algorithms for Modular Robots Zack Butler, Robert Fitch, Daniela Rus and Yuhang Wang Department of Computer Science Dartmouth College {zackb,rfitch,rus,wyh}@cs.dartmouth.edu
More informationStrategies for simulating pedestrian navigation with multiple reinforcement learning agents
Strategies for simulating pedestrian navigation with multiple reinforcement learning agents Francisco Martinez-Gil, Miguel Lozano, Fernando Ferna ndez Presented by: Daniel Geschwender 9/29/2016 1 Overview
More informationReinforcement Learning (2)
Reinforcement Learning (2) Bruno Bouzy 1 october 2013 This document is the second part of the «Reinforcement Learning» chapter of the «Agent oriented learning» teaching unit of the Master MI computer course.
More informationA motion planning method for mobile robot considering rotational motion in area coverage task
Asia Pacific Conference on Robot IoT System Development and Platform 018 (APRIS018) A motion planning method for mobile robot considering rotational motion in area coverage task Yano Taiki 1,a) Takase
More informationTURN AROUND BEHAVIOR GENERATION AND EXECUTION FOR UNMANNED GROUND VEHICLES OPERATING IN ROUGH TERRAIN
1 TURN AROUND BEHAVIOR GENERATION AND EXECUTION FOR UNMANNED GROUND VEHICLES OPERATING IN ROUGH TERRAIN M. M. DABBEERU AND P. SVEC Department of Mechanical Engineering, University of Maryland, College
More informationLinear Reconfiguration of Cube-Style Modular Robots
Linear Reconfiguration of ube-style Modular Robots Greg loupis Sébastien ollette Mirela Damian Erik D. Demaine Robin Flatland Stefan Langerman Joseph O Rourke Suneeta Ramaswami Vera Sacristán Stefanie
More informationForward Search Value Iteration For POMDPs
Forward Search Value Iteration For POMDPs Guy Shani and Ronen I. Brafman and Solomon E. Shimony Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel Abstract Recent scaling up of POMDP
More informationLayering Algorithm for Collision-Free Traversal Using Hexagonal Self-Reconfigurable Metamorphic Robots
Layering Algorithm for Collision-Free Traversal Using Hexagonal Self-Reconfigurable Metamorphic Robots Plamen Ivanov and Jennifer Walter Computer Science Department Vassar College {plivanov,jewalter}@vassar.edu
More informationDistributed Control for Unit-compressible Robots: Goal-recognition, Locomotion and Splitting
1 Distributed Control for Unit-compressible Robots: Goal-recognition, Locomotion and Splitting Zack Butler, Robert Fitch and Daniela Rus Dept. of Computer Science Dartmouth College Abstract We present
More informationMarkov Decision Processes and Reinforcement Learning
Lecture 14 and Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig Course Overview Introduction Artificial Intelligence
More informationQ-learning with linear function approximation
Q-learning with linear function approximation Francisco S. Melo and M. Isabel Ribeiro Institute for Systems and Robotics [fmelo,mir]@isr.ist.utl.pt Conference on Learning Theory, COLT 2007 June 14th, 2007
More informationSelf Assembly of Modular Manipulators with Active and Passive modules
Self Assembly of Modular Manipulators with Active and Passive modules Seung-kook Yun, Yeoreum Yoon, Daniela Rus Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology,
More informationLearning Complex Motions by Sequencing Simpler Motion Templates
Learning Complex Motions by Sequencing Simpler Motion Templates Gerhard Neumann gerhard@igi.tugraz.at Wolfgang Maass maass@igi.tugraz.at Institute for Theoretical Computer Science, Graz University of Technology,
More informationHomework #1. Displays, Image Processing, Affine Transformations, Hierarchical modeling, Projections
Computer Graphics Instructor: rian Curless CSEP 557 Winter 213 Homework #1 Displays, Image Processing, Affine Transformations, Hierarchical modeling, Projections Assigned: Tuesday, January 22 nd Due: Tuesday,
More informationReinforcement Learning with Parameterized Actions
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Reinforcement Learning with Parameterized Actions Warwick Masson and Pravesh Ranchod School of Computer Science and Applied
More informationSlides credited from Dr. David Silver & Hung-Yi Lee
Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to
More informationA (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes
A (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes Douglas Aberdeen National ICT Australia, Canberra, Australia. December 8, 2003 Abstract Partially observable
More information6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning
6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning Lecture Notes Prepared by Daniela Rus EECS/MIT Spring 2012 Figures by Nancy Amato, Rodney Brooks, Vijay Kumar Reading:
More informationEdges and Lines Readings: Chapter 10: better edge detectors line finding circle finding
Edges and Lines Readings: Chapter 10: 10.2.3-10.3 better edge detectors line finding circle finding 1 Lines and Arcs Segmentation In some image sets, lines, curves, and circular arcs are more useful than
More informationFrom Theory to Practice: Distributed Coverage Control Experiments with Groups of Robots
From Theory to Practice: Distributed Coverage Control Experiments with Groups of Robots Mac Schwager 1, James McLurkin 1, Jean-Jacques E. Slotine 2, and Daniela Rus 1 1 Computer Science and Artificial
More informationRobotic Search & Rescue via Online Multi-task Reinforcement Learning
Lisa Lee Department of Mathematics, Princeton University, Princeton, NJ 08544, USA Advisor: Dr. Eric Eaton Mentors: Dr. Haitham Bou Ammar, Christopher Clingerman GRASP Laboratory, University of Pennsylvania,
More informationOptic Flow and Basics Towards Horn-Schunck 1
Optic Flow and Basics Towards Horn-Schunck 1 Lecture 7 See Section 4.1 and Beginning of 4.2 in Reinhard Klette: Concise Computer Vision Springer-Verlag, London, 2014 1 See last slide for copyright information.
More informationAssignment 4: CS Machine Learning
Assignment 4: CS7641 - Machine Learning Saad Khan November 29, 2015 1 Introduction The purpose of this assignment is to apply some of the techniques learned from reinforcement learning to make decisions
More informationMarkov Decision Processes (MDPs) (cont.)
Markov Decision Processes (MDPs) (cont.) Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 29 th, 2007 Markov Decision Process (MDP) Representation State space: Joint state x
More informationAUTONOMOUS SYSTEMS. LOCALIZATION, MAPPING & SIMULTANEOUS LOCALIZATION AND MAPPING Part V Mapping & Occupancy Grid Mapping
AUTONOMOUS SYSTEMS LOCALIZATION, MAPPING & SIMULTANEOUS LOCALIZATION AND MAPPING Part V Mapping & Occupancy Grid Mapping Maria Isabel Ribeiro Pedro Lima with revisions introduced by Rodrigo Ventura Instituto
More informationProgrammable Reinforcement Learning Agents
Programmable Reinforcement Learning Agents David Andre and Stuart J. Russell Computer Science Division, UC Berkeley, CA 9702 dandre,russell @cs.berkeley.edu Abstract We present an expressive agent design
More informationAutomated Refinement Checking of Asynchronous Processes. Rajeev Alur. University of Pennsylvania
Automated Refinement Checking of Asynchronous Processes Rajeev Alur University of Pennsylvania www.cis.upenn.edu/~alur/ Intel Formal Verification Seminar, July 2001 Problem Refinement Checking Given two
More information6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning
6.141: Robotics systems and science Lecture 9: Configuration Space and Motion Planning Lecture Notes Prepared by Daniela Rus EECS/MIT Spring 2011 Figures by Nancy Amato, Rodney Brooks, Vijay Kumar Reading:
More informationAdaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces
Adaptive Robotics - Final Report Extending Q-Learning to Infinite Spaces Eric Christiansen Michael Gorbach May 13, 2008 Abstract One of the drawbacks of standard reinforcement learning techniques is that
More informationMobile Robots: An Introduction.
Mobile Robots: An Introduction Amirkabir University of Technology Computer Engineering & Information Technology Department http://ce.aut.ac.ir/~shiry/lecture/robotics-2004/robotics04.html Introduction
More informationGauss-Sigmoid Neural Network
Gauss-Sigmoid Neural Network Katsunari SHIBATA and Koji ITO Tokyo Institute of Technology, Yokohama, JAPAN shibata@ito.dis.titech.ac.jp Abstract- Recently RBF(Radial Basis Function)-based networks have
More informationPredictive Autonomous Robot Navigation
Predictive Autonomous Robot Navigation Amalia F. Foka and Panos E. Trahanias Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece and Department of Computer
More information