arxiv: v1 [cs.ro] 31 Dec 2018

Similar documents
Learning from Successes and Failures to Grasp Objects with a Vacuum Gripper

Towards Grasp Transfer using Shape Deformation

Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection

Learning a visuomotor controller for real world robotic grasping using simulated depth images

arxiv: v2 [cs.ro] 26 Oct 2018

The Crucial Components to Solve the Picking Problem

arxiv: v2 [cs.ro] 18 Sep 2018

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Learning to Singulate Objects using a Push Proposal Network

Toward data-driven contact mechanics

Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences

arxiv: v3 [cs.lg] 2 Apr 2016

Deep Learning a Grasp Function for Grasping under Gripper Pose Uncertainty

arxiv: v2 [cs.cv] 12 Jul 2017

Combining RGB and Points to Predict Grasping Region for Robotic Bin-Picking

Final Project Report: Mobile Pick and Place

arxiv: v3 [cs.ro] 9 Nov 2017

Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images

Keywords: clustering, construction, machine vision

Open World Assistive Grasping Using Laser Selection

Robotic Grasping Based on Efficient Tracking and Visual Servoing using Local Feature Descriptors

LUMS Mine Detector Project

Learning 6-DoF Grasping and Pick-Place Using Attention Focus

Pixels to Plans: Learning Non-Prehensile Manipulation by Imitating a Planner

Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach

arxiv: v2 [cs.lg] 25 Sep 2017

Assignment 3. Position of the center +/- 0.1 inches Orientation +/- 1 degree. Decal, marker Stereo, matching algorithms Pose estimation

arxiv: v1 [cs.ro] 11 Jul 2016

A Modular Software Framework for Eye-Hand Coordination in Humanoid Robots

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Pose estimation using a variety of techniques

Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations

Spatial R-C-C-R Mechanism for a Single DOF Gripper

Efficient Grasping from RGBD Images: Learning Using a New Rectangle Representation. Yun Jiang, Stephen Moseson, Ashutosh Saxena Cornell University

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

Generating and Learning from 3D Models of Objects through Interactions. by Kiana Alcala, Kathryn Baldauf, Aylish Wrench

Intelligent Robotics

Planning Multi-Fingered Grasps as Probabilistic Inference in a Learned Deep Network

arxiv: v1 [cs.cv] 18 Sep 2018

Data-driven Approaches to Simulation (Motion Capture)

BIN PICKING APPLICATIONS AND TECHNOLOGIES

Tactile-Visual Integration for Task-Aware Grasping. Mabel M. Zhang, Andreas ten Pas, Renaud Detry, Kostas Daniilidis

Fingertips Tracking based on Gradient Vector

Improved Object Pose Estimation via Deep Pre-touch Sensing

Shape Completion Enabled Robotic Grasping

Design and building motion capture system using transducer Microsoft kinect to control robot humanoid

output vector controller y s... y 2 y 1

Manipulating a Large Variety of Objects and Tool Use in Domestic Service, Industrial Automation, Search and Rescue, and Space Exploration

Convolutional Residual Network for Grasp Localization

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Team Description Paper Team AutonOHM

Lighting- and Occlusion-robust View-based Teaching/Playback for Model-free Robot Programming

Benchmarking for the Metamorphic Hand based on a. Dimensionality Reduction Model

Mechanical simulation design of the shaft type hybrid mechanical arm based on Solidworks

Learning End-Effector Orientations for Novel Object Grasping Tasks

arxiv: v2 [cs.cv] 14 May 2018

Robot Vision without Calibration

This week. CENG 732 Computer Animation. Warping an Object. Warping an Object. 2D Grid Deformation. Warping an Object.

Modelling of Human Welder for Intelligent Welding and Welder Training*

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

Abstract Purpose: Results: Methods: Conclusions:

Intern Presentation:

Introduction to Digitization Techniques for Surgical Guidance

3D object recognition used by team robotto

Using Web Camera Technology to Monitor Steel Construction

Intelligent Cutting of the Bird Shoulder Joint

Small Object Manipulation in 3D Perception Robotic Systems Using Visual Servoing

Self-Supervised Visual Planning with Temporal Skip Connections. {febert,cbfinn,alexlee

Ceilbot vision and mapping system

Visual Collision Detection for Corrective Movements during Grasping on a Humanoid Robot

A novel approach to motion tracking with wearable sensors based on Probabilistic Graphical Models

Learning from 3D Data

Research Subject. Dynamics Computation and Behavior Capture of Human Figures (Nakamura Group)

High precision grasp pose detection in dense clutter*

Physically-based Grasp Quality Evaluation under Pose Uncertainty

Design & Kinematic Analysis of an Articulated Robotic Manipulator

arxiv: v1 [cs.cv] 28 Sep 2018

Using Vision for Pre- and Post-Grasping Object Localization for Soft Hands

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

10/25/2018. Robotics and automation. Dr. Ibrahim Al-Naimi. Chapter two. Introduction To Robot Manipulators

Product information. Hi-Tech Electronics Pte Ltd

Robust Vision-Based Detection and Grasping Object for Manipulator using SIFT Keypoint Detector

Reinforcement Learning for Appearance Based Visual Servoing in Robotic Manipulation

3D model classification using convolutional neural network

ACE Project Report. December 10, Reid Simmons, Sanjiv Singh Robotics Institute Carnegie Mellon University

arxiv: v1 [cs.cv] 16 Sep 2018

Push-Net: Deep Planar Pushing for Objects with Unknown Physical Properties

DEVELOPMENT OF TELE-ROBOTIC INTERFACE SYSTEM FOR THE HOT-LINE MAINTENANCE. Chang-Hyun Kim, Min-Soeng Kim, Ju-Jang Lee,1

arxiv: v3 [cs.cv] 21 Oct 2016

Human Augmentation in Teleoperation of Arm Manipulators in an Environment with Obstacles

CS4758: Rovio Augmented Vision Mapping Project

Virtual Interaction System Based on Optical Capture

Geometry of Multiple views

MOTION TRAJECTORY PLANNING AND SIMULATION OF 6- DOF MANIPULATOR ARM ROBOT

z θ 1 l 2 θ 2 l 1 l 3 θ 3 θ 4 θ 5 θ 6

Learning Robotic Manipulation of Granular Media

Dept. of Mechanical and Materials Engineering University of Western Ontario London, Ontario, Canada

Hand-Eye Calibration from Image Derivatives

Supplementary: Cross-modal Deep Variational Hand Pose Estimation

arxiv: v1 [cs.ro] 23 May 2018

Transcription:

A dataset of 40K naturalistic 6-degree-of-freedom robotic grasp demonstrations. Rajan Iyengar, Victor Reyes Osorio, Presish Bhattachan, Adrian Ragobar, Bryan Tripp * University of Waterloo arxiv:1812.11683v1 [cs.ro] 31 Dec 2018 * bptripp@uwaterloo.ca Abstract Modern approaches to grasp planning often involve deep learning. However, there are only a few large datasets of labelled grasping examples on physical robots, and available datasets involve relatively simple planar grasps with two-fingered grippers. Here we present: 1) a new human grasp demonstration method that facilitates rapid collection of naturalistic grasp examples, with full six-degree-of-freedom gripper positioning; and 2) a dataset of roughly forty thousand successful grasps on 109 different rigid objects with the RightHand Robotics three-fingered ReFlex gripper. Introduction Grasp planning has traditionally used analytic methods to estimate quality metrics for potential grasps [6, 14]. Two particular limitations of analytic grasp metrics are the need for accurate knowledge of the object geometry, and assumptions involved, such as simplified contact models. Much recent work in grasp planning has focused on data-driven approaches [2] to address both of these limitations. A common approach is to use deep learning to map depth or RGB images to quantities that can be used directly for grasp planning, such as grasp success predictions [8, 10, 13, 15, 18 20]. For example, uncertainty about the shapes of novel objects has been addressed by training deep networks to predict grasp metrics from depth images, using large numbers of known synthetic examples [13]. However, the relationship between these metrics and physical grasp success is complex [17]. Ideally, a grasp planner would be trained directly on physical grasping examples rather than grasp quality metrics. However, a central challenge for this approach is the expense of obtaining sufficient labelled data to support sophisticated decisions without overfitting. To reduce this expense, some groups have resorted to large-scale physics simulations [8, 9]. However, these simulations have (so far) employed simplified contact models, reintroducing one of the key limitations that motivated a departure from grasp quality metrics. Other groups have trained models initially in simulated environments, then trained further with physical robots [3, 22], or performed large-scale trial-and-error data collection [15] or reinforcement learning [7, 11, 16] on physical robots. Datasets of successful grasps have also been generated by humans. This requires either transfer from human-hand grasps to robotic-gripper grasps [5, 12] or human control of a robot. The latter has been achieved by physically guiding the robotic arm [1], and recently, teleoperation using virtual reality hardware [21]. These methods have, so far, not produced sufficiently large datasets for deep learning of skilled grasping (although this seems feasible with teleoperation). Overall, while robotic grasp planning with unknown objects has been extensively studied, there is still much room for improvement in success rates. New labelled datasets of physical robotic grasps may allow further progress. 1/7

We developed a new grasp demonstration approach that is intended to make grasp demonstration relatively rapid and naturalistic. Here we describe this approach, and a corresponding dataset of 40K successful grasps, demonstrated on 109 objects. We believe this to be the largest available dataset of human grasp demonstrations with a robotic gripper. The grasps use a three-fingered gripper (RightHand Robotics ReFlex gripper), and full 6-degree-of-freedom trajectories. The dataset can be used, shared, and modified freely for any non-commercial purpose. It is available from https://dataverse.scholarsportal.info/dataverse/uw-brain-lab. Methods Grasp Demonstration Method Our method is a hybrid of previous approaches that have used motion tracking with human-hand grasps [5], and manual control of a robot [1]. To approach the speed and naturalistic control of human-hand grasping while avoiding the need to generalize from the hand to a gripper, we mounted a gripper on a 3d-printed handle with motion-tracking markers (Figure 1). This allowed the operator to position the gripper with natural arm movements. We used a Polaris optical motion tracker from Northern Digital Inc. (NDI). This system can track the 6DOF configuration of unique multi-marker tools, but it requires a line of sight to all of the markers on a tool. To reduce occlusions, we mounted two of these tools at different positions and angles on the gripper handle (Figure 1, bottom left). We used a RightHand Robotics ReFlex gripper. The operator controlled the gripper fingers with a joystick. This gripper has four degrees of freedom in the finger positions, corresponding to flexion of each finger, and the angle of spread between two of the fingers. We used one degree of freedom of the joystick to control the spread, and the other to control all finger flexion angles together. Objects and Data Collection We collected grasp demonstrations with 109 different rigid objects. 48 of these belonged to the YCB dataset [4]. We avoided YCBs object that were either non-rigid, or too small or too large to easily handle with the ReFlex gripper. Two experimenters participated in each data collection session. In each trial, one experimenter placed an object on a table, in a random orientation, and the other used the joystick and handle to grasp and lift the object (see Figure 1). To encourage more variability in the grasps, one or two additional objects ( obstacles ) were placed between the operator and the target object in some trials. Failure to grasp and lift an object was rare, particularly after the operator had encountered a given object a few times. Failed grasps were not included in the dataset. Prior to each grasp, we captured images of the target object with two cameras. One was an infrared structured-light sensor that captured RGB and depth images (RealSense SR300). Because the motion tracker also used infrared light, this camera and the motion tracker were enabled at alternating times. The second camera was a stereo camera (Stereolabs ZED). From this camera we saved stereo RGB images, as well as the depth map estimated from these images by the ZED software. Both cameras were fixed to a rigid frame that also held a small table on which the objects were placed. For each grasp, we stored the 6DOF trajectory of the gripper as it approached the object, along with the gripper finger positions. The gripper and finger positions were recorded on different computers, with different sampling rates. To create an integrated 10-dimensional gripper-configuration signal, we synchronized the clocks and interpolated the finger positions at the times of the motion-tracker samples. The coordinate systems of the gripper and the table were right-handed. The positive z axis of the gripper pointed out of the palm, and the positive y axis pointed between the two fingers on one side of the gripper. Finger positions were recorded in units of 1/4096 rotations of the corresponding servos. A home finger position was also saved for each trial. In the home position, the two fingers on one side of the gripper were oriented parallel to each other, and the fingers were extended, so that 2/7

Figure 1. Top: An experimenter using the handle-mounted gripper and joystick to grasp an object. Left: 3D printed handle for naturalistic human positioning of the ReFlex gripper. This part replaces the ReFlex gripper s case. Two distinct NDI tools are mounted at known positions and orientations relative to the gripper axes, allowing us to reliably determine the gripper s configuration during grasping. Right: To measure the tool positions relative to the gripper reference frame, we temporarily mounted a tool at the point we defined as the gripper origin, and aligned with its axes. 3/7

Figure 2. Setup for grasp replay on the UR5 robot. The frame is the same one used for data collection. For replay the frame is bolted to the UR5 s base, whereas for data collection it is clamped to a table in front of the NDI Polaris. the fingers on each side were oriented approximately 180 degrees from each other about the x axis. The fingers were visually aligned to this position periodically, as well as immediately after occasional mechanical problems with the gripper that affected the relationship between servo and finger positions (such as after replacing stripped gears). Re-creating and Replaying Grasps After data collection, we re-created the target-object placement for some of the grasps. To do this, we overlaid the object image captured during the trial with a live image from the camera, and manually aligned the images by moving the object on the table. In some cases, we attached a motion-tracker tool to the object and manually measured its position relative to the object centre. This allowed us to analyze gripper positions in object coordinates rather than table coordinates for these grasps. In other cases, we replayed recorded grasps, with the gripper mounted on a robotic arm (Universal Robots UR5). This allowed us to confirm that re-creation of the object positions was fairly accurate, and to study robustness of the demonstrated grasps by replaying them with small perturbations. Results Table 1 shows an example of a full processed record for a single grasping trial. Most of the record consists of lines with 11 numbers each. Each of these lines corresponds to a time point in the trial. The first is the time, in seconds. The next six describe the position and orientation of the gripper base, and the last four describe the finger positions (see details in Methods). The units are 1/4096-rotation increments of the servos. The first two values are flexion positions of the two fingers on one side of the gripper, the third is the flexion position of the opposed finger, and the fourth is the spread between fingers 1 and 2. Figure 3 illustrates a number of final gripper positions (for different trials) around a single object. Conclusion We extended previous grasp demonstration methods to allow support rapid collection of a large dataset of naturalistic grasp examples. Our dataset includes roughly forty thousand examples of successful human-controlled grasps with a three-fingered gripper and a wide variety of objects. We 4/7

58.766572,221.789299,-609.763151,195.410166,0.449433,-3.041296,-0.346504,15360,14073,17234,15977 58.852577,226.062290,-609.847780,194.122006,0.463341,-3.029768,-0.341773,15360,14073,17234,15977 58.932581,235.837668,-610.406901,190.827514,0.478962,-3.017942,-0.332029,15360,14073,17234,15977 59.016586,243.962991,-607.974445,190.105599,0.472996,-2.990546,-0.332048,15360,14073,17234,15977 59.101591,245.909832,-603.582400,192.819763,0.460992,-2.966361,-0.351199,15360,14073,17234,15977 59.182596,243.874769,-585.271146,208.594179,0.413329,-2.880503,-0.436490,15360,14073,17234,15977 59.718626,138.905502,-251.135107,335.640953,-0.294284,-1.607775,-0.849032,15360,14073,17234,15977 59.800631,128.693773,-216.504463,318.502908,-0.347747,-1.580540,-0.812127,15360,14073,17234,15977 59.882636,107.239968,-158.570112,275.121700,-0.417064,-1.567641,-0.740564,15360,14073,17234,15977 59.966641,87.909652,-117.351661,233.379995,-0.464845,-1.583969,-0.675839,15360,14073,17234,15977 60.051645,80.553602,-102.145953,218.497081,-0.481372,-1.568064,-0.655075,15360,14073,17234,15977 60.133650,70.049938,-79.689731,199.347674,-0.484544,-1.535028,-0.640559,15360,14073,17234,15977 60.216655,61.995151,-63.102111,187.352547,-0.511094,-1.511234,-0.618503,15360,14073,17234,15977 60.301660,59.427056,-56.254151,181.030151,-0.526766,-1.511101,-0.607948,15360,14073,17234,15977 60.383664,56.054238,-46.715327,170.989789,-0.539463,-1.502067,-0.604229,15360,14073,17234,15977 60.466669,53.731411,-40.008332,164.390846,-0.559969,-1.495705,-0.594769,15360,14073,17234,15977 60.550674,52.684224,-38.058058,161.073913,-0.562399,-1.508526,-0.592344,15360,14073,17234,15977 60.632679,50.528583,-36.798384,155.019788,-0.565769,-1.531361,-0.584006,15360,14073,17234,15977 60.716683,48.763014,-36.471269,153.736346,-0.580545,-1.516428,-0.576254,15360,14073,17234,15977 60.800688,48.144880,-35.951091,153.747681,-0.588292,-1.507040,-0.571679,15360,14073,17234,15977 60.882693,46.467974,-34.466787,153.331707,-0.592097,-1.498633,-0.565814,15360,14073,17234,15977 60.966698,44.244102,-32.376847,152.113245,-0.592320,-1.504597,-0.558035,15360,14073,17234,15977 61.050703,42.687376,-31.328490,151.058910,-0.589851,-1.511646,-0.552103,15360,14073,17234,15977 61.132707,39.011888,-29.567549,148.645548,-0.573705,-1.527906,-0.542928,15360,14073,17234,15977 61.216712,35.034849,-26.950631,147.070403,-0.570720,-1.528795,-0.531688,15360,14073,17234,15978 61.301717,33.156350,-25.321857,146.550399,-0.574164,-1.523476,-0.524586,15360,14073,17234,15978 61.383721,29.951727,-22.735748,144.591925,-0.581185,-1.518510,-0.511573,15360,14073,17234,15978 61.466726,28.670496,-21.602918,143.546636,-0.589383,-1.508460,-0.503629,15497,13935,17371,16020 61.551731,28.989980,-20.890682,143.836360,-0.595603,-1.501013,-0.504591,15672,13760,17547,16072 61.632736,29.602992,-20.291707,144.032218,-0.599700,-1.499684,-0.507331,15839,13593,17714,16121 61.716741,29.722951,-20.295574,143.725106,-0.604394,-1.498913,-0.504594,16012,13420,17887,16173 61.800746,29.739973,-19.953561,143.821686,-0.610408,-1.494604,-0.503381,16185,13247,18060,16224 61.882750,30.383383,-18.717478,144.782633,-0.619521,-1.482919,-0.507993,16354,13078,18229,16275 61.966755,30.977656,-17.384658,146.225541,-0.630547,-1.473200,-0.505784,16528,12904,18402,16326 62.051760,31.556563,-17.371589,150.831706,-0.629501,-1.462754,-0.513835,16613,12820,18488,16352 62.132764,33.859368,-18.337561,168.907829,-0.607408,-1.460303,-0.552683,16613,12820,18488,16352 62.216769,37.116328,-21.036063,185.732048,-0.572061,-1.471607,-0.592257,16613,12820,18488,16352 62.300774,38.880719,-23.174612,194.285943,-0.559710,-1.459379,-0.606706,16613,12820,18488,16352 62.383779,40.771197,-27.263256,208.944944,-0.540954,-1.463413,-0.634165,16613,12820,18488,16352 T:3-medium wrap plastic spiral 2 obs Fingers at calibration = [14523, 14910, 16398, 16365] Table 1. Complete contents of an example grasp record file (147104). The final three lines are: manually entered notes on the object and grasp type; the number of obstacles; and finger servo positions at the home position (other finger positions should be interpreted relative to these values). The preceding lines each have 11 entries each. The first is a time stamp, in seconds. Entries 2 to 4 are x, y, z offsets from the centre of the table surface, respectively, in mm. Entries 5-7 are an axis-angle representation of the gripper orientation. Entries 8 to 11 are positions of the gripper fingers (see text for details). 5/7

Figure 3. A render of a mustard bottle, with plots of final gripper positions and orientations (in the object s reference frame) for a number of grasps. The red, green, and blue axes are x, y, and z, respectively. The lines at the bottom are projections of the z axes onto the support surface. hope this dataset will be useful for training deep networks for grasp planning, and for understanding human grasping strategies. Acknowledgments We thank Ricky Verma and Ibrahim Okeil for their assistance with data collection, Xueyang Yao for assistance with data processing, and ReFlex Robotics for technical assistance. This work was supported by Applied Brain Research and NSERC. References 1. B. Akgun, M. Cakmak, K. Jiang, and A. L. Thomaz. Keyframe-based learning from demonstration. International Journal of Social Robotics, 4(4):343 355, 2012. 2. J. Bohg, A. Morales, T. Asfour, D. Kragic, and S. Member. Data-Driven Grasp Synthesis A Survey. IEEE Transactions on Robotics, 30(2):289 309, 2014. 3. K. Bousmalis, A. Irpan, P. Wohlhart, Y. Bai, M. Kelcey, M. Kalakrishnan, L. Downs, J. Ibarz, P. Pastor, K. Konolige, S. Levine, and V. Vanhoucke. Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping. In ICRA, pages 4243 4250. IEEE, 2018. 4. B. Calli, A. Walsman, S. Member, A. Singh, S. Member, S. Srinivasa, S. Member, P. Abbeel, S. Member, M. Dollar, and S. Member. Benchmarking in manipulation research: Using the Yale-CMU-Berkeley object and model set. IEEE Robotics & Automation Magazine, 22(3):36 52, 2015. 5. S. Ekvall and D. Kragic. Learning and evaluation of the approach vector for automatic grasp generation and planning. In Robotics and Automation, 2007 IEEE International Conference on, pages 4715 4720. IEEE, 2007. 6/7

6. C. Ferrari and J. Canny. Planning optimal grasps. In ICRA, pages 2290 2295, 1992. 7. S. Gu, E. Holly, T. Lillicrap, and S. Levine. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates. In ICRA, page 9, 2017. 8. D. Kappler, J. Bohg, and S. Schaal. Leveraging Big Data for Grasp Planning. In ICRA, 2015. 9. A. Kleinhans, B. S. Rosman, M. Michalik, B. Tripp, and R. Detry. G3db: A database of successful and failed grasps with rgb-d images, point clouds, mesh models and gripper parameters. 2015. 10. I. Lenz, H. Lee, and A. Saxena. Deep learning for detecting robotic grasps. The International Journal of Robotics Research, 34(4-5):705 724, 2015. 11. S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 37(4-5):421 436, 2018. 12. Y. Lin and Y. Sun. Robot grasp planning based on demonstrated grasp strategies. The International Journal of Robotics Research, 34(1):26 42, 2015. 13. J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea, and K. Goldberg. Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics. arxiv preprint, 2017. 14. A. Miller and P. Allen. Examples of 3D grasp quality computations. ICRA, 2(May):1240 1246, 1999. 15. L. Pinto and A. Gupta. Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours. ICRA, 2016. 16. D. Quillen, E. Jang, O. Nachum, C. Finn, J. Ibarz, and S. Levine. Deep Reinforcement Learning for Vision-Based Robotic Grasping : A Simulated Comparative Evaluation of Off-Policy Methods. arxiv preprint, page arxiv:1802, 2018. 17. C. Rubert, D. Kappler, A. Morales, S. Schaal, and J. Bohg. On the relevance of grasp metrics for predicting grasp success. In IROS, pages 265 272, 2017. 18. A. Saxena, J. Driemeyer, and A. Y. Ng. Robotic Grasping of Novel Objects using Vision. The International Journal of Robotics Research, 27(2):157 173, 2008. 19. P. Schmidt, N. Vahrenkamp, M. Wachter, and T. Asfour. Grasping of Unknown Objects using Deep Convolutional Neural Networks based on Depth Images. In ICRA, pages 6831 6838, 2018. 20. Z. Wang, Z. Li, B. Wang, and H. Liu. Robot grasp detection using multimodal deep convolutional neural networks. Advances in Mechanical Engineering, 8(9):1 12, 2016. 21. T. Zhang, Z. McCarthy, O. Jowl, D. Lee, X. Chen, K. Goldberg, and P. Abbeel. Deep imitation learning for complex manipulation tasks from virtual reality teleoperation. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1 8. IEEE, 2018. 22. Y. Zhu, Z. Wang, J. Merel, A. Rusu, T. Erez, S. Cabi, S. Tunyasuvunakool, J. Kramár, R. Hadsell, N. de Freitas, et al. Reinforcement and imitation learning for diverse visuomotor skills. arxiv preprint arxiv:1802.09564, 2018. 7/7