Visual Perception for Robots Sven Behnke Computer Science Institute VI Autonomous Intelligent Systems
Our Cognitive Robots Complete systems for example scenarios Equipped with rich sensors Flying robot Soccer robot Communication robot Service robot Exploration robot 2
Our Humanoid Soccer Robots Dynaped Copedo NimbRo-OP Size: 95-114 cm, Weight: 6,6-8 kg 13-20 articulated joints PC, wide-angle camera(s), IMU 3
Visual Perception YUV color segmentation Recognition of field, ball, goals, obstacles, field lines, corners Egocentric modeling Probabilistic localization [Schulz & Behnke, Advanced Robotics 2012] 4
Features for Localization Goals Field lines Corners of lines Side poles Egocentric view Localization
Observation Likelihood Lines Side poles Line corners All features
RoboCup 2013 Final NimbRo 4:0 CIT Brains => Won fifth time in a row. 7
Intuitive Multimodal Communication Not keyboard, mouse, screen, but Eye contact Facing with head and trunk Facial expressions Gestures Speech Body language Transfer established human communication techniques to the man-machine interface Application: museum guide
Perception of Communication Partners Detection and tracking of faces Head pose estimation [Bennewitz Behnke: Humanoids 05] [Vatahska, Bennewitz, Behnke: Humanoids 07] Gesture recognition [Axenbeck, Bennewitz, Behnke, Burgard: Humanoids 08] Speech recognition (Loquendo)
Robotinho in Deutsches Museum Bonn [Nieuwenhuisen & Behnke, Journal of Social Robotics (SORO), 2013] 10
Our Service Robots Dynamaid Cosero Size: 100-180 cm, weight: 30-35 kg 36 articulated joints PC, laser scanner, Kinect, microphone, 11
2D Mapping of the Environment 12
3D-Mapping with Surfels 13
3D-Mapping with Surfels 14
3D-Mapping and Localization Registration of 3D laser scans Representation of point distributions in voxels Drivability assessment trough region growing Robust localization using 2D laser scans [Kläß, Stückler, Behnke: Robotik 2012] 15
3D Mapping by RGB-D SLAM Modelling of shape and color distributions in Voxels Local multiresolution Efficient registration of views on CPU Global optimization [Stückler, Behnke: Journal of Visual Communication and Image Representation 2013] 2,5cm 5cm Multi-camera SLAM [Stoucken, Diplomarbeit 2013] 16
Learning and Tracking Object Models Modeling of objects by RGB-D-SLAM Real-time registration with current RGB-D image 17
Transfer of Object Knowledge Non-rigid registration of known models and actual object Transfer of grasp and end-effector [Stückler, Behnke: submitted to ICRA] 18
Analysis of Table-top Scenes and Grasp Planning Detection of Clusters above horizontal plane Two grasps (top, side) Flexible grasping of many unknown objects [Stückler, Steffens, Holz, Behnke, Robotics and Autonomous Systems 2012] 19
Tool use: Bottle Opener Perception of tool tip Extension of arm kinematics Perception of crown cap 20
Tool use: Pair of Tongs Perception of tool tip Extension of arm kinematics Estimation of sausage pose Our team NimbRo has won the last three international RoboCup@Home competitions 21
Perception of Persons Detection in laser scans and tracking Visual verification and identification (VeriLook) 30cm 1m Systematic exploration Speech recognition and synthesis (Loquendo) [Stückler & Behnke, RoboCup 2010] Gesture recognition Natural gaze control [Droeschel et al, ICRA 2011] 22
Visual Object Recognition Object detection with laser or Kinect Recognition based on color and texture features (SURF) Object tracking 23
Semantic Mapping Pixel-wise classification of RGB-D images by random forests Inner nodes compare color / depth of regions Size normalization Training and recall on GPU 3D fusion through RGB-D SLAM Evaluation on own data set and NYU depth v2 [Stückler, Biresev, Behnke: IROS 2012] Accuracy in % Ø Classes Ø Pixels Ground truth Segmentation Silberman et al. 2012 59,6 58,6 Couprie et al. 2013 63,5 64,5 Random forest 65,9 68,6 3D-Fusion 67,0 70,9 [Stückler et al., Accepted with minor revision for Journal of Real-Time Image Processing] 24
Learning Depth-Sensitive CRFs SLIC+depth super pixels Unary features: random forest Height feature Pairwise features Color contrast Directed angle Depth difference Normal differences Results: similarity between superpixel normals Random forest CRF prediction Ground truth [Müller and Behnke, submitted to ICRA] 25
Object Class Detection in RGB-D Hough forests make not only object class decision, but describe object center RGB-D objects data set Color and depth features Training with rendered scenes Detection of object position and orientation Scene Class prob. Object centers Orientation Detected objects Depth helps a lot [Badami, Stückler, Behnke: SPME 2013] 26
Bin Picking Known objects in transport box Matching of graphs of 2D and 3D shape primitives 3D 2D Grasp and motion planning Offline Online [Nieuwenhuisen et al.: ICRA 2013] 27
Articulated Objects: Doors Door motion is important Detection of changes between maps Instantiation of door models Estimation of opening angle from laser scan Localization more reliable, more precise Navigation planning can use door opening state [Nieuwenhuisen, Stückler, Behnke, ICRA 10] 28
Adaptive Person Model Model: geometric primitives, connected by joints Registration through articulated ICP Adaptation of primitive parameters to body proportions [Droeschel, Behnke: ICIRA 2011] 29
Hierarchical Object Discovery trough Motion Segmentation Motion is strong segmentation cue Both camera and object motion Segment-wise registration of a sequence Inference of a segment hierarchy [Stückler, Behnke: IJCAI 2013] 30
Autonomous Flight near Obstacles Octocopter with many sensors and strong computer Multimodal obstacle detection 3D laser scanner Stereo cameras Ultrasound Local obstacle avoidance [Nieuwenhuisen et al., ECMR 2013] 31
Exploration in Rough Terrain Wheeled robot with Intel 4th Core-i7 Quad Omnidirectional RGB-D sensor 3D laser scanner 32
3D Mapping and 6D Localization Efficient registration of Multiresolution surfel maps Global optimization 6D localization with 2D laser scan using particle filter [Schadler, Stückler, Behnke: accepted for SSRR 2013] 33
Conclusion Robot operation in complex environments is challenging Simple skills realized Autonomous control is limited Often perception is the problem 3D sensors are helpful Need for further research Possibilities with robots Multimodal sensor fusion Active perception Interactive perception 34
Thanks for your attention! Questions? 35