Markerless human motion capture through visual hull and articulated ICP

Similar documents
Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models

Multimodal Motion Capture Dataset TNT15

CS 775: Advanced Computer Graphics. Lecture 17 : Motion Capture

Applications. Systems. Motion capture pipeline. Biomechanical analysis. Graphics research

CS 775: Advanced Computer Graphics. Lecture 8 : Motion Capture

Medical images, segmentation and analysis

Part I: HumanEva-I dataset and evaluation metrics

MUSCULOSKELETAL SIMULATION :

White Paper. OLGA Explained. Lasse Roren. Author:

Algorithm for Arm Position Reconstruction from Optical Motion Capture Data with Noisy or Missing Data

This is a preprint of an article published in Computer Animation and Virtual Worlds, 15(3-4): , 2004.

A Model-based Approach to Rapid Estimation of Body Shape and Postures Using Low-Cost Depth Cameras

WHITE PAPER: Mischa Muehling 1,Tim Weber 1, 2, Philipp Russ 3, Sebastian Dendorfer 1, 2 1

Character Animation COS 426

Articulated Structure from Motion through Ellipsoid Fitting

Thiruvarangan Ramaraj CS525 Graphics & Scientific Visualization Spring 2007, Presentation I, February 28 th 2007, 14:10 15:00. Topic (Research Paper):

Motion capture: An evaluation of Kinect V2 body tracking for upper limb motion analysis

Development of 3D Model-based Morphometric Method for Assessment of Human Weight-bearing Joint. Taeho Kim

Motion Capture Using Joint Skeleton Tracking and Surface Estimation

A 12-DOF Analytic Inverse Kinematics Solver for Human Motion Control

Tracking of Human Body using Multiple Predictors

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. XXXIV-5/W10

Motion Capture using Body Mounted Cameras in an Unknown Environment

3D Surface Reconstruction from 2D Multiview Images using Voxel Mapping

Robust Human Body Shape and Pose Tracking

3D Digitization of a Hand-held Object with a Wearable Vision Sensor

Research Subject. Dynamics Computation and Behavior Capture of Human Figures (Nakamura Group)

Pose Estimation of Small-articulated Animals using Multiple View Images

Multi-camera Tracking of Articulated Human Motion using Shape and Motion Cues

Shape from Selfies : Human Body Shape Estimation using CCA Regression Forests

Comparison and evaluation of biomechanical parameters of motion capture systems

SCAPE: Shape Completion and Animation of People

Marionette Mass-Spring Model for 3D Gait Biometrics

Combined Shape Analysis of Human Poses and Motion Units for Action Segmentation and Recognition

AUTOMATED THRESHOLD DETECTION FOR OBJECT SEGMENTATION IN COLOUR IMAGE

Prostheses must cater to different needs of people.

Human Upper Body Pose Estimation in Static Images

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Motion Detection Algorithm

Muscle Activity: From D-Flow to RAGE

Research Article Human Model Adaptation for Multiview Markerless Motion Capture

Markerless Motion Capture using Multiple Cameras

3D Imaging from Video and Planar Radiography

Is my simulation good enough? Validation & Verification for Biomechanical Modeling and Simulation

2 Michael E. Leventon and Sarah F. F. Gibson a b c d Fig. 1. (a, b) Two MR scans of a person's knee. Both images have high resolution in-plane, but ha

Data Fusion Virtual Surgery Medical Virtual Reality Team. Endo-Robot. Database Functional. Database

A Statistical Image-Based Shape Model for Visual Hull Reconstruction and 3D Structure Inference. Kristen Lorraine Grauman

Dynamic Geometry Processing

Model-Based Human Motion Capture from Monocular Video Sequences

Human motion analysis: methodologies and applications

Predicting 3D People from 2D Pictures

A multi-camera positioning system for steering of a THz stand-off scanner

The objective of this tutorial is to present Model Based Calculations. These are calculations that only make sense relative to rigid segments.

Multi-disciplinary Design Optimization for Human Well- Being and Overall System Performance

A two-step methodology for human pose estimation increasing the accuracy and reducing the amount of learning samples dramatically

A Real Time System for Detecting and Tracking People. Ismail Haritaoglu, David Harwood and Larry S. Davis. University of Maryland

Dynamic Human Shape Description and Characterization

Articulated Model Detection in Range Images

Extract Features. 2-D and 3-D Analysis. Markerless Tracking. Create Data Reports

Rigging / Skinning. based on Taku Komura, Jehee Lee and Charles B.Own's slides

Motion Capture & Simulation

of human activities. Our research is motivated by considerations of a ground-based mobile surveillance system that monitors an extended area for

Grasp Recognition using a 3D Articulated Model and Infrared Images

Markerless Motion Capture with Multi-view Structured Light

Finding Surface Correspondences With Shape Analysis

Unsupervised Human Members Tracking Based on an Silhouette Detection and Analysis Scheme

Measurement of 3D Foot Shape Deformation in Motion

Creating a distortion characterisation dataset for visual band cameras using fiducial markers.

A New Hierarchical Particle Filtering for Markerless Human Motion Capture

arxiv: v1 [cs.cv] 28 Sep 2018

Hand Pose Estimation Using Expectation-Constrained-Maximization From Voxel Data

Development of an optomechanical measurement system for dynamic stability analysis

Multi-camera Tracking of Articulated Human Motion Using Motion and Shape Cues

User Guide MTD-2. Motion Lab Systems, Inc.

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Data Mining of the E-Pelvis Simulator Database: A Quest for a Generalized Algorithm for Objectively Assessing Medical Skill

Product information. Hi-Tech Electronics Pte Ltd

An Improved Tracking Technique for Assessment of High Resolution Dynamic Radiography Kinematics

3D Human Motion Analysis and Manifolds

Application of functional methods for kinematical analysis in biomechanics

Quadruped Robots and Legged Locomotion

Documents. OpenSim Tutorial. March 10, 2009 GCMAS Annual Meeting, Denver, CO. Jeff Reinbolt, Ajay Seth, Scott Delp. Website: SimTK.

A Non-Linear Image Registration Scheme for Real-Time Liver Ultrasound Tracking using Normalized Gradient Fields

Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation

Tracked surgical drill calibration

A Validation Study of a Kinect Based Body Imaging (KBI) Device System Based on ISO 20685:2010

Full Body Pose Estimation During Occlusion using Multiple Cameras Fihl, Preben; Cosar, Serhan

Cage-based Motion Recovery using Manifold Learning

EECS 442 Computer vision. Announcements

PART-LEVEL OBJECT RECOGNITION

2D Vessel Segmentation Using Local Adaptive Contrast Enhancement

Computer Animation. Courtesy of Adam Finkelstein

Fast Natural Feature Tracking for Mobile Augmented Reality Applications

A novel approach to motion tracking with wearable sensors based on Probabilistic Graphical Models

Figure 1 shows unstructured data when plotted on the co-ordinate axis

Gesture Recognition: Hand Pose Estimation. Adrian Spurr Ubiquitous Computing Seminar FS

Human Body Analysis with Biomechanics Criteria

Articulated Pose Estimation with Flexible Mixtures-of-Parts

Keywords Binary Linked Object, Binary silhouette, Fingertip Detection, Hand Gesture Recognition, k-nn algorithm.

3D Modeling for Capturing Human Motion from Monocular Video

Transcription:

Markerless human motion capture through visual hull and articulated ICP Lars Mündermann lmuender@stanford.edu Stefano Corazza Stanford, CA 93405 stefanoc@stanford.edu Thomas. P. Andriacchi Bone and Joint Center Palo Alto VA Palo Alto, CA 94304 tandriac@stanford.edu Orthopedic Surgery Stanford Medical Center Abstract A next critical advancement in human motion capture is the development of a non-invasive and marker-free system. In this study a marker-free approach based on an articulated iterative closest point (ICP) algorithm with soft joint constraints was evaluated using the HumanEva dataset. Our tracked results accurately overlay the visual data. As the ground truth marker placement and methodology for calculating joint centers is not sufficiently documented, accurate quantitative comparison of error is impossible. Nevertheless, average difference and standard deviation between corresponding joints are reported to give an upper bound to the error of the proposed algorithm. The standard deviation for most joints is around 1 to 2 cm. 1 Introduction Human motion capture is a well established paradigm for the diagnosis of the pathomechanics related to musculoskeletal diseases, the development and evaluation of rehabilitative treatments and preventive interventions for musculoskeletal diseases. At present, the most common methods for accurate capture of three-dimensional human movement require a laboratory environment and the attachment of markers, fixtures or sensors on the skin s surface of the body segment being analyzed. These laboratory conditions can cause experimental artifacts. Comparisons of bone orientation from true bone embedded markers versus clusters of three skin-based markers indicate a worst-case root mean square artifact of 7 [1, 2]. A next critical advancement in human motion capture is the development of a non-invasive and marker-free system [3]. A technique for human body kinematics estimation that does not require markers or fixtures placed on the body would greatly expand the applicability of human motion capture. To date, markerless methods are not widely available because the accurate capture of human movement without markers is technically challenging yet recent technical developments in computer vision provide the potential for markerless human motion capture for biomechanical and clinical applications [3, 4]. Our current approach employs an articulated iterative closest point (ICP) algorithm with soft joint constraints [5]

for tracking human body segments in visual hull sequences (a standard 3D representation of dynamic sequences from multiple images) [3]. The soft joint constraints approach extends previous approaches [6, 7] for tracking articulated models that enforced hard constraints on the joints of the articulated body. Two adjacent body segments each predict the location of the common joint center. The deviation between the two predictions is penalized in leastsquares terms. The purpose of this study was to evaluate the performance of our markerless motion capture approach using the HumanEva dataset [8]. 2 Methods The subject was separated from the background in the image sequence of all cameras using an intensity threshold for the grayscale cameras, and an intensity and color threshold for the color cameras [9]. The 3D representation was achieved through visual hull construction from multiple 2D camera views [10-12]. A 3D articulated body was tracked in the 3D representations using an articulated body from a repository of subject-specific articulated bodies [13] that would match the subject closest based on a volume and height evaluation (Figure 1). The lack in detailed knowledge of the morphology and kinematic chain of the tracked subjects was adjusted by allowing larger inconsistencies at the joints. The articulated body consisted of 15 body segments (head, trunk, pelvis, and left and right arm, forearm, hand, thigh, shank and foot) and 14 joints connecting these segments. a) b) Figure 1: a) Selected bodies from repository of subject-specific articulated bodies. b) Articulated body consisting of 15 body segments and 14 joint for subject S1. The subject s pose was roughly matched based on a motion trajectory to the first valid frame in the motion sequence and subsequently tracked automatically over the gait cycle. The motion trajectory was calculated as a trajectory of center of volumes obtained from the 3D representations for each frame throughout the captured motion. 3 Results The quality of visual hulls depends on numerous aspects including camera calibration, number of cameras, camera configuration, imager resolution and the accurate fore/background segmentation in the image sequences [14, 15]. Accurate background segmentation in the greyscale cameras is challenging (Figure 2). Difficulties in the accurate fore/background separation in the grayscale cameras BW1 and BW4 prevented the use of all cameras. As a result, visual hulls were created using only five cameras (BW2, BW3, C1, C2, and C3) resulting in a rather crude approximation (Figure 2).

a) b) Figure 2: a) Selected foreground (red) in grayscale cameras (top row) and color cameras (bottom row). b) Visual hulls for selected frames (54, 104, 154, 204 and 254) of image sequence S1\Walking_1. The articulated body was tracked successfully through individual video sequences. Our tracked results (red, Figures 3 and 4) accurately overlays the visual data. a) Frame 72 b) Frame 224 c) Frame 860 Figure 3: Selected frames of camera BW3 of trial S1\Walking_1 with overlaid marker-based data (blue) and our markerless tracking results (red). a) Frame 254 b) Frame 370 c) Frame 458 Figure 4: Selected frames of camera BW3 of trial S1\Boxing_1 with overlaid marker-based data (blue) and our markerless tracking results (red). Comparison between the joints provided by the marker-based system and calculated by our approach yielded truthful overlay, but differences in defining joint centers in the kinematic chain in the selected articulated model and marker-based system cause an offset among corresponding joints. As the ground truth marker placement and methodology for calculating joint centers is not sufficiently documented, accurate quantitative comparison of error is impossible. Nevertheless, average difference and standard deviation calculated from the Euclidian distances between corresponding joints are reported to give an upper bound to the error of the proposed algorithm (Figure 5, Table 1). However, the results do not reflect the true potential of the proposed algorithm due to differences in defining joint centers.

a) b) Figure 5: Euclidian distance between corresponding joints from the marker-based and markerless information. a) S1\Walking_1. b) S1\Boxing_1. Table 1: Average difference (avg) and standard deviation (stdev) calculated from the Euclidian distances between corresponding joints of the marker-based and markerless information. S1\Walking_1 S1\Boxing_1 avg [m] stdev [m] avg [m] stdev[m] RShldr 0.069 0.02 0.051 0.009 RElbow 0.053 0.023 0.062 0.025 RWrist 0.047 0.025 0.054 0.028 RKnee 0.034 0.016 0.027 0.011 RAnkle 0.04 0.017 0.015 0.009 LShldr 0.058 0.014 0.058 0.017 LElbow 0.087 0.02 0.082 0.022 LWrist 0.064 0.037 0.045 0.022 LKnee 0.046 0.017 0.035 0.016 LAnkle 0.033 0.02 0.025 0.009 4 Discussion The results presented here demonstrate the feasibility of measuring 3D human body kinematics using a markerless motion capture system on the basis of visual hulls. The employed algorithm yields great potential for accurately tracking human body segments. The algorithm does not enforce hard constraints for tracking articulated models. The employed cost function consists of two terms, which ensure that corresponding points align and joint are approximately preserved. The emphasis on either term can be chosen globally and/or individually, and thus yields more anatomically correct models. Moreover, the presented algorithm can be employed by either fitting the articulated model to the visual hull or the visual hull to the articulated model. Both scenarios will provide identical results in an ideal case. However, fitting data to the model is likely to be more robust in an experimental environment where visual hull only provide partial information due to calibration and/or segmentation errors. The presented approach has been successfully applied to simple walking and running sequences and more complex motion sequences such as a cricket bowl, handball throw and cart wheel. The obtained accuracy compared to marker-based systems [3]. Utilizing more cameras to enhance the quality of the visual hulls and a better knowledge of the morphology and kinematic chain of the tracked subjects allowing stronger consistencies at the joints would improve the accuracy.

Ac knowledg me nts Funding provided by NSF #03225715 and VA #ADR0001129. References 1. Leardini, A., et al., Human movement analysis using stereophotogrammetry Part 3: Soft tissue artifact assessment and compensation. Gait and Posture, 2005. 21: p. 221-225. 2. Cappozzo, A., et al., Position and orientation in space of bones during movement: experimental artifacts. Clinical Biomechanics, 1996. 11: p. 90-100. 3. Mündermann, L., S. Corazza, and T.P. Andriacchi, The evolution of methods for the capture of human movement leading to markerless motion capture for biomechancial applications. Journal of Neuroengineering and Rehabilitation, 2006. 3(6). 4. Moeslund, G. and E. Granum, A survey of computer vision-based human motion capture. Computer Vision and Image Understanding, 2001. 81(3): p. 231-268. 5. Anguelov, D., L. Mündermann, and S. Corazza. An Iterative Closest Point Algorithm for Tracking Articulated Models in 3D Range Scans. in Summer Bioengineering Conference. 2005. Vail, CO. 6. Bregler, C. and J. Malik. Tracking people with twists and exponential maps. in Computer Vision and Pattern Recognition. 1997. 7. Cheung, G., S. Baker, and T. Kanade. Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. in IEEE Conference on Computer Vision and Pattern Recognition. 2003. Madison, WI: IEEE. 8. Sigal, L. and M.J. Black, HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion. Technical Report CS-06-08, 2006. 9. Haritaoglu, I. and L. Davis, W4: real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000. 22(8): p. 809-830. 10. Martin, W. and J. Aggarwal, Volumetric description of objects from multiple views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1983. 5(2): p. 150-158. 11. Laurentini, A., The Visual Hull concept for silhouette base image understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994. 16: p. 150-162. 12. Cheung, K., S. Baker, and T. Kanade, Shape-From-Silhouette Across Time Part I: Theory and Algorithm. International Journal of Computer Vision, 2005. 62(3): p. 221-247. 13. Anguelov, D., et al., Scape: Shape completion and animation of people. ACM Transaction on Graphics, 2005. 24(3). 14. Mündermann, L., et al., Most favorable camera configuration for a shape-fromsilhouette markerless motion capture system for biomechanical analysis. SPIE- IS&T Electronic Imaging, 2005. 5665: p. 278-287. 15. Mündermann, L., et al., Conditions that influence the accuracy of anthropometric parameter estimation for human body segments using shape-from-silhouette. SPIE- IS&T Electronic Imaging, 2005. 5665: p. 268-277.