Robust Articulated-ICP for Real-Time Hand Tracking Andrea Tagliasacchi* Sofien Bouaziz Matthias Schröder* Mario Botsch Anastasia Tkach Mark Pauly * equal contribution 1/36
Real-Time Tracking Setup Data from (single) RGBD Sensors PrimeSense (Carmine) SoftKinetic Intel RealSense low SNR along in depth (z-axis) completely discards small portions of geometry 2
Motivation? Augmented Reality Intel Perceptual SDK Oculus Research (VR) MagicLeap (AR) Microsoft HoloLens - PR Video (hololens.com) HoloLens (AR) 3
Previous Work Appearance-based (guess solely based on current frame) Model-based (registers model of previous frame) 4
Previous Work Appearance vision Model geometry [Keskin ICCV 12] [Qian CVPR 14] [Oikono. CVPR 14] [Tompson SIG 14] [Melax I3D 13] [Tang CVPR 14] [Sridhar 3DV 14] [Schroder ICRA 14] 5
Contributions combined 2D/3D registration (within ICP) occlusion-aware correspondences (ICP) regularization with statistical pose-space prior extensible and unified real-time solver (>60fps) revamping ICP for articulated tracking 6
ICP: Iterative Closest Point Step 1: optimizing correspondences Step 2: optimizing transformations update correspondences update transformation update correspondences update transformation for more details please refer to Sparse-ICP [Bouaziz, Tagliasacchi, Pauly SGP 13] 7
Robust Articulated ICP [Wei et al. SIGA 12] our tracking process successfully tracks the entire motion sequence while ICP fails to track most of frames. This is because ICP is often sensitive to initial poses and prone to local minimum, particularly involving tracking high-dimensional human body poses from noisy depth data. [Zhang et al. SIGA 14] The accompanying video clearly shows that our tracking process is much more robust than the ICP algorithm [ ] our tracking process successfully tracks the entire motion sequence while ICP fails to track most of frames. [Qian et al. CVPR 14] It uses alternate and gradient based optimization, converges fast, and is suitable for realtime applications. However, it can be easily trapped in poor local optima and cannot handle non-rigid objects well. Yet, it is still insufficient for high-dimensional articulated hands, especially under free viewpoints. 8
System Overview color image silhouette posed model lost tracking? no yes converged? no yes reinitialize 2D correspondences linear 3D correspondences data fitting solve temporal depth image distance trans. point cloud bounds pose space collision min E points + E silh. +E wrist {z } Fitting terms + E pose + E kin. + E temporal {z } Prior terms 9
Preprocessing S s -sensorsilhouette color to identify the Region-of-Interest demo: assumption on picking +y for PCA but all this could be learned!! [Tompson et al. TOG 14] 10
Data Fitting Energies color image silhouette posed model lost tracking? no yes converged? no yes reinitialize 2D correspondences linear 3D correspondences data fitting solve temporal depth image distance trans. point cloud bounds pose space collision min E points + E silh. +E wrist {z } Fitting terms + E pose + E kin. + E temporal {z } Prior terms 11
3D Registration (w/ occlusions) X s -sensorpointcloud E points =! 1 X x2x s kx M (x, )k 1 2 x M ICP (case #1) ICP (case #2) our method (case #2) M -cylinderhandmodel 3D Registration 12
Lesson #1: insufficient to render X s -sensorpointcloud Correspondences of [Wei et al. SIGA 12] (renders the hand model into a point cloud) c 1 c 2 c 1 c 2 Correspondences of [Our Method] (computes correspondences in close form) Ground Truth Motion (finger 2 comes out of occlusion) c 1 c 2 M -cylinderhandmodel 3D Registration 13
2D/3D Registration X s -sensorpointcloud M -cylinderhandmodel 3D Registration 14
2D/3D Registration X s -sensorpointcloud S s -sensorsilhouette M -cylinderhandmodel 3D Registration S r -renderedsilhouette 2D Registration 15
2D Registration S s -sensorsilhouette E silhouette =! 2 X p2s r kp Ss (p, )k 2 2 x Ss S r -renderedsilhouette 2D Registration 16
Model Prior Energies color image silhouette posed model lost tracking? no yes converged? no yes reinitialize 2D correspondences linear 3D correspondences data fitting solve temporal depth image distance trans. point cloud bounds pose space collision min E points + E silh. +E wrist {z } Fitting terms + E pose + E kin. + E temporal {z } Prior terms 17
Joint Bounds Energy 18
Collision Energy 19
Temporal Coherence Energy 20
Statistical Pose Prior encodes correlation across joints 21
Statistical Pose Prior Recorded by VICON tracking system [Schroder ICRA 14] (they are accurate for the person that have been recorded for) 22
Statistical Pose Prior (Soft) E pose =! 4 k (µ + P )k 2 2 +! 5 k k 2 2 we optimize the current pose So that it is similar to a reconstructed pose from the low dimensional subspace but when DOF are unconstrained we would like to restore the neutral (i.e. mean) pose. µ 23
Statistical Pose Prior (Hard) E pose =! 4 k (µ + P )k 2 2 +! 5 k k 2 2! 4 = 1 only optimize in the subspace (i.e. variable replacement) even with a high number of bases the animation remains stiff!!! 24
CPU/GPU Optimization min X i Ē silh. =! 2 X kf i + J i k 2 2 = X i J T i J i 1 X i J T i f i p2s r (n T (J persp (x)j skel (x) +(p Ss (p, ))) 2 S r 20k!!!! J silh 20k 26 J T silhj silh = 26 26 25
Results and Limitations 26
Motion Transfer to Rig 27
Tracking with Fast Motion 28
Tracking of Interacting Hands 29
Limitation: Calibration 30
Limitations: Fist Rotation 31
State of the Art - Evaluations Dexter-1 Dataset (MPI) mm [Shroder ICRA 14] Subspace ICP [Tang et al. 2014] [Sridhar et al. 2013] [Sridhar et al. 2014] [ours] [ours] + re-init. [Sridhar ICCV 13] EM Tracker [Tang CVPR 14] Forest Classifiers 20 10 adbadd flexex1 pinch count tigergrasp wave random [Melax 14] Intel Perceptual SDK (re-initialization helps because Dexter-1 is a low frame-rate dataset only 30Hz) [Tompson TOG 14] ConvNets [Qian CVPR 14] ICP/PSO Hybrid Qualitative Quantitative 32
Qualitative Comparisons Convex Dynamics (Intel SDK) Subspace ICP 33
Conclusions combined 2D/3D registration (within ICP) occlusion-aware correspondences (ICP) regularization with statistical pose-space prior extensible and unified real-time solver (>60fps) fully open source! https://github.com/opengp/htrack 34
Shameless Advertisement Who? Prof. Andrea Tagliasacchi and Brian Wyvill What? MSc (PhD) Where? Who pays? Language? English https://www.csc.uvic.ca/program_information/graduate_studies/msc_program.htm 35
Thank you!!! Live demo at SGP 15!!! Don t be shy!! Sofien Matthias Mark Anastasia https://github.com/opengp/htrack 36
Extra Slides 37
Statistical Prior: Side Effects E pose =! 4 k (µ + P )k 2 2 +! 5 k k 2 2 0 3 6 10 0 1 2 3 as this prior correlates joint angles the convergence speed increases by about 3x 38
Reinitialization from Failure Determine tracking failure [Melax 13] Detection by ~[Qian et al. CVPR 14] 1 = X x2x s kx M (x, )k 2, 2 = S r \ S s S r [ S s xy-fingers Logistic Regressor (optimize alpha s.t.): 2 1+e T [ 1, 2 ]! tracking ok! z-fingers Exploration of corresp. energy 39
Particle Swarm 40