Multibody reconstruction of the dynamic scene surrounding a vehicle using a wide baseline and multifocal stereo system Laurent Mennillo 1,2, Éric Royer1, Frédéric Mondot 2, Johann Mousain 2, Michel Dhome 1 1 Pascal Institute, Clermont Auvergne University - Aubière, France 2 Technocentre RENAULT - Guyancourt, France September 24, 2017
Context and scientific objectives 2 L. Mennillo et al. Multibody SLAM using an heterogeneous stereo system
Context and scientific objectives Context Short baseline with an identical stereo pair is well studied Not the case of wide baseline and heterogeneous stereo Multi-camera system inspired by actual sensor implantation on current vehicles (frontal camera and AVM systems) Industrial approach with RENAULT Scientific objectives Develop a sparse, purely geometrical solution for multibody reconstruction on heterogeneous stereo systems acquisition in a real environment 3
- Overview Framework 1 Offline intrinsic and extrinsic calibration using [1] 2 Feature extraction and matching 3 4 5 Local optimization 4
- Features Feature sets Each frame has a corresponding set of SIFT features f i,t i 0... m is the camera of observation t 0... n is the time of observation Two feature matching schemes between the sets f i,t and f i,t Temporal matching i = i and t t Stereo matching i i and t = t Matches between a feature x f i,t and another feature x f i,t Potential feature match p(x, x ) Final feature match m(x, x ) 5
- Feature extraction 1. Extracting the set of new features S1 Frame downsampling to account for the different focal lengths Frame division into blocks to ensure good spatial repartition SIFT feature detection and description for each block 2. Extracting the set of tracked features S2 Temporal tracking of previously triangulated features in f i,t 1 using the Lucas Kanade method [2] to compensate for block division SIFT description for each tracked feature 3. Merging the two sets S1 and S2 to obtain f i,t Elimination of duplicates based on pixelwise euclidean distance 6
- Feature matching Locality constraint Lc for temporal matching between f i,t and f i,t+1 Potential matches Features at near distance (search window) Epipolar constraint Ec for stereo matching between f i,t and f i,t Potential matches Features near epipolar lines If more than one potential match exists for a feature Retain the minimal L 2 distance between descriptors Potential matches p(x, x ) = Final match m(x, x ) 7
- Estimate the ego motion parameters of the multi-camera system Bundle adjustment approach as in [3] Local optimization of selected keyframes and associated 3D points 8
- Set of observations o X associated to the 3D point X At least a couple of associated observations (o X i,t, ox i,t ) Corresponding to either a temporal or stereo match m(x, x ) Several possible observations, in multiple frames at multiple times Determine the class C of the 3D point X from o X Static = C X = S Mobile = C X = M Outlier = C X = O 9
- 3D point consistency constraint Cc Reprojection error for all o X i,t ox is inferior to a threshold t Cc Static 3D points are consistent for all their observations 10
- Mobile 3D point detection Step 1 - Stereo match and reconstruction at time t1 11
- Mobile 3D point detection Step 2 - Temporal matches from t1 to t2 = Tracking 12
- Mobile 3D point detection Step 3 - Stereo match at time t2 13
- Mobile 3D point detection Consistency constraint is not satisfied for all observations of X 2 14
- 3D point mobility constraints Mc1, Mc2 and Mc3 Mc1 = Consistency for each individual temporality t Mc2 = At least one stereo match per temporality Mc3 = At least two temporalities per 3D point 15
- Trajectory consistency Filters erratic movements generated by false matches For mobile points that have been tracked at least 3 times Distance and elevation between each pair of consecutive points Angle formed by each triplet of consecutive points 16
- of camera poses and 3D points Unified optimization of all 3D points Static points and mobile points per temporality Minimization of the reprojection error with bundle adjustment 17
Experimental vehicle and sequences Motivations Specific camera configuration needed to reflect industrial trends No multifocal and wide baseline stereo datasets publicly available Experimental vehicle Multifocal and wide baseline multi camera system (3x 185, 1x 80 ) Hardware synchronization of all cameras Environment and sequences Realistic but controlled environment 8 sequences = Different road traffic scenarios at low speed 18
Experimental vehicle and sequences v 19 L. Mennillo et al. Multibody SLAM using an heterogeneous stereo system
Qualitative evaluation Limitations Qualitative evaluation Green - Static points Red - Mobile points Several mobile points tracked and reconstructed 20 L. Mennillo et al. Multibody SLAM using an heterogeneous stereo system
Qualitative evaluation Limitations Limitations False positives can occur due to false matches Static points on a moving object = Not tracked for 3 consecutive frames One inconsistent observation = Dismisses the point entirely 21 L. Mennillo et al. Multibody SLAM using an heterogeneous stereo system
and future works and dataset The method works as intended on our dataset Future works Denser matching near reconstructed mobile points Scoring method to prevent outliers arising from a single false match Working on more mobile points could help their reconstruction in non-overlapped FOV of the multi-camera system 22
Bibliography P. Lébraly, E. Royer, O. Ait-Aider, C. Deymier, and M. Dhome. Fast calibration of embedded non-overlapping cameras. In International Conference on Robotics and Automation, pages 221 227. IEEE, 2011. B. D. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI 81, pages 674 679, San Francisco, CA, USA, 1981. Morgan Kaufmann Publishers Inc. E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, and P. Sayd. Real time localization and 3d reconstruction. In Computer Vision and Pattern Recognition, volume 1, pages 363 370. IEEE, 2006. 23
Questions? 24
25 L. Mennillo et al. Multibody SLAM using an heterogeneous stereo system