Z Y X New approaches to pattern recognition and automated learning Technology Forum 2015 Johannes Zuegner STEMMER IMAGING GmbH, Puchheim, Germany
OUTLINE Introduction Description of the task What does pose estimation exactly mean? Presentation of two approaches Image Features & Bag of Words Classification Presentation of the search classifier Direct comparison of the two approaches Summary and outlook Evaluation of the current state Future developments 17. November 2015 2
OUTLINE Introduction Description of the task What does pose estimation exactly mean? Presentation of two approaches Image Features & Bag of Words Classification Presentation of the search classifier Direct comparison of the two approaches Summary and outlook Evaluation of the current state Future developments 3
DESCRIPTION OF THE TASK The task of pattern recognition: searching and finding of pre-learned objects Example applications: counting of parts, pick & place etc. pattern image scene image [Blendswap] 4
DESCRIPTION OF THE TASK The task of pattern recognition: searching and finding of pre-learned objects Example applications: counting of parts, pick & place etc. pattern image scene image [Blendswap] 5
DESCRIPTION OF THE TASK The task of pattern recognition: searching and finding of pre-learned objects Example applications: counting of parts, pick & place etc. The mission often exeeds the task of finding an object 2D: exact orientation and scale of the objects is of interest orientation 0 orientation 45 orientation 135 GE 10 SE-128 M12 GE 10 AS-96 M12 GE 10 CX-32 M12 6
DESCRIPTION OF THE TASK The task of pattern recognition: searching and finding of pre-learned objects Example applications: counting of parts, pick & place etc. The mission often exeeds the task of finding an object 2D: exact orientation and scale of the objects is of interest 3D: rigid transform [R,t] with 6 degrees of freedom Z Y Z Y X [R,t] X 2D-image plane of the camera 3D scene 7
DESCRIPTION OF THE TASK The task of pattern recognition: searching and finding of pre-learned objects Example applications: counting of parts, pick & place etc. The mission often exeeds the task of finding an object 2D: exact orientation and scale of the objects is of interest 3D: rigid transform [R,t] with 6 degrees of freedom Z Z Y X Z Y X Y X Z Y X Z Y 2D-image plane of the camera 3D scene X 8
DESCRIPTION OF THE TASK The task of pattern recognition: searching and finding of pre-learned objects Example applications: counting of parts, pick & place etc. The mission often exeeds the task of finding an object 2D: exact orientation and scale of the objects is of interest 3D: rigid transform [R,t] with 6 degrees of freedom Summary The pose of an object refers to its concrete position, orientation and scale In the following two approaches are presented for pattern recognition and pose estimation 9
OUTLINE Introduction Description of the task What does pose estimation exactly mean? Presentation of two approaches Image Features & Bag of Words Classification Presentation of the search classifier Direct comparison of the two approaches Summary and outlook Evaluation of the current state Future developments 10
IMAGE FEATURES & BAG OF WORDS Extraction of image features with SIFT, KAZE & binary feature descriptors Extraction of feature points in corner-like image structures 11
IMAGE FEATURES & BAG OF WORDS Extraction of image features with SIFT, KAZE & binary feature descriptors Extraction of feature points in corner-like image structures Calculation of descriptors, the footprint [Blendswap/Wikipedia/Wikia] 12
IMAGE FEATURES & BAG OF WORDS Extraction of image features with SIFT, KAZE & binary feature descriptors Extraction of feature points in corner-like image structures Calculation of descriptors, the footprint Bag of Visual Words Accumulation of the descriptors like a dictionary [Blendswap/Wikipedia/Wikia] Bag of Words 13
IMAGE FEATURES & BAG OF WORDS Extraction of image features with SIFT, KAZE & binary feature descriptors Extraction of feature points in corner-like image structures Calculation of descriptors, the footprint Bag of Visual Words Accumulation of the descriptors like a dictionary clustering algorithms, for example FLANN, k-means etc. Cluster 3 Cluster 1 Cluster 2 [Blendswap/Wikipedia/Wikia] Bag of Words 14
IMAGE FEATURES & BAG OF WORDS Apply the Bag of Words for recognition tasks Matching of the extracted features Cluster 3 Cluster 1 scene image Cluster 2 Bag of Words 15
IMAGE FEATURES & BAG OF WORDS Apply the Bag of Words for recognition tasks Matching of the extracted features Evaluation of the histogram Cluster 3 Cluster 1 scene image p C1 C2 C3 Cluster 2 Bag of Words Histogram Cluster 16
IMAGE FEATURES & BAG OF WORDS Apply the Bag of Words for recognition tasks Matching of the extracted features Evaluation of the histogram Cluster 3 Cluster 1 scene image p C1 C2 C3 Cluster 2 Bag of Words Histogram Cluster 17
IMAGE FEATURES & BAG OF WORDS Apply the Bag of Words for recognition tasks Matching of the extracted features Evaluation of the histogram Use point correspondences for the pose Cluster 3 Cluster 1 scene image p C1 C2 C3 Cluster 2 Bag of Words Histogram Cluster 18
IMAGE FEATURES & BAG OF WORDS Advantages of this technique Robust search results for a variety of poses 19
IMAGE FEATURES & BAG OF WORDS Advantages of this technique Robust search results for a variety of poses Robustness towards variations in lighting 20
IMAGE FEATURES & BAG OF WORDS Advantages of this technique Robust search results for a variety of poses Robustness towards variations in lighting Disadvantages of this technique Needs objects with disctinctive image structures [Stemmer/Wiki] 21
IMAGE FEATURES & BAG OF WORDS Advantages of this technique Robust search results for a variety of poses Robustness towards variations in lighting Disadvantages of this technique Needs objects with disctinctive image structures Processing time is high time (ms) 1435 890 35 point extraction descriptors clustering [Stemmer/Rewe] 22
IMAGE FEATURES & BAG OF WORDS Advantages of this technique Robust search results for a variety of poses Robustness towards variations in lighting Disadvantages of this technique Needs objects with disctinctive image structures Processing time is high License costs of SIFT The alternatives are more imprecise or insignificantly faster Motivation for the search for an alternative 23
OUTLINE Introduction Description of the task What does pose estimation exactly mean? Presentation of two approaches Image Features & Bag of Words Classification Presentation of the search classifier Direct comparison of the two approaches Summary and outlook Evaluation of the current state Future developments 24
PRESENTATION OF THE SEARCH CLASSIFIER Classic approach in pattern recognition Finding a pre-learned pattern using window search A metric decides wether the pattern was found or not (e.g. correlation, regression etc.) pattern image scene image 25
PRESENTATION OF THE SEARCH CLASSIFIER Classic approach in pattern recognition Finding a pre-learned pattern using window search A metric decides wether the pattern was found or not (e.g. correlation, regression etc.) pattern image scene image (ten)-thousands of comparisons! 26
PRESENTATION OF THE SEARCH CLASSIFIER Classic approach in pattern recognition Finding a pre-learned pattern using window search A metric decides wether the pattern was found or not (e.g. correlation, regression etc.) Known problems: low performance, changes in the scene image (geometry, shadowing etc.) pattern image scene image 27
PRESENTATION OF THE SEARCH CLASSIFIER Classic approach in pattern recognition Finding a pre-learned pattern using window search A metric decides wether the pattern was found or not (e.g. correlation, regression etc.) Known problems: low performance, changes in the scene image (geometry, shadowing etc.) pattern image scene image?! 28
PRESENTATION OF THE SEARCH CLASSIFIER Classic approach in pattern recognition Finding a pre-learned pattern using window search A metric decides wether the pattern was found or not (e.g. correlation, regression etc.) Known problems: low performance, changes in the scene image (geometry, shadowing etc.) pattern image scene image?! 29
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Automated learning in CVB Polimago: generation of random learning examples For every learning example: Extraction of features using a MRF (Multi Resolution Filter) & regression (Tikhonov) Saving of the underlying transformation pattern database learning image zero position 30
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Automated learning in CVB Polimago: generation of random learning examples For every learning example: Extraction of features using a MRF (Multi Resolution Filter) & regression (Tikhonov) Saving of the underlying transformation pattern database learning image 90 rotated 31
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Automated learning in CVB Polimago: generation of random learning examples For every learning example: Extraction of features using a MRF (Multi Resolution Filter) & regression (Tikhonov) Saving of the underlying transformation pattern database learning image -30 rotated 32
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Automated learning in CVB Polimago: generation of random learning examples For every learning example: Extraction of features using a MRF (Multi Resolution Filter) & regression (Tikhonov) Saving of the underlying transformation pattern database learning image -30 rotated and 40 px shifted 33
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Automated learning in CVB Polimago: generation of random learning examples For every learning example: Extraction of features using a MRF (Multi Resolution Filter) & regression (Tikhonov) Saving of the underlying transformation pattern database learning image +45 rotated and 10 px shifted 34
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Automated learning in CVB Polimago: generation of random learning examples For every learning example: Extraction of features using a MRF (Multi Resolution Filter) & regression (Tikhonov) Saving of the underlying transformation pattern database learning image Thousands of learning examples! 35
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased 36
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 37
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 38
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 39
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 40
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 41
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 42
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 43
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 44
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 45
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 46
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 47
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true negative 48
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true positive 49
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true positive shift: 10 px to the right, 80 px to the bottom 50
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition 51
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true positive 52
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Thanks to the automated learning stage of the classifier a huge number of poses can be learned Additional advantage: The processing time can be decreased search in the scene image pattern recognition true positive zero position found 53
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Example: an image window of 268 x 252 pixels and an object size of 64 x 64 pixels Search with correlation without image pyramid- 33768 comparisons two-level image pyramid - 2110 comparisons three-level image pyramid - 527 comparisons 54
PRESENTATION OF THE SEARCH CLASSIFIER A new approach in pattern recognition: Learn how to find the object Example: an image window of 268 x 252 pixels and an object size of 64 x 64 pixels Search with correlation without image pyramid- 33768 comparisons two-level image pyramid - 2110 comparisons two-level image pyramid - 527 comparisons Search with CVB Polimago only 109 comparisons in total the search result holds the complete object pose 55
OUTLINE Introduction Description of the task What does pose estimation exactly mean? Presentation of two approaches Image Features & Bag of Words Classification Presentation of the search classifier Direct comparison of the two approaches Summary and outlook Evaluation of the current state Future developments 56
COMPARISON OF THE TWO APPROACHES Qualitative comparison Approach / Criterion Bag of Words with SIFT Features Search Classifier Invariance against - geometric transformations - variations in lighting fully affine (ASIFT) perspective (PSIFT) yes (normalization) fully affine (Training) imaginable yes (training) Extraction of features depends on corner-like structures arbitrary structures Pattern recognition - multiple objects - negative samples no (yes with extensions) no yes yes Processing time low high 57
COMPARISON OF THE TWO APPROACHES Quantitative comparison what is the expected precision? Comparison of Polimago with a geometric pattern matcher (CVB ShapeFinder) 1/10 Pixel precision in positioning 0,1 precision in orientation Error: Rotation in Suchklassifikator ShapeFinder Error: Euklidische Distanz Suchklassifikator ShapeFinder 0,1 0,045 0,09 0,04 0,08 0,035 0,07 0,03 0,06 0,025 0,05 0,04 0,02 0,03 0,015 0,02 0,01 0,01 0,005 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 Num Tests Num Tests 58
COMPARISON OF THE TWO APPROACHES Quantitative comparison what is the expected precision? Comparison of Polimago with a geometric pattern matcher (CVB ShapeFinder) 1/10 Pixel precision in positioning 0,1 precision in orientation Comparison with an established measurement system PTB-certified ground truth available [GOM & Engineeringcapacity] 59
COMPARISON OF THE TWO APPROACHES Quantitative comparison what is the expected precision? Comparison of Polimago with a geometric pattern matcher (CVB ShapeFinder) 1/10 Pixel precision in positioning 0,1 precision in orientation Comparison with an established measurement system PTB-certified ground truth available Calculation of the three Euler angles [Stemmer] 60
COMPARISON OF THE TWO APPROACHES Quantitative comparison what is the expected precision? Comparison of Polimago with a geometric pattern matcher (CVB ShapeFinder) 1/10 Pixel precision in positioning 0,1 precision in orientation Comparison with an established measurement system PTB-certified ground truth available Calculation of the three Euler angles 5 measurement accuracy up to 60 of tilt [Stemmer] 61
OUTLINE Introduction Description of the task What does pose estimation exactly mean? Presentation of two approaches Image Features & Bag of Words Classification Presentation of the search classifier Direct comparison of the two approaches Summary and outlook Evaluation of the current state Future developments 62
SUMMARY Evaluation of the current state Robust recognition results of pre-learned objects Pose estimation of one or several objects in parallel Low processing times suitable for real-time tracking applications [Stemmer] 63
SUMMARY Evaluation of the current state Robust recognition results of pre-learned objects Pose estimation of one or several objects in parallel Low processing times suitable for real-time tracking applications Integrated in Common Vision Blox 2016 64
SUMMARY Evaluation of the current state Robust recognition results of pre-learned objects Pose estimation of one or several objects in parallel Low processing times suitable for real-time tracking applications Integrated in Common Vision Blox 2016 Future developments Speed-up of the classifier s learning stage (GPU, SSE) Preparation for the platforms Linux / ARM 65
DO YOU HAVE ANY QUESTIONS? Come and join our LinkedIn-group EUROPEAN VISION TECHNOLOGY FORUM and meet with our experts. 66
Thank you for your attention! STEMMER IMAGING GmbH Gutenbergstraße 9 13 82178 Puchheim, Germany Telefon: +49 89 80902-744 Fax: +49 89 80902-116 j.zuegner@stemmer-imaging.de www.stemmer-imaging.de Your contact person: Johannes Zügner