ED STIC - Proposition de Sujets de Thèse pour la campagne d'allocation de thèses 2014 Axe Sophi@Stic : Titre du sujet : aucun Manifold recognition for large-scale image database Mention de thèse : ATSI HDR Directeur de thèse inscrit à l'ed STIC : Fillatre Lionel Co-encadrant de thèse éventuel : Nom : Prénom : Email : Téléphone : Email de contact pour ce sujet : Laboratoire d'accueil : lionel.fillatre@i3s.unice.fr I3S Description du sujet : Background: Big data require novel technologies to efficiently process massive datasets within tolerable elapsed times. In the area of image and video processing, many challenges exist all around the world to evaluate algorithms for object recognition and image classification at large scale [1]. Pattern recognition consists in determining if one or more instances of an object are present in an image. The conventional approaches are generally resource demanding since it is often necessary to compute features (HOG and SIFT in image processing for example) and next to train a classifier based on these features [2]. Furthermore there is generally no guarantee on the recognition performance which is typically measured by the proportion of true recognitions over the whole image database. Page 1/5
When the data lives in a subspace of the observation space (a phenomenon which occurs frequently), it is going to look very flat when viewed from the full space. Principal Component Analysis (PCA) is a conventional method to calculate a linear approximation of a flat subspace [3]. For pattern recognition tasks, PCA should be used very carefully because the discriminant dimensions could be thrown out. To prevent this, the Linear Discriminant Analysis (LDA) is looking for the planes which separate at best some flat subspaces. A large number of data (such as images under varying geometric transformations) are thought of as constituting highly nonlinear (i.e., non-flat) manifolds in the high-dimensional observation space [4]. Visualization and exploration of high-dimensional vector data are therefore the focus of much current machine learning research. However, most recognition systems using conventional methods (such as PCA and LDA or their kernel-based improvements kernel PCA and kernel discriminant analysis [3]) are bound to ignore subtleties of manifolds (such as the curvature). This is a bottleneck for achieving highly accurate recognition. This problem has to be solved to make possible the design of a high performance recognition system. This PhD proposal aims to design a manifold recognition algorithm which classifies some manifolds by exploiting their structural properties. Objectives: Recognizing some patterns is nothing more than classifying manifolds. Indeed, each single object naturally generates a non-linear manifold under various transformations. In the area of image and video processing, each manifold can be generated by a single geometric object (cat, bike, etc.) which is rotated, scaled or translated [5]. The manifold is not easily modeled but it can be approximated by using some prior samples coming from a learning database. This PhD proposal is mainly focused on image and video applications but the developed results will concern big data processing in general. The main objective of this proposal is to define an almost optimal classifier to identify the true manifold (corresponding to the analyzed data) among a finite set of possible manifolds [6]. These manifolds are estimated from la large-scale learning database. The main awaited novelty consists in developing an algorithm which combines conventional machine learning methods and optimal statistical classifiers. Generally, these two approaches are not used simultaneously. Pattern recognition in large-scale databases provides an ideal framework to study the benefits of this combination. The machine learning tools are relevant to derive the approximate manifold model from the learning database and the statistical classification framework is necessary to design the almost optimal classifier. A second objective of this PhD thesis is to identify the characteristics of image manifolds which are responsible for the accuracy of the classifier. These characteristics (the curvature for example) generally depend both on the pattern (typically a geometric object) and the transformations (typically some translations, rotations and scalings) which have generated the manifold. This study would be very useful to forecast the quality of the learning database and the potential efficiency of pattern recognition based on this database. Page 2/5
Finally, a major issue of this research is to propose a simple and very quick pattern recognition algorithm which will able to process large-scale databases. A typical large-scale image database contains more than one million of learning images. As an indication, a processing time of 1 second per image (which is rather optimistic for pattern recognition) requires approximately 12 days for processing such a database. Hence it is crucially important to design low complexity algorithms. The study will be made in collaboration with Michel Barlaud (professor emeritus at I3S laboratory). Expected skills: Mathematics, probability, image processing and programming (matlab or C++). References: [1] O. Russakovsky, J. Deng, Z. Huang, A. C. Berg, and L. Fei-Fei, Detecting avocados to zucchinis: what have we done, and where are we going? in International Conference on Computer Vision (ICCV), 2013. [2] R. Nock, W. Bel Haj Ali, R. D Ambrosio, F. Nielsen, and M. Barlaud, Gentle nearest neighbors boosting over proper scoring rules, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014. [3] C. Bishop, Pattern recognition and machine learning. Springer New York, 2006. [4] A. Brun, Manifolds in image science and visualization, Ph.D. dissertation, Linköping University, Medical Informatics, The Institute of Technology, 2007. [5] S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 290, no. 5500, pp. 2323 2326, 2000. [6] L. Fillatre, Constrained epsilon-minimax test for simultaneous detection and classification, IEEE Transactions on Information Theory, vol. 57, no. 12, pp. 8055 8071, 2011. URL : http://www.i3s.unice.fr/~fillatre/ English version: Background: Big data require novel technologies to efficiently process massive datasets within tolerable elapsed times. In the area of image and video processing, many challenges exist all around the world to evaluate algorithms for object recognition and image classification at large scale [1]. Pattern recognition consists in determining if one or more instances of an object are present in an image. The conventional approaches are generally resource demanding since it is often necessary to compute features (HOG and SIFT in image processing for example) and next to train a classifier based on these features [2]. Furthermore there is generally no guarantee on the recognition performance which is typically measured by the proportion of true recognitions over the whole image database. When the data lives in a subspace of the observation space (a phenomenon which occurs frequently), it is going to look very flat when viewed from the full space. Principal Component Analysis (PCA) is a conventional method to calculate a linear approximation of a flat subspace [3]. Page 3/5
For pattern recognition tasks, PCA should be used very carefully because the discriminant dimensions could be thrown out. To prevent this, the Linear Discriminant Analysis (LDA) is looking for the planes which separate at best some flat subspaces. A large number of data (such as images under varying geometric transformations) are thought of as constituting highly nonlinear (i.e., non-flat) manifolds in the high-dimensional observation space [4]. Visualization and exploration of high-dimensional vector data are therefore the focus of much current machine learning research. However, most recognition systems using conventional methods (such as PCA and LDA or their kernel-based improvements kernel PCA and kernel discriminant analysis [3]) are bound to ignore subtleties of manifolds (such as the curvature). This is a bottleneck for achieving highly accurate recognition. This problem has to be solved to make possible the design of a high performance recognition system. This PhD proposal aims to design a manifold recognition algorithm which classifies some manifolds by exploiting their structural properties. Objectives: Recognizing some patterns is nothing more than classifying manifolds. Indeed, each single object naturally generates a non-linear manifold under various transformations. In the area of image and video processing, each manifold can be generated by a single geometric object (cat, bike, etc.) which is rotated, scaled or translated [5]. The manifold is not easily modeled but it can be approximated by using some prior samples coming from a learning database. This PhD proposal is mainly focused on image and video applications but the developed results will concern big data processing in general. The main objective of this proposal is to define an almost optimal classifier to identify the true manifold (corresponding to the analyzed data) among a finite set of possible manifolds [6]. These manifolds are estimated from la large-scale learning database. The main awaited novelty consists in developing an algorithm which combines conventional machine learning methods and optimal statistical classifiers. Generally, these two approaches are not used simultaneously. Pattern recognition in large-scale databases provides an ideal framework to study the benefits of this combination. The machine learning tools are relevant to derive the approximate manifold model from the learning database and the statistical classification framework is necessary to design the almost optimal classifier. A second objective of this PhD thesis is to identify the characteristics of image manifolds which are responsible for the accuracy of the classifier. These characteristics (the curvature for example) generally depend both on the pattern (typically a geometric object) and the transformations (typically some translations, rotations and scalings) which have generated the manifold. This study would be very useful to forecast the quality of the learning database and the potential efficiency of pattern recognition based on this database. Finally, a major issue of this research is to propose a simple and very quick pattern recognition algorithm which will able to process large-scale databases. A typical large-scale image database contains more than one million of learning images. As an indication, a processing time of 1 second per image (which is rather optimistic for pattern recognition) requires approximately 12 Page 4/5
days for processing such a database. Hence it is crucially important to design low complexity algorithms. The study will be made in collaboration with Michel Barlaud (professor emeritus at I3S laboratory). Expected skills: Mathematics, probability, image processing and programming (matlab or C++). References: [1] O. Russakovsky, J. Deng, Z. Huang, A. C. Berg, and L. Fei-Fei, Detecting avocados to zucchinis: what have we done, and where are we going? in International Conference on Computer Vision (ICCV), 2013. [2] R. Nock, W. Bel Haj Ali, R. D Ambrosio, F. Nielsen, and M. Barlaud, Gentle nearest neighbors boosting over proper scoring rules, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014. [3] C. Bishop, Pattern recognition and machine learning. Springer New York, 2006. [4] A. Brun, Manifolds in image science and visualization, Ph.D. dissertation, Linköping University, Medical Informatics, The Institute of Technology, 2007. [5] S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 290, no. 5500, pp. 2323 2326, 2000. [6] L. Fillatre, Constrained epsilon-minimax test for simultaneous detection and classification, IEEE Transactions on Information Theory, vol. 57, no. 12, pp. 8055 8071, 2011. URL : http://www.i3s.unice.fr/~fillatre/ Page 5/5