Scene Text Recognition using Co-occurrence of Histogram of Oriented Gradients

Size: px
Start display at page:

Download "Scene Text Recognition using Co-occurrence of Histogram of Oriented Gradients"

Transcription

1 203 2th International Conference on Document Analysis and Recognition Scene Text Recognition using Co-occurrence of Histogram of Oriented Gradients Shangxuan Tian, Shijian Lu, Bolan Su and Chew Lim Tan Department of Computer Science,School of Computing,National University of Singapore Computing, 3 Computing Drive, Singapore {tians, subolan, tancl}@comp.nus.edu.sg Visual Computing Department, Institute for Infocomm Research Fusionopolis Way, #2-0 Connexis, Singapore slu@i2r.a-star.edu.sg Abstract Scene text recognition is a fundamental step in Endto-End applications where traditional optical character recognition (OCR) systems often fail to produce satisfactory results. This paper proposes a technique that uses co-occurrence histogram of oriented gradients (Co-HOG) to recognize the text in scenes. Compared with histogram of oriented gradients (HOG), Co-HOG is a more powerful tool that captures spatial distribution of neighboring orientation pairs instead of just a single gradient orientation. At the same time, it is more efficient compared with HOG and therefore more suitable for real-time applications. The proposed scene text recognition technique is evaluated on ICDAR2003 character dataset and Street View Text (SVT) dataset. Experiments show that the Co-HOG based technique clearly outperforms state-of-the-art techniques that use HOG, Scale Invariant Feature Transform (SIFT), and Maximally Stable Extremal Regions (MSER). I. INTRODUCTION Recognition of the text in natural scenes has attracted increasing research attention in recent years due to its crucial importance in scene understanding. It has become a very promising tool in different applications such as unmanned vehicle/robot navigation, living aids for visually impaired persons, content based image retrieval, etc. Though optical character recognition (OCR) of scanned document images has achieved great success, recognition of the scene text by using existing OCR systems still has a large space for improvements due to a number of factors. First, unlike scanned document texts that usually lie over a blank document background with similar color, texture, and controlled lighting, scene texts often have a much more variational background that could have arbitrary color, texture, and lighting conditions as illustrated in Fig.. Second, unlike scanned document texts that are usually printed in some widely used text font and text size, the scene text could be captured in arbitrary size and printed in some fancy but infrequently used text fonts as illustrated in Fig.. Even worse, the font of the scene text may even change within a single word, for the purpose of special visual effects or attraction of the human attention. Third, unlike scanned document texts that usually have a front-parallel view, scene texts captured from arbitrary viewpoints often suffer from the perspective distortion as illustrated in Fig.. All these variations make OCR of scene texts a very challenging task. A robust OCR technique is urgently needed that is tolerant to the variations of scene Fig. : Example characters taken from ICDAR2003 (first and second rows) and SVT (third and fourth rows) datasets. First row: E, S, S, N, N, G, G, R, A. Second row: A, A, f, M, H, R, T, T. Third row: E, S, L, M, b, J, o, R, R. Fourth row: P, M, K, E, h, M, T, A, n. texts as well as their background. A number of scene text recognition techniques have been reported which can be classified into two categories. The traditional approach first performs certain preprocessing like binarization, slant correction, perspective rectification before parsing scene texts to the existing OCR engines. Chen et al. [] performed a variant of Niblack s adaptive binarization algorithm [2] on the detected text region before feeding to OCR for recognition. An iterative binarization method is proposed in [3] on single character using k-means to produce a set of potential binarized characters and then Support Vector Machines (SVM) is used to measure the degree of characterlikeness and the one with maximum character-likeness is selected as the optimal result. Recently the Markov Random Field (MRF) model is adopted for binarization in [4] where an auto-seeding technique is proposed to first determine certain foreground and background pixel seeds and then use MRF to segment text and non-text regions. Another way is to extract feature descriptors and then use classifiers for scene text recognition. The recent approach /3 $ IEEE DOI 0.09/ICDAR

2 first extracts certain features from gray/color images and then trains classifiers for scene text recognition. This approach is studied extensively in [5] where the scene text recognition performance is evaluated by using different feature descriptors including Shape Contexts, Scale Invariant Feature Transform (SIFT), Geometric Blur, Maximum Response of filters, patch descriptor, etc. in combination with bag-of-words model. But the results are not satisfactory to serve as the basis for word recognition. In [6], the authors address the problem by employing Gabor filters and then building a similarity model to measure the distance between characters in their text recognition framework. Maximally Stable Extremal Regions (MSER) is used in [7] to get a MSER mask and extract orientation features along the MSER boundary. In addition, an unsupervised feature learning system is proposed by [8] using a variant of K-means clustering to first build a dictionary and then map all character images to a new representation using the dictionary. Recently, the classical HOG feature [9] has also been widely used for scene text recognition. As studied in [0], [], [2], HOG outperforms almost all the other features due to its robustness to illumination variation and invariance to the local geometric and photometric transformations. However, HOG is just a statistics of gradient orientation in each block which does not capture the spatial relationship of neighboring pixels sufficiently. For example, two image patches having similar HOG features may look very different when their pixel locations are rearranged. Therefore, we propose to recognize the scene text by using an extension of the HOG, namely, co-occurrence HOG (Co- HOG) [3], that captures gradient orientation of neighboring pixel pairs instead of a single image pixel. Co-HOG divides the image into blocks with no overlap which is more efficient than HOG with overlapped blocks [3]. This is essential in real-time text recognition system. More importantly, relative location and orientation are considered with each neighboring pixel, respectively, which is more precise to describe the character shape. In addition, Co-HOG keeps the advantages of HOG, i.e., the robustness to varying illumination and local geometric transformations. Extensive tests show that Co-HOG outperforms other feature descriptors significantly for scene text recognition. II. CO-OCCURRENCE OF HISTOGRAM OF ORIENTED GRADIENTS Co-HOG is extended Histogram of Oriented Gradients. It becomes HOG when the offset is (0, 0) as to be illustrated later in this section. In this section, we first explain the general idea of HOG and then show how to extend it to Co-HOG for the scene text recognition task. A. Histogram of Oriented Gradients HOG feature [9] is first proposed to deal with human detection task and later becomes a very popular feature in object detection area. When extracting HOG features, the orientations of gradients are usually quantized into histogram bins and each bin has an orientation range. An image is divided into overlapping blocks and in each block, a histogram of oriented gradients falling into each bin is computed and then (a) Sample image (b) Gradient Orientation (c) Histogram and Vectorization Fig. 2: Illustration of HOG feature extraction: (a) shows a character sample which is divided into 4 blocks (the blocks overlay with neighboring blocks in implementation). (b) shows the corresponding gradient orientation of each block. (c) shows the histogram of gradient orientation and concatenated one after another to form a HOG feature vector. normalized to overcome illumination variation. The features from all blocks are then concatenated together to form a feature descriptor of the whole image. Fig. 2 illustrates the extraction process of the classical HOG feature. Due to its robustness to illumination variation and invariance to the local geometric and photometric transformations, many scene text recognition works employ HOG for the recognition of texts in scenes. On the other hand, HOG captures orientation of only isolated pixels, whereas spatial information of neighboring pixels is ignored. Co-HOG instead captures more spatial information and is more powerful in scene text recognition to be discussed in the ensuing subsection. B. Co-occurrence of Histogram of Oriented Gradients Co-HOG captures spatial information by counting frequency of co-occurrence of oriented gradients between pixel pairs. Thus relative locations are stored. The relative locations are reflected by the offset between 2 pixels as shown in Fig. 3(a). The yellow pixel in the center is the pixel under study and the neighboring blue ones are pixels with different offsets. Each neighboring pixel in blue color forms an orientation pair with the center yellow pixel and accordingly votes to the cooccurrence matrix as illustrated in Fig. 3(b). Therefore, HOG is just a special case of Co-HOG when the offset is set to (0, 0), i.e., only the pixel under study is counted. The frequency of co-occurrence of oriented gradients is captured at each offset via co-occurrence matrix as shown in Fig. 3(b). Co-occurrence matrix at a specific offset (x, y) is 93

3 (a) Offset in Co-HOG (b) Co-occurrence matrix Fig. 4: Bi-linear interpolation of weighted magnitude gradient magnitude and orientation bin is combined and Fig. 4 gives a simple illustration. (c) Vectorization Fig. 3: Illustration of Co-HOG feature extraction: (a) illustrates the offset used in Co-HOG. (b) shows the co-occurrence of one block in Fig. 2(a). (c) shows the vectorization of co-occurrence matrix and concatenated one after another to form Co-HOG feature vector. given by: H x,y(i, j) = (p,q) B { if O(p, q) =i & O(p + x, q + y) =j 0 otherwise where H x,y is the co-occurrence matrix at offset (x, y), which is a square matrix and its dimension is decide by number of orientation bins. Therefore, we will have 24 co-occurrence matrix with offsets as illustrated in Fig. 3(a). O is the gradient orientation of the input image I and B is a block in the image. Therefore, Equation computes co-occurrence matrix in a block and Fig. 3(b) shows an example. The Co-HOG feature descriptor of an image can thus be constructed by vectorizing and concatenating the Co-HOG matrix of all blocks of the image under study. The Co-HOG feature extraction process can be summarized in the following three steps. ) Gradient Magnitude and Orientation Computation: Gradient magnitude is computed as an L2 norm of horizontal and vertical magnitude computed by Sobel filter. For color images, the gradient is computed separately for each color channel and the one with maximum magnitude is used. Gradient orientation ranges between 0 80 (unsigned gradient) and is quantized into 9 orientation bins. 2) Weighted Voting: The original Co-HOG is computed without weighting as specified in Equation [3], which by itself can not reflect the difference between strong gradient and weak gradient pixels. We propose to add in a weighting mechanism based on the gradient magnitude where bi-linear interpolation is employed to vote between two neighboring orientation bins. Equation 2 shows how the weighting of () ( H(θ,θ 3 ) H(θ,θ 3 )+M α θ ) ( + M 2 β θ ) 3 θ 2 θ θ 4 θ ( H(θ,θ 4 ) H(θ,θ 4 )+M α θ ) ( ) β θ3 + M 2 θ 2 θ θ 4 θ ( ) ( α θ H(θ 2,θ 3 ) H(θ 2,θ 3 )+M + M 2 β θ ) 3 θ 2 θ θ 4 θ ( ) ( ) α θ β θ3 H(θ 2,θ 4 ) H(θ 2,θ 4 )+M + M 2 θ 2 θ θ 4 θ where H is the co-occurrence matrix at a specific offset as defined in Equation. M is the gradient magnitude at location (p, q) and α is its corresponding gradient orientation. M 2 is the gradient magnitude at location (p + x, q + y) with corresponding gradient orientation β. θ and θ 2 denote the neighboring orientation bin centers of α, similar to θ 3 and θ 4. In the proposed weighting scheme, a pixel with very small gradient value can have a fair large weight if its pair pixel has a large gradient value. To avoid such situations, we do not count pixel pairs when at least one of them has very small gradient value. 3) Feature Vector Construction: The obtained block features are first normalized with L2 normalization method. The Co-HOG feature descriptor of the whole image under study can then be constructed by concatenating all the normalized block features. III. SCENE TEXT RECOGNITION Characters in scenes can thus be recognized by training a classifier based on the Co-HOG descriptors as described in the last Section. In our implemented system, a linear SVM classifier is trained using LIBLINEAR [4] which is much faster but with similar performance compared with LIBSVM [5] and SVMLight [6]. We train the SVM classifier by using up to 8500 character images to be discussed in the next Section. IV. EXPERIMENTAL RESULTS A. Datasets We evaluate our methods on ICDAR 2003 [7] and SVT [] datasets. The ICDAR 2003 character dataset has about 600 characters for training and 5400 characters for testing. The characters are collected from a wide branches of scenes, (2) 94

4 TABLE I: Character recognition accuracy on ICDAR and SVT dataset Method ICDAR SVT ABBYY FineReader 0 [8] 26.6% 5.4% GB+NN [5] 4.0% - HOG+NN [0] 5.5% - NATIVE+FERNS [] 64.0% - MSER [7] 67.0% - HOG+SVM [2] - 6.9% Proposed Co-HOG 79.4% 75.4% Proposed Co-HOG (Case Insensitive) 83.6% 80.6% like book covers, road signs, brand logo and other texts randomly selected from various objects. The text font, size, illumination, color and texture corresponding vary greatly. Take character size for example, the character width ranges from to 589 pixels and the character height ranges from 0 to 898 pixels. For SVT dataset, only the testing part is annotated in [2] for character recognition which contains about 3796 samples. Compared with the ICDAR 2003, this one is more challenging where most of the characters are cropped from business board and brand names taken from Google Street View. They are usually of fancy fonts, low resolution and often suffer from bad illumination as illustrated in Fig.. In addition, we add Char74K dataset [5] for training. Thus the training dataset consists of ICDAR 2003 training dataset and Chars74K dataset, which have roughly 8,500 characters in total. In the experiment, we resize each character to pixels and then divide each into 4 4 blocks before feature extraction. After we get the Co-HOG features, a linear SVM classifier is trained with LIBLINEAR [4] and evaluated on the testing datasets. B. Scene Text Recognition Accuracy The ICDAR 2003 test dataset altogether has 5430 characters. We exclude those that do not belong to the 62 classes (52 upper and lower case English letters plus 0 digits) thus have 5379 characters left. The trained linear SVM is first tested on this dataset. Experimental results in Table I show that our proposed method outperforms all previous feature descriptors with an accuracy of 79.4%, while the FineReader 0 [8] gets the worst accuracy with 26.6%, largely due to the fact that the FineReader was designed for document text recognition. The result of GB+NN (4.0%) trained on Chars74K dataset, is the one that performs the best as reported in [5]. The character recognition accuracy on ICDAR dataset is not given for the HOG+SVM method as reported in [2]. The method in [8] reports an accuracy of 8.7%. On the other hand, that method uses a huge amount of training data that is not available to the public. More importantly, the testing dataset in [8] only consists of 598 instead of 5379 characters because it re-crops square character patches from the original dataset and ignores (a) Successfully recognized characters (b) Wrongly recognized characters. First row: 5 (r), V (N), P (R), E (f), P (e), D (A). Second row: T (f), 4 (A), r (I), n (R), E (L). Third row: R (0), g (q), l (I), e (l), A (G), U (i), G (E). The characters in single quotation marks are our predicted ones while those in parentheses are the ground truths. Fig. 5: Some samples of the character recognition results those characters along the boundary which is impossible to fit in a square bounding box. Text case identification is a very challenging task for scene text recognition because scene texts usually lack specific document layout information. For scene texts, some upper case letters and lower case letters like C and c, S and s, O and o are extremely difficult to distinguish, even for human beings. In fact, letter cases are often annotated incorrectly in the ground truth dataset. Therefore, we also show case insensitive result which further increases the scene text recognition accuracy of the proposed technique up to 83.6%. In addition, the accuracy of our proposed method on the SVT dataset is 75.4%, which is only 4% lower than that on the ICDAR. This gap may be further reduced by retraining the SVM using 52 classes because there are no digits in the SVT dataset. The above comparison to some degree shows the superiority of our proposed method which works comparatively well even on very different dataset. Currently 95

5 accuracy gap between those two very different datasets shows the power of the Co-HOG in capturing the shape information of characters under different scenes. In the future, we will investigate some global features and combine them with Co-HOG to formulate a more accurate scene character recognition technique. REFERENCES Fig. 6: Confusion matrix of character recognition on the SVT and ICDAR datasets. There are 62 classes indicated by the number on the coordinate, which represents 0-9a-zA-Z respectively. little character recognition accuracy has been reported on this dataset. The most recent work in [2] achieves an accuracy of 6.9% which is much lower compared with our proposed method. If we ignore letter cases, the accuracy of our proposed method goes up to 80.6%. Fig. 5 shows some challenging examples that are correctly recognized by our propose method and some failure cases. As Fig. 5a shows, the proposed technique is capable of recognizing many challenging characters in scenes. At the same time, many failed cases as illustrated in Fig. 5b are difficult to read even for humans. C. Discussion We compute the confusion matrix on the two datasets and add them together as shown in Fig. 6. As Fig. 6 shows, the mistakes concentrate on those confusing letter cases as we discussed earlier. Besides, two most obvious mistakes are the mis-classification between I and l, and between 0 and O, which even human beings often fail to differentiate correctly. Certain recognition failure can be explained by several other factors. For example, some characters are mistakenly annotated in both ICDAR and SVT datasets. Besides, there even exists character of size 35 pixels in the ICDAR 2003 dataset because some characters are not cropped carefully. The scene text recognition could be greatly improved without these interfering factors. V. CONCLUSION AND FUTURE WORK Character recognition has played a crucial role in text recognition in scene images. We propose to use co-occurrence histogram of oriented gradients (Co-HOG) with weighted voting scheme for scene character recognition. Compared with histogram of oriented gradients (HOG), Co-HOG captures more local spatial information but at the same time keeps the advantage of HOG, i.e., the robustness to illumination variation and invariance to local geometric transformation. The results on both ICDAR 2003 and SVT datasets greatly outperform all the previous feature descriptor based methods. The small [] X. Chen and A. Yuille, Detecting and reading text in natural scenes, in Computer Vision and Pattern Recognition, CVPR Proceedings of the 2004 IEEE Computer Society Conference on, vol. 2. IEEE, 2004, pp. II 366. [2] W. Niblack, An introduction to digital image processing. Strandberg Publishing Company, 985. [3] K. Kita and T. Wakahara, Binarization of color characters in scene images using k-means clustering and support vector machines, in Proceedings of the th International Conference on Pattern Recognition, ser. ICPR 0. Washington, DC, USA: IEEE Computer Society, 200, pp [Online]. Available: [4] A. Mishra, K. Alahari, and C. Jawahar, An mrf model for binarization of natural scene text, in Document Analysis and Recognition (ICDAR), 20 International Conference on. IEEE, 20, pp. 6. [5] T. E. de Campos, B. R. Babu, and M. Varma, Character recognition in natural images. in VISAPP (2), 2009, pp [6] J. Weinman, E. Learned-Miller, and A. Hanson, Scene text recognition using similarity and a lexicon with sparse belief propagation, Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 3, no. 0, pp , [7] L. Neumann and J. Matas, A method for text localization and recognition in real-world images, Computer Vision ACCV 200, pp , 20. [8] A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D. Wu, and A. Ng, Text detection and character recognition in scene images with unsupervised feature learning, in Document Analysis and Recognition (ICDAR), 20 International Conference on. IEEE, 20, pp [9] N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, in Computer Vision and Pattern Recognition, CVPR IEEE Computer Society Conference on, vol.. IEEE, 2005, pp [0] K. Wang and S. Belongie, Word spotting in the wild, Computer Vision ECCV 200, pp , 200. [] K. Wang, B. Babenko, and S. Belongie, End-to-end scene text recognition, in Computer Vision (ICCV), 20 IEEE International Conference on. IEEE, 20, pp [2] A. Mishra, K. Alahari, and C. Jawahar, Top-down and bottom-up cues for scene text recognition, in Computer Vision and Pattern Recognition (CVPR), 202 IEEE Conference on. IEEE, 202, pp [3] T. Watanabe, S. Ito, and K. Yokoi, Co-occurrence histograms of oriented gradients for human detection, Information and Media Technologies, vol. 5, no. 2, pp , 200. [4] C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan, A dual coordinate descent method for large-scale linear svm, in Proceedings of the 25th international conference on Machine learning, vol. 95, no. 08, 2008, pp [5] M. A. Hearst, S. Dumais, E. Osman, J. Platt, and B. Scholkopf, Support vector machines, Intelligent Systems and their Applications, IEEE, vol. 3, no. 4, pp. 8 28, 998. [6] T. Joachims, Training linear svms in linear time, in Proceedings of the 2th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2006, pp [7] S. M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young, Icdar 2003 robust reading competitions, in Proceedings of the Seventh International Conference on Document Analysis and Recognition, vol. 2, 2003, pp [8] ABBYY FineReader 0, 96

LETTER Learning Co-occurrence of Local Spatial Strokes for Robust Character Recognition

LETTER Learning Co-occurrence of Local Spatial Strokes for Robust Character Recognition IEICE TRANS. INF. & SYST., VOL.E97 D, NO.7 JULY 2014 1937 LETTER Learning Co-occurrence of Local Spatial Strokes for Robust Character Recognition Song GAO, Student Member, Chunheng WANG a), Member, Baihua

More information

Text Extraction from Natural Scene Images and Conversion to Audio in Smart Phone Applications

Text Extraction from Natural Scene Images and Conversion to Audio in Smart Phone Applications Text Extraction from Natural Scene Images and Conversion to Audio in Smart Phone Applications M. Prabaharan 1, K. Radha 2 M.E Student, Department of Computer Science and Engineering, Muthayammal Engineering

More information

Detecting and Recognizing Text in Natural Images using Convolutional Networks

Detecting and Recognizing Text in Natural Images using Convolutional Networks Detecting and Recognizing Text in Natural Images using Convolutional Networks Aditya Srinivas Timmaraju, Vikesh Khanna Stanford University Stanford, CA - 94305 adityast@stanford.edu, vikesh@stanford.edu

More information

An ICA based Approach for Complex Color Scene Text Binarization

An ICA based Approach for Complex Color Scene Text Binarization An ICA based Approach for Complex Color Scene Text Binarization Siddharth Kherada IIIT-Hyderabad, India siddharth.kherada@research.iiit.ac.in Anoop M. Namboodiri IIIT-Hyderabad, India anoop@iiit.ac.in

More information

SCENE TEXT RECOGNITION IN MULTIPLE FRAMES BASED ON TEXT TRACKING

SCENE TEXT RECOGNITION IN MULTIPLE FRAMES BASED ON TEXT TRACKING SCENE TEXT RECOGNITION IN MULTIPLE FRAMES BASED ON TEXT TRACKING Xuejian Rong 1, Chucai Yi 2, Xiaodong Yang 1 and Yingli Tian 1,2 1 The City College, 2 The Graduate Center, City University of New York

More information

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds 9 1th International Conference on Document Analysis and Recognition Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds Weihan Sun, Koichi Kise Graduate School

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

Perspective Scene Text Recognition With Feature Compression and Ranking

Perspective Scene Text Recognition With Feature Compression and Ranking Perspective Scene Text Recognition With Feature Compression and Ranking Yu Zhou 1, Shuang Liu 1, Yongzheng Zhang 1, Yipeng Wang 1, and Weiyao Lin 2 1 Institute of Information Engineering, Chinese Academy

More information

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines 2011 International Conference on Document Analysis and Recognition Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines Toru Wakahara Kohei Kita

More information

Text Detection from Natural Image using MSER and BOW

Text Detection from Natural Image using MSER and BOW International Journal of Emerging Engineering Research and Technology Volume 3, Issue 11, November 2015, PP 152-156 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Text Detection from Natural Image using

More information

Scene Text Recognition in Mobile Applications by Character Descriptor and Structure Configuration

Scene Text Recognition in Mobile Applications by Character Descriptor and Structure Configuration 1 Scene Text Recognition in Mobile Applications by Character Descriptor and Structure Configuration Chucai Yi, Student Member, IEEE, and Yingli Tian, Senior Member, IEEE Abstract Text characters and strings

More information

Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection

Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection Tomoki Watanabe, Satoshi Ito, and Kentaro Yokoi Corporate Research and Development Center, TOSHIBA Corporation, 1, Komukai-Toshiba-cho,

More information

Scene Text Detection Using Machine Learning Classifiers

Scene Text Detection Using Machine Learning Classifiers 601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department

More information

Multi-script Text Extraction from Natural Scenes

Multi-script Text Extraction from Natural Scenes Multi-script Text Extraction from Natural Scenes Lluís Gómez and Dimosthenis Karatzas Computer Vision Center Universitat Autònoma de Barcelona Email: {lgomez,dimos}@cvc.uab.es Abstract Scene text extraction

More information

Combining Multi-Scale Character Recognition and Linguistic Knowledge for Natural Scene Text OCR

Combining Multi-Scale Character Recognition and Linguistic Knowledge for Natural Scene Text OCR 2012 10th IAPR International Workshop on Document Analysis Systems Combining Multi-Scale Character Recognition and Linguistic Knowledge for Natural Scene Text OCR Khaoula Elagouni Orange Labs R&D Cesson-Sévigné,

More information

Segmentation Framework for Multi-Oriented Text Detection and Recognition

Segmentation Framework for Multi-Oriented Text Detection and Recognition Segmentation Framework for Multi-Oriented Text Detection and Recognition Shashi Kant, Sini Shibu Department of Computer Science and Engineering, NRI-IIST, Bhopal Abstract - Here in this paper a new and

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation M. Blauth, E. Kraft, F. Hirschenberger, M. Böhm Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1,

More information

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)

More information

Scene text extraction based on edges and support vector regression

Scene text extraction based on edges and support vector regression IJDAR (2015) 18:125 135 DOI 10.1007/s10032-015-0237-z SPECIAL ISSUE PAPER Scene text extraction based on edges and support vector regression Shijian Lu Tao Chen Shangxuan Tian Joo-Hwee Lim Chew-Lim Tan

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Image Resizing Based on Gradient Vector Flow Analysis

Image Resizing Based on Gradient Vector Flow Analysis Image Resizing Based on Gradient Vector Flow Analysis Sebastiano Battiato battiato@dmi.unict.it Giovanni Puglisi puglisi@dmi.unict.it Giovanni Maria Farinella gfarinellao@dmi.unict.it Daniele Ravì rav@dmi.unict.it

More information

Selective Search for Object Recognition

Selective Search for Object Recognition Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview Introduction Object Recognition Selective Search Similarity Metrics Results Object Recognition Kitten Goal: Problem: Where

More information

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013 Feature Descriptors CS 510 Lecture #21 April 29 th, 2013 Programming Assignment #4 Due two weeks from today Any questions? How is it going? Where are we? We have two umbrella schemes for object recognition

More information

arxiv: v1 [cs.cv] 23 Apr 2016

arxiv: v1 [cs.cv] 23 Apr 2016 Text Flow: A Unified Text Detection System in Natural Scene Images Shangxuan Tian1, Yifeng Pan2, Chang Huang2, Shijian Lu3, Kai Yu2, and Chew Lim Tan1 arxiv:1604.06877v1 [cs.cv] 23 Apr 2016 1 School of

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

CS 231A Computer Vision (Fall 2012) Problem Set 3

CS 231A Computer Vision (Fall 2012) Problem Set 3 CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest

More information

Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention

Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention Anand K. Hase, Baisa L. Gunjal Abstract In the real world applications such as landmark search, copy protection, fake image

More information

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription Automatically Algorithm for Physician s Handwritten Segmentation on Prescription Narumol Chumuang 1 and Mahasak Ketcham 2 Department of Information Technology, Faculty of Information Technology, King Mongkut's

More information

Sketchable Histograms of Oriented Gradients for Object Detection

Sketchable Histograms of Oriented Gradients for Object Detection Sketchable Histograms of Oriented Gradients for Object Detection No Author Given No Institute Given Abstract. In this paper we investigate a new representation approach for visual object recognition. The

More information

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi hrazvi@stanford.edu 1 Introduction: We present a method for discovering visual hierarchy in a set of images. Automatically grouping

More information

ABSTRACT 1. INTRODUCTION 2. RELATED WORK

ABSTRACT 1. INTRODUCTION 2. RELATED WORK Improving text recognition by distinguishing scene and overlay text Bernhard Quehl, Haojin Yang, Harald Sack Hasso Plattner Institute, Potsdam, Germany Email: {bernhard.quehl, haojin.yang, harald.sack}@hpi.de

More information

CS 223B Computer Vision Problem Set 3

CS 223B Computer Vision Problem Set 3 CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.

More information

An Exploration of Computer Vision Techniques for Bird Species Classification

An Exploration of Computer Vision Techniques for Bird Species Classification An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex

More information

2D Image Processing Feature Descriptors

2D Image Processing Feature Descriptors 2D Image Processing Feature Descriptors Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Overview

More information

Bus Detection and recognition for visually impaired people

Bus Detection and recognition for visually impaired people Bus Detection and recognition for visually impaired people Hangrong Pan, Chucai Yi, and Yingli Tian The City College of New York The Graduate Center The City University of New York MAP4VIP Outline Motivation

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

Exploiting Colour Information for Better Scene Text Recognition

Exploiting Colour Information for Better Scene Text Recognition FRAZ, SARFRAZ, EDIRISINGHE: SCENE TEXT RECOGNITION USING COLOUR 1 Exploiting Colour Information for Better Scene Text Recognition Muhammad Fraz 1 M.Fraz@lboro.ac.uk M. Saquib Sarfraz 2 Muhammad.Sarfraz@kit.edu

More information

Evaluation of Hardware Oriented MRCoHOG using Logic Simulation

Evaluation of Hardware Oriented MRCoHOG using Logic Simulation Evaluation of Hardware Oriented MRCoHOG using Logic Simulation Yuta Yamasaki 1, Shiryu Ooe 1, Akihiro Suzuki 1, Kazuhiro Kuno 2, Hideo Yamada 2, Shuichi Enokida 3 and Hakaru Tamukoh 1 1 Graduate School

More information

SCENE TEXT BINARIZATION AND RECOGNITION

SCENE TEXT BINARIZATION AND RECOGNITION Chapter 5 SCENE TEXT BINARIZATION AND RECOGNITION 5.1 BACKGROUND In the previous chapter, detection of text lines from scene images using run length based method and also elimination of false positives

More information

Keyword Spotting in Document Images through Word Shape Coding

Keyword Spotting in Document Images through Word Shape Coding 2009 10th International Conference on Document Analysis and Recognition Keyword Spotting in Document Images through Word Shape Coding Shuyong Bai, Linlin Li and Chew Lim Tan School of Computing, National

More information

Automated Canvas Analysis for Painting Conservation. By Brendan Tobin

Automated Canvas Analysis for Painting Conservation. By Brendan Tobin Automated Canvas Analysis for Painting Conservation By Brendan Tobin 1. Motivation Distinctive variations in the spacings between threads in a painting's canvas can be used to show that two sections of

More information

Thai Text Localization in Natural Scene Images using Convolutional Neural Network

Thai Text Localization in Natural Scene Images using Convolutional Neural Network Thai Text Localization in Natural Scene Images using Convolutional Neural Network Thananop Kobchaisawat * and Thanarat H. Chalidabhongse * Department of Computer Engineering, Chulalongkorn University,

More information

Object Category Detection. Slides mostly from Derek Hoiem

Object Category Detection. Slides mostly from Derek Hoiem Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models

More information

Top-Down and Bottom-up Cues for Scene Text Recognition

Top-Down and Bottom-up Cues for Scene Text Recognition Top-Down and Bottom-up Cues for Scene Text Recognition Anand Mishra 1 Karteek Alahari 2 C. V. Jawahar 1 1 CVIT, IIIT Hyderabad, India 2 INRIA - WILLOW / École Normale Supérieure, Paris, France Abstract

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh, Tao Wang, Andrew Y. Ng Computer Science

More information

Category vs. instance recognition

Category vs. instance recognition Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building

More information

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION Panca Mudjirahardjo, Rahmadwati, Nanang Sulistiyanto and R. Arief Setyawan Department of Electrical Engineering, Faculty of

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Toward Retail Product Recognition on Grocery Shelves

Toward Retail Product Recognition on Grocery Shelves Toward Retail Product Recognition on Grocery Shelves Gül Varol gul.varol@boun.edu.tr Boğaziçi University, İstanbul, Turkey İdea Teknoloji Çözümleri, İstanbul, Turkey Rıdvan S. Kuzu ridvan.salih@boun.edu.tr

More information

[2008] IEEE. Reprinted, with permission, from [Yan Chen, Qiang Wu, Xiangjian He, Wenjing Jia,Tom Hintz, A Modified Mahalanobis Distance for Human

[2008] IEEE. Reprinted, with permission, from [Yan Chen, Qiang Wu, Xiangjian He, Wenjing Jia,Tom Hintz, A Modified Mahalanobis Distance for Human [8] IEEE. Reprinted, with permission, from [Yan Chen, Qiang Wu, Xiangian He, Wening Jia,Tom Hintz, A Modified Mahalanobis Distance for Human Detection in Out-door Environments, U-Media 8: 8 The First IEEE

More information

An Implementation on Histogram of Oriented Gradients for Human Detection

An Implementation on Histogram of Oriented Gradients for Human Detection An Implementation on Histogram of Oriented Gradients for Human Detection Cansın Yıldız Dept. of Computer Engineering Bilkent University Ankara,Turkey cansin@cs.bilkent.edu.tr Abstract I implemented a Histogram

More information

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Hyunghoon Cho and David Wu December 10, 2010 1 Introduction Given its performance in recent years' PASCAL Visual

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Speeding up the Detection of Line Drawings Using a Hash Table

Speeding up the Detection of Line Drawings Using a Hash Table Speeding up the Detection of Line Drawings Using a Hash Table Weihan Sun, Koichi Kise 2 Graduate School of Engineering, Osaka Prefecture University, Japan sunweihan@m.cs.osakafu-u.ac.jp, 2 kise@cs.osakafu-u.ac.jp

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

Recognition of Multiple Characters in a Scene Image Using Arrangement of Local Features

Recognition of Multiple Characters in a Scene Image Using Arrangement of Local Features 2011 International Conference on Document Analysis and Recognition Recognition of Multiple Characters in a Scene Image Using Arrangement of Local Features Masakazu Iwamura, Takuya Kobayashi, and Koichi

More information

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images Karthik Ram K.V & Mahantesh K Department of Electronics and Communication Engineering, SJB Institute of Technology, Bangalore,

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Mobile Visual Search with Word-HOG Descriptors

Mobile Visual Search with Word-HOG Descriptors Mobile Visual Search with Word-HOG Descriptors Sam S. Tsai, Huizhong Chen, David M. Chen, and Bernd Girod Department of Electrical Engineering, Stanford University, Stanford, CA, 9435 sstsai@alumni.stanford.edu,

More information

THE SPEED-LIMIT SIGN DETECTION AND RECOGNITION SYSTEM

THE SPEED-LIMIT SIGN DETECTION AND RECOGNITION SYSTEM THE SPEED-LIMIT SIGN DETECTION AND RECOGNITION SYSTEM Kuo-Hsin Tu ( 塗國星 ), Chiou-Shann Fuh ( 傅楸善 ) Dept. of Computer Science and Information Engineering, National Taiwan University, Taiwan E-mail: p04922004@csie.ntu.edu.tw,

More information

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Object Detection Using Segmented Images

Object Detection Using Segmented Images Object Detection Using Segmented Images Naran Bayanbat Stanford University Palo Alto, CA naranb@stanford.edu Jason Chen Stanford University Palo Alto, CA jasonch@stanford.edu Abstract Object detection

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

Text Localization in Real-world Images using Efficiently Pruned Exhaustive Search

Text Localization in Real-world Images using Efficiently Pruned Exhaustive Search Text Localization in Real-world Images using Efficiently Pruned Exhaustive Search Lukáš Neumann Centre for Machine Perception, Dept. of Cybernetics Czech Technical University, Prague, Czech Republic neumalu@cmp.felk.cvut.cz

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Viewpoint Invariant Features from Single Images Using 3D Geometry

Viewpoint Invariant Features from Single Images Using 3D Geometry Viewpoint Invariant Features from Single Images Using 3D Geometry Yanpeng Cao and John McDonald Department of Computer Science National University of Ireland, Maynooth, Ireland {y.cao,johnmcd}@cs.nuim.ie

More information

A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes

A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes Renwu Gao 1, Faisal Shafait 2, Seiichi Uchida 3, and Yaokai Feng 3 1 Information Sciene and Electrical Engineering, Kyushu

More information

Text Detection in Multi-Oriented Natural Scene Images

Text Detection in Multi-Oriented Natural Scene Images Text Detection in Multi-Oriented Natural Scene Images M. Fouzia 1, C. Shoba Bindu 2 1 P.G. Student, Department of CSE, JNTU College of Engineering, Anantapur, Andhra Pradesh, India 2 Associate Professor,

More information

Handwritten Script Recognition at Block Level

Handwritten Script Recognition at Block Level Chapter 4 Handwritten Script Recognition at Block Level -------------------------------------------------------------------------------------------------------------------------- Optical character recognition

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Pattern Recognition xxx (2016) xxx-xxx. Contents lists available at ScienceDirect. Pattern Recognition. journal homepage:

Pattern Recognition xxx (2016) xxx-xxx. Contents lists available at ScienceDirect. Pattern Recognition. journal homepage: Pattern Recognition xxx (2016) xxx-xxx Contents lists available at ScienceDirect ARTICLE INFO Article history: Received 6 June 2016 Received in revised form 20 September 2016 Accepted 15 October 2016 Available

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Local Image Features

Local Image Features Local Image Features Computer Vision CS 143, Brown Read Szeliski 4.1 James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial This section: correspondence and alignment

More information

[2006] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image

[2006] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image [6] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image Matching Methods, Video and Signal Based Surveillance, 6. AVSS

More information

Car Detecting Method using high Resolution images

Car Detecting Method using high Resolution images Car Detecting Method using high Resolution images Swapnil R. Dhawad Department of Electronics and Telecommunication Engineering JSPM s Rajarshi Shahu College of Engineering, Savitribai Phule Pune University,

More information

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Chieh-Chih Wang and Ko-Chih Wang Department of Computer Science and Information Engineering Graduate Institute of Networking

More information

Latest development in image feature representation and extraction

Latest development in image feature representation and extraction International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image

More information

Linear combinations of simple classifiers for the PASCAL challenge

Linear combinations of simple classifiers for the PASCAL challenge Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu

More information

Restoring Warped Document Image Based on Text Line Correction

Restoring Warped Document Image Based on Text Line Correction Restoring Warped Document Image Based on Text Line Correction * Dep. of Electrical Engineering Tamkang University, New Taipei, Taiwan, R.O.C *Correspondending Author: hsieh@ee.tku.edu.tw Abstract Document

More information

Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients

Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients ThreeDimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients Authors: Zhile Ren, Erik B. Sudderth Presented by: Shannon Kao, Max Wang October 19, 2016 Introduction Given an

More information

Efficient Acquisition of Human Existence Priors from Motion Trajectories

Efficient Acquisition of Human Existence Priors from Motion Trajectories Efficient Acquisition of Human Existence Priors from Motion Trajectories Hitoshi Habe Hidehito Nakagawa Masatsugu Kidode Graduate School of Information Science, Nara Institute of Science and Technology

More information

Automatic Script Identification in the Wild

Automatic Script Identification in the Wild Automatic Script Identification in the Wild Baoguang Shi, Cong Yao, Chengquan Zhang, Xiaowei Guo, Feiyue Huang, Xiang Bai School of EIC, Huazhong University of Science and Technology, Wuhan, P.R. China

More information

Evaluation and comparison of interest points/regions

Evaluation and comparison of interest points/regions Introduction Evaluation and comparison of interest points/regions Quantitative evaluation of interest point/region detectors points / regions at the same relative location and area Repeatability rate :

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Study of Viola-Jones Real Time Face Detector

Study of Viola-Jones Real Time Face Detector Study of Viola-Jones Real Time Face Detector Kaiqi Cen cenkaiqi@gmail.com Abstract Face detection has been one of the most studied topics in computer vision literature. Given an arbitrary image the goal

More information

Determinant of homography-matrix-based multiple-object recognition

Determinant of homography-matrix-based multiple-object recognition Determinant of homography-matrix-based multiple-object recognition 1 Nagachetan Bangalore, Madhu Kiran, Anil Suryaprakash Visio Ingenii Limited F2-F3 Maxet House Liverpool Road Luton, LU1 1RS United Kingdom

More information

Color-Based Classification of Natural Rock Images Using Classifier Combinations

Color-Based Classification of Natural Rock Images Using Classifier Combinations Color-Based Classification of Natural Rock Images Using Classifier Combinations Leena Lepistö, Iivari Kunttu, and Ari Visa Tampere University of Technology, Institute of Signal Processing, P.O. Box 553,

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail: nongluk.c@chula.ac.th

More information

Text Detection in Indoor/Outdoor Scene Images

Text Detection in Indoor/Outdoor Scene Images Text Detection in Indoor/Outdoor Scene Images B. Gatos, I. Pratikakis, K. Kepene and S.J. Perantonis Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center

More information

Detection III: Analyzing and Debugging Detection Methods

Detection III: Analyzing and Debugging Detection Methods CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can

More information