Human Action Recognition in Videos Using Hybrid Motion Features

Size: px
Start display at page:

Download "Human Action Recognition in Videos Using Hybrid Motion Features"

Transcription

1 Human Action Recognition in Videos Using Hybrid Motion Features Si Liu 1,2, Jing Liu 1,TianzhuZhang 1, and Hanqing Lu 1 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Science, Beijing , China 2 China-Singapore Institute of Digital Media, , Singapore Abstract. In this paper, we present hybrid motion features to promote action recognition in videos. The features are composed of two complementary components from different views of motion information. On one hand, the period feature is extracted to capture global motion in timedomain. On the other hand, the enhanced histograms of motion words (EHOM) are proposed to describe local motion information. Each word is represented by optical flow of a frame and the correlations between words are encoded into the transition matrix of a Markov process, and then its stationary distribution is extracted as the final EHOM. Compared to traditional Bags of Words representation, EHOM preserves not only relationships between words but also temporary information in videos to some extent. We show that by integrating local and global features, we get improved recognition rates on a variety of standard datasets. Keywords: Action recognition, Period, EHOM, Optical flow, Bag of words, Markov process. 1 Introduction With the wide spread of digital cameras for public visual surveillance purposes, digital multimedia processing has been received increasing attention during the past decade. Human action recognition is becoming one of the most important topics in computer vision. The results can be applied to many areas such as surveillance, video retrieval and human computer interaction etc. Successful extraction of good features from videos is crucial to action recognition. Yan et al. [1] extend the 2D box feature to 3D spatio-temporal volumetric feature. Recently, Ju Sun et al. [2] propose to model the spatio-temporal context information in a hierarchical way. Among all the proposed features, there is a huge family directly describing motion. For example, Bobick and Davis [3] develop the temporal template which captures both motion and shape. Laptev [4] extracts motion-based space-time features. This representation focuses on human actions viewed as motion patterns. Ziming Zhang et al. [5] propose Motion Context (MC) which captures the distribution of the motion words and thus summarizes the local motion information in a rich 3D MC descriptor. These motion based approaches have shown to be successful for action recognition. S. Boll et al. (Eds.): MMM 2010, LNCS 5916, pp , c Springer-Verlag Berlin Heidelberg 2010

2 412 S. Liu et al. Acknowledging the discriminative power of motion features, we propose to combine period and enhanced histograms of motion words (EHOM) to describe motion in the video. Considering the large variation in realistic videos, our method is more feasible to extract compared to 3D volumes, trajectories, spatiotemporal interest points etc. Period Features: Periodical motion occurs often in human actions. For example, running and walking can be seen as periodical actions in the leg region. Therefore, a variety of methods use period features to perform action recognition. Cutler and Davis [6] compute an object s self-similarity as it evolves in time. For periodic motion, the self-similarity measure is also periodic, and they apply Time-Frequency analysis to detect and characterize the periodic motion. What s more, Liu et al. [7] also classify periodic motions. Optical Flow Features: Efros et al. [8] recognize the actions of small scale figures using features derived from optical flow measurements in a spatio-temporal volume for each stabilized human figure. Alireza Fathi et al. [9] develop a method constructing mid-level motion features which are built from low-level optical flow information. Saad Ali et al. [10] propose a set of kinematic features that are derived from the optical flow. All of them achieve good results. Hybrid Features: We strongly feel that period and optical flow are complementary for action recognition mainly for two reasons. First, optical flow only capture the motion between two adjacent frames thus bringing in local problems, while period can capture global motion in time domain. For example, suppose we want to differentiate walking from jogging. Because they produce quite similar optical flow, it is difficult to distinguish them based on optical flow features alone. Yet, the period feature can easily distinguish them because when somebody jogs, his/her legs move faster. Second, period information is not obvious in several actions such as bending. However, the optical flow of bending with forwarding components and rising up components are quite discriminative. To exploit the synergy, we choose to use hybrid features consisting of both period features (capturing global motion) and optical flow features (capturing local motion) to develop an effective recognition framework. 2 Overview of Our Recognition System The main components of the system are illustrated in Fig. 1. We first produce a figure-centric spatio-temporal volume (see Fig. 1(a)) for each person. It can be obtained by using any one of detection/tracking algorithms over the input sequence and constructing a fixed size window around it. Afterwards, we divide every frame of the spatio-temporal volume into m n blocks to make the proposed algorithm robust to noise and efficient to be computed. By doing this, we also implicitly maintain spatial information in the frame when constructing

3 Human Action Recognition in Videos Using Hybrid Motion Features 413 Input Video (a) Spatio-temporal Cuboid Frame based Optical Flow Optical Flow Bag t x y x t y Video based Period Feature Video based EHOM (b) (c) Hybrid Motion Features SVM Classifier (d) Fig. 1. The Framework of Our Approach features.asaresult,wegetm n smaller spatio-temporal cuboids consisting of all the blocks at the corresponding location in every frame(fig. 1(b)). Sec. 3 addresses quasi-period extraction of the cuboid to describe the global motion in time-domain. The feature of all the cuboid are concatenated to form the period feature of the video. Sec. 4 introduces the EHOM feature extraction. Specifically, each frame s optical flow is first assigned a label by k-means clustering algorithm. Based on these labels, Markov process is used to encode the dynamic information (Fig. 1(c)). Then the hybrid features are constructed and fed into the subsequent multi-class SVM classifier (Fig. 1(d)). The experimental results are reported in Sec. 5. Finally, the conclusions are given in Sec Period Feature Extraction Based on the spatio-temporal cuboid obtained by dividing the original video, our frequency extraction approach is appearance-based similar to [11]. Fig. 2 shows the block diagram of the module. First, we use probabilistic PCA (ppca) [12] to detect the maximum spatially coherent changes over time in the objects appearance. The input data that are spatially correlated are grouped together. Different from pixel-wise approaches, ppca considers these pixels as one physical entity. Hence the method is robust to noise. The final output consists of a

4 414 S. Liu et al. Figure-centric cubiod ppca Frequency Analysis f est per est Fig. 2. Block diagram of period extraction module combination of two indicators: the estimated period f est and the degree of periodicity per est. Next, we will describe the ppca phase and frequency analysis phase respectively. ppca for Robust Periodicity Detection: Let X D N =[x 1 x 2...x N ]represent the input video, with D the number of pixels in one frame and N the number of image frames. The rows of an aligned image frame are concatenated to form the column x n. The optimal linear reconstruction ˆX of the data is given by: ˆX = WU + X, wherew D Q =[w 1 w 2...w Q ] is the set of orthonormal basis vectors, principal components matrix U D Q is a set of Q-dimensional vectors of unobserved variables(see Fig.3(b)) and X the set of mean vectors x.each eigenvector s corresponding eigenvalue is indicated by Λ = diag(λ 1,λ 2,...λ d )of the covariance matrix S of the input data X: S = VΛV T, which is calculated by eigenvalue decomposition. The dimension Q is selected by setting the maximum percentage of retained variance we want to preserve in the reconstructed matrix ˆX. Frequency Analysis: Periodogram is a typical non-parametric frequency analysis method which estimates the power spectrum based on the Fourier Transform of the autocovariance function. We choose the modified periodogram of the non-parametric class: P q (f) = 1 N 1 2 N w(n)x(n)exp( jn2πf), where N is the n=0 frame length, w(n) is the window used and x(n) is principal component vector u T q from the ppca(see Fig. 3(b)). By weighing the spectra P q (f) with the relative percentages λ q of the retained variance and summing them together, a spectrum is obtained by P (f) = Q λ qp q (f), where λ q = λq q=1. D λ d d=1 In order to detect the dominant frequency component in the spectrum P (f) (see Fig. 3(c)), we first detect the peaks and local minima which define the peaks supports. The peaks with a frequency lower than fs N are discarded, with f s being the sampling rate of the video and N the frame length. Afterwards, starting from the lowest found frequency to the highest, each peak is checked against the others for its harmonicity. We require that a fundamental frequency

5 Human Action Recognition in Videos Using Hybrid Motion Features 415 u 1 spectrum u 2 Frame number Frequency(mHz) (a) (b) (c) Fig. 3. (a) The spatio-temporal cubic of running is denoted in red. (b) The first 2 principal components of the cubic. (c) The weighted spectrum of running, the peaks is denoted in red and their supports in green. should have a higher peak than its harmonics and a tolerance of fs N is used in the matching process. We select the one group k with the highest total energy to represent the dominant frequency component in the data. The total energy is the sum of the area between the left and right supports E(.) of the fundamental frequency peak fk 0 and its harmonics f k i : { f est =argmax f 0 k E(f 0 k )+ i E(f i k ) } The estimated frequency f est of Fig. 3(c) is 120mHz, which means that the motion repeats itself every 8.33 frames. Note that no matter whether the data is periodical or not, as long as there exist some minor peaks in the spectrum P (f), the above method may still give a frequency estimate. So we adopt to compare the energy of all peaks found in P (f) with the total energy to separate the above cases: K E Δ (f k ) k=1 per est =, (2) P (f) where K is the number of peaks detected and E Δ (f k ) as the area of a triangle formed by the peak and its left and right supports. Note that the peak supports should have zero energy for the spectrum of periodic signal. By only using the triangle area for the nominator in eq.(2), we assign a lower per est value for quasiperiodic signal. The obtained per est and f est are then concancated to generate the period component of the hybrid feature. f (1)

6 416 S. Liu et al. Optical Flow Extraction Visual Words Generation Markov Process EHOM Fig. 4. Block diagram of EHOM extraction module 4 Enhanced Histograms of Motion Words Extraction As motion frequency is a global and thus coarse description of motion, we adopt a local and finer motion mode descriptor optical flow as a complement. Fig. 4 shows the block diagram of the module. First, We extract the optical flow of every frame. Then we generate the codebook by clustering all optical flow in training dataset. Afterwards, we would have directly computed the histogram of words occurrences over the entire video sequence based on the obtained visual words, but by doing so the time domain information is lost. For action recognition, however, the dynamic properties of these object components are quite essential, e.g. for the action of standing up or airplane taking off. That is why we go one step further and combine a optical flow based Bags of Words representation with Markov process [13] to get EHOM. It is independent of the length of video and simultaneity maintains both the dynamic information and correlations between words in the video. To our best knowledge, we are the first to consider the relationship between motion words in action recognition. The Lucas and Kanade [14] algorithm is employed to compute the optical flow for each frame. The optical flow vector field F is then split into horizontal and vertical components of the flow, F x and F y. These two non-negative channels are then blurred with a gaussian and normalized. They will be used as our optical flow motion features for each frame. Blurring the optical flows reduces the influence of noise and small spatial shifts in the figure centric volume. For each frame, optical flow features of each block are concatenated to generate a longer vector. Next, we represent a video sequence as Bags of Words. Our method represents a frame as a single word. In other words, a word corresponds to a frame, and a document corresponds to a video sequence in our representation. Specifically, given the optical flow vector of every frame in the video, we construct a visual vocabulary with the k-means algorithm and then assign each frame to the closest (we use Euclidean distance) vocabulary word. In fig. 5(a), different colors mean the corresponding frames are assigned to different visual words. As we mentioned, we go one step further than Bags of Words by considering the relationship between the motion words using Markov process. Before going deep into details, we present some basic definitions in Markov chains. A Markov Chain [15] is a sequence of random observed variables with the Markov property. It is a powerful tool for modeling the dynamic properties of a system. The markov stationary distribution, associated with an ergodic Markov chain, offers a compact and effective representation for a dynamic system.

7 Human Action Recognition in Videos Using Hybrid Motion Features 417 x t y (a) Frames Assigned to Different Visual Words k (d) Markov Stationary Features (b) Visual Words Transition Diagram k k (c) Visual Words Occurrence Matrix 2 Fig. 5. Construction of E HOM Theorem 4.1. Any ergodic finite-state Markov chain is associated with a unique stationary distribution (row) vector, such that πp = π. Theorem ) The limit A = lim A n exists for all ergodic Markov chains, x where the matrix A n = 1 n +1 (I + P P n ) (3) 2) Each row of A is the unique stationary distribution vector π. Hence when the ergodicity condition is satisfied, we can approximate A by A n. To further reduce the approximation error when using a finite n, π is calculated as the column average of A n. For consecutive frames in a fixed-length time window with their codebook labels F and F. we translate the sequential relations between these labels into a directed graph, which is similar to the state diagram of a Markov chain (Fig. 5 (b)). Here we get K vertices corresponding to the K codewords, and weighted edges corresponding to the occurrence of each transition between the words. We further establish an equivalent matrix representation of the graph(fig. 5 (c)), and perform row-normalization on the matrix to arrive at a valid transition matrix P for a certain Markov chain. Once we obtain the transition matrix P and make sure it is associated with an ergodic Markov chain, we can use eq. 3 to compute π (Fig. 5(d)). 5 Experiments Here we briefly introduce the parameters used in our experiments. In the period feature extraction phase, ppca retained variance is 90%, Hanning window is used for periodogram smoothing. If per est is less than 0.4, in other words, the

8 418 S. Liu et al. signal is not periodic, we assign the corresponding f est to zero. In EHOM extraction phase, the vocabulary size is set to be 100 and we use n =50toestimate A by A n. The length of time window is 20 frames. For classification, we use support vector machine (SVM) classifier with RBF kernel. We adopt PCA to reduce the dimension of period feature to make it the same as the dimension of EHOM. To prove the effectiveness of our hybrid feature, we test our algorithm on two human action datasets: KTH human motion dataset [16] and Weizmann human action dataset [17]. For each dataset, we perform leave-one-out cross-validation. During each run, we leave the videos of one person as test data each time, and use the rest of the videos for training. 5.1 Evaluating Different Components in Hybrid Feature We will show that both components in our proposed hybrid feature are quite discriminative. The period features of 6 activities in KTH database are illustrated in Fig. 6. We can see that the bottom three actions have different frequencies in the leg regions (denoted in red ellipses). Specifically, f running >f jogging >f walking, where f stands for the frequencies of leg regions. It conforms to the intuitive understanding. Fig. 7 shows the comparison of our proposed EHOM with traditional BOW representation and illustrates that better results are achieved by considering correlations between motion words. The following experiment is to demonstrate the benefit of combining period and EHOM feature. Fig. 8 shows the classification results for period features, EHOM features and the hybrid of them. The average accuracies are 80.49%, 89.38% and 93.47% respectively. It shows that the EHOM component achieves better result than the period component. We can also draw the conclusion that the hybrid feature is more discriminative than either component alone. Fig. 6. The frequencies of different actions in KTH database

9 Human Action Recognition in Videos Using Hybrid Motion Features 419 Methods Mean Accuracy BOW 87.25% EHOM 89.38% Fig. 7. The comparison between BOW andehomonkthdataset Methods Mean Accuracy period feature 80.49% EHOM feature 89.38% hybrid feature 93.47% Fig. 8. The comparison of different features about mean accuracy on KTH dataset 5.2 Comparison with the State-of-the-Art Experiments on Weizmann Dataset: The Weizmann human action dataset contains 93 low-resolution video sequences showing 9 different people, each of which performing 10 different actions. We have tracked and stabilized the figures using background subtraction masks that come with the dataset. In Fig. 9(a) we have shown some sample frames of the dataset. The confusion matrix of our results is shown in Fig. 9(b). Our method has achieved a 100% accuracy. Experiments on KTH Dataset: The KTH human motion dataset, contains six types of human actions (walking, jogging, running, boxing, hand waving and hand clapping). Each action is performed several times by 25 subjects in four different conditions: outdoors, outdoors with scale variation, outdoors with bend jack jump pjump run side skip walk wave1 wave jack bend wave2 wave1 walk skip side run pjump jump (a) Weizmann dataset (b) Weizmann Confusion matrix Fig. 9. Results on Weizmann dataset: (a) sample frames. (b) confusion matrix on Weizmann dataset using 100 codewords. (overall accuracy=100%) Methods Mean Accuracy Saad Ali [10] 87.70% Alireza Fathi [9] 90.50% Ivan Laptev [18] 91.8% Our method 93.47% Fig. 10. The comparison of different methods about mean accuracy on KTH dataset

10 420 S. Liu et al. boxing handclapping handwaving jogging running walking boxing walking running jogging handwaving handclapping (a) KTH dataset (b) KTH Confusion matrix Fig. 11. Results on KTH dataset: (a) sample frames. (b) confusion matrix on KTH dataset using 100 codewords. (overall accuracy=93.47%) different clothes and indoors. Representative frames of this dataset are shown in Fig. 11(a). Note that the person moves in different directions in the video of KTH database [16], so we divide each video into several segments according to the person s moving direction. Since most of the previously published results assign a single label to each video, we will also report per-video classification on KTH datasets. The per-video classification is performed by assigning a single action label aquired from majority voting. The confusion matrix on the KTH dataset is shown in Fig. 11(b). The most confusion is between the last three actions: running, jogging and walking. We have compared our results with the current state of the art in Fig. 10. Our results outperform other methods. The reason for the improvement is the complementarity of the period and EHOM components in our feature. The other reason is the combination of Bags of Words representation and Markov process keeps the correlation between words and temporary information to some extent. 6 Conclusion In this paper, we propose an efficient feature for human action recognition. The hybrid feature composed of two complementary ingredients is extracted. As a global motion description in time-domain, period component can capture the global motion in time-domain. As an additional source of evidence, EHOM component could describe local motion information. When generating EHOM, we integrate Bags of Words representation with Markov process to relax the requirement on the duration of videos and maintain the dynamic information. Experiments testify the complementary roles of the two components. The proposed algorithm is simple to implement and experiments have demonstrated its improved performance compared with the state-of-the-art algorithms on the

11 Human Action Recognition in Videos Using Hybrid Motion Features 421 task of action recognition. Since we have already achieved pretty good results in benchmark databases under controlled settings, we plan to test our algorithm in more complicated settings such as movies in future. References 1. Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV (2005) 2. Sun, J., Wu, X., Yan, S., Chua, T., Cheong, L., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR (2009) 3. Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, (2001) 4. Laptev, I., Lindeberg, T.: Pace-time interest points. In: ICCV (2003) 5. Zhang, Z., Hu, Y., Chan, S., Chia, L.-T.: Motion context: A new representation for human action recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp Springer, Heidelberg (2008) 6. Cutler, R., Davis, L.S.: Robust real-time periodic motion detection, analysis, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000) 7. Liu, Y., Collins, R., Tsin, Y.: Gait sequence analysis using frieze patterns. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV LNCS, vol. 2351, pp Springer, Heidelberg (2002) 8. Efros, A., Berg, A., Mori, G., Malik, J.: Recognition action at a distance. In: ICCV (2003) 9. Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: CVPR (2008) 10. Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 99 (2008) 11. Pogalin, E., Smeulders, A.W.M., Thean, A.H.C.: Visual quasi-periodicity. In: CVPR (2008) 12. Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. of Royal Stat. Society, Series B 61 (1999) 13. Li, J., Wu, W., Wang, T., Zhang, Y.: One step beyond histograms: Image representation using markov stationary features. In: CVPR (2008) 14. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: DARPA Image Understanding Workshop (1981) 15. Breiman, L.: Probability. Society for Industrial Mathematics (1992) 16. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: CVPR (2004) 17. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV (2005) 18. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)

Lecture 18: Human Motion Recognition

Lecture 18: Human Motion Recognition Lecture 18: Human Motion Recognition Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today? Introduction Motion classification using template matching Motion classification i using spatio

More information

Action Recognition & Categories via Spatial-Temporal Features

Action Recognition & Categories via Spatial-Temporal Features Action Recognition & Categories via Spatial-Temporal Features 华俊豪, 11331007 huajh7@gmail.com 2014/4/9 Talk at Image & Video Analysis taught by Huimin Yu. Outline Introduction Frameworks Feature extraction

More information

Evaluation of Local Space-time Descriptors based on Cuboid Detector in Human Action Recognition

Evaluation of Local Space-time Descriptors based on Cuboid Detector in Human Action Recognition International Journal of Innovation and Applied Studies ISSN 2028-9324 Vol. 9 No. 4 Dec. 2014, pp. 1708-1717 2014 Innovative Space of Scientific Research Journals http://www.ijias.issr-journals.org/ Evaluation

More information

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,

More information

Action Recognition by Learning Mid-level Motion Features

Action Recognition by Learning Mid-level Motion Features Action Recognition by Learning Mid-level Motion Features Alireza Fathi and Greg Mori School of Computing Science Simon Fraser University Burnaby, BC, Canada {alirezaf,mori}@cs.sfu.ca Abstract This paper

More information

Human Activity Recognition Using a Dynamic Texture Based Method

Human Activity Recognition Using a Dynamic Texture Based Method Human Activity Recognition Using a Dynamic Texture Based Method Vili Kellokumpu, Guoying Zhao and Matti Pietikäinen Machine Vision Group University of Oulu, P.O. Box 4500, Finland {kello,gyzhao,mkp}@ee.oulu.fi

More information

EigenJoints-based Action Recognition Using Naïve-Bayes-Nearest-Neighbor

EigenJoints-based Action Recognition Using Naïve-Bayes-Nearest-Neighbor EigenJoints-based Action Recognition Using Naïve-Bayes-Nearest-Neighbor Xiaodong Yang and YingLi Tian Department of Electrical Engineering The City College of New York, CUNY {xyang02, ytian}@ccny.cuny.edu

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

Action Recognition with HOG-OF Features

Action Recognition with HOG-OF Features Action Recognition with HOG-OF Features Florian Baumann Institut für Informationsverarbeitung, Leibniz Universität Hannover, {last name}@tnt.uni-hannover.de Abstract. In this paper a simple and efficient

More information

Learning Human Actions with an Adaptive Codebook

Learning Human Actions with an Adaptive Codebook Learning Human Actions with an Adaptive Codebook Yu Kong, Xiaoqin Zhang, Weiming Hu and Yunde Jia Beijing Laboratory of Intelligent Information Technology School of Computer Science, Beijing Institute

More information

QMUL-ACTIVA: Person Runs detection for the TRECVID Surveillance Event Detection task

QMUL-ACTIVA: Person Runs detection for the TRECVID Surveillance Event Detection task QMUL-ACTIVA: Person Runs detection for the TRECVID Surveillance Event Detection task Fahad Daniyal and Andrea Cavallaro Queen Mary University of London Mile End Road, London E1 4NS (United Kingdom) {fahad.daniyal,andrea.cavallaro}@eecs.qmul.ac.uk

More information

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION Maral Mesmakhosroshahi, Joohee Kim Department of Electrical and Computer Engineering Illinois Institute

More information

Velocity adaptation of space-time interest points

Velocity adaptation of space-time interest points Velocity adaptation of space-time interest points Ivan Laptev and Tony Lindeberg Computational Vision and Active Perception Laboratory (CVAP) Dept. of Numerical Analysis and Computer Science KTH, SE-1

More information

Object and Action Detection from a Single Example

Object and Action Detection from a Single Example Object and Action Detection from a Single Example Peyman Milanfar* EE Department University of California, Santa Cruz *Joint work with Hae Jong Seo AFOSR Program Review, June 4-5, 29 Take a look at this:

More information

Evaluation of local descriptors for action recognition in videos

Evaluation of local descriptors for action recognition in videos Evaluation of local descriptors for action recognition in videos Piotr Bilinski and Francois Bremond INRIA Sophia Antipolis - PULSAR group 2004 route des Lucioles - BP 93 06902 Sophia Antipolis Cedex,

More information

Spatial-Temporal correlatons for unsupervised action classification

Spatial-Temporal correlatons for unsupervised action classification Spatial-Temporal correlatons for unsupervised action classification Silvio Savarese 1, Andrey DelPozo 2, Juan Carlos Niebles 3,4, Li Fei-Fei 3 1 Beckman Institute, University of Illinois at Urbana Champaign,

More information

Human Action Recognition Using Independent Component Analysis

Human Action Recognition Using Independent Component Analysis Human Action Recognition Using Independent Component Analysis Masaki Yamazaki, Yen-Wei Chen and Gang Xu Department of Media echnology Ritsumeikan University 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577,

More information

Person Action Recognition/Detection

Person Action Recognition/Detection Person Action Recognition/Detection Fabrício Ceschin Visão Computacional Prof. David Menotti Departamento de Informática - Universidade Federal do Paraná 1 In object recognition: is there a chair in the

More information

Dictionary of gray-level 3D patches for action recognition

Dictionary of gray-level 3D patches for action recognition Dictionary of gray-level 3D patches for action recognition Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin To cite this version: Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin. Dictionary of

More information

Research Article Action Recognition by Joint Spatial-Temporal Motion Feature

Research Article Action Recognition by Joint Spatial-Temporal Motion Feature Applied Mathematics Volume 2013, Article ID 605469, 9 pages http://dx.doi.org/10.1155/2013/605469 Research Article Action Recognition by Joint Spatial-Temporal Motion Feature Weihua Zhang, 1 Yi Zhang,

More information

Spatio-temporal Feature Classifier

Spatio-temporal Feature Classifier Spatio-temporal Feature Classifier Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2015, 7, 1-7 1 Open Access Yun Wang 1,* and Suxing Liu 2 1 School

More information

Action recognition in videos

Action recognition in videos Action recognition in videos Cordelia Schmid INRIA Grenoble Joint work with V. Ferrari, A. Gaidon, Z. Harchaoui, A. Klaeser, A. Prest, H. Wang Action recognition - goal Short actions, i.e. drinking, sit

More information

Action Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels

Action Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels Action Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels Kai Guo, Prakash Ishwar, and Janusz Konrad Department of Electrical & Computer Engineering Motivation

More information

Human Action Recognition Using Silhouette Histogram

Human Action Recognition Using Silhouette Histogram Human Action Recognition Using Silhouette Histogram Chaur-Heh Hsieh, *Ping S. Huang, and Ming-Da Tang Department of Computer and Communication Engineering Ming Chuan University Taoyuan 333, Taiwan, ROC

More information

Bag of Optical Flow Volumes for Image Sequence Recognition 1

Bag of Optical Flow Volumes for Image Sequence Recognition 1 RIEMENSCHNEIDER, DONOSER, BISCHOF: BAG OF OPTICAL FLOW VOLUMES 1 Bag of Optical Flow Volumes for Image Sequence Recognition 1 Hayko Riemenschneider http://www.icg.tugraz.at/members/hayko Michael Donoser

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

Action Recognition Using Hybrid Feature Descriptor and VLAD Video Encoding

Action Recognition Using Hybrid Feature Descriptor and VLAD Video Encoding Action Recognition Using Hybrid Feature Descriptor and VLAD Video Encoding Dong Xing, Xianzhong Wang, Hongtao Lu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive

More information

Human Upper Body Pose Estimation in Static Images

Human Upper Body Pose Estimation in Static Images 1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project

More information

EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari

EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it What is an Event? Dictionary.com definition: something that occurs in a certain place during a particular

More information

Dynamic Human Shape Description and Characterization

Dynamic Human Shape Description and Characterization Dynamic Human Shape Description and Characterization Z. Cheng*, S. Mosher, Jeanne Smith H. Cheng, and K. Robinette Infoscitex Corporation, Dayton, Ohio, USA 711 th Human Performance Wing, Air Force Research

More information

Temporal Feature Weighting for Prototype-Based Action Recognition

Temporal Feature Weighting for Prototype-Based Action Recognition Temporal Feature Weighting for Prototype-Based Action Recognition Thomas Mauthner, Peter M. Roth, and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology {mauthner,pmroth,bischof}@icg.tugraz.at

More information

Summarization of Egocentric Moving Videos for Generating Walking Route Guidance

Summarization of Egocentric Moving Videos for Generating Walking Route Guidance Summarization of Egocentric Moving Videos for Generating Walking Route Guidance Masaya Okamoto and Keiji Yanai Department of Informatics, The University of Electro-Communications 1-5-1 Chofugaoka, Chofu-shi,

More information

Incremental Action Recognition Using Feature-Tree

Incremental Action Recognition Using Feature-Tree Incremental Action Recognition Using Feature-Tree Kishore K Reddy Computer Vision Lab University of Central Florida kreddy@cs.ucf.edu Jingen Liu Computer Vision Lab University of Central Florida liujg@cs.ucf.edu

More information

A Hierarchical Model of Shape and Appearance for Human Action Classification

A Hierarchical Model of Shape and Appearance for Human Action Classification A Hierarchical Model of Shape and Appearance for Human Action Classification Juan Carlos Niebles Universidad del Norte, Colombia University of Illinois at Urbana-Champaign, USA jnieble2@uiuc.edu Li Fei-Fei

More information

Local Descriptors for Spatio-Temporal Recognition

Local Descriptors for Spatio-Temporal Recognition Local Descriptors for Spatio-Temporal Recognition Ivan Laptev and Tony Lindeberg Computational Vision and Active Perception Laboratory (CVAP) Dept. of Numerical Analysis and Computing Science KTH, S-100

More information

MoSIFT: Recognizing Human Actions in Surveillance Videos

MoSIFT: Recognizing Human Actions in Surveillance Videos MoSIFT: Recognizing Human Actions in Surveillance Videos CMU-CS-09-161 Ming-yu Chen and Alex Hauptmann School of Computer Science Carnegie Mellon University Pittsburgh PA 15213 September 24, 2009 Copyright

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

Spatio-temporal Shape and Flow Correlation for Action Recognition

Spatio-temporal Shape and Flow Correlation for Action Recognition Spatio-temporal Shape and Flow Correlation for Action Recognition Yan Ke 1, Rahul Sukthankar 2,1, Martial Hebert 1 1 School of Computer Science, Carnegie Mellon; 2 Intel Research Pittsburgh {yke,rahuls,hebert}@cs.cmu.edu

More information

Adaptive Action Detection

Adaptive Action Detection Adaptive Action Detection Illinois Vision Workshop Dec. 1, 2009 Liangliang Cao Dept. ECE, UIUC Zicheng Liu Microsoft Research Thomas Huang Dept. ECE, UIUC Motivation Action recognition is important in

More information

Fast Motion Consistency through Matrix Quantization

Fast Motion Consistency through Matrix Quantization Fast Motion Consistency through Matrix Quantization Pyry Matikainen 1 ; Rahul Sukthankar,1 ; Martial Hebert 1 ; Yan Ke 1 The Robotics Institute, Carnegie Mellon University Intel Research Pittsburgh Microsoft

More information

Middle-Level Representation for Human Activities Recognition: the Role of Spatio-temporal Relationships

Middle-Level Representation for Human Activities Recognition: the Role of Spatio-temporal Relationships Middle-Level Representation for Human Activities Recognition: the Role of Spatio-temporal Relationships Fei Yuan 1, Véronique Prinet 1, and Junsong Yuan 2 1 LIAMA & NLPR, CASIA, Chinese Academy of Sciences,

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Space-Time Shapelets for Action Recognition

Space-Time Shapelets for Action Recognition Space-Time Shapelets for Action Recognition Dhruv Batra 1 Tsuhan Chen 1 Rahul Sukthankar 2,1 batradhruv@cmu.edu tsuhan@cmu.edu rahuls@cs.cmu.edu 1 Carnegie Mellon University 2 Intel Research Pittsburgh

More information

REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY. Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin

REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY. Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin Univ. Grenoble Alpes, GIPSA-Lab, F-38000 Grenoble, France ABSTRACT

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

Fast Realistic Multi-Action Recognition using Mined Dense Spatio-temporal Features

Fast Realistic Multi-Action Recognition using Mined Dense Spatio-temporal Features Fast Realistic Multi-Action Recognition using Mined Dense Spatio-temporal Features Andrew Gilbert, John Illingworth and Richard Bowden CVSSP, University of Surrey, Guildford, Surrey GU2 7XH United Kingdom

More information

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Xiaotang Chen, Kaiqi Huang, and Tieniu Tan National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy

More information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Ana González, Marcos Ortega Hortas, and Manuel G. Penedo University of A Coruña, VARPA group, A Coruña 15071,

More information

Sketchable Histograms of Oriented Gradients for Object Detection

Sketchable Histograms of Oriented Gradients for Object Detection Sketchable Histograms of Oriented Gradients for Object Detection No Author Given No Institute Given Abstract. In this paper we investigate a new representation approach for visual object recognition. The

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

Action Recognition using Randomised Ferns

Action Recognition using Randomised Ferns Action Recognition using Randomised Ferns Olusegun Oshin Andrew Gilbert John Illingworth Richard Bowden Centre for Vision, Speech and Signal Processing, University of Surrey Guildford, Surrey United Kingdom

More information

Matching Gait Image Sequences in the Frequency Domain for Tracking People at a Distance

Matching Gait Image Sequences in the Frequency Domain for Tracking People at a Distance Matching Gait Image Sequences in the Frequency Domain for Tracking People at a Distance Ryusuke Sagawa, Yasushi Makihara, Tomio Echigo, and Yasushi Yagi Institute of Scientific and Industrial Research,

More information

Locally Adaptive Regression Kernels with (many) Applications

Locally Adaptive Regression Kernels with (many) Applications Locally Adaptive Regression Kernels with (many) Applications Peyman Milanfar EE Department University of California, Santa Cruz Joint work with Hiro Takeda, Hae Jong Seo, Xiang Zhu Outline Introduction/Motivation

More information

Improving Recognition through Object Sub-categorization

Improving Recognition through Object Sub-categorization Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,

More information

Available online at ScienceDirect. Procedia Computer Science 60 (2015 )

Available online at  ScienceDirect. Procedia Computer Science 60 (2015 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science (15 ) 43 437 19th International Conference on Knowledge Based and Intelligent Information and Engineering Systems Human

More information

SUPERVISED NEIGHBOURHOOD TOPOLOGY LEARNING (SNTL) FOR HUMAN ACTION RECOGNITION

SUPERVISED NEIGHBOURHOOD TOPOLOGY LEARNING (SNTL) FOR HUMAN ACTION RECOGNITION SUPERVISED NEIGHBOURHOOD TOPOLOGY LEARNING (SNTL) FOR HUMAN ACTION RECOGNITION 1 J.H. Ma, 1 P.C. Yuen, 1 W.W. Zou, 2 J.H. Lai 1 Hong Kong Baptist University 2 Sun Yat-sen University ICCV workshop on Machine

More information

Learning realistic human actions from movies

Learning realistic human actions from movies Learning realistic human actions from movies Ivan Laptev, Marcin Marszalek, Cordelia Schmid, Benjamin Rozenfeld CVPR 2008 Presented by Piotr Mirowski Courant Institute, NYU Advanced Vision class, November

More information

Action Recognition By Learnt Class-Specific Overcomplete Dictionaries

Action Recognition By Learnt Class-Specific Overcomplete Dictionaries Action Recognition By Learnt Class-Specific Overcomplete Dictionaries Tanaya Guha Electrical and Computer Engineering University of British Columbia Vancouver, Canada Email: tanaya@ece.ubc.ca Rabab K.

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Learning Realistic Human Actions from Movies

Learning Realistic Human Actions from Movies Learning Realistic Human Actions from Movies Ivan Laptev*, Marcin Marszałek**, Cordelia Schmid**, Benjamin Rozenfeld*** INRIA Rennes, France ** INRIA Grenoble, France *** Bar-Ilan University, Israel Presented

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

A Motion Descriptor Based on Statistics of Optical Flow Orientations for Action Classification in Video-Surveillance

A Motion Descriptor Based on Statistics of Optical Flow Orientations for Action Classification in Video-Surveillance A Motion Descriptor Based on Statistics of Optical Flow Orientations for Action Classification in Video-Surveillance Fabio Martínez, Antoine Manzanera, Eduardo Romero To cite this version: Fabio Martínez,

More information

Action Recognition in Low Quality Videos by Jointly Using Shape, Motion and Texture Features

Action Recognition in Low Quality Videos by Jointly Using Shape, Motion and Texture Features Action Recognition in Low Quality Videos by Jointly Using Shape, Motion and Texture Features Saimunur Rahman, John See, Chiung Ching Ho Centre of Visual Computing, Faculty of Computing and Informatics

More information

Detection of Small-Waving Hand by Distributed Camera System

Detection of Small-Waving Hand by Distributed Camera System Detection of Small-Waving Hand by Distributed Camera System Kenji Terabayashi, Hidetsugu Asano, Takeshi Nagayasu, Tatsuya Orimo, Mutsumi Ohta, Takaaki Oiwa, and Kazunori Umeda Department of Mechanical

More information

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada Spatio-Temporal Salient Features Amir H. Shabani Vision and Image Processing Lab., University of Waterloo, ON CRV Tutorial day- May 30, 2010 Ottawa, Canada 1 Applications Automated surveillance for scene

More information

Computation Strategies for Volume Local Binary Patterns applied to Action Recognition

Computation Strategies for Volume Local Binary Patterns applied to Action Recognition Computation Strategies for Volume Local Binary Patterns applied to Action Recognition F. Baumann, A. Ehlers, B. Rosenhahn Institut für Informationsverarbeitung (TNT) Leibniz Universität Hannover, Germany

More information

A Feature Point Matching Based Approach for Video Objects Segmentation

A Feature Point Matching Based Approach for Video Objects Segmentation A Feature Point Matching Based Approach for Video Objects Segmentation Yan Zhang, Zhong Zhou, Wei Wu State Key Laboratory of Virtual Reality Technology and Systems, Beijing, P.R. China School of Computer

More information

An Adaptive Eigenshape Model

An Adaptive Eigenshape Model An Adaptive Eigenshape Model Adam Baumberg and David Hogg School of Computer Studies University of Leeds, Leeds LS2 9JT, U.K. amb@scs.leeds.ac.uk Abstract There has been a great deal of recent interest

More information

Reconstruction of Images Distorted by Water Waves

Reconstruction of Images Distorted by Water Waves Reconstruction of Images Distorted by Water Waves Arturo Donate and Eraldo Ribeiro Computer Vision Group Outline of the talk Introduction Analysis Background Method Experiments Conclusions Future Work

More information

Revisiting LBP-based Texture Models for Human Action Recognition

Revisiting LBP-based Texture Models for Human Action Recognition Revisiting LBP-based Texture Models for Human Action Recognition Thanh Phuong Nguyen 1, Antoine Manzanera 1, Ngoc-Son Vu 2, and Matthieu Garrigues 1 1 ENSTA-ParisTech, 828, Boulevard des Maréchaux, 91762

More information

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science

More information

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Chris J. Needham and Roger D. Boyle School of Computing, The University of Leeds, Leeds, LS2 9JT, UK {chrisn,roger}@comp.leeds.ac.uk

More information

An Edge-Based Approach to Motion Detection*

An Edge-Based Approach to Motion Detection* An Edge-Based Approach to Motion Detection* Angel D. Sappa and Fadi Dornaika Computer Vison Center Edifici O Campus UAB 08193 Barcelona, Spain {sappa, dornaika}@cvc.uab.es Abstract. This paper presents

More information

Mixtures of Gaussians and Advanced Feature Encoding

Mixtures of Gaussians and Advanced Feature Encoding Mixtures of Gaussians and Advanced Feature Encoding Computer Vision Ali Borji UWM Many slides from James Hayes, Derek Hoiem, Florent Perronnin, and Hervé Why do good recognition systems go bad? E.g. Why

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2015) A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b

More information

People Tracking and Segmentation Using Efficient Shape Sequences Matching

People Tracking and Segmentation Using Efficient Shape Sequences Matching People Tracking and Segmentation Using Efficient Shape Sequences Matching Junqiu Wang, Yasushi Yagi, and Yasushi Makihara The Institute of Scientific and Industrial Research, Osaka University 8-1 Mihogaoka,

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Graph Matching Iris Image Blocks with Local Binary Pattern

Graph Matching Iris Image Blocks with Local Binary Pattern Graph Matching Iris Image Blocs with Local Binary Pattern Zhenan Sun, Tieniu Tan, and Xianchao Qiu Center for Biometrics and Security Research, National Laboratory of Pattern Recognition, Institute of

More information

Human activity recognition in the semantic simplex of elementary actions

Human activity recognition in the semantic simplex of elementary actions STUDENT, PROF, COLLABORATOR: BMVC AUTHOR GUIDELINES 1 Human activity recognition in the semantic simplex of elementary actions Beaudry Cyrille cyrille.beaudry@univ-lr.fr Péteri Renaud renaud.peteri@univ-lr.fr

More information

Fast Motion Consistency through Matrix Quantization

Fast Motion Consistency through Matrix Quantization Fast Motion Consistency through Matrix Quantization Pyry Matikainen; Rahul Sukthankar; Martial Hebert; Yan Ke Robotics Institute Carnegie Mellon University Pittsburgh, Pennsylvania, USA {pmatikai, rahuls}@cs.cmu.edu;

More information

Fabric Defect Detection Based on Computer Vision

Fabric Defect Detection Based on Computer Vision Fabric Defect Detection Based on Computer Vision Jing Sun and Zhiyu Zhou College of Information and Electronics, Zhejiang Sci-Tech University, Hangzhou, China {jings531,zhouzhiyu1993}@163.com Abstract.

More information

Gesture Recognition Under Small Sample Size

Gesture Recognition Under Small Sample Size Gesture Recognition Under Small Sample Size Tae-Kyun Kim 1 and Roberto Cipolla 2 1 Sidney Sussex College, University of Cambridge, Cambridge, CB2 3HU, UK 2 Department of Engineering, University of Cambridge,

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

A Spatio-Temporal Descriptor Based on 3D-Gradients

A Spatio-Temporal Descriptor Based on 3D-Gradients A Spatio-Temporal Descriptor Based on 3D-Gradients Alexander Kläser Marcin Marszałek Cordelia Schmid INRIA Grenoble, LEAR, LJK {alexander.klaser,marcin.marszalek,cordelia.schmid}@inrialpes.fr Abstract

More information

CS 4495 Computer Vision Motion and Optic Flow

CS 4495 Computer Vision Motion and Optic Flow CS 4495 Computer Vision Aaron Bobick School of Interactive Computing Administrivia PS4 is out, due Sunday Oct 27 th. All relevant lectures posted Details about Problem Set: You may *not* use built in Harris

More information

Human pose estimation using Active Shape Models

Human pose estimation using Active Shape Models Human pose estimation using Active Shape Models Changhyuk Jang and Keechul Jung Abstract Human pose estimation can be executed using Active Shape Models. The existing techniques for applying to human-body

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

arxiv: v3 [cs.cv] 3 Oct 2012

arxiv: v3 [cs.cv] 3 Oct 2012 Combined Descriptors in Spatial Pyramid Domain for Image Classification Junlin Hu and Ping Guo arxiv:1210.0386v3 [cs.cv] 3 Oct 2012 Image Processing and Pattern Recognition Laboratory Beijing Normal University,

More information

Arm-hand Action Recognition Based on 3D Skeleton Joints Ling RUI 1, Shi-wei MA 1,a, *, Jia-rui WEN 1 and Li-na LIU 1,2

Arm-hand Action Recognition Based on 3D Skeleton Joints Ling RUI 1, Shi-wei MA 1,a, *, Jia-rui WEN 1 and Li-na LIU 1,2 1 International Conference on Control and Automation (ICCA 1) ISBN: 97-1-9-39- Arm-hand Action Recognition Based on 3D Skeleton Joints Ling RUI 1, Shi-wei MA 1,a, *, Jia-rui WEN 1 and Li-na LIU 1, 1 School

More information

1126 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011

1126 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011 1126 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 4, APRIL 2011 Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences Antonios Oikonomopoulos, Member, IEEE,

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

A Novel Extreme Point Selection Algorithm in SIFT

A Novel Extreme Point Selection Algorithm in SIFT A Novel Extreme Point Selection Algorithm in SIFT Ding Zuchun School of Electronic and Communication, South China University of Technolog Guangzhou, China zucding@gmail.com Abstract. This paper proposes

More information

A Performance Evaluation of HMM and DTW for Gesture Recognition

A Performance Evaluation of HMM and DTW for Gesture Recognition A Performance Evaluation of HMM and DTW for Gesture Recognition Josep Maria Carmona and Joan Climent Barcelona Tech (UPC), Spain Abstract. It is unclear whether Hidden Markov Models (HMMs) or Dynamic Time

More information

Human Action Recognition from Gradient Boundary Histograms

Human Action Recognition from Gradient Boundary Histograms Human Action Recognition from Gradient Boundary Histograms by Xuelu Wang Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the M.A.SC.

More information

Automatic Gait Recognition. - Karthik Sridharan

Automatic Gait Recognition. - Karthik Sridharan Automatic Gait Recognition - Karthik Sridharan Gait as a Biometric Gait A person s manner of walking Webster Definition It is a non-contact, unobtrusive, perceivable at a distance and hard to disguise

More information

1646 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 6, JUNE 2007

1646 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 6, JUNE 2007 1646 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 6, JUNE 2007 Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition Liang Wang and David Suter Abstract In this paper, we

More information

Cross-View Action Recognition from Temporal Self-Similarities

Cross-View Action Recognition from Temporal Self-Similarities Appears at ECCV 8 Cross-View Action Recognition from Temporal Self-Similarities Imran N. Junejo, Emilie Dexter, Ivan Laptev and Patrick Pérez INRIA Rennes - Bretagne Atlantique Rennes Cedex - FRANCE Abstract.

More information