MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO

Size: px
Start display at page:

Download "MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO"

Transcription

1 MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO YIFAN ZHANG, QINGSHAN LIU, JIAN CHENG, HANQING LU National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing , China {yfzhang, qsliu, jcheng, Abstract: In this paper, we propose an effective fusion scheme of audio and visual modals for highlight detection in broadcast soccer videos. The Adaboost learning is adopted to learn some discriminating audio features for some special audio classification. The logo-based replay shot detection is used for mid-level visual semantic analysis. A finite state machine is utilized to integrate the audio and visual analysis result for further highlight detection. Experiments conducted on several real-world soccer game videos show that the proposed method has an encouraging performance. Keywords: Highlight detection; broadcast video; multimodal analysis 1. Introduction The quantity and availability of sports video content is soaring due to the popularization of television and internet. The various video services via new media channels such as network TV and mobile device have shown tremendous commercial potentials, and brought huge demands of personalized sports video services according to consumers preference. The traditional one-to-many broadcast mode can not meet different audiences demands. Sometimes, people may only be interested in the highlights from lengthy and voluminous sports video programs and want to skip the less interesting parts. In this paper, an effective multimodal approach using audio and visual features for highlights detection in broadcast sport videos is presented. We initially focus the application domain on soccer, because it is the most popular sport appealing with large audiences. Among the whole soccer game, usually only a small portion is exciting and highlight-worthy. Manually generating the highlights of soccer videos is a high-labored cost work, as editors need to browse the whole game. The soccer game s dynamical and flexible structure also brings the challenge to video parsing and analysis work. Audio track is an important information source, which has good correlations with semantics and is less expensive to compute. Hence, it is important to utilize audio information in highlight detection. In soccer games, the audio track consists of whistle, audience applaud, commentator speech, and various kinds of environmental sounds. Based our observation, the wonderful events which are highlight-worthy always occur with excited commentator speeches, excited audience applauds and sometimes the whistles from the referee. The whistles often occur in other common events such as foul, kickoff, the start and the end of game, etc. In addition, some whistles may be from the audience but not the referee. It will affect the detecting accuracy. The audience applauds are always mixed with various environmental noises. Thus, we set the excited commentator speech as the audio cue to facilitate highlight inferring and detecting. However, audio based analysis is not always reliable in soccer game videos because the environment sounds of soccer are very noise. We therefore appeal to some visual analysis for help. To limit the computing complexity and enhance the robustness, we only use replay shot detection in visual analysis. As we know, replay shot is a special effect inserted by the TV director to explain the game progress and show player s details. It is a significant mid-level feature and has strong relationship to highlights. However, sometimes replay shots are inserted to show some technical details such as foul and offside, which are less interesting for most of the audiences. Therefore, the combination of audio cue extraction and replay shot detection can effectively reduce the false positive of the two single modal s analysis. The scheme of our proposed solution contains three parts: audio cue extraction, replay shot detection and audio-visual modals integration. For audio cue extraction, Adaboost is utilized to automatically select some discriminating low-level features for audio classification.

2 Replay shot detection is based on flying logo detection before and after the replay shots, which is a production rule in broadcast sports videos. Finally, a finite state machine is designed to integrate the audio and visual analysis result for highlight detection. Since the features we used are generic in broadcast sports videos but not game specific, it is easy to extend our approach to other application domain.. Related Work Highlight detection and game analysis for sports video has attracted much research attention in recent years [1]. Most of the existing methods were based on visual analysis [, 3], which attempted to extract mid-level semantic concepts from low-level visual information. For soccer games, [4] tryed to use the position information of the players and the ball during the game and therefore it needed a quite complex and accurate tracking system. Ekin et al. [5] proposed a framework using object-based features for analysis and summarization of soccer videos. The framework included some novel video processing approaches such as dominant color region detection, referee detection and penalty-box detection. TV broadcasting rules were also used together with visual information to detect goal event. However, visual features are not only expensive to compute, but also not very robust. Hence, some researchers began to focus on audio analysis [6, 7]. Rui et al. [6] detected speeches and ball-hit sounds for extracting highlight of baseball videos. Several learning algorithms are compared in speech classification. A directional template matching approach was used for ball-hit sounds detection. Since the game-specific sounds and domain knowledge are used, it seems difficult to be generalized to many other sports. In [7], SVM was employed to train sound (applause, speech and whistles) recognizers. It was assumed that those sounds are closely related to some events under specific sports game rules. The low-level audio features used for recognizers were selected manually, which was labor-intensive in training and testing, and difficult to fit different classification tasks. Some researchers attempted to combine audio and visual features to improve the detection precision. Han et al. [8] used a maximum entropy to integrate audio, image and speech to detect highlight in baseball videos. Nepal et al.[9] employed heuristic approach to combine cheer voice, score display and the transition in camera motion for detecting goal events in basketball games. They are almost based on some specific domain rules and game-specific features. 3. Audio Cue Extraction Audio cue is referred to the significant audio information which has strong relationship with semantics in the game and can facilitate highlight detection. In soccer game, we set excited commentator speech as the audio cue in the reason that it has better correlations to exciting events and is relatively easier to be classified than other audio information Feature Extraction Since the audio track consists of sounds mainly from commentator, audience, whistle and other environment noise, we extract features which can well characterize those sounds from both time domain and frequency domain of the audio signals Mel-Frequency Cepstral Coefficient (MFCCs) The mel-frequency cepstral is proved to be effective in speech recognition and modeling the subjective pitch and frequency content of audio signals. The frequency bands are positioned logarithmically (on the mel scale) which approximates the human auditory system's response more closely than the linearly spaced frequency bands of FFT or DCT. The MFCCs are computed from the FFT power coefficients which are filtered by a triangular pass filter bank as follows: K Cn = (log Sk)cos[ n( k 0.5) / k], n 1,,... N k π = (1) k = 1 where S k is the output of the filter banks and N is the number of MFCCs dimensions. The delta and acceleration of MFCCs are also used in our experiments Linear Prediction Coefficient (LPCs) The LPCs are the coefficients of linear prediction coding which are frequently used for transmitting spectral envelope information. By minimizing the sum of the squared differences between the actual audio samples and the predictive ones, a set of predictor coefficients can be determined. The Levinson recursion approach is used for iteration and calculation LPC Cepstral Coefficient (LPCCs) The LPCCs are the cepstral coefficients converted from linear prediction coefficients. The LPCs are defined by [ a 0, a 1, a,, a p ], and the LPCCs are defined by [ b 0, b 1, b,..., b p,..., bn 1 ]. The recursion is defined by the

3 following equations: B0 ln E = () 1 m 1 Bm = am + [ ( m k) akb( m k) ],1 m p m (3) k = 1 m 1 ( m k) Bm = [ ab k ( m k) ], p< m< n (4) k = 1 m where E is the prediction error, n is the dimension number of cepstral coefficients Zero Crossing Rate (ZCR) The zero crossing rate is the rate of sign-changes along a signal. The rate at which zero crossings occur is a simple measure of the frequency content of a signal. It is calculated as 1 T 1 R = sign( s i s ) (5) t t 1 T t= 0 where s is a signal of length T and the indicator function sign(a) is the algebraic sign of its argument A Short Time Energy (STE) The short time energy is the mean square of samples in each frame which is weighted with a Hamming window h(n). It is calculated as 1 T 1 STE = s( n) h( T n) T (6) n= 0 where s is a signal of length T. Adaboost is a popular learning algorithm which can select and weight the discriminating features for efficient classifiers [10]. In this paper, we use the Adaboost to select the most discriminating features for excited commentator speech classification. We simply use the Gaussian weak classifier for each dimension of the features. The whole process is shown in Table 1. Table 1. Feature selection by Adaboost 1 Initialize weights w 1 1 1, i, m n = for positive and negative samples, where m and n are the number of negatives and positives respectively. For t = 1,,T: 1. normalize the weights w ti,.. for each feature f j, train the Gaussian classifier G j. The error is evaluated with respect to w ti,, ej = wt, i Gj( xi) y i. where y i is the label of samples. 3. choose the feature f t, with the minimum error e t. 4. update the weights: 1 w i t+ 1, i = wt, iβt δ where δ i = 0 if sample x i is classified correctly, δ i = 1 otherwise, and β e t t =. 1 e t 3 The final selected feature vector is : { α f, α f, α f,..., α f } where 1 αt = log βt T T i 3.. Feature Selection We segment the original audio signals into 50ms per frame as the basic unit. Each frame is described by its observation of the low-level features extracted in section 3.1. The features of one frame are normalized and combined into a vector. The extraction of audio cue here can be formulated as a two-class classification problem (e.g. excited commentator speech vs. others). The frames which belong to excited commentator speech segments are considered as positive samples and other frames are considered as negative ones. In the research work of Rui et al. [6] and Xu et al. [7], SVM is proved to be an effective classifier. However, they did not consider the properties of different low level features. Actually different low-level features have different influence on audio classification. For example, the energy and MFCC feature perform well in speech detection, while whistle is easy distinguished by ZCR feature [6, 7]. Moreover, sometimes simply combining them together will degenerate the classification performance. Thus, it is necessary to do feature selection automatically according to practical demands. 4. Replay Detection In most of broadcast soccer games, there exists a special transition at both start and end of a replay, that is, a logo comes in and disappears gradually. Base our observation, above 90% broadcast soccer videos use flying logo to launch replays, which can be seen as a production rule. Figure 1 shows examples of the flying logo in several soccer game videos. (a) (b) (c) (d) FIG. 1. Flying logo in (a) World Cup 006, (b) European Championship 004, (c) European Champion League and (d) England Premier League Based on our previous work [11], we use an effective solution for replay shot detection using the flying logo. The solution consists of logo-transition detection, logo detection and replay recognition. We firstly detect the logo-transitions

4 and further extract logo-samples from them. Then, we employ the template matching approach to detect other logos. After all the logos are obtained, the videos can be partitioned into segments which are replay or non-replay. In the logo-transition detection, the difference between neighbor frames is measured by intensity mean square difference (MSD). Count the number of consecutive inter-frame differences exceeding a certain threshold. If the number is large enough, a wipe transition can be determined. The logo template is obtained from the average image of the samples in the transition process. Color and shape features are used in template matching. Ideally, a pair of detected logos can determine a replay shot. However, because of the existing of false and missing detection, we have to add other features (such as shot length, shot type, motion vector etc.) to help determine the replay shot recognition. Further technical details refer to [11]. 5 Audio-Visual Fusion In our scheme, it is an important part to integrate the audio and visual modal analysis results for highlight detection. In audio modal, since observation of real-world sports games reveals that excited commentator speech usually lasts much longer than one second, we divide audio stream into segments which are one second each. Each segment is labeled by the majority voting of the classification results of the frame sequences. In visual modal, the video stream is also divided into one second per segment and each segment is labeled as {1, 0} for replay and non-replay respectively. As we know, highlights are always followed with replay shot, and the excited commentator speech occurs before or in the replay shot. Therefore, a forward-search rule is utilized to search for the excited commentator speech based on the replay shot detection results. The search rule between audio and video streams is shown in Figure. FIG.. Search rule between audio and video streams Based on the forward search rule, a finite state machine (FSM) is designed to detecting highlight. Based on observation, we make two rules in the FSM for soccer. Certainly they can also be modified to adapt to other kinds of sports video. Rule1: the replay shots should not longer than 60 seconds. Rule: the interval between excited commentator speeches and replay shots should not longer than 30 seconds. Transition condition: A: rule 1 not satisfied; B: audio cue found; C: audio cue not found and rule satisfied; D: audio cue not found and rule not satisfied. FIG. 3. Finite state machine for highlight detection The FSM s states and transition conditions are listed in Figure 3. It first searches the replay shots according to section 4. If the detected segment is not longer than 60 seconds, it will be regarded as a replay shot; otherwise it is regarded as false replay shot detection. Then the forward search is carried out in audio stream from the replay moment. If the audio cue (the excited commentator speech segment) is found and obeys the rule, the highlight can be determined. Forward search will continue to search other audio cues which should be included in the highlight segment until the rule is not satisfied. 6 Experiment Results We conduct the experiments on 5 real-world soccer games (3 FIFA World Cup 006 games and UEFA European Championship 004 games). The audio samples were collected with 44.1 khz sample rate, 705kbps bit rate, 16bits per sample. In audio cue extraction modal, 10 minutes audio data from 3 games are used for feature selection and classifier training. The rests are for testing. The excited commentator speech frames in the original audio track are labeled manually as the ground truth. To further evaluate the proposed Adaboost classifier, we also investigate the SVM classifier using every single feature and their several combinations. Figure 4 shows the error rate of excited commentator speech detection result. In this figure, the last bin corresponds to the Adaboost classifier, while the others are SVM classifiers using the corresponding features. It is clear that not all the features

5 are effective for classification. The SVM classifier using all the features yields high error rate. The Adaboost and the SVM using the features of MFCCs and STE both perform well. It is encouraging that our approach is comparable to the result of SVM using the best features which are evaluated and selected manually. To enhance the conclusion, we further change the classification task on whistle (the result is shown in Figure 5). We detect the whistles out of other sounds. The Adaboost classifier still gets the second lowest error rate, which is a little higher than the SVM using the features of STE and ZCR experiments are also conducted on audio modal and visual modal respectively. We detect the segments which have the excited commentator speeches on audio modal or replay shots on visual modal and regard them as highlights. Comparing with the ground truth, the results are listed in Table 4 and Table 5 respectively. It can be seen that although the recall is good, the detection precision is unfortunately low by single modal. It is because that some technical but not wonderful events (e.g. foul, offside) also have replay shots; and the audio cue detection is not always reliable due to the environment noise in soccer games, to guarantee the recall we has to sacrifice the precision. Hence, it is clear that the integration of audio and visual modal analysis can effectively reduce the false positive and achieve the satisfied results. FIG. 4. Excited commentator speech detection Table 3. Highlight detection by multimodal No. Game True False Miss Precision Recall 1 France_Spain % 95.7% Germany_Costa Rica % 84.0% 3 Portugal_Mexico % 90.0% FIG. 5. Whistle detection For replay shot detection, three games are used to test the performance of the flying logo-based approach. The results are listed in table. In our results, some missing detections are due to the logo themselves missing in the original videos. Table. Replay shot detection Game Precision % Recall % Portugal_Mexico France_Spain Czech_Greece In audio-video fusion modal, results of audio cue extraction and replay shot detection are integrated for finally detecting the highlights. A human subject (not included in our project) was asked to watch the 5 real-world soccer games and selected the highlights as the ground truth. Table 3 lists the results of multimodal highlight detection approach. 99 of 114 highlights were successfully detected from 5 real-world soccer games. 15 of them are missing. In these 15 segments, 6 of them are caused by missing detection of replay shot. The other 9 segments commentator speeches are not very excited. The result of the 5 th game is relatively low is due to the low quality of the audio track in the original video data. In contrast, 4 Portugal_England % 88.5% 5 Czech_Greece % 75.0% Table 4. Highlight detection by audio modal No. Game True False Miss Precision Recall 1 France_Spain % 95.7% Germany_Costa Rica % 84.0% 3 Portugal_Mexico % 90.0% 4 Portugal_England % 9.3% 5 Czech_Greece % 75.0% Table 5. Highlight detection by visual modal No. Game True False Miss Precision Recall 1 France_Spain % 95.7% Germany_Costa Rica % 84.0% 3 Portugal_Mexico % 95.0% 4 Portugal_England % 88.5% 5 Czech_Greece % 80.0%

6 7 Conclusion In this paper, a multimodal highlight detection scheme is proposed for broadcast soccer games. The Adaboost learning is present to select discriminating audio cues for excited commentator speeches classification. To limit the computing complexity, only replay shot detection is used in visual analysis. The Finite state machine is adopted to fuse the audio and visual analysis together for highlight detection. The experimental results show that the integration of audio and visual modal analysis is effective for highlights detection. Our next step work is to add some other effective visual cues such as object based features to enhance the detection. 8 Acknowledgement The research is supported by the 863 Program of China (Grant No. 006AA01Z315, 006AA01Z117), NNSF of China (Grant No ) and NSF of Beijing (Grant No ). [6] Y.Rui, A. Gupta, and A. Acero, Automatically extracting highlights for TV baseball programs, In Proc. of ACM Multimedia, Los Angeles, CA, (000) [7] M. Xu, N. C. Maddage, C. S. Xu, M. Kankanhalli, and Q, Tian, Creating audio key-words for event detection in soccer video, in Proc. of International Conference on Multimedia and Expo. (003) 6-9 [8] M. Han, W. Hua, W. Xu, and Y. H. Gong, An integrated baseball digest system using maximum entropy method, In Proc. of ACM Multimedia. (00) [9] S. Nepal, U. Srinivasan and G. Reynolds, Automatic detection of goal segments in basketball videos, In Proc. of ACM Multimedia, Ottawa, Canada, (001) [10] Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting, In Computational Learning Theory, Springer-verlag, Eurocolt 95, (1995) 3-37 [11] X. F. Tong, H. Q. Lu, Q. S. Liu, and H. L. Jin, Replay detection in broadcasting sports video, In Proc. of ICIG, (004) References [1] Y. H. Gong, L. T. Sin, C. H. Chuan, H. J. Zhang, and M. Sakauchi, Automatic parsing of TV soccer programs, in Proc. of International Conference on Multimedia Computing and System, (1995) [] Y. P. Tan, D. D. Saur, S. R. Kulkarni, and P. J. Ramadge, Rapid estimation of camera motion from compressed video with application to video annotation, in IEEE Trans. on Circuits and Systems for Video Technology. vol.10,( 000) [3] P. Xu, L. Xie, S. F. Chang, A. Divakaran, A. Vetro, and H. Sun, Algorithms and systems for segmentation and structure analysis in soccer video, in Proc. of International Conference on Multimedia and Expo. Tokyo, Japan, (001) 5 [4] V. Tovinkere, and R. J. Qian, Detecting semantic events in soccer games: Toward a complete solution, in Proc. of International Conference on Multimedia and Expo. Tokyo, Japan, (001) [5] A. Ekin, and M. Tekalp, Automatic soccer video analysis and summarization, In Proc. of IS&T/SPIE03, Santa Clara, CA (003)

Highlight Ranking for Broadcast Tennis Video Based on Multi-modality Analysis and Relevance Feedback

Highlight Ranking for Broadcast Tennis Video Based on Multi-modality Analysis and Relevance Feedback Highlight Ranking for Broadcast Tennis Video Based on Multi-modality Analysis and Relevance Feedback Guangyu Zhu 1, Qingming Huang 2, and Yihong Gong 3 1 Harbin Institute of Technology, Harbin, P.R. China

More information

Baseball Game Highlight & Event Detection

Baseball Game Highlight & Event Detection Baseball Game Highlight & Event Detection Student: Harry Chao Course Adviser: Winston Hu 1 Outline 1. Goal 2. Previous methods 3. My flowchart 4. My methods 5. Experimental result 6. Conclusion & Future

More information

Story Unit Segmentation with Friendly Acoustic Perception *

Story Unit Segmentation with Friendly Acoustic Perception * Story Unit Segmentation with Friendly Acoustic Perception * Longchuan Yan 1,3, Jun Du 2, Qingming Huang 3, and Shuqiang Jiang 1 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

Motion analysis for broadcast tennis video considering mutual interaction of players

Motion analysis for broadcast tennis video considering mutual interaction of players 14-10 MVA2011 IAPR Conference on Machine Vision Applications, June 13-15, 2011, Nara, JAPAN analysis for broadcast tennis video considering mutual interaction of players Naoto Maruyama, Kazuhiro Fukui

More information

Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain

Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain Radhakrishan, R.; Xiong, Z.; Divakaran,

More information

Title: Pyramidwise Structuring for Soccer Highlight Extraction. Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang

Title: Pyramidwise Structuring for Soccer Highlight Extraction. Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang Title: Pyramidwise Structuring for Soccer Highlight Extraction Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang Mailing address: Microsoft Research Asia, 5F, Beijing Sigma Center, 49 Zhichun Road, Beijing

More information

Multimodal Semantic Analysis and Annotation for Basketball Video

Multimodal Semantic Analysis and Annotation for Basketball Video Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 32135, Pages 1 13 DOI 10.1155/ASP/2006/32135 Multimodal Semantic Analysis and Annotation for Basketball

More information

Detection of goal event in soccer videos

Detection of goal event in soccer videos Detection of goal event in soccer videos Hyoung-Gook Kim, Steffen Roeber, Amjad Samour, Thomas Sikora Department of Communication Systems, Technical University of Berlin, Einsteinufer 17, D-10587 Berlin,

More information

Computer Vision and Image Understanding

Computer Vision and Image Understanding Computer Vision and Image Understanding 113 (2009) 415 424 Contents lists available at ScienceDirect Computer Vision and Image Understanding journal homepage: www.elsevier.com/locate/cviu A framework for

More information

Highlights Extraction from Unscripted Video

Highlights Extraction from Unscripted Video Highlights Extraction from Unscripted Video T 61.6030, Multimedia Retrieval Seminar presentation 04.04.2008 Harrison Mfula Helsinki University of Technology Department of Computer Science, Espoo, Finland

More information

Real-Time Content-Based Adaptive Streaming of Sports Videos

Real-Time Content-Based Adaptive Streaming of Sports Videos Real-Time Content-Based Adaptive Streaming of Sports Videos Shih-Fu Chang, Di Zhong, and Raj Kumar Digital Video and Multimedia Group ADVENT University/Industry Consortium Columbia University December

More information

A Rapid Scheme for Slow-Motion Replay Segment Detection

A Rapid Scheme for Slow-Motion Replay Segment Detection A Rapid Scheme for Slow-Motion Replay Segment Detection Wei-Hong Chuang, Dun-Yu Hsiao, Soo-Chang Pei, and Homer Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 10617,

More information

SOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2

SOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 1 Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 33720, Tampere, Finland toni.heittola@tut.fi,

More information

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Ajay Divakaran, Kadir A. Peker, Regunathan Radhakrishnan, Ziyou Xiong and Romain Cabasson Presented by Giulia Fanti 1 Overview Motivation

More information

TRACKING OF MULTIPLE SOCCER PLAYERS USING A 3D PARTICLE FILTER BASED ON DETECTOR CONFIDENCE

TRACKING OF MULTIPLE SOCCER PLAYERS USING A 3D PARTICLE FILTER BASED ON DETECTOR CONFIDENCE Advances in Computer Science and Engineering Volume 6, Number 1, 2011, Pages 93-104 Published Online: February 22, 2011 This paper is available online at http://pphmj.com/journals/acse.htm 2011 Pushpa

More information

A Hybrid Approach to News Video Classification with Multi-modal Features

A Hybrid Approach to News Video Classification with Multi-modal Features A Hybrid Approach to News Video Classification with Multi-modal Features Peng Wang, Rui Cai and Shi-Qiang Yang Department of Computer Science and Technology, Tsinghua University, Beijing 00084, China Email:

More information

Face Tracking in Video

Face Tracking in Video Face Tracking in Video Hamidreza Khazaei and Pegah Tootoonchi Afshar Stanford University 350 Serra Mall Stanford, CA 94305, USA I. INTRODUCTION Object tracking is a hot area of research, and has many practical

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Exciting Event Detection Using Multi-level Multimodal Descriptors and Data Classification

Exciting Event Detection Using Multi-level Multimodal Descriptors and Data Classification Exciting Event Detection Using Multi-level Multimodal Descriptors and Data Classification Shu-Ching Chen, Min Chen Chengcui Zhang Mei-Ling Shyu School of Computing & Information Sciences Department of

More information

Temporal structure analysis of broadcast tennis video using hidden Markov models

Temporal structure analysis of broadcast tennis video using hidden Markov models Temporal structure analysis of broadcast tennis video using hidden Markov models Ewa Kijak a,b, Lionel Oisel a, Patrick Gros b a THOMSON multimedia S.A., Cesson-Sevigne, France b IRISA-CNRS, Campus de

More information

A Unified Framework for Semantic Content Analysis in Sports Video

A Unified Framework for Semantic Content Analysis in Sports Video Proceedings of the nd International Conference on Information Technology for Application (ICITA 004) A Unified Framework for Semantic Content Analysis in Sports Video Chen Jianyun Li Yunhao Lao Songyang

More information

Fast Highlight Detection and Scoring for Broadcast Soccer Video Summarization using On-Demand Feature Extraction and Fuzzy Inference

Fast Highlight Detection and Scoring for Broadcast Soccer Video Summarization using On-Demand Feature Extraction and Fuzzy Inference , pp.13-36 http://dx.doi.org/10.14257/ijcg.2015.6.1.02 Fast Highlight Detection and Scoring for Broadcast Soccer Video Summarization using On-Demand Feature Extraction and Fuzzy Inference Mohamad-Hoseyn

More information

Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification

Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification

More information

Algorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video

Algorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video Algorithms and Sstem for High-Level Structure Analsis and Event Detection in Soccer Video Peng Xu, Shih-Fu Chang, Columbia Universit Aja Divakaran, Anthon Vetro, Huifang Sun, Mitsubishi Electric Advanced

More information

Region Based Image Fusion Using SVM

Region Based Image Fusion Using SVM Region Based Image Fusion Using SVM Yang Liu, Jian Cheng, Hanqing Lu National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences ABSTRACT This paper presents a novel

More information

A Semantic Image Category for Structuring TV Broadcast Video Streams

A Semantic Image Category for Structuring TV Broadcast Video Streams A Semantic Image Category for Structuring TV Broadcast Video Streams Jinqiao Wang 1, Lingyu Duan 2, Hanqing Lu 1, and Jesse S. Jin 3 1 National Lab of Pattern Recognition Institute of Automation, Chinese

More information

Neetha Das Prof. Andy Khong

Neetha Das Prof. Andy Khong Neetha Das Prof. Andy Khong Contents Introduction and aim Current system at IMI Proposed new classification model Support Vector Machines Initial audio data collection and processing Features and their

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Survey on Summarization of Multiple User-Generated

More information

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS Nilam B. Lonkar 1, Dinesh B. Hanchate 2 Student of Computer Engineering, Pune University VPKBIET, Baramati, India Computer Engineering, Pune University VPKBIET,

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

Interactive Video Retrieval System Integrating Visual Search with Textual Search

Interactive Video Retrieval System Integrating Visual Search with Textual Search From: AAAI Technical Report SS-03-08. Compilation copyright 2003, AAAI (www.aaai.org). All rights reserved. Interactive Video Retrieval System Integrating Visual Search with Textual Search Shuichi Shiitani,

More information

SVM-based Soccer Video Summarization System

SVM-based Soccer Video Summarization System SVM-based Soccer Video Summarization System Hossam M. Zawbaa Cairo University, Faculty of Computers and Information Email: hossam.zawba3a@gmail.com Nashwa El-Bendary Arab Academy for Science, Technology,

More information

A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection

A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection Bin Gao 2, Tie-Yan Liu 1, Qian-Sheng Cheng 2, and Wei-Ying Ma 1 1 Microsoft Research Asia, No.49 Zhichun

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN: Semi Automatic Annotation Exploitation Similarity of Pics in i Personal Photo Albums P. Subashree Kasi Thangam 1 and R. Rosy Angel 2 1 Assistant Professor, Department of Computer Science Engineering College,

More information

Speaker Diarization System Based on GMM and BIC

Speaker Diarization System Based on GMM and BIC Speaer Diarization System Based on GMM and BIC Tantan Liu 1, Xiaoxing Liu 1, Yonghong Yan 1 1 ThinIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences Beijing 100080 {tliu, xliu,yyan}@hccl.ioa.ac.cn

More information

Automatic Shadow Removal by Illuminance in HSV Color Space

Automatic Shadow Removal by Illuminance in HSV Color Space Computer Science and Information Technology 3(3): 70-75, 2015 DOI: 10.13189/csit.2015.030303 http://www.hrpub.org Automatic Shadow Removal by Illuminance in HSV Color Space Wenbo Huang 1, KyoungYeon Kim

More information

Recall precision graph

Recall precision graph VIDEO SHOT BOUNDARY DETECTION USING SINGULAR VALUE DECOMPOSITION Λ Z.»CERNEKOVÁ, C. KOTROPOULOS AND I. PITAS Aristotle University of Thessaloniki Box 451, Thessaloniki 541 24, GREECE E-mail: (zuzana, costas,

More information

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Xiaotang Chen, Kaiqi Huang, and Tieniu Tan National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy

More information

Image retrieval based on bag of images

Image retrieval based on bag of images University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 Image retrieval based on bag of images Jun Zhang University of Wollongong

More information

A Bagging Method using Decision Trees in the Role of Base Classifiers

A Bagging Method using Decision Trees in the Role of Base Classifiers A Bagging Method using Decision Trees in the Role of Base Classifiers Kristína Machová 1, František Barčák 2, Peter Bednár 3 1 Department of Cybernetics and Artificial Intelligence, Technical University,

More information

Graph Matching Iris Image Blocks with Local Binary Pattern

Graph Matching Iris Image Blocks with Local Binary Pattern Graph Matching Iris Image Blocs with Local Binary Pattern Zhenan Sun, Tieniu Tan, and Xianchao Qiu Center for Biometrics and Security Research, National Laboratory of Pattern Recognition, Institute of

More information

Multi-level analysis of sports video sequences

Multi-level analysis of sports video sequences Multi-level analysis of sports video sequences Jungong Han a, Dirk Farin a and Peter H. N. de With a,b a University of Technology Eindhoven, 5600MB Eindhoven, The Netherlands b LogicaCMG, RTSE, PO Box

More information

Approach to Metadata Production and Application Technology Research

Approach to Metadata Production and Application Technology Research Approach to Metadata Production and Application Technology Research In the areas of broadcasting based on home servers and content retrieval, the importance of segment metadata, which is attached in segment

More information

Robust color segmentation algorithms in illumination variation conditions

Robust color segmentation algorithms in illumination variation conditions 286 CHINESE OPTICS LETTERS / Vol. 8, No. / March 10, 2010 Robust color segmentation algorithms in illumination variation conditions Jinhui Lan ( ) and Kai Shen ( Department of Measurement and Control Technologies,

More information

Active learning for visual object recognition

Active learning for visual object recognition Active learning for visual object recognition Written by Yotam Abramson and Yoav Freund Presented by Ben Laxton Outline Motivation and procedure How this works: adaboost and feature details Why this works:

More information

Semantic Event Detection and Classification in Cricket Video Sequence

Semantic Event Detection and Classification in Cricket Video Sequence Sixth Indian Conference on Computer Vision, Graphics & Image Processing Semantic Event Detection and Classification in Cricket Video Sequence M. H. Kolekar, K. Palaniappan Department of Computer Science,

More information

An Adaptive Threshold LBP Algorithm for Face Recognition

An Adaptive Threshold LBP Algorithm for Face Recognition An Adaptive Threshold LBP Algorithm for Face Recognition Xiaoping Jiang 1, Chuyu Guo 1,*, Hua Zhang 1, and Chenghua Li 1 1 College of Electronics and Information Engineering, Hubei Key Laboratory of Intelligent

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

A Statistical-driven Approach for Automatic Classification of Events in AFL Video Highlights

A Statistical-driven Approach for Automatic Classification of Events in AFL Video Highlights A Statistical-driven Approach for Automatic Classification of Events in AFL Video Highlights Dian Tjondronegoro 1 2 3 Yi-Ping Phoebe Chen 1 Binh Pham 3 School of Information Technology, Deakin University

More information

Face Recognition Using Ordinal Features

Face Recognition Using Ordinal Features Face Recognition Using Ordinal Features ShengCai Liao, Zhen Lei, XiangXin Zhu, ZheNan Sun, Stan Z. Li, and Tieniu Tan Center for Biometrics and Security Research & National Laboratory of Pattern Recognition,

More information

Feature-level Fusion for Effective Palmprint Authentication

Feature-level Fusion for Effective Palmprint Authentication Feature-level Fusion for Effective Palmprint Authentication Adams Wai-Kin Kong 1, 2 and David Zhang 1 1 Biometric Research Center, Department of Computing The Hong Kong Polytechnic University, Kowloon,

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose Department of Electrical and Computer Engineering University of California,

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Based on Multi-Modal Violent Movies Detection in Video Sharing Sites

Based on Multi-Modal Violent Movies Detection in Video Sharing Sites Based on Multi-Modal Violent Movies Detection in Video Sharing Sites Xingyu Zou 1, Ou Wu 2, Qishen Wang 2, Weiming Hu 2, Jinfeng Yang 1 1 College of aviation automation, Civil Aviation University of China,

More information

AIIA shot boundary detection at TRECVID 2006

AIIA shot boundary detection at TRECVID 2006 AIIA shot boundary detection at TRECVID 6 Z. Černeková, N. Nikolaidis and I. Pitas Artificial Intelligence and Information Analysis Laboratory Department of Informatics Aristotle University of Thessaloniki

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Audio-Based Action Scene Classification Using HMM-SVM Algorithm

Audio-Based Action Scene Classification Using HMM-SVM Algorithm Audio-Based Action Scene Classification Using HMM-SVM Algorithm Khin Myo Chit, K Zin Lin Abstract Nowadays, there are many kind of video such as educational movies, multimedia movies, action movies and

More information

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.

More information

Real-Time Detection of Sport in MPEG-2 Sequences using High-Level AV-Descriptors and SVM

Real-Time Detection of Sport in MPEG-2 Sequences using High-Level AV-Descriptors and SVM Real-Time Detection of Sport in MPEG-2 Sequences using High-Level AV-Descriptors and SVM Ronald Glasberg 1, Sebastian Schmiedee 2, Hüseyin Oguz 3, Pascal Kelm 4 and Thomas Siora 5 Communication Systems

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science

More information

CV of Qixiang Ye. University of Chinese Academy of Sciences

CV of Qixiang Ye. University of Chinese Academy of Sciences 2012-12-12 University of Chinese Academy of Sciences Qixiang Ye received B.S. and M.S. degrees in mechanical & electronic engineering from Harbin Institute of Technology (HIT) in 1999 and 2001 respectively,

More information

Iris Recognition for Eyelash Detection Using Gabor Filter

Iris Recognition for Eyelash Detection Using Gabor Filter Iris Recognition for Eyelash Detection Using Gabor Filter Rupesh Mude 1, Meenakshi R Patel 2 Computer Science and Engineering Rungta College of Engineering and Technology, Bhilai Abstract :- Iris recognition

More information

TEVI: Text Extraction for Video Indexing

TEVI: Text Extraction for Video Indexing TEVI: Text Extraction for Video Indexing Hichem KARRAY, Mohamed SALAH, Adel M. ALIMI REGIM: Research Group on Intelligent Machines, EIS, University of Sfax, Tunisia hichem.karray@ieee.org mohamed_salah@laposte.net

More information

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation. Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way

More information

2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology

2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology ISCA Archive STREAM WEIGHT OPTIMIZATION OF SPEECH AND LIP IMAGE SEQUENCE FOR AUDIO-VISUAL SPEECH RECOGNITION Satoshi Nakamura 1 Hidetoshi Ito 2 Kiyohiro Shikano 2 1 ATR Spoken Language Translation Research

More information

Text-Independent Speaker Identification

Text-Independent Speaker Identification December 8, 1999 Text-Independent Speaker Identification Til T. Phan and Thomas Soong 1.0 Introduction 1.1 Motivation The problem of speaker identification is an area with many different applications.

More information

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection Kuanyu Ju and Hongkai Xiong Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China ABSTRACT To

More information

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition ISCA Archive http://www.isca-speech.org/archive Auditory-Visual Speech Processing (AVSP) 2013 Annecy, France August 29 - September 1, 2013 Audio-visual interaction in sparse representation features for

More information

Deterministic Approach to Content Structure Analysis of Tennis Video

Deterministic Approach to Content Structure Analysis of Tennis Video Deterministic Approach to Content Structure Analysis of Tennis Video Viachaslau Parshyn, Liming Chen A Research Report, Lab. LIRIS, Ecole Centrale de Lyon LYON 2006 Abstract. An approach to automatic tennis

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

A robust method for automatic player detection in sport videos

A robust method for automatic player detection in sport videos A robust method for automatic player detection in sport videos A. Lehuger 1 S. Duffner 1 C. Garcia 1 1 Orange Labs 4, rue du clos courtel, 35512 Cesson-Sévigné {antoine.lehuger, stefan.duffner, christophe.garcia}@orange-ftgroup.com

More information

CONTENT ADAPTIVE SCREEN IMAGE SCALING

CONTENT ADAPTIVE SCREEN IMAGE SCALING CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT

More information

Video shot segmentation using late fusion technique

Video shot segmentation using late fusion technique Video shot segmentation using late fusion technique by C. Krishna Mohan, N. Dhananjaya, B.Yegnanarayana in Proc. Seventh International Conference on Machine Learning and Applications, 2008, San Diego,

More information

Further Studies of a FFT-Based Auditory Spectrum with Application in Audio Classification

Further Studies of a FFT-Based Auditory Spectrum with Application in Audio Classification ICSP Proceedings Further Studies of a FFT-Based Auditory with Application in Audio Classification Wei Chu and Benoît Champagne Department of Electrical and Computer Engineering McGill University, Montréal,

More information

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform M. Nancy Regina 1, S. Caroline 2 PG Scholar, ECE, St. Xavier s Catholic College of Engineering, Nagercoil, India 1 Assistant

More information

Affective Music Video Content Retrieval Features Based on Songs

Affective Music Video Content Retrieval Features Based on Songs Affective Music Video Content Retrieval Features Based on Songs R.Hemalatha Department of Computer Science and Engineering, Mahendra Institute of Technology, Mahendhirapuri, Mallasamudram West, Tiruchengode,

More information

Subject-Oriented Image Classification based on Face Detection and Recognition

Subject-Oriented Image Classification based on Face Detection and Recognition 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

Real-time Monitoring System for TV Commercials Using Video Features

Real-time Monitoring System for TV Commercials Using Video Features Real-time Monitoring System for TV Commercials Using Video Features Sung Hwan Lee, Won Young Yoo, and Young-Suk Yoon Electronics and Telecommunications Research Institute (ETRI), 11 Gajeong-dong, Yuseong-gu,

More information

An Automated Refereeing and Analysis Tool for the Four-Legged League

An Automated Refereeing and Analysis Tool for the Four-Legged League An Automated Refereeing and Analysis Tool for the Four-Legged League Javier Ruiz-del-Solar, Patricio Loncomilla, and Paul Vallejos Department of Electrical Engineering, Universidad de Chile Abstract. The

More information

Hybrid Biometric Person Authentication Using Face and Voice Features

Hybrid Biometric Person Authentication Using Face and Voice Features Paper presented in the Third International Conference, Audio- and Video-Based Biometric Person Authentication AVBPA 2001, Halmstad, Sweden, proceedings pages 348-353, June 2001. Hybrid Biometric Person

More information

Saliency Detection for Videos Using 3D FFT Local Spectra

Saliency Detection for Videos Using 3D FFT Local Spectra Saliency Detection for Videos Using 3D FFT Local Spectra Zhiling Long and Ghassan AlRegib School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA ABSTRACT

More information

Segmentation of Images

Segmentation of Images Segmentation of Images SEGMENTATION If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a

More information

Latent Variable Models for Structured Prediction and Content-Based Retrieval

Latent Variable Models for Structured Prediction and Content-Based Retrieval Latent Variable Models for Structured Prediction and Content-Based Retrieval Ariadna Quattoni Universitat Politècnica de Catalunya Joint work with Borja Balle, Xavier Carreras, Adrià Recasens, Antonio

More information

Multi-Camera Calibration, Object Tracking and Query Generation

Multi-Camera Calibration, Object Tracking and Query Generation MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Multi-Camera Calibration, Object Tracking and Query Generation Porikli, F.; Divakaran, A. TR2003-100 August 2003 Abstract An automatic object

More information

8.5 Application Examples

8.5 Application Examples 8.5 Application Examples 8.5.1 Genre Recognition Goal Assign a genre to a given video, e.g., movie, newscast, commercial, music clip, etc.) Technology Combine many parameters of the physical level to compute

More information

Cost-sensitive Boosting for Concept Drift

Cost-sensitive Boosting for Concept Drift Cost-sensitive Boosting for Concept Drift Ashok Venkatesan, Narayanan C. Krishnan, Sethuraman Panchanathan Center for Cognitive Ubiquitous Computing, School of Computing, Informatics and Decision Systems

More information

Adaptive Doppler centroid estimation algorithm of airborne SAR

Adaptive Doppler centroid estimation algorithm of airborne SAR Adaptive Doppler centroid estimation algorithm of airborne SAR Jian Yang 1,2a), Chang Liu 1, and Yanfei Wang 1 1 Institute of Electronics, Chinese Academy of Sciences 19 North Sihuan Road, Haidian, Beijing

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

Audio-Visual Content Indexing, Filtering, and Adaptation

Audio-Visual Content Indexing, Filtering, and Adaptation Audio-Visual Content Indexing, Filtering, and Adaptation Shih-Fu Chang Digital Video and Multimedia Group ADVENT University-Industry Consortium Columbia University 10/12/2001 http://www.ee.columbia.edu/dvmm

More information

Real-Time Position Estimation and Tracking of a Basketball

Real-Time Position Estimation and Tracking of a Basketball Real-Time Position Estimation and Tracking of a Basketball Bodhisattwa Chakraborty Digital Image and Speech Processing Lab National Institute of Technology Rourkela Odisha, India 769008 Email: bodhisattwa.chakraborty@gmail.com

More information

An Approach to Detect Text and Caption in Video

An Approach to Detect Text and Caption in Video An Approach to Detect Text and Caption in Video Miss Megha Khokhra 1 M.E Student Electronics and Communication Department, Kalol Institute of Technology, Gujarat, India ABSTRACT The video image spitted

More information

Audio-Visual Content Indexing, Filtering, and Adaptation

Audio-Visual Content Indexing, Filtering, and Adaptation Audio-Visual Content Indexing, Filtering, and Adaptation Shih-Fu Chang Digital Video and Multimedia Group ADVENT University-Industry Consortium Columbia University 10/12/2001 http://www.ee.columbia.edu/dvmm

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

Semantic Video Indexing

Semantic Video Indexing Semantic Video Indexing T-61.6030 Multimedia Retrieval Stevan Keraudy stevan.keraudy@tkk.fi Helsinki University of Technology March 14, 2008 What is it? Query by keyword or tag is common Semantic Video

More information

Video Editing Based on Situation Awareness from Voice Information and Face Emotion

Video Editing Based on Situation Awareness from Voice Information and Face Emotion 18 Video Editing Based on Situation Awareness from Voice Information and Face Emotion Tetsuya Takiguchi, Jun Adachi and Yasuo Ariki Kobe University Japan 1. Introduction Video camera systems are becoming

More information