Title: Pyramidwise Structuring for Soccer Highlight Extraction. Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang

Size: px
Start display at page:

Download "Title: Pyramidwise Structuring for Soccer Highlight Extraction. Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang"

Transcription

1 Title: Pyramidwise Structuring for Soccer Highlight Extraction Authors: Ming Luo, Yu-Fei Ma, Hong-Jiang Zhang Mailing address: Microsoft Research Asia, 5F, Beijing Sigma Center, 49 Zhichun Road, Beijing , China Electronic address: Phone: (Ming Luo) (Yu-Fei Ma & Hong-Jiang Zhang) Fax number: Contact author: Ming Luo Topic area: Multimedia processing and coding Subject area: Content analysis and adaptation

2 Pyramidwise Structuring for Soccer Highlight Extraction Ming Luo 1, Yu-Fei Ma 2, Hong-Jiang Zhang 2 1 Department of Computer Science University of Maryland, College Park, MD 20770, USA ming@cs.umd.edu 2 Microsoft Research Asia, 5F, Beijing Sigma Center, 49 Zhichun Road, Beijing , China {yfma, hjzhang}@microsoft.com Abstract Fast browsing video contents not only is an important research issue, but also has a variety of potential applications, especially for sports videos. In this paper, we propose a practical solution to extract highlights from soccer videos, which is based on the structure analysis of broadcast soccer video. First, the broadcast soccer video is structured into a soccer pyramid composing a series of layers from fine to coarse. With such a soccer pyramid structure, soccer highlights can be extracted in a flexible manner according to different high-level applications. Besides, in order to obtain soccer pyramid, a condensation approach to soccer video and the corresponding structure extraction methods are also proposed to achieve pyramidwise structuring. Experiments have demonstrated the effectiveness, efficiency and robustness of the proposed approach to structuring and highlight extraction. 1. Introduction Highlight extraction plays an important role in sports video fast browsing. The viewers usually would like to look through the critical segments of sports game, instead of the whole games. To flexibly browse highlights may be a good choice for viewers to rapidly understand the games. In this paper, we are seeking for an effective and flexible solution to soccer highlight extraction. There existed a number of related works in the literatures. Soccer video [1] and tennis video [2] are automatically parsed into meaningful structure based on a prior model comprising the sketch of field, ball and players. In [3], a similar approach tries to detect the complete set of semantic events in a soccer game using the position information of players and ball as input. But in fact, such approaches rely on too many high-level semantics, which are not practical for a great variety of broadcast styles and qualities. On the other hand, A. Ekin et al. extracted highlights of goals, relying on slow motion detection [4]. However, there are several limits in such method, that is, 1) the critical events are not always replayed by slow motion in some broadcasted programs, which will results in missing highlights; 2) it is an unsolved issue to detect slow motion replays for sports video analysis, especially for the slow motions created by high speed cameras [5]. Some other methods based on the temporal evolution of lowlevel features are also used, for example, goal detection algorithms based on finite-state machine [6] and controlled Markov chain model [7]. Although the loudness of audio signal is applied to improve the accuracy of visual analysis in [7], the number of false detections is still high. Although HMMs (Hidden Markov Models) based method is reasonable for sports video analysis in [8], the multiple low-level features fusion is still a challenging issue. Consequently, only alternative play/break segments can be identified in their work. In summary, the approaches based on high-level features such as field sketch, tracking of ball and players are not practical because such features are very difficult to extract robustly. While the approaches that directly fuse low-level features (e.g. color, texture and motion) using some stochastic methods are limited in semantic understanding, because of the great gap between low-level features and high-level semantics. In this paper, we are aiming at highlight extraction based on a structuring scheme without fully semantic understandings. We defined a pyramidal structure for soccer game, called soccer pyramid, which is composed of a series of layers, i.e. SEG, FAR-SIDE, GOAL, ATTACK, and GOA (group of attacks). This structure contains both physical layers and intermediate semantic layers. With high level heuristic rules or domain knowledge, highlights can be flexibly extracted based on such soccer pyramid. In our implementation, a condensed binary image sequence is generated by using a field mask, which is a readable form for computers and has no redundant information for semantic understanding. Additionally, a set of global statistics features are extracted from such condensed binary sequence for soccer video structuring, which are more robust than the local features, such as field, player and ball positions. The effectiveness of the proposed approach has been proven by extensive experiments. The rest of paper is organized as follows. In Section 2, the soccer pyramid is introduced. Then, the soccer video condensation method is discussed in Section 3. The detailed structure analysis methods and flexible highlights schemes are explained in Section 4. The evaluation results are given in Section 5. Section 6 concludes the paper. 2. Soccer Pyramid 1

3 Physical Levels Semantic SEG FAR-SIDE GOAL SEGs Levels ATTACK Shots (a) GOA Figure 1: Soccer Pyramid (a) Soccer pyramid; (b) SEG boundary definition. Soccer pyramid structure is illustrated in Figure 1(a). SEG is the top layer of the soccer pyramid. A SEG is a consequential segment with a uniform content, which has shortest duration among the layers of pyramid. SEG is finer than shot, the basic unit for traditional video segmentation. As shown in Figure 1(b), the shot boundary is also a SEG boundary. That is, a shot may be a SEG or compose several SEGs. We define 4 types of SEGs: CLOSE-UP, FAR-CENTER, FAR-SIDE and MIDDLE, as shown in Figure 2. CLOSE- UP SEGs are taken as a close-up view of players, referee or a view of coach or audience. FAR-CENTER SEGs are taken from far view and their contents are in the center area of field, i.e. non-penalty-area. FAR-SIDE SEGs are taken from far view and the contents are in the side area of field, i.e. penalty-area. MIDDLE SEGs are taken from middle distance view, normally focusing on one or several players. (b) As shown in Figure 1 (a), the top part of pyramid is physical layers, including SEG and FAR-SIDE, which have no semantics. Under the physical layers, there are 3 semantic layers, i.e. GOAL, ATTACK and GOA, which are utilized in highlights extraction. Based on semantic layers, the multi-scale soccer highlights can be flexibly extracted. In order to build such soccer pyramid, a soccer video condensation method is employed first, which compress raw video data into a concise form that can be understood by computers. 3. Soccer Video Condensation As the richest media, video sequence has a lot of redundant information, which are attractive for human, but over complex for contemporary computers. To make soccer video addressable for computers, a soccer video condensation method is proposed, which generates a condensed binary image sequence from original video sequence. As shown in Figure 3, we mask the images with field color to achieve an abstract description of soccer video that looks like a binary image in Figure 3 (b). In our implementation, we use Base line as Goal line, which is parallel to the goal line of soccer field, not only because it is easier to extracted based line than goal line for our mask, but also because the base line has equivalent semantics of goal line. (a) (b) (c) (d) Figure 2: Four Types of SEGs (a) CLOSE-UP; (b) FAR- CENTER; (c) FAR-SIDE; (d) MIDDLE We list FAR-SIDE SEG as the second layer of the soccer pyramid because it is greatly meaningful. FAR-SIDE SEGs are quite probable to be the occurrences of critical events such as shoots or goals. As the most boring part, FAR- CENTER SEGs covering a major portion of soccer video (generally over 50%) are not considered in soccer pyramid. Whereas, MIDDLE and CLOSE-UP SEGs often containing various semantics are selectively considered, because the ones around a FAR-SIDE SEG are mostly probable to be the set up or the end scene of a critical event. As the most interesting event in soccer game, GOAL is listed as the third layer of soccer pyramid. The fourth layer is ATTACK. ATTACKs are the segments involving critical moments, such as shooting and goals. ATTACKs may also be further classified as team A ATTACKs and team B ATTACKs. At the bottom of pyramid, the ATTACKs which are temporal or semantic relevant are grouped into GOA, the group of attacks. Such a soccer pyramid structure builds a bridge from low-level physical information of videos to high-level meaningful semantics. (a) (b) (c) Figure 3: Soccer Video Condensation (a) Original image; (b) Condensed image; (c) Base line detection We use field color as mask to condense soccer video. However, the field color model must be tuned for different soccer videos due to different field grass conditions, various building shadows and illumination in stadium. For example, the 6 fields shown in Figure 4 have great difference of field colors. Figure 4: Different Field Colors in Soccer Videos Generally, field color is not only green color in sense but also the dominant color in most scenes of soccer video. In this work, we initialize the green model as a convex set in HSI space, i.e., hue:[0.18, 0.4], saturation:[0.1, 1], intensity: [0.2, 1]. Then, a number of frames are scanned to obtain statistical HSI values of pixels in these frames, based on which the tuned field color model is built. Specifically, we build an evenly distributed H-S-I histogram in the HSI subspace ([0.18, 0.4], [0.1, 1], [0.2, 1]). The weight value w (i, j, k) in bin (i, j, k) is the ratio of the 2

4 number of pixels in that bin to the number of all pixels. If the criteria (1) and (2) are satisfied, the color falling in bin (i, j, k) is viewed as the field color in the considered video. w ( i, j, k)? 0.01 (1)?? 1,...,10 k j? 1,...,10 w ( i, j, k)? (2) Such tuning process is run per 5 minutes to automatically update field color model. Experiments show acceptable results for field color tuning. Not only are the influences of grass condition, illumination and shadows greatly eliminated, the special green colors on some uniforms are also successfully avoided to be considered as field color. 4. Highlight Extraction 4.1 SEG classification To segment video into SEGs, we classify each frame into 4 types of SEGs with Bayesian network first. Then, the SEGs are generated from the class label sequence by a merging routine Feature extraction For SEG classification, we extract 3 global features, field color ratio in the image, the probability of inclined base line existence, and the summing object size ratio in the field area, which are noted as F g, F l and F o respectively. F g is defined as, N g Fg? (3) W? H where N g is the number of field color pixels in the image, W and H are width and height of the image respectively. F l is an important feature for distinguishing a FAR-SIDE SEG. As shown in Figure 3 (c), in a FAR-SIDE, there is an inclined base line, which can be detected by a 2- dimensional Hough transform method [9]. So F l is got by Max{ vs, d 10? s? 170,? d max? d? d max} Fl? (4) d max d? where v s,d is the vote value for the straight line with a slope s and the distance to image top-left corner as d. We define d>0 if the line is over the top-left corner and d<0 otherwise. F o is sensitive to MIDDLE SEGs. In the condensed image as Figure 3 (b), field region, non-field region and objects in the field are extracted. Thus F o is computed by 2 2 max? W H (5) F? S i i o? (6) S F where S i is the size in pixels of the i th object in the field region, and S F is the field size in pixels. F g, F l and F o are all continuous features in [0, 1] Classification by Bayesian network To classify each frame into four types, we choose to use continuous Bayesian network because Bayesian network is a non-linear method which is more suitable for multimedia analysis. Using 3 features introduced in 4.1.1, the Bayesian network is shown as Figure 5. This is a Bayesian network comprising 3 continuous observation nodes and 1 discrete hidden node with 4 possible values (4 class labels). S F g F l F o Figure 5: Bayesian Network for SEG Classification With observations of F g, F l and F o, the a posterior probability is calculated as follows P( Fg, Fl, Fo S) P( S) P ( S Fg, Fl, Fo )? (7) P( F, F, F ) To compare the a posterior probability of different S value, it is equivalent to compare P(F g,f l,f o S)P(S). We compute P(S) using maximum likelihood method and assume that P(F g,f l,f o S) is a 3-dimensional Gaussian distribution that can be trained by samples. As there are some errors in the classification results, we employ a merging algorithm to generate SEGs from class label sequence generated by Bayesian classification. First, the adjacent frames with the same class labels are grouped together as one SEG. Then, the over short SEGs are filtered. In this manner, video sequence is parsed into SEGs with 4-type labels. 4.2 GOAL detection According to the definitions of the layers of soccer pyramid, goals must occur in FAR-SIDE SEGs with some special image characteristics, and have special temporal patterns constituted by the lower layers. By exploiting such characteristics and patterns, we propose an algorithm to detect goal as well as the replays following GOALs. The distinctive characteristics include: 1) There is at least one FAR-SIDE SEG within a GOAL; 2) During this FAR- SIDE SEG, the inclined base line in the image is moving down until a very below position; 3) This FAR-SIDE SEG usually is followed by a series of MIDDLE or CLOSE-UP SEGs. Moreover, these MIDDLE/CLOSE-UP SEGs often last a considerable length, presenting cheer scene or replays of the GOAL. According to these characteristics, an important feature, the base line intercept noted as R is defined based on the condensed image sequence, as shown in Figure 6. R (a) (b) Figure 6: Definition of R (a) rightward (b) leftward R g l o 3

5 Supposing a FAR-SIDE SEG has an R sequence as {R 1,,R n }, if the following 3 rules are all satisfied, a GOAL is detected. 1) There exist a number of consecutive R i with big values; 2) There is an rapid increase in R sequence; 3) The duration of the MIDDLE/CLOSE-UP SEGs following the GOAL is long enough. Usually, several replays from different viewpoints follow a goal. To achieve a more adequate structure understanding, we extract these replays by locating the short FAR- CENTER SEG and MIDDEL SEGs within a considerable extension after the GOAL. 4.3 ATTACK and GOA generation FAR-SIDE SEGs usually display the dangerous situations for the defending team, so they can be viewed as the anchors of critical moments. However, FAR-SIDE SEGs are very short, with an average length of about 3 seconds. In order to deliver more reasonable video clips to users, we defined higher level structure, i.e., ATTACK and GOA, which last enough time from the set up of a critical event until its end. Therefore, the ATTACKs are adaptively extended from corresponding FAR-SIDE SEGs forward and backward. For example, Figure 7 illustrated a corner kick ATTACK. Figure 7 (b) is the FAR-SIDE SEG, while the SEG before it is a MIDDEL SEG (Figure 7 (a)) and the one after it is a CLOSE-UP SEG (Figure 7 (c)). The 3 SEGs present a complete corner kick. (a) (b) (c) Figure 7: A Corner Kick ATTACK The attack direction can also be determined from the condensed image according to the inclination direction of the base line, as shown in Figure 6. GOA is generated as a series of relevant ATTACKs. If some ATTACKs are along the same direction and close enough, they are grouped into GOA. GOAs deliver a more adequate understanding about the progress of game to the viewers. 4.4 Highlight extraction With the soccer pyramid defined in Section 2, we can extract soccer highlights in a flexible manner. As illustrated in Figure 8, the multi-scale highlights can be extracted based on different semantic layers of soccer pyramid, 1) to view all GOALs without replays, 2) to view all GOALs with replays, 3) to view all ATTACKs or 4) to view all GOAs; 5) to view all attacks of team A or team B. Also, viewers may access any GOAL, ATTACK, or GOA in a non-linear manner. - GOAL without replay - GOAL with replay - ATTACK - GOA Figure 8: Multi-scale Highlight Extraction In addition, if a criticality measure is well defined for each layer, the criticality-based highlight can be generated. For example, we define the maximum R-value within an ATTACK (as shown in Figure 6) as a criticality indicator. With this indicator, the ATTACKs may be displayed in a ranking list, in which the most interesting ATTACKs are put on the top. This kind of highlight facilitates users to fast review the most important segments of the game. 5. Evaluations Six soccer matches summing up to 7.4 hours are used to evaluate the system performance, including 4 matches of World Cup 2002, 1 match of FIFA Cup 2001, and 1 match of MPEG7 test video. Only a 6-minute video clip from the third match of World Cup is used as training set for Bayesian network. The other videos are testing data. Ground truth is labeled manually. The testing videos have different image qualities. The videos from World Cup have good quality, while the video from FIFA Cup is too dark and the MPEG7 test video is too light. However, the experimental results are encouraging. We evaluate SEG accuracy, GOAL detection, ATTACK accuracy separately, and calculate the highlight time coverage. In the evaluation tables, M1 to M6 stands for Match1 to Match6. Table 1 gives the SEG accuracy. The classification precision and recall of FAR-SIDE is also shown in Table 1, because FAR-SIDE is the most important unit for highlight extraction. The average SEG accuracy (acc.) reaches 89.6%, while the average precision (pre.) and recall (rec.) of FAR-SIDE are 94.0% and 87.1% respectively. Table 1: SEG Accuracy SEG acc. FARSIDE pre. FARSIDE rec. M1 91.9% 93.9% 93.7% M2 87.5% 93.7% 89.9% M3 94.7% 95.9% 90.9% M4 90.2% 94.0% 78.3% M5 86.0% 91.5% 87.5% M6 84.1% 96.6% 73.0% Avg. 89.6% 94.0% 87.1% GOAL detection results are shown in Table 2. Averagely, with a 100% recall, the precision achieves 68.2%. Compared to the literatures such as [4] [7], these results are 4

6 more reasonable. Moreover, for all the 15 goals in our experiments, the replay precision and recall is both 100%. Table 2: GOAL Detection Accuracy correct false miss precision recall M % 100% M % 100% M % 100% M % 100% M % 100% M % 100% Total % 100% We compute the ATTACK precision, recall and direction accuracy in Table 3. The average precision, recall and direction accuracy are 93.5%, 86.4%, 96.8% respectively. Table 3: ATTACK Accuracy Precision(%) Recall(%) Direction(%) M M M M M M Total The time coverage of highlights extracted from the GOAL (G), GOAL with replays (GR) and ATTACK are shown in Table 4 and 5. Table 4: GOAL Time Coverage G G % GR GR% Whole M1 41 sec sec min M2 71 sec sec min M3 25 sec sec min M4 20 sec sec min M5 42 sec sec min M6 28 sec sec min Total 227sec sec min Table 5: ATTACK Time Coverage Number Time Time ratio Whole M min. 16.9% 78 min. M min. 17.2% 87 min. M min. 12.8% 94 min. M min. 9.0% 89 min. M min. 17.0% 47 min. M min. 9.1% 55 min. Total min. 13.6% 450min As the most interesting portions in soccer video, GOALs cover only 0.84% of the whole games, while GOAL with replays coverage is 2.30%, as shown in Table 4. From Table 5, we can see that the ATTACKs averagely cover 13.6% of the whole game. This ratio reflects that the ATTACK based highlight is a reasonable soccer synopsis as the whole game usually lasts 2 hours. In our experiments, the false alarms of ATTACK are mainly caused by goal kick or pass of guards. In fact, such ATTACKs can be easily filtered by criticality ranking. The miss faults are usually caused by low-quality attacks However, errors bring few negative effects to the highlight presentation because viewers usually pay little attention to them. On the other hand, we may further improve the performance of our system by selecting more effective features, and employing statistical models for structure extraction on every layer. 6. Conclusions In this paper, we proposed a pyramidal structure for soccer video analysis, which includes a series of layers including SEG, FAR-SIDE, GOAL, ATTACK and GOA from top to bottom. As soccer pyramid contains rich intermediate semantics, the soccer highlights can be extracted in a flexible manner according to the different viewers requirements. By generating a condensed binary image sequence, the effective global features are extracted, which make it possible to obtain accurate structures in soccer pyramid. The encouraging experimental results have proven the practicality of the proposed approach. References [1] Y. Gong, L. T. Sin, C. H. Chuan, H. Zhang, M. Sakauchi, Automatic parsing of TV soccer programs ± Proc. ICMCS95, Washington DC, USA, [2] D. Zhong, S. F. Chang, Structure Analysis of Sports Video Using Domain Models, Proc. ICME2001, pp , Tokyo, Japan, Aug [3] V. Tovinkere, R. J. Qian, Detecting Semantic Events in Soccer Games: Toward a Complete Solution, Proc. ICME2001, pp , Tokyo, Japan, Aug [4] A. Ekin, M. Tekalp, Automatic Soccer Video Analysis and Summarization, Proc. SST SPIE03, CA, USA, [5] H. Pan, P. van Beek and M.I. Sezan, Detection of slow-motion replay segments in sports video for highlights generation, ICASSP 2001, Salt Lake City, UT, May [6] A. Bonzanini, R. Leonardi, P. Migliorati, Event Recognition in Sport Programs Using Low-Level Motion Indices, Proc. ICME2001, pp , Japan, Aug [7] R. Leonardi, P. Migliorati, M. Prandini. Se mantic indexing of sport program sequences by audio-visual analysis, Proc. ICIP 2003, Barcelona, Spain, Sep [8] L. Xie, S-F Chang, A. Divakaran, H. Sun, Structure analysis of soccer video with Hidden Markov Models, Proc. ICASSP, [9] J. Illingworth, J. Kittler, A Survey of the Hough Transform, CVGIP, vol. 44, pp ,

Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain

Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain Radhakrishan, R.; Xiong, Z.; Divakaran,

More information

Story Unit Segmentation with Friendly Acoustic Perception *

Story Unit Segmentation with Friendly Acoustic Perception * Story Unit Segmentation with Friendly Acoustic Perception * Longchuan Yan 1,3, Jun Du 2, Qingming Huang 3, and Shuqiang Jiang 1 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

Baseball Game Highlight & Event Detection

Baseball Game Highlight & Event Detection Baseball Game Highlight & Event Detection Student: Harry Chao Course Adviser: Winston Hu 1 Outline 1. Goal 2. Previous methods 3. My flowchart 4. My methods 5. Experimental result 6. Conclusion & Future

More information

Real-Time Content-Based Adaptive Streaming of Sports Videos

Real-Time Content-Based Adaptive Streaming of Sports Videos Real-Time Content-Based Adaptive Streaming of Sports Videos Shih-Fu Chang, Di Zhong, and Raj Kumar Digital Video and Multimedia Group ADVENT University/Industry Consortium Columbia University December

More information

A Rapid Scheme for Slow-Motion Replay Segment Detection

A Rapid Scheme for Slow-Motion Replay Segment Detection A Rapid Scheme for Slow-Motion Replay Segment Detection Wei-Hong Chuang, Dun-Yu Hsiao, Soo-Chang Pei, and Homer Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 10617,

More information

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Survey on Summarization of Multiple User-Generated

More information

Multi-level analysis of sports video sequences

Multi-level analysis of sports video sequences Multi-level analysis of sports video sequences Jungong Han a, Dirk Farin a and Peter H. N. de With a,b a University of Technology Eindhoven, 5600MB Eindhoven, The Netherlands b LogicaCMG, RTSE, PO Box

More information

Highlights Extraction from Unscripted Video

Highlights Extraction from Unscripted Video Highlights Extraction from Unscripted Video T 61.6030, Multimedia Retrieval Seminar presentation 04.04.2008 Harrison Mfula Helsinki University of Technology Department of Computer Science, Espoo, Finland

More information

Algorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video

Algorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video Algorithms and Sstem for High-Level Structure Analsis and Event Detection in Soccer Video Peng Xu, Shih-Fu Chang, Columbia Universit Aja Divakaran, Anthon Vetro, Huifang Sun, Mitsubishi Electric Advanced

More information

Exciting Event Detection Using Multi-level Multimodal Descriptors and Data Classification

Exciting Event Detection Using Multi-level Multimodal Descriptors and Data Classification Exciting Event Detection Using Multi-level Multimodal Descriptors and Data Classification Shu-Ching Chen, Min Chen Chengcui Zhang Mei-Ling Shyu School of Computing & Information Sciences Department of

More information

MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO

MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO MULTIMODAL BASED HIGHLIGHT DETECTION IN BROADCAST SOCCER VIDEO YIFAN ZHANG, QINGSHAN LIU, JIAN CHENG, HANQING LU National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

SVM-based Soccer Video Summarization System

SVM-based Soccer Video Summarization System SVM-based Soccer Video Summarization System Hossam M. Zawbaa Cairo University, Faculty of Computers and Information Email: hossam.zawba3a@gmail.com Nashwa El-Bendary Arab Academy for Science, Technology,

More information

Title: Automatic event detection for tennis broadcasting. Author: Javier Enebral González. Director: Francesc Tarrés Ruiz. Date: July 8 th, 2011

Title: Automatic event detection for tennis broadcasting. Author: Javier Enebral González. Director: Francesc Tarrés Ruiz. Date: July 8 th, 2011 MASTER THESIS TITLE: Automatic event detection for tennis broadcasting MASTER DEGREE: Master in Science in Telecommunication Engineering & Management AUTHOR: Javier Enebral González DIRECTOR: Francesc

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Temporal structure analysis of broadcast tennis video using hidden Markov models

Temporal structure analysis of broadcast tennis video using hidden Markov models Temporal structure analysis of broadcast tennis video using hidden Markov models Ewa Kijak a,b, Lionel Oisel a, Patrick Gros b a THOMSON multimedia S.A., Cesson-Sevigne, France b IRISA-CNRS, Campus de

More information

If we want widespread use and access to

If we want widespread use and access to Content-Based Multimedia Indexing and Retrieval Semantic Indexing of Multimedia Documents We propose two approaches for semantic indexing of audio visual documents, based on bottom-up and top-down strategies.

More information

Video shot segmentation using late fusion technique

Video shot segmentation using late fusion technique Video shot segmentation using late fusion technique by C. Krishna Mohan, N. Dhananjaya, B.Yegnanarayana in Proc. Seventh International Conference on Machine Learning and Applications, 2008, San Diego,

More information

AUTOMATIC VIDEO INDEXING

AUTOMATIC VIDEO INDEXING AUTOMATIC VIDEO INDEXING Itxaso Bustos Maite Frutos TABLE OF CONTENTS Introduction Methods Key-frame extraction Automatic visual indexing Shot boundary detection Video OCR Index in motion Image processing

More information

Optimal Video Adaptation and Skimming Using a Utility-Based Framework

Optimal Video Adaptation and Skimming Using a Utility-Based Framework Optimal Video Adaptation and Skimming Using a Utility-Based Framework Shih-Fu Chang Digital Video and Multimedia Lab ADVENT University-Industry Consortium Columbia University Sept. 9th 2002 http://www.ee.columbia.edu/dvmm

More information

A Unified Framework for Semantic Content Analysis in Sports Video

A Unified Framework for Semantic Content Analysis in Sports Video Proceedings of the nd International Conference on Information Technology for Application (ICITA 004) A Unified Framework for Semantic Content Analysis in Sports Video Chen Jianyun Li Yunhao Lao Songyang

More information

Overlay Text Detection and Recognition for Soccer Game Indexing

Overlay Text Detection and Recognition for Soccer Game Indexing Overlay Text Detection and Recognition for Soccer Game Indexing J. Ngernplubpla and O. Chitsophuk, Member, IACSIT Abstract In this paper, new multiresolution overlaid text detection and recognition is

More information

Highlight Ranking for Broadcast Tennis Video Based on Multi-modality Analysis and Relevance Feedback

Highlight Ranking for Broadcast Tennis Video Based on Multi-modality Analysis and Relevance Feedback Highlight Ranking for Broadcast Tennis Video Based on Multi-modality Analysis and Relevance Feedback Guangyu Zhu 1, Qingming Huang 2, and Yihong Gong 3 1 Harbin Institute of Technology, Harbin, P.R. China

More information

Text Extraction in Video

Text Extraction in Video International Journal of Computational Engineering Research Vol, 03 Issue, 5 Text Extraction in Video 1, Ankur Srivastava, 2, Dhananjay Kumar, 3, Om Prakash Gupta, 4, Amit Maurya, 5, Mr.sanjay kumar Srivastava

More information

Text Area Detection from Video Frames

Text Area Detection from Video Frames Text Area Detection from Video Frames 1 Text Area Detection from Video Frames Xiangrong Chen, Hongjiang Zhang Microsoft Research China chxr@yahoo.com, hjzhang@microsoft.com Abstract. Text area detection

More information

Key Frame Extraction and Indexing for Multimedia Databases

Key Frame Extraction and Indexing for Multimedia Databases Key Frame Extraction and Indexing for Multimedia Databases Mohamed AhmedˆÃ Ahmed Karmouchˆ Suhayya Abu-Hakimaˆˆ ÃÃÃÃÃÃÈÃSchool of Information Technology & ˆˆÃ AmikaNow! Corporation Engineering (SITE),

More information

An Introduction to Content Based Image Retrieval

An Introduction to Content Based Image Retrieval CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and

More information

Image retrieval based on region shape similarity

Image retrieval based on region shape similarity Image retrieval based on region shape similarity Cheng Chang Liu Wenyin Hongjiang Zhang Microsoft Research China, 49 Zhichun Road, Beijing 8, China {wyliu, hjzhang}@microsoft.com ABSTRACT This paper presents

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail: nongluk.c@chula.ac.th

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES

TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES Mr. Vishal A Kanjariya*, Mrs. Bhavika N Patel Lecturer, Computer Engineering Department, B & B Institute of Technology, Anand, Gujarat, India. ABSTRACT:

More information

Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features. Wei-Ta Chu

Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features. Wei-Ta Chu 1 Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features Wei-Ta Chu H.-H. Yeh, C.-Y. Yang, M.-S. Lee, and C.-S. Chen, Video Aesthetic Quality Assessment by Temporal

More information

Computer Vision and Image Understanding

Computer Vision and Image Understanding Computer Vision and Image Understanding 113 (2009) 415 424 Contents lists available at ScienceDirect Computer Vision and Image Understanding journal homepage: www.elsevier.com/locate/cviu A framework for

More information

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,

More information

Video Key-Frame Extraction using Entropy value as Global and Local Feature

Video Key-Frame Extraction using Entropy value as Global and Local Feature Video Key-Frame Extraction using Entropy value as Global and Local Feature Siddu. P Algur #1, Vivek. R *2 # Department of Information Science Engineering, B.V. Bhoomraddi College of Engineering and Technology

More information

Chapter 9.2 A Unified Framework for Video Summarization, Browsing and Retrieval

Chapter 9.2 A Unified Framework for Video Summarization, Browsing and Retrieval Chapter 9.2 A Unified Framework for Video Summarization, Browsing and Retrieval Ziyou Xiong, Yong Rui, Regunathan Radhakrishnan, Ajay Divakaran, Thomas S. Huang Beckman Institute for Advanced Science and

More information

TEVI: Text Extraction for Video Indexing

TEVI: Text Extraction for Video Indexing TEVI: Text Extraction for Video Indexing Hichem KARRAY, Mohamed SALAH, Adel M. ALIMI REGIM: Research Group on Intelligent Machines, EIS, University of Sfax, Tunisia hichem.karray@ieee.org mohamed_salah@laposte.net

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

Detection of goal event in soccer videos

Detection of goal event in soccer videos Detection of goal event in soccer videos Hyoung-Gook Kim, Steffen Roeber, Amjad Samour, Thomas Sikora Department of Communication Systems, Technical University of Berlin, Einsteinufer 17, D-10587 Berlin,

More information

VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural

VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD Ertem Tuncel and Levent Onural Electrical and Electronics Engineering Department, Bilkent University, TR-06533, Ankara, Turkey

More information

PixSO: A System for Video Shot Detection

PixSO: A System for Video Shot Detection PixSO: A System for Video Shot Detection Chengcui Zhang 1, Shu-Ching Chen 1, Mei-Ling Shyu 2 1 School of Computer Science, Florida International University, Miami, FL 33199, USA 2 Department of Electrical

More information

Summarization of Egocentric Moving Videos for Generating Walking Route Guidance

Summarization of Egocentric Moving Videos for Generating Walking Route Guidance Summarization of Egocentric Moving Videos for Generating Walking Route Guidance Masaya Okamoto and Keiji Yanai Department of Informatics, The University of Electro-Communications 1-5-1 Chofugaoka, Chofu-shi,

More information

Fast Highlight Detection and Scoring for Broadcast Soccer Video Summarization using On-Demand Feature Extraction and Fuzzy Inference

Fast Highlight Detection and Scoring for Broadcast Soccer Video Summarization using On-Demand Feature Extraction and Fuzzy Inference , pp.13-36 http://dx.doi.org/10.14257/ijcg.2015.6.1.02 Fast Highlight Detection and Scoring for Broadcast Soccer Video Summarization using On-Demand Feature Extraction and Fuzzy Inference Mohamad-Hoseyn

More information

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Ajay Divakaran, Kadir A. Peker, Regunathan Radhakrishnan, Ziyou Xiong and Romain Cabasson Presented by Giulia Fanti 1 Overview Motivation

More information

Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models

Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models Lexing Xie, Peng Xu, Shih-Fu Chang, Ajay Divakaran, Huifang

More information

Audio-Visual Content Indexing, Filtering, and Adaptation

Audio-Visual Content Indexing, Filtering, and Adaptation Audio-Visual Content Indexing, Filtering, and Adaptation Shih-Fu Chang Digital Video and Multimedia Group ADVENT University-Industry Consortium Columbia University 10/12/2001 http://www.ee.columbia.edu/dvmm

More information

Audio-Visual Content Indexing, Filtering, and Adaptation

Audio-Visual Content Indexing, Filtering, and Adaptation Audio-Visual Content Indexing, Filtering, and Adaptation Shih-Fu Chang Digital Video and Multimedia Group ADVENT University-Industry Consortium Columbia University 10/12/2001 http://www.ee.columbia.edu/dvmm

More information

CONTENT analysis of video is to find meaningful structures

CONTENT analysis of video is to find meaningful structures 1576 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 11, NOVEMBER 2008 An ICA Mixture Hidden Markov Model for Video Content Analysis Jian Zhou, Member, IEEE, and Xiao-Ping

More information

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning TR2005-155

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Textural Features for Image Database Retrieval

Textural Features for Image Database Retrieval Textural Features for Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington Seattle, WA 98195-2500 {aksoy,haralick}@@isl.ee.washington.edu

More information

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection Kuanyu Ju and Hongkai Xiong Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China ABSTRACT To

More information

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images Karthik Ram K.V & Mahantesh K Department of Electronics and Communication Engineering, SJB Institute of Technology, Bangalore,

More information

KAZOVISION. Kazo Vision is a solution provider who focuses on LED/LCD display and visual effects serving advertisement, exhibition and sports field.

KAZOVISION. Kazo Vision is a solution provider who focuses on LED/LCD display and visual effects serving advertisement, exhibition and sports field. KAZOVISION Kazo Vision is a solution provider who focuses on LED/LCD display and visual effects serving advertisement, exhibition and sports field. Sports Live Video System Video Slow Motion System Title

More information

Understanding Tracking and StroMotion of Soccer Ball

Understanding Tracking and StroMotion of Soccer Ball Understanding Tracking and StroMotion of Soccer Ball Nhat H. Nguyen Master Student 205 Witherspoon Hall Charlotte, NC 28223 704 656 2021 rich.uncc@gmail.com ABSTRACT Soccer requires rapid ball movements.

More information

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES Mehran Yazdi and André Zaccarin CVSL, Dept. of Electrical and Computer Engineering, Laval University Ste-Foy, Québec GK 7P4, Canada

More information

Integrating Low-Level and Semantic Visual Cues for Improved Image-to-Video Experiences

Integrating Low-Level and Semantic Visual Cues for Improved Image-to-Video Experiences Integrating Low-Level and Semantic Visual Cues for Improved Image-to-Video Experiences Pedro Pinho, Joel Baltazar, Fernando Pereira Instituto Superior Técnico - Instituto de Telecomunicações IST, Av. Rovisco

More information

Multimedia Database Systems. Retrieval by Content

Multimedia Database Systems. Retrieval by Content Multimedia Database Systems Retrieval by Content MIR Motivation Large volumes of data world-wide are not only based on text: Satellite images (oil spill), deep space images (NASA) Medical images (X-rays,

More information

Recall precision graph

Recall precision graph VIDEO SHOT BOUNDARY DETECTION USING SINGULAR VALUE DECOMPOSITION Λ Z.»CERNEKOVÁ, C. KOTROPOULOS AND I. PITAS Aristotle University of Thessaloniki Box 451, Thessaloniki 541 24, GREECE E-mail: (zuzana, costas,

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

Color Image Segmentation

Color Image Segmentation Color Image Segmentation Yining Deng, B. S. Manjunath and Hyundoo Shin* Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106-9560 *Samsung Electronics Inc.

More information

A Statistical-driven Approach for Automatic Classification of Events in AFL Video Highlights

A Statistical-driven Approach for Automatic Classification of Events in AFL Video Highlights A Statistical-driven Approach for Automatic Classification of Events in AFL Video Highlights Dian Tjondronegoro 1 2 3 Yi-Ping Phoebe Chen 1 Binh Pham 3 School of Information Technology, Deakin University

More information

A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection

A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection Bin Gao 2, Tie-Yan Liu 1, Qian-Sheng Cheng 2, and Wei-Ying Ma 1 1 Microsoft Research Asia, No.49 Zhichun

More information

Offering Access to Personalized Interactive Video

Offering Access to Personalized Interactive Video Offering Access to Personalized Interactive Video 1 Offering Access to Personalized Interactive Video Giorgos Andreou, Phivos Mylonas, Manolis Wallace and Stefanos Kollias Image, Video and Multimedia Systems

More information

A robust method for automatic player detection in sport videos

A robust method for automatic player detection in sport videos A robust method for automatic player detection in sport videos A. Lehuger 1 S. Duffner 1 C. Garcia 1 1 Orange Labs 4, rue du clos courtel, 35512 Cesson-Sévigné {antoine.lehuger, stefan.duffner, christophe.garcia}@orange-ftgroup.com

More information

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

Available online at   ScienceDirect. Procedia Computer Science 96 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 96 (2016 ) 1409 1417 20th International Conference on Knowledge Based and Intelligent Information and Engineering Systems,

More information

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1 N. Adami, A. Bugatti, A. Corghi, R. Leonardi, P. Migliorati, Lorenzo A. Rossi, C. Saraceno 2 Department of Electronics

More information

Automatic visual recognition for metro surveillance

Automatic visual recognition for metro surveillance Automatic visual recognition for metro surveillance F. Cupillard, M. Thonnat, F. Brémond Orion Research Group, INRIA, Sophia Antipolis, France Abstract We propose in this paper an approach for recognizing

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Clustering Methods for Video Browsing and Annotation

Clustering Methods for Video Browsing and Annotation Clustering Methods for Video Browsing and Annotation Di Zhong, HongJiang Zhang 2 and Shih-Fu Chang* Institute of System Science, National University of Singapore Kent Ridge, Singapore 05 *Center for Telecommunication

More information

Online structured learning for Obstacle avoidance

Online structured learning for Obstacle avoidance Adarsh Kowdle Cornell University Zhaoyin Jia Cornell University apk64@cornell.edu zj32@cornell.edu Abstract Obstacle avoidance based on a monocular system has become a very interesting area in robotics

More information

A NOVEL FEATURE EXTRACTION METHOD BASED ON SEGMENTATION OVER EDGE FIELD FOR MULTIMEDIA INDEXING AND RETRIEVAL

A NOVEL FEATURE EXTRACTION METHOD BASED ON SEGMENTATION OVER EDGE FIELD FOR MULTIMEDIA INDEXING AND RETRIEVAL A NOVEL FEATURE EXTRACTION METHOD BASED ON SEGMENTATION OVER EDGE FIELD FOR MULTIMEDIA INDEXING AND RETRIEVAL Serkan Kiranyaz, Miguel Ferreira and Moncef Gabbouj Institute of Signal Processing, Tampere

More information

Region Feature Based Similarity Searching of Semantic Video Objects

Region Feature Based Similarity Searching of Semantic Video Objects Region Feature Based Similarity Searching of Semantic Video Objects Di Zhong and Shih-Fu hang Image and dvanced TV Lab, Department of Electrical Engineering olumbia University, New York, NY 10027, US {dzhong,

More information

Affective Music Video Content Retrieval Features Based on Songs

Affective Music Video Content Retrieval Features Based on Songs Affective Music Video Content Retrieval Features Based on Songs R.Hemalatha Department of Computer Science and Engineering, Mahendra Institute of Technology, Mahendhirapuri, Mallasamudram West, Tiruchengode,

More information

Tamil Video Retrieval Based on Categorization in Cloud

Tamil Video Retrieval Based on Categorization in Cloud Tamil Video Retrieval Based on Categorization in Cloud V.Akila, Dr.T.Mala Department of Information Science and Technology, College of Engineering, Guindy, Anna University, Chennai veeakila@gmail.com,

More information

Deterministic Approach to Content Structure Analysis of Tennis Video

Deterministic Approach to Content Structure Analysis of Tennis Video Deterministic Approach to Content Structure Analysis of Tennis Video Viachaslau Parshyn, Liming Chen A Research Report, Lab. LIRIS, Ecole Centrale de Lyon LYON 2006 Abstract. An approach to automatic tennis

More information

Introduction to Medical Imaging (5XSA0) Module 5

Introduction to Medical Imaging (5XSA0) Module 5 Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

Bus Detection and recognition for visually impaired people

Bus Detection and recognition for visually impaired people Bus Detection and recognition for visually impaired people Hangrong Pan, Chucai Yi, and Yingli Tian The City College of New York The Graduate Center The City University of New York MAP4VIP Outline Motivation

More information

Assistive Sports Video Annotation: Modelling and Detecting Complex Events in Sports Video

Assistive Sports Video Annotation: Modelling and Detecting Complex Events in Sports Video : Modelling and Detecting Complex Events in Sports Video Aled Owen 1, David Marshall 1, Kirill Sidorov 1, Yulia Hicks 1, and Rhodri Brown 2 1 Cardiff University, Cardiff, UK 2 Welsh Rugby Union Abstract

More information

Supervised texture detection in images

Supervised texture detection in images Supervised texture detection in images Branislav Mičušík and Allan Hanbury Pattern Recognition and Image Processing Group, Institute of Computer Aided Automation, Vienna University of Technology Favoritenstraße

More information

Robust color segmentation algorithms in illumination variation conditions

Robust color segmentation algorithms in illumination variation conditions 286 CHINESE OPTICS LETTERS / Vol. 8, No. / March 10, 2010 Robust color segmentation algorithms in illumination variation conditions Jinhui Lan ( ) and Kai Shen ( Department of Measurement and Control Technologies,

More information

Short Run length Descriptor for Image Retrieval

Short Run length Descriptor for Image Retrieval CHAPTER -6 Short Run length Descriptor for Image Retrieval 6.1 Introduction In the recent years, growth of multimedia information from various sources has increased many folds. This has created the demand

More information

Chris Poppe, Steven Verstockt, Sarah De Bruyne, Rik Van de Walle

Chris Poppe, Steven Verstockt, Sarah De Bruyne, Rik Van de Walle biblio.ugent.be The UGent Institutional Repository is the electronic archiving and dissemination platform for all UGent research publications. Ghent University has implemented a mandate stipulating that

More information

Blur Detection For Video Streams In The Compressed Domain

Blur Detection For Video Streams In The Compressed Domain log of pixel number Blur Detection For Video Streams In The Compressed Domain Zhenyu Wu 1, Daiying Zhou 1, and Hong Hu 2 1 University of Electronic Science and Technology of China, Chengdu, Sichuan, China

More information

Iterative Image Based Video Summarization by Node Segmentation

Iterative Image Based Video Summarization by Node Segmentation Iterative Image Based Video Summarization by Node Segmentation Nalini Vasudevan Arjun Jain Himanshu Agrawal Abstract In this paper, we propose a simple video summarization system based on removal of similar

More information

DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM

DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM Anoop K. Bhattacharjya and Hakan Ancin Epson Palo Alto Laboratory 3145 Porter Drive, Suite 104 Palo Alto, CA 94304 e-mail: {anoop, ancin}@erd.epson.com Abstract

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform M. Nancy Regina 1, S. Caroline 2 PG Scholar, ECE, St. Xavier s Catholic College of Engineering, Nagercoil, India 1 Assistant

More information

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS Xie Li and Wenjun Zhang Institute of Image Communication and Information Processing, Shanghai Jiaotong

More information

Automatic Shadow Removal by Illuminance in HSV Color Space

Automatic Shadow Removal by Illuminance in HSV Color Space Computer Science and Information Technology 3(3): 70-75, 2015 DOI: 10.13189/csit.2015.030303 http://www.hrpub.org Automatic Shadow Removal by Illuminance in HSV Color Space Wenbo Huang 1, KyoungYeon Kim

More information

Human Upper Body Pose Estimation in Static Images

Human Upper Body Pose Estimation in Static Images 1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project

More information

Video Syntax Analysis

Video Syntax Analysis 1 Video Syntax Analysis Wei-Ta Chu 2008/10/9 Outline 2 Scene boundary detection Key frame selection 3 Announcement of HW #1 Shot Change Detection Goal: automatic shot change detection Requirements 1. Write

More information

A Background Subtraction Based Video Object Detecting and Tracking Method

A Background Subtraction Based Video Object Detecting and Tracking Method A Background Subtraction Based Video Object Detecting and Tracking Method horng@kmit.edu.tw Abstract A new method for detecting and tracking mo tion objects in video image sequences based on the background

More information

Large-scale Video Classification with Convolutional Neural Networks

Large-scale Video Classification with Convolutional Neural Networks Large-scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Note: Slide content mostly from : Bay Area

More information

Visual Attention Control by Sensor Space Segmentation for a Small Quadruped Robot based on Information Criterion

Visual Attention Control by Sensor Space Segmentation for a Small Quadruped Robot based on Information Criterion Visual Attention Control by Sensor Space Segmentation for a Small Quadruped Robot based on Information Criterion Noriaki Mitsunaga and Minoru Asada Dept. of Adaptive Machine Systems, Osaka University,

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES

AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES 1 RIMA TRI WAHYUNINGRUM, 2 INDAH AGUSTIEN SIRADJUDDIN 1, 2 Department of Informatics Engineering, University of Trunojoyo Madura,

More information

Digital Image Processing. Prof. P.K. Biswas. Department of Electronics & Electrical Communication Engineering

Digital Image Processing. Prof. P.K. Biswas. Department of Electronics & Electrical Communication Engineering Digital Image Processing Prof. P.K. Biswas Department of Electronics & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Image Segmentation - III Lecture - 31 Hello, welcome

More information

What is an edge? Paint. Depth discontinuity. Material change. Texture boundary

What is an edge? Paint. Depth discontinuity. Material change. Texture boundary EDGES AND TEXTURES The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Bill Freeman and Antonio Torralba (MIT), including their

More information