Video shot boundary detection using motion activity descriptor

Similar documents
Joint Adaptive Block Matching Search (JABMS) Algorithm

Fast Block-Matching Motion Estimation Using Modified Diamond Search Algorithm

Enhanced Hexagon with Early Termination Algorithm for Motion estimation

Semi-Hierarchical Based Motion Estimation Algorithm for the Dirac Video Encoder

Key Frame Extraction and Indexing for Multimedia Databases

Prediction-based Directional Search for Fast Block-Matching Motion Estimation

A New Fast Motion Estimation Algorithm. - Literature Survey. Instructor: Brian L. Evans. Authors: Yue Chen, Yu Wang, Ying Lu.

Scene Change Detection Based on Twice Difference of Luminance Histograms

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Redundancy and Correlation: Temporal

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Implementation of H.264 Video Codec for Block Matching Algorithms

A Comparative Approach for Block Matching Algorithms used for Motion Estimation

PixSO: A System for Video Shot Detection

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

A Robust Wipe Detection Algorithm

Research Article Block-Matching Translational and Rotational Motion Compensated Prediction Using Interpolated Reference Frame

Comparative Analysis on Medical Images using SPIHT, STW and EZW

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July ISSN

Camera Motion Identification in the Rough Indexing Paradigm

Scalable Coding of Image Collections with Embedded Descriptors

Motion in 2D image sequences

Efficient Block Matching Algorithm for Motion Estimation

AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS

AUTOMATIC VIDEO INDEXING

Motion Vector Estimation Search using Hexagon-Diamond Pattern for Video Sequences, Grid Point and Block-Based

Searching Video Collections:Part I

VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural

Motion Estimation for Video Coding Standards

CHAPTER 3 SHOT DETECTION AND KEY FRAME EXTRACTION

Detection and Treatment of Torn Films with Comparison of Different Motionestimation Techniques

Star Diamond-Diamond Search Block Matching Motion Estimation Algorithm for H.264/AVC Video Codec

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

A Novel Hexagonal Search Algorithm for Fast Block Matching Motion Estimation

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

A Geometrical Key-frame Selection Method exploiting Dominant Motion Estimation in Video

Module 7 VIDEO CODING AND MOTION ESTIMATION

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform

Real-time Monitoring System for TV Commercials Using Video Features

AN ADJUSTABLE BLOCK MOTION ESTIMATION ALGORITHM BY MULTIPATH SEARCH

A High Quality/Low Computational Cost Technique for Block Matching Motion Estimation

Image Segmentation for Image Object Extraction

Automatic Shadow Removal by Illuminance in HSV Color Space

Video shot segmentation using late fusion technique

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Fast Motion Estimation for Shape Coding in MPEG-4

Advanced Video Coding: The new H.264 video compression standard

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

COMPARATIVE ANALYSIS OF BLOCK MATCHING ALGORITHMS FOR VIDEO COMPRESSION

Recall precision graph

Medical Image Sequence Compression Using Motion Compensation and Set Partitioning In Hierarchical Trees

Image Compression and Resizing Using Improved Seam Carving for Retinal Images

Saliency Detection for Videos Using 3D FFT Local Spectra

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Tunnelling-based Search Algorithm for Block-Matching Motion Estimation

Fobe Algorithm for Video Processing Applications

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Object Extraction Using Image Segmentation and Adaptive Constraint Propagation

Shot Detection using Pixel wise Difference with Adaptive Threshold and Color Histogram Method in Compressed and Uncompressed Video

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

MISB ST STANDARD. 27 February Motion Imagery Interpretability and Quality Metadata. 1 Scope. 2 References. 2.1 Normative References

VIDEO COMPRESSION STANDARDS

Video Key-Frame Extraction using Entropy value as Global and Local Feature

Data Hiding in Video

Compression of Light Field Images using Projective 2-D Warping method and Block matching

Including the Size of Regions in Image Segmentation by Region Based Graph

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

Reconstruction PSNR [db]

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Robotics Programming Laboratory

Texture Image Segmentation using FCM

Unsupervised Camera Motion Estimation and Moving Object Detection in Videos

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Video Compression System for Online Usage Using DCT 1 S.B. Midhun Kumar, 2 Mr.A.Jayakumar M.E 1 UG Student, 2 Associate Professor

Introduction to Medical Imaging (5XSA0) Module 5

Edge and corner detection

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

MATRIX BASED SEQUENTIAL INDEXING TECHNIQUE FOR VIDEO DATA MINING

NeTra-V: Towards an Object-based Video Representation

A feature-based algorithm for detecting and classifying production effects

Efficient Acquisition of Human Existence Priors from Motion Trajectories

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Clustering Methods for Video Browsing and Annotation

SURVEILLANCE VIDEO FOR MOBILE DEVICES

Motion Detection Algorithm

Motion Estimation and Optical Flow Tracking

Dominant colour extraction in DCT domain

AUDIOVISUAL COMMUNICATION

2.1 Optimized Importance Map

View Synthesis for Multiview Video Compression

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Tamil Video Retrieval Based on Categorization in Cloud

Transcription:

JOURNAL OF TELECOMMUNICATIONS, VOLUME, ISSUE 1, APRIL 010 Video shot boundary detection using motion activity descriptor Abdelati Malek Amel, Ben Abdelali Abdessalem and Mtibaa Abdellatif 54 Abstract This paper focus on the study of the motion activity descriptor for shot boundary detection in video sequences. We interest in the validation of this descriptor in the aim of its real time implementation with reasonable high performances in shot boundary detection. The motion activity information is extracted in uncompressed domain based on adaptive rood pattern search (ARPS) algorithm. In this context, the motion activity descriptor was applied for different video sequence. Index Terms Motion activity, MPEG-7, video segmentation, ARPS, Block-matching. 1. INTRODUCTION n recent years, the development of software and Ihardware technology has enabled the creation of a large amount of digital video content. Video segmentation based on motion [1] is a new research area. Motion is a salient feature in video, in addition to other typical image features such as color, shape and texture. Video shot boundary detection is a fundamental step in video indexing and retrieval, and in general video data management. The general objectives are to segment a given video sequence into its constituent shots, and to identify and classify the different shot transitions in the sequence [10]. Different algorithms have been proposed, for instance, based on simple color histograms [11, 1], pixel color differences [13], color ratio histograms [14], edges [15], and motion [16 18]. In this work, we study the problem of video partitioning using a motion-based approach. Grey level In this study we interest in the validation of the motion activity descriptor in the aim of its possible real time sequence implementation. We have applied this descriptor for different video sequence. We have considered its application for shot boundary detection. The rest of this paper is organized as follows: in section, we present the motion activity descriptor specification. In section 3 we focus on the study of the motion activity descriptor for shot boundary detection in video sequences. The different experimentations and the evolution results of the motion activity are presented in Very low Activity section 4. MOTION ACTIVITY DESCRIPTOR The motion activity is one of the motion features included in the visual part of the MPEG-7 standard. It also used to describe the level or intensity of activity, action, or motion Abdelati Malek Amel is with the National school of engineers of monastir, Tunisia. Ben Abdelali Abdessalem is with the High Institute of Informatics and Mathematics of Monastir, Tunisia. Mtibaa Abdellatif is with the National School ef engineers, Monastir, Tunisia. 010 JOT http://sites.google.com/site/journaloftelecommunications/ in that video sequence. In this paper, we propose that since low or high action is a measure of how much a video segment is changing. The motion activity is used in different applications such as video surveillance, fast browsing, dynamic video summarization, content-based querying etc. This information can for example be used for shot boundary detection, for shot classification or scene segmentation [19]. The magnitude of the motion vectors represents a measure of intensity of motion activity that includes transformation of video Low Activity Medium Activity High Activity Very High Activity Intensity motion detection Fig.1. Different steps for motion intensity extraction Division of each image in 16x16 macroblocks Search for each macro block in the next image Motion Vector extraction Motion activity matrix

55 several additional attributes that contribute towards the efficient use of these motion descriptors in a number of applications. Intensity of activity [4] is expressed by an integer in the range (1-5) and higher the value of intensity, higher the motion activity. The motion features are extracted using the motion vectors. Block motion techniques are employed to extract the motion vectors. Suppose x (i, j) and y (i, j) denote the motion vectors in x and y directions for a given frame, where (i, j) indicates the block indices. The spatial activity matrix was defined in [5] to be C mv = { R( i, j)} (1) Where ( ) ( i, j) x( i, j) y( i, j) R xy = + () The average of activity matrix for each frame is given by M 1N 1 avg 1 Cmv = Cmv ( i, j) (3) MN i= 0 j= 0 Here M and N are the width and height of the macroblocks in the frame. This method ignores the low activity blocks and leaves the high activity blocks unaltered. The intensity of motion for each frame is determined as follows: fr M 1N 1 1 i= 0 j= 0 ( ( ) ) avg C i, j C σ = mv mv (4) MN Intensity values of motion are classified into five categories (Table 1) namely very low, low, medium, high and very high activity. The intensity of motion the level of motion activity and TABLE 1 MOTION ACTIVITY CATEGORIES Activity value Dynamique de l écart-type des vecteurs de mouvement σ 1/Very low Intensity 0 σ 3,9 /Low Intensity 3,9 σ 10,7 3/Medium Intensity 10,7 σ 17,1 4/High Intensity 17,1 σ 3 5/Very High Intensity 3 σ it s their feature used for motion activity description. In the next section we describe how to extract motion intensity in the uncompressed domain. We use the block matching algorithm to extract the motion intensity feature..1. Block_matching for motion vector extraction To extract the motion intensity, the motion of the moving objects has to be estimated first. This is called motion estimation. The commonly used motion estimation technique in all the standard video codecs is the block matching algorithm (BMA). Block matching is used to retrieve an initial estimate of the image displacement. To obtain a dense displacement field, matching with adaptive block sizes was implemented. In this typical algorithm, a frame is divided into blocks of M N pixels or, more usually, square blocks of N pixels [3]. Then, we assume that each block undergoes translation only with no scaling or rotation. The blocks in the first frame are compared to the blocks in the second frame. Motion Vectors can then be calculated for each block to see where each block from the first frame ends up in the second frame. A vast number of BMAs have been proposed: Exhaustive search (ES) Diamond Search(DS) Three-step search(tss) New three step search(ntss) Simple and effective search(ses) Four Step Search (4SS) Adaptive Rood Pattern Search(ARPS) In this work, we use an adaptive rood pattern search (ARPS) algorithm. A. Barjatya[7] illustrates and simulates 7 of the most popular block matching algorithms, with their comparative study. Three more, very recent, block matching algorithms are studied in the end as part of literature review. Of the various algorithms studied or simulated in [7], ARPS turns out to be the best block matching algorithm. ARPS [8] algorithm makes use of the fact that the general motion in a frame is usually coherent, i.e. if the macro blocks around the current macro block moved in a particular direction then there is a high probability that the current macro block will also have a similar motion vector. This algorithm uses the motion vector of the macro block to its immediate left to predict its own motion vector. An example is shown in Fig. 1. The predicted motion vector points to (3, -). In addition to checking the location pointed by the predicted motion vector, it also checks at a rood pattern distributed points, as shown in Fig 1, where they are at a step size of S = Max ( X, Y ). X and Y are the x-coordinate and y-coordinate of the predicted motion vector. This rood pattern search is always the first step. Fig.. Adaptive Root Pattern: The predicted motion vector is (3,-) and the step size S = Max( 3, - ) = 3. It directly puts the search in an area where there is a high 010 JOT http://sites.google.com/site/journaloftelecommunications/

56 probability of finding a good matching block. I.1. Fig. 3. Search points per macro block while computing the PSNR performance of Fast Block Matching Algorithms [7]. Fig.6. Example of video sequence results with the motion activity descriptor Fig. 4. Peak-Signal-to-Noise-Ratio (PSNR) performance of Fast Block Matching Algorithms. Caltrain Sequence was used with a frame distance of [7]. Peak-Signal-to-Noise-Ratio (PSNR) characterizes the motion compensated image that is created by using motion vectors and macro clocks from the reference frame. An example of the Matlab implementation results of the motion vector description technique is given by fig 5. (a) (b) (c) Fig.5. (a) and (b) two successive caltrain frames and (c) motion vector result of (a) and (b) frames... Motion activity implementation for an example of video sequence An example of spatial repartition of the motion activity descriptor is given in fig 6. The obtained results demonstrate the performance of the motion descriptor for News-4 sequence. 3 SHOT BOUNDARY DETECTION USING MOTION ACTIVITY The basis of any video segmentation method consists in detecting visual discontinuities along the time domain. During this process, it is required to extract visual features that measure the degree of similarity between frames in a given shot. This measure is related to the difference or discontinuity between frame n and n+j where j>= 1. The main idea underlying the methods of segmentation schemes is that images in the vicinity of a transition are highly dissimilar. It then seeks to identify discontinuities in the video stream. The general principle is to extract a comment on each image, and then define a distance [9] (or similarity measure) between observations. The application of the distance between two successive images, the entire video stream, roduces a onedimensional signal, in which we seek then the peaks (resp. hollow if similarity measure), which correspond to moments of high dissimilarity. In this work, we used the extraction of key frames method based on detecting a significant change in the activity of motion. To jump images which do not distort the calculations but we can minimize the execution time. First we extract the motion vectors between image i and image i+ then calculates the intensity of motion, we repeat this process until reaching the last frame of the video and comparing the difference between the intensities of successive motion to a specified threshold. The idea can be visualized in fig 7.

57 Frame number 1 3 4 5 6 7 8 n-3 n- n-1 n Foot 9 1 3 4 5 MV MV MV Motion intensity Fig.7.The idea of video segmentation using motion intensity 6 7 Film 1 Film Film 3 Film 4 4 EXPERIMENTATION The goal of short boundary detection is to process video sequences that contain high redundancy and make them exciting, interesting, valuable and useful for users. Many researchers do not include any form of quantitative evaluations. The criteria to judge shot boundary detection quality will be different for different application domains. For event-based content, summaries might be scored using the percentage of events from the original program that they contain. Two evaluations are given in [8] and [9]. Ekin and Tekalp [9] give precision and recall values for goal, referee, and penalty box detection, which are important events in soccer games. Pr ecision = Transitions Correctly Reported/ Transitions Reported (5) Recall= Transitions Correctly Reported/ Transitions in Reference FP = False Transitions / Transition s Reported For every video sequence we determine the number of shots, the number of shots correctly reported, the number of false detections and the number of non reported shots. For each sequence we also draw the curve of the distances between the successive frames. These curves are mainly used to determine the threshold values, but they also give an idea about the capacity of the used technique in detecting transitions. (6) (7) Film 5 Pub 1 Pub Pub 3 Pub 4 Pub 5 Fig.8. Example frames from the video sequence employed in the experimental evaluation. An overview of our shot boundary detection results is given in Table. The segmentation method does not allow to explicitly selecting the total number of homogeneous segments. This number depends on the threshold value adopted in the merging decision. We have studied the sensibility of our motion-based segmentation method to the threshold setting. The video segmentation using motion activity is given in Table. With motion activity descriptor we obtain good overall results for shot boundary detection, as it is shown in Table3. TABLE Type de vidéos MOTION VIDEO SEQUENCE PROPERTIES Durée (min) Nombre d images Transitions totales CUTs Fondus News 05:56 7136 83 6 1 04:56 855 110 89 1 Football 04:14 110 134 99 35 Films 03:45 7733 60 57 3 News 1 News News 3 News 4 Foot1 Foot Foot 3 Foot 4 Foot 5 Foot 6 Foot 7 Foot 8 Publicity 0:38 596 90 60 30 TOTAL 1:9 40306 477 367 110 The obtained results for the different evaluation tests are given in Table 3. This table gives the performances (precision, recall and false positives) of the motion activity descriptor for different video sequence types.

58 TABLE 3 PERFORMANCES OF MOTION ACTIVITY DESCRIPTOR FOR DIFFERENT VIDEO SEQUENCE Video Precision Recall FP News 7,7% 75,33% 8,8% Football 69,56% 84,1% 30,43% Films 75% 66,66% 5% Publicité 76,19% 84,1 % 3,8% s 63,63% 75,71% 36,36% Image N 1373 Image N 1444 Image N 1796 Image N 181 Image N 188 Image N 191 An example of graphical results (for the video News 4 ) is given in Fig 9. The curve represented in this figure demonstrates the difference between two successive motion activity for frames to obtain a shot boundary detection. Image N 051 Image N 073 Image N 181 Image N 300 Fig.10. An example of shot boundary detection (the "News 4" video sequence) Fig.9. The difference between successive motion intensities for shot boundary detection of the news-4. The first frame of each shot is considered as the key image of the detected shot. The Fig 10 represents an example of shot boundary detection corresponding to the sequence "News 4". Image N 1 Image N 534 Image N 559 Image N 893 Image N 106 Image N 109 5 CONCLUSIONS This paper brought an experimental study of the motion activity descriptor particularly for shot boundary detection. After introducing the content based video segmentation problem, we have presented the specification of the motion activity in section. Section 3 and section 4 were dedicated to present the experiments of the motion activity descriptor for shot boundary detection in video sequences. The motion activity was applied for different video sequences. Experimental results demonstrate that the use of motion activity can assure a satisfactory of shot boundary detection rate. This can promise a better computing performance and can be useful for real time implementation. References [1] Amel Malek Abdelati, Jlassi bahaeddine, Abdessalem Ben Abdellali and Abdellatif Mtibaa, Suivi d objet en movement par la method Level Set variationnelle, international workshop on system engineering design & application, October 4-6, 008, Monastir, Tunisia. [] ISO/IEC 15938-3 Information Technology - Multimedia Content, Description Interface, Part 3 Visual, 001. [3] ISHIGURO, T., and IINUMA, K. Television bandwidth compression transmission by motion-compensated interframe coding, IEEE Commun. Mag., 198, 10, pp.4 30 [4] S. Jeannin and A. Divakaran, MPEG-7 visual motion descriptors, IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, pp. 70-74, 001. [5] D. A. X. Sun and B. S. Manjunath, "A Motion Activity Descriptor and Its Extraction in Compressed Domain," IEEE

Pacific-Rim Conference on Multimedia (PCM), vol. LNCS 195, pp. 450-453, October 001. [6] Yao Nie, and Kai-Kuang Ma, Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation, IEEE Trans. Image Processing, vol 11, no. 1, pp. 144-1448,December 00. [7] Aroh Barjatya, Block Matching Algorithms For Motion Estimation, Student Member, IEEE, DIP 660 Spring 004 Final Project Paper [8] A.M. Ferman and A.M. Tekalp, Two-stage hierarchical video summary extraction to match low-level user browsing preferences, IEEE transactions on multimedia, 5(5):44-56, 003. [9] Ahmet Ekin and A.Murat Tekalp, Automatic Soccer video analysis and summarization, In Proceedings of SPIE Conference on storage and Retrieval for Media Databases 003, volume 501, pages 339-350, Santa Clara, CA, 0-4 January 003. [10] Don Adjeroh, M. C. Lee, N. Banda, and Uma Kandaswamy, Adaptive Edge-Oriented Shot Boundary Detection, Hindawi Publishing Corporation EURASIP Journal on Image and Video Processing Volume 009, Article ID 859371, 13 pages doi:10.1155/009/859371, 18 May 009. [11] M. J. Swain and D. H. Ballard, Color indexing, International Journal of Computer Vision, vol. 7, no. 1, pp. 11 3, 1991. [1] A. Nagasaka and Y. Tanaka, Automatic video indexing and full-video search for object appearances, in Visual Database Systems II, Elsevier, 199. [13] H. Zhang, A. Kankanhalli, and S. W. Smoliar, Automatic partitioning of full-motion video, Multimedia Systems, vol. 1,no. 1, pp. 10 8, 1993. [14] D. A. Adjeroh and M. C. Lee, Robust and efficient transform domain video sequence analysis: an approach from the generalized color ratio model, Journal of Visual Communication and Image Representation, vol. 8, no., pp. 18 07, 1997. [15] R. Zabih, J. Miller, and K. Mai, A feature-based algorithm for detecting and classifying production effects, Multimedia Systems, vol. 7, no., pp. 119 18, 1999. [16] J. D. Courtney, Automatic video indexing via object motion analysis, Pattern Recognition, vol. 30, no. 4, pp. 607 65, 1997. [17] P. Bouthemy, M. Gelgon, and F. Ganansia, A unified approach to shot change detection and camera motion characterization, IEEE Transactions on Circuits and Systems for Video Technology, vol. 9, no. 7, pp. 1030 1044, 1999. [18] S. Dagtas, W. Al-Khatib, A. Ghafoor, and R. L. Kashyap, Models for motion-based video indexing and retrieval, IEEE Transactions on Image Processing, vol. 9, no. 1, pp. 88 101, 000. [19] Report, State of the Art of Content Analysis Tools for Video, Audio and Speech, Werner Bailer (JRS), Franz Höller (JRS), Alberto Messina (RAI), Daniele Airola (RAI), Peter Schallauer (JRS), Michael Hausenblas (JRS), 10/3/005 59