Video Key-Frame Extraction using Entropy value as Global and Local Feature

Similar documents
Scene Change Detection Based on Twice Difference of Luminance Histograms

Key-Frame Extraction Algorithm using Entropy Difference

Video shot segmentation using late fusion technique

CHAPTER 3 SHOT DETECTION AND KEY FRAME EXTRACTION

Navidgator. Similarity Based Browsing for Image & Video Databases. Damian Borth, Christian Schulze, Adrian Ulges, Thomas M. Breuel

Searching Video Collections:Part I

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

A Robust Wipe Detection Algorithm

Joint Key-frame Extraction and Object-based Video Segmentation

Key Frame Extraction using Faber-Schauder Wavelet

AUTOMATIC VIDEO INDEXING

Short Survey on Static Hand Gesture Recognition

Elimination of Duplicate Videos in Video Sharing Sites

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Browsing News and TAlk Video on a Consumer Electronics Platform Using face Detection

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 2, Issue 12, June 2013

Lecture 12: Video Representation, Summarisation, and Query

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing

NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES. David Asatryan, Manuk Zakaryan

PixSO: A System for Video Shot Detection

Image Segmentation. 1Jyoti Hazrati, 2Kavita Rawat, 3Khush Batra. Dronacharya College Of Engineering, Farrukhnagar, Haryana, India

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Clustering Methods for Video Browsing and Annotation

Automatic Shot Boundary Detection and Classification of Indoor and Outdoor Scenes

An ICA based Approach for Complex Color Scene Text Binarization

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Video Syntax Analysis

Key Frame Extraction and Indexing for Multimedia Databases

A Novel Compression for Enormous Motion Data in Video Using Repeated Object Clips [ROC]

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

Video Compression An Introduction

Including the Size of Regions in Image Segmentation by Region Based Graph

A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON DWT WITH SVD

Recall precision graph

Tamil Video Retrieval Based on Categorization in Cloud

Story Unit Segmentation with Friendly Acoustic Perception *

Motion Detection Algorithm

Graph-based High Level Motion Segmentation using Normalized Cuts

Webpage: Volume 3, Issue VII, July 2015 ISSN

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA

International Journal of Advance Engineering and Research Development

COLOR AND SHAPE BASED IMAGE RETRIEVAL

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Automatic Colorization of Grayscale Images

Moving Object Detection and Tracking for Video Survelliance

Suspicious Activity Detection of Moving Object in Video Surveillance System

Short Run length Descriptor for Image Retrieval

SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS. Luca Bertinetto, Attilio Fiandrotti, Enrico Magli

Iterative Image Based Video Summarization by Node Segmentation

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

Learning video saliency from human gaze using candidate selection

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services

Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation

5. Hampapur, A., Jain, R., and Weymouth, T., Digital Video Segmentation, Proc. ACM Multimedia 94, San Francisco, CA, October, 1994, pp

Automatic clustering based on an information-theoretic approach with application to spectral anomaly detection

Object Detection in Video Streams

Clustering Web Documents using Hierarchical Method for Efficient Cluster Formation

KEY FRAMES EXTRACTION USING GRAPH MODULARITY CLUSTERING FOR EFFICIENT VIDEO SUMMARIZATION

Shot Detection using Pixel wise Difference with Adaptive Threshold and Color Histogram Method in Compressed and Uncompressed Video

Integration of Global and Local Information in Videos for Key Frame Extraction

An Automatic Timestamp Replanting Algorithm for Panorama Video Surveillance *

Latest development in image feature representation and extraction

Video annotation based on adaptive annular spatial partition scheme

AIIA shot boundary detection at TRECVID 2006

AN ACCELERATED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION

International Journal of Modern Engineering and Research Technology

Time Stamp Detection and Recognition in Video Frames

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Robust color segmentation algorithms in illumination variation conditions

A Rapid Scheme for Slow-Motion Replay Segment Detection

Content Based Classification of Audio Using MPEG-7 Features

Image Classification Using Wavelet Coefficients in Low-pass Bands

Key-frame extraction using dominant-set clustering

Image Matching Using Run-Length Feature

Available online at ScienceDirect. Procedia Computer Science 87 (2016 ) 12 17

CHAPTER 9 INPAINTING USING SPARSE REPRESENTATION AND INVERSE DCT

Tracking and Recognizing People in Colour using the Earth Mover s Distance

Color Local Texture Features Based Face Recognition

Experimentation on the use of Chromaticity Features, Local Binary Pattern and Discrete Cosine Transform in Colour Texture Analysis

DUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING

A Miniature-Based Image Retrieval System

MATRIX BASED SEQUENTIAL INDEXING TECHNIQUE FOR VIDEO DATA MINING

An Efficient Semantic Image Retrieval based on Color and Texture Features and Data Mining Techniques

Enhanced Hexagon with Early Termination Algorithm for Motion estimation

Recognition of Non-symmetric Faces Using Principal Component Analysis

Nearest Clustering Algorithm for Satellite Image Classification in Remote Sensing Applications

Figure 1 shows unstructured data when plotted on the co-ordinate axis

Speed-up Multi-modal Near Duplicate Image Detection

ScienceDirect. Reducing Semantic Gap in Video Retrieval with Fusion: A survey

Image enhancement for face recognition using color segmentation and Edge detection algorithm

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution

VC 11/12 T14 Visual Feature Extraction

On-the-fly extraction of key frames for efficient video summarization

Texture Segmentation by Windowed Projection

Multi-Camera Calibration, Object Tracking and Query Generation

Real-time Monitoring System for TV Commercials Using Video Features

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Human Motion Detection and Tracking for Video Surveillance

Transcription:

Video Key-Frame Extraction using Entropy value as Global and Local Feature Siddu. P Algur #1, Vivek. R *2 # Department of Information Science Engineering, B.V. Bhoomraddi College of Engineering and Technology Hubli, Karnataka, INDIA 1 algursp@bvb.edu * Department of Information Science Engineering, B.V. Bhoomraddi College of Engineering and Technology Hubli, Karnataka, INDIA 2 vivekr321@gmail.com Abstract Key frames play an important role in annotation. It is one of the widely used methods for abstraction as this will help us for processing a large set of data with sufficient content representation in faster way. In this paper a novel approach for key-frame extraction using entropy value is proposed. The proposed approach classifies frames based on entropy values as global feature and selects frame from each class as representative key-frame. It also eliminates redundant frames from selected key-frames using entropy value as local feature. Evaluation of the approach on several s has been presented. Results show that the algorithm is successful in helping annotators automatically identify key-frames. Keywords Key- Frame Extraction, Entropy value. I. INTRODUCTION In the recent years, developments in software tools for content management have made it possible to classify content in efficient way and made the process of search and retrieval even faster. Current research community concentrates on automation of content management, overcoming the drawbacks of system involving human interaction i.e., time consumption and human perspective. Video segmentation and key frame extraction are the bases of analysis and content-based retrieval. Key frame extraction, is an essential part in analysis and management, providing a suitable summarization for indexing, browsing and retrieval. The use of key frames reduces the amount of data required in indexing and provides the framework for dealing with the content. Video can be defined as visual representation of data. Raw can be structured as sequence of scenes, scenes as sequence of shots and shots as sequence of frames. Most of the presented work exploits this structure of for segmentation and key-frame extraction. Key frame is the frame which can represent the salient content and information of the shot. The key frames extracted must summarize the characteristics of the, and the image characteristics of a can be tracked by all the key frames in time sequence. Many methods have been developed for the selection of key frames. In a retrieval application, a sequence is subdivided in time into a set of shorter segments each of which contains similar content. These segments are represented by representative key-frames that greatly reduce amount of data that is searched. However, key frames do not describe the motions and actions of objects within the segment. Selecting key frames of scenes allows us to capture most of the content variations, while at the same time excluding other frames which may be redundant. Choosing the first frame seems to be the natural choice, as all the rest of the frames in the scene can be considered to be logical and continuous extensions of the first frame, but it may not be the best match for all the frames in the scene. II. RELATED WORK A basic rule of key frame extraction is that key frame extraction would rather be wrong than not enough. So it is necessary to discard the frames with repetitive or redundant information during the extraction [1]. Current segmentation and key-frame extraction algorithms can be classified as temporal based segmentation also known as shot based segmentation and object based segmentation. A. Shot-Based Video Segmentation Shot-based segmentation can be considered as a process of data abstraction, in which two major steps, i.e., temporal segmentation and key-frame extraction are usually involved. Temporal segmentation classifies one sequence into a set of shots using one or several framewise features, such as the color layout [2], entropy [3], [4] etc. Key-frame extraction implements data abstraction by selecting a set of key-frames. It is usually modeled as a typical clustering processing that divides one shot into several clusters and selects clusters centroid as key-frames. Keyframes of each shot are extracted by using the K-means method [2]. In [5], the Gaussian mixture model (GMM) is used to model the temporal variation of color histograms in the RGB color space. Frames present in the shot can be classified into several clusters based on the feature representation. For each cluster, the frame nearest to the centroid is selected as a key-frame. The number of clusters can be determined by Bayesian Information Criterion. The main drawback of this method is that it is not able to automatically determine the number of clusters and, hence, would fail to automatically adapt the clustering to the content.

Fig. 1: System Model for Proposed Work B. Object-based segmentation Object-based segmentation is to decompose one shot into objects and background, which are usually application-dependent. Unlike shot-based segmentation that has a frame as the basic unit, object-based segmentation can provide objects that represent a raw at a higher semantic level. Object-based segmentation classifies a sequence into several objects, each of which can be considered as a pattern represented by temporal or/and spatial features in a. Current object-based segmentation methods can be classified into three types: segmentation with spatial priority, segmentation with temporal priority, and joint spatial and temporal segmentation. More recent interests are on joint spatial and temporal segmentation due to the nature of human vision that recognizes salient structures jointly in spatial and temporal domains [6]. Hence, both spatial and temporal pixel-wise features are extracted to construct a multi-dimensional feature space for object segmentation. Compared with key-frame extraction methods using frame-wise features, e.g., color histogram, these approaches are usually more computationally expensive. III. PROPOSED WORK The proposed approach classifies frames based on entropy values as global feature and selects frame from each class as representative key-frame. It also eliminates redundant frames from selected key-frames using entropy value as local feature. The system model for proposed work is show in Fig 1. It has three main components A. Segmentation of into shots, B. Entropy value based key-frame extraction C. Elimination of similar key-frames present in the extracted set. A. Segmentation of Video is segmented into shots based on the detection of shot boundaries using sharp transition cut detection. The cut is defined as sharp transition between a shot and the one following. Cuts generally correspond to an abrupt change in the color and brightness pattern for two consecutive images. The principle behind this approach is that, since two consecutive frames in a shot do not change significantly in their background and object content, their overall color and brightness distribution differs little. On the other hand if we have a scene where there is a dramatically change in the color and illumination of the background it will have consequences to the color level of the image, which indicates change in the objects and background of the shot. Segmentation of into shots by cuts detection can be done by many methods as proposed in literature such as Histogram differences, Template matching, Edge change ratio etc.. The proposed work makes uses of template matching algorithm to segment the. In this method two consecutive frames of the are compared pixel by pixel and a co-relation factor between the two frames is calculated. If the co-relation factor is less than the threshold i.e., 0.9, then there is a cut present in the, and the is segmented into shots. It is observed that, segmentation by template matching results in short meaning-less shots, caused by insertion of

photographic effect (e.g. fades and dissolves). These shots of fades and dissolve can be merged with other shots by fixing the minimum time-frame of the shot. B. Entropy Value based Key-frame Extraction A novel algorithm for key-frame extraction based on entropy value of frames is proposed in this section. The algorithm classifies the frames in a shot into different bins. Each bin represents a class of frames containing similar objects and background. In this algorithm entropy value is used as a global feature representing the content of the frame. Centre frame from each bin is chosen as one of the key-frame for the shot. Bins with fewer frames i.e., less than twenty frames are neglected to avoid redundant frames. Entropy Value: Consider a typical frame from a sequence, where the number of grey level is quantized to 256. Let h f (k) be the histogram of frame f and k the grey level, such that 0 < k < 2 b 1, where b is the number of bits, in which the image quantization levels can be represented. If the frame is of the class M rows N columns, then the probability of appearance of this grey level in the frame will be p k h k / M N (1) f f. Entropy of the frame can be defined as sum of product of the log of the inverse probability of appearance p f (k) with probability of appearance p f (k) Entropy k max k 0 p k / p k log (2) f To increase the distance between the entropy bins, so that classification of each frame with varying entropy values can be performed with ease, discrete value of squared entropy values are considered (modified entropy value). 2 mf round En f En (3) Where En mf is modified entropy value of the frame f with entropy value En f. The Algorithm 1 key-frame extraction computes the modified entropy value for each frame and classifies them. New bins are initialized as when new values of entropies are computed. C. Elimination of similar key-frames It is observed that, many a times the object and background repeat in different shots of the, e.g. a news reader narrating news story. This leads to one or more redundant key-frames. To eliminate these redundant key frames, a filtering step is carried out, wherein each key-frame is compared to every other key-frame to find the duplicate or near duplicate frames. To find out similarity between two frames, segmented entropy technique is applied. In segmented entropy technique each frame is divided into 64 segments and entropy of each segment is calculated individually. In this technique, entropy is used as local feature. Through this method, variation between two frames can be monitored at segment level, which yields to more accurate f ALGORITHM I: KEY-FRAME EXTRACTION Input: shot S which is an array of frames S[fr 0...fr n] Output: representative frames of the shot S EnBins[] := 0 // Array holding modified entropy value FrameCount[] := 0 //Array holding no. of frames present in each bins EnFrameInBin[][] := 0 //Array holding frame number corresponding to each bin BinCount := 0 //Number of bins for all i such that 0 <= i <= n do fr := Si En := CalculateEntrpoy(fr) En M := round(en 2 ) //Modified entropy value found := false for i := 0 to BinCount-1 do if En := EnBins[j] then EnFrameInBin[j][FrameCount[j]] := FrameNo(fr) FrameCount[j] := FrameCount[j] + 1 found := true if found := false then EnBins[BinCount] := En M EnFrameInBin[BinCount][0]] := FrameNo(fr) FrameCount[BinCount] := 1 BinCount := BinCount + 1 for i := 0 to BinCount-1 do if FrameCount[i] > 20 then SaveFrame(EnFrameInBin[i][FrameCount/2]) identification when compared to entropy of whole frame. To measure the dissimilarity of two frames, standard deviation of the difference of segmented entropies of two frames is computed. If standard deviation is nearing to the value Zero then the two frames are considered as similar, the second frame it eliminated as duplicate frame. Consider two frames M and N, both frames are divided into 64 equal segments and entropies of each segments are computed. Where EnM (s1, s2 s64) corresponds to set of entropies of frame M and E nn (s1, s2 s64) corresponds to set of entropies of frame N. Entropies of each segment are obtained using (1). Difference between the entropies Diff (s1, s2, s3 s64) of each segment of frame M and N are computed. Diff s En s En s i N i M i (4) Standard deviation of entropy difference is computed, which signifies the dissimilarity between two frames. SD Variance (5) i 64 2 Diff s i mean i1 Variance (6) IV. EXPERIMENTAL RESULTS The algorithm was implemented in the Open CV workspace and compared against the Entropy Difference algorithm [3]. The entropy difference algorithm has been compared with five different technique for key-frame extraction Pair-wise Pixel (P

Video English News Star trek TABLE I COMPARISON OF THE KEY-FRAMES EXTRACTED FROM FIVE SAMPLE VIDEO SEQUENCES USING ENTROPY VALUE Entropy difference Proposed Algorithm Total frames Manually identified key frames Key frame Redundant Missing Key frame Redund Missing frame identified frame frame identified ant frame 3775 22 14 0 8 23 3 2 2001 29 21 3 11 28 0 1 4563 119 199 101 21 109 7 17 4002 67 45 0 22 85 24 6 Hindi news with graphics 2184 36 17 3 22 28 9 17 P), x2 Test(X T), likelihood Ratio (L R), Histogram Comparison (H C) and the Consecutive Frame Difference(C f d), the details of experiment results can be found at [3].In all the experiments reported in this section, the streams are AVI format, with frame rate varying from 23 frames/sec to 30 frames/sec. To validate the effectiveness of the proposed algorithm, representatives s from News and Movie are tested. The s that have been selected consists of action (Lord of the rings, Star trek), conversation (news ) and inserted graphics (news s). The length varies from 1 minute sec to 4 minutes long. Initially all the s have been watched and key-frames from each has picked up manually. These have been judged by a human who watches the entire. These key-frames serve as standard against which the two different algorithms were compared and percentage accuracy was calculated regarding to how many correct key-frames each algorithm has managed to identify. Table I shows a comparison of the key-frames extracted out of the sequences. It also allows seeing comparison between how efficiently the key-frames are extracted in terms of redundant frames and missing frames. The experimental results are shown in Fig 2. The Table II shows the deviation of the algorithm from the standard. We can see that the number of redundant frames identified by the proposed approach is comparatively lower to the entropy difference algorithm. The proposed algorithm is able to detect the presence of transient changes. On the other hand when there is inserted graphics in the the performance of the algorithm is low, redundant frames present in the key-frame identified are comparatively higher than other sequence. V. CONCLUSION In this paper, a novel approach for automatic key-frame extraction has been presented. The proposed algorithm performance very well when the image background is distinguishable from the objects and when there is abrupt (cut) change between the shots. Algorithm works well, but with some performance loss when the sequence contains transient changes and inserted graphics. The main advantage of the proposed algorithm is, the data loss during the keyframe extraction is minimal (number of missing frame are less) and high compactness is achieved (Number of key-frame identified/ Total frames present in the ) which is very essential property in abstraction methods. 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 English News Star trek movie Lord of ring Lord of ring movie hindi news Fig. 2 Deviation from Manually identification Entropy_differ ence Proposed Algorithm Manual TABLE II DEVIATION FROM STANDARD (MANUALLY SELECTED) KEY-FRAMES Video Entropy difference Proposed Algorithm English News 0.37 0.09 Star trek movie 0.38 0.03 0.17 0.14 0.33 0.08 Hindi news with graphics 0.61 0.47

REFERENCES [1] Guozhu Liu, and Junming Zhao, Key Frame Extraction from MPEG Video Stream, Information Processing (ISIP), Third International Symposium, Pages 423 427, October 2010. [2] Damian Borth, Adrian Ulges, Christian Schulze, Thomas M. Breuel, Keyframe Extraction for Video Tagging and Summarization, Proceedings of Informatiktage, 2008. [3] Markos Mentzelopoulos and Alexandra Psarrou, Key-Frame Extraction Algorithm using Entropy Difference, Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, October 2004. [4] T. Liu, H.-J. Zhang, and F. Qi, A novel key-frame-extraction algorithm based on perceived motion energy model, IEEE Trans.Circuits Syst. Video Technol., vol. 13, no. 10, pp. 1006 1013, Oct.2003. [5] R. Hammoud and R. Mohr," probabilistic framework of selecting effective key frames for browsing and indexing," in International workshop on Real-Time Image Sequence Analysis, 2000. [6] Xiaomu Song, Guoliang Fan, "Joint Key-Frame Extraction and Object-Based Video Segmentation," wacv-motion, vol. 2, pp.126-131, IEEE Workshop on Motion and Video Computing (WACV/MOTION'05) - Volume 2, 2005. [7] John S. Boreczky and Lynn D. Wilcox, A Hidden Markov Model Framework for Video Segmentation Using Audio and Image Features, Preceding of Multimedia and Expo, 2007 IEEE International Conference, Pages 532 535, July 2007.