A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Similar documents
Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Binarization Algorithm specialized on Document Images and Photos

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL

Parallelism for Nested Loops with Non-uniform and Flow Dependences

An Image Fusion Approach Based on Segmentation Region

Local Quaternary Patterns and Feature Local Quaternary Patterns

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

Video Content Representation using Optimal Extraction of Frames and Scenes

Efficient Content Representation in MPEG Video Databases

A fast algorithm for color image segmentation

Background Removal in Image indexing and Retrieval

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

CHAPTER 3 ENCODING VIDEO SEQUENCES IN FRACTAL BASED COMPRESSION. Day by day, the demands for higher and faster technologies are rapidly

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Video Shot Boundary Detection Algorithm

Semantic Image Retrieval Using Region Based Inverted File

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

A Deflected Grid-based Algorithm for Clustering Analysis

Detection of an Object by using Principal Component Analysis

A Multi-step Strategy for Shape Similarity Search In Kamon Image Database

SRBIR: Semantic Region Based Image Retrieval by Extracting the Dominant Region and Semantic Learning

Hybrid Non-Blind Color Image Watermarking

Video Copy Detection Based on Fusion of Spatio-temporal Features

Object-Based Techniques for Image Retrieval

Efficient Mean Shift Algorithm based Color Images Categorization and Searching

A Gradient Difference based Technique for Video Text Detection

Brushlet Features for Texture Image Retrieval

A Gradient Difference based Technique for Video Text Detection

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Robust Shot Boundary Detection from Video Using Dynamic Texture

1. Introduction. Abstract

Mining User Similarity Using Spatial-temporal Intersection

Load Balancing for Hex-Cell Interconnection Network

Long-Term Moving Object Segmentation and Tracking Using Spatio-Temporal Consistency

Robust Video Watermarking Using Image Normalization, Motion Vector and Perceptual Information

Machine Learning: Algorithms and Applications

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

Robust Mean Shift Tracking with Corrected Background-Weighted Histogram

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Palmprint Feature Extraction Using 2-D Gabor Filters

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Accurate Overlay Text Extraction for Digital Video Analysis

Grading Image Retrieval Based on DCT and DWT Compressed Domains Using Low-Level Features

Algorithm for Human Skin Detection Using Fuzzy Logic

Improved SIFT-Features Matching for Object Recognition

Real-Time View Recognition and Event Detection for Sports Video

High-Boost Mesh Filtering for 3-D Shape Enhancement

Face Detection Using DCT Coefficients in MPEG Video. Jun Wang, Mohan S Kankanhalli, Philippe Mulhem, Hadi Hassan Abdulredha

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Combination of Color and Local Patterns as a Feature Vector for CBIR

EFFICIENT H.264 VIDEO CODING WITH A WORKING MEMORY OF OBJECTS

An Improved Image Segmentation Algorithm Based on the Otsu Method

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

3 Image Compression. Multimedia Data Size/Duration Kbits Telephone quality speech. A Page of text 11 x 8.5

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Classifier Selection Based on Data Complexity Measures *

A Similarity Measure Method for Symbolization Time Series

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

Object Tracking Based on PISC Image and Template Matching

Enhanced Watermarking Technique for Color Images using Visual Cryptography

An Image Compression Algorithm based on Wavelet Transform and LZW

Video Proxy System for a Large-scale VOD System (DINA)

An efficient method to build panoramic image mosaics

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Shape-adaptive DCT and Its Application in Region-based Image Coding

UB at GeoCLEF Department of Geography Abstract

Linear Hashtable Motion Estimation Algorithm for Distributed Video Processing

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Efficient Video Coding with R-D Constrained Quadtree Segmentation

ClassMiner: Mining medical video content structure and events towards efficient access and scalable skimming *

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

Local Tri-directional Weber Rhombus Co-occurrence Pattern: A New Texture Descriptor for Brodatz Texture Image Retrieval

KEYWORDS: Digital Image Watermarking, Discrete Wavelet Transform, General Regression Neural Network, Human Visual System. 1.

Structural Analysis of Musical Signals for Indexing and Thumbnailing

Classification Based Mode Decisions for Video over Networks

Private Information Retrieval (PIR)

Cluster Analysis of Electrical Behavior

Pictures at an Exhibition

Face Recognition using 3D Directional Corner Points

Coding Artifact Reduction Using Edge Map Guided Adaptive and Fuzzy Filter

Efficient Color and Texture Feature Extraction Technique for Content Based Image Retrieval System

Photo management applications

An Effective Approach for Video Copy Detection and Identification of Misbehaving Users

Semi-Fragile Watermarking Scheme for Authentication of JPEG Images

Fast Intra- and Inter-Prediction Mode Decision in H.264 Advanced Video Coding

Optimal Workload-based Weighted Wavelet Synopses

A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Visual Thesaurus for Color Image Retrieval using Self-Organizing Maps

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Video Denoising Algorithm in Sliding 3D DCT domain

Data Modelling and. Multimedia. Databases M. Multimedia. Information Retrieval Part II. Outline

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Transcription:

A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper, we present a novel technque that can be used for fast smlarty-based ndexng and retreval of both mage and vdeo databases n dstrbuted envronments. We assume that mage or vdeo databases are stored n the compressed form usng standard technques such as JPEG for mages, and M-JPEG or MPEG for vdeos. The exstng technques, proposed n the lterature, use computatonally ntensve features and cost functons for content-based mage and vdeo retreval and ndexng. The proposed algorthm uses an nnovatve approach based on hstograms of DC coeffcents only, and therefore s computatonally less expensve than the other approaches. In the case of a JPEG-compressed mage database, the query process s the followng. The user submts a request for searchby-smlarty by presentng the desred mage. The algorthm calculates the DC coeffcents of ths mage and creates the hstogram of DC coeffcents. Then, the algorthm compares the DC hstogram of the submtted mage wth the DC hstograms of the mages stored n the database usng a hstogram smlarty metrc. The mage database can be local or at a remote server. In our experments, we compared several hstogram smlarty metrcs: weghted Eucldean dstance, square dfference, and absolute dfference. The algorthm then selects and presents to the user the mages wth the smallest values of the metrc that best match the submtted mage. In the case of a compressed vdeo database, the smlarty-based ndexng and retreval s more complex. The manpulaton of a vdeo database conssts of three man operatons: () parttonng of the vdeo nto clps, () key frame extracton, and (3) ndexng and retreval of key frames. The proposed algorthm has been appled n all three steps. Frst, the DC hstograms are mplemented for parttonng each vdeo nto clps or camera shots. Then, n the next phase the same DC hstograms are used to extract key frames and create a database of key frames only. Fnally, n the last step, the user submts one or more vdeo frames that he/she s searchng for. We mplemented the descrbed algorthm for smlarty-based retreval to both mage and vdeo databases. The expermental results, presented n the paper, show that the proposed algorthm can be very effcent for smlarty-based search of mages and vdeos n dstrbuted envronments, such as Internet, Intranets, or local-area networks. Keywords: content-based retreval and ndexng, multmeda databases, DC coeffcents, hstogram of DC coeffcents. INTRODUCTION There are two man approaches n ndexng and retreval of mages and vdeos n multmeda databases: (a) keyword-based ndexng and (b) content-based ndexng. The keyword-based ndexng uses keywords or descrptve text, whch s stored together wth mages and vdeos n the databases. Retreval s performed by matchng the query, gven n the form of keywords, wth the stored keywords. Ths approach s not satsfactory, because the text-based descrpton tends to be ncomplete, mprecse, and nconsstent n specfyng vsual nformaton.

To overcome ths problem, recent research has been focused on content-based ndexng and retreval technques [,,3,4,5]. Ths approach allows users to ndex and retreve mages and vdeos from databases usng vsual content (such as promnent regons, color, shape, sze, and texture), moton related nformaton (movement of objects, enlargng or shrnkng, and global camera operaton), and smlarty-based. The current technques, proposed n lterature, mostly deal wth uncompressed multmeda objects (mages and vdeos). There are several technques proposed for shot detecton and segmentaton of compressed vdeo [3,4,5]. These technques use block comparson metrcs, whch measure the dfferences between DCT coeffcents of blocks n two frames. In ths paper, we present a technque for content-based mage and vdeo ndexng and retreval, whch uses hstograms of DC coeffcents. We assume that mages and vdeos n multmeda databases are stored n compressed form (JPEG for mages or MPEG and M-JPEG for vdeos). We propose a fast retreval and ndexng algorthm that can be very effcently used for content-based search on the Internet. The fundamental dea of the new algorthm conssts of usng hstograms of DC coeffcents only of the stored JPEG mages, or I-frames n the case of the compressed MPEG or M-JPEG vdeo. The experments show that the hstogram of DC coeffcents s a very dstngushable characterstc of an mage and can be effectvely used for mage or vdeo retreval and ndexng. On the other hand, the calculaton of the hstogram of DC coeffcents and related cost functons turns to be very fast and does not requre computatonally ntensve algorthms.. AN ALGORITHM FOR SIMILARITY-BASED RETRIVAL OF IMAGES The JPEG encodng standard for full-color mages s based on DCT transformaton. An mage s dvded nto 8x8 blocks, and pxels from each block are transformed from spatal to frequency doman. The transformed 64-pont dscrete sgnal s a functon of two spatal dmensons, and ts components are called spatal frequences or DCT coeffcents. The F(0,0) coeffcent s called the DC coeffcent, and the remanng 63 coeffcents are called AC coeffcents. For color mages, represented by YUV or YCbCr format, the DCT transform s performed to all three components. The proposed algorthm s based on DC coeffcents that are calculated only for Y (lumnance) component. There are two reasons for ths decson: () human vsual system s more senstve to Y than to two other chromnance components, and () the JPEG and MPEG standards typcally use hgher densty for Y than for the other two components. Hstogram of DC Coeffcents The pxels of the orgnal Y component n spatal doman are coded wth 8 bts. However, after the DCT transformaton, the szes of DC coeffcents of the Y component become bts; the DC coeffcents are n the range [-04 to +03]. The hstogram of DC coeffcents can be now created. For llustraton purposes, the DC hstogram s created for the mage elephant, whch conssts of 600x800 pxels. The mage contans 75x00 mcroblocks, whch gves 7,500 DC coeffcents. The hstogram of DC coeffcents s shown n Fgure. The number of hstogram bns n ths example s 048, whch corresponds to all values of DC coeffcents n the range [-,04 to +,03]. However, the hstogram of DC coeffcents can be reduced to a smaller sze of hstogram bns 04, 5, or 56 bns. The hstogram wth a smaller sze of bns requres less computaton when hstogram smlarty metrc s calculated. Hstogram Smlarty Metrcs Hstogram smlarty metrcs are used to compare DC hstograms of a gven mage wth hstograms of compressed mages from the database. We analyzed three hstogram-comparson metrcs: () Weghted Eucldean Dstance, () Square Dfference, and (3) Absolute Dfference. These three metrcs are defned next. Let s denote the j th hstogram bn value of a query mage as H Q (j), and the j th hstogram bn value of an mage n the database as H D (j). Then, the Weghted Eucldean Dstance (WED) metrc s defned as WED = N j= w [ H j Q H D ]

where: N s the total number of hstogram bns, and w j s the weght n bn j defned as w w j j = H Q f.. H = otherwse Q 0 Fgure. Hstogram of DC coeffcents for the mage elephant. The Square Dfference (SD) metrc s defned as SD = N j= [ H Q H D ]

and the Absolute Dfference (AD) metrc as AD = N j= H Q H D The complexty of all three metrcs depends on the number of hstogram bns. Our experments have shown that the metrcs based on 5 bns perform qute well and not much worse than wth,048 bns. Example of Smlarty-Based Retreval of an Image Database In the followng example, we compared the effcency of three metrcs n retrevng compressed mages from an mage database. We created an mage database, whch conssts of 00 mages. We performed the experments for dfferent number of hstogram bns: 048, 04, 5, and 56. The user submts a request for search by smlarty by presentng the desred mage to the algorthm. The algorthm calculates the DC coeffcents of ths mage. Then, one of the hstogram smlarty metrc s calculated to compare the DC hstogram of the submtted mage wth the DC hstograms of the mages stored n the database. Then, the algorthm presents to the user the set of mages wth the smallest values of hstogram smlarty metrcs. The whole query process takes only a few seconds. For llustraton, n Table and Fgure, results of query-by-smlarty are presented. In Fgure, the algorthm presented best 0 matches of the compressed mages based on the absolute dfference metrc. Table. Results of Retrevng Image elephant.jpg from the Image Database IMAGE NAME WED IMAGE NAME SD IMAGE NAME AD Elephant.jpg 0.30 Elephant.jpg 0 Elephant.jpg 0 Elephant3.jpg 650 Elephant3.jpg 9.35 Elephant3.jpg 0.5 Oregeon-sunset.jpg 45 Icefeld.jpg 8.8 Elephant.jpg 0.83 Icefeld.jpg 508 Oregeon-sunset.jpg 8.99 Flower3.jpg.04 Icefeld.jpg 53 Namess.jpg 9.87 Goat.jpg.07 Chamber.jpg 546 Icefeld.jpg 9.96 Flower7.jpg.09 Namess.jpg 548 Namess3.jpg 30. Surf.jpg. Porcelan.jpg 568 Namess4.jpg 3.09 Flower6.jpg. Woman.jpg 573 Namess6.jpg 3.44 Flower4.jpg.6 Namess3.jpg 583 Lake-goat.jpg 3.3 Sd5.jpg.0 The followng conclusons can be drawn from these experments: All three metrcs gave good results n smlarty-based retreval, but the absolute dfference metrc seems to be the most relable. Reducng the number of hstogram bns from,048 to,04 was effcent. Frst, t reduced the number of operatons needed for the calculaton of smlarty metrcs. Second, the smaller number of bns reduced the senstvty of ndexng due to quantzaton nose. However, when the number of bns was further reduced to 5 and 56, the ndexng results were deterorated.

Fgure. Example of smlarty-based retreval usng the DC hstogram and the absolute dfference metrc.

3. AN ALGORITHM FOR SIMILARITY-BASED RETRIEVAL OF COMPRESSED VIDEOS In the case of compressed vdeo databases, the procedure s more complex. The manpulaton of a vdeo database conssts of three man operatons: () Parttonng of the vdeo nto clps, () Key frame extracton, and (3) Indexng and retreval of key frames. The frst two steps are typcally performed off-lne durng the feature extracton phase, whle the last step s performed n real tme. The proposed algorthm, based on DC hstograms, can be appled n all three steps. Frst, the DC hstogram s mplemented to partton each vdeo nto clps or camera shots. Then, n the next phase the same DC hstogram s used to extract key frames and create a database of key frames only. Fnally, n the last step, the user submts one or more vdeo frames that he/she s searchng for. The algorthm s capable of searchng through the vdeo database (key frames only) and retreve the most smlar frames or clps. Vdeo Parttonng The hstogram of DC coeffcents can successfully be used n parttonng vdeo by detectng camera breaks. Frst, let s consder M-JPEG compressed vdeo, where all frames are I-frames. In ths case, we use DC hstograms to compare subsequent frames and detect camera breaks. To mnmze the computatonal complexty, the range of DC coeffcents s reduced to [-56,+55] by usng the followng formula: F(0,0) = 3 7 x= 0 y= 0 where: F(0,0) s a DC coeffcent, and f(x,y) s a pxel value of y-component n a 8x8 block. 7 f ( x, y) To test the smlarty of hstograms of subsequent frames from the same clp, we performed several experments wth standard vdeo clps Football, Mss Amerca, and Suse. Results, presented n Fgure 3a-c, show two DC hstograms for each clp, the hstogram of frame 0 and frame 8. In all three cases the hstograms of these two frames are almost dentcal. Then, we compared DC hstograms of dfferent clps. Fgure 4 compares the DC hstograms of frame 0 for these three clps. It shows that the hstograms of these three frames are sgnfcantly dfferent. In order to detect camera breaks, we defne the normalzed square dfference metrc (NSD): NSD = N j= [ H H H ] where: NSD s the normalzed square dfference metrc for frame, and H (j) are DC hstogram values for the th frame, and j s one of possble hstogram levels. If the overall dfference s greater than a gven threshold T, a camera break s declared.

Fgure 3. Hstogram of DC coeffcents of frames 0 and 7 for vdeo clps: (a) Football, (b) Mss Amerca, and (c) Suse.

Mss Amerca Footbal l Suse Fgure 4. DC hstogram comparson of frames 0 for three vdeo clps. To test the proposed technque, we apled t to a composed vdeo conssted of three clps, each contanng 8 frames. The results of the vdeo parttonng experment are presented n Fgure 5. The algorthm was able to correctly detect both camera breaks. The threshold, used n the experment, was T =0. 40 Camera breaks 35 NSD x 00 [%] 30 5 0 Threshold 5 0 5 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 Frame number Fgure 5. DC hstogram comparson technque appled to vdeo parttonng. For a vdeo database compressed usng the MPEG technque, the vdeo parttonng uses a two-pass approach [3]. In the frst pass, the proposed technque, based on DC hstograms, s appled to I-frames only. For example, for a MPEG sequence {IBBPBBPBB}{IBBPBBPBPP}, etc., the algorthm wll detect the camera breaks occurred between I-frames. In the second pass, a technque based on moton vectors [7] s appled to detect the camera break wthn those sequences whch are detected n the frst pass.

Key Frame Extracton In the next step, the key frames are extracted from the vdeo segments dentfed n the frst step. The DC hstogram comparson technque s used n ths step as well. However, the smlarty metrc s now defned as the accumulated dfference between the current frame and the prevous key frame NSD = N j= [ H H KF H ] where: H KF (j) s the j th hstogram bn value of the DC hstogram of the prevous key frame. The frst frame n a vdeo clp s always declared as the frst key frame. Then, the other frames are compared to ths frame. When the dfference becomes greater than the threshold T, the current frame s declared as the next key frame. The followng frames are then compared to ths key frame. Fgure 6 llustrates the procedure for extractng key frames. Note that the threshold T =0, used for the key frame extracton, s smaller than the threshold T, used for vdeo parttonng. The descrbed process s appled to I-frames only. 6 4 Threshold Accumulated NSD x 00 [%] 0 8 6 4 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 - Frame number Key frames Fgure 6. DC hstogram comparson technque for extracton of key frames. In the example n Fgure 6, the vdeo clp comprsed of three sequences: Football, Mss Amerca, and Suse, each consstng of 8 frames. The algorthm has extracted four key frames.

Indexng and Retreval of Key Frames Fnally, n the last step, the DC hstogram technque s appled to smlarty-based search of extracted key frames. The set of key frames, extracted n the prevous step, comprses a key-frame database, and the search s now performed on key frames only. Ths step s equvalent to the smlarty-based retreval of mage databases, descrbed n Secton. In our experment, we created a database of key frames and appled the proposed algorthm for the retreval of frames, whch are smlar to the gven frame. The results are shown n Fgure 7. Fgure 7. Example of smlarty-based retreval of key frames usng DC hstograms.

4. CONCLUSION We presented an algorthm for smlarty-based ndexng and retreval of mage and vdeo databases. The proposed algorthm s based on DC hstograms of compressed mages and vdeo frames. We analyzed several hstogram smlarty metrcs n order to select the most effcent one. The algorthm has been tested on a small compressed mage database as well as on several vdeo sequences. In summary, the proposed algorthm can be very effcent for smlarty-based retreval of mages and vdeos n dstrbuted envronments, such as Internet, Intranets, or local-area networks. REFERENCES. H.J. Zhang, S.Y. Tan, S.W. Smolar, and Y. Gong, Automatc Parsng and Indexng of News Vdeo, Multmeda Systems, Vol., No. 6, pp. 55-64, 995.. E. Ardzzone and M. La Casca, Automatc Vdeo Database Indexng and Retreval, Journal of Multmeda Tools and Applcatons, Vol. 4, pp. 9-56, 997. 3. B. Furht, S.W. Smolar, and H.J. Zhang, Vdeo and Image Processng n Multmeda Systems, Kluwer Academc Publshers, Norwell, MA, 995. 4. F. Arman et al., Content-Based Browsng of Vdeo Sequences, Proc. of ACM Multmeda 94, San Francsco, CA, October 994. 5. H.J. Zhang, et al., Vdeo Parsng Usng Compressed Data, Proc. SPIE 94 Symposum on Image and Vdeo Processng, San Jose, CA, pp. 4-49, February 994. 6. B.-L. Yeo and B. Lu, A Unfed Approach to Temporal Segmentaton of Moton JPEG and MPEG Compressed Vdeo, Proc. IEEE Internatonal Conference on Multmeda Computng and Networkng, Washngton D.C., pp. 8-88, May 995. 7. A. Akutsu et. al., Vdeo Indexng Usng Moton Vectors, Proc. of SPIE 9 Symposum on Communcatons and Image Processng, Boston, MA, pp. 5-530, November 99.