Video Summarization and Browsing Using Growing Cell Structures

Size: px
Start display at page:

Download "Video Summarization and Browsing Using Growing Cell Structures"

Transcription

1 Video Summarization and Browsing Using Growing Cell Structures Irena Koprinska and James Clark School of Information Technologies, The University of Sydney Madsen Building F09, Sydney NSW 2006, Australia {irena, Abstract We present a new approach for video summarization and browsing of MPEG-2 compressed video based on the Growing Cell Structures (GCS) neural algorithms. It first applies GCS to select keyframes for each shot and then clusters them using TreeGCS to form a hierarchical view of the video for efficient browsing. The keyframe selection is based on histogram features of the dc-images for I frames. It captures well the video content and outperforms two other approaches. The main advantage of the TreeGCS module is the ability to form dynamically a flexible hierarchy depending on the video content. I. INTRODUCTION Recent developments in computing performance, multimedia compression and communication technologies have made possible the creation of digital video archives. Applications such as video-on-demand, digital TV, digital libraries generate and use large collections of video data. It is also expected that the storage of digital video at home will soon overtake the current analogue systems. However, unlike the document databases that use keywords to quickly access data, video databases still lack techniques for efficient organization, searching and retrieval. Text-based video organization based on manual annotation is highly inefficient, subjective, and time consuming. Recently, some content-based prototype systems [1,3,11,12] have been developed to automatically organize video in order to provide fast and meaningful nonlinear access to the relevant material in video. The generally accepted approach first breaks up the video stream into temporally homogeneous segments called shots [9]. Each shot is then represented by one or more keyframes. The shots are indexed typically using spatial features extracted from the keyframes (e.g. color, texture, shape) and also temporal features extracted from the shot (e.g. motion, camera operations) and are organised into a sequential or hierarchical structure of keyframes. This representation allows the user to browse the content of the video and search quickly for sub-sequences of interest without the need to watch the entire video. The user can also query the video database. The retrieval is based on the similarity between the feature vector of the query and feature vectors representing the shots. Clustering has been successfully applied for both keyframe selection and video organization. Ferman et al. [4] cluster the frames within each shot using an iterative 3-means algorithm and select as a keyframe the frame closest to the cenntroid of the larger cluster. As the content of the shot may change significantly due to camera operations and object motion, in their subsequent work [3] the clustering algorithm is modified to extract more than one keyframe. Two keyframes are extracted for clusters with high intercluster variance (the closest and the farthest to the centroid). Frames with large deviations from the average luminance of the shot are selected as keyframes as well. Girgenson and Boreczky [7] applied an agglomerative hierarchical clustering to extract a predefined number of keyframes and used them to represent videotaped meetings and presentations. Drew and Au [2] proposed a new feature based on color histograms and then applied an agglomerative hierarchical clustering that merges clusters based on cluster variance and temporal distance. Yeung et al. [12] select one keyframe for each shot and then cluster them based on visual content and temporal distance to create a scene-transition graph. In ViBE [1] each shot is represented with a tree induced by hierarchical clustering of the frames. The video is then organised into a three level similarity pyramid where each level contains groups of similar shots organized into a two dimensional grid. The pyramid is created by clustering of the shots based on temporal, motion, pseudosemantic and shot-tree distance. In this paper we present a new approach for keyframe selection and video browsing based on the Growing Cell Structures (GCS) neural algorithm. We use GCS to select keyframes representing the content of each shot. GCS finds the number of keyframes in unsupervised fashion, and also maps similar frames to neighboring nodes offering a better indication of the shot s structure. TreeGCS is then used to cluster the shots and provide a high-level hierarchical video representation. The main advantage over existing systems is that it creates a flexible hierarchical representation of the

2 video content, i.e. the number of layers in the hierarchy and the number of clusters in each layer will depend on the video content and do not have to be pre-specified in advance. In addition, similar clusters (of frames or shots) are mapped onto neighboring nodes that makes browsing more convenient. Our system operates directly on MPEG-2 compressed video that allows faster operations and smaller storage requirements. II. GROWING CELL STRUCTURE ALGORITHMS A. GCS GCS [5] is an incremental self-organizing neural algorithm, an extension of Kohonen s self-organising maps (SOM) [10]. It generates a mapping from a high dimensional input data to a lower (typically two-dimensional) space. The main advantage of such a mapping is that it allows to gain insight into the structure of the data due to two important properties: topology preservation (similar inputs are mapped onto neighboring neurons) and density preservation (regions of high input density are mapped on neural structures with more neurons). An important advantage over SOM and most of the classical clustering algorithms (e.g. k-means) is that GCS is able automatically to find a suitable network size and structure, i.e. does not require the number of clusters to be specified in advance. This is achieved through a process of controlled growing and removal of nodes. Unlike SOM, in GCS the number of neighboring neurons connected to a given neuron is not fixed. Finally, GCS is able to form discrete clusters, while in SOM the clusters remain connected and to find their boundaries is not always easy. The GCS algorithm we implemented starts with a randomly initialised triangle of neurons. At each iteration the best matching unit and its topological neighbours are adapted toward the input vector. There is no cooling schedule'' as in SOM, where neighbourhood size and learning rate decrease with time. New neurons are inserted at positions with high errors when the current structure under represents the input data distribution. Superfluous neurons are deleted from regions with low probability distribution. It is important that the deletion step maintains the consistency of the triangular structure. To ensure this we have implemented a simpler heuristic than Fritzke s tetrahedron based. The algorithm iterates until the stopping criteria is satisfied (maximum number of epochs or network size is reached). Fritzke has also demonstrated superior performance of GCS over SOM in terms of topology preservation and distribution-modelling error [6]. The algorithm requires 7 user specified parameters: maximum number of neurons or training epochs, insertion period, deletion period, learning rates for the winner ebmu and its neighbourhood e i, and error decay factors and. Fig. 1. GCS simulation results on four square shaped data B. TreeGCS TreeGCS [8] is a hierarchical clustering algorithm that is based on GCS. It maps high dimensional input vectors onto a multi-depth two dimensional hierarchy that preserves the topological ordering of the input space. The tree is generated dynamically and adapts to the underlying GCS structure. Initially the root of the tree points to one cluster that contains the initial GCS network. A split in the cluster results in adding a new node in the tree (Fig. 2). When clusters are deleted, the associated tree nodes are deleted and the resulting redundancies (if any) are removed. Our implementation follows strictly the original algorithm apart from the introduction of a hierarchy generation threshold (in [8] the tree is generated at the end of each GCS epoch). This threshold is the only user specified parameter. Fig. 2. Creating new nodes in TreeGCS when a cluster subdivides III. VIDEOGCS A. Data Pre-processing and Feature Extraction Since MPEG was established as an international standard for compression of digital video, video is increasingly stored and moved in compressed format. This motivates the

3 development of methods that process directly compressed video due to the computational and storage savings (no need to decode/re-encode the video) and faster operations (lower data rate of compressed video). Our system operates directly on MPEG-2 encoded video. MPEG-2 uses mackroblock based motion compensation to reduce temporal redundancy and block-based Discrete Cosine Transform (DCT) to reduce spatial redundancy. The only information that is available in the compressed stream is the DCT coefficients of intra coded blocks or residual errors, and also the motion vectors. Our system uses the DC terms (i.e. the 0 frequency term of the DCT coefficients) of intra-coded (I) frames. As each DC term is a scaled version of the block's average value, spatially reduced versions of the original images, called dc-images [11], can be constructed. The (i,j) pixel of the dc-image is the average value of the (i,j) block of the image (Fig.3 ). For each dc-image we compute the 16-bin grayscale histogram. Histograms have been successfully used as image representation as they are less sensitive to object movement, image rotation or variations in viewing angle and scale. decompressed them. After that we concatenated them using cuts (in the order shown in Table I) to form one long sequence. This sequence was then MPEG-2 compressed and the DC terms were extracted. A total of 30 shot boundaries were detected, 11 of them gradual and the other abrupt. Thus the total number of shots was 31. The size of the original video frames was 352x240 pixels, hence the size of their dc-images was 44x30 pixels. TABLE I VIDEO STATISTICS sequence # frames # shots Canada day 768 4: s0-s3 Capilano 778 2: s4-s5 Dragon boat 705 6: s6-s11 Jazz 577 3: s12-s14 Professor 130 1: s15 Steam clock 648 1: s16 Walk with dragon 795 3: s17-s19 Aqua : s20-s26 Beach 740 4: s27-s30 Fig. 3. A full image (352x288 pixels) and its dc image (44x36 pixels) B. Keyframe Selection and Video Representation After shot boundary detection, we use GCS to cluster the frames in each shot based on their 16-dimensional feature vectors. Depending on the content of the shot, GCS forms different number of discrete clusters. For each of them, the keyframe closest to the centroid is selected as a keyframe. Because GCS is preserving topology, similar frames are mapped to neighboring neurons. The selected keyframes are further clustered using TreeGCS to create a hierarchical view of the video sequence allowing the user to browse at different level of content. The depth of the hierarchy and number of nodes in each level depend on the video content. Each node corresponds to a cluster of similar shots, and can be represented by one single keyframe chosen as described above. The bottom level nodes are associated with clusters of similar shots that are mapped on a 2-dimensional GCS grid allowing efficient visualization and browsing. IV. EXPERIMENTS A. Video Sequences We used 9 video sequences available from [14] and previously used for keyframe selection evaluation (Table I). As the videos were originally MPEG-1 compressed, we B. GCS and TreeGCS Parameters The following GCS parameters were used: number of iterations=20000, insertion period=200, deletion period=2000, error decay factors =1, =0.0004, learning rates: e bmu = 0.06, e i = The hierarchy period of TreeGCS was set to 500. Our preliminary experiments showed that the ratio between the insertion and deletion periods is important. Before a deletion is performed, the GCS network has to grow sufficiently. This ensures that the clusters are not formed prematurely. C. Keyframe Selection Results The keyframe selection results of GCS are summarized in Table II. The column Correct indicates correct humanproduced results. It should be noted that our correct keyframes are slightly different than those reported in [2] for the following sequences: Capilano (1 less keyframe in the first shot), Dragon boat (2 less in the last shot) and Aquarium (3 more: 1 more in shot 1, 1 for shot 2 that was a missed shot in [2] and 1 more in shot 6). A comparison of GCS with two other approaches HistInt [3] and Signatures [2] is presented in Table IV based on the results reported in [2]. Both approaches used color histograms and work on uncompressed video. Some examples are shown in Fig.4-6 (for GCS the grayscale dcimages are shown).

4 TABLE II NUMBER OF KEYFRAMES GENERATED BY GCS sequence correct generated redundant missed Canada day Capilano Dragon boat Jazz Professor Steam clock Walk dragon Aqua Beach Total a) Correct (4 keyframes) b) GCS (6 keyframes) c) HistInt (18 keyframes) a) Correct (9 keyframes) d) Signatures (5 keyframes) Fig. 6. Keyframe selection for the sequence Walk with the dragon b) GCS (10 keyframes) c) HistInt (36 keyframes) d) Signatures (4 keyframes) Fig. 4. Keyframe selection for the sequence Aqua Overall GCS performs well and typically selects 1 keyframe for the low activity shots and several keyframes for the high activity shots. It misses just 3 keyframes but generates 13 redundant. In half of the cases these redundancies occur in sequences involving panning and zooming. For example, GCS typically generates two keyframes instead of one for shot 3 of Canada day (there is a small zoom and object tracking) and shot 1 (pan), and also for shot 3 of Beach (zoom and object movement). As the image (and its corresponding histogram) changes, GCS generates a new keyframe. But as the semantics does not change, the human does not select a new keyframe. In the other cases the redundant keyframes are selected for small and low activity shots. For example, two similar keyframes are generated for shot 4 of Aqua (Fig.4), and three for shot 1 of Walk with the dragon. This happens because GCS always splits the cluster after a pre-specified number of iterations regardless of its quality. This drawback can be eliminated by modifying the GCS deletion step. a) Correct b) GCS c) Signatures (1 keyframe) d) HistInt (6 keyframes) Fig. 5. Keyframe selection for the sequence Steam TABLE III DEFINITION OF RECALL, PRECISION AND F1 MEASURE keyframes # assigned as correct # not assigned as correct # correct tp fn (missed) # not correct fp (redundant) tn tp tp PR P, R, F1 2 tp fp tp fn P R

5 Nevertheless, GCS compares well with the other two approaches. HistInt tends to generate too many keyframes, while Signatures is able to generate a compact representation but there are many misses and redundancies. We have also calculated Recall (R), Precision (P) and F1 measure that are standard performance measures in information retrieval (Table III). As it can be seen from Table IV, overall GCS is the best approach. features characterizing each shot (based on the motion vectors that are directly available in the MPEG-2 steam), temporal features that prevent too distant keyframes to be grouped together and also more semantically rich components such as text captions and teletex. The open framework also allows using different distance metrics, e.g. the histograms can be compared with the widely used chi squared test. TABLE IV KEYFRAME SELECTION COMPARISON corr ect gener ated redun dant miss ed R [%] P [%] F1 [%] GCS Hist Int Signa tures D. Hierarchical Video Representation Results The generated hierarchical representation by clustering of the keyframes using TreeGCS is shown in Fig. 7. It has organized the keyframes (and the shots they represent) into a 3-level structure. Each node corresponds to a cluster of similar shots, and can be represented by one single keyframe. As it can be seen, the keyframes are grouped into two main clusters based on their gray-level histogram: lighter and darker. These two clusters are further split into 3 and 2 subclusters of similar shots, respectively. Similar sub-clusters appear close to each other in the tree. The biggest sub-cluster (sub-cluster 5) is less homogeneous than the others; if TreeGCS had been trained longer, it would have split it into further sub-clusters. The number of neurons in the five GCS grids was 8, 8, 11, 18 and 43, respectively. Within each of these bottom level clusters, similar keyframes were mapped to neighboring neurons in the GCS grid. The keyframe closest to the cluster centroid was selected as a keyframe representing the cluster of similar shots (the framed pictures: s17, s6, s19, s16 and s27). Similarly, keyframes can also be chosen for the two nodes at level 1. Thus, the resulting structure will allow the user to browse the video at different levels of detail. The quality of the video summarization crucially depends on the quality of the features extracted to represent each shot. As the example shows, while the keyframe histogram is a useful feature it may not be enough to capture well the semantics of the video and allow efficient retrieval. Highlevel semantic features would provide more useful description but their automated extraction is an open research problem. One of the advantages of clustering-based keyframe selection and video organization is that new features can be easily incorporated. We plan to investigate the use of motion Fig. 7. Hierarchical video representation The main advantage of the hierarchical representation used in VideoGCS is the ability dynamically to form a hierarchy where the number of layers and clusters in them depend on the video content. In the existing systems for video summarization the structure is fixed. For example, in [1] a three level hierarchy is used with a fixed number of clusters in each level (e.g. 4, 16, 54). In [13] the number of levels and clusters in them was also pre-determined. The agglomerative hierarchical clustering approaches used in [7,12] generate dendrograms that cannot be visualized for large data sets and require a selection of pre-defined number of nodes. TreeGCS also provides good visualization due to the underlaying GCS algorithm that maps high dimensional inputs to a twodimensional grid that is topology and density preserving. In contrast to SOM, it is able to automatically find the cluster boundaries. V. CONCLUSION In this paper we have presented a new approach for video summarization and browsing based on the GCS neural algorithms. The system VideoGCS process directly MPEG-2 compressed video. It applies GCS to select keyframes for each shot and then clusters them using TreeGCS to form a hierarchical view of the video content for efficient browsing. The results show that the keyframe selection module captures well the salient video content and outperforms two other approaches. The generated hierarchy based on the grayscale histogram of keyframes is useful but it does not capture the

6 video semantics. However, an advantage of the TreeGCS module over the existing systems is its ability to dynamically form a flexible hierarchy that depends on the video content. Future work will include modification of the GCS algorithm to reduce the number of redundant keyframes for small and low activity shots, and also integration of complementary low-level and semantic features to improve summarization. Another interesting direction for future research is to apply VideoGCS for creating video summaries on-line as both GCS and TreeGCS can be used in an on-line mode. ACKNOWLEDGMENT This work was supported by SESQUI grant Video Segmentation and Summarization from the University of Sydney. We are very grateful to Damien McMonigal for the extraction of the dc-images. REFERENCES [1] J.-Y. Chen, G. Taskiran, A. Albiol, E.J. Delp and C. Bouman, ViBE: A Compressed Video Database Structured for Active Browsing and Search,, IEEE Trans. Multimedia, [2] M. S. Drew and J. Au, Video Keyframe Production by Efficient Clustering of Compressed Chromaticity Signatures, ACM Multimedia, [3] A.M. Ferman and A.M. Tekalp, Efficient Filtering and Clustering Methods for Temporal Video Representtaion and Visual Summarization,, J. Visual Commun. & Image Rep., vol. 9, pp , [4] A.M. Ferman and A.M. Tekalp, Multiscale Context Extraction and Representtaion for Video Indexing, SPIE 3229, pp.23-31, [5] B. Fritzke, Growing Cell Structures a Self-Organizing Network for Unsupervised and Supervised Learning,, Neural Networks, vol.7(9), pp , [6] B. Fritzke, Kohonen feature maps and Growing Cell Structures A Performance Comparison, Adv. Neural Info. Processing, [7] A. Girgensohn and J. Boreczky, Time-constrained Keyframe Selection Technique, Multim. Tools & Appl, v.11, pp , [8] V.J. Hodge and J. Austin, "Hierarchical Growing Cell Structures: TreeGCS, IEEE Trans Know& Data Eng, v.13(2), pp , [9] I. Koprinska and S. Carrato, Temporal Video segmentation: A Survey,, Signal Processing: Image Commun, v.16, pp , [10] T. Kohonen, Self-Organizing Maps, 2d ed., Springer-Verlag, [11] B. Yeo and B.-L. Liu, Rapid scene Analysis on Compressed Video,, IEE Trans Circuits Sys Video tech, v.5(6), pp , [12] M. Yeung and B.-L. Yeo, Segmentation of Video by Clustering and Graph Analysis, Comp.Vis.& Image Und., v.71(1), pp , [13] D. Zhong, H. Zhang, and S.-F. Chang, Clustering Methods for Video Browsing and Annotation,, SPIR-2670, pp , [14]

Navidgator. Similarity Based Browsing for Image & Video Databases. Damian Borth, Christian Schulze, Adrian Ulges, Thomas M. Breuel

Navidgator. Similarity Based Browsing for Image & Video Databases. Damian Borth, Christian Schulze, Adrian Ulges, Thomas M. Breuel Navidgator Similarity Based Browsing for Image & Video Databases Damian Borth, Christian Schulze, Adrian Ulges, Thomas M. Breuel Image Understanding and Pattern Recognition DFKI & TU Kaiserslautern 25.Sep.2008

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

PixSO: A System for Video Shot Detection

PixSO: A System for Video Shot Detection PixSO: A System for Video Shot Detection Chengcui Zhang 1, Shu-Ching Chen 1, Mei-Ling Shyu 2 1 School of Computer Science, Florida International University, Miami, FL 33199, USA 2 Department of Electrical

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES. David Asatryan, Manuk Zakaryan

NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES. David Asatryan, Manuk Zakaryan International Journal "Information Content and Processing", Volume 2, Number 1, 2015 71 NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES

More information

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 8, DECEMBER 1999 1147 Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services P. Salembier,

More information

Scene Change Detection Based on Twice Difference of Luminance Histograms

Scene Change Detection Based on Twice Difference of Luminance Histograms Scene Change Detection Based on Twice Difference of Luminance Histograms Xinying Wang 1, K.N.Plataniotis 2, A. N. Venetsanopoulos 1 1 Department of Electrical & Computer Engineering University of Toronto

More information

THE PROLIFERATION of multimedia material, while offering

THE PROLIFERATION of multimedia material, while offering IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY 2004 103 ViBE: A Compressed Video Database Structured for Active Browsing and Search Cuneyt Taskiran, Student Member, IEEE, Jau-Yuen Chen, Alberto

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

A Rapid Scheme for Slow-Motion Replay Segment Detection

A Rapid Scheme for Slow-Motion Replay Segment Detection A Rapid Scheme for Slow-Motion Replay Segment Detection Wei-Hong Chuang, Dun-Yu Hsiao, Soo-Chang Pei, and Homer Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 10617,

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

Video Key-Frame Extraction using Entropy value as Global and Local Feature

Video Key-Frame Extraction using Entropy value as Global and Local Feature Video Key-Frame Extraction using Entropy value as Global and Local Feature Siddu. P Algur #1, Vivek. R *2 # Department of Information Science Engineering, B.V. Bhoomraddi College of Engineering and Technology

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

Clustering Methods for Video Browsing and Annotation

Clustering Methods for Video Browsing and Annotation Clustering Methods for Video Browsing and Annotation Di Zhong, HongJiang Zhang 2 and Shih-Fu Chang* Institute of System Science, National University of Singapore Kent Ridge, Singapore 05 *Center for Telecommunication

More information

About MPEG Compression. More About Long-GOP Video

About MPEG Compression. More About Long-GOP Video About MPEG Compression HD video requires significantly more data than SD video. A single HD video frame can require up to six times more data than an SD frame. To record such large images with such a low

More information

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK Professor Laurence S. Dooley School of Computing and Communications Milton Keynes, UK How many bits required? 2.4Mbytes 84Kbytes 9.8Kbytes 50Kbytes Data Information Data and information are NOT the same!

More information

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Jung-Rim Kim, Seong Soo Chun, Seok-jin Oh, and Sanghoon Sull School of Electrical Engineering, Korea University,

More information

Image Segmentation Techniques for Object-Based Coding

Image Segmentation Techniques for Object-Based Coding Image Techniques for Object-Based Coding Junaid Ahmed, Joseph Bosworth, and Scott T. Acton The Oklahoma Imaging Laboratory School of Electrical and Computer Engineering Oklahoma State University {ajunaid,bosworj,sacton}@okstate.edu

More information

Two-step Modified SOM for Parallel Calculation

Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical

More information

Copyright Detection System for Videos Using TIRI-DCT Algorithm

Copyright Detection System for Videos Using TIRI-DCT Algorithm Research Journal of Applied Sciences, Engineering and Technology 4(24): 5391-5396, 2012 ISSN: 2040-7467 Maxwell Scientific Organization, 2012 Submitted: March 18, 2012 Accepted: June 15, 2012 Published:

More information

Lecture 12: Video Representation, Summarisation, and Query

Lecture 12: Video Representation, Summarisation, and Query Lecture 12: Video Representation, Summarisation, and Query Dr Jing Chen NICTA & CSE UNSW CS9519 Multimedia Systems S2 2006 jchen@cse.unsw.edu.au Last week Structure of video Frame Shot Scene Story Why

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

CHAPTER 3 SHOT DETECTION AND KEY FRAME EXTRACTION

CHAPTER 3 SHOT DETECTION AND KEY FRAME EXTRACTION 33 CHAPTER 3 SHOT DETECTION AND KEY FRAME EXTRACTION 3.1 INTRODUCTION The twenty-first century is an age of information explosion. We are witnessing a huge growth in digital data. The trend of increasing

More information

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India

Binju Bentex *1, Shandry K. K 2. PG Student, Department of Computer Science, College Of Engineering, Kidangoor, Kottayam, Kerala, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Survey on Summarization of Multiple User-Generated

More information

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform M. Nancy Regina 1, S. Caroline 2 PG Scholar, ECE, St. Xavier s Catholic College of Engineering, Nagercoil, India 1 Assistant

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

Chapter 12 On-Line Image and Video Data Processing

Chapter 12 On-Line Image and Video Data Processing Chapter 12 On-Line Image and Video Data Processing Nik Kasabov nkasabov@aut.ac.nz, www.kedri.info 12/16/2002 Nik Kasabov - Evolving Connectionist Systems Overview On-line colour quantisation On-line image

More information

CERIAS Tech Report ViBE: A Compressed Video Database Structured for Active Browsing and Search by J Chen, C Taskiran, A Albiol, E Delp, C

CERIAS Tech Report ViBE: A Compressed Video Database Structured for Active Browsing and Search by J Chen, C Taskiran, A Albiol, E Delp, C CERIAS Tech Report 2004-117 ViBE: A Compressed Video Database Structured for Active Browsing and Search by J Chen, C Taskiran, A Albiol, E Delp, C Bouman Center for Education and Research Information Assurance

More information

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features 1 Kum Sharanamma, 2 Krishnapriya Sharma 1,2 SIR MVIT Abstract- To describe the image features the Local binary pattern (LBP)

More information

Compression of Stereo Images using a Huffman-Zip Scheme

Compression of Stereo Images using a Huffman-Zip Scheme Compression of Stereo Images using a Huffman-Zip Scheme John Hamann, Vickey Yeh Department of Electrical Engineering, Stanford University Stanford, CA 94304 jhamann@stanford.edu, vickey@stanford.edu Abstract

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution 2011 IEEE International Symposium on Multimedia Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution Jeffrey Glaister, Calvin Chan, Michael Frankovich, Adrian

More information

Recall precision graph

Recall precision graph VIDEO SHOT BOUNDARY DETECTION USING SINGULAR VALUE DECOMPOSITION Λ Z.»CERNEKOVÁ, C. KOTROPOULOS AND I. PITAS Aristotle University of Thessaloniki Box 451, Thessaloniki 541 24, GREECE E-mail: (zuzana, costas,

More information

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data

More information

Algorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video

Algorithms and System for High-Level Structure Analysis and Event Detection in Soccer Video Algorithms and Sstem for High-Level Structure Analsis and Event Detection in Soccer Video Peng Xu, Shih-Fu Chang, Columbia Universit Aja Divakaran, Anthon Vetro, Huifang Sun, Mitsubishi Electric Advanced

More information

TreeGNG - Hierarchical Topological Clustering

TreeGNG - Hierarchical Topological Clustering TreeGNG - Hierarchical Topological lustering K..J.Doherty,.G.dams, N.Davey Department of omputer Science, University of Hertfordshire, Hatfield, Hertfordshire, L10 9, United Kingdom {K..J.Doherty,.G.dams,

More information

Hierarchical Clustering 4/5/17

Hierarchical Clustering 4/5/17 Hierarchical Clustering 4/5/17 Hypothesis Space Continuous inputs Output is a binary tree with data points as leaves. Useful for explaining the training data. Not useful for making new predictions. Direction

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

ViBE: A New Paradigm for Video Database Browsing and Search

ViBE: A New Paradigm for Video Database Browsing and Search ViBE: A New Paradigm for Video Database Browsing and Search Jau-Yuen Chen, Cüneyt Taşkiran, Edward J. Delp and Charles A. Bouman Electronic Imaging Systems Laboratory (EISL) Video and Image Processing

More information

Segmentation of Images

Segmentation of Images Segmentation of Images SEGMENTATION If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a

More information

Cluster Analysis. Ying Shen, SSE, Tongji University

Cluster Analysis. Ying Shen, SSE, Tongji University Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group

More information

Representation of 2D objects with a topology preserving network

Representation of 2D objects with a topology preserving network Representation of 2D objects with a topology preserving network Francisco Flórez, Juan Manuel García, José García, Antonio Hernández, Departamento de Tecnología Informática y Computación. Universidad de

More information

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Markus Turtinen, Topi Mäenpää, and Matti Pietikäinen Machine Vision Group, P.O.Box 4500, FIN-90014 University

More information

Topological Correlation

Topological Correlation Topological Correlation K.A.J. Doherty, R.G. Adams and and N. Davey University of Hertfordshire, Department of Computer Science College Lane, Hatfield, Hertfordshire, UK Abstract. Quantifying the success

More information

Image Segmentation. 1Jyoti Hazrati, 2Kavita Rawat, 3Khush Batra. Dronacharya College Of Engineering, Farrukhnagar, Haryana, India

Image Segmentation. 1Jyoti Hazrati, 2Kavita Rawat, 3Khush Batra. Dronacharya College Of Engineering, Farrukhnagar, Haryana, India Image Segmentation 1Jyoti Hazrati, 2Kavita Rawat, 3Khush Batra Dronacharya College Of Engineering, Farrukhnagar, Haryana, India Dronacharya College Of Engineering, Farrukhnagar, Haryana, India Global Institute

More information

Video shot segmentation using late fusion technique

Video shot segmentation using late fusion technique Video shot segmentation using late fusion technique by C. Krishna Mohan, N. Dhananjaya, B.Yegnanarayana in Proc. Seventh International Conference on Machine Learning and Applications, 2008, San Diego,

More information

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Di Zhong a, Shih-Fu Chang a, John R. Smith b a Department of Electrical Engineering, Columbia University, NY,

More information

Compression of Light Field Images using Projective 2-D Warping method and Block matching

Compression of Light Field Images using Projective 2-D Warping method and Block matching Compression of Light Field Images using Projective 2-D Warping method and Block matching A project Report for EE 398A Anand Kamat Tarcar Electrical Engineering Stanford University, CA (anandkt@stanford.edu)

More information

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression Habibollah Danyali and Alfred Mertins University of Wollongong School of Electrical, Computer and Telecommunications Engineering

More information

Lecture 10: Semantic Segmentation and Clustering

Lecture 10: Semantic Segmentation and Clustering Lecture 10: Semantic Segmentation and Clustering Vineet Kosaraju, Davy Ragland, Adrien Truong, Effie Nehoran, Maneekwan Toyungyernsub Department of Computer Science Stanford University Stanford, CA 94305

More information

Key Frame Extraction and Indexing for Multimedia Databases

Key Frame Extraction and Indexing for Multimedia Databases Key Frame Extraction and Indexing for Multimedia Databases Mohamed AhmedˆÃ Ahmed Karmouchˆ Suhayya Abu-Hakimaˆˆ ÃÃÃÃÃÃÈÃSchool of Information Technology & ˆˆÃ AmikaNow! Corporation Engineering (SITE),

More information

Texture Segmentation by Windowed Projection

Texture Segmentation by Windowed Projection Texture Segmentation by Windowed Projection 1, 2 Fan-Chen Tseng, 2 Ching-Chi Hsu, 2 Chiou-Shann Fuh 1 Department of Electronic Engineering National I-Lan Institute of Technology e-mail : fctseng@ccmail.ilantech.edu.tw

More information

Video Compression An Introduction

Video Compression An Introduction Video Compression An Introduction The increasing demand to incorporate video data into telecommunications services, the corporate environment, the entertainment industry, and even at home has made digital

More information

Video Representation. Video Analysis

Video Representation. Video Analysis BROWSING AND RETRIEVING VIDEO CONTENT IN A UNIFIED FRAMEWORK Yong Rui, Thomas S. Huang and Sharad Mehrotra Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign

More information

Lesson 3. Prof. Enza Messina

Lesson 3. Prof. Enza Messina Lesson 3 Prof. Enza Messina Clustering techniques are generally classified into these classes: PARTITIONING ALGORITHMS Directly divides data points into some prespecified number of clusters without a hierarchical

More information

Interactive Video Retrieval System Integrating Visual Search with Textual Search

Interactive Video Retrieval System Integrating Visual Search with Textual Search From: AAAI Technical Report SS-03-08. Compilation copyright 2003, AAAI (www.aaai.org). All rights reserved. Interactive Video Retrieval System Integrating Visual Search with Textual Search Shuichi Shiitani,

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Content-Based Image Retrieval of Web Surface Defects with PicSOM

Content-Based Image Retrieval of Web Surface Defects with PicSOM Content-Based Image Retrieval of Web Surface Defects with PicSOM Rami Rautkorpi and Jukka Iivarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 54, FIN-25

More information

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Course Presentation Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Video Coding Correlation in Video Sequence Spatial correlation Similar pixels seem

More information

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION Kiran V. Gaidhane*, Prof. L. H. Patil, Prof. C. U. Chouhan DOI: 10.5281/zenodo.58632

More information

Machine Learning. Unsupervised Learning. Manfred Huber

Machine Learning. Unsupervised Learning. Manfred Huber Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Video Compression MPEG-4. Market s requirements for Video compression standard

Video Compression MPEG-4. Market s requirements for Video compression standard Video Compression MPEG-4 Catania 10/04/2008 Arcangelo Bruna Market s requirements for Video compression standard Application s dependent Set Top Boxes (High bit rate) Digital Still Cameras (High / mid

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE 32 CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE 3.1 INTRODUCTION In this chapter we present the real time implementation of an artificial neural network based on fuzzy segmentation process

More information

A Robust Video Hash Scheme Based on. 2D-DCT Temporal Maximum Occurrence

A Robust Video Hash Scheme Based on. 2D-DCT Temporal Maximum Occurrence A Robust Video Hash Scheme Based on 1 2D-DCT Temporal Maximum Occurrence Qian Chen, Jun Tian, and Dapeng Wu Abstract In this paper, we propose a video hash scheme that utilizes image hash and spatio-temporal

More information

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications: Chapter 11.3 MPEG-2 MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications: Simple, Main, SNR scalable, Spatially scalable, High, 4:2:2,

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

International Journal of Emerging Technology and Advanced Engineering Website:   (ISSN , Volume 2, Issue 4, April 2012) A Technical Analysis Towards Digital Video Compression Rutika Joshi 1, Rajesh Rai 2, Rajesh Nema 3 1 Student, Electronics and Communication Department, NIIST College, Bhopal, 2,3 Prof., Electronics and

More information

Digital Image Stabilization and Its Integration with Video Encoder

Digital Image Stabilization and Its Integration with Video Encoder Digital Image Stabilization and Its Integration with Video Encoder Yu-Chun Peng, Hung-An Chang, Homer H. Chen Graduate Institute of Communication Engineering National Taiwan University Taipei, Taiwan {b889189,

More information

Software Design Document

Software Design Document ÇANKAYA UNIVERSITY Software Design Document Content Based Video Segmentation Berk Can Özütemiz-201311049, Ece Nalçacı-201411040, Engin Öztürk-201311049 28/12/2017 Table of Contents 1. INTRODUCTION... 3

More information

5. Hampapur, A., Jain, R., and Weymouth, T., Digital Video Segmentation, Proc. ACM Multimedia 94, San Francisco, CA, October, 1994, pp

5. Hampapur, A., Jain, R., and Weymouth, T., Digital Video Segmentation, Proc. ACM Multimedia 94, San Francisco, CA, October, 1994, pp 5. Hampapur, A., Jain, R., and Weymouth, T., Digital Video Segmentation, Proc. ACM Multimedia 94, San Francisco, CA, October, 1994, pp. 357-364. 6. Kasturi, R. and Jain R., Dynamic Vision, in Computer

More information

Research on Construction of Road Network Database Based on Video Retrieval Technology

Research on Construction of Road Network Database Based on Video Retrieval Technology Research on Construction of Road Network Database Based on Video Retrieval Technology Fengling Wang 1 1 Hezhou University, School of Mathematics and Computer Hezhou Guangxi 542899, China Abstract. Based

More information

A Fast Method for Textual Annotation of Compressed Video

A Fast Method for Textual Annotation of Compressed Video A Fast Method for Textual Annotation of Compressed Video Amit Jain and Subhasis Chaudhuri Department of Electrical Engineering Indian Institute of Technology, Bombay, Mumbai - 400076. INDIA. ajain,sc @ee.iitb.ac.in

More information

coding of various parts showing different features, the possibility of rotation or of hiding covering parts of the object's surface to gain an insight

coding of various parts showing different features, the possibility of rotation or of hiding covering parts of the object's surface to gain an insight Three-Dimensional Object Reconstruction from Layered Spatial Data Michael Dangl and Robert Sablatnig Vienna University of Technology, Institute of Computer Aided Automation, Pattern Recognition and Image

More information

AIIA shot boundary detection at TRECVID 2006

AIIA shot boundary detection at TRECVID 2006 AIIA shot boundary detection at TRECVID 6 Z. Černeková, N. Nikolaidis and I. Pitas Artificial Intelligence and Information Analysis Laboratory Department of Informatics Aristotle University of Thessaloniki

More information

Image Segmentation Techniques

Image Segmentation Techniques A Study On Image Segmentation Techniques Palwinder Singh 1, Amarbir Singh 2 1,2 Department of Computer Science, GNDU Amritsar Abstract Image segmentation is very important step of image analysis which

More information

IMAGE DENOISING TO ESTIMATE THE GRADIENT HISTOGRAM PRESERVATION USING VARIOUS ALGORITHMS

IMAGE DENOISING TO ESTIMATE THE GRADIENT HISTOGRAM PRESERVATION USING VARIOUS ALGORITHMS IMAGE DENOISING TO ESTIMATE THE GRADIENT HISTOGRAM PRESERVATION USING VARIOUS ALGORITHMS P.Mahalakshmi 1, J.Muthulakshmi 2, S.Kannadhasan 3 1,2 U.G Student, 3 Assistant Professor, Department of Electronics

More information

Region Feature Based Similarity Searching of Semantic Video Objects

Region Feature Based Similarity Searching of Semantic Video Objects Region Feature Based Similarity Searching of Semantic Video Objects Di Zhong and Shih-Fu hang Image and dvanced TV Lab, Department of Electrical Engineering olumbia University, New York, NY 10027, US {dzhong,

More information

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Jeffrey S. McVeigh 1 and Siu-Wai Wu 2 1 Carnegie Mellon University Department of Electrical and Computer Engineering

More information

NeTra-V: Towards an Object-based Video Representation

NeTra-V: Towards an Object-based Video Representation Proc. of SPIE, Storage and Retrieval for Image and Video Databases VI, vol. 3312, pp 202-213, 1998 NeTra-V: Towards an Object-based Video Representation Yining Deng, Debargha Mukherjee and B. S. Manjunath

More information

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze

More information

Image Segmentation for Image Object Extraction

Image Segmentation for Image Object Extraction Image Segmentation for Image Object Extraction Rohit Kamble, Keshav Kaul # Computer Department, Vishwakarma Institute of Information Technology, Pune kamble.rohit@hotmail.com, kaul.keshav@gmail.com ABSTRACT

More information

Pattern based Residual Coding for H.264 Encoder *

Pattern based Residual Coding for H.264 Encoder * Pattern based Residual Coding for H.264 Encoder * Manoranjan Paul and Manzur Murshed Gippsland School of Information Technology, Monash University, Churchill, Vic-3842, Australia E-mail: {Manoranjan.paul,

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11/ m3110 MPEG 97 February 1998/San

More information

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION 15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

A Geometrical Key-frame Selection Method exploiting Dominant Motion Estimation in Video

A Geometrical Key-frame Selection Method exploiting Dominant Motion Estimation in Video A Geometrical Key-frame Selection Method exploiting Dominant Motion Estimation in Video Brigitte Fauvet, Patrick Bouthemy, Patrick Gros 2 and Fabien Spindler IRISA/INRIA 2 IRISA/CNRS Campus Universitaire

More information

University of Florida CISE department Gator Engineering. Clustering Part 4

University of Florida CISE department Gator Engineering. Clustering Part 4 Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Chapter 7: Competitive learning, clustering, and self-organizing maps

Chapter 7: Competitive learning, clustering, and self-organizing maps Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural

More information

Active Image Database Management Jau-Yuen Chen

Active Image Database Management Jau-Yuen Chen Active Image Database Management Jau-Yuen Chen 4.3.2000 1. Application 2. Concept This document is applied to the active image database management. The goal is to provide user with a systematic navigation

More information

70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing

70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing 70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY 2004 ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing Jianping Fan, Ahmed K. Elmagarmid, Senior Member, IEEE, Xingquan

More information

Depth Estimation for View Synthesis in Multiview Video Coding

Depth Estimation for View Synthesis in Multiview Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract

More information

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 4: CLUSTER ANALYSIS CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis

More information

28 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 1, JANUARY 2010

28 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 1, JANUARY 2010 28 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 1, JANUARY 2010 Camera Motion-Based Analysis of User Generated Video Golnaz Abdollahian, Student Member, IEEE, Cuneyt M. Taskiran, Member, IEEE, Zygmunt

More information

Data Warehousing and Machine Learning

Data Warehousing and Machine Learning Data Warehousing and Machine Learning Preprocessing Thomas D. Nielsen Aalborg University Department of Computer Science Spring 2008 DWML Spring 2008 1 / 35 Preprocessing Before you can start on the actual

More information

A Two-Level Adaptive Visualization for Information Access to Open-Corpus Educational Resources

A Two-Level Adaptive Visualization for Information Access to Open-Corpus Educational Resources A Two-Level Adaptive Visualization for Information Access to Open-Corpus Educational Resources Jae-wook Ahn 1, Rosta Farzan 2, Peter Brusilovsky 1 1 University of Pittsburgh, School of Information Sciences,

More information