NeTra-V: Towards an Object-based Video Representation

Size: px
Start display at page:

Download "NeTra-V: Towards an Object-based Video Representation"

Transcription

1 Proc. of SPIE, Storage and Retrieval for Image and Video Databases VI, vol. 3312, pp , 1998 NeTra-V: Towards an Object-based Video Representation Yining Deng, Debargha Mukherjee and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA deng, Abstract There is a growing need for new representations of video that allow not only compact storage of data but also content-based functionalities such as search and manipulation of objects. We present here a prototype system, called NeTra-V, that is currently being developed to address some of these content related issues. The system has a twostage video processing structure: a global feature extraction and clustering stage, and a local feature extraction and object-based representation stage. Key aspects of the system include a new spatio-temporal segmentation and objecttracking scheme, and a hierarchical object-based video representation model. The spatio-temporal segmentation scheme combines the color/texture image segmentation and affine motion estimation techniques. Experimental results show that the proposed approach can handle large motion. The output of the segmentation, the alpha plane as it is referred to in the MPEG-4 terminology, can be used to compute local image properties. This local information forms the low-level content description module in our video representation. Experimental results illustrating spatiotemporal segmentation and tracking are provided. Keywords: content-based retrieval, spatio-temporal segmentation, object-based video representation. 1. Introduction With the rapid developments in multimedia and internet applications, there is a growing need for new representations of video that allow not only compact storage of data but also content-based functionalities such as search and manipulation of objects, semantic description of the scene, detection of unusual events, and possible recognition of the objects. Current compression standards, such as MPEG-2 and H.263, are developed to achieve good data compression, but do not provide any content-related functionalities. There has been much work done on the emerging MPEG-4 standard 17, which is targeted towards access and manipulation of objects as well as more efficient data compression. However, functionalities that can be provided at present. are limited to cut and paste of a few objects in simple scenes. There is no visual information extracted from the object itself that can be used for similarity search and high-level understanding. On the other hand, current research in content-based video retrieval 1, 5, 8, 9, 11, 21 has provided ways of simple content descriptions by temporally partitioning the video clip into smaller shots, each of which contains a continuous scene, and extracting visual features such as color, texture, and motion from these shots. These low-level visual features are quite effective in searching for similar video scenes given a query video shot. However, with the exception of a few 20, much of the prior work in this area is restricted to global image features. This paper describes a video analysis and retrieval system, called NeTra-V [1], which is being developed with the objective of providing content related functionalities. The system is fully automatic and has a two-stage video processing structure. The first stage is global feature extraction and clustering. The second stage is local feature extraction and object-based representation. Key aspects of the system include a new spatio-temporal segmentation and object-tracking scheme, and a hierarchical object-based video representation model. The spatio-temporal segmentation scheme combines the spatial color/texture image segmentation and affine motion estimation techniques. Experi- [1] Netra means eye in Sanskrit, an ancient Indian language. NeTra is also the name of the image retrieval system described in 14.

2 mental results show that the proposed approach can handle large motion and complex scenes containing several independently moving objects. The output of the segmentation, the alpha plane, can be used for both MPEG-4 coding which produces compact stored data, and local feature extraction which provides object information. Both global features and local features are used to form the low-level content description of the video representation model. The current implementation of NeTra-V allows the user to track regions in a video sequence and search for regions with similar color, texture, shape, motion pattern, location, or size in the database. Identifying more meaningful objects from these low-level region features is a future goal. Some examples from a football game database are shown in this paper. Demonstrations of NeTra-V system are available on the web at: The rest of the paper is organized as follows. Section 2 gives an overview of the system. Section 3 details the spatio-temporal segmentation. Section 4 illustrates the low-level content description of the video data. Section 5 concludes with discussions. 2. System Overview Figure 1 shows a schematic diagram of the NeTra-V system. Our research so far include all the shaded blocks in the figure. Video data, either raw or compressed using current standards, is segmented in the temporal domain into small video shots of consistent visual information. Often each video shot represents a single natural scene delimited camera breaks or editing cuts. The temporal partitioning algorithm 5 works directly on the MPEG-2 video sequence. It can detect both abrupt scene cuts and gradual transitions by using color and pixel intensities in the I-frames and motion prediction information in the P-frames and B-frames. The partitioned video shots are processed in two stages: 1. Global features are first extracted 5. These features help in preliminary scene classification, are quite robust to local perturbations, and are easy to compute. Feature clustering is then performed based on the global feature distances to better organize the data and facilitate indexing and search. Video shots are clustered into different categories according to the content and object search can be restricted within certain categories so that search space is greatly reduced. A traditional agglomerative method is chosen for feature clustering instead of the more popular k-means method because many video shots do not belong to any well defined categories and should not affect the clustering process. Figure 2 shows some example frames of the categories generated from a football game database. It can be seen that the feature clustering, to a certain degree, captures some semantic level information in each category, such as zoom-out shots of the football field and zoom-in shots of individual players. 2. The local processing step consists of three blocks. Spatio-temporal segmentation generates a labeled region map, or the alpha plane in the MPEG-4 terminology, for each video frame. The alpha plane is essential to MPEG-4 object-based video coding, which generates compactly stored video data while allowing object access and manipulation. With the region map, local features can also be extracted. These features include color, texture, motion, shape, size, and spatial relations among the regions. Note that dimensionality reduction 2 is needed to provide compression of the high dimensional feature vectors and an efficient indexing structure. A hierarchical object-based video representation model, which provides both compact storage of the video data and content information of the scene, is also illustrated in Figure 1. This model, shown within the dashed box, is composed of four data representation levels. The bottom level stores object encoded video data and allows object access and manipulation. The next level provides low-level content description of the video scenes by storing all global and local visual features. These features can be used for content-based search and retrieval, and semantic abstraction at the next level. High-level content-based description requires a certain degree of human assistance and the system should have self learning ability as well. The top level contains textual annotations of the video data. These could include non-image related information such as recording date, recording place, source, category, description of the content and so on. The following sections give more details on two key aspects of NeTra-V, spatio-temporal segmentation and lowlevel video content description.

3 Raw or Compressed Video Data: movies, news, sports, surveillance,... Hierarchical Object-based Video Representation Model Textual Annotation Temporal Partitioning Global Processing Global Feature Extraction High-level Content Description: Semantic Abstraction Supervised and Unsupervised Learning Spatiotemporal Segmentation Feature Clustering Local Processing Local Feature Extraction Dimensionality Reduction Object-based Video Coding Low-level Content Description: color, texture, motion, shape, spatial relations,... Compactly Stored Video Data Two-Step Video Processing Structure User Interface Network System Interface: Search and Retrieval New Database Indexing Figure 1. Schematic Diagram of NeTra-V System.

4 Figure 2. Example frames of different categories generated from the football game database. 3. Spatio-temporal Segmentation Spatio-temporal segmentation continues to be a challenging problem in computer vision research 6, 7, 10, 18, 19. Many motion segmentation schemes use optical flow methods to estimate motion vectors at the pixel level, and then cluster pixels into regions of coherent motion. There are several drawbacks to this approach. First, the optical flow method does not cope well with large motions. Second, regions of coherent motion may contain multiple objects, for example, the entire background. While these regions are good for coding purposes, they are not useful for local feature extraction and object identification. In general, techniques designed with coding objectives cannot yield good segmentation results. We will elaborate on this point later in Section 5. Another approach to spatio-temporal segmentation combines the results of both spatial and motion segmentation. Intuitively, this approach exploits as much information as possible from the data and should yield better results. The general strategy here is to spatially segment the first frame and estimate local affine motion parameters for each region to predict subsequent frames. Numerical methods 3, 4, 16 have been proposed to estimate affine motion parameters. The success of this approach depends largely on an initial good spatial segmentation. Results using simple region growing methods are usually not very satisfactory. Recently, a general framework for color and texture image segmentation has been presented 13, which appears to give good segmentation results on a diverse collection of images. This algorithm is used in our spatio-temporal segmentation scheme. 3.1 General Scheme We borrow the idea of intra- and inter-frame coding from MPEG. Video data is processed in consecutive groups of frames. These groups are non-overlapping and independent of each other. The number of frames in each group is set to 7 in the following experiments. The middle frame of each group is called the I-frame. Spatial segmentation is performed on the I-frame only for each group. Remaining frames in the group are called P-frames. P-frames are segmented by local affine motion predictions from their previous frames. There can be either forward prediction or backward prediction. The insertion of I-frames in the video sequence recovers failures from affine motion estimation in case of large object movements in 3D space and ensures robustness of the algorithm. Some heuristics are applied to handle overlapped and uncovered regions by using the information in motion prediction. Figure 3 illustrates the general segmentation scheme. The use of simultaneous forward and backward predictions, such as the way B-frames in MPEG-2 are generated, will help in affine motion estimation. However, unlike in the case of block-based prediction in MPEG-2, this method creates the problem of region correspondence between frames since the spatial segmentation on two consecutive I- frames can be significantly different. For this reason B-frames are not used in our scheme. The current method restricts the maximum number of regions to be same as the one generated from spatial segmentation. Regions can disappear because of occlusion or moving out of the image boundary, but no new regions are labeled during the motion prediction phase. New regions entering the scene are handled by the next I-frame. 3.2 Spatial Segmentation A brief description of the spatial segmentation algorithm 13 is given here. This algorithm integrates color and texture features together to compute the segmentation. First, direction of changes in color and texture is identified and integrated at each pixel location. Then a vector is constructed at each pixel pointing in the direction that a region

5 P3 P2 P1 I P1 P2 P3 Figure 3. General segmentation scheme. One group of frames is shown. I is the spatially segmented frame. P1, P2, and P3 are the first, second and third predicted frames, respectively. boundary is most likely to occur. The vector field propagates to neighboring points with similar directions and stops if two neighboring points have opposite flow directions, which indicates the presence of a boundary between the two pixels. After boundary detection, disjoint boundaries are connected to form closed contours. This is followed by region merging based on color and texture similarities as well as the boundary length between the two regions. The algorithm is designed for general images and requires very little parameter tuning from the user. The only parameter to be specified is the scale factor for localizing the boundaries. 3.3 Motion Segmentation The results of spatial segmentation can be used for affine motion estimation. A 6-parameter 2D affine transformation is assumed for each region in the frame and is estimated by finding the best match in the next frame. Consequently, segmentation results for the next frame is obtained. A Gaussian smoothing is performed on each image before affine estimation. Affine motion estimation is performed on the luminance component of the video data only. Mathematically, the following functional f is to be minimized for each region R, f( a) = gi ( 1 ( x'y', ) I 2 ( x, y) ) ( x, y) R (1) where a is the six-parameter affine motion vector which can be separated in x and y directions, a a x T T = a,, g is a robust error norm to reject outliers, defined as 10 x1 a x2 a x3 a y = a y1 a y2 a y3, = a x a y T and ge ( ) = e 2 ( σ + e 2 ) (2) where σ is a scale parameter, I 1 and I 2 are the current frame and the next frame respectively, x and y are pixel locations, x' = x + dx and y' = y+ dy, dx and dy are displacement vectors, dx = b T a x and dy = b T a y, where b = 1 xy T. Ignoring high order terms, a Taylor expansion of (1) gives 1 f( a) = f( a 0 ) + f( a 0 )( a a 0 ) + -- ( a a 2 0 ) T 2 f( a 0 )( a a 0 ) (3) Using a modified Newton s method 12 that ensures both descent and convergence, a can be iteratively solved by updating the following equation at the kth iteration, ak [ + 1] = ak [ ] ck [ ] 2 f( a[ k] ) { }1 f( a[ k] ) (4) where c[k] is a search parameter selected to minimize f. For (1), f and 2 f are calculated as

6 f = R g I T b T I b T e x' y' (5) 2 f = R 2 g I g I e 2 x' e x' 2 2 g I 1 e I x' y' 2 g I 1 e I x' y' 2 g I g I e 2 y' e y' 2 1 x y xx 2 yx yxyy 2 (6) Note that the gradient components I 1 x' and I 1 y' can be precomputed before the iterations start. The method derived here requires the cost function to be convex in affine space, which is not true in general. However, in the vicinity of the actual affine parameter values, we can assume it to be true. Thus a good initialization is needed before the iterations. In the case that affine parameters of the previous frame are known, we can make a first-order assumption that the region is going to keep the same motion and use the affine parameters of the previous frame as the initial values for the current frame. In the case that affine parameters of the previous frame are unknown (I-frames, for example), a hierarchical search is performed to obtain the best initial affine values. This is only needed once for every group of frames. The search is done using all three color components to ensure best results. To reduce the complexity, a 4-parameter affine model which accounts for x and y translations, scale, and rotation is used. Image is downsampled first and results of the search at a lower-resolution are projected back to the original-sized image for fine tuning. 3.4 Results Figure 4 shows two spatio-temporal segmentation examples, one from an MPEG-2 standard test sequence flowergarden, the other from a football game sequence. It can be seen that the results are quite good where regions of sky, clouds, tree, flowers in (a) and helmet, face, jersey in (b) are all segmented out. (Original color images can be found on the web at 4. Video Representation 4.1 Low-level Video Content Description Low-level video content description is an important module in NeTra-V. The representation scheme used for this purpose is show in Figure 5. It is organized from bottom to top as follows: 1. Region features extracted from the I-frame are used to represent the entire group of frames since features in the P-frames of the same group should be similar. Also features in the I-frame are more reliable than the ones in the P-frames because there are no propagation errors due to motion segmentation. We refer to these regions in the I- frame as I-regions. 2. Temporal correspondence between regions in consecutive I-frames is established by pairing up I-regions with the most similar features. Motion compensation is used to predict the approximate location of each region in the next I-frame to limit the search area. The correspondences are one-to-one in the forward temporal direction. That is each I-region can only be connected to one I-region in the next I-frame. Some I-regions are left out without any correspondences, indicating disappearances of the objects. Starting from the first frame, corresponding I-regions are tracked through the entire video shot. This is illustrated in Figure 6. A subobject is defined as a group of corresponding I-regions through tracking. We call them subobjects because object definitions are often quite subjective and a segmented region is usually a part of an object. In our experiments, the duration of a subobject is required to be at least 3 I-frames long. 3. Each video shot is composed of a set of subobjects. A video shot now can be characterized by its subobject information, and the spatial and temporal relations between these subobjects.

7 I P1 P2 P3 Figure 4. (a) Example of spatio-temporal segmentation on one group of frames in the flower garden sequence. Arrows indicate the actual flow of video frames in time.

8 I P1 P2 P3 Figure 4. (b) Example of spatio-temporal segmentation on one group of frames in a football game video database. Arrows indicate the actual flow of video frames in time.

9 video shot (global features)... subobject... subobject... (subobject features) (subobject features) fundamental elements... I-region I-region (I-region features)... (I-region features)... Figure 5. Structure of low-level content description. Subobjects are the fundamental elements of this representation level. Figure 7 shows two examples of identified subobjects. A set of 6 consecutive I-frames are shown in each example. Figure 7(a) is a half zoom-out view of a football field. A small subobject is identified, which is the upper body of a football player. Figure 7(b) shows the tracking of a person s face. Notice that in the first frame the face is partially occluded while in the last frame there is a segmentation failure which merges the face with the helmet. In both cases, the tracking algorithm is robust enough to pick up the face. Table 1 shows the detail information extracted to characterize the I-region, the subobject, and the video shot. Integrating information from different region features is an important issue. Since color is the most dominant feature, it is used to rank the distance measure while other features are only used as constraints to eliminate false matches. Subobjects are the fundamental elements of the low-level content description. Similarity search and retrieval are mainly performed using the subobjects. I-region information can also be used if necessary. For example, in order to answer a query such as find the subobjects that move from left to right, motion information of each I-region of the subobject is needed. The subobjects could serve as building blocks for user-subjective object identifications in the high-level content description. 5. Discussions 5.1 Video Coding vs. Analysis It is natural to consider developing a scheme that can be simultaneously optimized for both video coding and analysis. However, this is difficult because coding is to compress the data as much as possible while analysis is to extract information from the data. These are two separate objectives. Current techniques can do a good job on either of them but not both simultaneously. Commonly used image features are not suitable for data reconstruction and A A A A A A I k I k+1 I k+2 I k+3 I k+4 I k+5 Figure 6. Tracking of I-regions to form a subobject. Regions labeled A in each I-frame are matched and tracked. A subobject is formed by grouping of these regions. The subobject is identified despite the occlusion effects.

10 (a) (b) Figure 7. Two examples of identified subobjects, each showing a set of 6 consecutive I-regions.

11 Table 1: Detail Content Descriptions I region Subobject Video Shot index region label subobject label start and end frames region indices subobject indices temporal relations among subobjects color region color histogram average of I region features global color histogram texture region Gabor texture feature average of I region features global Gabor texture feature motion affine motion parameters average and variance of I region features global motion histogram shape Fourier-based descriptor using curvature, centroid distance, and complex coordinate functions average of I region features size number of pixels average of I region features location centroid and bounding box average of I region features spatial relations among subobjects encoded bit streams do not contain much useful visual information either. Further, because these two goals are different, the approaches are different as well. For example, in order to achieve a good spatial-temporal segmentation, information in the next frame is needed. This is not possible for any predictive coding scheme. Coding schemes that seek to minimize mean squared errors, such as the block-based motion prediction method used in the MPEG-2 and H.263, do not really care whether the segmentation makes sense. Note that the segmentation method presented here is general for both purposes. With some small modifications it can be easily adapted for predictive coding 15. The model proposed in this paper separates coding and analysis into two modules within one general framework, sharing the common results of the spatio-temporal segmentation preprocess. Given the alpha planes, how to code the data more efficiently becomes an independent issue as long as the decoder can provide object access and manipulation functionalities. Motion predictions can be recalculated if necessary to achieve results optimized for compression. 5.2 Conclusions and Future Research In this paper, we have described an implementation of the NeTra-V system, whose main objective is to provide content-functionalities for the video data. Key aspects of the system include a new spatio-temporal segmentation and object-tracking scheme, and a hierarchical object-based video representation model. One of the main focuses of this work has been on the low-level content description of the video representation model. Future research will be on the high-level semantic abstraction of the video data based on this low-level content description.

12 Acknowledgments This work is supported by a grant from NSF under award #IRI We would like to thank Dr. Wei-Ying Ma for providing the software for the spatial segmentation, Gabor texture and shape feature extraction. References [1] E. Ardizzone, M.L. Cascia, Multifeature image and video content-based storage and retrieval, Proc. of SPIE, vol. 2916, pp , [2] M. Beatty and B.S. Manjunath, Dimensionality Reduction Using Multi-Dimensional Scaling for Content- Based Retrieval, Proc. of IEEE Intl. Conf. on Image Processing, vol. 2, pp , [3] J. Bergen, P. Burt, R. Hingorani, and S. Peleg, A three-frame algorithm for estimating two-component image motion, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, no. 9, pp , [4] M. Bober and J. Kittler, Robust Motion Analysis, Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp , [5] Y. Deng and B.S. Manjunath, Content-based Search of Video Using Color, Texture and Motion, Proc. of IEEE Intl. Conf. on Image Processing, vol. 2, pp , [6] B. Duc, P. Schroeter, and J. Bigun, Spatio-temporal robust motion estimation and segmentation, Proc. of 6th Intl. Conf. on Computer Analysis of Images and Patterns, pp , [7] F. Dufaux, F. Moscheni, and A. Lippman, Spatio-temporal segmentation based on motion and static segmentation, Proc. of IEEE Intl. Conf. on Image Processing, vol. 1, pp , [8] A. Hampapur, et. al., Virage video engine, Proc. of SPIE, vol. 3022, pp , [9] G. Iyengar and A.B. Lippman, Videobook: an experiment in characterization of Video, Proc. of IEEE Intl. Conf. on Image Processing, vol. 3, pp , [10] S. Ju, M. Black, and A. Jepson, Skin and bones: Multi-layer, locally affine, optical flow and regularization with transparency, Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp , [11] V. Kobla, D. Doermann, and K. Lin, Archiving, indexing, and retrieval of video in the compressed domain, Proc. of SPIE, vol. 2916, pp 78-89, [12] D. Luenberger, Linear and Nonlinear Programming, 2nd ed., Addison-Wesley, [13] W.Y. Ma and B.S. Manjunath, Edge flow: a framework of boundary detection and image segmentation, Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp , [14] W.Y. Ma and B.S. Manjunath, NeTra: A toolbox for navigating large image databases, Proc. of IEEE Intl. Conf. on Image Processing, vol. 1, pp , 1997, and also in ACM Multimedia Systems Journal. [15] D. Mukherjee, Y. Deng and S.K. Mitra, A region-based video coder using edge flow segmentation and hierarchical affine region matching, Proc. of SPIE, vol. 3309, [16] H. Sanson, Toward a robust parametric identification of motion on regions of arbitrary shape by non-linear optimization, Proc. of IEEE Intl. Conf. on Image Processing, vol. 1, pp , [17] Special Issue on MPEG-4, IEEE Trans. on Circuit and Systems for Video Technology, vol.7, no.1, [18] J. Wang and E. Adelson, Spatio-temporal segmentation of video data, Proc. of SPIE, vol. 2182, pp , [19] L. Wu, J. Benois-Pineau, and D. Barba, Spatio-temporal segmentation of image sequences for object-oriented low bit-rate image coding, Proc. of IEEE Intl. Conf. on Image Processing, vol. 2, pp , [20] D. Zhong and S.F. Chang, Video Object Model and Segmentation for Content-based Video Indexing, Proc. of IEEE Intl. Symposium on Circuit and Systems, [21] H.J. Zhang, J. Wu, D. Zhong and S.W. Smolliar, An integrated system for content-based video retrieval and browsing, Pattern Recognition, vol. 30, no. 4, pp , 1997.

WITH the rapid developments in multimedia and Internet

WITH the rapid developments in multimedia and Internet 616 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 8, NO. 5, SEPTEMBER 1998 NeTra-V: Toward an Object-Based Video Representation Yining Deng, Student Member, IEEE, and B. S. Manjunath,

More information

Color Image Segmentation

Color Image Segmentation Color Image Segmentation Yining Deng, B. S. Manjunath and Hyundoo Shin* Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106-9560 *Samsung Electronics Inc.

More information

Tools for texture/color based search of images

Tools for texture/color based search of images pp 496-507, SPIE Int. Conf. 3106, Human Vision and Electronic Imaging II, Feb. 1997. Tools for texture/color based search of images W. Y. Ma, Yining Deng, and B. S. Manjunath Department of Electrical and

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides. Optical Flow-Based Motion Estimation Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides. 1 Why estimate motion? We live in a 4-D world Wide applications Object

More information

Representing Moving Images with Layers. J. Y. Wang and E. H. Adelson MIT Media Lab

Representing Moving Images with Layers. J. Y. Wang and E. H. Adelson MIT Media Lab Representing Moving Images with Layers J. Y. Wang and E. H. Adelson MIT Media Lab Goal Represent moving images with sets of overlapping layers Layers are ordered in depth and occlude each other Velocity

More information

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial

More information

A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing

A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing Hayit Greenspan 1, Jacob Goldberger 2, and Arnaldo Mayer 1 1 Faculty of Engineering, Tel Aviv University, Tel Aviv 69978, Israel

More information

Motion in 2D image sequences

Motion in 2D image sequences Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or activities Segmentation and understanding of video sequences

More information

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 8, DECEMBER 1999 1147 Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services P. Salembier,

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

Module 7 VIDEO CODING AND MOTION ESTIMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION Module 7 VIDEO CODING AND MOTION ESTIMATION Lesson 20 Basic Building Blocks & Temporal Redundancy Instructional Objectives At the end of this lesson, the students should be able to: 1. Name at least five

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

Motion Tracking and Event Understanding in Video Sequences

Motion Tracking and Event Understanding in Video Sequences Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!

More information

Particle Tracking. For Bulk Material Handling Systems Using DEM Models. By: Jordan Pease

Particle Tracking. For Bulk Material Handling Systems Using DEM Models. By: Jordan Pease Particle Tracking For Bulk Material Handling Systems Using DEM Models By: Jordan Pease Introduction Motivation for project Particle Tracking Application to DEM models Experimental Results Future Work References

More information

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES Mehran Yazdi and André Zaccarin CVSL, Dept. of Electrical and Computer Engineering, Laval University Ste-Foy, Québec GK 7P4, Canada

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors Segmentation I Goal Separate image into coherent regions Berkeley segmentation database: http://www.eecs.berkeley.edu/research/projects/cs/vision/grouping/segbench/ Slide by L. Lazebnik Applications Intelligent

More information

Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

More information

Introduction to Medical Imaging (5XSA0) Module 5

Introduction to Medical Imaging (5XSA0) Module 5 Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed

More information

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features 1 Kum Sharanamma, 2 Krishnapriya Sharma 1,2 SIR MVIT Abstract- To describe the image features the Local binary pattern (LBP)

More information

BI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH

BI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH BI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH Marc Servais, Theo Vlachos and Thomas Davies University of Surrey, UK; and BBC Research and Development,

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

Multiple Motion and Occlusion Segmentation with a Multiphase Level Set Method

Multiple Motion and Occlusion Segmentation with a Multiphase Level Set Method Multiple Motion and Occlusion Segmentation with a Multiphase Level Set Method Yonggang Shi, Janusz Konrad, W. Clem Karl Department of Electrical and Computer Engineering Boston University, Boston, MA 02215

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Video shot segmentation using late fusion technique

Video shot segmentation using late fusion technique Video shot segmentation using late fusion technique by C. Krishna Mohan, N. Dhananjaya, B.Yegnanarayana in Proc. Seventh International Conference on Machine Learning and Applications, 2008, San Diego,

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Motion and Tracking Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Motion Segmentation Segment the video into multiple coherently moving objects Motion and Perceptual Organization

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Tracking of video objects using a backward projection technique

Tracking of video objects using a backward projection technique Tracking of video objects using a backward projection technique Stéphane Pateux IRISA/INRIA, Temics Project Campus Universitaire de Beaulieu 35042 Rennes Cedex, FRANCE ABSTRACT In this paper, we present

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

AUTOMATIC OBJECT DETECTION IN VIDEO SEQUENCES WITH CAMERA IN MOTION. Ninad Thakoor, Jean Gao and Huamei Chen

AUTOMATIC OBJECT DETECTION IN VIDEO SEQUENCES WITH CAMERA IN MOTION. Ninad Thakoor, Jean Gao and Huamei Chen AUTOMATIC OBJECT DETECTION IN VIDEO SEQUENCES WITH CAMERA IN MOTION Ninad Thakoor, Jean Gao and Huamei Chen Computer Science and Engineering Department The University of Texas Arlington TX 76019, USA ABSTRACT

More information

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology

More information

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi Motion and Optical Flow Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi We live in a moving world Perceiving, understanding and predicting motion is an important part of our daily lives Motion

More information

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS Yen-Kuang Chen 1, Anthony Vetro 2, Huifang Sun 3, and S. Y. Kung 4 Intel Corp. 1, Mitsubishi Electric ITA 2 3, and Princeton University 1

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN: Semi Automatic Annotation Exploitation Similarity of Pics in i Personal Photo Albums P. Subashree Kasi Thangam 1 and R. Rosy Angel 2 1 Assistant Professor, Department of Computer Science Engineering College,

More information

Adaptive Learning of an Accurate Skin-Color Model

Adaptive Learning of an Accurate Skin-Color Model Adaptive Learning of an Accurate Skin-Color Model Q. Zhu K.T. Cheng C. T. Wu Y. L. Wu Electrical & Computer Engineering University of California, Santa Barbara Presented by: H.T Wang Outline Generic Skin

More information

Local Image Registration: An Adaptive Filtering Framework

Local Image Registration: An Adaptive Filtering Framework Local Image Registration: An Adaptive Filtering Framework Gulcin Caner a,a.murattekalp a,b, Gaurav Sharma a and Wendi Heinzelman a a Electrical and Computer Engineering Dept.,University of Rochester, Rochester,

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

International Journal of Advance Engineering and Research Development

International Journal of Advance Engineering and Research Development Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 11, November -2017 e-issn (O): 2348-4470 p-issn (P): 2348-6406 Comparative

More information

Performance study of Gabor filters and Rotation Invariant Gabor filters

Performance study of Gabor filters and Rotation Invariant Gabor filters Performance study of Gabor filters and Rotation Invariant Gabor filters B. Ng, Guojun Lu, Dengsheng Zhang School of Computing and Information Technology University Churchill, Victoria, 3842, Australia

More information

Clustering Methods for Video Browsing and Annotation

Clustering Methods for Video Browsing and Annotation Clustering Methods for Video Browsing and Annotation Di Zhong, HongJiang Zhang 2 and Shih-Fu Chang* Institute of System Science, National University of Singapore Kent Ridge, Singapore 05 *Center for Telecommunication

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Cellular Learning Automata-Based Color Image Segmentation using Adaptive Chains

Cellular Learning Automata-Based Color Image Segmentation using Adaptive Chains Cellular Learning Automata-Based Color Image Segmentation using Adaptive Chains Ahmad Ali Abin, Mehran Fotouhi, Shohreh Kasaei, Senior Member, IEEE Sharif University of Technology, Tehran, Iran abin@ce.sharif.edu,

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

Experiments with Edge Detection using One-dimensional Surface Fitting

Experiments with Edge Detection using One-dimensional Surface Fitting Experiments with Edge Detection using One-dimensional Surface Fitting Gabor Terei, Jorge Luis Nunes e Silva Brito The Ohio State University, Department of Geodetic Science and Surveying 1958 Neil Avenue,

More information

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

AUTOMATIC VIDEO INDEXING

AUTOMATIC VIDEO INDEXING AUTOMATIC VIDEO INDEXING Itxaso Bustos Maite Frutos TABLE OF CONTENTS Introduction Methods Key-frame extraction Automatic visual indexing Shot boundary detection Video OCR Index in motion Image processing

More information

VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural

VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD Ertem Tuncel and Levent Onural Electrical and Electronics Engineering Department, Bilkent University, TR-06533, Ankara, Turkey

More information

Notes 9: Optical Flow

Notes 9: Optical Flow Course 049064: Variational Methods in Image Processing Notes 9: Optical Flow Guy Gilboa 1 Basic Model 1.1 Background Optical flow is a fundamental problem in computer vision. The general goal is to find

More information

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Course Presentation Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Video Coding Correlation in Video Sequence Spatial correlation Similar pixels seem

More information

Image retrieval based on region shape similarity

Image retrieval based on region shape similarity Image retrieval based on region shape similarity Cheng Chang Liu Wenyin Hongjiang Zhang Microsoft Research China, 49 Zhichun Road, Beijing 8, China {wyliu, hjzhang}@microsoft.com ABSTRACT This paper presents

More information

Camera Motion Identification in the Rough Indexing Paradigm

Camera Motion Identification in the Rough Indexing Paradigm Camera Motion Identification in the Rough Indexing aradigm etra KRÄMER and Jenny BENOIS-INEAU LaBRI University Bordeaux I, France {petra.kraemer,jenny.benois}@labri.fr Camera Motion Identification in the

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods

More information

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Di Zhong a, Shih-Fu Chang a, John R. Smith b a Department of Electrical Engineering, Columbia University, NY,

More information

Segmentation by Clustering. Segmentation by Clustering Reading: Chapter 14 (skip 14.5) General ideas

Segmentation by Clustering. Segmentation by Clustering Reading: Chapter 14 (skip 14.5) General ideas Reading: Chapter 14 (skip 14.5) Data reduction - obtain a compact representation for interesting image data in terms of a set of components Find components that belong together (form clusters) Frame differencing

More information

Segmentation by Clustering Reading: Chapter 14 (skip 14.5)

Segmentation by Clustering Reading: Chapter 14 (skip 14.5) Segmentation by Clustering Reading: Chapter 14 (skip 14.5) Data reduction - obtain a compact representation for interesting image data in terms of a set of components Find components that belong together

More information

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Jung-Rim Kim, Seong Soo Chun, Seok-jin Oh, and Sanghoon Sull School of Electrical Engineering, Korea University,

More information

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE COMPUTER VISION 2017-2018 > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE OUTLINE Optical flow Lucas-Kanade Horn-Schunck Applications of optical flow Optical flow tracking Histograms of oriented flow Assignment

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Optical Flow Estimation

Optical Flow Estimation Optical Flow Estimation Goal: Introduction to image motion and 2D optical flow estimation. Motivation: Motion is a rich source of information about the world: segmentation surface structure from parallax

More information

Digital Image Stabilization and Its Integration with Video Encoder

Digital Image Stabilization and Its Integration with Video Encoder Digital Image Stabilization and Its Integration with Video Encoder Yu-Chun Peng, Hung-An Chang, Homer H. Chen Graduate Institute of Communication Engineering National Taiwan University Taipei, Taiwan {b889189,

More information

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection Kuanyu Ju and Hongkai Xiong Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China ABSTRACT To

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Textural Features for Image Database Retrieval

Textural Features for Image Database Retrieval Textural Features for Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington Seattle, WA 98195-2500 {aksoy,haralick}@@isl.ee.washington.edu

More information

Key Frame Extraction and Indexing for Multimedia Databases

Key Frame Extraction and Indexing for Multimedia Databases Key Frame Extraction and Indexing for Multimedia Databases Mohamed AhmedˆÃ Ahmed Karmouchˆ Suhayya Abu-Hakimaˆˆ ÃÃÃÃÃÃÈÃSchool of Information Technology & ˆˆÃ AmikaNow! Corporation Engineering (SITE),

More information

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier Computer Vision 2 SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung Computer Vision 2 Dr. Benjamin Guthier 1. IMAGE PROCESSING Computer Vision 2 Dr. Benjamin Guthier Content of this Chapter Non-linear

More information

Shot Detection using Pixel wise Difference with Adaptive Threshold and Color Histogram Method in Compressed and Uncompressed Video

Shot Detection using Pixel wise Difference with Adaptive Threshold and Color Histogram Method in Compressed and Uncompressed Video Shot Detection using Pixel wise Difference with Adaptive Threshold and Color Histogram Method in Compressed and Uncompressed Video Upesh Patel Department of Electronics & Communication Engg, CHARUSAT University,

More information

Image Coding with Active Appearance Models

Image Coding with Active Appearance Models Image Coding with Active Appearance Models Simon Baker, Iain Matthews, and Jeff Schneider CMU-RI-TR-03-13 The Robotics Institute Carnegie Mellon University Abstract Image coding is the task of representing

More information

TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES

TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES Mr. Vishal A Kanjariya*, Mrs. Bhavika N Patel Lecturer, Computer Engineering Department, B & B Institute of Technology, Anand, Gujarat, India. ABSTRACT:

More information

Artifacts and Textured Region Detection

Artifacts and Textured Region Detection Artifacts and Textured Region Detection 1 Vishal Bangard ECE 738 - Spring 2003 I. INTRODUCTION A lot of transformations, when applied to images, lead to the development of various artifacts in them. In

More information

Global Flow Estimation. Lecture 9

Global Flow Estimation. Lecture 9 Motion Models Image Transformations to relate two images 3D Rigid motion Perspective & Orthographic Transformation Planar Scene Assumption Transformations Translation Rotation Rigid Affine Homography Pseudo

More information

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Yongying Gao and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48823 email:

More information

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution 2011 IEEE International Symposium on Multimedia Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution Jeffrey Glaister, Calvin Chan, Michael Frankovich, Adrian

More information

Marcel Worring Intelligent Sensory Information Systems

Marcel Worring Intelligent Sensory Information Systems Marcel Worring worring@science.uva.nl Intelligent Sensory Information Systems University of Amsterdam Information and Communication Technology archives of documentaries, film, or training material, video

More information

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing Visual servoing vision allows a robotic system to obtain geometrical and qualitative information on the surrounding environment high level control motion planning (look-and-move visual grasping) low level

More information

CAP 5415 Computer Vision Fall 2012

CAP 5415 Computer Vision Fall 2012 CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented

More information

An Edge-Based Approach to Motion Detection*

An Edge-Based Approach to Motion Detection* An Edge-Based Approach to Motion Detection* Angel D. Sappa and Fadi Dornaika Computer Vison Center Edifici O Campus UAB 08193 Barcelona, Spain {sappa, dornaika}@cvc.uab.es Abstract. This paper presents

More information

Texture Image Segmentation using FCM

Texture Image Segmentation using FCM Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore Texture Image Segmentation using FCM Kanchan S. Deshmukh + M.G.M

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Texture Segmentation by Windowed Projection

Texture Segmentation by Windowed Projection Texture Segmentation by Windowed Projection 1, 2 Fan-Chen Tseng, 2 Ching-Chi Hsu, 2 Chiou-Shann Fuh 1 Department of Electronic Engineering National I-Lan Institute of Technology e-mail : fctseng@ccmail.ilantech.edu.tw

More information

Automatic Texture Segmentation for Texture-based Image Retrieval

Automatic Texture Segmentation for Texture-based Image Retrieval Automatic Texture Segmentation for Texture-based Image Retrieval Ying Liu, Xiaofang Zhou School of ITEE, The University of Queensland, Queensland, 4072, Australia liuy@itee.uq.edu.au, zxf@itee.uq.edu.au

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Figure 1: Representation of moving images using layers Once a set of ane models has been found, similar models are grouped based in a mean-square dist

Figure 1: Representation of moving images using layers Once a set of ane models has been found, similar models are grouped based in a mean-square dist ON THE USE OF LAYERS FOR VIDEO CODING AND OBJECT MANIPULATION Luis Torres, David Garca and Anna Mates Dept. of Signal Theory and Communications Universitat Politecnica de Catalunya Gran Capita s/n, D5

More information

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms

More information

NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES. David Asatryan, Manuk Zakaryan

NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES. David Asatryan, Manuk Zakaryan International Journal "Information Content and Processing", Volume 2, Number 1, 2015 71 NOVEL APPROACH TO CONTENT-BASED VIDEO INDEXING AND RETRIEVAL BY USING A MEASURE OF STRUCTURAL SIMILARITY OF FRAMES

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

9.913 Pattern Recognition for Vision. Class 8-2 An Application of Clustering. Bernd Heisele

9.913 Pattern Recognition for Vision. Class 8-2 An Application of Clustering. Bernd Heisele 9.913 Class 8-2 An Application of Clustering Bernd Heisele Fall 2003 Overview Problem Background Clustering for Tracking Examples Literature Homework Problem Detect objects on the road: Cars, trucks, motorbikes,

More information

A Rapid Scheme for Slow-Motion Replay Segment Detection

A Rapid Scheme for Slow-Motion Replay Segment Detection A Rapid Scheme for Slow-Motion Replay Segment Detection Wei-Hong Chuang, Dun-Yu Hsiao, Soo-Chang Pei, and Homer Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 10617,

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

Video Syntax Analysis

Video Syntax Analysis 1 Video Syntax Analysis Wei-Ta Chu 2008/10/9 Outline 2 Scene boundary detection Key frame selection 3 Announcement of HW #1 Shot Change Detection Goal: automatic shot change detection Requirements 1. Write

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

PixSO: A System for Video Shot Detection

PixSO: A System for Video Shot Detection PixSO: A System for Video Shot Detection Chengcui Zhang 1, Shu-Ching Chen 1, Mei-Ling Shyu 2 1 School of Computer Science, Florida International University, Miami, FL 33199, USA 2 Department of Electrical

More information