MPEG-7 Framework. Compression Coding MPEG-1,-2,-4. Management Filtering. Transmission Retrieval Streaming. Acquisition Authoring Editing

Size: px
Start display at page:

Download "MPEG-7 Framework. Compression Coding MPEG-1,-2,-4. Management Filtering. Transmission Retrieval Streaming. Acquisition Authoring Editing"

Transcription

1 MPEG-7

2 Motivation MPEG-7 formally named Multimedia Content Description Interface, is a standard for describing the multimedia content data that supports some degree of interpretation of the information s meaning, which can be passed onto, or accessed by, a device or a computer code. The goal of the MPEG-7 standard is to allow interoperable searching, indexing, filtering and access of audio-visual (AV) content by enabling interoperability among devices and applications that deal with AV content description MPEG-7 is set of standardized tools for describing multimedia content at different abstraction levels MPEG-7 lacks of explicit semantics: ambiguities resulting from flexibility in structuring the descriptions. Leaves room for different interpretations

3 MPEG-7 Framework Compression Coding MPEG-1,-2,-4 Transmission Retrieval Streaming Management Filtering Acquisition Authoring Editing Searching Indexing MPEG-7 Browsing Navigation Rich multimedia content description: Video segments, moving regions, shots, frames, Audio-visual features: color, texture, shape, Semantics: people, events, objects, scenes, R e fe r e n c e R e g io n R e fe r e n c e R e g io n R e fe r e n c e R e g io n M o tio n M o tio n M o tio n

4 4 Information flow

5 Scope of the Standard Description Production (extraction) Standard Description Description Consumption Normative part of MPEG-7 standard MPEG-7 describes specific features of AV content as well as information related to AV content management for a diversity of Applications: Multimedia, Music/Audio, Graphics, Video. Specifically it provides four types of normative elements: Descriptors, Description Schemes (DSs), Description Definition Language (DDL), coding schemes. MPEG-7 does not specify how to extract descriptions how to use descriptions what similarity between contents

6 Parts of the MPEG7 standard MPEG7 is composed of: MPEG-7 Visual the Description Tools dealing with Visual descriptions. MPEG-7 Audio the Description Tools dealing with Audio descriptions MPEG-7 Multimedia Description Schemes - the Description Tools dealing with generic features and multimedia descriptions. MPEG-7 Description Definition Language defining the syntax of the MPEG-7 Description Tools and for defining new Description Schemes. MPEG-7 descriptions take two possible forms: a textual XML form suitable for editing, searching, and filtering a binary form suitable for storage, transmission, and streaming delivery

7 Performance evaluation NG(q) number of ground truth images for a query q NR(q), number of found items in first K(q) retrievals, where: K(q)=min(4*NG(q), 2*GTM) GTM is max{ng(q)} for all q s of a data set. MR(q)= NG(q)-NR(q), number of missed items Compute from the ranks Rank(k) of the found items counting the rank of the first retrieved item as 1. A Rank of (1.25K(q)) is assigned to each of the ground truth images which are not in the first K(q) retrievals. Compute the normalized modified retrieval rank NMRR(q) (always in the range of [0.0,1.0])

8 Average Retrieval Rate (AVR) and ANMRR Compute AVR(q) for query q as follows: AVR ( q ) NG k ( q ) 1 Rank NG ( k ) ( q ) Compute the modified retrieval rank as follows: M R R ( q ) = A V R ( q ) - 0.5(1 + N G ( q )) Normalized MRR, NMRR = MRR(q)/Norm(q) where Norm(q)=1.25*K *NG(q) ANMRR 1 Q q Q 1 NMRR ( q )

9 Application Areas of MPEG-7 Broadcast media selection (e.g., radio channel, TV channel) Cultural services (history museums, art galleries, etc.). Digital libraries (e.g., image catalogue, musical dictionary, film, video and radio archives). E-Commerce (e.g., personalised advertising, on-line catalogues). Education (e.g., repositories of multimedia courses, multimedia search for material). Multimedia directory services (e.g. yellow pages, Tourist information, Geographical information systems). Remote sensing (e.g., cartography, natural resources management). Surveillance and investigation services (e.g., humans recognition, forensics, traffic control, surface transportation). MPEG-7 will also make the web as searchable for multimedia content as it is searchable for text today. This would apply especially to large content archives, which are being made accessible to the public, as well as to multimedia catalogues enabling people to identify content for purchase.

10 MPEG-7 Visual Descriptors Color Descriptors Texture Descriptors Shape Descriptors Motion Descriptors for Video

11 Color Descriptors Color Descriptors Dominant Color Scalable Color HSV space Color Structure HMMD space Color Layout YCbCr space Group Of Frames / Pictures histogram Constrained color spaces: - Scalable Color Descriptor uses HSV - Color Structure Descriptor uses HMMD Color Space: - R, G, B - Y, Cr, Cb - H, S, V - Monochrome - Linear transformation of R, G, B - HMMD (hue-min-max-diff) 11

12 Scalable Color Descriptor (SCD) Scalable Color Descriptor is in the form of a color histogram in the HSV color space encoded using a Haar transform. The binary representation is scalable in terms of number of bins used (from 16 to 256) and the number of bits per bin. In the case of 255 bins, SCD uniformly quantizes the H component into 16 bins, and the S and V components of the pixel into 4 bins each. After all the pixels are processed, the histogram is calculated with the probability for each bin, truncated into an 11-bit value. These values are then non-uniformly quantized into 4-bit values according to the table provided in the ISO specification 13 for more efficient encoding, giving higher significance to small values The Haar Transform is applied across the histogram bins to the 4 bit values

13 Scalable Color Descriptor Extraction As SCD is encoded by a Haar transform, its binary representation is scalable in terms of bin numbers. Representations can be stored in different resolutions, ranging from 256 down to 16 coefficients per histogram and bit representation accuracy over a broad range of data rates No. coeff # bins: H # bins: S #bins: V to 11bits/bin to 4bits/bin Nbits/bin (#bin<256)

14 Performance evaluation Matching between SCD realizations can be performed by using the Haar coefficients or histogram bin values employing an L1 norm. More accurate values are expected by reconstructing the histogram values 0,6 A N M R R 0,5 0,4 0,3 0,2 0, H - R e c N u m b e r o f b its Results with different numbers of Haar coefficients (16-256) quantized at different numbers of bits. H-Rec signifies retrieval results after reconstruction of histogram from Haar coefficients at full bit resolution.

15 RECALL Haar transform

16 Discrete Wavelet Transform In numerical analysis and functional analysis, the Discrete Wavelet Transform (DWT) refers to wavelet transforms for which the wavelets are discretely sampled. The first DWT was invented by the Hungarian mathematician Alfréd Haar. For an input represented by a list of 2n numbers, the Haar wavelet transform may be considered to simply pair up input values, storing the difference and passing the sum. This process is repeated recursively, pairing up the sums to provide the next scale: finally resulting in 2n 1 differences and one final sum. The discrete wavelet transform has nice properties: It can be performed in O(n) operations; It captures not only some notion of the frequency content of the input, by examining it at different scales, but also captures the temporal content, i.e. the times at which these frequencies occur. Combined, these two properties make an alternative to the conventional Fast Fourier Transform.

17 The Haar wavelet can be described as a step function: F(x) 2x2 matrix H <= x < ½ 1-1 ½ < x < =1 0 otherwise 1-1 Given a sequence (a 0, a 1, a 2,a 3 a 2n+1 ) of even lenght this can be transformed into a sequence of twocomponent vectors ((a 0,a 1 ), (a 2n,a 2n+1 )). If one multiplies each vector with the matrix H one gets the result ((s 0,d 0 )..(s n,d n ) of one stage of the Haar wavelet transform (sum, difference). The two sequences s and d are separated and the process is repeated with the sequence s (s 0, s 1, s 2, s 3 s 2n+1 )

18 In the one dimensional case, this is the same as the signal is broken into subbands by passing it through a low pass filter and a high pass filter, and both subbands are downsampled by 2. The DWT of a signal x is calculated by passing it through a series of filters. First the samples are passed through a low pass filter with impulse response g resulting in a convolution of the two: The same procedure can then be applied to the low frequency subband using a high-pass filter h, and repeated for as many levels of decomposition as desired. If the filters used satisfy certain properties, the original signal can be reconstructed by reversing the procedure. It is important that the two filters are related to each other and they are known as a quadrature mirror filter.

19 The outputs give: the detail coefficients (from the high-pass filter) the approximation coefficients (from the low-pass). At each level, since half the frequencies of the signal are removed, half of the samples can be discarded according to Nyquist s rule. The filter outputs are therefore downsampled by 2. Therefore the time resolution is halved (only half of each filter output characterises the signal). Since each output has half the frequency band of the input, the frequency resolution is doubled. high h With the downsampling operator the above summation can be written more concisely: Due to the decomposition process the input signal must be a multiple of 2n where n is the number of levels.

20 1D Discrete Wavelet Transform x(n) z 1 H0 H1 2 2 z 1 H0 H1 2 2 z 1 H0 H1 2 2 y0 y1 y2 y3 HO: low pass digital filter H1: high pass digital filter Z-1: delay, 2: down-sample by 2 Recursive application of wavelet transform in spatial domain corresponds to dyadic partition of data in the frequency domain. y0 y1 y2 y3

21 End RECALL

22 GoF/GoP Color Descriptor Extends Scalable Color Descriptor for a video segment or a group of pictures (joint color histogram is then possessed as SCD - Haar transform encoding) Histograms Aggregation methods: Average: sensitivity to outliers (lighting changes occlusion, text overlays) Median: increased computational complexity for sorting Intersection: a least common color trait viewpoint Applications: Browsing a large collection of images to find similar images Use histogram Intersection as a color similarity measure for clustering a collection of images Represent each cluster by GoP descriptor 22

23 Dominant Color Descriptor (DCD) DCD assumes that a given image is described in terms of a set of region labels and the associated color descriptors: Each pixel has a unique region label. Each region is characterized by a variable bin color histogram Colors in a given region are clustered into a small number of representative colors. The descriptor consists of the representative colors, their percentages in a region, spatial coherency of the color, and color variance: { { i i i } } F = c, p, v, s, ( i = 1, 2,, N ) c i : the i-th representative color p i : its percentages in the region v i : its color variance s : its spatial coherency

24 Similarity Distance Measure Typically when using DCD similarity is evaluated simply comparing the corresponding dominant color percentages and dominant color distances N 1 N 2 N 1 N 2 D 2 (F 1,F 2 ) 2 p 1i 2 p 2 j 2 a 1i, 2 j p 1i p 2 j i 1 j 1 i 1 j 1 a k,l : similarity coefficient between two colors c k and c l a k, l 1 d k, l / d max d k, l T d 0 d k, l T d d k, l c k c l d k,l : Euclidean distance between two color c k and c l T d : maximum distance for two colors to be considered similar, d k, l c k c l d max = T d, values , T d values Equivalent to the quadratic color histogram distance measure D 2 ( F, F ) ( F F ) A ( F F ) T

25 Dominant Color Descriptor enhancements DCD variance is computed as the variance of each of the dominant color DCD spatial coherency is computed as a single value by the weighted sum of per-dominant-color spatial coherencies. The weight is proportional to the number of pixels corresponding to each dominant color. Spatial coherency per dominant color captures how coherent the pixels corresponding to the dominant color are and whether they appear to be a solid color in the given image region. It gives an idea of the spatial homogeneity of the dominant colors of a region Spatial coherency per dominant color is computed by the normalized average connectivity (8-connectedness) for the corresponding dominant color pixels.

26 DCD is suitable for local (object or region) features, when a small number of colors is enough to characterize the color information. Before feature extraction, images must be segmented into regions. maximum of 8 dominant colors can be used to represent the region (3 bits). Percentage values are quantized to 5 bits each. Variance: 3 bits/dominant color. Spatial coherence: 5 bits. The color quantization depends on the color space specifications defined for the entire database and need not be specified with each descriptor. Experiments with 6 bits/color index.

27 Dominant color representation is sufficiently accurate and compact compared to the traditional color histogram color bins quantized from each image region instead of fixed 3 bins on average instead of 256 or more It supports efficient database indexing and search No high-dimensional indexing complexity of the searching depends only on the desired degree of the similarity of the matching, not directly on the database size insertion and deletion of database entries do not cause index structure rebuilding retrieval results accurate and fast compared to the traditional color histogram. Not effective for smooth regions

28 Color Structure Descriptor (CSD) Similar to a histogram, the Color Structure Descriptor represents an image by both the color distribution and the local structure. Scalable Color Descriptor may not distinguish both images. But the CSD can do it. CSD is obtained by scanning the image by an 8x8 structure element in a sliding window approach: with each shift of the structuring element, the number of times a particular color is contained in the structure element is counted, and a color histogram is constructed. The HMMD color space is used

29 CSD is characterized by a color structure histogram for M quantized color c m, where M is {256, 128, 64, 32}. The bin value h(m) is the number of structuring elements containing one or more pixels with color c m. Let be I the set of quantized color index of an image and S be the set of quantized color index existing inside the subimage region covered by the structuring element. With the structuring element scanning the image, the color histogram bins are accumulated. Thus, the final value of h(m) is determined by the number of positions at which the structuring element contains color c m. C O L O R C0 B I N C1 +1 C2 C3 +1 C4 C5 C6 C x 8 s tr u c tu r in g e le m e n t The HMMD color space should be used with CSD descriptor.

30 RECALL HSV and HMMD color spaces

31 HSV Color Space The HMMD color space regards the colors adjacent to a given color in the color space as the neighboring colors. It is closely related to HSV. Cyan (180 o ) Green (120 o ) Yellow (60 o ) Red (0 o ) Blue (240 o ) Magenta (300 o ) Value White Hue Black Saturation

32 RGB to HSV HSV values can be derived by the RGB values: Max = max(r, G, B); Min = min( R, G, B); Value = max(r, G, B); if( Max = 0 ) then Saturation = 0; else Saturation = (Max-Min)/Max; if( Max = Min ) then Hue is undefined (achromatic color); otherwise: if( Max = R && G > B ) Hue = 60*(G-B)/(Max-Min) else if( Max = R && G < B ) Hue = *(G-B)/(Max-Min) else if( G = Max ) Hue = 60*(2.0 + (B-R)/(Max-Min)) else Hue = 60*(4.0 + (R-G)/(Max-Min))

33 HMMD Color space In HMMD the Hue is the same as in the HSV space (0-360 ), Max and Min are the maximum and minimum among the R, G, and B values (how much black and how much white are present respectively), Diff component is the difference between Max and Min (how much a color is close to pure color) Sum = (Max + Min) / 2 can also be defined. Max [0,1) is obtained with the same RGB transform as V in HSV but with a different subspace Diff [0,1) same as S in HSV but with a different subspace Sum refers to brightness Only three of the four components are sufficient to describe the HMMD space (H, Max, Min) or (H, Diff, Sum). HMMD color space can be depicted using the double cone structure In the MPEG-7 core experiments for image retrieval, the HMMD color space is very effective and compared favorably with the HSV color space. Note that the HMMD color space is a slight twist on the HSI color space, where the diff component is scaled by the intensity value

34 End RECALL

35 HMMD subspace quantization Subspace 0 Subspace 1 Four nonuniform quantizations are defined that partition the space into 256, 128, 64, 32 cells Subspace 2 Subspace 3 Each 3D quantization is defined via five subspaces. The Diff axis is defined in 5 subintervals [0,6), [6,20), [20, 60), [60,110), [110, 255). Each subspace has sum and hue allowed to take all values in their ranges. They are partitioned into uniform intervals according to a table. Subspace 4 Example: 128-bins (Cells) of the HMMD color space Hue Sum white black HMMD can accomplish a color quantization close to the change of the color sensed by the human eye, thereby capable of enhancing a performance of the image searching system based on content

36 Matching is performed by computing L1 distance measure between CSDs:. dist( A, B ) h A ( i ) h B ( i ) i CSD provides more accurate similarity retrieval because of the inclusion of spatial color information. This representation is more closely related to the human perception and, thus, is more useful for indexing and retrieval. Structure histogram describes color feature very well and can give very high retrieval accuracy. Although the color structure histogram contributes to the high retrieval accuracy of CSD wrt DCD, the fixed color space requirement of the histogram results in redundancy in the representation. For example: a DCD with 8 colors need bytes in binary representation, a DCD with 4 colors only need bytes. The most compact CSD 32 uses 32 bytes per descriptor, which is about times of DCD.

37 A N M R R CSD: Experimental results desc rip to r bit- len gth

38 Color Layout Descriptor (CLD) CLD is very Compact Descriptor (63 bit) per image based on: Grid-based Dominant Color in the YCbCr-Color Space (the dominant color may also be the average color) DCT transformation on a 2D-array of Dominant Colors Final Quantization Step to 63 bits F ={CoefPattern, Y-DCcoef, Cb-DCcoef, Cr-DCcoef, Y-ACcoef, Cb-ACcoef, Cr-ACcoef} Y = 0.299*R *G *B Cb = *R *G *B Cr = 0.500*R *G *B

39 Color Layout Descriptor extraction The image is clustered into 64 (8x8) blocks A single representative color is selected from each block (the average of the pixel colors in a block suggested as the representative color). The selection results in an image of size 8x8 Derived average colors are transformed into a series of coefficients by performing DCT A few low-frequency coefficients are selected using zigzag scanning and quantized to form a CLD (large quantization step in quantizing AC coeff / small quantization step in quantizing DC coff). If the time domain data is smooth (with little variation in data) then frequency domain data will make low frequency data larger and high frequency data smaller.

40 Color Layout Descriptor Matching CLD is efficient for: Sketch-based image retrieval Content Filtering using image indexing The distance of two CLDs CL and CL with 12 coefficients (6 for Y, 3 for Cb and Cr: CL{Y0,..., Y5, Cr0, Cr1, Cr2, Cb0, Cb1, Cb2} ) is defined as follows :

41 RECALL Discrete Cosine Transformation

42 DCT (Discrete Cosine Transformation) DCT applies to 8x8 image blocks For each block, DCT allows to shift from spatial domain to frequency domain: f(i,j) is the value that is present in the (i,j) position of the 8x8 block of the original image F(u,v) is the DCT coefficient of the 8x8 block in the (u,v) position of the 8x8 matrix that encodes the transformed coefficients

43 The 64 (8 x 8) DCT Discrete Cosine Transform basis functions: F[0,0]

44 End RECALL

45 What applications Dominant Color(s) descriptor is most suitable for representing local (object or image region) features where a small number of colors are enough to characterize the color information. A spatial coherency on the entire descriptor is also defined, and used in similarity retrieval. Scalable Color descriptor is useful for image-to-image matching and retrieval based on color feature. Retrieval accuracy increases with the number of bits used in the representation. Color Layout descriptor allows image-to-image matching at very small computational costs and ultra high-speed sequence-to-sequence matching also at different resolutions. It is feasible to apply to mobile terminal applications where the available resources is strictly limited. Users can easily introduce perceptual sensitivity of human vision system for similarity calculation. Color structure descriptor is suited to image-to-image matching and its intended use is for stillnatural image retrieval, where an image may consist of either a single rectangular frame or arbitrarily shaped, possibly disconnected, regions.

46 Texture Descriptors Homogenous Texture Descriptor Non-Homogenous Texture Descriptor (Edge Histogram)

47 Homogenous Texture Descriptor (HTD) Procedure for HTD extraction: Partitioning the frequency domain into 30 channels (modeled by a 2D-Gabor function) Computing the energy and energy deviation for each channel Computing mean and standard variation of frequency coefficients F = {f DC, f SD, e 1,, e 30, d 1,, d 30 } With HTD one can perform: Rotation invariance matching Intensity invariance matching (f DC removed from the feature vector) Scale-Invariant matching F = {f DC, f SD, e 1,, e 30, d 1,, d 30 } 47

48 RECALL GABOR function, energy function

49 1D-Gabor Function Gabor filter is used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. The function to be transformed is first multiplied by a Gaussian function, which can be regarded as a window, and the resulting function is then transformed with a Fourier transform to derive the time-frequency analysis. The window function means that the signal near the time being analyzed will have higher weight. The Gabor transform of a signal x(t) is defined by this formula: Through time frequency analysis by applying the Gabor transform, the available bandwidth can be known and those frequency bands can be used for other applications and bandwidth is saved.

50 2D-Gabor Function It is a Gaussian weighted sinusoid It is used to model individual channels: each channel filters a specific type of texture

51 Energy Function Energy in i channel is defined as e i log 10 [1 p i ] where: p i [ G P s, r (, ) P (, )] 2 being P(ω,θ) the Fourier transform of an image represented in the polar frequency domain and G a Gaussian function:. G P s, r exp 2 2 s 2 s exp 2 2 r 2 r

52 End RECALL

53 An efficient HTD implementation: Radon transform followed by Fourier transform 2-D image f(x,y) Radon transform 1D P(R, θ) 1D F(P (R, θ)) Resulted sampling grid in polar coords

54 RECALL RADON transform

55 Radon Transform Transforms images with lines into a domain of possible line parameters. Each line will be transformed to a peak point in the resulted image

56 Definition: let ƒ(x) = ƒ(x,y) be a continuous function vanishing outside some large disc in the Euclidean plane R 2. The Radon transform, Rƒ, is a function defined on the space of straight lines L in R 2 by the line integral along each such line: In practice, any straight line L can be parametrized by where s is the distance of L from the origin and α is the angle the normal vector to L makes with the x axis. The quantities (α,s) can be considered as coordinates on the space of all lines in R 2, and the Radon transform can be expressed in these coordinates by The Hough transform, when written in a continuous form, is very similar, if not equivalent, to the Radon transform.

57 End RECALL

58 Texture Browsing Descriptor Same sp. filtering procedure as the HTD.. E.g look for textures that are very regular and oriented at 30 0 Regularity (periodic to random) Scale and orientation selective band-pass filters Coarseness (grain to coarse) Directionality (/30 0 ) The texture browsing descriptor can be used to find a set of candidates with similar perceptual properties and then use the HTD to get a precise similarity match list among the candidate images.

59 Non-Homogenous Texture Descriptor Edge Histogram Descriptor (EHD) Represents the spatial distribution of five types of edges: vertical, horizontal, 45, 135, and non-directional Dividing the image into 16 (4x4) blocks Generating a 5-bin histogram for each block It is scale invariant F = {BinCounts[k]},k=80 Cannot be used for object-based image retrieval Th edge if set to 0 ETD applies for binary edge images (sketch-based retrieval) Extended HTD achieves better results but does not exhibits rotation invariant property Retain strong edges by thresholding Canny edge operator

60 EHD extraction Basic (80 bins) Extended (150 bins). basic Semi-global global +13 clusters for semi-global Egde map image using Canny edge operator 67

61 What applications Homogenous Texture descriptor is for searching and browsing through large collections of similar looking patterns. An image can be considered as a mosaic of homogeneous textures so that these texture features associated with the regions can be used to index the image data. Texture Browsing descriptor is useful for representing homogeneous texture for browsing type applications, and requires only 12 bits (maximum). It provides a perceptual characterization of texture, similar to a human characterization, in terms of regularity, coarseness and directionality. Edge Histogram descriptor, in that edges play an important role for image perception, can retrieve images with similar semantic meaning. It targets image-to-image matching (by example or by sketch), especially for natural images with non-uniform edge distribution. The image retrieval performance can be significantly improved if the edge histogram descriptor is combined with other descriptors such as the color histogram descriptor.

62 Shape Descriptors Region-based Descriptor Contour-based Shape Descriptor 2D/3D Shape Descriptor 3D Shape Descriptor

63 A shape is the outline or characteristic surface configuration of a thing: a contour; a form. A shape cannot be described through text. Shape representation and matching is one of the major and oldest research topics of pattern Recognition and Computer Vision. Property of invariance of the representation - such that shape representations are left unaltered, under a set of transformations - plays a very important role in order to recognize the same object even in its translated /rotated/ scaled/ shrinked.. view.

64 Region-based Descriptor (RBD) Expresses pixel distribution within a 2-D object region. Employs a complex 2D-Angular Radial Transformation (ART) F nm V nm 0 2 1,, f, V,, f, d d 0 nm A m 1 2 exp jm m = 0,..12 R n 1 2 cos n n n 0 0 F ={MagnitudeOfART[k]},k=nxm n = 0,..3

65 ART is a 2-D complex transform defined on a unit disk in polar coordinates, F nm V nm 2 1,, f, V,, f, d d 0 0 nm f (, ) is an image function in polar coordinates, and V nm (, ) is the ART basis function. The ART basis functions are separable along the angular and radial directions, i.e., V, nm A m R n The angular and radial basis functions are defined as follows: A m 2 1 exp jm R n 1 2 cos n n n 0 0

66 RBD Applicability Applicable to figures (a) (e) Distinguishes (i) from (g) and (h); (j), (k), and (l) are similar Advantages: Describes complex shapes with disconnected regions Robust to segmentation noise Small size Fast extraction and matching

67 Contour-Based Descriptor (CBD) Contour-Based Descriptor is based on Curvature Scale-Space representation: Finds curvature zero crossing points of the shape s contour (keypoints) Reduces the number of keypoints step by step, by applying Gaussian smoothing The position of key points are expressed relative to the length of the contour curve

68 CBD Applicability Applicable to (a) Distinguishes differences in (b) Find similarities in (c) - (e) Advantages: Captures the shape very well Robust to the noise, scale, and orientation It is fast and compact 87

69 How the Contour is calculated? N equidistant points are selected on the contour, starting from an arbitrary point on the contour and following the contour clockwise. The x-coordinates of the selected N points are grouped together and the y-coordinates are also grouped together into two series X, Y. The contour is then gradually smoothed by repetitive application of a low-pass filter with the kernel (0.25,0.5,0.25) to X and Y coordinates of the selected N contour points

70 GlobalCurvatureVector This element specifies global parameters of the contour, namely the Eccentricity and Circularity circularit y perimeter area 2 FOR A CIRCLE CIRCULARITY IS C circle ( 2 r ) r eccentrici ty i i i i i i i i i 2 i i i i 4 i i 2 02 ( y y c ) i ( x )( y 11 c c i x 2 20 ( x x c ) y )

71 Comparison (RB/CB descriptors) Blue: Similar shapes by Region-Based Yellow: Similar shapes by Contour-Based 95

72 What applications Region Shape descriptor makes use of all pixels constituting the shape within a frame and can describe any shapes. It is also characterized by small size, fast extraction time and matching. The data size for this representation is fixed to 17.5 bytes. The feature extraction and matching processes have low order of computational complexities, and are suitable for tracking shapes in the video data processing. Contour Shape descriptor captures perceptually meaningful features of the shape enabling similarity-based retrieval. It is robust to non-rigid motion. It is robust to partial occlusion of the shape. It is robust to perspective transformations, which result from the changes of the camera parameters and are common in images and video

LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR

LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR RGB COLOR HISTOGRAM HSV COLOR MOMENTS hsv_image = rgb2hsv(rgb_image) converts the RGB image to the equivalent HSV image. RGB is an m-by-n-by-3 image array

More information

VC 11/12 T14 Visual Feature Extraction

VC 11/12 T14 Visual Feature Extraction VC 11/12 T14 Visual Feature Extraction Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos Miguel Tavares Coimbra Outline Feature Vectors Colour Texture

More information

Autoregressive and Random Field Texture Models

Autoregressive and Random Field Texture Models 1 Autoregressive and Random Field Texture Models Wei-Ta Chu 2008/11/6 Random Field 2 Think of a textured image as a 2D array of random numbers. The pixel intensity at each location is a random variable.

More information

MPEG-7 Visual shape descriptors

MPEG-7 Visual shape descriptors MPEG-7 Visual shape descriptors Miroslaw Bober presented by Peter Tylka Seminar on scientific soft skills 22.3.2012 Presentation Outline Presentation Outline Introduction to problem Shape spectrum - 3D

More information

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang NICTA & CSE UNSW COMP9314 Advanced Database S1 2007 jzhang@cse.unsw.edu.au Reference Papers and Resources Papers: Colour spaces-perceptual, historical

More information

An Introduction to Content Based Image Retrieval

An Introduction to Content Based Image Retrieval CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and

More information

The MPEG-7 Description Standard 1

The MPEG-7 Description Standard 1 The MPEG-7 Description Standard 1 Nina Jaunsen Dept of Information and Media Science University of Bergen, Norway September 2004 The increasing use of multimedia in the general society and the need for

More information

Color Representation in MPEG-7

Color Representation in MPEG-7 Color Representation in MPEG-7 MPEG-7 distinguishes: Color Space and Color Quantization, Descriptive Elements: dominant, scalable, (histogram based, counting of pixels that fulfill criteria), color-structure,

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Part 9: Representation and Description AASS Learning Systems Lab, Dep. Teknik Room T1209 (Fr, 11-12 o'clock) achim.lilienthal@oru.se Course Book Chapter 11 2011-05-17 Contents

More information

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1 Last update: May 4, 200 Vision CMSC 42: Chapter 24 CMSC 42: Chapter 24 Outline Perception generally Image formation Early vision 2D D Object recognition CMSC 42: Chapter 24 2 Perception generally Stimulus

More information

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier Computer Vision 2 SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung Computer Vision 2 Dr. Benjamin Guthier 1. IMAGE PROCESSING Computer Vision 2 Dr. Benjamin Guthier Content of this Chapter Non-linear

More information

Color and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception

Color and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception Color and Shading Color Shapiro and Stockman, Chapter 6 Color is an important factor for for human perception for object and material identification, even time of day. Color perception depends upon both

More information

One image is worth 1,000 words

One image is worth 1,000 words Image Databases Prof. Paolo Ciaccia http://www-db. db.deis.unibo.it/courses/si-ls/ 07_ImageDBs.pdf Sistemi Informativi LS One image is worth 1,000 words Undoubtedly, images are the most wide-spread MM

More information

Boundary descriptors. Representation REPRESENTATION & DESCRIPTION. Descriptors. Moore boundary tracking

Boundary descriptors. Representation REPRESENTATION & DESCRIPTION. Descriptors. Moore boundary tracking Representation REPRESENTATION & DESCRIPTION After image segmentation the resulting collection of regions is usually represented and described in a form suitable for higher level processing. Most important

More information

11. Image Data Analytics. Jacobs University Visualization and Computer Graphics Lab

11. Image Data Analytics. Jacobs University Visualization and Computer Graphics Lab 11. Image Data Analytics Motivation Images (and even videos) have become a popular data format for storing information digitally. Data Analytics 377 Motivation Traditionally, scientific and medical imaging

More information

MPEG-7. Multimedia Content Description Standard

MPEG-7. Multimedia Content Description Standard MPEG-7 Multimedia Content Description Standard Abstract The purpose of this presentation is to provide a better understanding of the objectives & components of the MPEG-7, "Multimedia Content Description

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Lecture 12 Color model and color image processing

Lecture 12 Color model and color image processing Lecture 12 Color model and color image processing Color fundamentals Color models Pseudo color image Full color image processing Color fundamental The color that humans perceived in an object are determined

More information

Multimedia Information Retrieval

Multimedia Information Retrieval Multimedia Information Retrieval Prof Stefan Rüger Multimedia and Information Systems Knowledge Media Institute The Open University http://kmi.open.ac.uk/mmis Why content-based? Actually, what is content-based

More information

Chapter 4 - Image. Digital Libraries and Content Management

Chapter 4 - Image. Digital Libraries and Content Management Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 4 - Image Vector Graphics Raw data: set (!) of lines and polygons

More information

Content-based Image Retrieval (CBIR)

Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Searching a large database for images that match a query: What kinds of databases? What kinds of queries? What constitutes a match?

More information

Short Run length Descriptor for Image Retrieval

Short Run length Descriptor for Image Retrieval CHAPTER -6 Short Run length Descriptor for Image Retrieval 6.1 Introduction In the recent years, growth of multimedia information from various sources has increased many folds. This has created the demand

More information

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang Extracting Layers and Recognizing Features for Automatic Map Understanding Yao-Yi Chiang 0 Outline Introduction/ Problem Motivation Map Processing Overview Map Decomposition Feature Recognition Discussion

More information

AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES

AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES 1 RIMA TRI WAHYUNINGRUM, 2 INDAH AGUSTIEN SIRADJUDDIN 1, 2 Department of Informatics Engineering, University of Trunojoyo Madura,

More information

Color, Edge and Texture

Color, Edge and Texture EECS 432-Advanced Computer Vision Notes Series 4 Color, Edge and Texture Ying Wu Electrical Engineering & Computer Science Northwestern University Evanston, IL 628 yingwu@ece.northwestern.edu Contents

More information

High Efficiency Video Coding. Li Li 2016/10/18

High Efficiency Video Coding. Li Li 2016/10/18 High Efficiency Video Coding Li Li 2016/10/18 Email: lili90th@gmail.com Outline Video coding basics High Efficiency Video Coding Conclusion Digital Video A video is nothing but a number of frames Attributes

More information

CHAPTER 1 Introduction 1. CHAPTER 2 Images, Sampling and Frequency Domain Processing 37

CHAPTER 1 Introduction 1. CHAPTER 2 Images, Sampling and Frequency Domain Processing 37 Extended Contents List Preface... xi About the authors... xvii CHAPTER 1 Introduction 1 1.1 Overview... 1 1.2 Human and Computer Vision... 2 1.3 The Human Vision System... 4 1.3.1 The Eye... 5 1.3.2 The

More information

Practical Image and Video Processing Using MATLAB

Practical Image and Video Processing Using MATLAB Practical Image and Video Processing Using MATLAB Chapter 18 Feature extraction and representation What will we learn? What is feature extraction and why is it a critical step in most computer vision and

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi Journal of Asian Scientific Research, 013, 3(1):68-74 Journal of Asian Scientific Research journal homepage: http://aessweb.com/journal-detail.php?id=5003 FEATURES COMPOSTON FOR PROFCENT AND REAL TME RETREVAL

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

Edge and local feature detection - 2. Importance of edge detection in computer vision

Edge and local feature detection - 2. Importance of edge detection in computer vision Edge and local feature detection Gradient based edge detection Edge detection by function fitting Second derivative edge detectors Edge linking and the construction of the chain graph Edge and local feature

More information

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection Why Edge Detection? How can an algorithm extract relevant information from an image that is enables the algorithm to recognize objects? The most important information for the interpretation of an image

More information

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013 ECE 417 Guest Lecture Video Compression in MPEG-1/2/4 Min-Hsuan Tsai Apr 2, 213 What is MPEG and its standards MPEG stands for Moving Picture Expert Group Develop standards for video/audio compression

More information

Color Content Based Image Classification

Color Content Based Image Classification Color Content Based Image Classification Szabolcs Sergyán Budapest Tech sergyan.szabolcs@nik.bmf.hu Abstract: In content based image retrieval systems the most efficient and simple searches are the color

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Filtering Images. Contents

Filtering Images. Contents Image Processing and Data Visualization with MATLAB Filtering Images Hansrudi Noser June 8-9, 010 UZH, Multimedia and Robotics Summer School Noise Smoothing Filters Sigmoid Filters Gradient Filters Contents

More information

IT Digital Image ProcessingVII Semester - Question Bank

IT Digital Image ProcessingVII Semester - Question Bank UNIT I DIGITAL IMAGE FUNDAMENTALS PART A Elements of Digital Image processing (DIP) systems 1. What is a pixel? 2. Define Digital Image 3. What are the steps involved in DIP? 4. List the categories of

More information

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING DS7201 ADVANCED DIGITAL IMAGE PROCESSING II M.E (C.S) QUESTION BANK UNIT I 1. Write the differences between photopic and scotopic vision? 2. What

More information

Computer Vision I - Basics of Image Processing Part 2

Computer Vision I - Basics of Image Processing Part 2 Computer Vision I - Basics of Image Processing Part 2 Carsten Rother 07/11/2014 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University CS443: Digital Imaging and Multimedia Binary Image Analysis Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University Outlines A Simple Machine Vision System Image segmentation by thresholding

More information

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854, USA Part of the slides

More information

2. LITERATURE REVIEW

2. LITERATURE REVIEW 2. LITERATURE REVIEW CBIR has come long way before 1990 and very little papers have been published at that time, however the number of papers published since 1997 is increasing. There are many CBIR algorithms

More information

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami to MPEG Prof. Pratikgiri Goswami Electronics & Communication Department, Shree Swami Atmanand Saraswati Institute of Technology, Surat. Outline of Topics 1 2 Coding 3 Video Object Representation Outline

More information

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors Texture The most fundamental question is: How can we measure texture, i.e., how can we quantitatively distinguish between different textures? Of course it is not enough to look at the intensity of individual

More information

Matching and Recognition in 3D. Based on slides by Tom Funkhouser and Misha Kazhdan

Matching and Recognition in 3D. Based on slides by Tom Funkhouser and Misha Kazhdan Matching and Recognition in 3D Based on slides by Tom Funkhouser and Misha Kazhdan From 2D to 3D: Some Things Easier No occlusion (but sometimes missing data instead) Segmenting objects often simpler From

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

The Core Technology of Digital TV

The Core Technology of Digital TV the Japan-Vietnam International Student Seminar on Engineering Science in Hanoi The Core Technology of Digital TV Kosuke SATO Osaka University sato@sys.es.osaka-u.ac.jp November 18-24, 2007 What is compression

More information

Generic Fourier Descriptor for Shape-based Image Retrieval

Generic Fourier Descriptor for Shape-based Image Retrieval 1 Generic Fourier Descriptor for Shape-based Image Retrieval Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info Tech Monash University Churchill, VIC 3842 Australia dengsheng.zhang@infotech.monash.edu.au

More information

Lecture 5: Compression I. This Week s Schedule

Lecture 5: Compression I. This Week s Schedule Lecture 5: Compression I Reading: book chapter 6, section 3 &5 chapter 7, section 1, 2, 3, 4, 8 Today: This Week s Schedule The concept behind compression Rate distortion theory Image compression via DCT

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

Content Based Image Retrieval Using Combined Color & Texture Features

Content Based Image Retrieval Using Combined Color & Texture Features IOSR Journal of Electrical and Electronics Engineering (IOSR-JEEE) e-issn: 2278-1676,p-ISSN: 2320-3331, Volume 11, Issue 6 Ver. III (Nov. Dec. 2016), PP 01-05 www.iosrjournals.org Content Based Image Retrieval

More information

Key properties of local features

Key properties of local features Key properties of local features Locality, robust against occlusions Must be highly distinctive, a good feature should allow for correct object identification with low probability of mismatch Easy to etract

More information

Image Enhancement Techniques for Fingerprint Identification

Image Enhancement Techniques for Fingerprint Identification March 2013 1 Image Enhancement Techniques for Fingerprint Identification Pankaj Deshmukh, Siraj Pathan, Riyaz Pathan Abstract The aim of this paper is to propose a new method in fingerprint enhancement

More information

Image Processing: Final Exam November 10, :30 10:30

Image Processing: Final Exam November 10, :30 10:30 Image Processing: Final Exam November 10, 2017-8:30 10:30 Student name: Student number: Put your name and student number on all of the papers you hand in (if you take out the staple). There are always

More information

Obtaining Feature Correspondences

Obtaining Feature Correspondences Obtaining Feature Correspondences Neill Campbell May 9, 2008 A state-of-the-art system for finding objects in images has recently been developed by David Lowe. The algorithm is termed the Scale-Invariant

More information

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html

More information

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression IMAGE COMPRESSION Image Compression Why? Reducing transportation times Reducing file size A two way event - compression and decompression 1 Compression categories Compression = Image coding Still-image

More information

CITS 4402 Computer Vision

CITS 4402 Computer Vision CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh, CEO at Mapizy (www.mapizy.com) and InFarm (www.infarm.io) Lecture 02 Binary Image Analysis Objectives Revision of image formation

More information

5. Feature Extraction from Images

5. Feature Extraction from Images 5. Feature Extraction from Images Aim of this Chapter: Learn the Basic Feature Extraction Methods for Images Main features: Color Texture Edges Wie funktioniert ein Mustererkennungssystem Test Data x i

More information

Texture. Outline. Image representations: spatial and frequency Fourier transform Frequency filtering Oriented pyramids Texture representation

Texture. Outline. Image representations: spatial and frequency Fourier transform Frequency filtering Oriented pyramids Texture representation Texture Outline Image representations: spatial and frequency Fourier transform Frequency filtering Oriented pyramids Texture representation 1 Image Representation The standard basis for images is the set

More information

Adaptive Quantization for Video Compression in Frequency Domain

Adaptive Quantization for Video Compression in Frequency Domain Adaptive Quantization for Video Compression in Frequency Domain *Aree A. Mohammed and **Alan A. Abdulla * Computer Science Department ** Mathematic Department University of Sulaimani P.O.Box: 334 Sulaimani

More information

Distributed Algorithms. Image and Video Processing

Distributed Algorithms. Image and Video Processing Chapter 5 Object Recognition Distributed Algorithms for Motivation Requirements Overview Object recognition via Colors Shapes (outlines) Textures Movements Summary 2 1 Why object recognition? Character

More information

Shape Context Matching For Efficient OCR

Shape Context Matching For Efficient OCR Matching For Efficient OCR May 14, 2012 Matching For Efficient OCR Table of contents 1 Motivation Background 2 What is a? Matching s Simliarity Measure 3 Matching s via Pyramid Matching Matching For Efficient

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Third Edition Rafael C. Gonzalez University of Tennessee Richard E. Woods MedData Interactive PEARSON Prentice Hall Pearson Education International Contents Preface xv Acknowledgments

More information

Lecture 8: Multimedia Information Retrieval (I)

Lecture 8: Multimedia Information Retrieval (I) Lecture 8: Multimedia Information Retrieval (I) A/Prof. Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2010 jzhang@cse.unsw.edu.au Announcement!!! Lecture 10 swaps with Lecture 11 Lecture 10

More information

Lecture 8 Object Descriptors

Lecture 8 Object Descriptors Lecture 8 Object Descriptors Azadeh Fakhrzadeh Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University 2 Reading instructions Chapter 11.1 11.4 in G-W Azadeh Fakhrzadeh

More information

CHAPTER 3 FACE DETECTION AND PRE-PROCESSING

CHAPTER 3 FACE DETECTION AND PRE-PROCESSING 59 CHAPTER 3 FACE DETECTION AND PRE-PROCESSING 3.1 INTRODUCTION Detecting human faces automatically is becoming a very important task in many applications, such as security access control systems or contentbased

More information

Module 7 VIDEO CODING AND MOTION ESTIMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION Module 7 VIDEO CODING AND MOTION ESTIMATION Lesson 20 Basic Building Blocks & Temporal Redundancy Instructional Objectives At the end of this lesson, the students should be able to: 1. Name at least five

More information

Lecture 4 Image Enhancement in Spatial Domain

Lecture 4 Image Enhancement in Spatial Domain Digital Image Processing Lecture 4 Image Enhancement in Spatial Domain Fall 2010 2 domains Spatial Domain : (image plane) Techniques are based on direct manipulation of pixels in an image Frequency Domain

More information

INF5063: Programming heterogeneous multi-core processors. September 17, 2010

INF5063: Programming heterogeneous multi-core processors. September 17, 2010 INF5063: Programming heterogeneous multi-core processors September 17, 2010 High data volumes: Need for compression PAL video sequence 25 images per second 3 bytes per pixel RGB (red-green-blue values)

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK Professor Laurence S. Dooley School of Computing and Communications Milton Keynes, UK How many bits required? 2.4Mbytes 84Kbytes 9.8Kbytes 50Kbytes Data Information Data and information are NOT the same!

More information

Content Based Image Retrieval

Content Based Image Retrieval Content Based Image Retrieval R. Venkatesh Babu Outline What is CBIR Approaches Features for content based image retrieval Global Local Hybrid Similarity measure Trtaditional Image Retrieval Traditional

More information

Parametric Texture Model based on Joint Statistics

Parametric Texture Model based on Joint Statistics Parametric Texture Model based on Joint Statistics Gowtham Bellala, Kumar Sricharan, Jayanth Srinivasa Department of Electrical Engineering, University of Michigan, Ann Arbor 1. INTRODUCTION Texture images

More information

International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18,   ISSN International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18, www.ijcea.com ISSN 2321-3469 A SURVEY ON THE METHODS USED FOR CONTENT BASED IMAGE RETRIEVAL T.Ezhilarasan

More information

Feature Detectors and Descriptors: Corners, Lines, etc.

Feature Detectors and Descriptors: Corners, Lines, etc. Feature Detectors and Descriptors: Corners, Lines, etc. Edges vs. Corners Edges = maxima in intensity gradient Edges vs. Corners Corners = lots of variation in direction of gradient in a small neighborhood

More information

Edge and corner detection

Edge and corner detection Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements

More information

Anno accademico 2006/2007. Davide Migliore

Anno accademico 2006/2007. Davide Migliore Robotica Anno accademico 6/7 Davide Migliore migliore@elet.polimi.it Today What is a feature? Some useful information The world of features: Detectors Edges detection Corners/Points detection Descriptors?!?!?

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

Multimedia Database Systems. Retrieval by Content

Multimedia Database Systems. Retrieval by Content Multimedia Database Systems Retrieval by Content MIR Motivation Large volumes of data world-wide are not only based on text: Satellite images (oil spill), deep space images (NASA) Medical images (X-rays,

More information

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ) 5 MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ) Contents 5.1 Introduction.128 5.2 Vector Quantization in MRT Domain Using Isometric Transformations and Scaling.130 5.2.1

More information

Efficient Image Retrieval Using Indexing Technique

Efficient Image Retrieval Using Indexing Technique Vol.3, Issue.1, Jan-Feb. 2013 pp-472-476 ISSN: 2249-6645 Efficient Image Retrieval Using Indexing Technique Mr.T.Saravanan, 1 S.Dhivya, 2 C.Selvi 3 Asst Professor/Dept of Computer Science Engineering,

More information

COLOR IMAGE SEGMENTATION IN RGB USING VECTOR ANGLE AND ABSOLUTE DIFFERENCE MEASURES

COLOR IMAGE SEGMENTATION IN RGB USING VECTOR ANGLE AND ABSOLUTE DIFFERENCE MEASURES COLOR IMAGE SEGMENTATION IN RGB USING VECTOR ANGLE AND ABSOLUTE DIFFERENCE MEASURES Sanmati S. Kamath and Joel R. Jackson Georgia Institute of Technology 85, 5th Street NW, Technology Square Research Building,

More information

ADAPTIVE TEXTURE IMAGE RETRIEVAL IN TRANSFORM DOMAIN

ADAPTIVE TEXTURE IMAGE RETRIEVAL IN TRANSFORM DOMAIN THE SEVENTH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2002), DEC. 2-5, 2002, SINGAPORE. ADAPTIVE TEXTURE IMAGE RETRIEVAL IN TRANSFORM DOMAIN Bin Zhang, Catalin I Tomai,

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22) Digital Image Processing Prof. P. K. Biswas Department of Electronics and Electrical Communications Engineering Indian Institute of Technology, Kharagpur Module Number 01 Lecture Number 02 Application

More information

Schedule for Rest of Semester

Schedule for Rest of Semester Schedule for Rest of Semester Date Lecture Topic 11/20 24 Texture 11/27 25 Review of Statistics & Linear Algebra, Eigenvectors 11/29 26 Eigenvector expansions, Pattern Recognition 12/4 27 Cameras & calibration

More information

Content-Based Image Retrieval of Web Surface Defects with PicSOM

Content-Based Image Retrieval of Web Surface Defects with PicSOM Content-Based Image Retrieval of Web Surface Defects with PicSOM Rami Rautkorpi and Jukka Iivarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 54, FIN-25

More information

Sampling and Reconstruction

Sampling and Reconstruction Sampling and Reconstruction Sampling and Reconstruction Sampling and Spatial Resolution Spatial Aliasing Problem: Spatial aliasing is insufficient sampling of data along the space axis, which occurs because

More information

Part 3: Image Processing

Part 3: Image Processing Part 3: Image Processing Image Filtering and Segmentation Georgy Gimel farb COMPSCI 373 Computer Graphics and Image Processing 1 / 60 1 Image filtering 2 Median filtering 3 Mean filtering 4 Image segmentation

More information

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction Preprocessing The goal of pre-processing is to try to reduce unwanted variation in image due to lighting,

More information

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing ĐẠI HỌC QUỐC GIA TP.HỒ CHÍ MINH TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA ĐIỆN-ĐIỆN TỬ BỘ MÔN KỸ THUẬT ĐIỆN TỬ VIDEO AND IMAGE PROCESSING USING DSP AND PFGA Chapter 3: Video Processing 3.1 Video Formats 3.2 Video

More information

A Keypoint Descriptor Inspired by Retinal Computation

A Keypoint Descriptor Inspired by Retinal Computation A Keypoint Descriptor Inspired by Retinal Computation Bongsoo Suh, Sungjoon Choi, Han Lee Stanford University {bssuh,sungjoonchoi,hanlee}@stanford.edu Abstract. The main goal of our project is to implement

More information

CS 556: Computer Vision. Lecture 18

CS 556: Computer Vision. Lecture 18 CS 556: Computer Vision Lecture 18 Prof. Sinisa Todorovic sinisa@eecs.oregonstate.edu 1 Color 2 Perception of Color The sensation of color is caused by the brain Strongly affected by: Other nearby colors

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town Recap: Smoothing with a Gaussian Computer Vision Computer Science Tripos Part II Dr Christopher Town Recall: parameter σ is the scale / width / spread of the Gaussian kernel, and controls the amount of

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information