Introduction. Chapter 1. Multimedia Information Retrieval. SIGIR 98. Norbert Fuhr. images audio. media types: text.

Size: px
Start display at page:

Download "Introduction. Chapter 1. Multimedia Information Retrieval. SIGIR 98. Norbert Fuhr. images audio. media types: text."

Transcription

1 Multimedia Information Retrieval Norbert Fuhr SIGIR 98 Chapter 1 Introduction media types: text images audio video 1

2 terminology: monomedia object/document: object containing data of a single media multimedia object/document: object containing data of multiple media hypertext document: nonlinear text document (i.e. with links) hypermedia document: nonlinear multimedia document document structures and attributes IR networks heterogeneity effectivness user friendlyn. content structure head title author chapter section section chapter logical structure IR in networks J. Doe document layout structure author = J. Doe crdate = ladate = external attributes

3 characteristics of multimedia information systems: managing data of multiple media large storage and bandwidth requirements continuous delivery (to avoid jitter) synchronization between several channels (e.g. between audio and video) content- and similarity-based access user-friendly interfaces tasks: representation and transformation storage and compression communication and synchronization authoring and cooperation presentation content analysis browsing retrieval and filtering user interfaces versioning

4 Course structure: 1. introduction 2. media 3. retrieval 4. indexing Chapter 2 Media media types and dimensions views on media objects FERMI multimedia data model 7

5 2.2 Views on media objects here: images physical view pixel matrix logical view perceptive view colour brightness symbolic view spatial view: spatial relations (depending on modelling space) structural view set of image objects structural relations between image objects (aggregation) texture 2.1 Media types y image text is a linear medium... t audio video x

6

7 2.3 The FERMI Multimedia Document Model Document structure and IR impact of structure: of multimedia information: heterogeneity of multimedia data on semantic content: logical ˆ= structure discourse structure on corpus classic IR: document = atomic unit MMIR: retrieval of document components

8 2.3.2 Elements of the multimedia data model logical structure hierarchy of structural objects leaves = single-media data implements explicit organization of discourse other data model elements refer to logical structure attributes classical attributes(author,dates,...) index expressions navigational structure links The logical structure logical structure ˆ= hierarchical aggregation of structural objects: LS = (OS; str;seq; TYPEST ; tst; typest ; TYPEM; typem) OS: finite set of document structural objects elements: osi str: aggregative relation between structural objects, defines hierarchical composition seq: defines a linear sequence on OS (corresponds to standard, linear order to access components TYPEST : set of types of structural objects types correspond to abstraction levels tst : relation on structural object types defining hierarchy of abstraction levels typest : total function assigning each structural object its structural type in TYPEST TYPEM : set of media types, = TYPEM ftext ;image;graphic;multimediag. typem : total function assigning to each structural object its media type in TYPEM e.g. for books: TYPEST = fdocument; Chapter; Sub Section; Paragraph; Figureg Section;

9 Attributes A = (OS;NAMEA;VALUEA; namea; domaina; valuea; SM) where: OS: the set of structural objects in the document elements: osi NAMEA: set of attributes names. VALUEA: set of all possible attribute values (union of all the domain languages of all attributes) namea: partial function associating to structural objects a nonempty set of attribute names domaina: total function defining the domain of any attribute name (i.e. all the expressions of its associated language) valuea: partial function assigning to structural objects the value for a related attribute name (definition allows multi-valued attributes) Content Attributes single-media models involve up to five types of views: the physical view the structural view the symbolic view standard attribute names (called Content Attributes) for! views: physical structural symbolic spatial perceptive the spatial view (only in image and graphic models) the perceptive view (only in image and graphic models)

10 Attribute Classes ˆ= attribute properties assigned to elements of the logical structure problem in structured documents: propagation of properties among related structural objects example: author names propagated (and collected) bottom-up type of propagation may depend on specific attribute (e.g. author vs. publication date) type of document (e.g. conference proceedings, encyclopediae etc.) classes of attributes: dynamic attributes descending (e.g. publication date) ascending (e.g. author) static (e.g. title) Indexing model indexing: assign index expressions to document structural objects retrieval of multimedia documents: retrieve smallest units that fulfill the query index expressions assigned to parent object have to imply index expressions of its component! objects index objects: structural objects that are indexed (assigned a value of attribute symbolic) index model of a document base: OI : set of index objects oii OI OS I =(OI;TYPEI;ind) TYPEI : set of index object types TYPEI TYPEST ind : relation representing structural dependency between index objects: ind OI OI

11 parts of the structural and semantic views of a document Structural View Semantic View Document Os1 Chapter Section Subsection Paragraph U1 U2 Os2 Os3 Os4 Os5 Os6 Os7 U4 U6 Os8 Os9 Os10 Os11 Os12 Os13 Os14 Os15 Os16 Os17 Os18 Os19 U3 U5 U7 Osem2 Osem3 Osem8 Osem9 Osem10 Osem11 Osem Example of an indexing structure example of structure and index hierarchy types (index objects of type Chapter or Subsection only) Structure Types Symbolic Types Document Chapter Section Subsection Paragraph

12 2.3.3 Document Model, Document Base and Hyperbase Document Model combines logical structure and attributes: D =(LS;A) The Document Base document base B: set of documents document base as structure: B =(B;D) The Hyperbase document database does not allow browsing based on navigation links! support for browsing and querying: index objects + navigation links

13 Navigation structure =(ON; RNAV; TYPEL; cross; typel) N ON: set of node objects (nodes) ni of the hyperbase ON OSB. RNAV: relation defining navigation links on nodes intra-document links / inter-document links RNAV ON ON TYPEL: set of link types e.g. Same Author, Similar topic cross: standard access function related to navigation links: cross : RNAV! ON link (ni;nj): ni source, n j target 8(ni; n j) 2 RNAV; cross(ni; n j) =n j typer: total function assigning a type to each link: typer : RNAV! TYPEL Hyperbase H B document base N navigation structure I index model H =(B;N ;I)

14 Chapter 3 Multimedia Retrieval the logical view on IR models based on predicate logic retrieval of structured documents The logical view on IR IR as inference IR as uncertain inference Propositional vs. predicate logic

15 3.1.1 IR as inference q - query d document retrieval: search for documents which imply the query:! d q example: d = ft1;t2;t3g q = ft1;t3g logical view: = d t1 ^t2 ^t3 = q t1 ^t3 ): d! q advantage of inference-based approach: step from term-based to knowledge-based retrieval e.g. easy incorporation of additional knowledge example: d: squares q: rectangles thesaurus:! squares rectangles ): d! q

16 3.1.2 IR as uncertain inference d: quadrangles q: rectangles uncertain knowledge required ) quadrangles 0:3 rectangles! [Rijsbergen 86]: IR as uncertain inference ˆ= Retrieval estimate probability! q) P(d P(qjd) = q t 1 t 4 t 2 t 5 t 3 t 6 d Propositional vs. predicate logic limitations of propositional logic: document attributes query: documents published after 1990??- pubyear(d,y) & Y>1990 conventional indexing (based on propositional logic): d = ftree, houseg query: Is there a picture with a tree on the left of the house? query cannot be expressed in propositional logic ) predicate logic: d: tree(t1). house(h1). left(h1,t1).?- tree(x) & house(y) & left(x,y). multimedia retrieval

17 3.2 Models based on predicate logic Terminological logic Datalog Probabilistic Datalog Terminological logic Thesaurus polygon regular polygon triangle quadrangle... rectangle regular triangle square thesaurus knowledge: can be expressed in propositional logic square = ^ quadrangle regular-polygon terminological logic based on semantic networks more expressive than thesauri instances of concepts roles between (instances of) concepts

18 Elements of terminological logic concepts monadic predicates person, document roles dyadic predicates author, refers-to terminological axioms describe relationships between concepts and roles connotations: necessary conditions < man person definitions: necessary and sufficient conditions square = rectangle and regular-polygon assertions define instances of concepts and roles document[d123]. person[smith]. author[d123,smith]. MIRTL Multimedia Information Retrieval Terminological Logic terminological module concepts, roles, definitions, connotations assertional module: assertions

19 regular-triangle =(andtriangle regular-polygon) german-paper =(andpaper (c-some author german)) student-paper =(andpaper (all author student)) non-german =(andperson (a-not german)) unido =(singuniv-dortmund) multilingual =(and person (atleast 2 speaks-lang)) chinese-parent =(and(chinese (atmost 1 child))) MIRTL Syntax j(a-nothmonadic symboli) predicate constanti) j(singhindividual + ) j(andhconcepti j(allhroleihconcepti) hroleihconcepti) j(c-some numberihrolei) j(atleasthnatural numberihrolei) j(atmosthnatural ˆ= (and (atleastnr)(atmostnr)) (exactlynr) RC) ˆ= (func (and (allrc)(exactly R)) 1 (no R) ˆ= (atmost0 R) ::= hmonadic predicate symboli hconcepti j(top) j(bottom) ::= hdyadic predicate symboli hrolei j(invhrolei)

20 student =(andperson (atleast 1 enrolled) (atmost 1 enrolled) (all enrolled university)) =(andperson (exactly 1 enrolled) (all enrolled university)) =(andperson (func enrolled university)) bachelor =(andman (no spouse)) Retrieval with terminological logic modelling of documents: external attributes logical structure layout structure content structure queries: MIRTL as query language terminological knowledge (thesaurus)

21 (and paper (func appears-in (sing SIGIR93))) (all author (func affiliation (sing IEI-CNR))) (c-some author (sing Carlo-Meghini)) (c-some author (sing Fabrizio-Sebastiani)) (c-some author (sing Umberto-Straccia)) (c-some author (sing Constantino-Thanos)) (exactly 4 author))[paper666] (and (func typeset-with (sing LaTeX)) (func format (sing double-column)) (no figure) (no running-header) (no running-footer))[paper666] (and (exactly 1 abstract) (exactly 5 section) (exactly 1 bibliography)) [paper666] bibliography [paper666,bib666] (and (func typeset-with (sing BibTeX)) (func style (sing plain)) (exactly 22 reference)) [bib666] (and (c-some dw (sing Mirtl)) (c-some dw (sing syn666)) (c-some dw (sing sem666)) (c-some dw (sing alg666)) (c-some dw (sing terminological-logic (c-some modeling-tool (sing IR))))) [paper666] terminological-logic [Mirtl] syntax [Mirtl,syn666] semantics [Mirtl,sem666] inferential-algorithm [Mirtl,alg666] Papers by Thanos about terminological logic? (and paper (c-some author (sing Costantino-Thanos)) (c-some dw (sing terminological-logic))) Papers by Thanos on semantics of terminological logic? (and paper (c-some author (sing Costantino-Thanos)) (c-some dw (c-some (inv semantics) terminological-logic)))

22 italian[carlo Meghini] italian < european Papers by European author? (and paper (c-some author european)) Modelling IR in Datalog Introduction Datalog: horn predicate logic (most IR models based on propositional logic) no functions restricted forms of negation allowed sound and complete evaluation algorithms

23 ground facts: docterm(d1,ir). docterm(d2,ir). docterm(d1,db). docterm(d2,oop). rules: irdoc(d) :- docterm(d,ir). iranddb(d) :- docterm(d,ir) & docterm(d,db). irnotdb(d) :- docterm(d,ir) & not(docterm(d,db)). recursive rules: link(d1,d2). link(d2,d3). link(d3,d1). linked(x,y) :- link(x,y). linked(x,y) :- linked(x,z) & link(z,y). Hypertext structure docterm(d1,ir). docterm(d1,db). link(d1,d2). link(d2,d3). link(d3,d1). about(d,t) :- link(d,d1) & about(d1,t). d3 d1 d2 ir db?- about(d,ir) d1 d3 d2 docterm link

24 Image retrieval output of IRIS image indexing: probabilistic facts imgobj(o,i,n,l,r,b,t) O: object id I: image id N: concept (water,sand,forest,stone...) L,R,B,T: coordinates of the MBR images with stones in front of a forest:?- imgobj(oa,i,stone,l1,r1,b1,t1) & imgobj(ob,i,forest,l2,r2,b2,t2) & <= B1 B2

25 3.2.3 Probabilistic Datalog Syntax ground facts with probabilistic weights 0.9 docterm(d1,ir). 0.5 docterm(d1,db). 0.8 docterm(d2,ir). 0.3 docterm(d2,oop).?- docterm(d,ir). gives d1 0.9 d2 0.8?- docterm(d,ir) & docterm(d,db). gives d Semantics 0.6 docterm(d1,ir). 0.5 docterm(d1,db). independence! assumptions fdocterm(d1,ir)g P(W1)=0:3: P(W2)=0:3: docterm(d1,db)g fdocterm(d1,ir), fdocterm(d1,db)g P(W3)=0:2: fg P(W3)=0:2:?- docterm(d1,ir) & docterm(d1,db) 0.3

26 Disjoint events example: imprecise attribute values # py(dk,av). 0.2 py(d3,89). 0.7 py(d3,90). 0.1 py(d3,91). interpretation: P(W1)=0:2: fpy(d3,89)g P(W2)=0:7: fpy(d3,90)g P(W3)=0:1: fpy(d3,91)g b89(x) :- py(x,y) & Y > 89.?- b89(x) gives d3 [p(d3,90) p(d3,91)] 0:7 + 0:1 = 0:8 Vague predicates phrase search:?- doc(d), phrase(d, information retrieval ) documents:...information retrieval systems......information storage and retrieval......retrieval of information......information is retrieved... phrase as vague predicate, yields probabilistic weight (similar to Boolean builtin predicates) applications of vague predicates: variants of text search: compound words, proper nouns vague fact conditions (e.g. price ˆ<1000) multimedia IR (e.g. audio retrieval, image retrieval)

27 Probabilistic rules generating probabilistic events from deterministic facts: 0.5 related(d,d1) :- link(d,d1). about(d,t) :- docterm(d,t). about(d,t) :- related(d,d1), about(d1,t). semantics: # sex(dk,av). 0.7 l-s(x) :- sex(x,male). 0.4 l-s(x) :- sex(x,female). 0.5 sex(x,male) :- human(x). 0.5 sex(x,female) :- human(x). human(peter).?- ls(x) gives peter 0.55 interpretation: P(W1)=0:35: fsex(peter,male), l-s(peter)g P(W2)=0:15: fsex(peter,male)g P(W3)=0:20: fsex(peter,female), l-s(peter)g P(W4)=0:30: fsex(peter,female)g 3.3 Retrieval of structured documents: POOL goals: retrieval of structured documents hierarchical logical structure! abstraction from node types contexts as untyped nodes! multimedia retrieval! expressiveness of restricted predicate logic

28 3.3.1 Structure of POOL programs object: identifier + content context: object with nonempty content (a1, s11, s12) program: set of clauses clause: context / proposition / rule proposition: term (image, presentation) classification (article(a1), section(s11)) attribute (s11.author(smith), a1.pubyear(1997)) example: a1[ s11[ image 0.6 retrieval presentation ] s12[ ss121[ audio indexing ] ss122[ video not presentation ] ] ] s11.author(smith) s121.author(miller) s122.author(jones) a1.pubyear(1997) article(a1) section(s11) section(s12) subsection(ss121) subsection(ss122) rule: head :- body head: proposition / context containing a proposition body conjunction of subgoals (propositions or contexts) docnode(d) :- article(d) docnode(d) :- section(d) docnode(d) :- subsection(d) mm-ir-doc(d) :- docnode(d) & D[audio & retrieval] german-paper(d) :- D.author.country(germany) query:?- body?- D[audio & indexing]

29 3.3.2 Augmentation Contexts and augmentation clauses only hold for context where stated augmentation: propagation of propositions to surrounding contexts a1[ s11[ image 0.6 retrieval presentation ] s12[ ss121[ audio indexing ] ss122[ video not presentation ] ] ]?- D[audio & video] ; s12 augmentation with uncertainty:?- audio 1.00 ss121 ; 0.60 s12 ; 0.36 a1 ;?- D[audio & video] 0.22 a1 ; augmentation with uncertainty prefers most specific context! Augmentation and inconsistencies d1[ s1[ audio indexing ] s2[ s21[ image retrieval] s22[ video not retrieval ] ] ]?- D[audio & indexing] ; s1 d1 ;?- D[video & image] s2 ; d1 ;?- D[video & retrieval] ; (retrieval is inconsistent in s2) four-valued logic truth values: unknown, true, false, inconsistent s22: 7! video true 7! image unknown 7! retrieval false s2: 7! image true 7! video true 7! retrieval inconsistent ; a1 ; 0.36 s12

30 Chapter 4 Multimedia Indexing audio images video Audio Sound retrieval E. Wold et al.: Content-based classification, search and retrieval of audio. IEEE Multimedia 3(3), pp Levels of audio retrieval 1. exact match of sound samples 2. inexact match of sounds, irrespective of sample rate, quantization, compresssion, inexact match of acoustic features / perceptual properties of sound 4. content-based match (for speech, musical content) here: inexact match of sounds

31 Acoustic features aspects of sound considered: loudness root-mean-square of audio signal (in decibels) pitch greatest common divisor or peaks in Fourier spectra brightness centroid of short-time Fourier magnitude spectra (higher frequency content of signal) bandwidth magnitude-weighted average of differences between spectral components and the centroid (variation of frequencies, e.g. sine wave vs. white noise) harmonicity deviation of the sound s spectrum from a harmonic spectrum (i.e. harmonic spectra vs. inharmonic spectra vs. noise) variation of aspects over time: 1. compute aspect values at certain time intervals 2. derive features from sequences: average value variance autocorrelation (feature values weighted by amplitude) sound example

32 Property Mean Variance Autocorrelation Loudness Pitch Brightness Bandwidth Indexing and retrieval Indexing of a sound: compute and store feature vector a (mean, variance and autocorrelation for loudness, pitch, brightness, bandwidth and harmonicity) Retrieval: 1. conditions w.r.t. feature values 2. similarity of sounds: weighted Euclidian distance mean: µ= 1 M M a j j=1 covariance R = 1 M M (a j µ)(a j T µ) j=1 distance D = q (a b) T R 1 (a b) M # sounds considered

33 Property-based training and classification training: based on set of training sounds for a property (e.g. scratchiness) compute property-specific mean and covariance importance of feature: mean divided by standard deviation classification compute distances to means of all classes, select class with minimum distance likelihood: D 2 L exp = 2 Example: classification of laughter sounds

34 Example: class model for laughter Feature Mean Variance Importance Duration Loudness: Mean Variance Autocorrelation e Brightness: Mean Variance Autocorrelation Bandwidth: Mean Variance e Autocorrelation Pitch: Mean Variance Autocorrelation importance = jmeanj / p variance Speech retrieval 1. speech! recognition uncertain term identification 2. application of text retrieval methods on recognized terms Music retrieval McNab etal: The New Zealand Digital Library MELody index. D-Lib Magazine, May melody transcription 2. approximate string matching! TREC speech retrieval track

35 4.2 Images Introduction Semantic vs. syntactic indexing and retrieval syntactic image features: color texture contour semantic image features: objects (humans, animals, buildings, art works) topics (pollution, demonstration, political visit) most image indexing methods support syntactic features only Aboutness vs. ofness ofness: objects shown in the image aboutness: topic which is illustrated by the image aboutness is very much user-dependent e.g. image showing water pollution

36 4.2.2 QBIC tool for querying image and video databases example images user-constructed sketches and drawings selected color and texture patterns camera and object motion System overview main components: database population: 1. processing of images and videos to extract syntactical features: colors textures shape camera motion object motion 2. storing features in database database querying 1. user composes query graphically 2. generate features from from graphical query 3. search for database objects with similar features

37 Data model basic elements: still images/scenes contain objects video shots sets of contiguous frames contain motion objects still images: scene: image or video frame object part of a scene videos: 1. break into clips (shots) 2. generate representative frame for each slot, treated as still image 3. generate motion objects from shots

38 querying: on objects images with a red, round object on scenes images with 30 % red and 20 % blue on shots shots panning from left to right on combinations images with 30 % red containing a blue object Feature Calculation color color models: RGB, HSV, YUV, MTM average coordinates in color space k element histogram (typically = k 64;256)

39 texture coarseness: scale of texture contrast: vividness of a pattern (function of variance of grey-level histogram) directionality: peakedness of distribution of gradient directions in image (favoured direction (e.g. grass) vs. isotropic (e.g. sand)) shape area # pixels set in binary image circularity perimeter 2 /area major axis orientation 1. compute 2nd order covariance matrix from boundary pixels 2. major axis orientation = direction of largest eigenvector eccentricity = (largest eigenvalue) / (smallest eigenvalue) algebraic moment invariances consider 18 features invariant to affine transformations predefined matrices compute first m central moments as eigenvalues of

40 sketch based on reduced resolution edge map: 1. convert color image to single band luminance 2. compute binary edge image 3. reduce edge image to thin reduced image Sample queries average color queries search for images/objects with similar color computed as weighted Euclidian distance in color space histogram color queries search for images with specified color distribution based on 256-element histogram: Q query histogram D image histogram Z element difference histogram: = Z Q D A symmetric color similarity matrix j)=1 d(ci;cj)=dmax a(i; ck kth color in histogram MTM color distance d(ci;cj) dmax maximum distance between any two colors jjrjj = similarity: Z T AZ

41

42 texture queries user selects texture from a sampler compute weighted Euclidian distance in 3D texture space (coarseness, contrast, directionality) object shape user draws shape shape features: area, circularity, eccentricity, majoraxis-direction, object moments, tangent angles around object perimeter compute weighted Euclidian distance, weights are inverse variances of features query by sketch user draws dominant lines and edges 1. reduce user sketch to for each db image, correlate sketch with user sketch, based on edge/no edge comparison 3. compute correlation scores

43 4.2.3 IRIS semantic indexing of images 1. image analysis color contour texture 2. object recognition (a) basic objects: clouds, snow, water, sky, forest, grass, sand, stone (b) high-level objects: forestscene, skyscene, mountainscene, landscapescene,...

44 Image Analysis Color IRIS subdivides color space into about 20 different colors 1. subdivide image into nonoverlapping tiles 2. compute color histogram for each tile 3. most frequent color =: color of tile 4. join tiles with similar colors and compute circumscribing rectangle 5. compute attributes of color rectangles: position color color density (# tiles with color / # tiles in rectangle) color evidence original image size

45 color-based segmentation:... colour2 HOR=mid,VER=up,SIZ=XL,SHP=Rect,COL=BLUE, UL=0 1,LR=44 11,DEN= colour3 HOR=mid,VER=mid,SIZ=M,SHP=Rect,COL=BLUE, UL=15 10,LR=44 17,DEN= colour4 HOR=left,VER=mid,SIZ=XS,SHP=Quad,COL=BLUE, UL=1 11,LR=1 11,DEN=1 1 colour5 HOR=left,VER=mid,SIZ=XS,SHP=Rect,COL=BLUE, UL=3 11,LR=14 12,DEN= Texture consider local distribution and variation of grey values 1. compute normalized co-occurrence matrix p for 4 directions: 0,90,45, for each of the four directions, compute the following features from C: angular second moment contrast (local variations) correlation (linear relationship between pixel values) variance (deviation from the average) entropy 3. for each of the five parameters, compute the average from the values for the 4 directions (! invariance against rotation)

46 4. feed average values into neural network hidden-layer hidden-layer output-layer constrast asm variance correlation entropy forest input-layer gras sand water stone sky clouds ice 5. NN yields texture for each tile 6. join tiles with identical textures and compute circumscribing rectangles 7. compute attributes of texture rectangles: position texture texture density (# tiles with texture / # tiles in rectangle)... texture3 HOR=mid,VER=mid,SIZ=L,SHP=Rect,TEX=ice, UL=2 2,LR=10 3,DEN=11 18 texture4 HOR=left,VER=mid,SIZ=S,SHP=Path,TEX=clouds, UL=0 3,LR=3 3,DEN=4 4 texture5 HOR=left,VER=mid,SIZ=S,SHP=Quad,TEX=stone, UL=4 3,LR=5 4,DEN=3 4 texture6 HOR=mid,VER=mid,SIZ=S,SHP=Rect,TEX=clouds, UL=5 3,LR=8 4,DEN= size

47 Contour basedongreylevelimage 1. gradient-based edge detection 2. determination of object contours 3. shape analysis: compute position of centroid size of region bound coordinates of region

48 Object Recognition 1. step from syntactical to semantical features: identification of primitive objects 2. derivation of higher-level semantical features identification of primitive objects for each feature, consider corresponding region form graph describing topological relationships between feature regions: node = feature edge = topological relationship: overlaps, meets, contains overlaps CL T contains T CL CT T meets CT T CT CL CT formulate graph grammar rules for detecting primitive objects Mountainlake Clouds Sky Mountain Texture Segment Lake Color Segment Contour Segment Conditions of "Clouds" predicate((valcompeq(*self(2,"colorseg","col"),"blue") valcompeq(*self(2,"colorseg","col"),"white")) && valcompeq(*self(2,"colorseg","ver"),"up")); predicate(nrkind(*self(1,"contourseg"),"contains",*self(1,"colorseg")) && nrkind(*self(1,"contourseg"),"contains",*self(1,"textureseg"))); Clouds Forest basis: color, texture and contour features

49 4.2.4 Photobook developed at MIT Media Lab goal: semantic retrieval of images based on semantics-preserving image compression types of descriptions: appearance (faces) shape texture Appearance based on eigenimage representations Training: Building Eigenrepresentations 1. preprocessing of input images: normalize w.r.t. position, scale, orientation 2. computation of eigenvectors of normalized image covariance for training images (faces) subregions of training images (eyes, nose, mouth)

50 mean and first few eigenvectors: Retrieval Γ: new image (region) 1. transform Γ into face space 2. retrieval based on similarity measure

51

52 Shape representation based on modelling of physical deformations finite element! method stiffness! matrix eigenvectors Retrieval compute amount of energy needed to align object

53 Texture representation based on Wold decomposition for regular stochastic processes in 2D sum of three orthogonal components: ˆ= 1. harmonic field 2. generalized-evanescent field 3. purely-indeterministic field retrieval 1. derive parameters of Wold decompositions 2. compute similarity of parameter vectors

54

55 4.3 Video QBIC Representation of video data 1. shot detection 2. creation of representative frame 3. identify moving structures/objects Shot detection set of frames grouped into shots because they depict same scene signify single camera operation contain distinct event/action Automatic shot detection 1. cuts: high pulses in the histogram, detected by single threshold 2. gradual transitions over a sequence of frames (a) low threshold for detecting possible transitions (b) compute accumulated differences of successive frames (c) shot boundary, if sum exceeds second theshold are chosen as single indexable unit

56 representative frame generation representative frames representative frame generation methods: random frame from a shot synthesized r-frames mosaicking all frames in a panning shot remove moving objects layered representation different layers used for identifying significant objects in the scene algorithm divides a shot into a number of layers, each with its own treated as still images in database population in retrieval returned for as answer representing shot 2D affine motion parameters region of support in each frame

57 4.3.2 ISS-NUS Video Browsing 1. micons: icons for video content 2. hierarchical video magnifier 3. clipmaps Micons: icons for video content ˆ= scroll bar Hierarchical video magnifier segment represented by frame at its midpoint selection of frame yields further division of corresponding segment representative frame for each clip micon = 3D display of frame sequence horizontal / vertical slices of micons for browsing entire videos division into segments of equal length segment can be viewed via micon

58 Case study: news videos spatial structure temporal structure 4.4 Summary: media indexing and matching 1. exact match 2. inexact media match (irrespective of digitization parameters) 3. inexact media feature match 4. content-based match

59 Chapter 5 Conclusions Issues in MMIR syntactic (signal-based) vs. semantic (Symbolic) retrieval of MM objects 116 dealing with document structure IR models based on predicate logic

Multimedia Information Retrieval

Multimedia Information Retrieval Multimedia Information Retrieval Norbert Fuhr Tutorial @ HS-IR 98 Chapter 1 Introduction document structures and attributes media types terminology 1 Document structures and attributes IR networks heterogeneity

More information

Models based on predicate logic

Models based on predicate logic Models based on predicate logic 1/49 Models based on predicate logic Norbert Fuhr July 3, 2003 Description logic Datalog Probabilistic Datalog POOL: a probabilistic object-oriented logic Models based on

More information

IR Models based on predicate. logic. Norbert Fuhr, Henrik Nottelmann University of Duisburg-Essen. POOL: a probabilistic object-oriented logic

IR Models based on predicate. logic. Norbert Fuhr, Henrik Nottelmann University of Duisburg-Essen. POOL: a probabilistic object-oriented logic IR Models based on predicate logic Norbert Fuhr, Henrik Nottelmann University of Duisburg-Essen Description logic Datalog Probabilistic Datalog POOL: a probabilistic object-oriented logic Mapping OWL onto

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Part 9: Representation and Description AASS Learning Systems Lab, Dep. Teknik Room T1209 (Fr, 11-12 o'clock) achim.lilienthal@oru.se Course Book Chapter 11 2011-05-17 Contents

More information

Lecture 8 Object Descriptors

Lecture 8 Object Descriptors Lecture 8 Object Descriptors Azadeh Fakhrzadeh Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University 2 Reading instructions Chapter 11.1 11.4 in G-W Azadeh Fakhrzadeh

More information

Content-Based Image Retrieval Readings: Chapter 8:

Content-Based Image Retrieval Readings: Chapter 8: Content-Based Image Retrieval Readings: Chapter 8: 8.1-8.4 Queries Commercial Systems Retrieval Features Indexing in the FIDS System Lead-in to Object Recognition 1 Content-based Image Retrieval (CBIR)

More information

Content-based Image Retrieval (CBIR)

Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Searching a large database for images that match a query: What kinds of databases? What kinds of queries? What constitutes a match?

More information

Chapter 11 Representation & Description

Chapter 11 Representation & Description Chapter 11 Representation & Description The results of segmentation is a set of regions. Regions have then to be represented and described. Two main ways of representing a region: - external characteristics

More information

Multimedia Database Systems. Retrieval by Content

Multimedia Database Systems. Retrieval by Content Multimedia Database Systems Retrieval by Content MIR Motivation Large volumes of data world-wide are not only based on text: Satellite images (oil spill), deep space images (NASA) Medical images (X-rays,

More information

Content-Based Image Retrieval. Queries Commercial Systems Retrieval Features Indexing in the FIDS System Lead-in to Object Recognition

Content-Based Image Retrieval. Queries Commercial Systems Retrieval Features Indexing in the FIDS System Lead-in to Object Recognition Content-Based Image Retrieval Queries Commercial Systems Retrieval Features Indexing in the FIDS System Lead-in to Object Recognition 1 Content-based Image Retrieval (CBIR) Searching a large database for

More information

Boundary descriptors. Representation REPRESENTATION & DESCRIPTION. Descriptors. Moore boundary tracking

Boundary descriptors. Representation REPRESENTATION & DESCRIPTION. Descriptors. Moore boundary tracking Representation REPRESENTATION & DESCRIPTION After image segmentation the resulting collection of regions is usually represented and described in a form suitable for higher level processing. Most important

More information

CoE4TN4 Image Processing

CoE4TN4 Image Processing CoE4TN4 Image Processing Chapter 11 Image Representation & Description Image Representation & Description After an image is segmented into regions, the regions are represented and described in a form suitable

More information

Content-Based Image Retrieval Readings: Chapter 8:

Content-Based Image Retrieval Readings: Chapter 8: Content-Based Image Retrieval Readings: Chapter 8: 8.1-8.4 Queries Commercial Systems Retrieval Features Indexing in the FIDS System Lead-in to Object Recognition 1 Content-based Image Retrieval (CBIR)

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

Dietrich Paulus Joachim Hornegger. Pattern Recognition of Images and Speech in C++

Dietrich Paulus Joachim Hornegger. Pattern Recognition of Images and Speech in C++ Dietrich Paulus Joachim Hornegger Pattern Recognition of Images and Speech in C++ To Dorothea, Belinda, and Dominik In the text we use the following names which are protected, trademarks owned by a company

More information

Multimedia Information Retrieval

Multimedia Information Retrieval Multimedia Information Retrieval Prof Stefan Rüger Multimedia and Information Systems Knowledge Media Institute The Open University http://kmi.open.ac.uk/mmis Why content-based? Actually, what is content-based

More information

Wavelet Applications. Texture analysis&synthesis. Gloria Menegaz 1

Wavelet Applications. Texture analysis&synthesis. Gloria Menegaz 1 Wavelet Applications Texture analysis&synthesis Gloria Menegaz 1 Wavelet based IP Compression and Coding The good approximation properties of wavelets allow to represent reasonably smooth signals with

More information

ECE 176 Digital Image Processing Handout #14 Pamela Cosman 4/29/05 TEXTURE ANALYSIS

ECE 176 Digital Image Processing Handout #14 Pamela Cosman 4/29/05 TEXTURE ANALYSIS ECE 176 Digital Image Processing Handout #14 Pamela Cosman 4/29/ TEXTURE ANALYSIS Texture analysis is covered very briefly in Gonzalez and Woods, pages 66 671. This handout is intended to supplement that

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

Video search requires efficient annotation of video content To some extent this can be done automatically

Video search requires efficient annotation of video content To some extent this can be done automatically VIDEO ANNOTATION Market Trends Broadband doubling over next 3-5 years Video enabled devices are emerging rapidly Emergence of mass internet audience Mainstream media moving to the Web What do we search

More information

Chapter 4 - Image. Digital Libraries and Content Management

Chapter 4 - Image. Digital Libraries and Content Management Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 4 - Image Vector Graphics Raw data: set (!) of lines and polygons

More information

A SYNTAX FOR IMAGE UNDERSTANDING

A SYNTAX FOR IMAGE UNDERSTANDING A SYNTAX FOR IMAGE UNDERSTANDING Narendra Ahuja University of Illinois at Urbana-Champaign May 21, 2009 Work Done with. Sinisa Todorovic, Mark Tabb, Himanshu Arora, Varsha. Hedau, Bernard Ghanem, Tim Cheng.

More information

An Introduction to Content Based Image Retrieval

An Introduction to Content Based Image Retrieval CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and

More information

Content Based Image Retrieval

Content Based Image Retrieval Content Based Image Retrieval R. Venkatesh Babu Outline What is CBIR Approaches Features for content based image retrieval Global Local Hybrid Similarity measure Trtaditional Image Retrieval Traditional

More information

Chapter 11 Representation & Description

Chapter 11 Representation & Description Chain Codes Chain codes are used to represent a boundary by a connected sequence of straight-line segments of specified length and direction. The direction of each segment is coded by using a numbering

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Introduction to digital image classification

Introduction to digital image classification Introduction to digital image classification Dr. Norman Kerle, Wan Bakx MSc a.o. INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION Purpose of lecture Main lecture topics Review

More information

Image Processing, Analysis and Machine Vision

Image Processing, Analysis and Machine Vision Image Processing, Analysis and Machine Vision Milan Sonka PhD University of Iowa Iowa City, USA Vaclav Hlavac PhD Czech Technical University Prague, Czech Republic and Roger Boyle DPhil, MBCS, CEng University

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang NICTA & CSE UNSW COMP9314 Advanced Database S1 2007 jzhang@cse.unsw.edu.au Reference Papers and Resources Papers: Colour spaces-perceptual, historical

More information

2 Lalmas et al. term-based representation of documents and queries; i.e., a query is a bag of terms, sometimes connected by boolean operators. However

2 Lalmas et al. term-based representation of documents and queries; i.e., a query is a bag of terms, sometimes connected by boolean operators. However Concepts for a Graphical User Interface for Hypermedia Retrieval Mounia Lalmas 1, Thomas Rölleke 2,Frank Turra 3, and Norbert Fuhr 2 1 Queen Mary & Westfield College, London, England 2 University of Dortmund,

More information

Part 3: Image Processing

Part 3: Image Processing Part 3: Image Processing Image Filtering and Segmentation Georgy Gimel farb COMPSCI 373 Computer Graphics and Image Processing 1 / 60 1 Image filtering 2 Median filtering 3 Mean filtering 4 Image segmentation

More information

DOLORES: A System for Logic-Based Retrieval of Multimedia Objects

DOLORES: A System for Logic-Based Retrieval of Multimedia Objects DOLORES: A System for Logic-Based Retrieval of Multimedia Objects Norbert Fuhr, Norbert Gövert, Thomas Rölleke University of Dortmund, Germany Abstract We describe the design and implementation of a system

More information

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Practical Image and Video Processing Using MATLAB

Practical Image and Video Processing Using MATLAB Practical Image and Video Processing Using MATLAB Chapter 18 Feature extraction and representation What will we learn? What is feature extraction and why is it a critical step in most computer vision and

More information

Schedule for Rest of Semester

Schedule for Rest of Semester Schedule for Rest of Semester Date Lecture Topic 11/20 24 Texture 11/27 25 Review of Statistics & Linear Algebra, Eigenvectors 11/29 26 Eigenvector expansions, Pattern Recognition 12/4 27 Cameras & calibration

More information

Texture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image.

Texture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image. Texture Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image. Structural approach: a set of texels in some regular or repeated pattern

More information

CHAPTER 8 Multimedia Information Retrieval

CHAPTER 8 Multimedia Information Retrieval CHAPTER 8 Multimedia Information Retrieval Introduction Text has been the predominant medium for the communication of information. With the availability of better computing capabilities such as availability

More information

UNIVERSITY OF OSLO. Faculty of Mathematics and Natural Sciences

UNIVERSITY OF OSLO. Faculty of Mathematics and Natural Sciences UNIVERSITY OF OSLO Faculty of Mathematics and Natural Sciences Exam: INF 4300 / INF 9305 Digital image analysis Date: Thursday December 21, 2017 Exam hours: 09.00-13.00 (4 hours) Number of pages: 8 pages

More information

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University CS443: Digital Imaging and Multimedia Binary Image Analysis Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University Outlines A Simple Machine Vision System Image segmentation by thresholding

More information

9 length of contour = no. of horizontal and vertical components + ( 2 no. of diagonal components) diameter of boundary B

9 length of contour = no. of horizontal and vertical components + ( 2 no. of diagonal components) diameter of boundary B 8. Boundary Descriptor 8.. Some Simple Descriptors length of contour : simplest descriptor - chain-coded curve 9 length of contour no. of horiontal and vertical components ( no. of diagonal components

More information

Looming Motion Segmentation in Vehicle Tracking System using Wavelet Transforms

Looming Motion Segmentation in Vehicle Tracking System using Wavelet Transforms Looming Motion Segmentation in Vehicle Tracking System using Wavelet Transforms K. SUBRAMANIAM, S. SHUKLA, S.S. DLAY and F.C. RIND Department of Electrical and Electronic Engineering University of Newcastle-Upon-Tyne

More information

Large-Scale 3D Point Cloud Processing Tutorial 2013

Large-Scale 3D Point Cloud Processing Tutorial 2013 Large-Scale 3D Point Cloud Processing Tutorial 2013 Features The image depicts how our robot Irma3D sees itself in a mirror. The laser looking into itself creates distortions as well as changes in Prof.

More information

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker Image Segmentation Srikumar Ramalingam School of Computing University of Utah Slides borrowed from Ross Whitaker Segmentation Semantic Segmentation Indoor layout estimation What is Segmentation? Partitioning

More information

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 06 Image Structures 13/02/06 http://www.ee.unlv.edu/~b1morris/ecg782/

More information

Chapter 3. Indexing. basic steps: 1. feature identification 2. feature weighting

Chapter 3. Indexing. basic steps: 1. feature identification 2. feature weighting Chapter 3 Indexing basic steps: 1. feature identification 2. feature weighting 1 3.1 Text 3.1.1 Feature identification basic method: feature = single word text: Experiments with Indexing Methods. The analysis

More information

2D/3D Geometric Transformations and Scene Graphs

2D/3D Geometric Transformations and Scene Graphs 2D/3D Geometric Transformations and Scene Graphs Week 4 Acknowledgement: The course slides are adapted from the slides prepared by Steve Marschner of Cornell University 1 A little quick math background

More information

A Computer Vision System for Graphical Pattern Recognition and Semantic Object Detection

A Computer Vision System for Graphical Pattern Recognition and Semantic Object Detection A Computer Vision System for Graphical Pattern Recognition and Semantic Object Detection Tudor Barbu Institute of Computer Science, Iaşi, Romania Abstract We have focused on a set of problems related to

More information

Statistical image models

Statistical image models Chapter 4 Statistical image models 4. Introduction 4.. Visual worlds Figure 4. shows images that belong to different visual worlds. The first world (fig. 4..a) is the world of white noise. It is the world

More information

COMPUTER AND ROBOT VISION

COMPUTER AND ROBOT VISION VOLUME COMPUTER AND ROBOT VISION Robert M. Haralick University of Washington Linda G. Shapiro University of Washington A^ ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California

More information

LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR

LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR RGB COLOR HISTOGRAM HSV COLOR MOMENTS hsv_image = rgb2hsv(rgb_image) converts the RGB image to the equivalent HSV image. RGB is an m-by-n-by-3 image array

More information

TEXTURE. Plan for today. Segmentation problems. What is segmentation? INF 4300 Digital Image Analysis. Why texture, and what is it?

TEXTURE. Plan for today. Segmentation problems. What is segmentation? INF 4300 Digital Image Analysis. Why texture, and what is it? INF 43 Digital Image Analysis TEXTURE Plan for today Why texture, and what is it? Statistical descriptors First order Second order Gray level co-occurrence matrices Fritz Albregtsen 8.9.21 Higher order

More information

- Low-level image processing Image enhancement, restoration, transformation

- Low-level image processing Image enhancement, restoration, transformation () Representation and Description - Low-level image processing enhancement, restoration, transformation Enhancement Enhanced Restoration/ Transformation Restored/ Transformed - Mid-level image processing

More information

CSI 4107 Image Information Retrieval

CSI 4107 Image Information Retrieval CSI 4107 Image Information Retrieval This slides are inspired by a tutorial on Medical Image Retrieval by Henning Müller and Thomas Deselaers, 2005-2006 1 Outline Introduction Content-based image retrieval

More information

9/8/2016. Characteristics of multimedia Various media types

9/8/2016. Characteristics of multimedia Various media types Chapter 1 Introduction to Multimedia Networking CLO1: Define fundamentals of multimedia networking Upon completion of this chapter students should be able to define: 1- Multimedia 2- Multimedia types and

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Lecture 10: Image Descriptors and Representation

Lecture 10: Image Descriptors and Representation I2200: Digital Image processing Lecture 10: Image Descriptors and Representation Prof. YingLi Tian Nov. 15, 2017 Department of Electrical Engineering The City College of New York The City University of

More information

Digital Image Processing Chapter 11: Image Description and Representation

Digital Image Processing Chapter 11: Image Description and Representation Digital Image Processing Chapter 11: Image Description and Representation Image Representation and Description? Objective: To represent and describe information embedded in an image in other forms that

More information

Extraction of Color and Texture Features of an Image

Extraction of Color and Texture Features of an Image International Journal of Engineering Research ISSN: 2348-4039 & Management Technology July-2015 Volume 2, Issue-4 Email: editor@ijermt.org www.ijermt.org Extraction of Color and Texture Features of an

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Third Edition Rafael C. Gonzalez University of Tennessee Richard E. Woods MedData Interactive PEARSON Prentice Hall Pearson Education International Contents Preface xv Acknowledgments

More information

Multimedia Information Retrieval

Multimedia Information Retrieval Multimedia Information Retrieval Lecture 8 Lecturer: Theo Gevers Lab: MMIS Email: gevers@science.uva.nl http: www.science.uva.nl/~gevers http: www.science.uva.nl/~gevers/master003 Broad shape, multi-local

More information

The MPEG-7 Description Standard 1

The MPEG-7 Description Standard 1 The MPEG-7 Description Standard 1 Nina Jaunsen Dept of Information and Media Science University of Bergen, Norway September 2004 The increasing use of multimedia in the general society and the need for

More information

correlated to the Michigan High School Mathematics Content Expectations

correlated to the Michigan High School Mathematics Content Expectations correlated to the Michigan High School Mathematics Content Expectations McDougal Littell Algebra 1 Geometry Algebra 2 2007 correlated to the STRAND 1: QUANTITATIVE LITERACY AND LOGIC (L) STANDARD L1: REASONING

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System

Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System Proc. of IEEE Conference on Computer Vision and Pattern Recognition, vol.2, II-131 II-137, Dec. 2001. Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System

More information

Contend Based Multimedia Retrieval

Contend Based Multimedia Retrieval Contend Based Multimedia Retrieval CBIR Query Types Semantic Gap Features Segmentation High dimension IBMS QBIC GIFT, MRML Blobworld CLUE SIMPLIcity CBMR Multimedia Automatic Video Analysis 1 CBIR Contend

More information

CHAPTER 3 FEATURE EXTRACTION

CHAPTER 3 FEATURE EXTRACTION 37 CHAPTER 3 FEATURE EXTRACTION 3.1. INTRODUCTION This chapter presents feature representation of information of a frame in a video. The feature representation for image objects, the feature representation

More information

Multimedia Technology CHAPTER 4. Video and Animation

Multimedia Technology CHAPTER 4. Video and Animation CHAPTER 4 Video and Animation - Both video and animation give us a sense of motion. They exploit some properties of human eye s ability of viewing pictures. - Motion video is the element of multimedia

More information

Outline. Advanced Digital Image Processing and Others. Importance of Segmentation (Cont.) Importance of Segmentation

Outline. Advanced Digital Image Processing and Others. Importance of Segmentation (Cont.) Importance of Segmentation Advanced Digital Image Processing and Others Xiaojun Qi -- REU Site Program in CVIP (7 Summer) Outline Segmentation Strategies and Data Structures Algorithms Overview K-Means Algorithm Hidden Markov Model

More information

CS 231A Computer Vision (Fall 2012) Problem Set 3

CS 231A Computer Vision (Fall 2012) Problem Set 3 CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest

More information

EE 701 ROBOT VISION. Segmentation

EE 701 ROBOT VISION. Segmentation EE 701 ROBOT VISION Regions and Image Segmentation Histogram-based Segmentation Automatic Thresholding K-means Clustering Spatial Coherence Merging and Splitting Graph Theoretic Segmentation Region Growing

More information

Lecture 7: Segmentation. Thursday, Sept 20

Lecture 7: Segmentation. Thursday, Sept 20 Lecture 7: Segmentation Thursday, Sept 20 Outline Why segmentation? Gestalt properties, fun illusions and/or revealing examples Clustering Hierarchical K-means Mean Shift Graph-theoretic Normalized cuts

More information

Lecture 3 Image and Video (MPEG) Coding

Lecture 3 Image and Video (MPEG) Coding CS 598KN Advanced Multimedia Systems Design Lecture 3 Image and Video (MPEG) Coding Klara Nahrstedt Fall 2017 Overview JPEG Compression MPEG Basics MPEG-4 MPEG-7 JPEG COMPRESSION JPEG Compression 8x8 blocks

More information

DIGITAL IMAGE ANALYSIS. Image Classification: Object-based Classification

DIGITAL IMAGE ANALYSIS. Image Classification: Object-based Classification DIGITAL IMAGE ANALYSIS Image Classification: Object-based Classification Image classification Quantitative analysis used to automate the identification of features Spectral pattern recognition Unsupervised

More information

Image representation. 1. Introduction

Image representation. 1. Introduction Image representation Introduction Representation schemes Chain codes Polygonal approximations The skeleton of a region Boundary descriptors Some simple descriptors Shape numbers Fourier descriptors Moments

More information

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of

More information

Introduction to Medical Imaging (5XSA0) Module 5

Introduction to Medical Imaging (5XSA0) Module 5 Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed

More information

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR)

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 63 CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 4.1 INTRODUCTION The Semantic Region Based Image Retrieval (SRBIR) system automatically segments the dominant foreground region and retrieves

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Spectral Classification

Spectral Classification Spectral Classification Spectral Classification Supervised versus Unsupervised Classification n Unsupervised Classes are determined by the computer. Also referred to as clustering n Supervised Classes

More information

MPEG-7 Visual shape descriptors

MPEG-7 Visual shape descriptors MPEG-7 Visual shape descriptors Miroslaw Bober presented by Peter Tylka Seminar on scientific soft skills 22.3.2012 Presentation Outline Presentation Outline Introduction to problem Shape spectrum - 3D

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Previous Lecture Audio Retrieval - Query by Humming

More information

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology Course Presentation Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology Image Compression Basics Large amount of data in digital images File size

More information

Topic 6 Representation and Description

Topic 6 Representation and Description Topic 6 Representation and Description Background Segmentation divides the image into regions Each region should be represented and described in a form suitable for further processing/decision-making Representation

More information

Scene Detection Media Mining I

Scene Detection Media Mining I Scene Detection Media Mining I Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} Overview Hierarchical structure of video sequence

More information

CS4733 Class Notes, Computer Vision

CS4733 Class Notes, Computer Vision CS4733 Class Notes, Computer Vision Sources for online computer vision tutorials and demos - http://www.dai.ed.ac.uk/hipr and Computer Vision resources online - http://www.dai.ed.ac.uk/cvonline Vision

More information

TEXTURE ANALYSIS USING GABOR FILTERS

TEXTURE ANALYSIS USING GABOR FILTERS TEXTURE ANALYSIS USING GABOR FILTERS Texture Types Definition of Texture Texture types Synthetic Natural Stochastic < Prev Next > Texture Definition Texture: the regular repetition of an element or pattern

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Motion and Tracking Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Motion Segmentation Segment the video into multiple coherently moving objects Motion and Perceptual Organization

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

IN5520 Digital Image Analysis. Two old exams. Practical information for any written exam Exam 4300/9305, Fritz Albregtsen

IN5520 Digital Image Analysis. Two old exams. Practical information for any written exam Exam 4300/9305, Fritz Albregtsen IN5520 Digital Image Analysis Two old exams Practical information for any written exam Exam 4300/9305, 2016 Exam 4300/9305, 2017 Fritz Albregtsen 27.11.2018 F13 27.11.18 IN 5520 1 Practical information

More information

CS 223B Computer Vision Problem Set 3

CS 223B Computer Vision Problem Set 3 CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.

More information

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich. Autonomous Mobile Robots Localization "Position" Global Map Cognition Environment Model Local Map Path Perception Real World Environment Motion Control Perception Sensors Vision Uncertainties, Line extraction

More information

Filtering, scale, orientation, localization, and texture. Nuno Vasconcelos ECE Department, UCSD (with thanks to David Forsyth)

Filtering, scale, orientation, localization, and texture. Nuno Vasconcelos ECE Department, UCSD (with thanks to David Forsyth) Filtering, scale, orientation, localization, and texture Nuno Vasconcelos ECE Department, UCSD (with thanks to David Forsyth) Beyond edges we have talked a lot about edges while they are important, it

More information