Arnold W.M Smeulders Theo Gevers. University of Amsterdam smeulders}

Arnold W.M Smeulders Theo evers University of Amsterdam email: smeulders@wins.uva.nl http://carol.wins.uva.nl/~{gevers smeulders}

0 Prolem statement Query matching Query

0 Prolem statement Query classes 1. Query y ey-words 2. Query y feature attriutes including setch 3. Query y 1 or N examples 4. Feedac iterative definition of query

1 Demands World model WWW Trademar assessment Trademar in pulic Stolen goods retrieval Video dataases Product dataase Product retrieval Stamp collections general pictures unnown conditions 2D of 2D pictures nown camera 2D of 3D pictures unnown conditions general pictures well-ehaved camera general pictures unnown camera limited domain nown camera limited domain well-ehaved camera 2D of 2D pictures narrow domains

1 Demands Computational model geometric models detect cue group line detect line controller data detect edge

1 Demands Computational model symolic models oject_one_end x = house U1 = 0.8 oject_other_end x = gasline U2 = 0.7 x = house to gasline Ui = 0.7 ojectz = house to gasline Ui = 0.7 z = douly arrowed line Uii = 0.63.

1 Demands World and computational models conclusions 1. Visual worlds have very specific characteristics ojects scenes sensors 2. The world more models than detailed computational can e made. 3.a Narrow world domains requires detailed computational models. 3. road domain require general models ased on physical laws light sensing and materials physiological facts perception cultural oservations semiotics.

1 Demands Image segmentation reconsidering the definition Input : picture data Output : find what pixels form an oject in real world Segmentation requires experience with meaning!

1 Demands Image segmentation advanced Segmentation y elastic snae model Contour which minimises elastic energy v s 2 fvs Advanced segmentation suited for narrow domain.

1 Demands Image segmentation conclusions 1. Whatever rilliant techniques for segmentation: not good enough to segment general pictures. 2. Segmentation has no answer to occlusion and clutter. 3. For retrieval of general images: wea segmentation: esult is definitely in one oject no guarantee all of oject is found. Simplest form: point segmentation.

1 Demands Features one million views So many images of one oject due to minor differences in: camera location rotation scale light source illumination light interaction surface cover acground clutter occlusion camera type viewpoint A million different data arrays to one oject!

1 Demands Features invariance Image retrieval in general domains requires invariance! A feature F invariant to condition A will produce for oject x the value Fx regardless the effect of condition A. Features for general image retrieval condition A effect of A required feature everywhere any positions location invariance upside down any orientation rotation invariance closer any scale scale invariance shadow lamp any intensity illumination invariance occlusion any suset present invariance to partial no frontal view any view angle viewpoint invariance

1 Demands Features invariance Image retrieval requires invariance! A feature F invariant to condition A will produce for oject x the value Fx regardless the effect of condition A. Image retrieval requires selected invariance! Art = frontal view daylight no occlusion. Crowd = occluded aritrary viewpoint.

1 Demands Features the invariance of 42 the numer Image retrieval requires invariance! A feature F invariant to condition A will produce for oject x the value Fx regardless the effect of condition A. The most invariant feature is the value 42 or any other numer: never changes regardless whatever conditions A. Trade-off etween invariance and power to discriminate. Conclusion: Use the smallest set of relevant invariant features.

1 Demands Features to recognize an oject One or two details may e enough to recognise an oject if and only if recognition is invariant colour changes shape changes

1 Demands Features conclusions 1. Invariant to sensing conditions illumination and oject shape require photometric features 2D image of 3D world require affine invariant features 2. Invariant to emedding scene occlusion and clutter require local features 3. Use the smallest set of relevant invariant features. 4. Candidate features are shape colour & texture iff invariant for domain recording circumstances and query at hand

2 Colour shape What maes an image? Spectrum of source Surface reflectance eometry of oject Emedding in scene Spectrum of sensor and nothing else

2 Colour shape What maes an image? -space ow will ojects reflect in -space?

2 Colour shape What maes an image? Schafer s model ody surface

2 Colour shape What maes an image? Schafer s model C = m n s f λ e λ c λ dλ m n s v f λ e λ c λ d C s C s λ λ λ λ c eλ n s v f C λ surface aledo illumination oject surface normal illumination direction viewer s direction sensor sensitivity scene & viewpoint invariant scene dependent viewpoint variant scene dependent viewpoint variant scene dependent

cos for example the geometricterm is Lamertian i.e. Consider the ody reflection term: s n s n m d c e f s n m C C = = λ λ λ λ λ What maes an image? ody reflectance in ody reflectance in -space space 2 Colour shape

2 Colour shape What maes an image? ody reflectance in -space ow does -histogram loo loo lie?

2 Colour shape What maes an image? ody reflectance in -space

2 Colour shape What maes an image? surface reflectance in -space Consider the surface reflection term : m n s v s λ f C λ e λ c λ dλ where the geometric term is for example α only depends on n s and v. s the phong model cos n α where

2 Colour shape What maes an image? reflectance under white light reconsidered S Dichromatic reflection under white light CW = em n s c ems n s v c s f

2 Colour shape Colour invariance S I - space S ue: = arctan 3

C s n em C = For giving the red green and lue sensor response under white light. Further with } { C = λ λ λ d c f C C = Consider the ody reflection term: s n em s n em g = = s n em s n em = = s n em s n em r = = Colour invariance photometric invariance photometric invariance 2 Colour shape

Colour invariance photometric invariance 2 Colour shape. 4 3 4 3 where Then 2 2 2 2 2 2 2 2 2 2 etc...} e.g. {...} e.g. { u t s r q p C s n em s n em s n em s n em s n em s n em u t s r q p = = = = = = = =

2 Colour shape Colour invariance c = 1 photometric invariance - c1 c2 c3 space arctan max{ } c = 2 c 3 = arctan arctan max{ } max{ } em n s c 1 arctan arctan = = max{ em n s em n s } max{ }

. 3 3 where Then 2 2 2 etc...} e.g. {...} e.g. { u t s r q p C u t s r q p = = Colour invariance shiny ojects shiny ojects - l1 l2 l3 space l1 l2 l3 space 2 Colour shape

2 Colour shape Colour invariance m1 m2 m3 - space coloured light m 1 = 2 3 / 3 2 m 2 = 3 1 / 1 3 m 3 = 1 2 / 2 1

2 Colour shape Colour invariance conclusion shadows shading highlights ill. intensity ill. color I - - - - - - - - - - rg - - c1c2c3 - - hue - l1l2l3 - m1m2m3 - - no invariance invariance

2 Colour shape Shape invariant colour edge detection c1c2c3 l1l2l3

2 Colour shape Shape colour edge classification color edge maxima shadows and geometry highlights color edges

2 Colour shape Shape colour edge classification material highlight shadow or geometry

3 Searching and finding Searching individual images

3 Searching and finding Searching individual images inary VSM - results 100 80 xor and Percent 60 40 mrs hausdorf 20 Accumulated raning percentile 0 1 2 3 4 5 10

3 Searching and finding Searching individual images inary VSM - roust against occlusion? accumulated 100% average ran 50% Yes! Even 80% occluded. 0 50 80 90 occlusion percentage

3 Searching and finding Searching individual images inary VSM - roust against viewpoints? accumulated 100% average ran 50% 0 45 60 7580 Yes! Even 75 o out of sight. rotation in viewpoint

3 Searching and finding Searching individual images conclusions on VSM 1. VSM wors fine for multi-coloured ojects shape colour wors est single est feature is colour. 2. VSM is roust against features: viewpoint affine invariant similarity: occlusion & clutter invariant

3 Searching and finding Searching groups of images VSM with -NN learning K-nearest neighor represents each images y a vector in n-space. The nearness is defined y the Euclidean distance. feature2 Training samples class A Training samples class Query image feature1

3 Searching and finding Searching groups of images VSM in -NN photographs versus synthetic synthetic images photographic images

3 Searching and finding Searching groups of images VSM in -NN photographs versus synthetic Color variation Synthetic images tend to have fewer colors than photographs. Color saturation Colors in synthetic images are commonly less saturated. Color edge strength Synthetic images tend to have more arupt color transitions than photographs.

3 Searching and finding Searching groups of images -NN in VSM classification results Query Query esults esults Edge strength and saturation photo synthetic Classifier y -NN Data 200 training and 100 images for each class. photo 95% 5% synthetic 9% 91%

3 Searching and finding Searching groups of images -NN in VSM for sin detection sin pixels not-sin pixels - space versus c1 c2 c3 - space Conclusion: a range in normalized c1 c2 c3-space specific for sin.

3 Searching and finding Searching groups of images sin detection: results

3 Searching and finding Searching groups of images K-NN in VSM for portrait detection Query esults portrait non-portrait Features are color ratio s. Classification y -NN. portrait 96% 4% Data 200 training and 100 test images in each class. non-portrait 25% 75%

4 Oject localisation Split and merge Split regions until patch is homogeneous...

4 Oject localisation Split and merge... and merge patches which are alie. Wors ecause of spatial coherence.

4 Oject localisation omogeneity 1. Normalizedcross correlation : D C IQ = D IQ I = = 1 t t t = 1 = 1 w Q w t = 1 w Q 2. istogram intersection : min{ w w I 2 Q Q w I } 10 0 0 D I and # D C I Q dependent on occlusion and oject cluttering I Q independen oject I cluttering Q hue t on occlusion

4 Oject localisation Data set example

4 Oject localisation esults

5 Image search engine PicToSee Content-ased image retrieval Fast indexing Query pictorial example attriutes Invariance University of Amsterdam {gevers smeulders}@science.uva.nl http://carol.wins.uva.nl/~{gevers smeulders}