Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University

Similar documents
Supervised texture detection in images

Integrating Color, Texture, and Geometry for Image Retrieval

Textural Features for Image Database Retrieval

Robotics Programming Laboratory

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Texture. COS 429 Princeton University

Problem 1: Complexity of Update Rules for Logistic Regression

An Introduction to Content Based Image Retrieval

Texture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image.

A Graph Theoretic Approach to Image Database Retrieval

Selective Search for Object Recognition

Multi-Class Segmentation with Relative Location Prior

Tools for texture/color based search of images

Markov Random Fields and Segmentation with Graph Cuts

Pattern recognition. Classification/Clustering GW Chapter 12 (some concepts) Textures

Learning Semantic Concepts from Visual Data Using Neural Networks

Lecture 10: Semantic Segmentation and Clustering

Color Image Segmentation

Identifying Layout Classes for Mathematical Symbols Using Layout Context

Color. making some recognition problems easy. is 400nm (blue) to 700 nm (red) more; ex. X-rays, infrared, radio waves. n Used heavily in human vision

Distribution Distance Functions

Digital Image Processing COSC 6380/4393

SVM-KNN : Discriminative Nearest Neighbor Classification for Visual Category Recognition

Stereo Vision. MAN-522 Computer Vision

Image Processing. David Kauchak cs160 Fall Empirical Evaluation of Dissimilarity Measures for Color and Texture

Object Detection Using Segmented Images

Region-based Segmentation and Object Detection

Object Category Detection. Slides mostly from Derek Hoiem

Practical Image and Video Processing Using MATLAB

Tracking Computer Vision Spring 2018, Lecture 24

Short Run length Descriptor for Image Retrieval

Computer Vision. Exercise Session 10 Image Categorization

Lecture 24: More on Reflectance CAP 5415

A Qualitative Analysis of 3D Display Technology

What have we leaned so far?

CS 664 Segmentation. Daniel Huttenlocher

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu. Lectures 21 & 22 Segmentation and clustering

doc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague

Content-Based Image Retrieval Readings: Chapter 8:

Introduction to Similarity Search in Multimedia Databases

Practice Exam Sample Solutions

Combining Appearance and Topology for Wide

Stereo vision. Many slides adapted from Steve Seitz

EE 701 ROBOT VISION. Segmentation

Percentile Blobs for Image Similarity

COS Lecture 10 Autonomous Robot Navigation

ADAPTIVE TEXTURE IMAGE RETRIEVAL IN TRANSFORM DOMAIN

2. LITERATURE REVIEW

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Region-based Segmentation

Markov Networks in Computer Vision

A Content Based Image Retrieval System Based on Color Features

Image Processing Techniques and Smart Image Manipulation : Texture Synthesis

Targil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation

Grouping and Segmentation

Automatic Colorization of Grayscale Images

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

Sequential Maximum Entropy Coding as Efficient Indexing for Rapid Navigation through Large Image Repositories

Markov Networks in Computer Vision. Sargur Srihari

Color and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

Edge Histogram Descriptor, Geometric Moment and Sobel Edge Detector Combined Features Based Object Recognition and Retrieval System

Linear combinations of simple classifiers for the PASCAL challenge

Texture April 17 th, 2018

Nearest Neighbor with KD Trees

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Generative and discriminative classification techniques

Title: Adaptive Region Merging Segmentation of Airborne Imagery for Roof Condition Assessment. Abstract:

intro, applications MRF, labeling... how it can be computed at all? Applications in segmentation: GraphCut, GrabCut, demos

Segmentation and Grouping April 19 th, 2018

Semantic Segmentation of Street-Side Images

Stereo and Epipolar geometry

11. Image Data Analytics. Jacobs University Visualization and Computer Graphics Lab

Wavelet Applications. Texture analysis&synthesis. Gloria Menegaz 1

What is Computer Vision?

SkyFinder: Attribute-based Sky Image Search

Seminar Heidelberg University

Su et al. Shape Descriptors - III

Using Machine Learning for Classification of Cancer Cells

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Segmentation and Tracking of Partial Planar Templates

Recognizing hand-drawn images using shape context

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Using the Kolmogorov-Smirnov Test for Image Segmentation

An Efficient Semantic Image Retrieval based on Color and Texture Features and Data Mining Techniques

Does the Brain do Inverse Graphics?

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Color, Texture and Segmentation. CSE 455 Linda Shapiro

A Miniature-Based Image Retrieval System

Machine learning Pattern recognition. Classification/Clustering GW Chapter 12 (some concepts) Textures

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

VC 11/12 T14 Visual Feature Extraction

Category vs. instance recognition

Local Features and Bag of Words Models

CS 534: Computer Vision Texture

Gene Clustering & Classification

Video Texture. A.A. Efros

Snakes, level sets and graphcuts. (Deformable models)

Finding 2D Shapes and the Hough Transform

Transcription:

Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University

Why Image Retrieval? World Wide Web: Millions of hosts Billions of images Growth of video libraries Photography: going digital Nicholas R. Howe Beyond Mere Pixels 2

Image Retrieval Framework Collection of diverse images User supplies a query image. System returns most similar images from collection. Nicholas R. Howe Beyond Mere Pixels 3

Image Similarity & Retrieval Measuring the similarity between two images is a difficult test of image understanding.? 60%?? 90%? Q. Given a bunch of images, which are most similar to one I m interested in? Nicholas R. Howe Beyond Mere Pixels 4

Q. What s This? 94 129 124 207 157 142 161 136 38 22 26 55 191 122 177 181 130 133 147 196 157 94 27 11 170 177 191 183 128 102 160 171 155 179 162 72 187 184 171 170 188 192 200 168 173 152 167 180 131 147 179 188 200 194 181 171 149 167 176 167 132 134 172 192 195 192 205 160 159 169 165 158 209 198 172 183 197 175 172 151 166 157 162 180 154 191 176 192 200 162 152 149 142 164 169 156 Nicholas R. Howe Beyond Mere Pixels 5

Difficulties With Digital Images Digital photographs are Bit-mapped Low-resolution Restricted in color Nicholas R. Howe Beyond Mere Pixels 6

The Amazing Brain The brain sees more than just pixels. Aggregation into objects and larger regions is automatic & unconscious. Visual memory plays a role. Nicholas R. Howe Beyond Mere Pixels 7

The Amazing Brain (2) The brain synthesizes diverse sources of information: Shape Texture/Shading Multiple cues Seemingly effortless (?) Nicholas R. Howe Beyond Mere Pixels 8

Brain-Like Computers? What we need is computers that can work more like brains. Analysis of shapes, region, and texture Semantic labeling of image content Association with known models of objects, materials, and scenes Increasing Difficulty Nicholas R. Howe Beyond Mere Pixels 9

Dumb Stuff That Works Some simple (reliable) statistics work better than cognitively plausible (unreliable) ones. Classic example: color histograms for similarity (Swain & Ballard 1991) Nicholas R. Howe Beyond Mere Pixels 10

Color Histograms Work Comparisons on more than 20,000 images: Given this These images are the most similar: Color histograms form the core of most working systems today. Nicholas R. Howe Beyond Mere Pixels 11

But Not Always Some related images have very different histograms. Some unrelated images have nearly the same histogram. Nicholas R. Howe Beyond Mere Pixels 12

Variation: Color Correlograms Other statistical measures of image properties improve on color histograms. Table of Probabilities Correlograms (Huang et. al., 1997) have been particularly successful. Nicholas R. Howe Beyond Mere Pixels 13

Correlogram Details Correlograms consist of a table of probabilities. ( color( b) = x ( color( a = x) ( a b y)) C ( x, y) = P ) = Red Orange Yellow etc 1 pixel 0.32 0.0 0.06 0.14 3 pixels 0.16 0.0 0.04 0.0 5 pixels 0.08 0.0 0.03 0.0 Given a pixel of color x, the probability that a pixel chosen distance y away is also color x Correlograms can be compared like vectors. Nicholas R. Howe Beyond Mere Pixels 14

We Can Do Better How can we do something smarter? Must incorporate spatial information & objects Must employ multiple cues Must adapt to lighting, etc. Nicholas R. Howe Beyond Mere Pixels 15

Approach 1. Segmentation Identifies spatial patterns and objects. 2. Vector Representation Includes color, texture, and location cues. 3. Vector Comparison Allows adaptation for varying conditions. 4. Focus on Objects? Nicholas R. Howe Beyond Mere Pixels 16

Segmentation Segmenting an image means dividing it into regions that belong together. Q. What s a sensible way to segment any given picture? Nicholas R. Howe Beyond Mere Pixels 17

Characterizing Regions When humans segment an image, they can explain why each region hangs together. Red flowers Bluish cloudy sky Green lawn Models motivate the grouping into regions. Nicholas R. Howe Beyond Mere Pixels 18

Mathematical Models of Regions Model regions as smooth functions + noise: = + 2D example: Smooth function: Noise profile: Nicholas R. Howe Beyond Mere Pixels 19

Models of Regions (2) Each model tries to predict the image. Successful models are rare. Correct here Incorrect here Bad model Good model Nicholas R. Howe Beyond Mere Pixels 20

Outline of Segmentation Process 1. Start with small local regions. (Felzenszwalb & Huttenlocher 1998) 2. Create a pool of potential models. 3. Measure fit between all models & local regions. 4. Select a small number of models that fit many local regions well. (Details on the next slide) Goal: Nicholas R. Howe Beyond Mere Pixels 21

Segmentation Details Best segmentation found via energy minimization: E ( R) = Fit( r, M ) + Δ( r, r r 1 2 r R r R r R 1 2 ) The energy of a segmentation into regions R is equal to the fit of each region with its model plus a penalty to discourage excess regions. Minimum energy is difficult to compute in general. Nicholas R. Howe Beyond Mere Pixels 22

Graph Formulation Minimum energy = minimum graph cut (compare with Boykov, et. al., 1998) Model Model Model Model Model nodes Region nodes Nicholas R. Howe Beyond Mere Pixels 23

Graph Formulation (2) Minimum graph cut = best segmentation Running time bound: quadratic in # of nodes Quality bound: Energy found is 2 optimal. Nicholas R. Howe Beyond Mere Pixels 24

Examples Nicholas R. Howe Beyond Mere Pixels 25

Related Work Stereo Vision & Energy Minimization (Boykov, Vexler & Zabih, 1998) Normalized Cuts (Shi & Malik, 1997) JSEG (Deng, Manjunath, & Shin, 1999) Nicholas R. Howe Beyond Mere Pixels 26

A Region-Based Representation Begin with segmentation. (Provides locality.) Describe each patch using multiple features. Color Texture Location Combine such that each piece is preserved. Nicholas R. Howe Beyond Mere Pixels 27

Describing an Image by its Parts Discretize the range of each feature. (Color, texture, and location) Count area in image described by each combination of features. Blue-Smooth-TopLeft: 5, Blue-Smooth-TopMiddle: 1, Green-Smooth-TopLeft: 0, etc. Location Texture Color Nicholas R. Howe Beyond Mere Pixels 28

Discretization Color: 28 bins Texture: 3 bins (smooth, textured, rough) Location: 25 bins Total: 28 3 25 = 2100 combinations Nicholas R. Howe Beyond Mere Pixels 29

Vector Representation Final representation of image is a vector with 2100 dimensions. v = v c v,..., v, v,..., 1 t1l1 c1t1l 2 c1t1l 25 c1t 2l1, vc t l 28 3 25 Each dimension records how much of a particular type of material is present. e.g., how much smooth blue in the top left corner? Nicholas R. Howe Beyond Mere Pixels 30

Comparison Vectors are points in space. Images with similar composition will have similar (normalized) vectors. Angle between similar vectors will be small. Similar a b c a Dissimilar Nicholas R. Howe Beyond Mere Pixels 31

Comparison (2) Compare two images using a cosine metric: D( v 1, v 2 ) = cos 1 ( v T 1 v T 1 Sv 1 Sv 2 )( v T 2 Sv 2 ) Note generalization using S matrix: S = I is standard cosine metric. Other values of S allow adjustments to metric. Nicholas R. Howe Beyond Mere Pixels 32

Comparison: Match Coefficients Discretization of features loses some similarity information. e.g., Blue is closer to Green than to Orange. Such partial matches may be encoded in off-diagonal terms of S. second value first value (lighter more similar) Nicholas R. Howe Beyond Mere Pixels 33

Evaluating the Vector Method Two sets of test images: 12 and 16 categories of ~100 images each Classification task Most similar known image is used to classify unknown images.? Closest image Classify unknown as tiger Nicholas R. Howe Beyond Mere Pixels 34

Sample Categories Airshows Caves Elephants Skiers Polar Bears Stained Glass Nicholas R. Howe Beyond Mere Pixels 35

Classification Results Comparison with histograms and correlograms: Percent Correct 100 80 60 40 20 0 Set One Set Two Histograms Correlograms Vector Method (overall results) Outperforms baseline (histogram, green). Competitive with advanced image metric (correlogram, blue). Nicholas R. Howe Beyond Mere Pixels 36

Object Queries Something we ve wanted to do all along: Search for objects, not whole images. Rank 60 (of 19,000) Rank 1 (of 19,000) Nicholas R. Howe Beyond Mere Pixels 37

How Object Queries Work S(i,j) = S object (i,j) if i and j appear in the target object. S background (i,j) if neither i nor j appear in the target. 0 otherwise. 0 0 S object (strict match) S background (loose match) Nicholas R. Howe Beyond Mere Pixels 38

Testing Object Queries 200 images of cars Visual context is irrelevant. Classes are colors of car. Nicholas R. Howe Beyond Mere Pixels 39

Sample Result (Red Cars) Nicholas R. Howe Beyond Mere Pixels 40

Summary Vector representation preserves critical image features. Retrieval with vector representation is competitive with other techniques on full images. Flexible use of regions allows search for objects & arbitrary figures of interest. Nicholas R. Howe Beyond Mere Pixels 41

Related Work Vector Representation Howe & Huttenlocher, 2000; Howe, 2000; Howe 1998 Earth Mover s Distance Cohen, 1999 Blobworld (UC Berkeley) Carson et. al., 1999; Belongie et. al., 1997 Netra (UCSB) Deng & Manjunath 1999; Ma & Manjunath, 1997 Nicholas R. Howe Beyond Mere Pixels 42

The Future Moving away from absolutism OK, we can find red cars. Can we find cars? Relational encodings: White fur next to red velvet A piece of all the same color Interplay between segmentation, similarity, and compression/coding e.g., Color & texture from segment model Nicholas R. Howe Beyond Mere Pixels 43

Challenges Assumption: parts that belong together should look alike not always true! More sophisticated region models may help. Nicholas R. Howe Beyond Mere Pixels 44

Generating the S Matrix S assembled from matrices S j for each feature (Color) (Location) Smaller matrices are determined by the similarity of the feature values. Kronecker product e.g., Blue-Green vs. Blue -Orange. (Color and Location) Nicholas R. Howe Beyond Mere Pixels 45

Alternate View of S Matrix Cholesky factorization of S: S = T T T Cosine metric of modified vectors: T 1 ( Tv1) ( Tv 2) D( v ) = cos 1, v2 T ( Tv1) ( Tv1) ( Tv 2) ( )( T ( Tv )) 2 v 1 Tv 2 v 2 No Overlap Tv 1 Overlap Nicholas R. Howe Beyond Mere Pixels 46

Optimizations Similarity computation is linear in sparse vector v. Computed once per query D( v 1, v 2 ) = cos 1 ( v T 1 v T 1 Sv 1 Sv 2 )( v T 2 Sv 2 ) Precompute & cache Nicholas R. Howe Beyond Mere Pixels 47

Search Pruning Nearest neighbor search can be pruned by projection onto lower-dimensional spaces. α β β is lower bound on α. Images with β greater than some cutoff need not be considered. Nicholas R. Howe Beyond Mere Pixels 48

Dividing the Color Space Color seeds are dispersed evenly in HSV color cone. Divided into Voronoi regions. Ensures perceptual uniformity. Nicholas R. Howe Beyond Mere Pixels 49