Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.

Similar documents
Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 16

Lec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA

Adaptive Binary Quantization for Fast Nearest Neighbor Search

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 13

Hashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011

ECE 484 Digital Image Processing Lec 17 - Part II Review & Final Projects Topics

Locality- Sensitive Hashing Random Projections for NN Search

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Large-scale visual recognition Efficient matching

Algorithms for Nearest Neighbors

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Visual Representations for Machine Learning

Multiple-View Object Recognition in Band-Limited Distributed Camera Networks

Geometric data structures:

CLSH: Cluster-based Locality-Sensitive Hashing

Dimension Reduction CS534

Image Analysis & Retrieval Lec 10 - Classification II

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS 231A CA Session: Problem Set 4 Review. Kevin Chen May 13, 2016

Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Approximate Nearest Neighbor Search. Deng Cai Zhejiang University

CS 340 Lec. 4: K-Nearest Neighbors

Fast Indexing Method. Dongliang Xu 22th.Feb.2008

Nearest Neighbor with KD Trees

Large-Scale Face Manifold Learning

Object Classification Problem

Large scale object/scene recognition

Mining Social Network Graphs

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

10-701/15-781, Fall 2006, Final

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching

On Order-Constrained Transitive Distance

Nearest Neighbors Classifiers

The Curse of Dimensionality

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016

over Multi Label Images

Predictive Indexing for Fast Search

Similarity Searching Techniques in Content-based Audio Retrieval via Hashing

Fast Indexing and Search. Lida Huang, Ph.D. Senior Member of Consulting Staff Magma Design Automation

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Segmentation: Clustering, Graph Cut and EM

MSA220 - Statistical Learning for Big Data

Nearest Neighbor with KD Trees

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

Introduction to Machine Learning

Manifold Constrained Deep Neural Networks for ASR

Class 6 Large-Scale Image Classification

CS 664 Segmentation. Daniel Huttenlocher

Discriminate Analysis

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri

Computational Photography Denoising

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Evaluation and comparison of interest points/regions

doc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague

Web- Scale Mul,media: Op,mizing LSH. Malcolm Slaney Yury Li<shits Junfeng He Y! Research

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

Segmentation Computer Vision Spring 2018, Lecture 27

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Problem 1: Complexity of Update Rules for Logistic Regression

Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), June, 2012

Targil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation

Machine Learning for Data Science (CS4786) Lecture 11

Multidimensional Indexes [14]

Machine learning - HT Clustering

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Behavioral Data Mining. Lecture 18 Clustering

VK Multimedia Information Systems

Searching in one billion vectors: re-rank with source coding

Algorithms for Nearest Neighbors

Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015

Learning Low-rank Transformations: Algorithms and Applications. Qiang Qiu Guillermo Sapiro

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Repeating Segment Detection in Songs using Audio Fingerprint Matching

Lecture 4 Face Detection and Classification. Lin ZHANG, PhD School of Software Engineering Tongji University Spring 2018

Region-based Segmentation

Clustering Lecture 5: Mixture Model

Practical Data-Dependent Metric Compression with Provable Guarantees

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 229 Midterm Review

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

Part-based and local feature models for generic object recognition

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Large Scale Nearest Neighbor Search Theories, Algorithms, and Applications. Junfeng He

Learning Affine Robust Binary Codes Based on Locality Preserving Hash

Subspace Indexing on Grassmann Manifold for Large Scale Visual Recognition

Data Mining in Bioinformatics Day 1: Classification

10701 Machine Learning. Clustering

Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA

Introduction to spectral clustering

Locality-Sensitive Hashing

Introduction to Data Mining

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval

Transcription:

Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 18 Image Hashing Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv, 2016 Fall p.1

Outline Recap Lec 17: Sparse Signal Recovery L1 norm and L1 Magic Solution Application in occluded face recognition Application in super resolution Media Data Hashing LSH Spectral Hashing Grassmann Hashing Summary Z. Li, Image Analysis & Retrv, 2016 Fall p.2

Sparse Signal Recovery Sparse Signal Processing If signal is sparse in some (unknown) domain, then from a random measurement, we can reliably recover the signal via L1 minimization L1Magic min x x 1, s. t. y = Ax Z. Li, Image Analysis & Retrv, 2016 Fall p.3

Sparse Signal Recovery-L1Magic L1Magic % observations y = A*x; % initial guess = min energy x0 = A'*y; % solve with primal-dual method xp = l1eq_pd(x0, A, [], y, 1e-3); subplot(3,1,3); plot(xp); title('x(t) recovered by L1 magic'); Z. Li, Image Analysis & Retrv, 2016 Fall p.4

Sparsity in Face Models Assume y is belonging to class i, then, Or, Where only a small number of coefficients in x has non-zero entry, thus sparse. Z. Li, Image Analysis & Retrv, 2016 Fall p.5

Illustration of Recovery from Sparsity Assume y is belonging to class 1, then, Most co-efficients related to other classes are zero, only a small number of non-zero coefficients in alpha 1 Z. Li, Image Analysis & Retrv, 2016 Fall p.6

Coupled Dictionary Learning Pre-train a common set of coupled low and high resolution dictionary Super-resolve by solving L1 minimization on lower resolution patch, and use the same coeffiients to superresolve the higher resolution patch Z. Li, Image Analysis & Retrv, 2016 Fall p.7

Dictionary Training Training data: low and high resolution image patches Y l ={y k }, X h ={x k }: Enforce the common sparse coefficients Z. Li, Image Analysis & Retrv, 2016 Fall p.8

Results 3x super-resolution Low-resolution input Bicubic Neighbor embedding [Chang CVPR 04] Coupled Dictionary Original Z. Li, Image Analysis & Retrv, 2016 Fall p.9

Outline Recap Lec 17: Sparse Signal Recovery L1 norm and L1 Magic Solution Application in occluded face recognition Application in super resolution Media Data Hashing LSH & Spectral Hashing Grassmann Hashing Complementary Hashing Summary Z. Li, Image Analysis & Retrv, 2016 Fall p.10

Media Data Hashing Use Case Internet scale image retrieval Internet contains billions of images Search the internet Challenges: Internet Scale: very large reporitory, need a compact representation Speed: hash offers binary operations, fast Accuracy: the hash need to preserve the desired similarity in hamming distance Z. Li, Image Analysis & Retrv, 2016 Fall p.11

Media Data Hashing Recall MPEG CDVS, Scalable Fisher Vector: N SIFT binarize Aggregate against kdxnc GMM Hash Objective: Find a image feature and feature aggregation/projection Binarize the representation to generate Hash s.t., the pair-wise relationship is preserved by the Hamming distance of the Hash Z. Li, Image Analysis & Retrv, 2016 Fall p.12

Tree Based Hash Kd-Tree Hash Data partition solution Iteratively split the data along the dimensions Each leaf node has equal number of data points Assign Hash as 1/0 when traversing down the kd-tree Octree/Quadtree Hash Space partition solution Iteratively split the space into 2 d equal size pieces Each node is addressed by a byte code, resulting in a prefix hash. Z. Li, Image Analysis & Retrv, 2016 Fall p.13

Curse of Dimensionality When data dimension is large, say > 20, tree based solution breaks down. Degenerate to linear search with O(N) Complexity + Z. Li, Image Analysis & Retrv, 2016 Fall p.14

Nearest Neighbor Search Definitions Nearest Neighbor (NN), r-nn Credit: P.Indyk, Approx NN search in High Dimensional Space, http://www.mit.edu/~andoni/lsh/ Z. Li, Image Analysis & Retrv, 2016 Fall p.15

Approx. NN Search Definition Z. Li, Image Analysis & Retrv, 2016 Fall p.16

Motivation for LSH if p and q are close, then Ap, Aq must be close, not vice versa Z. Li, Image Analysis & Retrv, 2016 Fall p.17

Locality Sensitivity Definition: (p 1, p 2, r, cr) -sensitivity Z. Li, Image Analysis & Retrv, 2016 Fall p.18

LSH Locality Sensitive Hashing Basic Idea: Reduce images to some features {x k } in R d, where d is usually large (e.g., SCFV: d=32x128=512) Select random projections y=ax, where A is 1xd, then assign 1 or 0 from the projection Aggregate all these projections and the bits produced as Hash for the image 0 Y=A 1 x 1 101 Hash Generated Feature vector 0 Y=A 2 x 0 1 1 Y=A 3 x No learning involved Z. Li, Image Analysis & Retrv, 2016 Fall p.19

LSH Analysis Intuition: If two points {p, q}are close, they will hash to the same bucket with prob p 1. If two points are far away, they will hash to the same bucket with prob p 2. Pr[h(p)=h(q)]=(1-d(p,q)/D) k D is the number of dimensions in the binary representation k is the size of subset of Hashes We can vary the probability by changing k: adding more hash bits getting more evidence Pr k=1 Pr k=2 distance distance Z. Li, Image Analysis & Retrv, 2016 Fall p.20

Indyk s LSH Results Color histogram dataset from Corel Draw 20,000 images, 64 dimensions Used 1k, 2k, 5k, 10k, 19k points for training 1k points are used for query Computed missed ratio fraction of queries with no hits

Grassmann Hashing Main Motivation Allow multiple low dimensional projection, generating multiple bits per projection Penalizing the subspaces we already selected avoiding generating similar bits that are wasting the hash bits budget GRASH: GRASH introduces the Grassmann metric to measure the similarity between different hashing subspaces, so the hashing function can better capture the data diversity. GRASH incorporates the discriminant information into the hashing functions; GRASH can extend the original LSH s 1-d hashing subspaces to m-d; GRASH applies non-uniform size bucket to generate hashing codes, so the distortion can be minimized. Z. Li, Image Analysis & Retrv, 2016 Fall p.22

GRASH Discriminative Projection via Learning (can be LDA/LPP) Do FLDA, get first d Fisher Faces W Hash Projection Candidates Find Hashing Subspace Candidates (HSC) by traversing the combinations of the m Fisher faces out of d, where m is the No. of hashing dim. Record the discriminant energy of the derived HSC, which is defined as follows: arg max W = [ w w... w ] E 1 2 t m i T W SBW T W SWW n 2 i

GRASH Penalizing Similar subspaces chosen: Select the optimal k hashing functions according to the following criteria: min error rate and sum of grassmann distance i E d i j 2 arg max i (1 ) Arc (, ) i j U in

LSH vs GRASH

GRASH Bucket Design Bucket Design Non-uniform bucket design for hashing codes Apply Lloyd-Max algorithm, to minimize distortion: D E x xˆ 2

Experiments Datasets: A large human face dataset, combining YALE, ESSEX, ORL etc, 6,680 faces of 417 individuals MSRA-MM datasets, around 10,000 images from 10 classes, each image with 899D feature (e.g. feature from RBG histogram, wavelet texture) Performance Evaluation: Intersection rate, defined as follows: I 1 q, GRASH q,* Q q Q U U q,* U

Experiments Face Hash Face dataset: Intersection rate vs μ(no. of hashing function 20, 8- NN)

Experiments MSRA Data Set 4-NNS 8-NNS 16-NNS 32-NNS LSH-1bit 23.9% 28.8% 33.6% 35.1% LSH-2bit 31.5% 34.6% 39.3% 39.8% LSH-4bit 40.6% 45.7% 51.2% 55.1% GRASH-1bit 39.3% 42.4% 49.7% 53.2% GRASH-2bit 52.8% 55.8% 68.3% 72.3% GRASH-4bit 63.9% 69.7% 73.6% 80.3%

MSRA-MM dataset: Experiments

Spectral Hashing To simplify the problem, first assume that the items have already been embedded in a Euclidean space Try to embed the data into a hamming space Hamming space is binary space 010101001 Fergus et al

n y i i=1 Some definition Let be the list of code words (binary vectors of length k) for n data points Affinity map: W i, j = exp( x i x j 2 /h 2 )is the affinity matrix characterize similarities between data points.

Objective function the average Hamming distance between similar points is minimal What does this objective function mean? Generated hash {y i } has equal 1/0 bits W i,j enforce that similar data points are preserved in hamming distance of y i

Objective of Spectral Hashing Spectral Hashing explained: the average Hamming distance between similar neighbors in the Euclidean space The code is binary each bit have 50% to be 0 or 1 the bits to be uncorrelated (bounding condition for the objective)

Spectral Relaxation We obtain an easy problem whose solutions are simply the k eigenvectors of D W with minimal eigenvalue Observation: Similar with spectral graph partition Could be solved by computing generalized Eigenvalue problem on Laplacian

New Sample After Embedding Problem Only tells us how to compute the code representation of items in the training set How about the testing set? A new query image? Computing the code in the testing set is called the out-of-sample extension v 9 + What would be the hash for V 9?

New Sample Hash Assignment Need a function to map new points into the space Take limit of Eigenvalues as n \inf Need to carefully normalize graph Laplacian Analytical form of Eigenfunctions exists for certain distributions (uniform, Gaussian) Constant time compute/evaluate new point For uniform distribution: Eigen vectors 1/0 assignment Z. Li, Image Analysis & Retrv, 2016 Fall p.37

The Algorithm Input: Data {x i } of dimensionality d; desired # bits, k

1. Fit Multidimensional Rectangle Run PCA Run PCA to align axes Bound uniform distribution

2. Calculuate Eigenfunctions

2. Calculuate Eigenfunctions

2. Calculuate Eigenfunctions

2. Calculuate Eigenfunctions

3. Pick k smallest Eigenfunctions Eigenvalues e.g. k=3

4. Threshold chosen Eigenfunctions

Back to the 2-D Toy example Hashing the new data points 3 bits 7 bits 15 bits Distance Red Green Blue 0 bits 1 bit 2 bits

Fergus et al 2-D uniform Toy Example Comparison

Some results on Labelme data set Observation: spectral hashing get the best performance

Summary Image Hash a very useful technique in large scale image retrieval Locality Sensitive Hash Random projections that generate hash bits Sufficient number of projections will preserve its distance in hamming distance, as d(p,q) nearness is always preserved in projection. Not very efficient though (see Complementary Hashing) Grassmann Hash Allow flexible multiple dimension projection and bucket design Penalizing the projections with Grassmann metric Spectrum Hash Use local graph Laplacian eigenfunctions to generate hash bits, which is an assignment of segmentation.