Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford

Size: px
Start display at page:

Download "Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford"

Transcription

1 Video Google faces Josef Sivic, Mark Everingham, Andrew Zisserman Visual Geometry Group University of Oxford

2 The objective Retrieve all shots in a video, e.g. a feature length film, containing a particular person Visually defined search on faces Pretty Woman [Marshall, 1990] Applications: intelligent fast forward on characters pull out all videos of x from 1000s of digital camera mpegs

3 Uncontrolled viewing conditions Image variations due to: pose/scale lighting partial occlusion expression c.f. Standard face databases

4 The ideal situation face space Despite all these image variations, want different identities to map to distinct unique points

5 and reality face space

6 Approach minimize variations due to pose, and lighting by choice of feature vector multiple face exemplars to represent expressions each identity represented by a distribution over exemplar feature vectors

7 The benefits of video Automatically associate expression exemplars

8 Outline 1. Obtaining sets of faces using tracking within shots Identity free 2. Matching face sets within shots Requires identity matching 3. Indexing for efficient retrieval Live demo: Pretty Woman Groundhog Day Casablanca

9 1. Obtaining sets of faces by tracking within a shot

10 Face detection Need to associate detections with the same identity frames

11 Face Detector Mikolajczyk et al ECCV 2004 In tradition of Rowley et al 96, Schneiderman & Kanade 00, Viola & Jones 01 and also inspired by SIFT descriptor of Lowe `99 Local features: gradient quantized orientations Laplacian Weak classifiers: from feature occurrence and co-occurrence Strong classifier: using Adaboost Operate at high precision (90%) point few false positives

12 Face detection performance on CMU-MIT test data 125 images with 481 frontal faces ROC curve Operate at high precision (90%) point few false positives

13 Track local affine covariant regions on faces detect regions independently in each frame a region s size and shape are not fixed, but automatically adapts to the image intensity to cover the same physical surface i.e. pre-image is the same surface region tracking - connect the detected regions temporally Track through pose changes, partial occlusions, face deformations

14 Track local affine covariant regions on faces detect regions independently in each frame a region s size and shape are not fixed, but automatically adapts to the image intensity to cover the same physical surface i.e. pre-image is the same surface region tracking - connect the detected regions temporally Track through pose changes, partial occlusions, face deformations

15 Viewpoint covariant segmentation Characteristic scales (size of region) Lindeberg and Garding ECCV 1994 Lowe ICCV 1999 Mikolajczyk and Schmid ICCV 2001 Affine covariance (shape of region) Baumberg CVPR 2000 Matas et al BMVC 2002 Mikolajczyk and Schmid ECCV 2002 Schaffalitzky and Zisserman ECCV 2002 Tuytelaars and Van Gool BMVC 2000 Maximally stable regions Shape adapted regions

16 Tracking covariant regions two stages Goal: develop very long and good quality tracks Stage I match regions detected in neighbouring frames Problems: e.g. missing detections Stage II repair tracks by region propagation

17 Example I original sequence

18 Example I tracked regions

19 Example II tracked regions

20 Region tubes

21 Connecting face detections temporally Goal: associate face detections of each character within a shot Approach: Agglomeratively merge face detections based on connecting tubes frames require a minimum number of region tubes to overlap face detections

22 Example: Buffy the Vampire Slayer Breakfast Scene

23 raw face detections

24 face tubes

25 2. Matching face sets within shots

26 Face feature vector a possible approach is to determine 3D pose/illumination in the manner of Blanz, Romdhani & Vetter 3D morphable model instead concentrate on near frontal pose, and compensate for pose/illumination variation using descriptors designed with built in invariance multiple overlapping SIFTs

27 Face feature vector - summary Multiple, overlapping, affinely transformed local SIFT descriptors face detector eyes/nose/mouth multiple overlapping SIFTs inspired by von der Malsburg et al Elastic Bunch Graph Matching representation, and Heisele et al Component Approach

28 Detect face features for rectification Video with detected features close-up rectified face

29 Eyes/nose/mouth detectors Training data: ~5,000 images with hand-marked facial features Scale determined by face detector Fixed-size patches extracted around feature points

30 Constellation like Appearance/Shape Model Model shape X (2-D points) and appearance A (patches at points in X) Appearance and shape are assumed independent Appearance of a feature is modelled as a mixture of Gaussians (GMM) EM (mixture of probabilistic PCA) algorithm is used to estimate parameters Joint position of all features is modelled as a (mixture of) Gaussians Full covariance (positions of all features interact) position x i GMM clusters appearance a j

31 SIFT descriptor [Lowe 1999] rectified face Create array of orientation histograms 8 orientations x 3x3 spatial bins = 72 dim.

32 Face feature vector - summary Benefits of local SIFT descriptors: SIFT unaffected by small localization errors in eyes/nose/mouth detector Centre weighting de-emphasizes background (no foreground segmentation) Illumination normalization per SIFT allows lighting to vary across face multiple overlapping SIFTs SIFT for each facial feature, i.e. 5 x 72 = 360 vector for entire face

33 Parameters/representation Support region size, number and overlap Representation of distribution Distance measures between distribution over face exemplars

34 Face tube Representation of face set - I represent tube by set of 360-vectors no representation of ordering or dynamics

35 Matching face sets within a shot min-min distance: d!a, B" # $%& a!a,b!b d!a, b" A, B... sets of face descriptors (360-vectors)

36 Matching face sets within shot Goal: Match face tubes of a particular person within a shot (to overcome occlusions, self-occlusions) Approach: Agglomeratively merge facesets using min-min distance with exclusion constraints. Exclusion principle: The same character cannot appear twice in the same frame

37 face tubes (tracking only)

38 intra-shot matching

39 3. Indexing for efficient retrieval

40 Preliminaries Film statistics for Pretty Woman 170,000 frames 1151 shots Pre-processing Track local regions through every shot Detect faces in every frame using a `frontal face detector (38,0457 face detections) Obtain face tubes by tracking (659 face tubes) After intra-shot matching (plumbing) (611 face tubes)

41 Face tube representation as single vector Descriptor for each face Representation of face set II Compact representation of the entire face set obtain compact representation for the entire face tube treat face descriptors as samples from an underlying unknown pdf represent face tube as a histogram over face exemplars (non-parametric model of pdf) cf. Gaussian approximation [Shakhnarovich et al., ECCV 2002]

42 Represent face tube as histogram over face exemplars Face tube Having a set of precomputed face exemplars Assign each face to the nearest exemplar Counts Separate histogram for each facial feature Concatenate histograms for each face feature into a n-vector Exemplars Histogram over exemplars Facial feature exemplars are obtained by k- means clustering on a subset of the movie

43 Examples of face feature vocabulary Facial vocabulary: K-means initialized by progressive constructive clustering (determines K) K left eye 537 middle eye 523 right eye 675 mouth 834 nose 675 Total 3,244

44 Examples of face feature visual words

45 Represent marginals of each facial feature, not joint

46 Matching face tubes use chi-squared as a distance measure between face tube histograms! '!p, q" # SX k#(!p k " q k " ' p k ) q k Counts p k Counts q k Exemplars Exemplars

47 Matching face tubes use chi-squared as a distance measure between face tube histograms! '!p, q" # SX k#(!p k " q k " ' p k ) q k an alternative would be to measure KL divergence between the sets KL!pkq" # X p k *+,!p k /q k " though these are related as ( '!'!p, q" <# KL µpk ( '!p ) q" )KL µqk ( '!p ) q" <# *&'! '!p, q"

48 Making the search efficient (Google like retrieval) Represent video by histogram over facial feature exemplars for each face tube 42 facial feature exemplars 5 face tubes Counts p k Each column is a normalized histogram Exemplars cf words vs documents (e.g. web pages) in text retrieval Employ text-retrieval techniques e.g. Inverted file indexing Ranking (here on chi-squared)

49 Video Google Faces Demo

50 Inter shot retrieval ground truth evaluation Ground truth for 7 characters: 373 face tracks (minimum number of 10 detections)

51 Inter shot retrieval example I Query sequence Retrieved sequences (shown by first detection) Example sequence

52 Inter shot retrieval example II Query sequence Examples of recognized faces Retrieved sequences (shown by first detection)

53 Inter shot retrieval (other characters)

54 Example: Matching across movies Bill Murray Lost in translation [Coppola 2003] Groundhog Day [Ramis 1993 ]

55 Lost in translation - query Query shot Example face detection 192 associated face detections

56 Find Bill Murray in Groundhog Day Face detections from the first 36 retrieved face tracks: First false positive ranked 42nd, 15 false positives in the first 100 retrieved face tracks (out of total of 596 face tracks )

57 Summary Face shot retrieval using a specialized vocabulary and strong spatial model Extensions - Include hair/clothes in visual query for more specific search (integrate vocabularies) - Add profile face detector to harvest further face tubes - Use of exclusion principle to provide negative exemplar sets in inter-shot matching - Apply to other object classes Previous work: Object retrieval in entire movies Sivic and Zisserman, ICCV 2003 Demo:

Video Google: A Text Retrieval Approach to Object Matching in Videos

Video Google: A Text Retrieval Approach to Object Matching in Videos Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature

More information

Video Google: A Text Retrieval Approach to Object Matching in Videos

Video Google: A Text Retrieval Approach to Object Matching in Videos Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman Robotics Research Group, Department of Engineering Science University of Oxford, United Kingdom Abstract

More information

Evaluation and comparison of interest points/regions

Evaluation and comparison of interest points/regions Introduction Evaluation and comparison of interest points/regions Quantitative evaluation of interest point/region detectors points / regions at the same relative location and area Repeatability rate :

More information

Object Recognition with Invariant Features

Object Recognition with Invariant Features Object Recognition with Invariant Features Definition: Identify objects or scenes and determine their pose and model parameters Applications Industrial automation and inspection Mobile robots, toys, user

More information

Video Data Mining Using Configurations of Viewpoint Invariant Regions

Video Data Mining Using Configurations of Viewpoint Invariant Regions Video Data Mining Using Configurations of Viewpoint Invariant Regions Josef Sivic and Andrew Zisserman Robotics Research Group, Department of Engineering Science University of Oxford http://www.robots.ox.ac.uk/

More information

Matching. Brandon Jennings January 20, 2015

Matching. Brandon Jennings January 20, 2015 Matching Brandon Jennings January 20, 2015 Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman Video Google The problem: Desire to match objects in a scene

More information

Patch Descriptors. EE/CSE 576 Linda Shapiro

Patch Descriptors. EE/CSE 576 Linda Shapiro Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Large Scale Image Retrieval

Large Scale Image Retrieval Large Scale Image Retrieval Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague Features Affine invariant features Efficient descriptors Corresponding regions

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Instance-level recognition part 2

Instance-level recognition part 2 Visual Recognition and Machine Learning Summer School Paris 2011 Instance-level recognition part 2 Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique,

More information

Component-based Face Recognition with 3D Morphable Models

Component-based Face Recognition with 3D Morphable Models Component-based Face Recognition with 3D Morphable Models B. Weyrauch J. Huang benjamin.weyrauch@vitronic.com jenniferhuang@alum.mit.edu Center for Biological and Center for Biological and Computational

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

Bundling Features for Large Scale Partial-Duplicate Web Image Search

Bundling Features for Large Scale Partial-Duplicate Web Image Search Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research Abstract In state-of-the-art image retrieval systems, an image is

More information

Lecture 16: Object recognition: Part-based generative models

Lecture 16: Object recognition: Part-based generative models Lecture 16: Object recognition: Part-based generative models Professor Stanford Vision Lab 1 What we will learn today? Introduction Constellation model Weakly supervised training One-shot learning (Problem

More information

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Detection Recognition Sally Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification

More information

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. CSE 455 Linda Shapiro Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Hello! My name is... Buffy Automatic Naming of Characters in TV Video

Hello! My name is... Buffy Automatic Naming of Characters in TV Video 1 Hello! My name is... Automatic Naming of Characters in TV Video Mark Everingham, Josef Sivic and Andrew Zisserman Department of Engineering Science, University of Oxford {me,josef,az}@robots.ox.ac.uk

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

Shape recognition with edge-based features

Shape recognition with edge-based features Shape recognition with edge-based features K. Mikolajczyk A. Zisserman C. Schmid Dept. of Engineering Science Dept. of Engineering Science INRIA Rhône-Alpes Oxford, OX1 3PJ Oxford, OX1 3PJ 38330 Montbonnot

More information

Recap Image Classification with Bags of Local Features

Recap Image Classification with Bags of Local Features Recap Image Classification with Bags of Local Features Bag of Feature models were the state of the art for image classification for a decade BoF may still be the state of the art for instance retrieval

More information

Human Detection Based on a Probabilistic Assembly of Robust Part Detectors

Human Detection Based on a Probabilistic Assembly of Robust Part Detectors Human Detection Based on a Probabilistic Assembly of Robust Part Detectors K. Mikolajczyk 1, C. Schmid 2, and A. Zisserman 1 1 Dept. of Engineering Science Oxford, OX1 3PJ, United Kingdom {km,az}@robots.ox.ac.uk

More information

Evaluation of GIST descriptors for web scale image search

Evaluation of GIST descriptors for web scale image search Evaluation of GIST descriptors for web scale image search Matthijs Douze Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg and Cordelia Schmid INRIA Grenoble, France July 9, 2009 Evaluation of GIST for

More information

Instance-level recognition II.

Instance-level recognition II. Reconnaissance d objets et vision artificielle 2010 Instance-level recognition II. Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique, Ecole Normale

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

Lecture 14: Indexing with local features. Thursday, Nov 1 Prof. Kristen Grauman. Outline

Lecture 14: Indexing with local features. Thursday, Nov 1 Prof. Kristen Grauman. Outline Lecture 14: Indexing with local features Thursday, Nov 1 Prof. Kristen Grauman Outline Last time: local invariant features, scale invariant detection Applications, including stereo Indexing with invariant

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Section 10 - Detectors part II Descriptors Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering

More information

Face Detection and Alignment. Prof. Xin Yang HUST

Face Detection and Alignment. Prof. Xin Yang HUST Face Detection and Alignment Prof. Xin Yang HUST Many slides adapted from P. Viola Face detection Face detection Basic idea: slide a window across image and evaluate a face model at every location Challenges

More information

Window based detectors

Window based detectors Window based detectors CS 554 Computer Vision Pinar Duygulu Bilkent University (Source: James Hays, Brown) Today Window-based generic object detection basic pipeline boosting classifiers face detection

More information

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882 Matching features Building a Panorama Computational Photography, 6.88 Prof. Bill Freeman April 11, 006 Image and shape descriptors: Harris corner detectors and SIFT features. Suggested readings: Mikolajczyk

More information

Efficient visual search of videos cast as text retrieval

Efficient visual search of videos cast as text retrieval Efficient visual search of videos cast as text retrieval Josef Sivic and Andrew Zisserman Abstract We describe an approach to object retrieval which searches for and localizes all the occurrences of an

More information

Motion illusion, rotating snakes

Motion illusion, rotating snakes Motion illusion, rotating snakes Local features: main components 1) Detection: Find a set of distinctive key points. 2) Description: Extract feature descriptor around each interest point as vector. x 1

More information

Lecture 10 Detectors and descriptors

Lecture 10 Detectors and descriptors Lecture 10 Detectors and descriptors Properties of detectors Edge detectors Harris DoG Properties of detectors SIFT Shape context Silvio Savarese Lecture 10-26-Feb-14 From the 3D to 2D & vice versa P =

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM Karthik Krish Stuart Heinrich Wesley E. Snyder Halil Cakir Siamak Khorram North Carolina State University Raleigh, 27695 kkrish@ncsu.edu sbheinri@ncsu.edu

More information

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012) Three things everyone should know to improve object retrieval Relja Arandjelović and Andrew Zisserman (CVPR 2012) University of Oxford 2 nd April 2012 Large scale object retrieval Find all instances of

More information

Visual Object Recognition

Visual Object Recognition Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008 & Kristen Grauman Department

More information

CS6670: Computer Vision

CS6670: Computer Vision CS6670: Computer Vision Noah Snavely Lecture 16: Bag-of-words models Object Bag of words Announcements Project 3: Eigenfaces due Wednesday, November 11 at 11:59pm solo project Final project presentations:

More information

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection Today Indexing with local features, Bag of words models Matching local features Indexing features Bag of words model Thursday, Oct 30 Kristen Grauman UT-Austin Main questions Where will the interest points

More information

Finding people in repeated shots of the same scene

Finding people in repeated shots of the same scene 1 Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences

More information

Metric learning approaches! for image annotation! and face recognition!

Metric learning approaches! for image annotation! and face recognition! Metric learning approaches! for image annotation! and face recognition! Jakob Verbeek" LEAR Team, INRIA Grenoble, France! Joint work with :"!Matthieu Guillaumin"!!Thomas Mensink"!!!Cordelia Schmid! 1 2

More information

Scalable Recognition with a Vocabulary Tree

Scalable Recognition with a Vocabulary Tree Scalable Recognition with a Vocabulary Tree David Nistér and Henrik Stewénius Center for Visualization and Virtual Environments Department of Computer Science, University of Kentucky http://www.vis.uky.edu/

More information

Scalable Recognition with a Vocabulary Tree

Scalable Recognition with a Vocabulary Tree Scalable Recognition with a Vocabulary Tree David Nistér and Henrik Stewénius Center for Visualization and Virtual Environments Department of Computer Science, University of Kentucky http://www.vis.uky.edu/

More information

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier Stefan Hinterstoisser 1, Selim Benhimane 1, Vincent Lepetit 2, Pascal Fua 2, Nassir Navab 1 1 Department

More information

Previously. Window-based models for generic object detection 4/11/2011

Previously. Window-based models for generic object detection 4/11/2011 Previously for generic object detection Monday, April 11 UT-Austin Instance recognition Local features: detection and description Local feature matching, scalable indexing Spatial verification Intro to

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Local features and image matching. Prof. Xin Yang HUST

Local features and image matching. Prof. Xin Yang HUST Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source

More information

Viewpoint Invariant Features from Single Images Using 3D Geometry

Viewpoint Invariant Features from Single Images Using 3D Geometry Viewpoint Invariant Features from Single Images Using 3D Geometry Yanpeng Cao and John McDonald Department of Computer Science National University of Ireland, Maynooth, Ireland {y.cao,johnmcd}@cs.nuim.ie

More information

Specular 3D Object Tracking by View Generative Learning

Specular 3D Object Tracking by View Generative Learning Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp

More information

Image Processing. Image Features

Image Processing. Image Features Image Processing Image Features Preliminaries 2 What are Image Features? Anything. What they are used for? Some statements about image fragments (patches) recognition Search for similar patches matching

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

Local Image Features

Local Image Features Local Image Features Computer Vision CS 143, Brown Read Szeliski 4.1 James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial This section: correspondence and alignment

More information

Min-Hashing and Geometric min-hashing

Min-Hashing and Geometric min-hashing Min-Hashing and Geometric min-hashing Ondřej Chum, Michal Perdoch, and Jiří Matas Center for Machine Perception Czech Technical University Prague Outline 1. Looking for representation of images that: is

More information

Face detection. Bill Freeman, MIT April 5, 2005

Face detection. Bill Freeman, MIT April 5, 2005 Face detection Bill Freeman, MIT 6.869 April 5, 2005 Today (April 5, 2005) Face detection Subspace-based Distribution-based Neural-network based Boosting based Some slides courtesy of: Baback Moghaddam,

More information

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,

More information

Feature Based Registration - Image Alignment

Feature Based Registration - Image Alignment Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html

More information

Local features: detection and description. Local invariant features

Local features: detection and description. Local invariant features Local features: detection and description Local invariant features Detection of interest points Harris corner detection Scale invariant blob detection: LoG Description of local patches SIFT : Histograms

More information

Automated Scene Matching in Movies

Automated Scene Matching in Movies Automated Scene Matching in Movies F. Schaffalitzky and A. Zisserman Robotics Research Group Department of Engineering Science University of Oxford Oxford, OX1 3PJ fsm,az @robots.ox.ac.uk Abstract. We

More information

Object recognition (part 1)

Object recognition (part 1) Recognition Object recognition (part 1) CSE P 576 Larry Zitnick (larryz@microsoft.com) The Margaret Thatcher Illusion, by Peter Thompson Readings Szeliski Chapter 14 Recognition What do we mean by object

More information

Object Category Detection: Sliding Windows

Object Category Detection: Sliding Windows 04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical

More information

Categorization by Learning and Combining Object Parts

Categorization by Learning and Combining Object Parts Categorization by Learning and Combining Object Parts Bernd Heisele yz Thomas Serre y Massimiliano Pontil x Thomas Vetter Λ Tomaso Poggio y y Center for Biological and Computational Learning, M.I.T., Cambridge,

More information

Histograms of Oriented Gradients for Human Detection p. 1/1

Histograms of Oriented Gradients for Human Detection p. 1/1 Histograms of Oriented Gradients for Human Detection p. 1/1 Histograms of Oriented Gradients for Human Detection Navneet Dalal and Bill Triggs INRIA Rhône-Alpes Grenoble, France Funding: acemedia, LAVA,

More information

Object and Class Recognition I:

Object and Class Recognition I: Object and Class Recognition I: Object Recognition Lectures 10 Sources ICCV 2005 short courses Li Fei-Fei (UIUC), Rob Fergus (Oxford-MIT), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/iccv2005

More information

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability Midterm Wed. Local features: detection and description Monday March 7 Prof. UT Austin Covers material up until 3/1 Solutions to practice eam handed out today Bring a 8.5 11 sheet of notes if you want Review

More information

Object detection as supervised classification

Object detection as supervised classification Object detection as supervised classification Tues Nov 10 Kristen Grauman UT Austin Today Supervised classification Window-based generic object detection basic pipeline boosting classifiers face detection

More information

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image SURF CSED441:Introduction to Computer Vision (2015S) Lecture6: SURF and HOG Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Speed Up Robust Features (SURF) Simplified version of SIFT Faster computation but

More information

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science. Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Object Recognition in Large Databases Some material for these slides comes from www.cs.utexas.edu/~grauman/courses/spring2011/slides/lecture18_index.pptx

More information

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing CS 4495 Computer Vision Features 2 SIFT descriptor Aaron Bobick School of Interactive Computing Administrivia PS 3: Out due Oct 6 th. Features recap: Goal is to find corresponding locations in two images.

More information

Parameter Sensitive Detectors

Parameter Sensitive Detectors Boston University OpenBU Computer Science http://open.bu.edu CAS: Computer Science: Technical Reports 2007 Parameter Sensitive Detectors Yuan, Quan Boston University Computer Science Department https://hdl.handle.net/244/680

More information

Lecture 12 Recognition

Lecture 12 Recognition Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza 1 Lab exercise today replaced by Deep Learning Tutorial Room ETH HG E 1.1 from 13:15 to 15:00 Optional lab

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

Component-based Face Recognition with 3D Morphable Models

Component-based Face Recognition with 3D Morphable Models Component-based Face Recognition with 3D Morphable Models Jennifer Huang 1, Bernd Heisele 1,2, and Volker Blanz 3 1 Center for Biological and Computational Learning, M.I.T., Cambridge, MA, USA 2 Honda

More information

A performance evaluation of local descriptors

A performance evaluation of local descriptors MIKOLAJCZYK AND SCHMID: A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS A performance evaluation of local descriptors Krystian Mikolajczyk and Cordelia Schmid Dept. of Engineering Science INRIA Rhône-Alpes

More information

Visual Object Recognition

Visual Object Recognition Visual Object Recognition -67777 Instructor: Daphna Weinshall, daphna@cs.huji.ac.il Office: Ross 211 Office hours: Sunday 12:00-13:00 1 Sources Recognizing and Learning Object Categories ICCV 2005 short

More information

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds 9 1th International Conference on Document Analysis and Recognition Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds Weihan Sun, Koichi Kise Graduate School

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the published version: Arandjelovic, Ognjen and Zisserman, A 2005, Automatic face recognition for film character retrieval in feature length films, in CVPR 2005 : Proceedings of the Computer Vision

More information

Image Retrieval (Matching at Large Scale)

Image Retrieval (Matching at Large Scale) Image Retrieval (Matching at Large Scale) Image Retrieval (matching at large scale) At a large scale the problem of matching between similar images translates into the problem of retrieving similar images

More information

Texture Features in Facial Image Analysis

Texture Features in Facial Image Analysis Texture Features in Facial Image Analysis Matti Pietikäinen and Abdenour Hadid Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P.O. Box 4500, FI-90014 University

More information

An Evaluation of Volumetric Interest Points

An Evaluation of Volumetric Interest Points An Evaluation of Volumetric Interest Points Tsz-Ho YU Oliver WOODFORD Roberto CIPOLLA Machine Intelligence Lab Department of Engineering, University of Cambridge About this project We conducted the first

More information

Binary SIFT: Towards Efficient Feature Matching Verification for Image Search

Binary SIFT: Towards Efficient Feature Matching Verification for Image Search Binary SIFT: Towards Efficient Feature Matching Verification for Image Search Wengang Zhou 1, Houqiang Li 2, Meng Wang 3, Yijuan Lu 4, Qi Tian 1 Dept. of Computer Science, University of Texas at San Antonio

More information

Lecture 12 Recognition. Davide Scaramuzza

Lecture 12 Recognition. Davide Scaramuzza Lecture 12 Recognition Davide Scaramuzza Oral exam dates UZH January 19-20 ETH 30.01 to 9.02 2017 (schedule handled by ETH) Exam location Davide Scaramuzza s office: Andreasstrasse 15, 2.10, 8050 Zurich

More information

Prof. Feng Liu. Spring /26/2017

Prof. Feng Liu. Spring /26/2017 Prof. Feng Liu Spring 2017 http://www.cs.pdx.edu/~fliu/courses/cs510/ 04/26/2017 Last Time Re-lighting HDR 2 Today Panorama Overview Feature detection Mid-term project presentation Not real mid-term 6

More information

Structure Guided Salient Region Detector

Structure Guided Salient Region Detector Structure Guided Salient Region Detector Shufei Fan, Frank Ferrie Center for Intelligent Machines McGill University Montréal H3A2A7, Canada Abstract This paper presents a novel method for detection of

More information

A Novel Extreme Point Selection Algorithm in SIFT

A Novel Extreme Point Selection Algorithm in SIFT A Novel Extreme Point Selection Algorithm in SIFT Ding Zuchun School of Electronic and Communication, South China University of Technolog Guangzhou, China zucding@gmail.com Abstract. This paper proposes

More information

Oriented Filters for Object Recognition: an empirical study

Oriented Filters for Object Recognition: an empirical study Oriented Filters for Object Recognition: an empirical study Jerry Jun Yokono Tomaso Poggio Center for Biological and Computational Learning, M.I.T. E5-0, 45 Carleton St., Cambridge, MA 04, USA Sony Corporation,

More information

Indexing local features and instance recognition May 14 th, 2015

Indexing local features and instance recognition May 14 th, 2015 Indexing local features and instance recognition May 14 th, 2015 Yong Jae Lee UC Davis Announcements PS2 due Saturday 11:59 am 2 We can approximate the Laplacian with a difference of Gaussians; more efficient

More information

Adaptive Learning of an Accurate Skin-Color Model

Adaptive Learning of an Accurate Skin-Color Model Adaptive Learning of an Accurate Skin-Color Model Q. Zhu K.T. Cheng C. T. Wu Y. L. Wu Electrical & Computer Engineering University of California, Santa Barbara Presented by: H.T Wang Outline Generic Skin

More information

3D model search and pose estimation from single images using VIP features

3D model search and pose estimation from single images using VIP features 3D model search and pose estimation from single images using VIP features Changchang Wu 2, Friedrich Fraundorfer 1, 1 Department of Computer Science ETH Zurich, Switzerland {fraundorfer, marc.pollefeys}@inf.ethz.ch

More information

Feature descriptors and matching

Feature descriptors and matching Feature descriptors and matching Detections at multiple scales Invariance of MOPS Intensity Scale Rotation Color and Lighting Out-of-plane rotation Out-of-plane rotation Better representation than color:

More information

Mixtures of Gaussians and Advanced Feature Encoding

Mixtures of Gaussians and Advanced Feature Encoding Mixtures of Gaussians and Advanced Feature Encoding Computer Vision Ali Borji UWM Many slides from James Hayes, Derek Hoiem, Florent Perronnin, and Hervé Why do good recognition systems go bad? E.g. Why

More information

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching Akshay Bhatia, Robert Laganière School of Information Technology and Engineering University of Ottawa

More information

SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS. Luca Bertinetto, Attilio Fiandrotti, Enrico Magli

SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS. Luca Bertinetto, Attilio Fiandrotti, Enrico Magli SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS Luca Bertinetto, Attilio Fiandrotti, Enrico Magli Dipartimento di Elettronica e Telecomunicazioni, Politecnico di Torino (Italy) ABSTRACT

More information

THE aim of this work is to retrieve those key frames and

THE aim of this work is to retrieve those key frames and IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 4, APRIL 2009 591 Efficient Visual Search of Videos Cast as Text Retrieval Josef Sivic and Andrew Zisserman Abstract We describe

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gyuri Dorkó, Cordelia Schmid To cite this version: Gyuri Dorkó, Cordelia Schmid. Selection of Scale-Invariant Parts for Object Class Recognition.

More information

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Visuelle Perzeption für Mensch- Maschine Schnittstellen Visuelle Perzeption für Mensch- Maschine Schnittstellen Vorlesung, WS 2009 Prof. Dr. Rainer Stiefelhagen Dr. Edgar Seemann Institut für Anthropomatik Universität Karlsruhe (TH) http://cvhci.ira.uka.de

More information

Robust Human Detection Under Occlusion by Integrating Face and Person Detectors

Robust Human Detection Under Occlusion by Integrating Face and Person Detectors Robust Human Detection Under Occlusion by Integrating Face and Person Detectors William Robson Schwartz, Raghuraman Gopalan 2, Rama Chellappa 2, and Larry S. Davis University of Maryland, Department of

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Using the Forest to See the Trees: Context-based Object Recognition

Using the Forest to See the Trees: Context-based Object Recognition Using the Forest to See the Trees: Context-based Object Recognition Bill Freeman Joint work with Antonio Torralba and Kevin Murphy Computer Science and Artificial Intelligence Laboratory MIT A computer

More information

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies?

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies? Visual Categorization With Bags of Keypoints. ECCV,. G. Csurka, C. Bray, C. Dance, and L. Fan. Shilpa Gulati //7 Basic Problem Addressed Find a method for Generic Visual Categorization Visual Categorization:

More information

Object Category Detection: Sliding Windows

Object Category Detection: Sliding Windows 03/18/10 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Goal: Detect all instances of objects Influential Works in Detection Sung-Poggio

More information