Bag of Words or Bag of Features. Lecture-16 (Slides Credit: Cordelia Schmid LEAR INRIA Grenoble)
|
|
- Colin Charles
- 5 years ago
- Views:
Transcription
1 Bag of Words or Bag of Features Lecture-16 (Slides Credit: Cordelia Schmid LEAR INRIA Grenoble)
2 Contents Interest Point Detector Interest Point Descriptor K-means clustering Support Vector Machine (SVM) classifier Evaluation Metrics: Precision & Recall
3 Image classification Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Horse: not present
4 Image Tasks classification Image classification: assigning a class label to the image Car: present Cow: present Bike: not present Horse: not present Object localization: define the location and the category Car Cow Location Category
5 Difficulties: within object variations Variability: Camera position, Illumination,Internal parameters Within-object variations
6 Difficulties: within class variations
7 Given Image classification Positive training images containing an object class Negative training images that don t Classify A test image as to whether it contains the object class or not?
8 Bag-of-features Origin: texture recognition Texture is characterized by the repetition of basic elements or textons Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003
9 Bag-of-features Origin: texture recognition histogram Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003
10 West Bank water row The last Uncovering duel the trenches Palestinians have accused After quarrelling I know quite over a a lot bank about these Israel of Morning diverting sickness water away loan, two people, men took through part documentary in the from their Just towns a few in of order the opening to keep bars last fatal evidence duel staged relating on to them and Jewish settlements are enough to in transport the many Scottish contemporary soil. BBC News's accounts of their occupied unsuspecting territories fully souls back to the James Landale times. I can retraces identify the the ship my supplied. school Israel assembly denies the halls of their Documents Documents steps of Doc-1 ancestor his ancestor, Doc-2 served Doc-3 who on Doc-4 in the charge childhood. saying Palestinian Words Doc-1 Doc-2 Doc-3 Doc-4 made that Seven final Years' challenge. War, his father's farmers are to blame for using Bank house on Holy Island and illegal connections to irrigate 0 0 probably the Bank Bank : 1 : their fields. Loan Bank Bank : 1 0 : Words Bank Loan Water Doc-1 Doc-1 Loan Loan : 1 : Water Water : 0 : Farmer Farmer : 0 : Doc-2 Water Doc-2 Loan Loan : 0 0 : 0 Water Water : 2 2 : 2 Farmer Farmer : 0 0 : Farmer Farmer
11 Feature Detection 11 Image Dataset Local Patches
12 Local Patches Descriptors Clustering Words Generate the visual vocabulary 12
13 13 Image Dataset Local Patches Visual Words Represent an image as a histogram or bag of words
14 Bag-of-features Origin: bag-of-words (text) Orderless document representation: frequencies of words from a dictionary Classification to determine document categories Bag-of-words Common People Sculpture
15 Bag-of-features for image classification SVM Extract regions or Compute descriptors Find clusters and frequencies Compute distance matrix Classification Interest Points
16 Bag-of-features for image classification SVM Extract regions Compute descriptors Find clusters and frequencies Compute distance matrix Step 1 Step 2 Step 3 Classification
17 Step 1: feature extraction Detect Interest Points SIFT Harris Dense (take every nth pixel as interest point) Compute Descriptor around each interest point SIFT HOG
18 Dense features
19 Bag-of-features for image classification SVM Extract regions Compute descriptors Find clusters and frequencies Compute distance matrix Step 1 Step 2 Step 3 Classification
20 Step 2: Quantization Visual vocabulary Clustering
21 Step 2: Quantization Cluster descriptors K-means Assign each visual word to a cluster Build frequency histogram
22 K-Means
23 K-means Clustering: Step 1 Algorithm: k-means, Distance Metric: Euclidean Distance k 1 2 k k 3 From unknown source on internet
24 K-means Clustering: Step 2 Algorithm: k-means, Distance Metric: Euclidean Distance k 1 2 k k 3 From unknown source on internet
25 K-means Clustering: Step 3 Algorithm: k-means, Distance Metric: Euclidean Distance 5 4 k k 2 k From unknown source on internet
26 K-means Clustering: Step 4 Algorithm: k-means, Distance Metric: Euclidean Distance 5 4 k k 2 k From unknown source on internet
27 expression in condition 2 K-means Clustering: Step 5 Algorithm: k-means, Distance Metric: Euclidean Distance k 2 k expression in condition 1 k 1 From unknown source on internet
28 Example: 3-means Clustering Convergence in 3 steps from Duda et al.
29 Airplanes Motorbikes Faces Wild Cats Leaves People Bikes Examples for visual words
30 Image representation frequency.. codewords each image is represented by a vector, typically dimension,
31 Bag-of-features for image classification SVM Extract regions Compute descriptors Find clusters and frequencies Compute distance matrix Step 1 Step 2 Step 3 Classification
32 Step 3: Classification Learn a decision rule (classifier) assigning bag-offeatures representations of images to different classes Decision boundary Zebra Non-zebra
33 Training data Vectors are histograms, one from each training image positive negative Train classifier,e.g.svm
34 Support Vector Machines (SVM)
35 Application Pattern recognition Object classification/detection
36 Usage The classifier must be trained using a set of negative and positive examples. The classifier learns the regularities in the data If training was successful classifier is capable of classifying an unknown example with a high degree of accuracy.
37 Linear Classifier Binary classifier Task of separating classes in feature space w T x + b > 0 w T x + b = 0 w T x + b < 0 f(x) = sign(w T x + b)
38 Linear Classifier cont d Which of the linear separators is optimal?
39 Margin Distance from example to the separator is (Point to Plane Distance Equation) Examples closest to the hyperplane are support vectors. r w T x b w Margin 2γ of the separator is the width of separation between classes. r 2γ
40 Maximum Margin Classification Maximizing the margin is good according to intuition. Implies that only support vectors are important; other training examples are ignorable.
41 LibSVM SVM implementation
42 Kernels for bags of features Histogram intersection kernel: Generalized Gaussian kernel: D can be Euclidean distance RBF kernel D can be χ 2 distance Earth mover s distance N i i h i h h h I )) ( ), ( min( ), ( ), ( 1 exp ), ( h h D A h h K N i i h i h i h i h h h D ) ( ) ( ) ( ) ( ), (
43 Combining features SVM with multi-channel chi-square kernel Channel c is a combination of detector, descriptor D H, H ) is the chi-square distance between histograms A c c( i j D c 1 m 2 ( H1, H 2) [( h i i h i h i h ) ( 1 2i )] 2 is the mean value of the distances between all training sample Extension: learning of the weights, for example with Multiple Kernel Learning (MKL) J. Zhang, M. Marszalek, S. Lazebnik and C. Schmid. Local features and kernels for classification of texture and object categories: a comprehensive study, IJCV 2007.
44 Multi-class SVMs Various direct formulations exist, but they are not widely used in practice. It is more common to obtain multi-class SVMs by combining two-class SVMs in various ways. One versus all: Training: learn an SVM for each class versus the others Testing: apply each SVM to test example and assign to it the class of the SVM that returns the highest decision value One versus one: Training: learn an SVM for each pair of classes Testing: each learned SVM votes for a class to assign to the test example
45 Why does SVM learning work? Learns foreground and background visual words foreground words high weight background words low weight
46 Illustration Localization according to visual word probability Correct Image: 35 Correct Image: Correct Image: 38 Correct Image: foreground word more probable background word more probable
47 Illustration A linear SVM trained from positive and negative window descriptors A few of the highest weighted descriptor vector dimensions (= 'PAS + tile') + lie on object boundary (= local shape structures common to many training exemplars)
48 Bag-of-features for image classification Excellent results in the presence of background clutter bikes books building cars people phones trees
49 Examples for misclassified images Books- misclassified into faces, faces, buildings Buildings- misclassified into faces, trees, trees Cars- misclassified into buildings, phones, phones
50 Bag of visual words summary Advantages: largely unaffected by position and orientation of object in image fixed length vector irrespective of number of detections very successful in classifying images according to the objects they contain Disadvantages: no explicit use of configuration of visual word positions poor at localizing objects within an image
51 Evaluation of image classification PASCAL VOC [05-10] datasets PASCAL VOC 2007 Training and test dataset available Used to report state-of-the-art results Collected January 2007 from Flickr images downloaded and random subset selected 20 classes Class labels per image + bounding boxes 5011 training images, 4952 test images Evaluation measure: average precision
52 PASCAL 2007 dataset
53 PASCAL 2007 dataset
54 Evaluation Metrics GT RM precision RM GT RM recall GT TP RM TP GT GT RM TP precision RM FP TP GT RM TP recall GT TP FN Ground Truth (GT) Results of Method (RM) True Positives (TP) True Negatives (TN) False Negatives (FN) False Positives (FP)
55 Evaluation
56 Results for PASCAL 2007 Winner of PASCAL 2007 [Marszalek et al.] : map 59.4 Combination of several different channels (dense + interest points, SIFT + color descriptors, spatial grids) Non-linear SVM with Gaussian kernel Multiple kernel learning [Yang et al. 2009] : map 62.2 Combination of several features Group-based MKL approach Combining object localization and classification [Harzallah et al. 09] : map Use detection results to improve classification
57 Spatial pyramid matching Add spatial information to the bag-of-features Perform matching in 2D image space [Lazebnik, Schmid & Ponce, CVPR 2006]
58 Related work Similar approaches: Subblock description [Szummer & Picard, 1997] SIFT [Lowe, 1999] GIST [Torralba et al., 2003] SIFT Gist Szummer & Picard (1997) Lowe (1999, 2004) Torralba et al. (2003)
59 Spatial pyramid representation Locally orderless representation at several levels of spatial resolution level 0
60 Spatial pyramid representation Locally orderless representation at several levels of spatial resolution level 0 level 1
61 Spatial pyramid representation Locally orderless representation at several levels of spatial resolution level 0 level 1 level 2
62 Spatial pyramid matching Combination of spatial levels with pyramid match kernel [Grauman & Darell 05] Intersect histograms, more weight to finer grids
63 Scene dataset [Labzenik et al. 06] Coast Forest Mountain Open country Highway Inside city Tall building Street Suburb Bedroom Kitchen Living room Office Store Industrial 4385 images 15 categories
64 Scene classification L Single-level Pyramid 0(1x1) 72.2±0.6 1(2x2) 77.9± ±0.5 2(4x4) 79.4± ±0.3 3(8x8) 77.2± ±0.3
65 Retrieval examples
66 Category classification CalTech101 L Single-level Pyramid 0(1x1) 41.2±1.2 1(2x2) 55.9± ±0.8 2(4x4) 63.6± ±0.8 3(8x8) 60.3± ±0.7
67 Discussion Summary Spatial pyramid representation: appearance of local image patches + coarse global position information Substantial improvement over bag of features Depends on the similarity of image layout Extensions Flexible, object-centered grid
68 Human Action Recognition
69 Weizmann Action Dataset 10 actions 9 actors per action Bend Jack Wave 2 hands Wave 1 hand Jump in place Sidestep Hop Walk Skip Run
70 KTH Data Set Six Categories, 25 actors, 4 instances, 600. clips Boxi ng Hand Clapping Hand Waving Joggi Runn Walk
71 UCF Sports Action Dataset 9 actions, 142 videos. Bench Swing Div e Swing Run Kick Lift Ride Golf Swing Skate
72 IXMAS Multi view Data Set 13 action categories, 4 camera views, 10 actors, 3 instances. View 1 View 2 View 3 View 4 Check Watch Get Up
73 UCF YouTube Action Dataset (UCF 11) Cycling Diving Golf Swinging Riding Juggling Basketball Shooting Swinging Tennis Swinging Volleyball Spiking Trampoline Jumping Walking Dog
74 UCF50 Baseball Pitch Basketball Shooting Bench Press Biking Billiards Breaststroke Clean and Jerk Diving Drumming Fencing Golf Swing High Jump Horse Race Horse Riding Hula Hoop Javelin Throw Juggling Balls Jumping Jack Jump Rope Kayaking Lunges Military Parade Nun chucks Mixing Batter Pizza Tossing
75 UCF50 Playing Guitar Playing Piano Playing Tabla Playing Violin Pole Vault Pommel Horse Pull Ups Punch Push Ups Rock Climbing Indoor Rope Climbing Rowing Salsa Spins Skate Boarding Skiing Ski jet Soccer Juggling Swing TaiChi Tennis Swing ThrowDiscus Trampoline Jumping Walking with a dog Volleyball Spiking Yo Yo
76 Feature Detector 3D cuboids extraction Feature Clustering Visual Word B Visual Word A Video-words histogram Visual vocabulary
77 Bag of Visual Words model (II) e.g.3d Harris, 3.D. SIFT e.g.hof, 3DHOG Feature Detector 3D cuboids extraction Feature Clustering Visual Word B Visual Word A Video-words histogram Visual vocabulary
78 Histogram of Optical Flow (HOF)
79 Optical flow Optical flow provides very useful motion information for Action Recognition
80 Histogram of Optical flow (HOF) Very similar to HOG Optical flow Speed Optical flow Direction
81 Histogram of Optical flow (HOF) Quantize the orientation into 9 bins (0-180) without Interpolation The vote is magnitude
82 HOF Steps Compute optical flow over the whole video Detect interest point locations or use densely sampled locations Space Time Interest points Densely sampled points
83 HOF Steps Extract space time volumes in the neighborhood of interest points. Further divide neighborhood into small cuboids. Compute HOF in each small cuboids Concatenate histograms of all cuboids to make HOF descriptor of an interest point.
84 HOF Steps Video Interest Point Neighborhood Cuboids Make 1D HOF vector..
Bag-of-features. Cordelia Schmid
Bag-of-features for category classification Cordelia Schmid Visual search Particular objects and scenes, large databases Category recognition Image classification: assigning a class label to the image
More information6.819 / 6.869: Advances in Computer Vision
6.819 / 6.869: Advances in Computer Vision Image Retrieval: Retrieval: Information, images, objects, large-scale Website: http://6.869.csail.mit.edu/fa15/ Instructor: Yusuf Aytar Lecture TR 9:30AM 11:00AM
More informationHUMAN ACTION RECOGNITION
HUMAN ACTION RECOGNITION Human Action Recognition 1. Hand crafted feature + Shallow classifier 2. Human localization + (Hand crafted features) + 3D CNN Input is a small chunk of video 3. 3D CNN Input is
More informationLocal Features and Bag of Words Models
10/14/11 Local Features and Bag of Words Models Computer Vision CS 143, Brown James Hays Slides from Svetlana Lazebnik, Derek Hoiem, Antonio Torralba, David Lowe, Fei Fei Li and others Computer Engineering
More informationBeyond Bags of Features
: for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015
More informationAction recognition in videos
Action recognition in videos Cordelia Schmid INRIA Grenoble Joint work with V. Ferrari, A. Gaidon, Z. Harchaoui, A. Klaeser, A. Prest, H. Wang Action recognition - goal Short actions, i.e. drinking, sit
More informationCS6670: Computer Vision
CS6670: Computer Vision Noah Snavely Lecture 16: Bag-of-words models Object Bag of words Announcements Project 3: Eigenfaces due Wednesday, November 11 at 11:59pm solo project Final project presentations:
More informationPatch Descriptors. CSE 455 Linda Shapiro
Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar
More informationLocal Features and Kernels for Classifcation of Texture and Object Categories: A Comprehensive Study
Local Features and Kernels for Classifcation of Texture and Object Categories: A Comprehensive Study J. Zhang 1 M. Marszałek 1 S. Lazebnik 2 C. Schmid 1 1 INRIA Rhône-Alpes, LEAR - GRAVIR Montbonnot, France
More informationSupervised learning. y = f(x) function
Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the
More informationIntroduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others
Introduction to object recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Overview Basic recognition tasks A statistical learning approach Traditional or shallow recognition
More informationLecture 18: Human Motion Recognition
Lecture 18: Human Motion Recognition Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today? Introduction Motion classification using template matching Motion classification i using spatio
More informationLearning Representations for Visual Object Class Recognition
Learning Representations for Visual Object Class Recognition Marcin Marszałek Cordelia Schmid Hedi Harzallah Joost van de Weijer LEAR, INRIA Grenoble, Rhône-Alpes, France October 15th, 2007 Bag-of-Features
More informationBag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013
CS4670 / 5670: Computer Vision Noah Snavely Bag-of-words models Object Bag of words Bag of Words Models Adapted from slides by Rob Fergus and Svetlana Lazebnik 1 Object Bag of words Origin 1: Texture Recognition
More informationVisual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space.
Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space. Quantize via clustering; cluster centers are the visual words Word #2 Descriptor feature space Assign word
More informationCLASSIFICATION Experiments
CLASSIFICATION Experiments January 27,2015 CS3710: Visual Recognition Bhavin Modi Bag of features Object Bag of words 1. Extract features 2. Learn visual vocabulary Bag of features: outline 3. Quantize
More informationMotion Interchange Patterns for Action Recognition in Unconstrained Videos
Motion Interchange Patterns for Action Recognition in Unconstrained Videos Orit Kliper-Gross, Yaron Gurovich, Tal Hassner, Lior Wolf Weizmann Institute of Science The Open University of Israel Tel Aviv
More informationPatch Descriptors. EE/CSE 576 Linda Shapiro
Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar
More informationCategory-level localization
Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object
More informationAnnouncements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14
Announcements Computer Vision I CSE 152 Lecture 14 Homework 3 is due May 18, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images Given
More informationBy Suren Manvelyan,
By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan,
More informationClassifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped
More informationObject Classification Problem
HIERARCHICAL OBJECT CATEGORIZATION" Gregory Griffin and Pietro Perona. Learning and Using Taxonomies For Fast Visual Categorization. CVPR 2008 Marcin Marszalek and Cordelia Schmid. Constructing Category
More informationLearning Object Representations for Visual Object Class Recognition
Learning Object Representations for Visual Object Class Recognition Marcin Marszalek, Cordelia Schmid, Hedi Harzallah, Joost Van de Weijer To cite this version: Marcin Marszalek, Cordelia Schmid, Hedi
More informationPreliminary Local Feature Selection by Support Vector Machine for Bag of Features
Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Tetsu Matsukawa Koji Suzuki Takio Kurita :University of Tsukuba :National Institute of Advanced Industrial Science and
More informationPerson Action Recognition/Detection
Person Action Recognition/Detection Fabrício Ceschin Visão Computacional Prof. David Menotti Departamento de Informática - Universidade Federal do Paraná 1 In object recognition: is there a chair in the
More informationSupervised learning. y = f(x) function
Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the
More informationPart-based and local feature models for generic object recognition
Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza
More informationEvaluation of Local Space-time Descriptors based on Cuboid Detector in Human Action Recognition
International Journal of Innovation and Applied Studies ISSN 2028-9324 Vol. 9 No. 4 Dec. 2014, pp. 1708-1717 2014 Innovative Space of Scientific Research Journals http://www.ijias.issr-journals.org/ Evaluation
More informationCV as making bank. Intel buys Mobileye! $15 billion. Mobileye:
CV as making bank Intel buys Mobileye! $15 billion Mobileye: Spin-off from Hebrew University, Israel 450 engineers 15 million cars installed 313 car models June 2016 - Tesla left Mobileye Fatal crash car
More informationDiscriminative classifiers for image recognition
Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study
More informationPreviously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011
Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition
More informationDistribution Distance Functions
COMP 875 November 10, 2009 Matthew O Meara Question How similar are these? Outline Motivation Protein Score Function Object Retrieval Kernel Machines 1 Motivation Protein Score Function Object Retrieval
More informationRecognizing human actions in still images: a study of bag-of-features and part-based representations
DELAITRE, LAPTEV, SIVIC: RECOGNIZING HUMAN ACTIONS IN STILL IMAGES. 1 Recognizing human actions in still images: a study of bag-of-features and part-based representations Vincent Delaitre 1 vincent.delaitre@ens-lyon.org
More informationBeyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets
More informationEVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari
EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it What is an Event? Dictionary.com definition: something that occurs in a certain place during a particular
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING. Wei Liu, Serkan Kiranyaz and Moncef Gabbouj
Proceedings of the 5th International Symposium on Communications, Control and Signal Processing, ISCCSP 2012, Rome, Italy, 2-4 May 2012 ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING
More informationAction Recognition Using Global Spatio-Temporal Features Derived from Sparse Representations
Action Recognition Using Global Spatio-Temporal Features Derived from Sparse Representations Guruprasad Somasundaram, Anoop Cherian, Vassilios Morellas, and Nikolaos Papanikolopoulos Department of Computer
More informationObject Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce
Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS
More informationLoose Shape Model for Discriminative Learning of Object Categories
Loose Shape Model for Discriminative Learning of Object Categories Margarita Osadchy and Elran Morash Computer Science Department University of Haifa Mount Carmel, Haifa 31905, Israel rita@cs.haifa.ac.il
More informationarxiv: v3 [cs.cv] 3 Oct 2012
Combined Descriptors in Spatial Pyramid Domain for Image Classification Junlin Hu and Ping Guo arxiv:1210.0386v3 [cs.cv] 3 Oct 2012 Image Processing and Pattern Recognition Laboratory Beijing Normal University,
More informationAction Recognition by Dense Trajectories
Action Recognition by Dense Trajectories Heng Wang, Alexander Kläser, Cordelia Schmid, Liu Cheng-Lin To cite this version: Heng Wang, Alexander Kläser, Cordelia Schmid, Liu Cheng-Lin. Action Recognition
More informationLearning Realistic Human Actions from Movies
Learning Realistic Human Actions from Movies Ivan Laptev*, Marcin Marszałek**, Cordelia Schmid**, Benjamin Rozenfeld*** INRIA Rennes, France ** INRIA Grenoble, France *** Bar-Ilan University, Israel Presented
More informationSparse coding for image classification
Sparse coding for image classification Columbia University Electrical Engineering: Kun Rong(kr2496@columbia.edu) Yongzhou Xiang(yx2211@columbia.edu) Yin Cui(yc2776@columbia.edu) Outline Background Introduction
More informationOBJECT CATEGORIZATION
OBJECT CATEGORIZATION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it Slides: Ing. Lamberto Ballan November 18th, 2009 What is an Object? Merriam-Webster Definition: Something material that may be
More informationPart based models for recognition. Kristen Grauman
Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily
More informationObject recognition (part 2)
Object recognition (part 2) CSE P 576 Larry Zitnick (larryz@microsoft.com) 1 2 3 Support Vector Machines Modified from the slides by Dr. Andrew W. Moore http://www.cs.cmu.edu/~awm/tutorials Linear Classifiers
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More informationImage classification Computer Vision Spring 2018, Lecture 18
Image classification http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 18 Course announcements Homework 5 has been posted and is due on April 6 th. - Dropbox link because course
More informationMultiple-Choice Questionnaire Group C
Family name: Vision and Machine-Learning Given name: 1/28/2011 Multiple-Choice naire Group C No documents authorized. There can be several right answers to a question. Marking-scheme: 2 points if all right
More informationBias-Variance Trade-off + Other Models and Problems
CS 1699: Intro to Computer Vision Bias-Variance Trade-off + Other Models and Problems Prof. Adriana Kovashka University of Pittsburgh November 3, 2015 Outline Support Vector Machines (review + other uses)
More informationTA Section: Problem Set 4
TA Section: Problem Set 4 Outline Discriminative vs. Generative Classifiers Image representation and recognition models Bag of Words Model Part-based Model Constellation Model Pictorial Structures Model
More informationImproved Spatial Pyramid Matching for Image Classification
Improved Spatial Pyramid Matching for Image Classification Mohammad Shahiduzzaman, Dengsheng Zhang, and Guojun Lu Gippsland School of IT, Monash University, Australia {Shahid.Zaman,Dengsheng.Zhang,Guojun.Lu}@monash.edu
More informationVisual Object Recognition
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008 & Kristen Grauman Department
More informationHuman detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah
Human detection using histogram of oriented gradients Srikumar Ramalingam School of Computing University of Utah Reference Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection,
More informationRevisiting LBP-based Texture Models for Human Action Recognition
Revisiting LBP-based Texture Models for Human Action Recognition Thanh Phuong Nguyen 1, Antoine Manzanera 1, Ngoc-Son Vu 2, and Matthieu Garrigues 1 1 ENSTA-ParisTech, 828, Boulevard des Maréchaux, 91762
More informationDense trajectories and motion boundary descriptors for action recognition
Dense trajectories and motion boundary descriptors for action recognition Heng Wang, Alexander Kläser, Cordelia Schmid, Cheng-Lin Liu To cite this version: Heng Wang, Alexander Kläser, Cordelia Schmid,
More informationREJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY. Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin
REJECTION-BASED CLASSIFICATION FOR ACTION RECOGNITION USING A SPATIO-TEMPORAL DICTIONARY Stefen Chan Wai Tim, Michele Rombaut, Denis Pellerin Univ. Grenoble Alpes, GIPSA-Lab, F-38000 Grenoble, France ABSTRACT
More informationDetection III: Analyzing and Debugging Detection Methods
CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can
More informationEECS 442 Computer vision. Object Recognition
EECS 442 Computer vision Object Recognition Intro Recognition of 3D objects Recognition of object categories: Bag of world models Part based models 3D object categorization Computer Vision: Algorithms
More informationIMA Preprint Series # 2378
SPARSE MODELING OF HUMAN ACTIONS FROM MOTION IMAGERY By Alexey Castrodad and Guillermo Sapiro IMA Preprint Series # 2378 ( September 2011 ) INSTITUTE FOR MATHEMATICS AND ITS APPLICATIONS UNIVERSITY OF
More informationIMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim
IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION Maral Mesmakhosroshahi, Joohee Kim Department of Electrical and Computer Engineering Illinois Institute
More informationLinear combinations of simple classifiers for the PASCAL challenge
Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu
More informationLecture 12 Visual recognition
Lecture 12 Visual recognition Bag of words models for object recognition and classification Discriminative methods Generative methods Silvio Savarese Lecture 11 17Feb14 Challenges Variability due to: View
More informationCEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.
CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Session 19 Object Recognition II Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering Lab
More informationAnalysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009
Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context
More informationExtracting Spatio-temporal Local Features Considering Consecutiveness of Motions
Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,
More informationMining Discriminative Adjectives and Prepositions for Natural Scene Recognition
Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition Bangpeng Yao 1, Juan Carlos Niebles 2,3, Li Fei-Fei 1 1 Department of Computer Science, Princeton University, NJ 08540, USA
More informationRecognition Tools: Support Vector Machines
CS 2770: Computer Vision Recognition Tools: Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh January 12, 2017 Announcement TA office hours: Tuesday 4pm-6pm Wednesday 10am-12pm Matlab
More informationLocal Image Features
Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3
More informationTagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation
TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid LEAR team, INRIA Rhône-Alpes, Grenoble, France
More informationContent-based image and video analysis. Event Recognition
Content-based image and video analysis Event Recognition 21.06.2010 What is an event? a thing that happens or takes place, Oxford Dictionary Examples: Human gestures Human actions (running, drinking, etc.)
More informationBeyond Bags of features Spatial information & Shape models
Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features
More informationMultiple Kernel Learning for Emotion Recognition in the Wild
Multiple Kernel Learning for Emotion Recognition in the Wild Karan Sikka, Karmen Dykstra, Suchitra Sathyanarayana, Gwen Littlewort and Marian S. Bartlett Machine Perception Laboratory UCSD EmotiW Challenge,
More informationRobust Action Recognition Using Local Motion and Group Sparsity
Robust Action Recognition Using Local Motion and Group Sparsity Jungchan Cho a, Minsik Lee a, Hyung Jin Chang b, Songhwai Oh a, a Department of Electrical and Computer Engineering and ASRI, Seoul National
More informationFuzzy based Multiple Dictionary Bag of Words for Image Classification
Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image
More informationCS229: Action Recognition in Tennis
CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active
More informationCategory-level Localization
Category-level Localization Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Includes slides from: Ondra Chum, Alyosha Efros, Mark Everingham, Pedro Felzenszwalb,
More informationLearning realistic human actions from movies
Learning realistic human actions from movies Ivan Laptev, Marcin Marszalek, Cordelia Schmid, Benjamin Rozenfeld CVPR 2008 Presented by Piotr Mirowski Courant Institute, NYU Advanced Vision class, November
More informationBeyond bags of Features
Beyond bags of Features Spatial Pyramid Matching for Recognizing Natural Scene Categories Camille Schreck, Romain Vavassori Ensimag December 14, 2012 Schreck, Vavassori (Ensimag) Beyond bags of Features
More informationEvaluation of local descriptors for action recognition in videos
Evaluation of local descriptors for action recognition in videos Piotr Bilinski and Francois Bremond INRIA Sophia Antipolis - PULSAR group 2004 route des Lucioles - BP 93 06902 Sophia Antipolis Cedex,
More informationFirst-Person Animal Activity Recognition from Egocentric Videos
First-Person Animal Activity Recognition from Egocentric Videos Yumi Iwashita Asamichi Takamine Ryo Kurazume School of Information Science and Electrical Engineering Kyushu University, Fukuoka, Japan yumi@ieee.org
More informationAction Recognition & Categories via Spatial-Temporal Features
Action Recognition & Categories via Spatial-Temporal Features 华俊豪, 11331007 huajh7@gmail.com 2014/4/9 Talk at Image & Video Analysis taught by Huimin Yu. Outline Introduction Frameworks Feature extraction
More informationString distance for automatic image classification
String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,
More informationRobotics Programming Laboratory
Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car
More informationScene Recognition using Bag-of-Words
Scene Recognition using Bag-of-Words Sarthak Ahuja B.Tech Computer Science Indraprastha Institute of Information Technology Okhla, Delhi 110020 Email: sarthak12088@iiitd.ac.in Anchita Goel B.Tech Computer
More informationHigh-level Event Recognition in Internet Videos
High-level Event Recognition in Internet Videos Yu-Gang Jiang School of Computer Science Fudan University, Shanghai, China ygj@fudan.edu.cn Joint work with Guangnan Ye 1, Subh Bhattacharya 2, Dan Ellis
More informationMachine Learning Crash Course
Machine Learning Crash Course Photo: CMU Machine Learning Department protests G20 Computer Vision James Hays Slides: Isabelle Guyon, Erik Sudderth, Mark Johnson, Derek Hoiem The machine learning framework
More informationRecognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)
Recognition of Animal Skin Texture Attributes in the Wild Amey Dharwadker (aap2174) Kai Zhang (kz2213) Motivation Patterns and textures are have an important role in object description and understanding
More informationImageCLEF 2011
SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test
More informationCAP 5415 Computer Vision. Fall 2011
CAP 5415 Computer Vision Fall 2011 General Instructor: Dr. Mubarak Shah Email: shah@eecs.ucf.edu Office: 247-F HEC Course Class Time Tuesdays, Thursdays 12 Noon to 1:15PM 383 ENGR Office hours Tuesdays
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationModeling Image Context using Object Centered Grid
Modeling Image Context using Object Centered Grid Sobhan Naderi Parizi, Ivan Laptev, Alireza Tavakoli Targhi Computer Vision and Active Perception Laboratory Royal Institute of Technology (KTH) SE-100
More informationObject recognition. Methods for classification and image representation
Object recognition Methods for classification and image representation Credits Slides by Pete Barnum Slides by FeiFei Li Paul Viola, Michael Jones, Robust Realtime Object Detection, IJCV 04 Navneet Dalal
More informationAction Recognition Using Hybrid Feature Descriptor and VLAD Video Encoding
Action Recognition Using Hybrid Feature Descriptor and VLAD Video Encoding Dong Xing, Xianzhong Wang, Hongtao Lu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive
More informationReal-Time Detection of Landscape Scenes
Real-Time Detection of Landscape Scenes Sami Huttunen 1,EsaRahtu 1, Iivari Kunttu 2, Juuso Gren 2, and Janne Heikkilä 1 1 Machine Vision Group, University of Oulu, Finland firstname.lastname@ee.oulu.fi
More informationPrevious Lecture - Coded aperture photography
Previous Lecture - Coded aperture photography Depth from a single image based on the amount of blur Estimate the amount of blur using and recover a sharp image by deconvolution with a sparse gradient prior.
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationIntegrated Feature Selection and Higher-order Spatial Feature Extraction for Object Categorization
Integrated Feature Selection and Higher-order Spatial Feature Extraction for Object Categorization David Liu, Gang Hua 2, Paul Viola 2, Tsuhan Chen Dept. of ECE, Carnegie Mellon University and Microsoft
More information