Lessons Learned from Large Scale Crowdsourced Data Collection for ILSVRC. Jonathan Krause

Size: px
Start display at page:

Download "Lessons Learned from Large Scale Crowdsourced Data Collection for ILSVRC. Jonathan Krause"

Transcription

1 Lessons Learned from Large Scale Crowdsourced Data Collection for ILSVRC Jonathan Krause

2 Overview Classification Localization Detection Pelican

3 Overview Classification Localization Detection Pelican

4 Overview Classification Localization Detection Bird Frog

5 Classification Overview 1.4M images 1,000 classes By hand: 5 sec/image 50% images correct 12 hours worked/day Pelican = 324 days!

6 Crowdsourcing Let the crowd do the work for you!

7 Classification Pipeline 1. Collect candidate images for each category 2. Put candidate images on Amazon Mechanical Turk (AMT) 3. AMT workers click on images containing each class 4. Aggregate worker responses into labels

8 Collecting Images Category: Whippet Google Image Search:

9 Problem: Limited Images Web searches are limited Solution: Query Expansion WordNet: Whippet: a small slender dog of greyhound type developed in England whippet dog, whippet greyhound translate into other languages

10 Deploying on AMT Annotate many images at once!

11 Make sure workers understand the classes!

12 Understanding Classes Wikipedia and Google links

13 Understanding Classes Give them a definition delta: a low triangular area of alluvial deposits where a river divid before entering a larger body of water: the Mississippi River delta ; the Nile delta

14 Understanding Classes Test them on the definition

15 Understanding Classes Test them on the definition

16 Understanding Classes Give example images (if you have them) Hard a small slender dog of greyhound type developed in England Easy a small slender dog of greyhound type developed in England +

17 Quality Control Workers on AMT are: Fast Inexpensive Plentiful But they are not: Highly trained Solution: Multiple responses, merge results

18 Quality Control Given Set of (worker, image, response) Want P(image has label) for each image (Optionally) worker quality estimates

19 A Simple Method Majority vote Q: Is this a whippet? Responses: Yes No Yes Yes No No Yes Yes

20 Majority Vote Problems: Doesn t give confidence Hard to measure worker quality Responses: Yes No Yes Yes No No Yes How sure are we it s positive? How good are these workers?

21 One Approach Annotate a subset of images with many annotations Majority vote to determine ground truth Determine confidence given fewer annotations Deng et al. 2009

22 Pro & Con Pro Simple Gives image confidence Con Treats all workers the same Relies on initial majority vote

23 Another Approach Model: Prior of label correct Worker confusion matrix Max-likelihood with EM Dawid, Skene. 1979

24 Another Approach Worker Quality Compute Soft Label: distribution over labels given worker response Calculate expected cost of soft label q: Ipeirotis, Provost, Wang. 2012

25 Pro & Con Pro Gives image confidence Gives worker quality Con More complex Need to run optimization

26 Overview Classification Localization Detection Pelican

27 Localization Overview Classification images 1,000 classes 600k training bounding boxes Pelican Main Challenge: Collecting and verifying bounding boxes

28 Bounding Boxes Requirements: Tight around object Around all object instances Not around other objects bounding boxes for bottle Su, Deng, Fei-Fei. 2012

29 Tasks 1. Draw a bounding box around a single instance 2. Quality verification of bounding box 3. Coverage verification

30 Drawing Intuitively simple.. But the devil is in the details

31 Drawing Things vision researchers take for granted Include all visible parts Include only visible parts Make the bounding box tight Only include a single instance Don t draw over any instances that already have bounding boxes What if there are no unannotated objects? Provide instructions and use a qualification task!

32 Drawing Include all visible parts Good Bad

33 Drawing Include only all visible parts Don t try to complete the object Good Bad

34 Drawing Make the bounding box tight Even though loose is much faster Good Bad

35 Drawing Only include a single instance Good Bad

36 Drawing Don t draw over instances that already have bounding boxes Can enforce this in the UI Good Bad

37 Drawing What if there are no unannotated objects? Give option to annotate no bounding boxes Good Bad No more objects anything else

38 Quality Verification Simpler than bounding box drawing Still has some details Is this bounding box good? YES

39 Quality Verification Details: Still need to know about good bounding boxes Quality control Is this bounding box good? YES

40 Quality Verification Quality control Embed gold standard images Positives: Majority vote Negatives: Perturb the positives Reject annotations if bad answers to these Can be used for almost any type of task! (Optionally) require agreement of more than one annotator

41 Coverage Verification Similar in style to quality verification Just a different question Still need instructions, quality control Any unannotated raccoons? Nope!

42 Bounding Boxes: Misc. Provide definitions and example images! Especially if uncommon objects But also helps with common objects Annotators from different cultures Make sure objects being annotated are actually in your images Do the classification task first

43 Bounding Boxes: Misc. Make qualification tasks Verification tasks are much faster than drawing Corner cases: Each task needs plan for when previous task goes wrong.

44 Detection Overview 456k training images 61k fully-annotated val+test 200 classes Bird Frog

45 Detection Overview 456k training images 61k fully-annotated val+test 200 classes Bird Frog Main Challenge: Annotating all 200 classes in every image.

46 Detection Pipeline 1. Collect images 2. Class presence annotation 3. Bounding box annotation Bird Frog

47 Detection Pipeline 1. Collect images 2. Class presence annotation 3. Bounding box annotation Same as previous Bird Frog

48 Detection Pipeline 1. Collect images 2. Class presence annotation 3. Bounding box annotation Bird Frog

49 Collecting Images Need images that aren t single object-centric Additional queries: Compound object queries ( tiger lion, skunk and cat ) Complex scene queries ( kitchenette, dining table, orchestra )

50 Detection Pipeline 1. Collect images 2. Class presence annotation 3. Bounding box annotation Bird Frog

51 Naive approach: ask for each object Table Chair Horse Dog Cat Bird?????? Question Is there a table? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer Yes

52 Naive approach: ask for each object Table Chair Horse Dog Cat Bird +????? Question Is there a table? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer Yes

53 Naive approach: ask for each object Table Chair Horse Dog Cat Bird + +???? Question Is there a chair? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer Yes

54 Naive approach: ask for each object Table Chair Horse Dog Cat Bird + + -??? Question Is there a horse? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer No

55 Naive approach: ask for each object Table Chair Horse Dog Cat Bird ?? Question Is there a dog? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer No

56 Naive approach: ask for each object Table Chair Horse Dog Cat Bird ? Question Is there a cat? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer No

57 Naive approach: ask for each object Table Chair Horse Dog Cat Bird Question Is there a bird? Machine Crowd Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014 Answer No

58 Naive approach: ask for each object Cost: O(NK) for N images and K objects Table Chair Horse Dog Cat Bird Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

59 Hierarchy Animal Furniture Mammal Table Chair Horse Dog Cat Bird Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

60 Hierarchy Furniture Mammal Animal Table Chair Horse Dog Cat Bird Correlation Sparsity Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

61 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird?????? Question Machine Crowd Answer Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

62 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird?????? Question Is there an animal? Machine Crowd Answer No Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

63 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird?? Question Is there an animal? Machine Crowd Answer No Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

64 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird?? Question Is there furniture? Machine Crowd Answer Yes Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

65 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird?? Machine Question Is there a table? Crowd Answer Yes Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

66 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird +? Machine Question Is there a chair? Crowd Answer Yes Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

67 Better approach: exploit label structure Animal Furniture Mammal Table Chair Horse Dog Cat Bird Machine Question Is there a chair? Crowd Answer Yes Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

68 Selecting the Right Question Goal: Get as much utility (new labels) as possible, for as little cost (worker time) as possible, given a desired level of accuracy. Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

69 Accuracy constraint User-specified accuracy threshold, e.g., 95% Might require only one worker, might require several based on the task Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

70 Cost: worker time (time = money) expected human time to get an answer with 95% accuracy Question (is there ) Cost (second) a thing used to open cans/bottles 14.4 an item that runs on electricity (plugged in or using batteries) 12.6 a stringed instrument 3.4 a canine 2.0 Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

71 Utility: expected # of new labels Table Chair Horse Dog Cat Bird?????? Is there a table? Yes No utility = 1 Table Chair Horse Dog Cat Bird +????? Table Chair Horse Dog Cat Bird -????? Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

72 Utility: expected # of new labels Table Chair Horse Dog Cat Bird?????? Is there a table? Yes No utility = 1 Table Chair Horse Dog Cat Bird +????? Table Chair Horse Dog Cat Bird -????? Pr(Y) = 0.5 Table Chair Horse Dog Cat Bird Table Chair Horse Dog Cat Bird?????? Is there an animal? Pr(N) = 0.5?????? Table Chair Horse Dog Cat Bird?? utility = 0.5 * * 4 = 2 Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

73 Selecting the Right Question Pick the question with the most labels per second Query: Is there a... mammal with claws or fingers Utility (num labels) Cost (worker time in secs) Utility- Cost Ratio (labels per sec) living organism mammal creature without legs land or avian creature Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

74 Results Dataset: 20K images from ImageNet Challenge Labels: 200 basic categories (dog, cat, table ) 64 internal nodes in hierarchy Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

75 Results: accuracy Annotating 10K images with 200 objects Accuracy Threshold per question (parameter) Accuracy (F1 score) Naive approach Accuracy (F1 score) Our approach (75.67) (76.97) (60.17) (60.69) Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

76 Results: cost Annotating 10K images with 200 objects Accuracy Threshold per question (parameter) Cost saving (our approach compared to naive approach) x x 6 times more labels per second Deng, Russakovsky, Krause, Bernstein, Berg, Fei- Fei. CHI 2014

77 Overview Classification Localization Detection Bird Frog

78 Final Thoughts Provide good instructions Do quality control Visualize results Listen to your workers

79 Questions?

Crowdsourcing Annotations for Visual Object Detection

Crowdsourcing Annotations for Visual Object Detection Crowdsourcing Annotations for Visual Object Detection Hao Su, Jia Deng, Li Fei-Fei Computer Science Department, Stanford University Abstract A large number of images with ground truth object bounding boxes

More information

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem Many slides from Derek Hoiem Attributes Computer Vision James Hays Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed. Human-in-the-loop recognition:

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

24 hours of Photo Sharing. installation by Erik Kessels

24 hours of Photo Sharing. installation by Erik Kessels 24 hours of Photo Sharing installation by Erik Kessels And sometimes Internet photos have useful labels Im2gps. Hays and Efros. CVPR 2008 But what if we want more? Image Categorization Training Images

More information

Syllabus. 1. Visual classification Intro 2. SVM 3. Datasets and evaluation 4. Shallow / Deep architectures

Syllabus. 1. Visual classification Intro 2. SVM 3. Datasets and evaluation 4. Shallow / Deep architectures Syllabus 1. Visual classification Intro 2. SVM 3. Datasets and evaluation 4. Shallow / Deep architectures Image classification How to define a category? Bicycle Paintings with women Portraits Concepts,

More information

arxiv: v3 [cs.cv] 4 Jun 2016

arxiv: v3 [cs.cv] 4 Jun 2016 LSUN: Construction of a Large-Scale Image Dataset using Deep Learning with Humans in the Loop Fisher Yu Ari Seff Yinda Zhang Shuran Song Thomas Funkhouser Jianxiong Xiao arxiv:1506.03365v3 [cs.cv] 4 Jun

More information

The Caltech-UCSD Birds Dataset

The Caltech-UCSD Birds Dataset The Caltech-UCSD Birds-200-2011 Dataset Catherine Wah 1, Steve Branson 1, Peter Welinder 2, Pietro Perona 2, Serge Belongie 1 1 University of California, San Diego 2 California Institute of Technology

More information

Attributes and More Crowdsourcing

Attributes and More Crowdsourcing Attributes and More Crowdsourcing Computer Vision CS 143, Brown James Hays Many slides from Derek Hoiem Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed.

More information

Contextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work

Contextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work Contextual Dropout Finding subnets for subtasks Sam Fok samfok@stanford.edu Abstract The feedforward networks widely used in classification are static and have no means for leveraging information about

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

The attributes of objects. D.A. Forsyth, UIUC channelling Derek Hoiem, UIUC, with Ali Farhadi, Ian Endres, Gang Wang all of UIUC

The attributes of objects. D.A. Forsyth, UIUC channelling Derek Hoiem, UIUC, with Ali Farhadi, Ian Endres, Gang Wang all of UIUC The attributes of objects D.A. Forsyth, UIUC channelling Derek Hoiem, UIUC, with Ali Farhadi, Ian Endres, Gang Wang all of UIUC Obtain dataset Build features Mess around with classifiers, probability,

More information

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a

More information

Ranking Figure-Ground Hypotheses for Object Segmentation

Ranking Figure-Ground Hypotheses for Object Segmentation Ranking Figure-Ground Hypotheses for Object Segmentation João Carreira, Fuxin Li, Cristian Sminchisescu Faculty of Mathematics and Natural Science, INS, University of Bonn http://sminchisescu.ins.uni-bonn.de/

More information

Why Open Images? Why Open Images?

Why Open Images? Why Open Images? Why Open Images? It s open! Restrict to images with CC_BY license no copyright problems Can even use it commercially Enables legally safe crowdsourcing Why Open Images? Why Open Images? Start from Flickr,

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

Generating Annotations for How-to Videos Using Crowdsourcing

Generating Annotations for How-to Videos Using Crowdsourcing Generating Annotations for How-to Videos Using Crowdsourcing Phu Nguyen MIT CSAIL 32 Vassar St. Cambridge, MA 02139 phun@mit.edu Abstract How-to videos can be valuable teaching tools for users, but searching

More information

HUMAN ACCURACY ANAYLSIS ON THE AMAZON MECHANICAL TURK

HUMAN ACCURACY ANAYLSIS ON THE AMAZON MECHANICAL TURK HUMAN ACCURACY ANAYLSIS ON THE AMAZON MECHANICAL TURK JASON CHEN, JUSTIN HSU, STEFAN WAGER Platforms such as the Amazon Mechanical Turk (AMT) make it easy and cheap to gather human input for machine learning

More information

Lecture 15: Detecting Objects by Parts

Lecture 15: Detecting Objects by Parts Lecture 15: Detecting Objects by Parts David R. Morales, Austin O. Narcomey, Minh-An Quinn, Guilherme Reis, Omar Solis Department of Computer Science Stanford University Stanford, CA 94305 {mrlsdvd, aon2,

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 CS 1674: Intro to Computer Vision Attributes Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 Plan for today What are attributes and why are they useful? (paper 1) Attributes for zero-shot

More information

Object Detection by 3D Aspectlets and Occlusion Reasoning

Object Detection by 3D Aspectlets and Occlusion Reasoning Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition

More information

Multimedia Data Management M

Multimedia Data Management M ALMA MATER STUDIORUM - UNIVERSITÀ DI BOLOGNA Multimedia Data Management M Second cycle degree programme (LM) in Computer Engineering University of Bologna Semantic Multimedia Data Annotation Home page:

More information

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia CPSC340 State-of-the-art Neural Networks Nando de Freitas November, 2012 University of British Columbia Outline of the lecture This lecture provides an overview of two state-of-the-art neural networks:

More information

Supervised Learning of Classifiers

Supervised Learning of Classifiers Supervised Learning of Classifiers Carlo Tomasi Supervised learning is the problem of computing a function from a feature (or input) space X to an output space Y from a training set T of feature-output

More information

Feature selection. LING 572 Fei Xia

Feature selection. LING 572 Fei Xia Feature selection LING 572 Fei Xia 1 Creating attribute-value table x 1 x 2 f 1 f 2 f K y Choose features: Define feature templates Instantiate the feature templates Dimensionality reduction: feature selection

More information

Lecture 14: Annotation

Lecture 14: Annotation Lecture 14: Annotation Nathan Schneider (with material from Henry Thompson, Alex Lascarides) ENLP 23 October 2016 1/14 Annotation Why gold 6= perfect Quality Control 2/14 Factors in Annotation Suppose

More information

CS5670: Computer Vision

CS5670: Computer Vision CS5670: Computer Vision Noah Snavely Lecture 33: Recognition Basics Slides from Andrej Karpathy and Fei-Fei Li http://vision.stanford.edu/teaching/cs231n/ Announcements Quiz moved to Tuesday Project 4

More information

Attribute learning in large-scale datasets. Olga Russakovsky and Li Fei-Fei

Attribute learning in large-scale datasets. Olga Russakovsky and Li Fei-Fei Attribute learning in large-scale datasets Olga Russakovsky and Li Fei-Fei Categorization of the visual world Berry Fruit Entity Tree Instrument Furniture Categorization of the visual world Berry Fruit

More information

precision 2.1. ImageNet and Related Datasets

precision 2.1. ImageNet and Related Datasets precision.95.9 2 3 4 5 6 7 8 9 tree depth Figure 4: Percent of clean images at different tree depth levels in ImageNet. A total of 8 synsets are randomly sampled at every tree depth of the mammal and vehicle

More information

User-Centered Design Data Entry

User-Centered Design Data Entry User-Centered Design Data Entry CS 4640 Programming Languages for Web Applications [The Design of Everyday Things, Don Norman, Ch 7] 1 Seven Principles for Making Hard Things Easy 1. Use knowledge in the

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Pull the Plug? Predicting If Computers or Humans Should Segment Images Supplementary Material

Pull the Plug? Predicting If Computers or Humans Should Segment Images Supplementary Material In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. Pull the Plug? Predicting If Computers or Humans Should Segment Images Supplementary Material

More information

Weakly Supervised Localization of Novel Objects Using Appearance Transfer

Weakly Supervised Localization of Novel Objects Using Appearance Transfer Weakly Supervised Localization of Novel Objects Using Appearance Transfer Mrigank Rochan Department of Computer Science University of Manitoba, Canada mrochan@cs.umanitoba.ca Yang Wang Department of Computer

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

Single-Shot Refinement Neural Network for Object Detection -Supplementary Material-

Single-Shot Refinement Neural Network for Object Detection -Supplementary Material- Single-Shot Refinement Neural Network for Object Detection -Supplementary Material- Shifeng Zhang 1,2, Longyin Wen 3, Xiao Bian 3, Zhen Lei 1,2, Stan Z. Li 4,1,2 1 CBSR & NLPR, Institute of Automation,

More information

Lab 9. Julia Janicki. Introduction

Lab 9. Julia Janicki. Introduction Lab 9 Julia Janicki Introduction My goal for this project is to map a general land cover in the area of Alexandria in Egypt using supervised classification, specifically the Maximum Likelihood and Support

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document

More information

Detection III: Analyzing and Debugging Detection Methods

Detection III: Analyzing and Debugging Detection Methods CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can

More information

Learning video saliency from human gaze using candidate selection

Learning video saliency from human gaze using candidate selection Learning video saliency from human gaze using candidate selection Rudoy, Goldman, Shechtman, Zelnik-Manor CVPR 2013 Paper presentation by Ashish Bora Outline What is saliency? Image vs video Candidates

More information

Segmenting Objects in Weakly Labeled Videos

Segmenting Objects in Weakly Labeled Videos Segmenting Objects in Weakly Labeled Videos Mrigank Rochan, Shafin Rahman, Neil D.B. Bruce, Yang Wang Department of Computer Science University of Manitoba Winnipeg, Canada {mrochan, shafin12, bruce, ywang}@cs.umanitoba.ca

More information

Query Difficulty Prediction for Contextual Image Retrieval

Query Difficulty Prediction for Contextual Image Retrieval Query Difficulty Prediction for Contextual Image Retrieval Xing Xing 1, Yi Zhang 1, and Mei Han 2 1 School of Engineering, UC Santa Cruz, Santa Cruz, CA 95064 2 Google Inc., Mountain View, CA 94043 Abstract.

More information

Semantic Pooling for Image Categorization using Multiple Kernel Learning

Semantic Pooling for Image Categorization using Multiple Kernel Learning Semantic Pooling for Image Categorization using Multiple Kernel Learning Thibaut Durand (1,2), Nicolas Thome (1), Matthieu Cord (1), David Picard (2) (1) Sorbonne Universités, UPMC Univ Paris 06, UMR 7606,

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

The FreeSearch System

The FreeSearch System Wolfgang Nejdl 03/05/12 1 The FreeSearch System Search engine for digital libraries Simple to use interface Intuitive functionalities Easily scalable Now with focus on Duplicate detection and duplicate

More information

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points? Ranked Retrieval One option is to average the precision scores at discrete Precision 100% 0% More junk 100% Everything points on the ROC curve But which points? Recall We want to evaluate the system, not

More information

Crowdsourcing a News Query Classification Dataset. Richard McCreadie, Craig Macdonald & Iadh Ounis

Crowdsourcing a News Query Classification Dataset. Richard McCreadie, Craig Macdonald & Iadh Ounis Crowdsourcing a News Query Classification Dataset Richard McCreadie, Craig Macdonald & Iadh Ounis 0 Introduction What is news query classification and why would we build a dataset to examine it? Binary

More information

Exploiting noisy web data for largescale visual recognition

Exploiting noisy web data for largescale visual recognition Exploiting noisy web data for largescale visual recognition Lamberto Ballan University of Padova, Italy CVPRW WebVision - Jul 26, 2017 Datasets drive computer vision progress ImageNet Slide credit: O.

More information

Hierarchical Image-Region Labeling via Structured Learning

Hierarchical Image-Region Labeling via Structured Learning Hierarchical Image-Region Labeling via Structured Learning Julian McAuley, Teo de Campos, Gabriela Csurka, Florent Perronin XRCE September 14, 2009 McAuley et al (XRCE) Hierarchical Image-Region Labeling

More information

Top 20 Data Quality Solutions for Data Science

Top 20 Data Quality Solutions for Data Science Top 20 Data Quality Solutions for Data Science Data Science & Business Analytics Meetup Boulder, CO 2014-12-03 Ken Farmer DQ Problems for Data Science Loom Large & Frequently 4000000 Strikingly visible

More information

Crowdsourced Data Management: A Survey

Crowdsourced Data Management: A Survey 1 Crowdsourced Data Management: A Survey Guoliang Li Jiannan Wang Yudian Zheng Michael J. Franklin Abstract Any important data management and analytics tasks cannot be completely addressed by automated

More information

CS5670: Intro to Computer Vision

CS5670: Intro to Computer Vision CS5670: Intro to Computer Vision Noah Snavely Introduction to Recognition mountain tree banner building street lamp people vendor Announcements Final exam, in-class, last day of lecture (5/9/2018, 12:30

More information

Semi-supervised Learning

Semi-supervised Learning Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty

More information

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009 Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context

More information

ImageCLEF 2011

ImageCLEF 2011 SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test

More information

Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria

Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria Pei-Yun Hsueh, Prem Melville, Vikas Sindhwani IBM T.J. Watson Research Center 1101 Kitchawan Road, Route 134 Yorktown Heights,

More information

Semantic Segmentation without Annotating Segments

Semantic Segmentation without Annotating Segments Chapter 3 Semantic Segmentation without Annotating Segments Numerous existing object segmentation frameworks commonly utilize the object bounding box as a prior. In this chapter, we address semantic segmentation

More information

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Columbia University Computer Science Department Technical Report # CUCS 007-13 (2013) Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang

More information

Fitting D.A. Forsyth, CS 543

Fitting D.A. Forsyth, CS 543 Fitting D.A. Forsyth, CS 543 Fitting Choose a parametric object/some objects to represent a set of tokens Most interesting case is when criterion is not local can t tell whether a set of points lies on

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Image classification Computer Vision Spring 2018, Lecture 18

Image classification Computer Vision Spring 2018, Lecture 18 Image classification http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 18 Course announcements Homework 5 has been posted and is due on April 6 th. - Dropbox link because course

More information

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Christoph H. Lampert, Matthew B. Blaschko, & Thomas Hofmann Max Planck Institute for Biological Cybernetics Tübingen, Germany Google,

More information

over Multi Label Images

over Multi Label Images IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,

More information

On User-centric QoE Prediction for VoIP & Video Streaming based on Machine-Learning

On User-centric QoE Prediction for VoIP & Video Streaming based on Machine-Learning UNIVERSITY OF CRETE On User-centric QoE Prediction for VoIP & Video Streaming based on Machine-Learning Michalis Katsarakis, Maria Plakia, Paul Charonyktakis & Maria Papadopouli University of Crete Foundation

More information

Video Semantic Indexing using Object Detection-Derived Features

Video Semantic Indexing using Object Detection-Derived Features Video Semantic Indexing using Object Detection-Derived Features Kotaro Kikuchi, Kazuya Ueki, Tetsuji Ogawa, and Tetsunori Kobayashi Dept. of Computer Science, Waseda University, Japan Abstract A new feature

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Future directions in computer vision. Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA

Future directions in computer vision. Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA Future directions in computer vision Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA Presentation overview Future Directions Workshop on Computer Vision Object detection

More information

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang ACM MM 2010 Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet

More information

arxiv: v1 [cs.cv] 5 Oct 2015

arxiv: v1 [cs.cv] 5 Oct 2015 Efficient Object Detection for High Resolution Images Yongxi Lu 1 and Tara Javidi 1 arxiv:1510.01257v1 [cs.cv] 5 Oct 2015 Abstract Efficient generation of high-quality object proposals is an essential

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Describable Visual Attributes for Face Verification and Image Search

Describable Visual Attributes for Face Verification and Image Search Advanced Topics in Multimedia Analysis and Indexing, Spring 2011, NTU. 1 Describable Visual Attributes for Face Verification and Image Search Kumar, Berg, Belhumeur, Nayar. PAMI, 2011. Ryan Lei 2011/05/05

More information

Object Recognition II

Object Recognition II Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems & University of Koblenz Landau, Germany Semantic Search examples: Swoogle and Watson Steffen Staad credit: Tim Finin (swoogle), Mathieu d Aquin (watson) and their groups 2009-07-17

More information

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep 28 2012 Introduction What do we want -- Indoor scene parsing Segmentation and labeling Support relationships

More information

WISE: Large Scale Content Based Web Image Search. Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley

WISE: Large Scale Content Based Web Image Search. Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley WISE: Large Scale Content Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1 A picture is worth a thousand words. Query by Images What leaf?

More information

Bridging the Gap Between Local and Global Approaches for 3D Object Recognition. Isma Hadji G. N. DeSouza

Bridging the Gap Between Local and Global Approaches for 3D Object Recognition. Isma Hadji G. N. DeSouza Bridging the Gap Between Local and Global Approaches for 3D Object Recognition Isma Hadji G. N. DeSouza Outline Introduction Motivation Proposed Methods: 1. LEFT keypoint Detector 2. LGS Feature Descriptor

More information

Layered Scene Decomposition via the Occlusion-CRF Supplementary material

Layered Scene Decomposition via the Occlusion-CRF Supplementary material Layered Scene Decomposition via the Occlusion-CRF Supplementary material Chen Liu 1 Pushmeet Kohli 2 Yasutaka Furukawa 1 1 Washington University in St. Louis 2 Microsoft Research Redmond 1. Additional

More information

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight local variation of one variable with respect to another.

More information

A Convex Formulation for Learning from Crowds

A Convex Formulation for Learning from Crowds Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence A Convex Formulation for Learning from Crowds Hiroshi Kajino Department of Mathematical Informatics The University of Tokyo Yuta

More information

Image Classification pipeline. Lecture 2-1

Image Classification pipeline. Lecture 2-1 Lecture 2: Image Classification pipeline Lecture 2-1 Administrative: Piazza For questions about midterm, poster session, projects, etc, use Piazza! SCPD students: Use your @stanford.edu address to register

More information

Big Data Analytics! Special Topics for Computer Science CSE CSE Feb 9

Big Data Analytics! Special Topics for Computer Science CSE CSE Feb 9 Big Data Analytics! Special Topics for Computer Science CSE 4095-001 CSE 5095-005! Feb 9 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering I What

More information

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al.

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision Fall 2007 Objective Given a query image of an object,

More information

Optimizing Object Detection:

Optimizing Object Detection: Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN Visual Computing Systems Today s task: object detection Image classification: what is the object in this image?

More information

Neural Networks with Input Specified Thresholds

Neural Networks with Input Specified Thresholds Neural Networks with Input Specified Thresholds Fei Liu Stanford University liufei@stanford.edu Junyang Qian Stanford University junyangq@stanford.edu Abstract In this project report, we propose a method

More information

Supplementary Material

Supplementary Material Supplementary Material 1. Human annotation user interfaces Supplementary Figure 1 shows a screenshot of the frame-level classification tool. The segment-level tool was very similar. Supplementary Figure

More information

Video Google: A Text Retrieval Approach to Object Matching in Videos

Video Google: A Text Retrieval Approach to Object Matching in Videos Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature

More information

crowdsourcing, benchmarking & other cool things

crowdsourcing, benchmarking & other cool things crowdsourcing, benchmarking & other cool things Fei-Fei Li (publish under L. Fei-Fei) Computer Science Dept. Psychology Dept. Stanford University is team work! WordNet friends co-pi Research collaborator;

More information

Learning Semantic Video Captioning using Data Generated with Grand Theft Auto

Learning Semantic Video Captioning using Data Generated with Grand Theft Auto A dark car is turning left on an exit Learning Semantic Video Captioning using Data Generated with Grand Theft Auto Alex Polis Polichroniadis Data Scientist, MSc Kolia Sadeghi Applied Mathematician, PhD

More information

Automated Video Analysis of Crowd Behavior

Automated Video Analysis of Crowd Behavior Automated Video Analysis of Crowd Behavior Robert Collins CSE Department Mar 30, 2009 Computational Science Seminar Series, Spring 2009. We Are... Lab for Perception, Action and Cognition Research Interest:

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

A Systems View of Large- Scale 3D Reconstruction

A Systems View of Large- Scale 3D Reconstruction Lecture 23: A Systems View of Large- Scale 3D Reconstruction Visual Computing Systems Goals and motivation Construct a detailed 3D model of the world from unstructured photographs (e.g., Flickr, Facebook)

More information

Image Classification pipeline. Lecture 2-1

Image Classification pipeline. Lecture 2-1 Lecture 2: Image Classification pipeline Lecture 2-1 Administrative: Piazza For questions about midterm, poster session, projects, etc, use Piazza! SCPD students: Use your @stanford.edu address to register

More information

Fast Learning and Prediction for Object Detection using Whitened CNN Features

Fast Learning and Prediction for Object Detection using Whitened CNN Features Fast Learning and Prediction for Object Detection using Whitened CNN Features Björn Barz Erik Rodner Christoph Käding Joachim Denzler Computer Vision Group Friedrich Schiller University Jena Ernst-Abbe-Platz

More information

RANSAC: RANdom Sampling And Consensus

RANSAC: RANdom Sampling And Consensus CS231-M RANSAC: RANdom Sampling And Consensus Roland Angst rangst@stanford.edu www.stanford.edu/~rangst CS231-M 2014-04-30 1 The Need for RANSAC Why do I need RANSAC? I know robust statistics! Robust Statistics

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai

PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION Shin Matsuo Wataru Shimoda Keiji Yanai Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka,

More information

Weakly Supervised Object Recognition with Convolutional Neural Networks

Weakly Supervised Object Recognition with Convolutional Neural Networks GDR-ISIS, Paris March 20, 2015 Weakly Supervised Object Recognition with Convolutional Neural Networks Ivan Laptev ivan.laptev@inria.fr WILLOW, INRIA/ENS/CNRS, Paris Joint work with: Maxime Oquab Leon

More information

VISION & LANGUAGE From Captions to Visual Concepts and Back

VISION & LANGUAGE From Captions to Visual Concepts and Back VISION & LANGUAGE From Captions to Visual Concepts and Back Brady Fowler & Kerry Jones Tuesday, February 28th 2017 CS 6501-004 VICENTE Agenda Problem Domain Object Detection Language Generation Sentence

More information

Object Oriented Programming Part II of II. Steve Ryder Session 8352 JSR Systems (JSR)

Object Oriented Programming Part II of II. Steve Ryder Session 8352 JSR Systems (JSR) Object Oriented Programming Part II of II Steve Ryder Session 8352 JSR Systems (JSR) sryder@jsrsys.com New Terms in this Section API Access Modifier Package Constructor 2 Polymorphism Three steps of object

More information

ShadowDraw Real-Time User Guidance for Freehand Drawing. Harshal Priyadarshi

ShadowDraw Real-Time User Guidance for Freehand Drawing. Harshal Priyadarshi ShadowDraw Real-Time User Guidance for Freehand Drawing Harshal Priyadarshi Demo Components of Shadow-Draw Inverted File Structure for indexing Database of images Corresponding Edge maps Query method Dynamically

More information