Instances on a Budget

Size: px

Start display at page:

Download "Instances on a Budget"

Alexis Booker
5 years ago
Views:

1 Retrieving Similar or Informative Instances on a Budget Kristen Grauman Dept. of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimham Work with Sudheendra Vijayanarasimham, Brian Kulis, and Prateek Jain

2 The visual search problem Massive search pools ~20 hours of video/minute added to YouTube ~5,000 new tagged photos/minute added to Flickr, Complex representations and distances sky fence house grass? driveway

3 Retrieval on a budget Goal: Specify resources available algorithm focuses search accordingly. Cost of search Accuracy of retrieved results Cost of annotation Accuracy of learned models

4 Retrieval on a budget Retrieving similar instances with a search time budget Novel hash functions for learned metrics and arbitrary kernel functions Retrieving informative instances with a search time or annotation cost budget Novel hash functions for hyperplane queries Budgeted batch-mode active selection

5 Locality Sensitive Hashing (LSH) h r 1 1 r k N X i Guarantees approximate near neighbors in sub-linear time, given appropriate hash functions. Query time: hr 1 r k Q << N Q [Indyk and Motwani 1998, Charikar 2002]

6 LSH functions for dot products The probability that a random hyperplane separates two unit vectors depends on the angle between them: Corresponding hash function: High dot product: unlikely to split Lower dot product: likely to split [Goemans and Williamson 1995, Charikar 2004]

7 Learning how to compare images dissimilar similar Exploit (dis)similarity constraints to construct more useful distance function Number of existing techniques for metric learning [Weinberger et al. 2004, Hertz et al. 2004, Frome et al. 2007, Varma & Ray 2007, Kumar et al. 2007]

8 Learning how to compare images dissimilar similar Exploit (dis)similarity constraints to construct more useful distance function Number of existing techniques for metric learning [Weinberger et al. 2004, Hertz et al. 2004, Frome et al. 2007, Varma & Ray 2007, Kumar et al. 2007]

9 Our idea: Semi-supervised hash functions h( ( ) = h( ( ) h( ) h( ) Less likely to split pairs like those with similarity constraint More likely to split pairs like those with dissimilarity constraint [Jain, Kulis, & Grauman, CVPR 2008]

10 Semi-supervised hash functions Given learned Mahalanobis metric, We generate parameterized hash h functions for : Satisfies the locality-sensitivity condition: [Jain, Kulis, & Grauman, CVPR 2008]

11 Semi-supervised hash functions Query time: ate Error ra slower search faster search [Kulis, Jain, & Grauman, PAMI 2009] Flickr dataset Categorize scene based on nearest exemplars Base metric: Ling & Soatto s Proximity Distribution Kernel (PDK)

12 Semi-supervised hash functions Query time: ate Error ra Error drops with learned metric slower search faster search [Kulis, Jain, & Grauman, PAMI 2009] Flickr dataset Categorize scene based on nearest exemplars Base metric: Ling & Soatto s Proximity Distribution Kernel (PDK)

13 Searching with kernel functions Kernels encompass many useful similarity measures, many for structured input data. sky fence grass house? driveway sets graphs trees How can we search efficiently with an arbitrary kernel function?

14 Hash functions for kernels??

15 Our idea: Kernelized LSH (KLSH) Main idea: Draw on Central Limit Theorem to (implicitly) generate random Gaussian hyperplanes in the kernel-induced feature space. Show that products with those hyperplanes require only kernel and sparse set of data objects. [Kulis & Grauman, ICCV 2009]

16 Result: Kernelized LSH (KLSH) 80 Million Tiny Images dataset Gist descriptor + Gaussian RBF kernel KLSH searches less than 1% of the database to find a query s approximate near neighbors.

17 Retrieval on a budget Retrieving similar instances with a search time budget Novel hash functions for learned metrics and arbitrary kernel functions Retrieving informative instances with a search time or annotation cost budget Novel hash functions for hyperplane queries Budgeted batch-mode active selection

18 Active selection: retrieving informative instances Current Model Annotator Unlabeled data Labeled data Active Selection We have demands on both search time and annotation resources.

19 SVM margin criterion for active selection? Select point nearest to hyperplane decision boundary for labeling. w [Tong & Koller, 2000; Schohn & Cohn, 2000; Campbell et al. 2000] Problem: With massive unlabeled pool, cannot afford exhaustive linear scan to make selection.

20 Sub-linear time active selection Goal: Map hyperplane query directly to its nearest points. h( w) { x1, xk} x ( t) 1 x ( t) 2 w (t) x ( t 1) 1 x ( t 1) 2 x ( t 1) 3 w ( t 1) ) [Jain, Vijayanarasimhan & Grauman, NIPS 2010]

21 Hashing a hyperplane query To retrieve those points for which is small, want probable collision for perpendicular vectors: x 1 w Most likely to collide Assuming normalized data. Unlikely l to collide [Jain, Vijayanarasimhan & Grauman, NIPS 2010]

22 Hashing a hyperplane query To achieve this, we define asymmetric two-bit hash: Let:, Kristen Grauman, UT Austin

23 Hashing a hyperplane query To achieve this, we define asymmetric two-bit hash: Let: Then define: Likely to collide Unlikely to collide Kristen Grauman, UT Austin

24 Hashing a hyperplane query To achieve this, we define asymmetric two-bit hash: Let: Then define: We prove necessary LSH bounds, e.g.: See [Jain, Vijayanarasimhan & Grauman, NIPS 2010].

25 Data flow: Hashing a hyperplane query Hash all unlabeled data into table. Active selection loop: Hash current hyperplane as query. Retrieve unlabeled data points with which it collides. Request labels for them. Update hyperplane. Kristen Grauman, UT Austin

26 Results: Hashing a hyperplane query Tiny-1M 1Milli Million images from 1000s of categories CIFAR-10 60,000 images in 10 categories Newsgroups 20,000 documents in 20 categories

27 Results: Hashing a hyperplane query Accuracy improvements as more data labeled Newsgroups CIFAR: Frogs Time spent searching for selection (log scale)

28 Results: Hashing a hyperplane query Learning airplane Learning automobile Selected for labeling in first 10 iterationsti Efficient active selection with pool of 1 Million unlabeled examples!

29 Active selection: retrieving informative instances Current Model Annotator Unlabeled data $ $ $ $ $ $ Active Selection Labeled data How can we actively leverage many annotators at once?

30 Our idea: Budgeted batch active selection $T Current Current Budget Model Model Unlabeled $ $ data $ $ $ $ $ $ Budgeted Batch Active Selection $ Active Selection Labeled data Select a batch of examples such that together they most improve classifier objective and meet the annotation budget. [Vijayanarasimhan et al. CVPR 2010]

31 Our idea: Budgeted batch active selection Selected examples and optimistic labels Misclassification risk $ $ $ Budgeted Batch Active Selection $ Margin Improved risk and margin (predicted) Classifier Current after training addition of set selected and classifier examples [Vijayanarasimhan et al. CVPR 2010]

32 Results: Budgeted batch active selection Annotation cost = video length, segmentation time. Hollywood Activities: All 8 classes SIVAL Object: All 25 classes

33 Example selection at a single batch iteration: Positive action class = Stand up Random Myopic active batch Ours Answer phone Answer phone Sit down Sit down Stand up Sit down Kristen Grauman, UT Austin Stand up

34 Results: Budgeted batch active selection Optimizing a budgeted choice is crucial when candidate annotations vary in cost. Comparison to state-of-the-art batch-mode active p learning approach for choosing fixed-size batches [Hoi et al. 2009].

35 Summary: Retrieval on a budget To perform well with limited resources, we need search and learning algorithms that Offer guarantees on error search speed tradeoffs Target human supervision to use it most wisely Algorithms presented provide New families of locality-sensitive hash functions Large-scale active learning strategies to select points in sub-linear time and/or in batches.

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds Sudheendra Vijayanarasimhan Kristen Grauman Department of Computer Science University of Texas at Austin Austin,