Multiple-Choice Questionnaire Group C

Similar documents
Final Exam Study Guide

Scale Invariant Feature Transform

Object Classification Problem

Scale Invariant Feature Transform

Computer Vision. Exercise 3 Panorama Stitching 09/12/2013. Compute Vision : Exercise 3 Panorama Stitching

Bag-of-features. Cordelia Schmid

3D Object Recognition using Multiclass SVM-KNN

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies?

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

Discriminative classifiers for image recognition

Efficient Kernels for Identifying Unbounded-Order Spatial Features

Shape Matching. Brandon Smith and Shengnan Wang Computer Vision CS766 Fall 2007

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Contents I IMAGE FORMATION 1

Midterm Examination CS 534: Computational Photography

Action recognition in videos

Outline 7/2/201011/6/

Bus Detection and recognition for visually impaired people

Selection of Scale-Invariant Parts for Object Class Recognition

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Image Processing. Image Features

Video Google: A Text Retrieval Approach to Object Matching in Videos

Structure from Motion. Prof. Marco Marcon

Local Features and Kernels for Classifcation of Texture and Object Categories: A Comprehensive Study

Large-Scale 3D Point Cloud Processing Tutorial 2013

Step-by-Step Model Buidling

3D Reconstruction from Two Views

Object Recognition with Invariant Features

Final Exam Study Guide CSE/EE 486 Fall 2007

Automatic Image Alignment (feature-based)

Fundamentals of Digital Image Processing

Content-Based Image Classification: A Non-Parametric Approach

CS229: Action Recognition in Tennis

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

Beyond Bags of features Spatial information & Shape models

Object and Class Recognition I:

arxiv: v1 [cs.cv] 28 Sep 2018

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

All good things must...

ELEC Dr Reji Mathew Electrical Engineering UNSW

Instance-level recognition

Structure from motion

Chapter 3 Image Registration. Chapter 3 Image Registration

Vision Review: Image Formation. Course web page:

Evaluation and comparison of interest points/regions

Automatic Image Alignment

Instance-level recognition II.

Computer Vision I - Appearance-based Matching and Projective Geometry

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

Feature Based Registration - Image Alignment

Image processing and features

Instance-level recognition

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Local features: detection and description May 12 th, 2015

Computer Vision CSCI-GA Assignment 1.

Part-based and local feature models for generic object recognition

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

1 Projective Geometry

Part-Based Models for Object Class Recognition Part 3

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Computer Vision I - Appearance-based Matching and Projective Geometry

SVM-KNN : Discriminative Nearest Neighbor Classification for Visual Category Recognition

Global Flow Estimation. Lecture 9

Local Feature Detectors

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Application questions. Theoretical questions

Shape Context Matching For Efficient OCR

3D Modeling using multiple images Exam January 2008

Instance-level recognition part 2

Announcements. Recognition I. Gradient Space (p,q) What is the reflectance map?

Image Features: Detection, Description, and Matching and their Applications

Fisher vector image representation

Robotics Programming Laboratory

Descriptors for CV. Introduc)on:

Fish species recognition from video using SVM classifier

Large-scale visual recognition The bag-of-words representation

Automatic Image Alignment

Deep Learning: Image Registration. Steven Chen and Ty Nguyen

Multiple Views Geometry

Multimedia Retrieval Exercise Course 10 Bag of Visual Words and Support Vector Machine

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

CS231A Section 6: Problem Set 3

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

New approaches to pattern recognition and automated learning

SIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis. Hannes Fassold, Jakub Rosner

Camera Drones Lecture 3 3D data generation

Image correspondences and structure from motion

The end of affine cameras

Computer Vision I - Filtering and Feature detection

CS231A Midterm Review. Friday 5/6/2016

Optical flow and tracking

Recognition. Topics that we will try to cover:

3D object recognition used by team robotto

Stereo Vision. MAN-522 Computer Vision

Storyline Reconstruction for Unordered Images

Transcription:

Family name: Vision and Machine-Learning Given name: 1/28/2011 Multiple-Choice naire Group C No documents authorized. There can be several right answers to a question. Marking-scheme: 2 points if all right answers are selected, 1 point in case of a right but incomplete answer, 0 point if a wrong answer is selected. N.1: Motion and human actions. Optical flow estimation is problematic a. in homogeneous image areas. b. in textured image areas. c. at image edges. d. at the boundaries of moving objects. N.2: Image features. Given scale invariant interest points and SIFT descriptors which are normalized in the direction of the dominant gradient orient. Select the properties which are correct. a. These descriptors allow to match images taken at different distances. b. These descriptors are invariant to image rotation and translation. c. These descriptors are invariant to affine transformations. d. The detected regions indicate the local characteristic scale of the image. 1

N.3: Bag-of-features models for category-level classification. Image classification is one task in category recognition. Select the statements which are true in the context of image classification. a. The PASCAL dataset is a standard to compare the performance of different algorithms for image classification. b. Image classification allows to localize objects in the image. c. When training an image classifier, we use positive and negative training images. d. The number of visual words used in the context of image classification is in general very high (between 100k and 1M visual words). N.4: Bag-of-features models for category-level classification. The spatial pyramid kernel can be used for image classification. Select the statements which are correct. a. The spatial pyramid kernel captures coarsely the global spatial layout of the image. b. The spatial pyramid kernel works well for classifying scenes. c. The spatial pyramid kernel is well adapted for classifying images with objects in arbitrary positions. d. The spatial pyramid kernel is invariant to image rotation. N.5: Camera geometry and image alignment. What is the minimal number of point-to-point correspondences to compute (i) homography and (ii) 2D affine geometric transformation? a. 1 correspondence for affine transformation, 2 correspondences for homography. b. 2 correspondences for affine transformation, 3 correspondences for homography. c. 3 correspondences for affine transformation, 4 correspondences for homography. d. 4 correspondences for affine transformation, 5 correspondences for homography. 2

N.6: Large scale visual search. N sift descriptors are indexed using a randomized KD-tree discussed in the lecture. What is the complexity (in terms of N) of finding an approximate nearest neighbor to a query sift descriptor? a. N 2 b. N c. log N d. log log N N.7: Camera geometry and image alignment. Two images, I and I, are captured by two cameras with internal calibration matrices K and K. The two cameras have the same camera center and are related by a pure rotation given by matrix R (the mosaicking scenario). What is the form of homography H relating the two images in terms of K, K and R? Hint: Start from the perspective projection equation, which has the form x = 1 K[R t], z where x (3-vector) and (4-vector) are the image and scene points in homogeneous coordinates, respectively. Assume the two cameras have the same camera center at the origin t 1 = t 2 = 0, and use the fact that the scene point can be written as = where (3-vector) is in non-homogenous coordinates. a. H = K RK b. H = RKR 1 c. H = RK R 1 d. H = K RK 1» 1, N.8: Supervised learning. The support vector machine uses the loss function a. l(y, f) = max(0, 1 yf) b. l(y, f) = log(1 + exp( yf)) c. l(y, f) = (y f) 2 d. l(y, f) = y f 3

N.9: Large scale visual search. The number of visual words (VW) and their structure are parameters, if we want to perform large scale search for particular objects/buildings/scenes taken from different viewpoints. Select the statements which are correct. a. A small number of VW (between 1000 and 4000) gives excellent performance. b. A very large number of VW (between 200k and 1M) gives excellent performance. c. An average number of VW (around 20k) combined with a refinement based on a short binary signature gives excellent performance. d. A hierarchically structured visual vocabulary improves the performance in terms of search accuracy. N.10: Unsupervised learning. Let k(, ) a positive definite kernel defining a similarity measure. The spectral clustering algorithm relies upon the singular value decomposition (SVD) of: a. K = [k(x i, x j ] 1 i,j n b. K = ΠKΠ, with Π = I n,n 1 n 1 n1 T n and K defined above c. L = D K, with D = diag(deg(x 1 ),..., deg(x n )) et deg(x i ) = n j=1 k(x i, x j ) for i = 1,..., n d. L = K K, with K and K defined above N.11: Supervised learning. For competitive performance (and competitive generalization error), the C parameter of the support vector machine should be: a. kept fixed to C = 1 regardless of the training data at hand b. optimized on the test set, used eventually for evaluting the true performance of the learning algorithm c. optimized on the training set d. optimized through a cross-validation loop on the training set 4

N.12: Category-level localization. A linear SVM classifier used in combination with the sliding-window object detector a. is fast because of the cascade structure b. is fast because it can be expressed in the form of a dot-product f(x) = w T x + b. c. is slower compared to the nonlinear SVM. d. usually has lower accuracy compared to the nonlinear SVM. N.13: Category-level localization. Pictorial structure models are often used to model objects in terms of parts and relations between parts. The graph of a pictorial structure model a. has nodes corresponding to object parts and edges corresponding to part relations. b. has nodes corresponding to part relations and edges corresponding to object parts. c. has associated energy function which can always be optimized in polynomial time. d. typically has a tree or a star structure due to efficiency reasons. N.14: Motion and human actions. Movie scripts can be used as a source of readily available supervision. They can provide a. spatial supervision for objects in the video. b. noisy temporal supervision. c. reliable temporal supervision. d. complete description of a video. 5

N.15: Unsupervised learning. The k-means algorithm is a : a. supervised learning algorithm. b. unsupervised learning algorithm. c. semi-supervised learning algorithm. d. weakly supervised learning algorithm. N.16: Image features. The Harris detector extracts interest points for a given image. Select the properties which are correct. a. The detector is based on the auto-correlation matrix. b. The detector selects the characteristic scale. c. The detector finds discriminant points. d. The detector is invariant to rotation. 6

Multiple-Choice naire Group C n.1 n.2 n.3 n.4 n.5 n.6 n.7 n.8 n.9 n.10 n.11 n.12 n.13 n.14 n.15 n.16 a b c d

Multiple-Choice naire Group C n.1 n.2 n.3 n.4 n.5 n.6 n.7 n.8 n.9 n.10 n.11 n.12 n.13 n.14 n.15 n.16 a b c d