CS 231A Computer Vision Midterm

Similar documents
CS 231A Computer Vision Midterm

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

3D vector computer graphics

Structure from Motion

Support Vector Machines

Image Alignment CSC 767

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

CS 534: Computer Vision Model Fitting

TN348: Openlab Module - Colocalization

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

LECTURE : MANIFOLD LEARNING

Calibrating a single camera. Odilon Redon, Cyclops, 1914

Edge Detection in Noisy Images Using the Support Vector Machines

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

A Comparison and Evaluation of Three Different Pose Estimation Algorithms In Detecting Low Texture Manufactured Objects

Multi-stable Perception. Necker Cube

Feature Reduction and Selection

Any Pair of 2D Curves Is Consistent with a 3D Symmetric Interpretation

Vanishing Hull. Jinhui Hu, Suya You, Ulrich Neumann University of Southern California {jinhuihu,suyay,

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Fitting: Deformable contours April 26 th, 2018

Active Contours/Snakes

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Fitting and Alignment

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

An efficient method to build panoramic image mosaics

Scan Conversion & Shading

y and the total sum of

Scan Conversion & Shading

3D Modeling Using Multi-View Images. Jinjin Li. A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science

New dynamic zoom calibration technique for a stereo-vision based multi-view 3D modeling system

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

REFRACTION. a. To study the refraction of light from plane surfaces. b. To determine the index of refraction for Acrylic and Water.

Problem Set 3 Solutions

RECOGNITION AND AGE PREDICTION WITH DIGITAL IMAGES OF MISSING CHILDREN

Smoothing Spline ANOVA for variable screening

User Authentication Based On Behavioral Mouse Dynamics Biometrics

MOTION BLUR ESTIMATION AT CORNERS

What are the camera parameters? Where are the light sources? What is the mapping from radiance to pixel color? Want to solve for 3D geometry

Image warping and stitching May 5 th, 2015

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Radial Basis Functions

Lecture 5: Multilayer Perceptrons

LEAST SQUARES. RANSAC. HOUGH TRANSFORM.

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Lecture 9 Fitting and Matching

S1 Note. Basis functions.

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Angle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga

Recognizing Faces. Outline

Machine Learning 9. week

Lecture 13: High-dimensional Images

A Binarization Algorithm specialized on Document Images and Photos

A Robust Method for Estimating the Fundamental Matrix

Inverse-Polar Ray Projection for Recovering Projective Transformations

Unsupervised Learning and Clustering

Wishing you all a Total Quality New Year!

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

Lecture 4: Principal components

Optimal Combination of Stereo Camera Calibration from Arbitrary Stereo Images.

Detection of an Object by using Principal Component Analysis

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

12. Segmentation. Computer Engineering, i Sejong University. Dongil Han

Multi-view 3D Position Estimation of Sports Players

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Mathematics 256 a course in differential equations for engineering students

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University

IMAGE MATCHING WITH SIFT FEATURES A PROBABILISTIC APPROACH

Unsupervised Learning

Local Quaternary Patterns and Feature Local Quaternary Patterns

Some Tutorial about the Project. Computer Graphics

3D Rigid Facial Motion Estimation from Disparity Maps

AP PHYSICS B 2008 SCORING GUIDELINES

Module Management Tool in Software Development Organizations

ROBOT KINEMATICS. ME Robotics ME Robotics

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Announcements. Supervised Learning

K-means and Hierarchical Clustering

High-Boost Mesh Filtering for 3-D Shape Enhancement

Hierarchical clustering for gene expression data analysis

Geometric Primitive Refinement for Structured Light Cameras

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

Development of an Active Shape Model. Using the Discrete Cosine Transform

Programming in Fortran 90 : 2017/2018

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Image Matching Algorithm based on Feature-point and DAISY Descriptor

Transcription:

CS 231A Computer Vson Mdterm Tuesday October 30, 2012 Set 1 Multple Choce (22 ponts) Each queston s worth 2 ponts. To dscourage random guessng, 1 pont wll be deducted for a wrong answer on multple choce questons! For answers wth multple answers, 2 ponts wll only be awarded f all correct choces are selected, otherwse, t s wrong and wll ncur a 1 pont penalty. Please draw a crcle around the opton(s) to ndcate your answer. No credt wll be awarded for unclear/ambguous answers. 1. (Pck one) If all of our data ponts are n R 2, whch of the followng clusterng algorthms can handle clusters of arbtrary shape? (a) k-means. (b) k-means++. (c) EM wth a gaussan mxture model. (d) mean-shft. Only d 2. (Pck one) Suppose we are usng a Hough transform to do lne fttng, but we notce that our system s detectng two lnes where there s actually one n some example mage. Whch of the followng s most lkely to allevate ths problem? (a) Increase the sze of the bns n the Hough transform. (b) Decrease the sze of the bns n the Hough transform. (c) Sharpen the mage. (d) Make the mage larger. a 3. (Pck one) Whch of the followng processes would help avod alasng whle downsamplng an mage? 1

(a) Image sharpenng. (b) Image blurrng. (c) Medan flterng where you replace every pxel by the medan of pxels n a wndow around the pxel. (d) Hstogram equalzaton. b 4. (Crcle all that apply) A Sobel flter can be wrtten as 1 2 1 0 0 0 1 2 1 Whch of the followng statements are true = 1 0 1 [ 1 2 1 ] (1) (a) Separatng the flter n the above manner, reduces the number of computatons. (b) It s smlar to applyng a gaussan flter followed by a dervatve. (c) Separaton leads to spurous edge artfacts. (d) Separaton approxmates the frst dervatve of gaussan. a and b 5. (Pck one) Whch of the followng s true for Egenfaces (PCA) (a) Can be used to effectvely detect deformable objects. (b) Invarant to affne transforms. (c) Can be used for lossy mage compresson. (d) Is nvarant to shadows. c 6. (Pck one) Downsamplng can lead to alasng because (a) Samplng leads to addtons of low frequency nose. (b) Sampled hgh frequency components result n apparent low frequency components. (c) Samplng ncreases the frequency components n an mage. (d) Samplng leads to spurous hgh frequency nose b 7. (Pck one) If we replace one lens on a calbrated stereo rg wth a bgger one, what can we say about the essental matrx, E, and the fundamental matrx, F? 2

(a) E can change due to a possble change n the physcal length of the lens. F s unchanged. (b) F can change due to a possble change n the lens characterstcs. E s unchanged. (c) E can change due to a possble change n the lens characterstcs. F s unchanged. (d) Both are unchanged. b 8. (Pck one) Whch of the followng statements descrbes an affne camera but not a general perspectve camera? (a) Relatve szes of vsble objects n a scene can be determned wthout pror knowledge. (b) Can be used to determne the dstance from a object of a known heght. (c) Approxmates the human vsual system. (d) An nfntely long plane can be vewed as a lne from the rght angle. a 9. (Crcle all that apply) Whch of the followng could affect the ntrnsc parameters of a camera? (a) A crooked lens system. (b) Damond/Rhombus shaped pxels wth non rght angles. (c) The aperture confguraton and constructon. (d) Any offset of the mage sensor from the lens s optcal center. A,B,D 10. (Crcle all that apply) For camera calbraton, we learned that snce there are 11 unknown parameters, we need at least 6 correspondences to calbrate. Assumng that you couldn t fnd a calbraton target wth the mnmum of 6 corners to use as correspondences, you decde to take K pctures from dfferent vewponts of a statonary pattern wth N corners, where N < 6, whch of the followng statements s true? (a) The number of mages, K must satsfy 2NK > 11, for the 11 unknowns, the value of N sn t mportant, so long as t as N > 0. (b) The problem s unsolvable, snce you do not have enough correspondences n a sngle mage. (c) The number of unknown parameters scales wth the number of unque mages taken. (d) The number of unknown parameters s fxed, but the N corners must not be co-lnear. C 11. (crcle all that apply) Whch of the followng statements about correlaton are true n general? 3

(a) For a symmetrc 1D flter, computng convoluton of the flter wth a sgnal s the same as computng correlaton of the flter wth the sgnal. (b) Correlaton computaton can be made fast through the use of Dscrete Fourer Transform. (c) Correlaton computaton s not Shft Invarant. (d) The correlaton method would be effectve n solvng the correspondence problem between two mages of a checkerboard. A B 4

2 True or False (10 ponts) True or false. Correct answers are 1 pont, -1 pont for each ncorrect answer. (a) (True/False) Fsherfaces works better at dscrmnaton than Egenfaces because Egenfaces assumes that the faces are algned. (False), both assume the faces are algned (b) (True/False) If you don t normalze your data to have zero mean, then the frst prncpal component found va PCA s lkely to be unnformatve. (True) (c) (True/False) Gven suffcently many weak classfers, boostng s guaranteed to get perfect accuracy on the tranng set no matter what the tranng data looks lke. (False), one could have two dataponts whch are dentcal except for ther label (d) (True/False) Boostng always makes your algorthm generalze better. (False), you can overft (e) (True/False) It s possble to blur an mage usng a lnear flter. (True) (f) (True/False) When extractng the egenvectors of the smlarty matrx for an mage to do clusterng, the frst egenvector to use should be the one correspondng to the second largest egenvalue, not the largest. (False), t s the largest one (g) (True/False) The Canny edge detector s a lnear flter because t uses the Gaussan flter to blur the mage and then uses the lnear flter to compute the gradent. (False), It has non-lnear operatons, thresholdng, hysteress, non-maxmal supresson (h) (True/False) A zero skew ntrnsc matrx s not full rank because t has one less DOF. (False), It s stll full rank, even f t has one less DOF. () (True/False) Compared to the normalzed cut algorthm, the parttons of mnmum cut are always strctly smaller. (False), (Ths queston was a bt ambguous, so we gave everyone credt for any answer) Mn cut prefers very small and very large parttons, normalzed cur prefers parttons of roughly equal sze 5

(j) (True/False) Assumng the camera coordnate system s the same as the world coordnate system, the ntrnsc and extrnsc parameters of the a camera can map any pont n homogenous world coordnates to a unque pont n the mage plane. (False), In ths stuaton, the vector 0 0 0 1 spans the null space of the camera matrx, and represents the camera orgn, and the projectve lne n ths case s ambguous. 6

3 Long Answer (32 ponts) 11. (10 ponts) Detectng Patterns wth Flters A Gabor flter s a lnear flter that s used n mage processng to detect patterns of varous orentatons and frequences. A Gabor flter s composed of a Gaussan kernel functon that has been modulated by a snusodal plane wave. The real value verson of the flter s shown below. g (x, y; λ, θ, ψ, σ, γ) = exp Where x = x cos (θ) + y sn (θ) y = x sn (θ) + y cos (θ) ( x 2 +γ 2 y 2 2σ 2 ) cos ( 2π x λ + ψ ) Fgure 1 shows an example of a 2D Gabor Flter. 0.8 5 10 15 20 25 0.6 0.4 0.2 0 0.2 0.4 0.6 30 0.8 10 20 30 40 50 60 Fgure 1: 2D Gabor Flter (a) (6 ponts) What s the physcal meanng of each of the fve parameters of the Gabor flter, λ, θ, ψ, σ, γ, and how do they affect the mpulse response? Hnt: The mpulse response of a gaussan flter s shown n Equaton 2, t s normally radally symmetrc, how would you make ths flter ellptcal? How would you make ths flter steerable? What does the 2D cosne modulaton do to ths flter? gaussan (x, y) = 1 ( 2πσ 2 exp x2 + y 2 ) 2σ 2 (2) λ: represents the wavelength of the snusodal factor θ: represents the orentaton of the normal to the parallel strpes of a Gabor functon ψ: phase offest σ: Sgma of the gaussan envelope γ: Spatal aspect rato, and specfes the ellptcty of the support of the Gabor functon (b) (4 ponts) Gven a Gabor flter that has been tuned to maxmally respond to the strped pattern n shown n Fgure 2, how would these parameters, λ 0, θ 0, ψ 0, σ 0, γ 0, have to be modfed to recognze the followng varatons? Provde the values of the new parameters n terms of the orgnal values. 7

Fgure 2: Reference Pattern. θ = θ 0 + π 4. ψ = ψ 0 + π 8

. θ = θ 0 + π 4, σ = 2σ 0, λ = 2λ 0 v. γ = 1 2 γ 0 9

12. (10 ponts) Stereo Reconstructon Fgure 3: Rectfed Stereo Rg x x (a) (2 ponts) The fgure above shows a rectfed stereo rg wth camera centers O and O, focal length f and baselne B. x and x are the projected pont locatons on the vrtual mage planes by the pont P ; note that snce x s to the left of O, t s negatve. Gve an expresson for the depth of the pont P, shown n the dagram as Z. Also gve an expresson for the X coordnate of the pont P n world coordnates, assumng an orgn of O. You can assume that the two are pnhole cameras for the rest of ths queston. Z = fb X = xz f 10

(b) (4 ponts) Fgure 4: Rectfed Stereo Rg wth mage plane error Now assume that the camera system can t perfectly capture the projected ponts locaton on the mage planes, so there s now some uncertanty about the pont s locaton snce a real dgtal camera s mage plane s dscretzed. Assume that the orgnal x and x postons now have an uncertanty of ±e, whch s related to dscretzaton of the mage plane.. Gve an expresson of the X, Z locatons of the 4 ntersecton ponts resultng from the vrtual mage plane uncertanty.. Gve an expresson for the maxmum uncertanty n the X and Z drectons of the pont P s locaton n world coordnates. All expressons should be n terms of mage coordnates only, you can assume that x s always postve and x s always negatve. fb Z mn = (x+e) (x e) fb (x e) (x +e) Z max = d = x x Z dff = Z max Z mn = fb Z md = fb (x e) (x e) = fb x x X mn = (x e)z md f X max = (x+e)z md f X max X mn = Z md(2e) f 4e (d 2e)(d+2e) 11

(c) (4 ponts) Assume the X coordnate of the pont P s fxed.. Gve an expresson for the uncertanty n the reconstructon of Z, n terms of the actual value of Z and the other parameters of the stereo rg.. What s the depth uncertanty when Z s equal to zero?. Fnd the depth when the uncertanty s at ts maxmum and gve a physcal nterpretaton and a drawng to explan. d = fb Z Z dff (Z) = 4fBe ( fb Z ) 2 4e 2 Z dff = 0 when Z = 0, Ths s when Z s n between the two camera orgns, so the ray cast by P ntersects the mage plane at nfnty, whch means an nfnte dscrepancy. Z dff = when Z = fb 2e When Z takes on ths value, the dsparty between the two cameras equals 2e whch means that wth the addtonal e error from both cameras, the pont P wll be vewed wth a effectve dsparty of zero, so one of the reconstructed dstances wll be nfnte, whle the other possblty wll be fnte, but the dfference wll be nfnte. The pont s so far away, that the small error term causes the stereo rg to reconstruct t at nfnty. 13. (12 ponts) AdaBoost algorthm for Face Detecton Let f M (x) be the classfyng functon learnt after the M th teraton. f M (x) = M β m C m (x) (3) m=1 Where, C m (x) m {1,..., M} s a bunch of weak classfers learned n M teratons. C m : R { 1, 1}. We wll now look at a dervaton for the optmal β, C at the m th teraton gven β k, C k k {1,..., m 1}. You are also gven N trannng samples {(x, y )} =1,...,N, where x s a data pont and y { 1, 1} s the correspondng output. ( N (a) (2 ponts) (β m, C m ) = arg mn =1 L [y, f m 1 (x ) + βc(x )]), where L[y, g] = β,c exp( yg) s the loss functon. Show that (β m, C m ) can be wrtten n the form (β m, C m ) = arg mn β,c Gve an expresson for w (m 1) Note that w (m 1) β m, C m = arg mn = arg mn β,c β,c. N =1 w (m 1) exp{ βy C(x )} (4) s the weght assocated wth the th data pont after m 1 teratons. N =1 exp{ y f m 1 (x ) βy C(x )} N =1 w(m 1) exp{ βy C(x )} w (m 1) = exp{ y f m 1 (x )} 12

(b) (3 ponts) Express the optmal C m n the form arg mn Err(C), where Err(C) s an error functon and s ndependent of β. Err(C) should be defned n terms of the ndcator functon { I[C(x) y ] gven by 1, f C(x ) y I[C(x ) y ] = 0 f C(x ) = y C C m = arg mn e β y =C(x ) w(m 1) + e β y C(x ) w(m 1) C (e β e β ) N C m = arg mn C C m = arg mn C =1 w(m 1) N =1 w(m 1) I[y C(x )] I[y C(x )] + e β N =1 w(m 1) (c) (3 ponts) Usng the C m from part (a), the optmal expresson for β m can be obtaned as ( ) β m = 1 2 log 1 errm err m where err m = N =1 w(m) I[y C m(x )] N =1 w(m) Now, fnd the update equaton for w (m) From part(a), we have w (m) = w (m 1) e βmy C m(x ) Snce, y C m (x ) = 1 2I[y C m (x )] w (m+1) = w (m) e 2βmI[y C m(x )] e βm. and show that w (m) w (m 1) exp (2β m I[y C m (x )]) (d) (4 ponts) We wll use ths algorthm to classfy some smple faces. The set of tranng mages s gven n Fg. 5. x s the face and y s the correspondng face label, {1, 2, 3, 4, 5}. We are also gven 3 classfer patches p 1, p 2, p 3 n Fg. 6. A patch detector I (+) (x, p j ) s defned as follows: { I (+) (x, p j ) = I ( ) (x, p j ) = I (+) (x, p j ) 1, f mage x contans patch p j 1, otherwse All classfers C m are restrcted to belong to one of the 6 patch detectors,.e. C m (x) {I (±) (x, p 1 ), I (±) (x, p 2 ), I (±) (x, p 3 )}. If C 1 (x) = I (+) (x, p 2 ), w (0) = 1, {1, 2, 3, 4, 5} and β 1 = 1,. What s the optmal C 2 (x)?. What are the updated weghts w (1)? 13

x 1 x 2 x 3 y 1 = +1 y 2 = +1 y 3 = +1 x 4 x 5 y 4 = -1 y 5 = -1 Fgure 5: Tranng Set Faces p 1 p 2 p 3 Fgure 6: Classfer patches. What s the fnal classfer f 2 (x) combnng C 1, C 2? v. Does I [f 2 (x) > 0] correctly classfy all tranng faces? C 2 (x) = I (+) (x, p 3 ) w (1) exp(2β 2 ) for = 1, 5 w (1) 1 for = 2, 3, 4 where, β 2 = 0.5log(1.5) f 2 (x) = C 1 (x) + 0.5log(1.5) C 2 (x) Yes, t correctly classfes all the tranng mages. 14

4 Short Answer (36 ponts) 14. (6 ponts) Parallel Lnes under Perspectve Transforms Fgure 7: Boxes rendered usng dfferent projectons (a) (2 ponts) The two boxes n Fgure 7 represent the same 3D shape rendered usng two projectve technques, explan ther dfferent appearance and the types of projectons used to map the objects to the mage plane. Fgure on the left s an orthographc projecton, parallel lnes are parallel, fgure on the rght s perspectve, parallel lnes at an angle to the camera plane have a vanshng pont (b) (2 ponts) For each projecton, f the edges of the cubes were to be extended to nfnty, how many ntersecton ponts would there be? Left, none, rght, 3 vanshng ponts. (c) (1 pont) What s the maxmum number of vanshng ponts that are possble for an arbtrary mage? There s no lmt, any set of parallel lnes at an angle to the camera wll converge at a vanshng pont (d) (1 pont) How would you arrange parallel lnes so that they do not appear to have a vanshng pont? Place lnes that are parallel to the camera plane, they wll converge at the pont at nfnty 15. (6 ponts) Usng RANSAC to fnd crcles Suppose we would lke to use RANSAC to fnd crcles n R 2. Let D = {(x, y )} n =1 be our data, and let I be the random seed group of ponts used n RANSAC. 15

(a) (2 ponts) The next step of RANSAC s to ft a crcle to the ponts n I. Formulate ths as an optmzaton problem. That s, represent fttng a crcle to the ponts as a problem of the form mnmze L(x, y, c x, c y, r) where L s a functon for you to determne whch gves the dstance from (x, y ) to the crcle wth center (c x, c y ) and radus r. L(x, y, c x, c y, r) = sqrt((x c x ) 2 + (y c y ) 2 ) r (b) (2 ponts) What mght go wrong n solvng the problem you came up wth n (1) when I s too small? The problem s underdetermned. Wth e.g. 2 ponts there are nfntely many crcles one can ft perfectly. (c) (2 ponts) The next step n our RANSAC procedure s to determne what the nlers are, gven the crcle (c x, c y, r). Usng these nlers we reft the crcle and determne new nlers n an teratve fashon. Defne mathematcally what an nler s for ths problem. Menton any free varables. An nler s a pont (x, y) such that sqrt((x c x ) 2 + (y c y ) 2 ) r T for some threshold T. 16. (6 ponts) Fast forward camera I Fgure 8: Camara movement Suppose you capture two mages P and P n pure translaton n the Z drecton shown n Fgure 8. Image planes are parallel to the XY plane. (a) (3 ponts) Suppose the center of an mage s (0,0). For a pont (a, b) on mage P, what s the correspondng eppolar lne on mage P? the lne goes through (0, 0) and (a, b) on mage P. 16

(b) (3 ponts) What s the essental matrx n ths case assumng the camera s calbrated? 0-1 0 1 0 0 0 0 0 17. (6 ponts) K-Means Fgure 9: A wld jackalope (a) (3 ponts) What s lkely to happen f we run k-means to cluster pxels when we only represent pxels by ther locaton? Wth k=4, draw the boundary around each cluster and mark each cluster center wth a pont for some clusters that mght result when runnng k-means to convergence. Draw on Fgure 9, and set the ntal cluster centers to be the four corners of the mage. For the mage, we expect each quadrant to be ts own cluster. (b) (1 pont) What does ths tell us about usng pxel locatons as features? It s not suffcent, we need rcher features. (c) (2 ponts) We replace the sum of squared dstances of all ponts to the nearest cluster center crteron n k-means wth sum of absolute dstances of all ponts to the nearest cluster center,.e. our dstance s now gven by d(x 1, x 2 ) = x 1 x 2 1. How would the update step change for fndng the cluster center? At every teraton 1. Assgn ponts to the closest cluster center 17

2. Update the cluster centers wth the medan of ponts belongng to a cluster 18. (6 ponts) Canny Edge Detector (a) (4 ponts) There s an edge detected usng the Canny method. Ths detected edge s then rotated by θ as shown n Fgure 10, where the relaton between a pont on the orgnal edge (x, y) and a pont on the rotated edge (x, y ) s gven by x = x cos θ (5) y = x sn θ (6) Wll the rotated edge be detected usng the Canny method? Provde ether a mathematcal proof or a counter example. Fgure 10: Edge Rotated by θ Our rotaton s gven by x = x cos θ y = x sn θ Our canny edge depends on the magntude of the dervatve whch s the only part of the algorthm whch could have really changed. Ths s gven by Dx 2 x + D 2 y y = cos 2 θd 2 xx + sn 2 θdxx 2 = D 2 xx whch s the same rule for the orgnal edge thus we have shown that the Canny method s rotatonally nvarant. (b) (2 ponts) After runnng the Canny edge detector on an mage, you notce that long edges are broken nto short segments separated by gaps. In addton, some spurous edges appear. For each of the two thresholds (low and hgh) used n hysteress thresholdng, state how you would adjust the threshold (up or down) to address both 18

problems. Assume that a settng exsts for the two thresholds that produces the desred result. Explan your answer very brefly. The gaps n the long edges requre a lower low threshold: parts of the long edge are detected, so the hgh threshold s low enough for these edges, but the edges are dsconnected because the low threshold s too hgh. Lowerng the low threshold wll nclude more pxels of the long edges. Elmnatng the spurous edges requres a hgher hgh threshold. The hgh threshold should be ncreased only slghtly, so as not to make the long edges dsappear. The assumpton n the problem statement ensures that ths s possble. 19. (6 ponts) Cascaded Hough transform for detectng vanshng ponts 3 2.5 2 Y AXIS 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 X AXIS Fgure 11: Hough Transform For ths problem we are gong to use the slope m ntercept c representaton of lne y = mx + c. The attached Fgure 11 shows the vertces of a rectangular patch under perspectve transformaton. We wsh to fnd the vanshng pont n the mage through Hough Transform. (a) (2 ponts) Plot the Hough transform representaton of the mage. Assume no bnnng and make plots n a contnuous (m, c) space. Just show the ponts wth two or more votes. See fgure 12 (b) (2 ponts) Now usng y = mx + c representaton, run Hough transform on the results from part (a) (after usng a threshold of 2 votes) to get a representaton n the (x, y) space agan. 19

8 7 6 5 Intercept (c) 4 3 2 1 0 4 3 2 1 0 1 2 3 4 slope (m) Fgure 12: Make your plot here The same plot as the plot n the problem statement, wth an ntersecton at (2.5, 1.75) and dagonals of the trapezod (do not deduct ponts f dagonals are not shown) (c) (2 ponts)fnd the vanshng pont from the representaton n part (b) The vanshng pont s (2.5, 2.5) 20