STAT432 Mini-Midterm Exam I (green) University of Illinois Urbana-Champaign February 25 (Monday), :00 10:50a SOLUTIONS

Size: px

Start display at page:

Download "STAT432 Mini-Midterm Exam I (green) University of Illinois Urbana-Champaign February 25 (Monday), :00 10:50a SOLUTIONS"

Priscilla Thompson
5 years ago
Views:

STAT432 Mini-Midterm Exam I (green) University of Illinois Urbana-Champaign February 25 (Monday), 2019 10:00

5 points) 3 (10 points) extra-credit (1 point) Total (21 points) Points Question 1: Multiple Choices Choose

(1) [1 pt] Which of the following is the instructor of this course? (a) (b) (c) (b).

1 STAT432 Mini-Midterm Exam I (green) University of Illinois Urbana-Champaign February 25 (Monday), :00 10:50a SOLUTIONS Question 1 (6.5 points) 2 (3.5 points) 3 (10 points) extra-credit (1 point) Total (21 points) Points Question 1: Multiple Choices Choose ALL of correct statements for each problem. (1) [1 pt] Which of the following is the instructor of this course? (a) (b) (c) (b). (2) [1 pt] What did the instructor do when once he forgot to bring any marker in the class? (a) He borrowed one from the class. (b) He used the document camera with a pen and papers. (c) He never used markers on the board. (d) He always brings markers to the class. 1

2 (b). (3) (1.5pts) Which of the following is (are) true regarding the bias-variance trade-off? (a) Linear model does not involve the bias-variance trade-off if the true underlying model is indeed linear. (b) In linear regression, smaller number of covariates leads to larger variance. (c) For knn, larger k means smaller variance. (d) For Lasso, larger λ means smaller variance. (c) (d). (4) (1.5pts) Which of the following is an unsupervised model? (a) knn for classification (b) k-means clustering (c) Lasso (d) AIC (e) Ridge regression (b). (5) (1.5pts) Use line to connect the figures with the correct descriptions. (a) lasso (b) ridge regression (c) elastic net 1 (b); 2 (c); 3 (a) Question 2: Proof [3.5 pts] Prove that the eigenvalues of X T X are square of the singular values of X. (For SVD of X = UDV T, singular values are diagonal entries of D). 2

3 First we have SVD of X = UDV T with orthogonal matrices U and V. Then we can also have eigendecomposition of X T X = V D V where D is a diagonal matrix with entries being eigenvalues of X T X. If we substitute SVD into X T X we have X T X = V DU T UDV T = V D 2 V T because of the orthogonality of U. Then comparing with the eigen-decomposition we have Therefore, we have completed the proof. D = D 2 Question 3: Calculation Suppose there are 5 observations with 2 covariates. They are currently assigned to cluster C, shown below. For this question, you don t have to show all the detailed calculations, as long as your answer is correct. obs x 1 x 2 C (1) [2.5 pts] Plot the points. (2) [2.5 pts] Based on the k-means clustering algorithm, what is the cluster assignment C for the next one iteration? What is the corresponding cluster means? (3) [2.5 pts] Will C and cluster means be updated again? If they do, give the new values, if not, give a brief explanation. (4) [2.5 pts] Is this the best possible clustering result? Briefly explain your answer. (Extra-credit 1 pt) Do you have any comments (0.5 pts) and suggestions (0.5 pts) about this course? 3

4 Stat 432 Mini-Midterm I 10:00 10:50a Feb 25, 2019 Question 3 (understand k-means) (1). [2.5 points] Plot the points. ## [1] x2 x1 (2). [2.5 points] We include those two updates in a function kmeans_1step defined below. And then we run for one iteration of those two steps and output the results. # pairwise distance function # cited from pdist <- function(a,b) { an = apply(a, 1, function(rvec) crossprod(rvec,rvec)) bn = apply(b, 1, function(rvec) crossprod(rvec,rvec)) m = nrow(a) n = nrow(b) tmp = matrix(rep(an, n), nrow=m) tmp = tmp + matrix(rep(bn, m), nrow=m, byrow=true) sqrt( tmp - 2 * tcrossprod(a,b) ) kmeans_1step=function(c,x,sync=false,plot=false){ # given the cluster assignment, update the cluster means cntrs=sort(unique(c)) K=length(cntrs) m=matrix(0,k,dim(x)[2]) for(k in 1:K)m[k,]=colMeans(x[C==cntrs[k],]) # given the cluster means, update the cluster assignment pdist_xm=pdist(x,m) C=apply(pdist_xm,1,which.min) if(sync){ # synchronize the results 1

5 for(k in 1:K)m[k,]=colMeans(x[C==cntrs[k],]) if(plot){ plot(x[,1], x[,2], col = C, pch = 19) points(m[,1],m[,2],col=cntrs,pch=4);text(x[,1], x[,2],1:length(c),pos=3) return(list(cluster=c,centers=m,dist2cntr=pdist_xm)) # initialization x <- cbind(x1,x2) C <- C0 # now we run one iteration and output the result upd <- kmeans_1step(c,x,true,true) x[, 2] 3 1 # the centers of the clusters (m <- upd$centers) ## [,1] [,2] ## [1,] ## [2,] # the cluster assignment (C <- upd$cluster) ## [1] Focus on the ideas and steps. x[, 1] (3). [2.5 points] Now we iterate the two-step updates in (1) until it does not change. Then we output the final result. # iterate until the cluster assignment does not change num_iter=0 while(1){ C0 <- C upd <- kmeans_1step(c,x) C <- upd$cluster

6 num_iter <- num_iter+1 if(identical(c0,c))break # output the final result C ## [1] plot(x1, x2, col = C, pch = 19) x2 x1 In this instance, the algorithm converges in 1 iteration. No, C and cluster means will not update after (2) in this case. This is because they have been stablized without further updates. (4). [2.5 points] Now we generate another random initialization of C and repeat the steps. # record the results in (b) C1 <- C set.seed(1) # generate another initial value of the cluster assignments (C = sample(1:2, n, replace = TRUE)) ## [1] # repeat the kmeans updates in (a) num_iter=0 while(1){ C0 <- C upd <- kmeans_1step(c,x) (C <- upd$cluster) num_iter <- num_iter+1 if(identical(c0,c))break # new cluster assignment C ## [1] # plot results on the same graph plot(x1, x2, col = C1, pch = 19) 3

7 points(x1, x2, col = C, pch = 1, cex=2) legend('bottomright',legend=c("init-asgn 1","init-asgn 2"),pch=c(19,1)) x2 init asgn 1 init asgn 2 iterations. x1 In this instance, the algorithm converges in 2 Note that kmeans algorithm depends on the initial cluster assignment. Therefore you may not get the same results for different initializations, as illustrated here. However, if you change the random seed number to be 10, you can get the same results as in (2). Again there is no unique answer. Therefore, unless we enumnerate all cases of possible assignments, we never know which is the best. In another words, the current cluster assignment obtain in (3) may not be the best. 4

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography Estimating homography from point correspondence