Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD

Size: px
Start display at page:

Download "Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD"

Transcription

1 Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Goals. The goal of the first part of this lab is to demonstrate how the SVD can be used to remove redundancies in data; in this example we will be compressing image data. We will see that for a matrix of rank r, the SVD can give a rank p < r approximation to a matrix and that this approximation is the one that minimizes the Frobenius norm of the error. Introduction Any image from a digital camera, scanner or in a computer is a digital image. The real world color image is digitized by converting the images to numerical data. A pixel is the smallest element of the digital image. For example, a 3 megapixel camera has a grid of = 3,145,828 pixels. Since the size of a digitized image is dimensioned in pixels of say m rows and n columns, it is easy for us to think of the image as an m n matrix. However, each pixel of a color image has an RGB values (red, green, blue) which is represented by three numbers. The composite of the three RGB values creates the final color for the single pixel. So we can think of each entry in the m n matrix as having three numerical values stored in that location, i.e., an m n 3 matrix. Now suppose we have a digital image taken with a 3 megapixel camera and each color pixel is determined by a 24-bit number (8 bits for intensity of red, blue and green). Then the information we have is roughly However, when we print the picture suppose we only use 8-bit colors giving 2 8 = 256 colors. We are still using 3 million pixels but the information used to describe the image has been reduced to , i.e., a reduction of one-third. This is an example of image compression. In the figure below we give a grayscale image of the moon s surface (the figure on the left) along with two different compressed images. Clearly the center image is not acceptable but the compressed image on the right has most of the critical information. We want to investigate using the SVD for doing data compression in image processing. Figure 1: The image on the left is the original image while the other two images represent the results of data compression. Understanding the SVD Recall from the notes that the SVD is related to the familiar result that any n n real symmetric matrix can be made orthogonally similar to a diagonal matrix which gives us the decomposition A = QΛQ T where Q is orthogonal and Λ is a diagonal matrix containing the eigenvalues of A. 1

2 The SVD provides an analogous result for a general nonsymmetric, rectangular m n matrix A. For the general result two different orthogonal matrices are needed for the decomposition. We have that A can be written as A = UΣV T, where Σ is an m n diagonal matrix and U, V are orthogonal. Recall from the notes that the SVD of an m n matrix A can also be viewed as writing A as the sum of rank one matrices. Recall that for any nonzero vectors x, y the outer product x y T is a rank one matrix since each row of the resulting matrix is a multiple of the other rows. The product of two matrices can be written in terms of outer products. In particular, the product BC T of two matrices can be written as the sum of the outer products i b i c T i where b i, c i denote the ith columns of B and C respectively. Hence if we let u i denote the ith column of U and v i the ith column of V then writing the decomposition A = UΣV T as the sum of outer products we get (1) A = u 1 σ 1 v T 1 + u 2 σ 2 v T u r σ r v T r = r σ i u i v i T. where r denotes the rank of A and σ i is the ith singular value of A. Thus our expression for A is the sum of rank one matrices. If we truncate this series after p terms, then we have an approximation to A which has rank p. What is amazing, is that it can be shown that this rank p matrix is the best rank p approximation to A measured in the Frobenius norm. Recall that the Frobenius norm is just the matrix analogue of the standard Euclidean length, i.e., A F = i,j A 2 ij 1/2 i=1 where A ij denotes the (i, j) entry of the matrix A. However, it is NOT an induced matrix norm. We can get a result for the error that is made by approximating a rank r matrix by a rank p approximation. Clearly we have that the error E p is given by E p = A p σ i u i v i T = i=1 Due to the orthogonality of U and V we can write E p 2 F = and so a relative error measure can be computed from r i=p+1 r i=p+1 σ 2 i σ i u i v T i. (2) [ r i=p+1 σ2 i r i=1 σ2 i ] 1/2. Data compression using the SVD How can the SVD help us with our data compression problem? Remember that we can view our digitized image as an array of mn values and we want to find an approximation that captures the most significant features of the data. Recall that the rank of an m n matrix A tells us the number of linearly independent columns of A; this is essentially a measure of its non-redundancy. If the rank of A is small compared with n, then there is a lot of redundancy in the information. We would expect that an image with large-scale 2

3 features would possess redundancy in the columns or rows of pixels and so we would expect to be able to represent it with less information. For example, suppose we have the simple case where the rank of A is one; i.e., every column of A is a multiple of a single basis vector, say u. Then if the ith column of A is denoted a i and a i = k i u, then A = u k T, i.e., the outer product of the two vectors u and k where k = (k 1, k 2, k n ). If we can represent A by a rank one matrix then all we need to specify are the vectors u and k; that is m+n entries as opposed to mn entries for the full matrix A. Of all the rank one matrices we want to choose the one which best approximates A in some sense; if we choose the Frobenius norm to measure the error, then the SVD decomposition of A will give us the desired result. Of course in most applications the original matrix A is of higher rank than one so that a rank one approximation would be very crude. In general we seek a rank p approximation to A such that A p i=1 σ i u i v i T F is minimized. It is important to remember that the singular values given in Σ are ordered so that they are nonincreasing. Consequently, if the singular values decrease rapidly, then we would expect that fewer terms in the expansion of A in terms of rank one matrices would be needed. Computational Algorithms In this lab, we will treat the software for the SVD as a black box and assume that the results are accurate. In this lab, you can either use the LAPACK SVD routine dgesvd or MATLAB commands svd and svds. The LAPACK algorithm can be downloaded from netlib ( The interested student is referred to Golub and Van Loan s book for a description of the algorithm used to obtain the SVD. Test image libraries for use in image compression are maintained by several institutions. Here we use the ones from the University of Southern California. In addtion to the SVD algorithm, we will need routines to generate the image chart (i.e., our matrix) from an image and to generate an image from our approximation. There are various ways to do this. One of the simplest approaches is to use the MATLAB commands imread - reads an image from a graphics file imwrite - writes an image to a graphics file imshow - displays an image Specifics of the image processing commands can be found from Matlab s technical documentation such as http : // import and export.html or the online help command. When you use the imread command the output is a matrix in unsigned integer format, i.e., (uint8). You should convert this to double precision (in Matlab double) before writing to a file or using commands such as svds. However, the imshow command wants the uint8 format. You should learn the difference between the Matlab commands svd and svds. Exercises 1. The purpose of this problem is to make sure you are using the SVD algorithm (either from Matlab or netlib) correctly. First, make sure that you get the SVD of A =

4 as A = T. Note that Matlab gives you V although the decomposition uses the transpose of V. Next compute rank 1 and rank 2 approximations to A and determine the error in the Frobenius norm by (i) calculating the difference of A and its approximation and then computing its Frobenius norm and (ii) using the singular values. 2. In this problem we want to download an image from USC to use. Choose image stream and bridge. Create an integer matrix chart representing this image. Your image chart should be a matrix with entries between 0 and 255. View your image (for example with imshow) to make sure you have read it in correctly. a. Write code to create the matrix chart for an image and to do a partial SVD decomposition; you should read in the rank of the approximation you are using as well as the image file name. Use your code to determine and plot the first 150 singular values for the SVD of this image. What do the singular values imply about the number of terms we need to approximate the image? b. Modify your code to determine approximations to the image using rank 8, 16, 32, 64 and 128 approximations. Display your results as images along with the original image and discuss the various quality of the images. c. Now suppose we want to determine a reduced rank approximation of our image so that the relative error (measured in the Frobenius norm) is no more than 0.5%. Determine the rank of such an approximation and display your approximation. Compare the storage required for this data compression with the full image storage. 3. In this problem we will use the color image mandrill from the same USC site. Now each pixel in the image is represented by three RGB values and so the output of imread is a three dimensional array. a. Plot the first 150 singular values and discuss implications. b. Obtain rank 8, 16, 32, 64 and 128 approximations to your image. Display and compare your results. c. What is the lowest rank approximation to your image that you feel is an adequate representation in the eyeball norm? How does this compare with your interpretation of your results in (a)? d. Repeat (a)-(c) with your favorite image from the USC website or one of your own. 4

5 Part II - DATA COMPRESSION in IMAGE PROCESSING USING CLUSTERING Goals To investigate a clustering algorithm and apply it to image compression. This will expose you to another approach for data compression than just the SVD approach in the first part of this lab. Introduction Although we haven t studied clustering methods you have probably heard talks where Centroidal Voronoi Tessellations (CVT) or K-means are used. K-Means is a well-known algorithm for clustering objects. When we have a discrete set of data we can also view CVTs as a clustering algorithm which is equivalent to K- Means in this case. A Voronoi tessellation {V i } K i=1 of a region associated with a set of points (or generators) {z i } K i=1 is the decomposition of the region into set of subregions which have the property that all points in V i are closer to z i than any other generator. A CVT is a Voronoi tessellation where the generator z i is also the center of mass of V i with respect to the given density function. Lloyd s Method is an iterative method for constructing CVTs; however, as described below it is computationally costly. The method is outlined in the following steps: Lloyd s Method Given a set of initial points {z i } K i=1, a density function ρ, and a metric or distance function: 1. Construct the Voronoi tessellation {V i } K i=1 associated with the points {z i} K i=1 ; 2. Determine the centers of mass of each Voronoi region {V i } K i=1 ; set these points to be the new generators; 3. If convergence has not been achieved, return to (1.) Determining the centers of mass is easy, however, the construction of the Voronoi tessellation is quite costly. An alternative to Lloyd s Method is to take a probabilistic viewpoint. Instead of actually constructing the Voronoi tessellation, we will sample the region with a random point w, then determine which generator z i is closest to the given point. After sampling with many points, instead of a Voronoi region we will have a set of points in the Voronoi region. We then average the points in each cluster and take as the new generators an average of the points in each cluster or alternately we could take a weighted average of the old generators and the corresponding cluster averages. If we sample with enough points, then this should be a reasonable approximation to the Voronoi regions. Note that this method can easily be parallelized. Specifically, the probabilisitc Lloyd s method is given by the following steps. Probablistic Lloyd s Method Given a set of initial generators {z i } K i=1, a density function ρ, a metric or distance function and a number of sampling points N: 1. For i = 1,..., N sample with a random point w i in the domain; determine k such that z k is closest to w i ; adjust the kth cluster to include the point w i and increment the counter for the number of points in the specific cluster; 2. Determine the average of each discrete cluster; these points are the new generators; 5

6 3. If convergence has not been achieved, return to (1). The most time consuming part of this algorithm is determining the generator which is closest to the random point w i. You can implement a brute force approach to do this or a more sophisticated one. For the stopping criteria to determine if convergence has been reached, there are various choices. Since we want z i = zi for i = 1,..., K we can simply check 1 K z n+1 i zi n tolerance, (1) K i=1 where zi n denotes the nth iteration of the algorithm for the generator z i and represents the metric we are using to determine the generator nearest a point. To display a Voronoi region in two dimensions, various software is available. For this lab, it is probably easiest to use the MATLAB command voronoi. Using Clustering for Image Compression If we have a color image we know that each pixel is represented by three RGB values creating a myriad of colors. Our strategy now is to choose just a few colors to represent the picture. An obvious application of this data compression is when you print an image using a color printer with many fewer colors than are available on your computer. After we choose these colors, then the image chart for the picture must be modified so that each color is replaced by the new color that it is closest in color space. We can use K-Means or equivalently a discrete CVT to accomplish this image compression. For example, suppose we have a grayscale image and decide that we want to represent it with 32 shades of gray. Our job is to find which 32 colors best represent the image. We then initiate our probabilistic Lloyd s algorithm with 32 generators which are numbers between 0 and 255; we can simply choose the generators randomly. In Lloyd s algorithm we need to sample the space so in our application this means to sample the image; i.e., sample a random pixel. If the image is not too large, then we can simply sample every pixel in the image. We then proceed with the algorithm until convergence is attained. After convergence is achieved we know the best 32 colors to represent our image so our final step is to replace each color in our original matrix representation of the image with the converged centroid of the cluster it is in. For this application we will just use a constant density function and the standard Euclidean distance for our metric. 1. In this problem you will generate a CVT and plot it so you can make sure your algorithm is working correctly before we proceed to the image compression. Write a code to implement the probabilistic Lloyd s Method for a region which is an n-dimensional box and calculate the cluster variance. Test your code by generating a CVT diagram in the region (0, 2) (0, 2) using 100 generators. Use (i) 10 sampling points per generator, (ii) 100 sampling points per generator, and (iii) 1000 sampling points per generator; for each case display your tessellation using, for example, the MATLAB command Voronoi. Use a maximum number of iterations (300) and a stopping criteria as described above with a tolerance of Tabulate the number of iterations required for each case. Plot the cluster variance for each iteration. What conclusions can you draw from your result? 2. Use the color image (mandrill.tiff) from Part I of this lab and modify your algorithm from #1 to obtain approximations to the image using 4, 8, 16, 32, and 64 colors. Display your results along with the original image. As generators you will choose, e.g., 8 random points in the RGB color space and because there are only pixels you can sample the image by choosing each pixel to determine which of the 8 colors it is closest to; you can use the standard Euclidean length treating each point as a three-dimensional vector. Use 10 2 as a tolerance in your stopping criteria. 6

General Instructions. Questions

General Instructions. Questions CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These

More information

APPM 2360 Project 2 Due Nov. 3 at 5:00 PM in D2L

APPM 2360 Project 2 Due Nov. 3 at 5:00 PM in D2L APPM 2360 Project 2 Due Nov. 3 at 5:00 PM in D2L 1 Introduction Digital images are stored as matrices of pixels. For color images, the matrix contains an ordered triple giving the RGB color values at each

More information

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies. CSE 547: Machine Learning for Big Data Spring 2019 Problem Set 2 Please read the homework submission policies. 1 Principal Component Analysis and Reconstruction (25 points) Let s do PCA and reconstruct

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM

Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM 1 Instructions Labs may be done in groups of 2 or 3 (i.e., not alone). You may use any programming language you wish but MATLAB is highly suggested.

More information

APPM 2360 Lab #2: Facial Recognition

APPM 2360 Lab #2: Facial Recognition APPM 2360 Lab #2: Facial Recognition Instructions Labs may be done in groups of 3 or less. You may use any program; but the TAs will only answer coding questions in MATLAB. One report must be turned in

More information

The Singular Value Decomposition: Let A be any m n matrix. orthogonal matrices U, V and a diagonal matrix Σ such that A = UΣV T.

The Singular Value Decomposition: Let A be any m n matrix. orthogonal matrices U, V and a diagonal matrix Σ such that A = UΣV T. Section 7.4 Notes (The SVD) The Singular Value Decomposition: Let A be any m n matrix. orthogonal matrices U, V and a diagonal matrix Σ such that Then there are A = UΣV T Specifically: The ordering of

More information

Convex Optimization / Homework 2, due Oct 3

Convex Optimization / Homework 2, due Oct 3 Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a

More information

Image Compression with Singular Value Decomposition & Correlation: a Graphical Analysis

Image Compression with Singular Value Decomposition & Correlation: a Graphical Analysis ISSN -7X Volume, Issue June 7 Image Compression with Singular Value Decomposition & Correlation: a Graphical Analysis Tamojay Deb, Anjan K Ghosh, Anjan Mukherjee Tripura University (A Central University),

More information

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please) Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in

More information

Recognition, SVD, and PCA

Recognition, SVD, and PCA Recognition, SVD, and PCA Recognition Suppose you want to find a face in an image One possibility: look for something that looks sort of like a face (oval, dark band near top, dark band near bottom) Another

More information

Programming Exercise 7: K-means Clustering and Principal Component Analysis

Programming Exercise 7: K-means Clustering and Principal Component Analysis Programming Exercise 7: K-means Clustering and Principal Component Analysis Machine Learning May 13, 2012 Introduction In this exercise, you will implement the K-means clustering algorithm and apply it

More information

Facial Recognition Using Eigenfaces

Facial Recognition Using Eigenfaces Lab 11 Facial Recognition Using Eigenfaces Load the Data Lab Objective: Use the singular value decomposition to implement a simple facial recognition system. Suppose we have a large database containing

More information

Numerical Analysis and Statistics on Tensor Parameter Spaces

Numerical Analysis and Statistics on Tensor Parameter Spaces Numerical Analysis and Statistics on Tensor Parameter Spaces SIAM - AG11 - Tensors Oct. 7, 2011 Overview Normal Mean / Karcher Mean Karcher mean / Normal mean - finding representatives for a set of points

More information

APPLICATIONS OF THE SINGULAR VALUE DECOMPOSITION

APPLICATIONS OF THE SINGULAR VALUE DECOMPOSITION APPLICATIONS OF THE SINGULAR VALUE DECOMPOSITION Image Compression Let s take an image of a leader that we all know and respect: This image can be downloaded from the IAA website, after clicking on the

More information

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM 96 CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM Clustering is the process of combining a set of relevant information in the same group. In this process KM algorithm plays

More information

CS231A Course Notes 4: Stereo Systems and Structure from Motion

CS231A Course Notes 4: Stereo Systems and Structure from Motion CS231A Course Notes 4: Stereo Systems and Structure from Motion Kenji Hata and Silvio Savarese 1 Introduction In the previous notes, we covered how adding additional viewpoints of a scene can greatly enhance

More information

Linear Algebra Review

Linear Algebra Review CS 1674: Intro to Computer Vision Linear Algebra Review Prof. Adriana Kovashka University of Pittsburgh January 11, 2018 What are images? (in Matlab) Matlab treats images as matrices of numbers To proceed,

More information

CS1114 Section: SIFT April 3, 2013

CS1114 Section: SIFT April 3, 2013 CS1114 Section: SIFT April 3, 2013 Object recognition has three basic parts: feature extraction, feature matching, and fitting a transformation. In this lab, you ll learn about SIFT feature extraction

More information

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis Yiran Li yl534@math.umd.edu Advisor: Wojtek Czaja wojtek@math.umd.edu 10/17/2014 Abstract

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

Lecture 3: Camera Calibration, DLT, SVD

Lecture 3: Camera Calibration, DLT, SVD Computer Vision Lecture 3 23--28 Lecture 3: Camera Calibration, DL, SVD he Inner Parameters In this section we will introduce the inner parameters of the cameras Recall from the camera equations λx = P

More information

Clustering Images. John Burkardt (ARC/ICAM) Virginia Tech... Math/CS 4414:

Clustering Images. John Burkardt (ARC/ICAM) Virginia Tech... Math/CS 4414: John (ARC/ICAM) Virginia Tech... Math/CS 4414: http://people.sc.fsu.edu/ jburkardt/presentations/ clustering images.pdf... ARC: Advanced Research Computing ICAM: Interdisciplinary Center for Applied Mathematics

More information

An Approximate Singular Value Decomposition of Large Matrices in Julia

An Approximate Singular Value Decomposition of Large Matrices in Julia An Approximate Singular Value Decomposition of Large Matrices in Julia Alexander J. Turner 1, 1 Harvard University, School of Engineering and Applied Sciences, Cambridge, MA, USA. In this project, I implement

More information

Name: Math 310 Fall 2012 Toews EXAM 1. The material we have covered so far has been designed to support the following learning goals:

Name: Math 310 Fall 2012 Toews EXAM 1. The material we have covered so far has been designed to support the following learning goals: Name: Math 310 Fall 2012 Toews EXAM 1 The material we have covered so far has been designed to support the following learning goals: understand sources of error in scientific computing (modeling, measurement,

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Algebraic Iterative Methods for Computed Tomography

Algebraic Iterative Methods for Computed Tomography Algebraic Iterative Methods for Computed Tomography Per Christian Hansen DTU Compute Department of Applied Mathematics and Computer Science Technical University of Denmark Per Christian Hansen Algebraic

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Image Compression using Singular Value Decomposition (SVD)

Image Compression using Singular Value Decomposition (SVD) Image Compression using Singular Value Decomposition (SVD) by Brady Mathews 2 December 204 The University of Utah () What is the Singular Value Decomposition? Linear Algebra is a study that works mostly

More information

CS1114 Assignment 5, Part 1

CS1114 Assignment 5, Part 1 CS4 Assignment 5, Part out: Friday, March 27, 2009. due: Friday, April 3, 2009, 5PM. This assignment covers three topics in two parts: interpolation and image transformations (Part ), and feature-based

More information

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 14 Python Exercise on knn and PCA Hello everyone,

More information

Step-by-Step Guide to Relatedness and Association Mapping Contents

Step-by-Step Guide to Relatedness and Association Mapping Contents Step-by-Step Guide to Relatedness and Association Mapping Contents OBJECTIVES... 2 INTRODUCTION... 2 RELATEDNESS MEASURES... 2 POPULATION STRUCTURE... 6 Q-K ASSOCIATION ANALYSIS... 10 K MATRIX COMPRESSION...

More information

Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations

Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations Emma Twersky May 2017 Abstract This paper is an exploration into centroidal Voronoi tessellations, or CVTs. A centroidal

More information

Clustering Color/Intensity. Group together pixels of similar color/intensity.

Clustering Color/Intensity. Group together pixels of similar color/intensity. Clustering Color/Intensity Group together pixels of similar color/intensity. Agglomerative Clustering Cluster = connected pixels with similar color. Optimal decomposition may be hard. For example, find

More information

9.1. K-means Clustering

9.1. K-means Clustering 424 9. MIXTURE MODELS AND EM Section 9.2 Section 9.3 Section 9.4 view of mixture distributions in which the discrete latent variables can be interpreted as defining assignments of data points to specific

More information

Two-view geometry Computer Vision Spring 2018, Lecture 10

Two-view geometry Computer Vision Spring 2018, Lecture 10 Two-view geometry http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 10 Course announcements Homework 2 is due on February 23 rd. - Any questions about the homework? - How many of

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Clustering and Dimensionality Reduction. Stony Brook University CSE545, Fall 2017

Clustering and Dimensionality Reduction. Stony Brook University CSE545, Fall 2017 Clustering and Dimensionality Reduction Stony Brook University CSE545, Fall 2017 Goal: Generalize to new data Model New Data? Original Data Does the model accurately reflect new data? Supervised vs. Unsupervised

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #16 Loops: Matrix Using Nested for Loop In this section, we will use the, for loop to code of the matrix problem.

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

Matrices and Digital Images

Matrices and Digital Images Matrices and Digital Images Dirce Uesu Pesco and Humberto José Bortolossi Institute of Mathematics and Statistics Fluminense Federal University 1 Binary, grayscale and color images The images you see on

More information

Discrete geometry. Lecture 2. Alexander & Michael Bronstein tosca.cs.technion.ac.il/book

Discrete geometry. Lecture 2. Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Discrete geometry Lecture 2 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 The world is continuous, but the mind is discrete

More information

Compression, Clustering and Pattern Discovery in Very High Dimensional Discrete-Attribute Datasets

Compression, Clustering and Pattern Discovery in Very High Dimensional Discrete-Attribute Datasets IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Compression, Clustering and Pattern Discovery in Very High Dimensional Discrete-Attribute Datasets Mehmet Koyutürk, Ananth Grama, and Naren Ramakrishnan

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribe: Haowei Liu LECTURE 16 Structure from Motion from Tracked Points 16.1. Introduction In the last lecture we learned how to track point features

More information

UNIVERSITY OF OSLO. Faculty of Mathematics and Natural Sciences

UNIVERSITY OF OSLO. Faculty of Mathematics and Natural Sciences UNIVERSITY OF OSLO Faculty of Mathematics and Natural Sciences Exam: INF 4300 / INF 9305 Digital image analysis Date: Thursday December 21, 2017 Exam hours: 09.00-13.00 (4 hours) Number of pages: 8 pages

More information

Diffusion Wavelets for Natural Image Analysis

Diffusion Wavelets for Natural Image Analysis Diffusion Wavelets for Natural Image Analysis Tyrus Berry December 16, 2011 Contents 1 Project Description 2 2 Introduction to Diffusion Wavelets 2 2.1 Diffusion Multiresolution............................

More information

How are groceries, images, and matrices related?

How are groceries, images, and matrices related? How are groceries, images, and matrices related? Joan Gómez Urgellés, PhD Department of Mathematics Polytechnic University of Catalonia EPSEVG Vilanova i la Geltrú. Spain joan.vicenc.gomez@upc.edu How

More information

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

Feature Selection Using Modified-MCA Based Scoring Metric for Classification 2011 International Conference on Information Communication and Management IPCSIT vol.16 (2011) (2011) IACSIT Press, Singapore Feature Selection Using Modified-MCA Based Scoring Metric for Classification

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Collaborative Filtering for Netflix

Collaborative Filtering for Netflix Collaborative Filtering for Netflix Michael Percy Dec 10, 2009 Abstract The Netflix movie-recommendation problem was investigated and the incremental Singular Value Decomposition (SVD) algorithm was implemented

More information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves

More information

CS1114: Matlab Introduction

CS1114: Matlab Introduction CS1114: Matlab Introduction 1 Introduction The purpose of this introduction is to provide you a brief introduction to the features of Matlab that will be most relevant to your work in this course. Even

More information

GRAPHICS AND VISUALISATION WITH MATLAB Part 2

GRAPHICS AND VISUALISATION WITH MATLAB Part 2 GRAPHICS AND VISUALISATION WITH MATLAB Part 2 UNIVERSITY OF SHEFFIELD CiCS DEPARTMENT Deniz Savas & Mike Griffiths March 2012 Topics Handle Graphics Animations Images in Matlab Handle Graphics All Matlab

More information

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report Yiran Li yl534@math.umd.edu Advisor: Wojtek Czaja wojtek@math.umd.edu

More information

Singular Value Decomposition, and Application to Recommender Systems

Singular Value Decomposition, and Application to Recommender Systems Singular Value Decomposition, and Application to Recommender Systems CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Recommendation

More information

Image Compression using Singular Value Decomposition

Image Compression using Singular Value Decomposition Applications of Linear Algebra 1/41 Image Compression using Singular Value Decomposition David Richards and Adam Abrahamsen Introduction The Singular Value Decomposition is a very important process. In

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Unsupervised Learning: Clustering Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com (Some material

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Centroidal Voronoi Tessellation Algorithms for Image Compression, Segmentation, and Multichannel Restoration

Centroidal Voronoi Tessellation Algorithms for Image Compression, Segmentation, and Multichannel Restoration J Math Imaging Vis 24: 177 194, 2006 c 2006 Springer Science + Business Media, Inc. Manufactured in The Netherlands. DOI: 10.1007/s10851-005-3620-4 Centroidal Voronoi Tessellation Algorithms for Image

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

CS1114: Matlab Introduction

CS1114: Matlab Introduction CS1114: Matlab Introduction 1 Introduction The purpose of this introduction is to provide you a brief introduction to the features of Matlab that will be most relevant to your work in this course. Even

More information

CS 664 Structure and Motion. Daniel Huttenlocher

CS 664 Structure and Motion. Daniel Huttenlocher CS 664 Structure and Motion Daniel Huttenlocher Determining 3D Structure Consider set of 3D points X j seen by set of cameras with projection matrices P i Given only image coordinates x ij of each point

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

CS 231A CA Session: Problem Set 4 Review. Kevin Chen May 13, 2016

CS 231A CA Session: Problem Set 4 Review. Kevin Chen May 13, 2016 CS 231A CA Session: Problem Set 4 Review Kevin Chen May 13, 2016 PS4 Outline Problem 1: Viewpoint estimation Problem 2: Segmentation Meanshift segmentation Normalized cut Problem 1: Viewpoint Estimation

More information

Sparse Matrices in Image Compression

Sparse Matrices in Image Compression Chapter 5 Sparse Matrices in Image Compression 5. INTRODUCTION With the increase in the use of digital image and multimedia data in day to day life, image compression techniques have become a major area

More information

Assignment: Backgrounding and Optical Flow.

Assignment: Backgrounding and Optical Flow. Assignment: Backgrounding and Optical Flow. April 6, 00 Backgrounding In this part of the assignment, you will develop a simple background subtraction program.. In this assignment, you are given two videos.

More information

Computational Foundations of Cognitive Science. Inverse. Inverse. Inverse Determinant

Computational Foundations of Cognitive Science. Inverse. Inverse. Inverse Determinant Computational Foundations of Cognitive Science Lecture 14: s and in Matlab; Plotting and Graphics Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk February 23, 21 1 2 3 Reading:

More information

Analysis and Latent Semantic Indexing

Analysis and Latent Semantic Indexing 18 Principal Component Analysis and Latent Semantic Indexing Understand the basics of principal component analysis and latent semantic index- Lab Objective: ing. Principal Component Analysis Understanding

More information

Nearest Neighbor Predictors

Nearest Neighbor Predictors Nearest Neighbor Predictors September 2, 2018 Perhaps the simplest machine learning prediction method, from a conceptual point of view, and perhaps also the most unusual, is the nearest-neighbor method,

More information

CSC 411 Lecture 18: Matrix Factorizations

CSC 411 Lecture 18: Matrix Factorizations CSC 411 Lecture 18: Matrix Factorizations Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 18-Matrix Factorizations 1 / 27 Overview Recall PCA: project data

More information

Collaborative Filtering Based on Iterative Principal Component Analysis. Dohyun Kim and Bong-Jin Yum*

Collaborative Filtering Based on Iterative Principal Component Analysis. Dohyun Kim and Bong-Jin Yum* Collaborative Filtering Based on Iterative Principal Component Analysis Dohyun Kim and Bong-Jin Yum Department of Industrial Engineering, Korea Advanced Institute of Science and Technology, 373-1 Gusung-Dong,

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Programming Exercise 4: Neural Networks Learning

Programming Exercise 4: Neural Networks Learning Programming Exercise 4: Neural Networks Learning Machine Learning Introduction In this exercise, you will implement the backpropagation algorithm for neural networks and apply it to the task of hand-written

More information

Chapter 11 Image Processing

Chapter 11 Image Processing Chapter Image Processing Low-level Image Processing Operates directly on a stored image to improve or enhance it. Stored image consists of a two-dimensional array of pixels (picture elements): Origin (0,

More information

Multivariate Analysis (slides 9)

Multivariate Analysis (slides 9) Multivariate Analysis (slides 9) Today we consider k-means clustering. We will address the question of selecting the appropriate number of clusters. Properties and limitations of the algorithm will be

More information

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal. Chapter 8 out of 7 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal 8 Matrices Definitions and Basic Operations Matrix algebra is also known

More information

Fingerprint Image Compression

Fingerprint Image Compression Fingerprint Image Compression Ms.Mansi Kambli 1*,Ms.Shalini Bhatia 2 * Student 1*, Professor 2 * Thadomal Shahani Engineering College * 1,2 Abstract Modified Set Partitioning in Hierarchical Tree with

More information

Feature Selection for fmri Classification

Feature Selection for fmri Classification Feature Selection for fmri Classification Chuang Wu Program of Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 chuangw@andrew.cmu.edu Abstract The functional Magnetic Resonance Imaging

More information

Alternative Statistical Methods for Bone Atlas Modelling

Alternative Statistical Methods for Bone Atlas Modelling Alternative Statistical Methods for Bone Atlas Modelling Sharmishtaa Seshamani, Gouthami Chintalapani, Russell Taylor Department of Computer Science, Johns Hopkins University, Baltimore, MD Traditional

More information

Lab 1: Elementary image operations

Lab 1: Elementary image operations CSC, KTH DD2423 Image Analysis and Computer Vision : Lab 1: Elementary image operations The goal of the labs is to gain practice with some of the fundamental techniques in image processing on grey-level

More information

Chapter 11 Representation & Description

Chapter 11 Representation & Description Chain Codes Chain codes are used to represent a boundary by a connected sequence of straight-line segments of specified length and direction. The direction of each segment is coded by using a numbering

More information

1 Background and Introduction 2. 2 Assessment 2

1 Background and Introduction 2. 2 Assessment 2 Luleå University of Technology Matthew Thurley Last revision: October 27, 2011 Industrial Image Analysis E0005E Product Development Phase 4 Binary Morphological Image Processing Contents 1 Background and

More information

CS1114 Assignment 5 Part 1

CS1114 Assignment 5 Part 1 CS1114 Assignment 5 Part 1 out: Friday, March 30, 2012. due: Friday, April 6, 2012, 9PM. This assignment covers two topics: upscaling pixel art and steganography. This document is organized into those

More information

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR)

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 63 CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 4.1 INTRODUCTION The Semantic Region Based Image Retrieval (SRBIR) system automatically segments the dominant foreground region and retrieves

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline

More information

Perturbation Estimation of the Subspaces for Structure from Motion with Noisy and Missing Data. Abstract. 1. Introduction

Perturbation Estimation of the Subspaces for Structure from Motion with Noisy and Missing Data. Abstract. 1. Introduction Perturbation Estimation of the Subspaces for Structure from Motion with Noisy and Missing Data Hongjun Jia, Jeff Fortuna and Aleix M. Martinez Department of Electrical and Computer Engineering The Ohio

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Lecture # 4 Digital Image Fundamentals - II ALI JAVED Lecturer SOFTWARE ENGINEERING DEPARTMENT U.E.T TAXILA Email:: ali.javed@uettaxila.edu.pk Office Room #:: 7 Presentation Outline

More information

Data Clustering. Chapter 9

Data Clustering. Chapter 9 Chapter 9 Data Clustering When we are confronted with a novel data set it is customary to first do some exploratory analysis. A clustering of the data is an algorithm that can help us summarize how the

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

COMP 558 lecture 19 Nov. 17, 2010

COMP 558 lecture 19 Nov. 17, 2010 COMP 558 lecture 9 Nov. 7, 2 Camera calibration To estimate the geometry of 3D scenes, it helps to know the camera parameters, both external and internal. The problem of finding all these parameters is

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3 6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 06 Image Structures 13/02/06 http://www.ee.unlv.edu/~b1morris/ecg782/

More information