Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Similar documents
Data Mining. Lecture 03: Nearest Neighbor Learning

CISC 4631 Data Mining

Distribution-free Predictive Approaches

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

7. Nearest neighbors. Learning objectives. Foundations of Machine Learning École Centrale Paris Fall 2015

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Nearest Neighbor Classification. Machine Learning Fall 2017

7. Nearest neighbors. Learning objectives. Centre for Computational Biology, Mines ParisTech

Non-Parametric Modeling

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

CSC 411: Lecture 05: Nearest Neighbors

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines

CS7267 MACHINE LEARNING NEAREST NEIGHBOR ALGORITHM. Mingon Kang, PhD Computer Science, Kennesaw State University

Nearest Neighbor Predictors

Classification and Regression

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Notes and Announcements

CS 340 Lec. 4: K-Nearest Neighbors

9 Classification: KNN and SVM

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Feature-Guided K-Means Algorithm for Optimal Image Vector Quantizer Design

Data Mining and Machine Learning: Techniques and Algorithms

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Machine Learning: k-nearest Neighbors. Lecture 08. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Data Preprocessing. Supervised Learning

Supervised Learning: Nearest Neighbors

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

Introduction to Machine Learning. Xiaojin Zhu

k-nearest Neighbor (knn) Sept Youn-Hee Han

Based on Raymond J. Mooney s slides

Instance and case-based reasoning

VECTOR SPACE CLASSIFICATION

Using Machine Learning to Optimize Storage Systems

Lecture 06 Decision Trees I

Topics in Machine Learning

What to come. There will be a few more topics we will cover on supervised learning

Density estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques

CHAPTER 6 INFORMATION HIDING USING VECTOR QUANTIZATION

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Introduction to Artificial Intelligence

Instance-Based Learning.

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

Statistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology

K-Means Clustering. Sargur Srihari

Linear Regression and K-Nearest Neighbors 3/28/18

Unsupervised Learning : Clustering

K- Nearest Neighbors(KNN) And Predictive Accuracy

PROBLEM 4

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques.

Figure (5) Kohonen Self-Organized Map

Data Preprocessing. Slides by: Shree Jaswal

CHAPTER 4 STOCK PRICE PREDICTION USING MODIFIED K-NEAREST NEIGHBOR (MKNN) ALGORITHM

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

Generative and discriminative classification techniques

Random Forest A. Fornaser

Supervised Learning. Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...

Part-based and local feature models for generic object recognition

Voronoi diagrams and applications

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques

Practice EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

Spatial Outlier Detection

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

10-701/15-781, Fall 2006, Final

CS 231A CA Session: Problem Set 4 Review. Kevin Chen May 13, 2016

Density estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate

Basic Data Mining Technique

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

1 Case study of SVM (Rob)

USING OF THE K NEAREST NEIGHBOURS ALGORITHM (k-nns) IN THE DATA CLASSIFICATION

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

The k-means Algorithm and Genetic Algorithm

MODULE 7 Nearest Neighbour Classifier and its variants LESSON 11. Nearest Neighbour Classifier. Keywords: K Neighbours, Weighted, Nearest Neighbour

Nearest Neighbor Methods

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20

MS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods

Lecture 3. Oct

CSC411/2515 Tutorial: K-NN and Decision Tree

Multi-label classification using rule-based classifier systems

Clustering Billions of Images with Large Scale Nearest Neighbor Search

CSE4334/5334 DATA MINING

CS 584 Data Mining. Classification 1

Large-scale visual recognition Efficient matching

Transcription:

Voronoi Region K-means method for Signal Compression: Vector Quantization Blocks of signals: A sequence of audio. A block of image pixels. Formally: vector example: (0.2, 0.3, 0.5, 0.1) A vector quantizer maps k-dimensional vectors in the vector space R k into a finite set of vectors Y = {y i : i = 1, 2,..., N}. Each vector y i is called a code vector or a codeword. and the set of all the codewords is called a codebook. Associated with each codeword, y i, is a nearest neighbor region called Voronoi region, and it is defined by: The set of Voronoi regions partition the entire space R k. Two Dimensional Voronoi Diagram The Schematic of a Vector Quantizer (signal compression) Codewords in 2-dimensional space. Input vectors are marked with an x, codewords are marked with red circles, and the Voronoi regions are separated with boundary lines. Compression Formula Amount of compression: Codebook size is K, input vector of dimension L In order to inform the decoder of which code vector is selected, we need to use log 2 K bits. E.g. need 8 bits to represent 256 code vectors. Rate: each code vector contains the reconstruction value of L source output samples, the number of bits per vector component would be: log 2 K / L. K is called level of vector quantizer. Vector Quantizer Algorithm 1. Determine the number of codewords, N, or the size of the codebook. 2. Select N codewords at random, and let that be the initial codebook. The initial codewords can be randomly chosen from the set of input vectors. 3. Using the Euclidean distance measure clusterize the vectors around each codeword. This is done by taking each input vector and finding the Euclidean distance between it and each codeword. The input vector belongs to the cluster of the codeword that yields the minimum distance. 1

Vector Quantizer Algorithm (contd.) 4. Compute the new set of codewords. This is done by obtaining the average of each cluster. Add the component of each vector and divide by the number of vectors in the cluster. where i is the component of each vector (x, y, z,... directions), m is the number of vectors in the cluster. 5. Repeat steps 2 and 3 until the either the codewords don't change or the change in the codewords is small. Other Algorithms Problem: k-means is a greedy algorithm, may fall into Local minimum. Four methods selecting initial vectors: Random Splitting (with perturbation vector) Animation Train with different subset PNN (pairwise nearest neighbor) Empty cell problem: No input corresponds to am output vector Solution: give to other clusters, e.g. most populate cluster. VQ for image compression Taking blocks of images as vector L=NM. If K vectors in code book: need to use log 2 K bits. Rate: log 2 K / L The higher the value K, the better quality, but lower compression ratio. Overhead to transmit code book: codebook size K Overhead bits/pixel 16 0.03125 64 0.125 256 0.5 1024 2 Train with a set of images. K-Nearest Neighbor Learning 22c:145 University of Iowa Different Learning Methods Parametric Learning The target function is described by a set of parameters (examples are forgotten) E.g., structure and weights of a neural network Instance-based Learning Learning=storing all training instances Classification=assigning target function to a new instance Referred to as Lazy learning Instance-based Learning Its very similar to a Desktop!! 2

General Idea of Instancebased Learning Learning: store all the data instances Performance: when a new query instance is encountered retrieve a similar set of related instances from memory use to classify the new query Pros and Cons of Instance Based Learning Pros Can construct a different approximation to the target function for each distinct query instance to be classified Can use more complex, symbolic representations Cons Cost of classification can be high Uses all attributes (do not learn which are most important) Instance-based Learning K-Nearest Neighbor Algorithm Weighted Regression Case-based reasoning k-nearest neighbor (knn) learning Most basic type of instance learning Assumes all instances are points in n-dimensional space A distance measure is needed to determine the closeness of instances Classify an instance by finding its nearest neighbors and picking the most popular class among the neighbors 1-Nearest Neighbor 3-Nearest Neighbor 3

Important Decisions Distance measure Value of k (usually odd) Voting mechanism Memory indexing Euclidean Distance Typically used for real valued attributes Instance x (often called a feature vector) a x), a ( x), a ( ) 1( 2 n x Distance between two instances x i and x j d n ( xi, x j ) ( ar ( xi ) ar ( x j r 1 )) 2 Discrete Valued Target Function Continuous valued target function Training algorithm: For each training example <x, f(x)>, add the example to the list training_examples Classification algorithm: Given a query instance x q to be classified. Let x 1 x k be the k training examples nearest to x q Return k fˆ( x ) arg max ( v, f ( x )) q v V i 1 where ( a, b) 1if a b ( a, b) 0 otherwise i Algorithm computes the mean value of the k nearest training examples rather than the most common value Replace fine line in previous algorithm with k f ( xi ) i fˆ( xq ) 1 k Training dataset k-nn K = 3 Distance Score for an attribute is 1 for a match and 0 otherwise Distance is sum of scores for each attribute Voting scheme: proportionate voting in case of ties 4

Query: Zeb High Medium Married? Query: Yong Low High Married? Query: Vasco High Low Married? Voronoi Diagram Decision surface formed by the training examples of two attributes Examples of one attribute Distance-Weighted Nearest Neighbor Algorithm Assign weights to the neighbors based on their distance from the query point Weight may be inverse square of the distances All training points may influence a particular instance Shepard s method 5

Kernel function for Distance- Weighted Nearest Neighbor Examples of one attribute Remarks +Highly effective inductive inference method for noisy training data and complex target functions +Target function for a whole space may be described as a combination of less complex local approximations +Learning is very simple - Classification is time consuming Curse of Dimensionality - When the dimensionality increases, the volume of the space increases so fast that the available data becomes sparse. This sparsity is problematic for any method that requires statistical significance. Curse of Dimensionality Suppose there are N data points of dimension n in the space [-1/2, 1/2] n. The k-neighborhood of a point is defined to be the smallest hypercube containing the k-nearest neighbor. Let l be the average side length of a k- neighborhood. Then the volume of an average hypercube is d n. So d n /1 n = k/n, or d = (k/n) 1/n d = (k/n) 1/n N k n d 1,000,000 10 2 0.003 1,000,000 10 3 0.02 1,000,000 10 17 0.5 1,000,000 10 200 0.94 When n is big, all the points are outliers. 6

- Curse of Dimensionality - Curse of Dimensionality 7