Grouping Objects by Linear Pattern
|
|
- Sylvia Lamb
- 5 years ago
- Views:
Transcription
1 Grouping Objects by Linear Pattern Ruben Zamar Department of Statistics University of British Columbia Vancouver, CANADA
2 This talk is based on work with several collaborators: 1 Van Aelst, Wang, Zamar and Zhu (2006) Linear Grouping Using Orthogonal Regression, CSDA Garcia-Escudero, Gordaliza, San Martin, Van Aelst and Zamar (2008) Robust linear clustering, JRSS, Series B and Yan, Welch and Zamar (2010) A likelihood approach to linear clustering, submitted Special thanks to Justin Harrington for the R package LGA that implements some of the the methods presented here.
3 Outline 2 Grouping by linear patterns Linear grouping algorithm (LGA) The number of groups (GAP) Applications Dealing with outliers (RLGA) Model based approach
4 Clustering Algorithms 3 Traditional clustering algorithms are effective when clusters are distinct homogenous groups What about other interesting patterns?
5 Young and Old Trees 4
6 SNP Genotyping Data 5
7 Allometry Data 6
8 Our Goal 7 To find groups clustered around hyperplanes of different dimensions l i, where 0 l i d 1 i = 1, 2,..., g
9 Example d = 3 and g = 3 8 l 1 = 1 l 2 = 0 l 3 = 2 cluster around a line. cluster around a point. cluster around plane.
10 Linear Grouping Algorithm (LGA) 9 Goal: to find k groups around hyperplanes of dimension d 1 To find clusters around k lines in R 2. To find clusters around k planes in R 3....
11 Some References 10 Murtagh and Raftery (1984) Gawrysiak et al. (2000) Spath (1982,1985) Desarbo, Oliver and Rangaswamy (1989) Wedel and Kistemaker (1989) Kamgar-Parsi, Kamgar-Parsi and Wechsler (1990) Gawrysiak, Okoniewski and Rybinski (2000) Specified response variable.
12 Unsupervised Learning 11 Clustering is most often used for unsupervised learning. Unsupervised learning is characterized by the absence of response variables. Different linear groups may involve different subsets of variables.
13 Simple Example 12
14 Response Variable = Y 13
15 Response Variable = Z 14
16 Orthogonal Residuals 15
17 Hyperplane 16 H (α, β) = {z : α z = β, α = 1}
18 Data 17 Let z 1, z 2,..., z n be n points in R d z = 1 n zi (Sample Mean) S = 1 n (zi z)(z i z) (Sample Covariance)
19 Orthogonal Regression 18 J (α, β) = (α z i β) 2 ( ) ˆα, ˆβ minimizes J (α, β) ˆα = normalized first eigenvector of S ˆβ = ˆα z
20 LGA Algorithm 19 INPUT: d-dimensional data points z 1, z 2,..., z n and the number k of groups OUTPUT: The best partition of the dataset into k groups centered around hyperplanes of dimension d 1
21 LGA Step-by-Step 20 1) Initialization: Initial hyperplanes are defined by the exact fitting of k sub-samples of size d 2) Forming k groups: Each data point is assigned to its closest hyperplane using Euclidean distances. 3) Computing k Hyperplanes: New hyperplanes are computed applying orthogonal regression to each group. Steps 2) and 3) are repeated several times
22 The Number of Groups 21 The number of groups k is an input for lga k may be suggested by subject matter knowledge Finding k may be the an important research goal
23 Graphical Approach 22
24 The GAP Statistic 23 Tibshirani, Walther and Hastie (2001) proposed the GAP statistic to determine the number of clusters in a data set. GAP compares the pooled within-cluster sum of squares around the cluster centers with its expectation under a null reference distribution. The null distribution is obtained by generating uniformly distributed points on the hyper-rectangle aligned with the principal components of the data. The (modified) GAP statistic for linear grouping is obtained by replacing distance to the center by distance to the hyperplane.
25 The GAP Statistic (continued) 24 GAP (k) = [ 1 B B b=1 log (SSR k (b)) ] log (SSR k ) ˆk = smallest k such that GAP (k) GAP (k + 1) s k+1 s k+1 = S k (1/B) S k+1 = Standard Deviation of log (SSR k+1 (b))
26 Simple Example 25
27 Simple Example 26
28 Simple Example 27
29 Simple Example 28
30 Application to Allometry 29 Figure 6: Olfactory Bulb vs. Brain Weight (log-scale) for some mammal species: Insectivores (i), Carnivores (c), Prosimians (p), Apes (a), Monkeys (m), Human (h) and Horse (o).
31 Application to Allometry 30 Biologists study the relation between sizes of organs for different species. The (log-transformed) sizes are linearly related. Linear associations differ across species because of different living habits, environment, food sources, etc. Grouping by different linear patterns is necessary. Biologists make manual assignments based on experience (Jerison 1973).
32 Application to Allometry 31
33 Application to Allometry 32
34 Application to Allometry 33 Jerison (1973) I II III insectivores, carnivores, horses, prosimians (primitive primates) anthropoids (monkeys, apes, human) LGA with k=3 I II III insectivores, carnivores, horses, prosimians and apes monkeys and human LGA & GAP I II insectivores, carnivores, horses, prosimians monkeys, apes and human
35 Application to Sport Data 34 Performance of 871 players in the 94/95 Hockey League Variables PTS P/M PIM PP Description # of Goal Scored + # of Assists Plus/Minus Rating + team scored, - oponent team scored Total penalty time (minutes) Total number of power-play goals scored
36 Application to Sport Data 35 We applied LGA with k=3 The results:
37 Sharp Shooters - Team Players 36
38 Simple Example 37
39 Dealing with Outliers 38 Use trimming to allow a fraction of points not following any linear structure The resulting procedure is called Robust LGA (RLGA) RLGA is computed by the function rlga() in the lga package
40 RLGA Step-by-Step 39 1) Initialization: k hyperplanes are defined by exact fitting k random sub-samples of size d 2) Trimming and Forming the Groups: For 1 i n let r i (1), r i (2),..., r i (k) be the orthogonal distances from point i to each of the k hyplerplanes Let r i = min{r i (1), r i (2),..., r i (k)} The n(1 α) points with smallest r i are assigned to their closest hyperplanes 3) Computing k Hyperplanes: New hyperplanes are computed applying orthogonal regression to each group. Steps 2) - 3) are repeated several times
41 Simple Example - Continued 40
42 Simple Example - Continued 41
43 Computer Vision Data 42
Package lga. R topics documented: February 20, 2015
Version 1.1-1 Date 2008-06-15 Title Tools for linear grouping analysis (LGA) Author Justin Harrington Maintainer ORPHANED Depends R (>= 2.2.1) Imports boot, lattice Suggests snow, MASS, maps Package lga
More informationAdventures in HPC and R: Going Parallel
Outline Adventures in HPC and R: Going Parallel Justin Harrington & Matias Salibian-Barrera s UNIVERSITY OF BRITISH COLUMBIA Closing Remarks The R User Conference 2006 From Wikipedia: Parallel computing
More informationPackage birch. February 15, 2013
Package birch February 15, 2013 Type Package Depends R (>= 2.10), ellipse Suggests MASS Title Dealing with very large datasets using BIRCH Version 1.2-3 Date 2012-05-03 Author Lysiane Charest, Justin Harrington,
More informationSupervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationClustering. Supervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationRobust Kernel Methods in Clustering and Dimensionality Reduction Problems
Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust
More informationINFO0948 Fitting and Shape Matching
INFO0948 Fitting and Shape Matching Renaud Detry University of Liège, Belgium Updated March 31, 2015 1 / 33 These slides are based on the following book: D. Forsyth and J. Ponce. Computer vision: a modern
More informationK-Means Clustering. Sargur Srihari
K-Means Clustering Sargur srihari@cedar.buffalo.edu 1 Topics in Mixture Models and EM Mixture models K-means Clustering Mixtures of Gaussians Maximum Likelihood EM for Gaussian mistures EM Algorithm Gaussian
More informationk-means Clustering Todd W. Neller Gettysburg College Laura E. Brown Michigan Technological University
k-means Clustering Todd W. Neller Gettysburg College Laura E. Brown Michigan Technological University Outline Unsupervised versus Supervised Learning Clustering Problem k-means Clustering Algorithm Visual
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More informationscikit-learn (Machine Learning in Python)
scikit-learn (Machine Learning in Python) (PB13007115) 2016-07-12 (PB13007115) scikit-learn (Machine Learning in Python) 2016-07-12 1 / 29 Outline 1 Introduction 2 scikit-learn examples 3 Captcha recognize
More informationGeneralized Principal Component Analysis CVPR 2007
Generalized Principal Component Analysis Tutorial @ CVPR 2007 Yi Ma ECE Department University of Illinois Urbana Champaign René Vidal Center for Imaging Science Institute for Computational Medicine Johns
More informationk-means Clustering Todd W. Neller Gettysburg College
k-means Clustering Todd W. Neller Gettysburg College Outline Unsupervised versus Supervised Learning Clustering Problem k-means Clustering Algorithm Visual Example Worked Example Initialization Methods
More informationHomework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)
Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationRobust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson
Robust Regression Robust Data Mining Techniques By Boonyakorn Jantaranuson Outline Introduction OLS and important terminology Least Median of Squares (LMedS) M-estimator Penalized least squares What is
More informationLinear Methods for Regression and Shrinkage Methods
Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1
Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group
More informationUsing Spin Images for Efficient Object Recognition in Cluttered 3D Scenes
Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes TDT 03 - Advanced Topics in Computer Graphics Presentation by Ruben H. Fagerli Paper to be summarized Using Spin Images for Efficient
More informationUnsupervised Learning
Unsupervised Learning Pierre Gaillard ENS Paris September 28, 2018 1 Supervised vs unsupervised learning Two main categories of machine learning algorithms: - Supervised learning: predict output Y from
More informationCOURSE WEBPAGE. Peter Orbanz Applied Data Mining
INTRODUCTION COURSE WEBPAGE http://stat.columbia.edu/~porbanz/un3106s18.html iii THIS CLASS What to expect This class is an introduction to machine learning. Topics: Classification; learning ; basic neural
More informationLocal Features: Detection, Description & Matching
Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British
More informationCluster Analysis for Microarray Data
Cluster Analysis for Microarray Data Seventh International Long Oligonucleotide Microarray Workshop Tucson, Arizona January 7-12, 2007 Dan Nettleton IOWA STATE UNIVERSITY 1 Clustering Group objects that
More informationClustering human body shapes using k-means algorithm
Clustering human body shapes using k-means algorithm Guillermo Vinué Visús PhD Student Faculty of Mathematics University of Valencia, Spain Jointly with Guillermo Ayala Gallego, Juan Domingo Esteve, Esther
More informationIQR = number. summary: largest. = 2. Upper half: Q3 =
Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number
More information3D Models and Matching
3D Models and Matching representations for 3D object models particular matching techniques alignment-based systems appearance-based systems GC model of a screwdriver 1 3D Models Many different representations
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationClustering. Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH
Clustering Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH Introduction to Data Science Boulder Clustering 1 of 9 Clustering Lab Review of k-means Work through
More informationStereo and Epipolar geometry
Previously Image Primitives (feature points, lines, contours) Today: Stereo and Epipolar geometry How to match primitives between two (multiple) views) Goals: 3D reconstruction, recognition Jana Kosecka
More informationData mining techniques for actuaries: an overview
Data mining techniques for actuaries: an overview Emiliano A. Valdez joint work with Banghee So and Guojun Gan University of Connecticut Advances in Predictive Analytics (APA) Conference University of
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence COMP307 Machine Learning 2: 3-K Techniques Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline K-Nearest Neighbour method Classification (Supervised learning) Basic NN (1-NN)
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationStatistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte
Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,
More informationUnsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing
Unsupervised Data Mining: Clustering Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 1. Supervised Data Mining Classification Regression Outlier detection
More informationLast time... Coryn Bailer-Jones. check and if appropriate remove outliers, errors etc. linear regression
Machine learning, pattern recognition and statistical data modelling Lecture 3. Linear Methods (part 1) Coryn Bailer-Jones Last time... curse of dimensionality local methods quickly become nonlocal as
More informationExploratory data analysis for microarrays
Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP
More information9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology
9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example
More informationAlgorithms for LTS regression
Algorithms for LTS regression October 26, 2009 Outline Robust regression. LTS regression. Adding row algorithm. Branch and bound algorithm (BBA). Preordering BBA. Structured problems Generalized linear
More informationExtending Functional Dependency to Detect Abnormal Data in RDF Graphs
Extending Functional Dependency to Detect Abnormal Data in RDF Graphs Yang Yu, Jeff Heflin SWAT Lab Department of Computer Science and Engineering Lehigh University PA, USA Outline Semantic Web data and
More informationUnsupervised learning in Vision
Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationChapter 6: Linear Model Selection and Regularization
Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the
More informationFacial Expression Classification with Random Filters Feature Extraction
Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle
More informationA robust and sparse K-means clustering algorithm
A robust and sparse K-means clustering algorithm arxiv:1201.6082v1 [stat.ml] 29 Jan 2012 Yumi Kondo Matias Salibian-Barrera Ruben Zamar January 31, 2012 Keywords: K-means, robust clustering, sparse clustering,
More informationMultivariate Analysis (slides 9)
Multivariate Analysis (slides 9) Today we consider k-means clustering. We will address the question of selecting the appropriate number of clusters. Properties and limitations of the algorithm will be
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationK-means clustering Based in part on slides from textbook, slides of Susan Holmes. December 2, Statistics 202: Data Mining.
K-means clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 K-means Outline K-means, K-medoids Choosing the number of clusters: Gap test, silhouette plot. Mixture
More informationProjection Based M-Estimators
1 Projection Based M-Estimators Raghav Subbarao, Peter Meer, Senior Member, IEEE Electrical and Computer Engineering Department Rutgers University, 94 Brett Road, Piscataway, NJ, 08854-8058 rsubbara, meer@caip.rutgers.edu
More informationChapter 3 Image Registration. Chapter 3 Image Registration
Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation
More informationLinear Regression: One-Dimensional Case
Linear Regression: One-Dimensional Case Given: a set of N input-response pairs The inputs (x) and the responses (y) are one dimensional scalars Goal: Model the relationship between x and y (CS5350/6350)
More informationDiscriminate Analysis
Discriminate Analysis Outline Introduction Linear Discriminant Analysis Examples 1 Introduction What is Discriminant Analysis? Statistical technique to classify objects into mutually exclusive and exhaustive
More informationData mining with Support Vector Machine
Data mining with Support Vector Machine Ms. Arti Patle IES, IPS Academy Indore (M.P.) artipatle@gmail.com Mr. Deepak Singh Chouhan IES, IPS Academy Indore (M.P.) deepak.schouhan@yahoo.com Abstract: Machine
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationLizhe Sun. November 17, Florida State University. Ranking in Statistics and Machine Learning. Lizhe Sun. Introduction
in in Florida State University November 17, 2017 Framework in 1. our life 2. Early work: Model Examples 3. webpage Web page search modeling Data structure Data analysis with machine learning algorithms
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationKernel Methods & Support Vector Machines
& Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector
More informationDS504/CS586: Big Data Analytics Big Data Clustering Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Big Data Clustering Prof. Yanhua Li Time: 6:00pm 8:50pm Thu Location: AK 232 Fall 2016 High Dimensional Data v Given a cloud of data points we want to understand
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.
ST512 Fall Quarter, 2005 Exam 1 Name: Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. 1. (42 points) A random sample of n = 30 NBA basketball
More informationOutline. Advanced Digital Image Processing and Others. Importance of Segmentation (Cont.) Importance of Segmentation
Advanced Digital Image Processing and Others Xiaojun Qi -- REU Site Program in CVIP (7 Summer) Outline Segmentation Strategies and Data Structures Algorithms Overview K-Means Algorithm Hidden Markov Model
More informationComputational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions
Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Thomas Giraud Simon Chabot October 12, 2013 Contents 1 Discriminant analysis 3 1.1 Main idea................................
More informationLast week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints
Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing
More informationMath in image processing
Math in image processing Math in image processing Nyquist theorem Math in image processing Discrete Fourier Transformation Math in image processing Image enhancement: scaling Math in image processing Image
More information9.1. K-means Clustering
424 9. MIXTURE MODELS AND EM Section 9.2 Section 9.3 Section 9.4 view of mixture distributions in which the discrete latent variables can be interpreted as defining assignments of data points to specific
More informationGeneral Instructions. Questions
CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These
More informationPoint Cloud Processing
Point Cloud Processing Has anyone seen the toothpaste? Given a point cloud: how do you detect and localize objects? how do you map terrain? What is a point cloud? Point cloud: a set of points in 3-D space
More informationAdvances in Face Recognition Research
The face recognition company Advances in Face Recognition Research Presentation for the 2 nd End User Group Meeting Juergen Richter Cognitec Systems GmbH For legal reasons some pictures shown on the presentation
More informationData Mining and Analytics. Introduction
Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data
More informationCS6670: Computer Vision
CS6670: Computer Vision Noah Snavely Lecture 16: Bag-of-words models Object Bag of words Announcements Project 3: Eigenfaces due Wednesday, November 11 at 11:59pm solo project Final project presentations:
More informationCS Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts
More informationClassification: Linear Discriminant Functions
Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions
More informationWhat to come. There will be a few more topics we will cover on supervised learning
Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression
More informationHierarchical Clustering 4/5/17
Hierarchical Clustering 4/5/17 Hypothesis Space Continuous inputs Output is a binary tree with data points as leaves. Useful for explaining the training data. Not useful for making new predictions. Direction
More informationAssignment 2 : Projection and Homography
TECHNISCHE UNIVERSITÄT DRESDEN EINFÜHRUNGSPRAKTIKUM COMPUTER VISION Assignment 2 : Projection and Homography Hassan Abu Alhaija November 7,204 INTRODUCTION In this exercise session we will get a hands-on
More informationWorkload Characterization Techniques
Workload Characterization Techniques Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/
More informationAN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS
AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS H.S Behera Department of Computer Science and Engineering, Veer Surendra Sai University
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationObjective of clustering
Objective of clustering Discover structures and patterns in high-dimensional data. Group data with similar patterns together. This reduces the complexity and facilitates interpretation. Expression level
More informationIntroduction to Vector Space Models
Vector Span, Subspaces, and Basis Vectors Linear Combinations (Review) A linear combination is constructed from a set of terms v, v 2,..., v p by multiplying each term by a constant and adding the result:
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationPreprocessing Short Lecture Notes cse352. Professor Anita Wasilewska
Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept
More informationCOMP 551 Applied Machine Learning Lecture 13: Unsupervised learning
COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More informationPSS718 - Data Mining
Lecture 5 - Hacettepe University October 23, 2016 Data Issues Improving the performance of a model To improve the performance of a model, we mostly improve the data Source additional data Clean up the
More informationSupervised Learning for Image Segmentation
Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.
More informationSVM Classification in -Arrays
SVM Classification in -Arrays SVM classification and validation of cancer tissue samples using microarray expression data Furey et al, 2000 Special Topics in Bioinformatics, SS10 A. Regl, 7055213 What
More informationSoft Threshold Estimation for Varying{coecient Models 2 ations of certain basis functions (e.g. wavelets). These functions are assumed to be smooth an
Soft Threshold Estimation for Varying{coecient Models Artur Klinger, Universitat Munchen ABSTRACT: An alternative penalized likelihood estimator for varying{coecient regression in generalized linear models
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)
More informationEntity Resolution, Clustering Author References
, Clustering Author References Vlad Shchogolev vs299@columbia.edu May 1, 2007 Outline 1 What is? Motivation 2 Formal Definition Efficieny Considerations Measuring Text Similarity Other approaches 3 Clustering
More informationDoubtful Outliers with Robust Regression of an M-estimator In Cluster Analysis
MATEMATIKA, 2014, Volume 30, Number 1a, 59-70 UTM--CIAM. Doubtful Outliers with Robust Regression of an M-estimator In Cluster Analysis 1 Muhamad Alias Md Jedi, 2 Robiah Adnan, and 3 Sayed Ehsan Saffari
More informationNorbert Schuff VA Medical Center and UCSF
Norbert Schuff Medical Center and UCSF Norbert.schuff@ucsf.edu Medical Imaging Informatics N.Schuff Course # 170.03 Slide 1/67 Objective Learn the principle segmentation techniques Understand the role
More informationElemental Set Methods. David Banks Duke University
Elemental Set Methods David Banks Duke University 1 1. Introduction Data mining deals with complex, high-dimensional data. This means that datasets often combine different kinds of structure. For example:
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More information