Text Analytics. Text Clustering. Ulf Leser
|
|
- Gervais Moore
- 6 years ago
- Views:
Transcription
1 Text Analytics Text Clustering Ulf Leser
2 Text Classification Given a set D of docs and a set of classes C. A classifier is a function f: D C Problem: Finding a good classifier A good classifier assigns as many docs as possible their correct class How do we know? Supervised learning Obtain a set of docs with their classes Find the characteristics of docs in each class (= build a model) What do they have in common? How do the differ from docs in other classes? Encode the model in a classifier function f f is the better, the more docs are assign their correct class Ulf Leser: Text Analytics, Vorlesung, Sommersemester
3 Categorical Attributes ID Age Type of car Risk 1 23 Family High 2 17 Sports High 3 43 Sports High 4 68 Family Low 5 25 Truck Low Assume this classification was brought up by some insurance manager. What was in his head? Probably a set of rules, such as if age > 50 then risk = low elseif age < 25 then risk = high elseif car = sports then risk = high else risk = low Ulf Leser: Text Analytics, Vorlesung, Sommersemester
4 A Third Approach Why not: ID Age Type of car Risk 1 23 Family High 2 17 Sports High 3 43 Sports High 4 68 Family Low 5 25 Truck Low If age=17 and car = sports then risk = high elseif age=23 and car = family then risk = high elseif age=25 and car = truck then risk = high elseif age=43 and car = sports then risk = low else risk = low Ulf Leser: Text Analytics, Vorlesung, Sommersemester
5 Overfitting This was in instance of our perfect classifier We always learn a model from a small sample of the real world Overfitting If the model is too close to the training data, it performs perfect on the training data but learned any bias present in the training data Thus, the rules do not generalize well Solution Use an appropriate learning algorithm Evaluate you method using cross-validation Ulf Leser: Text Analytics, Vorlesung, Sommersemester
6 Nearest Neighbor Classifiers Very simple and effective method Definition Let D be a set of classified documents, m a distance function between any two documents, and d an unclassified doc. A nearest-neighbor (NN) classifier assigns to d the class of the nearest document to d in D wrt. m A k-nearest-neighbor (knn) classifier assigns to d the most frequent class among the k nearest documents to d in D wrt. m Remark Obviously, a proper distance function is very important In knn, we may weight the k nearest docs according to their distance to d We need to take care of multiple docs with the same distance Ulf Leser: Text Analytics, Vorlesung, Sommersemester
7 Properties Basic idea: Imagine a copy of d is in D. Of course, we then want to assign the class of this copy to d knn is extremely simple and astonishing good knn in general is more robust than NN [MS99]: 1NN reaches ~95% accuracy, MaxEnt ~96% Reuters collection, class earnings, 20 words with highest X 2 -value knn is a lazy learning Actually, there is no learning or model Major problem: Performance (speed) We need to compute the distance between d and any doc in D Various suggestions to structure D to save computations Clustering Chose one representative per class and find nearest representative (not good) Multidimensional index structures and metric embeddings Ulf Leser: Text Analytics, Vorlesung, Sommersemester
8 Bayes Classification Simple method based on probability Given Set D of docs and classes c 1, c 2, c m Docs are described as a set F of binary features Usually the presence/absence of terms in d = VSM representation We seek p(c i d), the probability of a doc d D being a member of class c i d eventually is assigned to c with p(c d) = argmax p(c i d) Replace d with feature representation p ( c d) = p( c F[ d]) = p( c f [ d],..., fn[ d]) = p( c t1,..., t 1 n ) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
9 Naïve Bayes We have The first term cannot be learned with any reasonably large training set There are 2 n combinations of feature values Solution: Be naïve Assume statistical independence of all terms Then p p( c d) p( t1,..., t c)* p( c) ( 1 t1,..., tn c) = p( t c)*...* p( tn c) n And finally p( c d) p( c)* n i= 1 p( t i c) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
10 Classification with ME The ME approach models the joint probability p(c,d) as Z is a normalization constant The feature weights α i are learned from the data K is the number of features Classification with ME K 1 = f p( c, d) * α Z We have p(c,d) = p(c d) * p(d) Again, p(d) can be dropped for ranking Compute p(c,d) for all classes and return the class with the maximal value i= 1 i i ( d, c) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
11 Where is the Problem? We want the α to assert certain conditions on our joint probability distribution p(c,d) Counting distributions of single features over the training set does not in itself create a joint distribution Counting joint distributions: Data sparsity problem (again) In NB, we additionally assumed statistical independence to come up with a joint distribution (using Bayes Theorem) ME goes another way and computes the probability distribution which maximizes the entropy of the joint distribution Thus, it makes as little assumptions as possible giving the data This distribution is encoded in the feature weights α i Ulf Leser: Text Analytics, Vorlesung, Sommersemester
12 Properties of Maximum Entropy Classifiers In general, ME should outperform NB But not always There is theory behind the discrepancies (not covered here) It does not assume independence of features Two redundant features will simply get half of their weights Very popular in statistical NLP Some of the best POS-tagger are ME-based Some of the best NER systems are ME-based Several extensions Maximum Entropy Markov Models Conditional Random Fields Ulf Leser: Text Analytics, Vorlesung, Sommersemester
13 Class of Linear Classifiers Many common classifiers are (log-)linear classifiers Naïve Bayes Perceptron / Winnow Rocchio Linear and logistic regression Maximum entropy Support vector machines (with linear kernel) All compute a hyperplane which (hopefully) separates the two classes Despite similarity, noticeable performance differences exist Which of the infinite number of possible separating hyperplanes is chosen? How are non-separable data sets handled? Experience: Classifiers more powerful than linear often don t perform better (on text) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
14 Linear Classifiers All learn a hyperplane which is used to separate classes in high dimensional space For illustration, we stay in 2-dimensional space and look at binary classification problems only But which? Quelle: Xiaojin Zhu, SVM-cs540 Ulf Leser: Text Analytics, Vorlesung, Sommersemester
15 Support Vector Machines (sketch) Compute the hyperplane which maximizes the margin I.e., is as far away from any data point as possible Can be cast in a linear optimization problem and solved efficiently Solution only depends on the support vectors (points most closest to hyperplane) Complication since usually the classes are not linearly separable Minimize the error (misclassification) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
16 Problems not Linearly Separable Map high dimensional data into an even higher dimensional space None-linearly separable sets may become linearly separable Doing this efficiently requires a good deal of work ( Kernel trick ) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
17 Content of this Lecture (Text) clustering Clustering algorithms Application Ulf Leser: Text Analytics, Vorlesung, Sommersemester
18 Clustering Clustering groups objects into (usually disjoint) sets NN VBZ NNS PP ( ADV JJ) NN Intuitively, each set should contain objects that are similar to each other and dissimilar to objects in any other set We need a similarity or distance function Two optimization goals Also called unsupervised learning We don t know how many sets/classes we expect We don t know how those sets should look like We have no examples for set members Supervised learning = classification / categorization Ulf Leser: Text Analytics, Vorlesung, Sommersemester
19 Example 1 Ulf Leser: Text Analytics, Vorlesung, Sommersemester
20 Clustering 1 Intuition here: Similarity corresponds to Euclidian distance Optimization for good ratio of inner-cluster coherence and intra-cluster distance Ulf Leser: Text Analytics, Vorlesung, Sommersemester
21 Clustering 2 Better or worse? Ulf Leser: Text Analytics, Vorlesung, Sommersemester
22 Quality of a Clustering Let us measure cluster quality only by the average distance of objects within a cluster Definition Let f be a clustering of a set of objects O into a set of classes C with C =k. Let m c be the centre of all objects of class c (to be defined later), and let d(o,o ) be the distance between two objects o and o. Then, the k-score of f is q ( f ) = d( o, k m c c C f ( o) = c Remark Similarly, we could define the k-score as the average distance across all objects pairs within a cluster Would relieve us from finding the centre of a set of objects ) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
23 6-Score Find centre of all clusters, computer distance, aggregate Probably better than the 2-score of clustering on previous slide But Ulf Leser: Text Analytics, Vorlesung, Sommersemester
24 Disadvantage Optimal clustering trivially is reached for k= O We need to fix our definition Ulf Leser: Text Analytics, Vorlesung, Sommersemester
25 Quality of a Clustering 2 Definition Let f: O C with C arbitrary. Let dist(o, c i ) be the average distance of o to all points of cluster c i. We define Note Inner score: a(o) = dist(o,f(o)) Outer score: b(o) = min( dist(o,c i )) with C i f(o) Let the silhouette s(o) be s( o) = Then, the silhouette s(f) of f is Σs(o) b(o): How much decreases the score if f(o) would not exist and o was assigned its next other cluster? s(o) 0: Point right between two cluster s(o) ~ 1: Point very close to only one (its own) cluster s(o) ~ -1: Point far away from its own cluster b( o) a( o) max( a( o), b( o)) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
26 Quality of Clustering 3 The silhouette is a very technical definition Usually, we want to find intuitively appealing clusters Those might not at all conform to our definitions Quelle: [FPPS96] Ulf Leser: Text Analytics, Vorlesung, Sommersemester
27 Text Clustering Applications Explorative data analysis Learn about the structure within your document collection Corpus preprocessing Clustering provides a semantic index to corpus Group docs into clusters to ease navigation Retrieval speed: Index only one representative per cluster Processing of search results Cluster all hits into groups of similar hits (in particular: duplicates) Improving search recall Return doc and all members of its cluster Has similarity to automatic relevance feedback using top-k docs Word sense disambiguation The different senses of a word should appear as clusters Ulf Leser: Text Analytics, Vorlesung, Sommersemester
28 Processing Search Results The research breakthrough was labeling the clusters, i.e., grouping search results into folder topics [Clusty.com blog] Ulf Leser: Text Analytics, Vorlesung, Sommersemester
29 Similarity between Documents All clustering methods require some form of distance or similarity function Must be a metric: d(x,x)=0, d(x,y)=d(y,x), d(x,y) d(x,z)+d(z,y) In contrast to search, we now compare to docs with each other And not a document and a query Nevertheless, the same methods are usually used Compute TD / IDF values for all terms in the corpus Represent documents as K -dimensional vectors Use cosine as distance function sim( d 1, d 2 ) = cos( d 1, d 2 ) = d d 1 1 o d * d 2 2 = d 1 ( d [ i]* d [ i] ) 1 [ i] 2 * 2 d 2 [ i] 2 Ulf Leser: Text Analytics, Vorlesung, Sommersemester
30 Further Issues To increase speed, feature selection is necessary We never counted the time it takes to compare two high dimensional vectors Do not cluster on terms Instead, use the most descriptive terms for you intended clustering Cluster label Use the representative, e.g., show 5-10 terms with highest TF/IDF values in the cluster centre Ulf Leser: Text Analytics, Vorlesung, Sommersemester
31 Content of this Lecture Text clustering Clustering algorithms Hierarchical clustering K-means Soft clustering: EM algorithm Application Ulf Leser: Text Analytics, Vorlesung, Sommersemester
32 Classes of Cluster Algorithms Hierarchical clustering Iteratively creates a hierarchy of clusters Bottom-Up: Start from D clusters and merge clusters until only one remains Top-Down: Start from one cluster (including all docs) and split clusters until every doc is one cluster Or some stop criterion is met Partitioning Heuristically partition all objects in k clusters Guess a first partitioning and improve iteratively k is a parameter of the method, not a result Other Algorithmically: Max-Cut (partitioning) etc. Density-base clustering Minimum description length Ulf Leser: Text Analytics, Vorlesung, Sommersemester
33 Hierarchical Clustering Also called UPGMA Unweighted Pair-group method with arithmetic mean We only discuss the bottom-up approach Computes a binary tree (dendogram) Simple algorithm Compute distance matrix M Distances between any pair of docs Choose pair d 1, d 2 with smallest distance Compute x = m(d 1,d 2 ) (the centre point) Remove d 1, d 2 from M Insert x Distance between x and any d in M: Average distance between d 1 and d and d 2 and 2 Loop until M is empty Ulf Leser: Text Analytics, Vorlesung, Sommersemester
34 Example: Distance Matrix A B C D E F.. A B C D E 95.. F Ulf Leser: Text Analytics, Vorlesung, Sommersemester
35 Ulf Leser: Text Analytics, Vorlesung, Sommersemester Iteration A B C D E F G ABCDEFG A B. C.. D... E... F... G... (B,D) a ACEFGa A C. E.. F... G... a... A B C D E F G ACGab A C. G.. a... b... (E,F) b A B C D E F G (A,b) c CGac C G. a.. c... A B C D E F G (C,G) d acd a c. d.. A B C D E F G (d,c) e A B C D E F G (a,e) f A B C D E F G ae a e.
36 Hierarchical Clustering Ulf Leser: Text Analytics, Vorlesung, Sommersemester
37 Properties Advantages Simple and intuitive algorithm Number of clusters is not an input of the method Usually good quality clusters Disadvantage Very expensive Requires O(n 2 ) space and time (at least) for distance matrix Total runtime is O(n 2 *log(n)) Why? Not applicable as such to large doc sets Does not really generate clusters Ulf Leser: Text Analytics, Vorlesung, Sommersemester
38 Intuition Hierarchical clustering organizes a doc collection Ideally, hierarchical clustering directly creates a directory of the corpus This is more of a wish Many, many ways to group objects clustering will choose just one There are no names for the groups Ulf Leser: Text Analytics, Vorlesung, Sommersemester
39 Branch Length Use branch length to symbolize distance Outlier detection Outlier Ulf Leser: Text Analytics, Vorlesung, Sommersemester
40 Variations Hierarchical clustering uses the distance between the centers of clusters to decide about distance between clusters Other alternatives Single Link: Distance of the two closest docs in both clusters Complete Link: Distance of the two furthest docs Average Link: Average distance between pairs of docs from both clusters Centroid: Distance between centre points Ulf Leser: Text Analytics, Vorlesung, Sommersemester
41 Variations Hierarchical clustering uses the distance between the centers of clusters to decide about distance between clusters Other alternatives Single Link: Distance of the two closests docs in both clusters Complete Link: Distance of the two furthest docs Average Link: Average distance between pairs of docs from both clusters Centroid: Distance between centre points Ulf Leser: Text Analytics, Vorlesung, Sommersemester
42 Variations Hierarchical clustering uses the distance between the centers of clusters to decide about distance between clusters Other alternatives Single Link: Distance of the two closests docs in both clusters Complete Link: Distance of the two furthest docs Average Link: Average distance between pairs of docs from both clusters Centroid: Distance between centre points Ulf Leser: Text Analytics, Vorlesung, Sommersemester
43 Single-link versus Complete-link Ulf Leser: Text Analytics, Vorlesung, Sommersemester
44 More Properties Single-link Optimizes a local criterion (only look at the closest pair) Similar to computing a minimal spanning tree With cuts at most expensive branches as going down the hierarchy Creates elongated clusters (chaining effect) Complete-link Optimizes a global criterion (look at the worst pair) Creates more compact, more convex, spherical clusters Ulf Leser: Text Analytics, Vorlesung, Sommersemester
45 Content of this Lecture Text clustering Clustering algorithms Hierarchical clustering K-means Soft clustering: EM algorithm Application Ulf Leser: Text Analytics, Vorlesung, Sommersemester
46 K-Means Partitioning method K-Means probably is the best known clustering algorithm Requires the number k of clusters to be predefined Algorithm Guess k cluster centers at random Can use k docs, or k random points in doc-space Loop forever Assign all docs to their closest cluster center If no doc has changed its assignment, stop Or if sufficiently few docs have changed their assignment Otherwise, compute new cluster centre as centre of all points in cluster Ulf Leser: Text Analytics, Vorlesung, Sommersemester
47 Example 1 k=3 Choose random start points Quelle: Stanford, CS 262 Computational Genomics Ulf Leser: Text Analytics, Vorlesung, Sommersemester
48 Example 2 Assign docs to closest cluster centre Ulf Leser: Text Analytics, Vorlesung, Sommersemester
49 Example 3 Compute new cluster centre Ulf Leser: Text Analytics, Vorlesung, Sommersemester
50 Example 4 Ulf Leser: Text Analytics, Vorlesung, Sommersemester
51 Example 5 Ulf Leser: Text Analytics, Vorlesung, Sommersemester
52 Example 6 Converged Ulf Leser: Text Analytics, Vorlesung, Sommersemester
53 Properties Usually, k-means converges quite fast Let l be the number of iterations Complexity: O(l*k*n) Assignment: n*k distance computations New centers: Summing up n vectors k times Choosing the right start points is important K-Means essentially is a greedy heuristic and only finds local optima Option 1: Start several times with different start points Option 2: Compute hierarchical clustering on small random sample and choose start points as cluster centers Buckshot algorithm How to choose k? Try for different k and use quality score to find best value Ulf Leser: Text Analytics, Vorlesung, Sommersemester
54 k-means and Outlier Try for k=3 Ulf Leser: Text Analytics, Vorlesung, Sommersemester
55 Help: K-Medoid Chose the doc in the middle of a cluster as representative PAM: Partitioning around Medoids Advantage Less sensitive to outliers Also works for non-metric spaces as no new center point needs to be computed Disadvantage More expensive We need to compute all pair-wise distances in each cluster in each round Overall complexity is O(n 3 ) Can save re-computations at the expense of more space requirements Ulf Leser: Text Analytics, Vorlesung, Sommersemester
56 k-medoid and Outlier Ulf Leser: Text Analytics, Vorlesung, Sommersemester
57 Content of this Lecture Text clustering Clustering algorithms Hierarchical clustering K-means Soft clustering: EM algorithm Application Ulf Leser: Text Analytics, Vorlesung, Sommersemester
58 Soft Clustering We always assumed docs are assigned exactly one cluster Probabilistic interpretation: All docs pertain to all clusters with a certain probability Generative model Assume we have k doc-producing devices Such as authors, topics, Each device produces docs that are normally distributed in vector space with a device-specific mean and variance Assume that k devices have produced D documents Clustering can be interpreted as re-covering mean and variance of each distribution / device Solution: Expectation Maximization Algorithm (EM) Ulf Leser: Text Analytics, Vorlesung, Sommersemester
59 Expectation Maximization (rough sketch, no math) EM optimizes the set of parameters Θ of a multi-variant normal distribution (mean and variance of k clusters) given sample data Iterative process with two phases Expectation: Assuming an instantiation of Θ, we can assign all docs its most likely generator / cluster Maximization: Assuming an assignment of docs to generators, we can compute the optimal Θ using maximum likelihood estimation Algorithm Or using a Bayes approach including a-priori probabilities of generators Guess an initial Θ Iterate through both steps until convergence Finds a local optimum, convergence guaranteed K-Means: special case of EM clustering assuming k normal distributions with different means yet equal and minimal variance Ulf Leser: Text Analytics, Vorlesung, Sommersemester
60 Content of this Lecture Text clustering Clustering algorithms Application Clustering Phenotypes Ulf Leser: Text Analytics, Vorlesung, Sommersemester
61 Mining Phenotypes for Function Prediction Ulf Leser: Text Analytics, Vorlesung, Sommersemester
62 Phenotypes Observable characteristics of an organism produced by the organism's genotype interacting with the environment. Individual 1 A Gene Transcripts Individual 2 T Proteins Phenotypes Ulf Leser: Text Analytics, Vorlesung, Sommersemester
63 Phenotypes Observable characteristics of an organism produced by the organism's genotype interacting with the environment. Individual 1 A Gene Transcripts Individual 2 T Proteins Disease Phenotypes Healthy Ulf Leser: Text Analytics, Vorlesung, Sommersemester
64 Genotypes ATCGATCGATGA ATCGACCGATGA Measuring genotypes: Sequencing, microarrays, etc. Describing genotypes: Gene Ontology > terms, 10 years history, widely accepted Ulf Leser: Text Analytics, Vorlesung, Sommersemester
65 Mining Phenotypes: General Idea Established Genotype Gene A Phenotype Genotype Gene B Phenotype? Established Genes with similar genotypes likely have similar phenotypes Question If genes generate very similar phenotypes do they have the same genotype? Ulf Leser: Text Analytics, Vorlesung, Sommersemester
66 Phenotypes What is a phenotype at? Visible characteristic of an organism Description of a disease Response to a drug Characterization of mutants Results of RNAi / gene knock-out Expression levels of genes Methods for the systematic measurement of phenotypes are established for few years only Describing phenotypes Today: Text, keywords, abstracts, home-grown vocabulary Tomorrow: Mammalian Phenotype Ontology? Ulf Leser: Text Analytics, Vorlesung, Sommersemester
67 Approach GO Annotation Gene A Phenotype Description Prediction Inference Similarity GO Annotation Gene B Phenotype Description Ulf Leser: Text Analytics, Vorlesung, Sommersemester
68 Phenodocs 411,102 phenotypes Short: <250 words Remove all phenotypes associated to more than one gene (~500) PhenomicDB Remove small phenotypes Remove multi-gene phenotypes Remove stop words Stemming 39,610 phenodocs for 15,426 genes Phenodocs Ulf Leser: Text Analytics, Vorlesung, Sommersemester
69 K-Means Clustering Hierarchical clustering would require ~ * = comparisons K-Means: Simple, iterative algorithm Number of clusters must be predefined We experimented with clusters Ulf Leser: Text Analytics, Vorlesung, Sommersemester
70 Properties: Phenodoc Similarity of Genes Genes with 5 PTs Genes in phenoclusters Pairwise similarity 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0, Ge ne no. 0 Genes with 5 PTs Control (Random selection) Pairw is e sim ila rity 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0, Gene no. 0 Pair-wise similarity scores of phenodocs of genes in the same cluster, sorted by score Result: Phenodocs of genes in phenoclusters are highly similar to each other Ulf Leser: Text Analytics, Vorlesung, Sommersemester
71 PPI: Inter-Connectedness Interacting proteins often share function PPI from BIOGRID database Not at all a complete dataset In >200 clusters, >30% of genes interact with each other Control (random groups): 3 clusters Result: Genes in phenoclusters interact with each other much more often than expected by chance Proteins and interactions from BioGrid. Red proteins have no phenotypes in PhenomicDB Ulf Leser: Text Analytics, Vorlesung, Sommersemester
72 Coherence of Functional Annotation Comparison of GO annotation of genes in phenoclusters Data from Entrez Gene Similarity of two GO terms: Normalized n# of shared ancestors Similarity of two genes: Average of the top-k GO pairs >200 clusters with score >0.4 Control: 2 clusters Results: Genes in phenoclusters have a much higher coherence in functional annotation than expected by chance Gene Ontology Molecular Function Biological Process Physiological Process Catalytic Activity Cellular Process Binding Metabolism Transferase Activity Nucleotide Binding Protein Metabolism Kinase Activity Cell Communication Signal Transduction Protein Modification Ulf Leser: Text Analytics, Vorlesung, Sommersemester
73 Function Prediction Can increased functional coherence of clusters be exploited for function prediction? General approach Compute phenoclusters For each cluster, compute set of associated genes (gene cluster) In each gene cluster, predict common GO terms to all genes Common: annotated to >50% of genes in the cluster Filtering clusters Idea: Find clusters a-priori which give hope for very good results Filter 1: Only clusters with >2 members and at least one common GO term Filter 2: Only clusters with GO coherence>0.4 Filter 3: Only clusters with PPI-connectedness >33% Ulf Leser: Text Analytics, Vorlesung, Sommersemester
74 Evaluation How can we know how good we are? Cross-validation Separate genes in training (90%) and test (10%) Remove annotation from genes in test set Build clusters and predict functions on training set Compare predicted with removed annotations Precision and recall Repeat and average results Macro-average Note: This punishes new and potentially valid annotations Ulf Leser: Text Analytics, Vorlesung, Sommersemester
75 Results for Different Filters (Filter 1) (Filter 1 & Filter 2) (Filter 1 & Filter 3) # of groups # of terms # of genes Precision 67.91% 62.52% 60.52% Recall 22.98% 26.16% 19.78% What if we consider predicted terms to be correct that are a little more general than the removed terms (filter 1)? One step more general: 75.6% precision, 28.7% recall Two steps: 76.3% precision, 30.7% recall The less stringent GO equality, the better the results This is a common trick in studies using GO Ulf Leser: Text Analytics, Vorlesung, Sommersemester
76 Results for Different Cluster Sizes K ,000 Cluster w/ GO-Sim 1 14 (5.6%) 26 (5.2%) 44 (5.9%) 71 (7.1%) # Genes Cluster w/ PPi 75% 12 (4.8%) 34 (6.8%) 65 (8.7%) 88 (8.8%) # Genes Cluster w/ PPi 33% 49 (19.6%) 119 (23.8%) 193 (25.7%) 252 (25.2%) # Genes Cluster for GO-Pred. 73 (29.2%) 153 (30.6%) 230 (30.7%) 295 (29.5%) # Genes # Terms Precision 81.53% 77.16% 74.26% 71.73% Recall 16.90% 20.22% 24.45% 26.36% Avg. Genes/Cluster ,750 3, (9.9%) 309 (10.3%) (11.4%) 353 (11.8%) (24.1%) 717 (23.9%) (27.2%) 816 (27.2%) % 62.89% 34.64% 34.61% 4 4 With increasing k Clusters are smaller Number of predicted terms increases Clusters are more homogeneous Number of genes which receive annotations stays roughly the same Precision decreases slowly, recall increases Effect of the rapid increase in number of predictions Ulf Leser: Text Analytics, Vorlesung, Sommersemester
Unsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationInformation Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 4th, 2014 Wolf-Tilo Balke and José Pinto Institut für Informationssysteme Technische Universität Braunschweig The Cluster
More informationText classification II CE-324: Modern Information Retrieval Sharif University of Technology
Text classification II CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2015 Some slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationhttp://www.xkcd.com/233/ Text Clustering David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Administrative 2 nd status reports Paper review
More informationINF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering
INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering Erik Velldal University of Oslo Sept. 18, 2012 Topics for today 2 Classification Recap Evaluating classifiers Accuracy, precision,
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationChapter 9. Classification and Clustering
Chapter 9 Classification and Clustering Classification and Clustering Classification and clustering are classical pattern recognition and machine learning problems Classification, also referred to as categorization
More informationClassification. 1 o Semestre 2007/2008
Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationToday s topic CS347. Results list clustering example. Why cluster documents. Clustering documents. Lecture 8 May 7, 2001 Prabhakar Raghavan
Today s topic CS347 Clustering documents Lecture 8 May 7, 2001 Prabhakar Raghavan Why cluster documents Given a corpus, partition it into groups of related docs Recursively, can induce a tree of topics
More informationCS490W. Text Clustering. Luo Si. Department of Computer Science Purdue University
CS490W Text Clustering Luo Si Department of Computer Science Purdue University [Borrows slides from Chris Manning, Ray Mooney and Soumen Chakrabarti] Clustering Document clustering Motivations Document
More informationKapitel 4: Clustering
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.
More informationBased on Raymond J. Mooney s slides
Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit
More informationText Categorization. Foundations of Statistic Natural Language Processing The MIT Press1999
Text Categorization Foundations of Statistic Natural Language Processing The MIT Press1999 Outline Introduction Decision Trees Maximum Entropy Modeling (optional) Perceptrons K Nearest Neighbor Classification
More informationInformation Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering May 25, 2011 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig Homework
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationVECTOR SPACE CLASSIFICATION
VECTOR SPACE CLASSIFICATION Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. Chapter 14 Wei Wei wwei@idi.ntnu.no Lecture
More informationCS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample
CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationMultiple Sequence Alignment Sum-of-Pairs and ClustalW. Ulf Leser
Multiple Sequence Alignment Sum-of-Pairs and ClustalW Ulf Leser This Lecture Multiple Sequence Alignment The problem Theoretical approach: Sum-of-Pairs scores Practical approach: ClustalW Ulf Leser: Bioinformatics,
More informationSupport Vector Machines + Classification for IR
Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines
More informationINF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering
INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 6: Flat Clustering Wiltrud Kessler & Hinrich Schütze Institute for Natural Language Processing, University of Stuttgart 0-- / 83
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationCSE 158. Web Mining and Recommender Systems. Midterm recap
CSE 158 Web Mining and Recommender Systems Midterm recap Midterm on Wednesday! 5:10 pm 6:10 pm Closed book but I ll provide a similar level of basic info as in the last page of previous midterms CSE 158
More informationINF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering
INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal & Stephan Oepen Language Technology Group (LTG) September 23, 2015 Agenda Last week Supervised vs unsupervised learning.
More informationUnsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing
Unsupervised Data Mining: Clustering Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 1. Supervised Data Mining Classification Regression Outlier detection
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 16: Flat Clustering Hinrich Schütze Institute for Natural Language Processing, Universität Stuttgart 2009.06.16 1/ 64 Overview
More informationINF4820, Algorithms for AI and NLP: Hierarchical Clustering
INF4820, Algorithms for AI and NLP: Hierarchical Clustering Erik Velldal University of Oslo Sept. 25, 2012 Agenda Topics we covered last week Evaluating classifiers Accuracy, precision, recall and F-score
More informationCOMP 551 Applied Machine Learning Lecture 13: Unsupervised learning
COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More information10701 Machine Learning. Clustering
171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among
More informationSOCIAL MEDIA MINING. Data Mining Essentials
SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu
More information10601 Machine Learning. Hierarchical clustering. Reading: Bishop: 9-9.2
161 Machine Learning Hierarchical clustering Reading: Bishop: 9-9.2 Second half: Overview Clustering - Hierarchical, semi-supervised learning Graphical models - Bayesian networks, HMMs, Reasoning under
More informationINF 4300 Classification III Anne Solberg The agenda today:
INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15
More informationFoundations of Machine Learning CentraleSupélec Fall Clustering Chloé-Agathe Azencot
Foundations of Machine Learning CentraleSupélec Fall 2017 12. Clustering Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning objectives
More informationBioinformatics - Lecture 07
Bioinformatics - Lecture 07 Bioinformatics Clusters and networks Martin Saturka http://www.bioplexity.org/lectures/ EBI version 0.4 Creative Commons Attribution-Share Alike 2.5 License Learning on profiles
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 15: Microarray clustering http://compbio.pbworks.com/f/wood2.gif Some slides were adapted from Dr. Shaojie Zhang (University of Central Florida) Microarray
More informationData Mining in Bioinformatics Day 1: Classification
Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls
More informationInformation Retrieval and Organisation
Information Retrieval and Organisation Chapter 16 Flat Clustering Dell Zhang Birkbeck, University of London What Is Text Clustering? Text Clustering = Grouping a set of documents into classes of similar
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationMining di Dati Web. Lezione 3 - Clustering and Classification
Mining di Dati Web Lezione 3 - Clustering and Classification Introduction Clustering and classification are both learning techniques They learn functions describing data Clustering is also known as Unsupervised
More informationBBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler
BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for
More informationLecture 8 May 7, Prabhakar Raghavan
Lecture 8 May 7, 2001 Prabhakar Raghavan Clustering documents Given a corpus, partition it into groups of related docs Recursively, can induce a tree of topics Given the set of docs from the results of
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationClustering Techniques
Clustering Techniques Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 16 Lopresti Fall 2007 Lecture 16-1 - Administrative notes Your final project / paper proposal is due on Friday,
More informationNatural Language Processing
Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction 2 Machine Learning Field of study that gives computers the ability to learn without
More information6.034 Quiz 2, Spring 2005
6.034 Quiz 2, Spring 2005 Open Book, Open Notes Name: Problem 1 (13 pts) 2 (8 pts) 3 (7 pts) 4 (9 pts) 5 (8 pts) 6 (16 pts) 7 (15 pts) 8 (12 pts) 9 (12 pts) Total (100 pts) Score 1 1 Decision Trees (13
More informationCPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017
CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationIntroduction to Computer Science
DM534 Introduction to Computer Science Clustering and Feature Spaces Richard Roettger: About Me Computer Science (Technical University of Munich and thesis at the ICSI at the University of California at
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationINF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22
INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-
More informationPV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211
PV: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv IIR 6: Flat Clustering Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University, Brno Center
More informationIntroduction to Machine Learning. Xiaojin Zhu
Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006
More informationCluster Analysis. Angela Montanari and Laura Anderlucci
Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a
More informationIntroduction to Mobile Robotics
Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationStructured Learning. Jun Zhu
Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,
More informationEvaluation of different biological data and computational classification methods for use in protein interaction prediction.
Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly
More informationSequence clustering. Introduction. Clustering basics. Hierarchical clustering
Sequence clustering Introduction Data clustering is one of the key tools used in various incarnations of data-mining - trying to make sense of large datasets. It is, thus, natural to ask whether clustering
More informationWhat to come. There will be a few more topics we will cover on supervised learning
Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster
More informationClustering CE-324: Modern Information Retrieval Sharif University of Technology
Clustering CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Ch. 16 What
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationCS229 Final Project: Predicting Expected Response Times
CS229 Final Project: Predicting Expected Email Response Times Laura Cruz-Albrecht (lcruzalb), Kevin Khieu (kkhieu) December 15, 2017 1 Introduction Each day, countless emails are sent out, yet the time
More informationSupervised and Unsupervised Learning (II)
Supervised and Unsupervised Learning (II) Yong Zheng Center for Web Intelligence DePaul University, Chicago IPD 346 - Data Science for Business Program DePaul University, Chicago, USA Intro: Supervised
More informationClassification: Feature Vectors
Classification: Feature Vectors Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just # free YOUR_NAME MISSPELLED FROM_FRIEND... : : : : 2 0 2 0 PIXEL 7,12
More informationDATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm
DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.
CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit
More informationPV211: Introduction to Information Retrieval
PV211: Introduction to Information Retrieval http://www.fi.muni.cz/~sojka/pv211 IIR 15-1: Support Vector Machines Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University,
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationk-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out
Machine learning: Unsupervised learning" David Kauchak cs Spring 0 adapted from: http://www.stanford.edu/class/cs76/handouts/lecture7-clustering.ppt http://www.youtube.com/watch?v=or_-y-eilqo Administrative
More informationCS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek
CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Fall, 2015!1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function
More informationCLUSTERING IN BIOINFORMATICS
CLUSTERING IN BIOINFORMATICS CSE/BIMM/BENG 8 MAY 4, 0 OVERVIEW Define the clustering problem Motivation: gene expression and microarrays Types of clustering Clustering algorithms Other applications of
More informationData Informatics. Seon Ho Kim, Ph.D.
Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu Clustering Overview Supervised vs. Unsupervised Learning Supervised learning (classification) Supervision: The training data (observations, measurements,
More informationAdministrative. Machine learning code. Supervised learning (e.g. classification) Machine learning: Unsupervised learning" BANANAS APPLES
Administrative Machine learning: Unsupervised learning" Assignment 5 out soon David Kauchak cs311 Spring 2013 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Machine
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationUnsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi
Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which
More informationMachine Learning: Think Big and Parallel
Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Admin Course add/drop deadline tomorrow. Assignment 1 is due Friday. Setup your CS undergrad account ASAP to use Handin: https://www.cs.ubc.ca/getacct
More informationData Mining Algorithms
for the original version: -JörgSander and Martin Ester - Jiawei Han and Micheline Kamber Data Management and Exploration Prof. Dr. Thomas Seidl Data Mining Algorithms Lecture Course with Tutorials Wintersemester
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/004 1
More informationHard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering
An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other
More information