Semi-supervised learning SSL (on graphs)
|
|
- Elisabeth Gilmore
- 6 years ago
- Views:
Transcription
1 Semi-supervised learning SSL (on graphs) 1
2 Announcement No office hour for William after class today! 2
3 Semi-supervised learning Given: A pool of labeled examples L A (usually larger) pool of unlabeled examples U Option 1 for using L and U : Ignore U and use supervised learning on L Option 2: Ignore labels in L+U and use k-means, etc find clusters; then label each cluster using L Question: Can you use both L and U to do better? 3
4 SSL is Somewhere Between Clustering and Supervised Learning 4
5 SSL is Between Clustering and SL 5
6 What is a natural grouping among these objects? slides: Bhavana Dalvi 6
7 SSL is Between Clustering and SL clustering is unconstrained and may not give you what you want maybe this clustering is as good as the other 7
8 SSL is Between Clustering and SL 8
9 SSL is Between Clustering and SL 9
10 SSL is Between Clustering and SL supervised learning with few labels is also unconstrained and may not give you what you want 10
11 SSL is Between Clustering and SL 11
12 SSL is Between Clustering and SL This clustering isn t consistent with the labels 12
13 SSL is Between Clustering and SL 13
14 SSL in Action: The NELL System 14
15 Type of SSL Margin-based: transductive SVM Logistic regression with entropic regularization Generative: seeded k-means Nearest-neighbor like: graph-based SSL 15
16 Harmonic Fields aka coem aka wvrn 16
17 Idea: construct a graph connecting the most similar examples (k-nn graph) Intuition: nearby points should have similar labels labels should propagate through the graph Formalization: try and minimize energy defined as: In this example y is a length- 10 vector Harmonic fields Gharamani, Lafferty and Zhu Observed label 17
18 Result 1: at the minimal energy state, each node s value is a weighted average of its neighbor s weights: Harmonic fields Gharamani, Lafferty and Zhu Observed label 18
19 Harmonic field LP algorithm Result 2: you can reach the minimal energy state with a simple iterative algorithm: Step 1: For each seed example (x i,y i ): Let V 0 (i,c) = [ y i = c ] Step 2: for t=1,,t --- T is about 5 Let V t+1 (i,c) =weighted average of V t+1 (j,c) for all j that are linked to i, and renormalize V t +1 (i,c) = 1 Z j w i, j V t ( j,c) For seeds, reset V t+1 (i,c) = [ y i = c ] 19
20 Harmonic fields Gharamani, Lafferty and Zhu This family of techniques is called Label propagation 20
21 Harmonic fields Gharamani, Lafferty and Zhu This experiment points out some of the issues with LP: 1. What distance metric do you use? 2. What energy function do you minimize? 3. What is the right value for K in your K-NN graph? Is a K-NN graph right? 4. If you have lots of data, how expensive is it to build the graph? This family of techniques is called Label propagation 21
22 NELL: Uses Co-EM ~= HF Extract cities: Paris Pittsburgh Seattle Cupertino Examples San Francisco Austin denial anxiety selfishness Berlin mayor of arg1 live in arg1 arg1 is home of traits such as arg1 Features 22
23 Semi-Supervised Bootstrapped Learning via Label Propagation mayor of arg1 arg1 is home of Paris Pittsburgh San Francisco Austin anxiety live in arg1 traits such as arg1 Seattle denial selfishness 23
24 Semi-Supervised Bootstrapped Learning via Label Propagation mayor of arg1 arg1 is home of Paris Pittsburgh San Francisco Austin Information from other categories tells you anxiety how far (when to stop propagating) Seattle live in arg1 denial denial traits such as arg1 traits such as arg1 arrogance selfishness selfishness Nodes near seeds Nodes far from seeds 24
25 Difference: graph construction is not instance-to-instance but instance-to-feature Paris Pittsburgh San Francisco Austin Important reformulation: the k- NN graph is expensive to build, the instancefeature graph may not anxiety be Seattle denial selfishness 25
26 Some other general issues with SSL How much unlabeled data do you want? Suppose you re optimizing J = J L (L) + J U (U) If U >> L does J U dominate J? If so you re basically just clustering Often we need to balance J L and J U Besides L, what other information about the task is useful (or necessary)? Common choice: relative frequency of classes Various ways of incorporating this into the optimization problem 26
27 ASONAM-2010 (Advances in Social Networks Analysis and Mining) 27
28 Network Datasets with Known Classes UBMCBlog AGBlog MSPBlog Cora Citeseer 28
29 RWR - fixpoint of: aka Personalized PageRank Seed selection 1. order by PageRank, degree, or randomly 2. go down list until you have at least k examples/class 29
30 HF method Results Blog data Random Degree PageRank 30
31 Results More blog data Random Degree PageRank 31
32 Results Citation data Random Degree PageRank 32
33 Seeding MultiRankWalk 33
34 Seeding HF/wvRN 34
35 MultiRank Walk vs HF/wvRN/CoEM Seeds are marked S HF MRW 35
36 Back to Experiments: Network Datasets with Known Classes UBMCBlog AGBlog MSPBlog Cora Citeseer 36
37 MultiRankWalk vs wvrn/hf/coem 37
38 Harmonic Fields aka coem aka wvrn 38
39 CoEM/HF/wvRN One definition [MacKassey & Provost, JMLR 2007]: Simple relational classifier is same as the harmonic field the score of each node in the graph is the harmonic (linearly weighted) average of its neighbors scores. 39
40 CoEM/wvRN/HF Another justification of the same algorithm goes back to 2003 start with cotraining with a naïve Bayes learner 40
41 CoEM/wvRN/HF One algorithm with several justifications. One is to start with co-training with a naïve Bayes learner And compare to an EM version of naïve Bayes E: soft-classify unlabeled examples with NB classifier M: re-train classifier with soft-labeled examples 41
42 CoEM/wvRN/HF A second experiment each + example: concatenate features from two documents, one of class A+, one of class B+ each - example: concatenate features from two documents, one of class A-, one of class B- features are prefixed with A, B è disjoint 42
43 CoEM/wvRN/HF A second experiment each + example: concatenate features from two documents, one of class A+, one of class B+ each - example: concatenate features from two documents, one of class A-, one of class B- features are prefixed with A, B è disjoint NOW co-training outperforms EM 43
44 CoEM/wvRN/HF Co-training with a naïve Bayes learner vs an EM version of naïve Bayes E: soft-classify unlabeled examples with NB classifier M: re-train classifier with soft-labeled examples incremental hard assignments iterative soft assignments 44
45 Co-EM for a Rote Learner: equivalent to HF on a bipartite graph Pittsburgh NPs contexts lives in _ 45
46 SSL AS OPTIMIZATION 46
47 SSL as optimization and Modified Adsorption slides from Partha Talukdar 47
48 48
49 yet another name for HF/wvRN/coEM 49
50 match seeds smoothness prior 50
51 Adsorption SSL algorithm 51
52 52
53 53
54 How to do this minimization? First, differentiate to find min is at Jacobi method: To solve Ax=b for x Iterate: or: 54
55 55
56 56
57 /HF/ precisionrecall break even point 57
58 /HF/ 58
59 /HF/ 59
60 from HTML tables on the web that are used for data, not formatting from mining patterns like musicians such as Bob Dylan 60
61 61
62 62
63 MAD SKETCHES 63
64 Followup work (AIStats 2014) Propagating labels requires usually small number of optimization passes Basically like label propagation passes Each is linear in the number of edges and the number of labels being propagated Can you do better? basic idea: store labels in a countmin sketch which is basically an compact approximation of an objectàdouble mapping 64
65 Count-min sketches split a real vector into k ranges, one for each hash function cm.inc( fred flintstone, 3): h1 h2 h3 add the value to each hash location cm.inc( barney rubble,5): h1 h2 h
66 Count-min sketches split a real vector into k ranges, one for each hash function cm.get( fred flintstone ): h1 h2 3 h3 take min when retrieving a value cm.get( barney rubble): h1 h2 5 h
67 Followup work (AIStats 2014) Propagating labels requires usually small number of optimization passes Basically like label propagation passes Each is linear in the number of edges and the number of labels being propagated the sketch size sketches can be combined linearly without unpacking them: sketch(av + bw) = a*sketch(v)+b*sketch(w) sketchs are good at storing skewed distributions 67
68 Followup work (AIStats 2014) Label distributions are often very skewed sparse initial labels community structure: labels from other subcommunities have small weight 68
69 Followup work (AIStats 2014) self-injection : similarity computation Freebase Flick-10k 69
70 Followup work (AIStats 2014) Freebase 70
71 Followup work (AIStats 2014) 100 Gb available 71
72 Even more recent work AIStats
73 Differences: objective function seeds smoothness close to uniform label distribution normalized predictions 73
74 Differences: scaling up Updates done in parallel with Pregel Replace count-min sketch with streaming approach updates from neighbors are a stream break stream into sections maintain a list of (y, Prob(y), Δ) filter out labels and end of section if Prob(y)+Δ is small 74
75 Results with EXPANDER 75
INTRO TO SEMI-SUPERVISED LEARNING (SSL)
SSL (on graphs) 1 INTRO TO SEMI-SUPERVISED LEARNING (SSL) Semi-supervised learning Given: A pool of labeled examples L A (usually larger) pool of unlabeled examples U Option 1 for using L and U : Ignore
More informationSemi-Supervised Learning: Lecture Notes
Semi-Supervised Learning: Lecture Notes William W. Cohen March 30, 2018 1 What is Semi-Supervised Learning? In supervised learning, a learner is given a dataset of m labeled examples {(x 1, y 1 ),...,
More informationSemi-supervised Learning
Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty
More informationThorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA
Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in
More informationSemi-supervised learning and active learning
Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,
More informationTransductive Phoneme Classification Using Local Scaling And Confidence
202 IEEE 27-th Convention of Electrical and Electronics Engineers in Israel Transductive Phoneme Classification Using Local Scaling And Confidence Matan Orbach Dept. of Electrical Engineering Technion
More informationGraph-based Semi- Supervised Learning as Optimization
Graph-based Semi- Supervised Learning as Optimization Partha Pratim Talukdar CMU Machine Learning with Large Datasets (10-605) April 3, 2012 Graph-based Semi-Supervised Learning 0.2 0.1 0.2 0.3 0.3 0.2
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationGraph-based Techniques for Searching Large-Scale Noisy Multimedia Data
Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationDATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines
DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1
More informationLDA for Big Data - Outline
LDA FOR BIG DATA 1 LDA for Big Data - Outline Quick review of LDA model clustering words-in-context Parallel LDA ~= IPM Fast sampling tricks for LDA Sparsified sampler Alias table Fenwick trees LDA for
More informationAdaptation of Graph-Based Semi-Supervised Methods to Large-Scale Text Data
Adaptation of Graph-Based Semi-Supervised Methods to Large-Scale Text Data ABSTRACT Frank Lin Carnegie Mellon University 5 Forbes Ave. Pittsburgh, PA 15213 frank@cs.cmu.edu Graph-based semi-supervised
More informationLarge Scale Manifold Transduction
Large Scale Manifold Transduction Michael Karlen, Jason Weston, Ayse Erkan & Ronan Collobert NEC Labs America, Princeton, USA Ećole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland New York University,
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationClassification: Feature Vectors
Classification: Feature Vectors Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just # free YOUR_NAME MISSPELLED FROM_FRIEND... : : : : 2 0 2 0 PIXEL 7,12
More informationSemi- Supervised Learning
Semi- Supervised Learning Aarti Singh Machine Learning 10-601 Dec 1, 2011 Slides Courtesy: Jerry Zhu 1 Supervised Learning Feature Space Label Space Goal: Optimal predictor (Bayes Rule) depends on unknown
More informationCollective classification in network data
1 / 50 Collective classification in network data Seminar on graphs, UCSB 2009 Outline 2 / 50 1 Problem 2 Methods Local methods Global methods 3 Experiments Outline 3 / 50 1 Problem 2 Methods Local methods
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationEfficient Iterative Semi-supervised Classification on Manifold
. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani
More informationMulti-label classification using rule-based classifier systems
Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-
More informationDensity estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate
Density estimation In density estimation problems, we are given a random sample from an unknown density Our objective is to estimate? Applications Classification If we estimate the density for each class,
More informationIntroduction to Machine Learning. Xiaojin Zhu
Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006
More informationINF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering
INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering Erik Velldal University of Oslo Sept. 18, 2012 Topics for today 2 Classification Recap Evaluating classifiers Accuracy, precision,
More information9 Classification: KNN and SVM
CSE4334/5334 Data Mining 9 Classification: KNN and SVM Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2017 (Slides courtesy of Pang-Ning Tan, Michael Steinbach
More informationBased on Raymond J. Mooney s slides
Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit
More informationAccurate Semi-supervised Classification for Graph Data
Accurate Semi-supervised Classification for Graph Data Frank Lin Carnegie Mellon University 5000 Forbes Ave Pittsburgh, PA 523 frank@cs.cmu.edu William W. Cohen Carnegie Mellon University 5000 Forbes Ave
More informationScaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch
Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch Graph-based SSL using a count-min sketch has a number of properties that are desirable, and somewhat surprising.
More informationLearning Better Data Representation using Inference-Driven Metric Learning
Learning Better Data Representation using Inference-Driven Metric Learning Paramveer S. Dhillon CIS Deptt., Univ. of Penn. Philadelphia, PA, U.S.A dhillon@cis.upenn.edu Partha Pratim Talukdar Search Labs,
More informationLecture #11: The Perceptron
Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More informationAnnouncements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/20/2010 Announcements W7 due Thursday [that s your last written for the semester!] Project 5 out Thursday Contest running
More informationSOCIAL MEDIA MINING. Data Mining Essentials
SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate
More informationUVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia
UVA CS 6316/4501 Fall 2016 Machine Learning Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff Dr. Yanjun Qi University of Virginia Department of Computer Science 11/9/16 1 Rough Plan HW5
More informationSupervised Learning: Nearest Neighbors
CS 2750: Machine Learning Supervised Learning: Nearest Neighbors Prof. Adriana Kovashka University of Pittsburgh February 1, 2016 Today: Supervised Learning Part I Basic formulation of the simplest classifier:
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More information10601 Machine Learning. Model and feature selection
10601 Machine Learning Model and feature selection Model selection issues We have seen some of this before Selecting features (or basis functions) Logistic regression SVMs Selecting parameter value Prior
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already
More information1 Case study of SVM (Rob)
DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how
More informationDensity estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate
Density estimation In density estimation problems, we are given a random sample from an unknown density Our objective is to estimate? Applications Classification If we estimate the density for each class,
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.
CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit
More informationComposite Likelihood Data Augmentation for Within-Network Statistical Relational Learning
Composite Likelihood Data Augmentation for Within-Network Statistical Relational Learning Joseph J. Pfeiffer III 1 Jennifer Neville 1 Paul Bennett 2 1 Purdue University 2 Microsoft Research ICDM 2014,
More informationCIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]
CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, 2015. 11:59pm, PDF to Canvas [100 points] Instructions. Please write up your responses to the following problems clearly and concisely.
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu
More informationMachine Learning (CSE 446): Unsupervised Learning
Machine Learning (CSE 446): Unsupervised Learning Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 19 Announcements HW2 posted. Due Feb 1. It is long. Start this week! Today:
More informationUVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia
UVA CS 4501: Machine Learning Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff Dr. Yanjun Qi University of Virginia Department of Computer Science 1 Where are we? è Five major secfons
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationData mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20
Data mining Piotr Paszek Classification k-nn Classifier (Piotr Paszek) Data mining k-nn 1 / 20 Plan of the lecture 1 Lazy Learner 2 k-nearest Neighbor Classifier 1 Distance (metric) 2 How to Determine
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More information(Graph-based) Semi-Supervised Learning. Partha Pratim Talukdar Indian Institute of Science
(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015 Supervised Learning Labeled Data Learning Algorithm Model 2 Supervised Learning
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationMachine Learning: Algorithms and Applications Mockup Examination
Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature
More informationMachine Learning. Semi-Supervised Learning. Manfred Huber
Machine Learning Semi-Supervised Learning Manfred Huber 2015 1 Semi-Supervised Learning Semi-supervised learning refers to learning from data where part contains desired output information and the other
More informationAnnouncements: projects
Announcements: projects 805 students: Project proposals are due Sun 10/1. If you d like to work with 605 students then indicate this on your proposal. 605 students: the week after 10/1 I will post the
More informationEvaluation. Evaluate what? For really large amounts of data... A: Use a validation set.
Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?
More informationWhat to come. There will be a few more topics we will cover on supervised learning
Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Kernels and Clustering Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationClassification Algorithms in Data Mining
August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms
More informationSupervised Learning. Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...
Supervised Learning Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression... Supervised Learning y=f(x): true function (usually not known) D: training
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationLinear methods for supervised learning
Linear methods for supervised learning LDA Logistic regression Naïve Bayes PLA Maximum margin hyperplanes Soft-margin hyperplanes Least squares resgression Ridge regression Nonlinear feature maps Sometimes
More informationPython With Data Science
Course Overview This course covers theoretical and technical aspects of using Python in Applied Data Science projects and Data Logistics use cases. Who Should Attend Data Scientists, Software Developers,
More informationUnlabeled Data Classification by Support Vector Machines
Unlabeled Data Classification by Support Vector Machines Glenn Fung & Olvi L. Mangasarian University of Wisconsin Madison www.cs.wisc.edu/ olvi www.cs.wisc.edu/ gfung The General Problem Given: Points
More informationA Note on Semi-Supervised Learning using Markov Random Fields
A Note on Semi-Supervised Learning using Markov Random Fields Wei Li and Andrew McCallum {weili, mccallum}@cs.umass.edu Computer Science Department University of Massachusetts Amherst February 3, 2004
More informationK Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat
K Nearest Neighbor Wrap Up K- Means Clustering Slides adapted from Prof. Carpuat K Nearest Neighbor classification Classification is based on Test instance with Training Data K: number of neighbors that
More informationData Preprocessing. Supervised Learning
Supervised Learning Regression Given the value of an input X, the output Y belongs to the set of real values R. The goal is to predict output accurately for a new input. The predictions or outputs y are
More informationEnsemble Learning. Another approach is to leverage the algorithms we have via ensemble methods
Ensemble Learning Ensemble Learning So far we have seen learning algorithms that take a training set and output a classifier What if we want more accuracy than current algorithms afford? Develop new learning
More informationKernels and Clustering
Kernels and Clustering Robert Platt Northeastern University All slides in this file are adapted from CS188 UC Berkeley Case-Based Learning Non-Separable Data Case-Based Reasoning Classification from similarity
More informationExam Review Session. William Cohen
Exam Review Session William Cohen 1 General hints in studying Understand what you ve done and why There will be questions that test your understanding of the techniques implemented why will/won t this
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important
More informationLink prediction in graph construction for supervised and semi-supervised learning
Link prediction in graph construction for supervised and semi-supervised learning Lilian Berton, Jorge Valverde-Rebaza and Alneu de Andrade Lopes Laboratory of Computational Intelligence (LABIC) University
More informationIntroduction to Automated Text Analysis. bit.ly/poir599
Introduction to Automated Text Analysis Pablo Barberá School of International Relations University of Southern California pablobarbera.com Lecture materials: bit.ly/poir599 Today 1. Solutions for last
More informationNearest neighbors classifiers
Nearest neighbors classifiers James McInerney Adapted from slides by Daniel Hsu Sept 11, 2017 1 / 25 Housekeeping We received 167 HW0 submissions on Gradescope before midnight Sept 10th. From a random
More informationClassifiers and Detection. D.A. Forsyth
Classifiers and Detection D.A. Forsyth Classifiers Take a measurement x, predict a bit (yes/no; 1/-1; 1/0; etc) Detection with a classifier Search all windows at relevant scales Prepare features Classify
More informationMining di Dati Web. Lezione 3 - Clustering and Classification
Mining di Dati Web Lezione 3 - Clustering and Classification Introduction Clustering and classification are both learning techniques They learn functions describing data Clustering is also known as Unsupervised
More informationParametrizing the easiness of machine learning problems. Sanjoy Dasgupta, UC San Diego
Parametrizing the easiness of machine learning problems Sanjoy Dasgupta, UC San Diego Outline Linear separators Mixture models Nonparametric clustering Nonparametric classification and regression Nearest
More informationWeka ( )
Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised
More informationEdge Classification in Networks
Charu C. Aggarwal, Peixiang Zhao, and Gewen He Florida State University IBM T J Watson Research Center Edge Classification in Networks ICDE Conference, 2016 Introduction We consider in this paper the edge
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.
CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your
More informationLink Prediction for Social Network
Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue
More informationData Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition
Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 4 Instance-Based Learning Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Instance Based Classifiers
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More informationCOMP 551 Applied Machine Learning Lecture 13: Unsupervised learning
COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationINF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering
INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.
More informationBipartite Edge Prediction via Transductive Learning over Product Graphs
Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang School of Computer Science, Carnegie Mellon University July 8, 2015 ICML 2015 Bipartite Edge Prediction
More informationMapReduce ML & Clustering Algorithms
MapReduce ML & Clustering Algorithms Reminder MapReduce: A trade-off between ease of use & possible parallelism Graph Algorithms Approaches: Reduce input size (filtering) Graph specific optimizations (Pregel
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal
More informationMidterm Examination CS540-2: Introduction to Artificial Intelligence
Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationChapter 4: Non-Parametric Techniques
Chapter 4: Non-Parametric Techniques Introduction Density Estimation Parzen Windows Kn-Nearest Neighbor Density Estimation K-Nearest Neighbor (KNN) Decision Rule Supervised Learning How to fit a density
More informationCS512 (Spring 2012) Advanced Data Mining : Midterm Exam I
CS512 (Spring 2012) Advanced Data Mining : Midterm Exam I (Thursday, March 1, 2012, 90 minutes, 100 marks brief answers directly written on the exam paper) Note: Closed book and notes but one reference
More information6.034 Quiz 2, Spring 2005
6.034 Quiz 2, Spring 2005 Open Book, Open Notes Name: Problem 1 (13 pts) 2 (8 pts) 3 (7 pts) 4 (9 pts) 5 (8 pts) 6 (16 pts) 7 (15 pts) 8 (12 pts) 9 (12 pts) Total (100 pts) Score 1 1 Decision Trees (13
More information