Guiding Semi-Supervision with Constraint-Driven Learning
|
|
- Felicia Barnett
- 5 years ago
- Views:
Transcription
1 Guiding Semi-Supervision with Constraint-Driven Learning Ming-Wei Chang 1 Lev Ratinov 2 Dan Roth 3 1 Department of Computer Science University of Illinois at Urbana-Champaign Paper presentation by: Drew Stone Paper presentation by: Drew Stone 1 /
2 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 2 /
3 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 3 /
4 Semi-supervised learning Introduction Question: When labelled data is scarce, how can we take advantage of unlabelled data for training? Intuition: If there exists structure in the underlying distribution of samples, points close/clustered to one another may share labels Paper presentation by: Drew Stone 4 /
5 Semi-supervised learning Framework Given a model M trained on labelled samples from a distribution D and an unlabelled set of examples U D m Learn labels for each example in U Learn(U, M) and re-use examples (now labelled) to tune M M Paper presentation by: Drew Stone 5 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
6 Semi-supervised learning Framework Given a model M trained on labelled samples from a distribution D and an unlabelled set of examples U D m Learn labels for each example in U Learn(U, M) and re-use examples (now labelled) to tune M M Benefit: Access to more training data Paper presentation by: Drew Stone 5 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
7 Semi-supervised learning Framework Given a model M trained on labelled samples from a distribution D and an unlabelled set of examples U D m Learn labels for each example in U Learn(U, M) and re-use examples (now labelled) to tune M M Benefit: Access to more training data Drawback: Learned model might drift from correct classifier if the assumptions on the distribution do not hold. Paper presentation by: Drew Stone 5 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
8 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 6 /
9 Constraint-driven learning Introduction Motivation: Keep the learned model simple by using constraints to balance over-simplicity Paper presentation by: Drew Stone 7 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
10 Constraint-driven learning Introduction Motivation: Keep the learned model simple by using constraints to balance over-simplicity Benefits: Simple models (less features) are more computationally efficient Paper presentation by: Drew Stone 7 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
11 Constraint-driven learning Introduction Motivation: Keep the learned model simple by using constraints to balance over-simplicity Benefits: Simple models (less features) are more computationally efficient Intuition: Fix a set of task-specific constraints to enable the use of a simple machine learning model but encode task-specific constraints to make both learning easier and more correct. Paper presentation by: Drew Stone 7 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
12 Constraint-driven learning Framework Given an objective function argmax y λ F (x, y) Define the set of linear (non-linear) constraints {C i } i k C i : X Y {0, 1} Solve the optimization problem given the constraints Paper presentation by: Drew Stone 8 /
13 Constraint-driven learning Framework Given an objective function argmax y λ F (x, y) Define the set of linear (non-linear) constraints {C i } i k C i : X Y {0, 1} Solve the optimization problem given the constraints Any Boolean rule can be encoded as a set of linear inequalities. Paper presentation by: Drew Stone 8 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
14 Constraint-driven learning Example task Sequence tagging with HMM/CRF + global constraints Paper presentation by: Drew Stone 9 /
15 Constraint-driven learning Sequence tagging constraints Unique label for each word Edges must for a path A verb must exist Paper presentation by: Drew Stone 10 /
16 Constraint-driven learning Example task Semantic role labelling with independent classifiers + global constraints Paper presentation by: Drew Stone 11 /
17 Constraint-driven learning Example task Each verb predicate carries with it a new inference problem Paper presentation by: Drew Stone 12 /
18 Constraint-driven learning Example task Paper presentation by: Drew Stone 13 /
19 Constraint-driven learning SRL constraints No duplicate argument classes Unique labels Order constraints Paper presentation by: Drew Stone 14 /
20 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 15 /
21 Constraint-drive learning w/ semi-supervision Introduction Intuition: Use constraints to provide better training samples from our unlabelled set of samples Paper presentation by: Drew Stone 16 /
22 Constraint-drive learning w/ semi-supervision Introduction Intuition: Use constraints to provide better training samples from our unlabelled set of samples Benefit: Deviations from simple model only do so towards a more expressive answer, since constraints guide the model Paper presentation by: Drew Stone 16 /
23 Constraint-drive learning w/ semi-supervision Examples Consider the constraint over {0, 1} n of 1 0. For every possible sequence {1 010 }, there are 2 good fixes {1 110, }. What is the best approach for learning? Consider the constraint, state transitions must occur on punctuation marks. There could be many good sequences abiding by this constraint. Paper presentation by: Drew Stone 17 /
24 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 18 /
25 Constraint-driven learning w/ semi-supervision Model Consider a data and output domain X, Y and a distribution D defined over X Y Input/output pairs (x, y) X Y are given as sequence pairs: x = (x 1,, x N ), y = (y 1,, y M ) We wish to find a structured output classifier h : X Y that uses a structured scoring function f : X Y R to choose the most likely output sequence, where the correct output sequence is given the highest score We are given (or define) a set of constraints {C i } i K C i : X Y {0, 1} Paper presentation by: Drew Stone 19 /
26 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 20 /
27 Constraint-driven learning w/ semi-supervision Scoring Scoring rule: f (x, y) = M i=1 λ i f i (x, y) = λ F (x, y) w T φ(x, y) Compatible for any linear discriminative and generative model such as HMMs and CRFs Local feature functions {f i } i M allow for tractable inference by capturing contextual structure Space of structured pairs could be huge Locally there are smaller distances to account for Paper presentation by: Drew Stone 21 /
28 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 22 /
29 Constraint-driven learning w/ semi-supervision Sample constraints Paper presentation by: Drew Stone 23 /
30 Constraint-driven learning w/ semi-supervision Constraint pipeline (hard) Define 1 C(x) Y as the set of output sequences that assign the value 1 for a given (x, y) Our objective function then becomes argmax y 1C(x) λ F (x, y) Paper presentation by: Drew Stone 24 / ing-wei Chang Lev Ratinov Dan Roth (University GuidingofSemi-Supervision Illinois at Urbana-Champaign) with Constraint-Driven Learning
31 Constraint-driven learning w/ semi-supervision Constraint pipeline (hard) Define 1 C(x) Y as the set of output sequences that assign the value 1 for a given (x, y) Our objective function then becomes argmax y 1C(x) λ F (x, y) Intuition: Find best output sequence y that maximizes the score, fitting the hard constraints Paper presentation by: Drew Stone 24 /
32 Constraint-driven learning w/ semi-supervision Constraint pipeline (soft) Given a suitable distance metric between an output sequence and the space of outputs respecting a single constraint Given a set of soft constraints {C i } i K with penalties {ρ i } i K, we get a new objective function: K argmax y λ F (x, y) ρ i d(y, 1 Ci (x)) i=1 Paper presentation by: Drew Stone 25 /
33 Constraint-driven learning w/ semi-supervision Constraint pipeline (soft) Given a suitable distance metric between an output sequence and the space of outputs respecting a single constraint Given a set of soft constraints {C i } i K with penalties {ρ i } i K, we get a new objective function: K argmax y λ F (x, y) ρ i d(y, 1 Ci (x)) Goal: Find best output sequence under our model that violates the least amount of constraints i=1 Paper presentation by: Drew Stone 25 /
34 Constraint-driven learning w/ semi-supervision Constraint pipeline (soft) Given a suitable distance metric between an output sequence and the space of outputs respecting a single constraint Given a set of soft constraints {C i } i K with penalties {ρ i } i K, we get a new objective function: K argmax y λ F (x, y) ρ i d(y, 1 Ci (x)) Goal: Find best output sequence under our model that violates the least amount of constraints Intuition: Given a learned model, λ F (x, y), we can bias its decisions using the amount by which the output sequence violates each soft constraint Option: Use minimal Hamming distance to a sequence i=1 d(y, 1 Ci (x)) = min y C i (x)h(y, y ) Paper presentation by: Drew Stone 25 /
35 Constraint-driven learning w/ semi-supervision Distance example Taken from CCM-NAACL-12-Tutorial M d(y, 1 Ci (x)) = φ Ci (y j ) count violations j=1 Paper presentation by: Drew Stone 26 /
36 Constraint-driven learning Recall: sequence tagging Paper presentation by: Drew Stone 27 /
37 Constraint-driven learning Recall: sequence tagging Paper presentation by: Drew Stone 28 /
38 Constraint-driven learning Recall: sequence tagging (a) correct citation parsing (b) HMM predicted citation parsing Adding punctuation state transition constraint returns correct parsing under the same HMM Paper presentation by: Drew Stone 29 /
39 Constraint-driven learning Recall: SRL Paper presentation by: Drew Stone 30 /
40 Constraint-driven learning Recall: SRL Paper presentation by: Drew Stone 31 /
41 Constraint-driven learning Recall: SRL Paper presentation by: Drew Stone 32 /
42 Outline 1 Background Semi-supervised learning Constraint-driven learning 2 Constraint-driven learning with semi-supervision Introduction Model Scoring Constraint pipeline Constraint Driven Learning (CODL) algorithm 3 Experiments Paper presentation by: Drew Stone 33 /
43 Constraint-driven learning w/ semi-supervision CODL algorithm Constraints are used in inference not learning Maximum likelihood estimation of λ (EM algorithm) Semi-supervision occurs on line 7,9 (i.e. for each unlabelled sample, we generate K training examples) All unlabelled samples are re-labelled each training cycle, unlike self-training perspective Paper presentation by: Drew Stone 34 /
44 Constraint-driven learning w/ semi-supervision Expectation Maximization: Top K-hard EM EM procedures typically assign a distribution over all input/output pairs for a given unlabelled x X Instead, we choose top K, y Y maximizing our soft objective function and assign uniform probability to each output Constraints mutate distribution each step Paper presentation by: Drew Stone 35 /
45 Constraint-driven learning w/ semi-supervision Motivation for K > 1 There may be multiple ways to correct constraint-violating samples We have access to more training data Paper presentation by: Drew Stone 36 /
46 Experiments Citations Paper presentation by: Drew Stone 37 /
47 Experiments HMM Paper presentation by: Drew Stone 38 /
48 Pictures taken from: CCM-NAACL-12-Tutorial Chang, M. W., Ratinov, L., & Roth, D. (2007, June). Guiding semi-supervision with constraint-driven learning. In ACL (pp ). Paper presentation by: Drew Stone /
Conditional Random Fields and beyond D A N I E L K H A S H A B I C S U I U C,
Conditional Random Fields and beyond D A N I E L K H A S H A B I C S 5 4 6 U I U C, 2 0 1 3 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative
More informationShallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher Raj Dabre 11305R001
Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher - 113059006 Raj Dabre 11305R001 Purpose of the Seminar To emphasize on the need for Shallow Parsing. To impart basic information about techniques
More informationPosterior Regularization for Structured Latent Varaible Models
University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science 7-200 Posterior Regularization for Structured Latent Varaible Models Kuzman Ganchev University
More information27: Hybrid Graphical Models and Neural Networks
10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look
More informationConditional Random Fields - A probabilistic graphical model. Yen-Chin Lee 指導老師 : 鮑興國
Conditional Random Fields - A probabilistic graphical model Yen-Chin Lee 指導老師 : 鮑興國 Outline Labeling sequence data problem Introduction conditional random field (CRF) Different views on building a conditional
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationOn Structured Perceptron with Inexact Search, NAACL 2012
On Structured Perceptron with Inexact Search, NAACL 2012 John Hewitt CIS 700-006 : Structured Prediction for NLP 2017-09-23 All graphs from Huang, Fayong, and Guo (2012) unless otherwise specified. All
More informationComputationally Efficient M-Estimation of Log-Linear Structure Models
Computationally Efficient M-Estimation of Log-Linear Structure Models Noah Smith, Doug Vail, and John Lafferty School of Computer Science Carnegie Mellon University {nasmith,dvail2,lafferty}@cs.cmu.edu
More informationMachine Learning. Semi-Supervised Learning. Manfred Huber
Machine Learning Semi-Supervised Learning Manfred Huber 2015 1 Semi-Supervised Learning Semi-supervised learning refers to learning from data where part contains desired output information and the other
More informationComplex Prediction Problems
Problems A novel approach to multiple Structured Output Prediction Max-Planck Institute ECML HLIE08 Information Extraction Extract structured information from unstructured data Typical subtasks Named Entity
More informationStructured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen
Structured Perceptron Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen 1 Outline 1. 2. 3. 4. Brief review of perceptron Structured Perceptron Discriminative Training Methods for Hidden Markov Models: Theory and
More informationStructured Models in. Dan Huttenlocher. June 2010
Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies
More informationQANUS A GENERIC QUESTION-ANSWERING FRAMEWORK
QANUS A GENERIC QUESTION-ANSWERING FRAMEWORK NG, Jun Ping National University of Singapore ngjp@nus.edu.sg 30 November 2009 The latest version of QANUS and this documentation can always be downloaded from
More informationClassification. 1 o Semestre 2007/2008
Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class
More informationLecture 21 : A Hybrid: Deep Learning and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation
More informationContextual Classification with Functional Max-Margin Markov Networks
Contextual Classification with Functional Max-Margin Markov Networks Dan Munoz Nicolas Vandapel Drew Bagnell Martial Hebert Geometry Estimation (Hoiem et al.) Sky Problem 3-D Point Cloud Classification
More informationOnline Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies
Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies Ivan Titov University of Illinois at Urbana-Champaign James Henderson, Paola Merlo, Gabriele Musillo University
More information(University Improving of Montreal) Generative Adversarial Networks with Denoising Feature Matching / 17
Improving Generative Adversarial Networks with Denoising Feature Matching David Warde-Farley 1 Yoshua Bengio 1 1 University of Montreal, ICLR,2017 Presenter: Bargav Jayaraman Outline 1 Introduction 2 Background
More informationUnsupervised Learning. Clustering and the EM Algorithm. Unsupervised Learning is Model Learning
Unsupervised Learning Clustering and the EM Algorithm Susanna Ricco Supervised Learning Given data in the form < x, y >, y is the target to learn. Good news: Easy to tell if our algorithm is giving the
More informationKnowledge Discovery and Data Mining 1 (VO) ( )
Knowledge Discovery and Data Mining 1 (VO) (707.003) Data Matrices and Vector Space Model Denis Helic KTI, TU Graz Nov 6, 2014 Denis Helic (KTI, TU Graz) KDDM1 Nov 6, 2014 1 / 55 Big picture: KDDM Probability
More informationConditional Random Fields for Object Recognition
Conditional Random Fields for Object Recognition Ariadna Quattoni Michael Collins Trevor Darrell MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139 {ariadna, mcollins, trevor}@csail.mit.edu
More informationSemi-Supervised Learning of Named Entity Substructure
Semi-Supervised Learning of Named Entity Substructure Alden Timme aotimme@stanford.edu CS229 Final Project Advisor: Richard Socher richard@socher.org Abstract The goal of this project was two-fold: (1)
More informationCS5670: Computer Vision
CS5670: Computer Vision Noah Snavely Lecture 33: Recognition Basics Slides from Andrej Karpathy and Fei-Fei Li http://vision.stanford.edu/teaching/cs231n/ Announcements Quiz moved to Tuesday Project 4
More informationCS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs
CS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs Felix Wang fywang2 John Wieting wieting2 Introduction We implement a texture classification algorithm using 2-D Noncausal Hidden
More informationEnergy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt
Energy Based Models, Restricted Boltzmann Machines and Deep Networks Jesse Eickholt ???? Who s heard of Energy Based Models (EBMs) Restricted Boltzmann Machines (RBMs) Deep Belief Networks Auto-encoders
More informationPosterior Regularization for Structured Latent Variable Models
University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 1-1-2009 Posterior Regularization for Structured Latent Variable Models Kuzman Ganchev University
More informationSupport Vector Machine Learning for Interdependent and Structured Output Spaces
Support Vector Machine Learning for Interdependent and Structured Output Spaces I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, ICML, 2004. And also I. Tsochantaridis, T. Joachims, T. Hofmann,
More informationmodel order p weights The solution to this optimization problem is obtained by solving the linear system
CS 189 Introduction to Machine Learning Fall 2017 Note 3 1 Regression and hyperparameters Recall the supervised regression setting in which we attempt to learn a mapping f : R d R from labeled examples
More informationIntroduction to Hidden Markov models
1/38 Introduction to Hidden Markov models Mark Johnson Macquarie University September 17, 2014 2/38 Outline Sequence labelling Hidden Markov Models Finding the most probable label sequence Higher-order
More informationUnsupervised Rank Aggregation with Distance-Based Models
Unsupervised Rank Aggregation with Distance-Based Models Alexandre Klementiev, Dan Roth, and Kevin Small University of Illinois at Urbana-Champaign Motivation Consider a panel of judges Each (independently)
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationSupervised Learning for Image Segmentation
Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.
More informationPosterior Regularization for Structured Latent Variable Models
Journal of Machine Learning Research 11 (2010) 2001-2049 Submitted 12/09; Published 6/10 Posterior Regularization for Structured Latent Variable Models Kuzman Ganchev Department of Computer and Information
More informationIJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:
Semi Automatic Annotation Exploitation Similarity of Pics in i Personal Photo Albums P. Subashree Kasi Thangam 1 and R. Rosy Angel 2 1 Assistant Professor, Department of Computer Science Engineering College,
More informationMention Detection: Heuristics for the OntoNotes annotations
Mention Detection: Heuristics for the OntoNotes annotations Jonathan K. Kummerfeld, Mohit Bansal, David Burkett and Dan Klein Computer Science Division University of California at Berkeley {jkk,mbansal,dburkett,klein}@cs.berkeley.edu
More informationUnsupervised Learning
Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised
More informationSemi-supervised Clustering
Semi-supervised lustering BY: $\ S - MAI AMLT - 2016/2017 (S - MAI) Semi-supervised lustering AMLT - 2016/2017 1 / 26 Outline 1 Semisupervised lustering 2 Semisupervised lustering/labeled Examples 3 Semisupervised
More informationBayes Net Learning. EECS 474 Fall 2016
Bayes Net Learning EECS 474 Fall 2016 Homework Remaining Homework #3 assigned Homework #4 will be about semi-supervised learning and expectation-maximization Homeworks #3-#4: the how of Graphical Models
More informationPTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks
PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks Pramod Srinivasan CS591txt - Text Mining Seminar University of Illinois, Urbana-Champaign April 8, 2016 Pramod Srinivasan
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationIncremental Integer Linear Programming for Non-projective Dependency Parsing
Incremental Integer Linear Programming for Non-projective Dependency Parsing Sebastian Riedel James Clarke ICCS, University of Edinburgh 22. July 2006 EMNLP 2006 S. Riedel, J. Clarke (ICCS, Edinburgh)
More informationLearning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu
Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu 12 October 2012 Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,
More informationSemi-Markov Conditional Random Fields for Information Extraction
Semi-Markov Conditional Random Fields for Information Extraction S U N I T A S A R A W A G I A N D W I L L I A M C O H E N N I P S 2 0 0 4 P R E S E N T E D B Y : D I N E S H K H A N D E L W A L S L I
More informationStructured Learning. Jun Zhu
Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum
More informationThe Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer
The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer - proposes a formal definition for the timed asynchronous distributed system model - presents measurements of process
More informationA Unified Framework to Integrate Supervision and Metric Learning into Clustering
A Unified Framework to Integrate Supervision and Metric Learning into Clustering Xin Li and Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 (xli1,danr)@uiuc.edu December
More informationEvery Picture Tells a Story: Generating Sentences from Images
Every Picture Tells a Story: Generating Sentences from Images Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, David Forsyth University of Illinois
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationIntroduction to CRFs. Isabelle Tellier
Introduction to CRFs Isabelle Tellier 02-08-2013 Plan 1. What is annotation for? 2. Linear and tree-shaped CRFs 3. State of the Art 4. Conclusion 1. What is annotation for? What is annotation? inputs can
More informationAnalysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009
Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context
More informationVision Plan. For KDD- Service based Numerical Entity Searcher (KSNES) Version 2.0
Vision Plan For KDD- Service based Numerical Entity Searcher (KSNES) Version 2.0 Submitted in partial fulfillment of the Masters of Software Engineering Degree. Naga Sowjanya Karumuri CIS 895 MSE Project
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationAdapting SVM Classifiers to Data with Shifted Distributions
Adapting SVM Classifiers to Data with Shifted Distributions Jun Yang School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 juny@cs.cmu.edu Rong Yan IBM T.J.Watson Research Center 9 Skyline
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationAutomated Extraction of Event Details from Text Snippets
Automated Extraction of Event Details from Text Snippets Kavi Goel, Pei-Chin Wang December 16, 2005 1 Introduction We receive emails about events all the time. A message will typically include the title
More informationTuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017
Tuning Philipp Koehn presented by Gaurav Kumar 28 September 2017 The Story so Far: Generative Models 1 The definition of translation probability follows a mathematical derivation argmax e p(e f) = argmax
More informationImage Restoration with Deep Generative Models
Image Restoration with Deep Generative Models Raymond A. Yeh *, Teck-Yian Lim *, Chen Chen, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do Department of Electrical and Computer Engineering, University
More informationSemi-supervised Learning
Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 20. PGM Representation Next Lectures Representation of joint distributions Conditional/marginal independence * Directed vs
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationRegularization and Markov Random Fields (MRF) CS 664 Spring 2008
Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization in Low Level Vision Low level vision problems concerned with estimating some quantity at each pixel Visual motion (u(x,y),v(x,y))
More informationMotivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)
Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,
More informationTransliteration as Constrained Optimization
EMNLP 08 Transliteration as Constrained Optimization Dan Goldwasser Dan Roth Department of Computer Science University of Illinois Urbana, IL 61801 {goldwas1,danr}@uiuc.edu Abstract This paper introduces
More informationAnnouncements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.
CS 188: Artificial Intelligence Spring 2011 Lecture 21: Perceptrons 4/13/2010 Announcements Project 4: due Friday. Final Contest: up and running! Project 5 out! Pieter Abbeel UC Berkeley Many slides adapted
More informationDependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin
Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing Formalizing dependency trees Transition-based dependency parsing
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest
More informationSpeech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri
Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute eugenew@cs.nyu.edu Slide Credit: Mehryar Mohri Speech Recognition Components Acoustic and pronunciation model:
More informationCOMS 4771 Clustering. Nakul Verma
COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationAn Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments
An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments Hui Fang ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign Abstract In this paper, we report
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationStatistical parsing. Fei Xia Feb 27, 2009 CSE 590A
Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.
CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit
More informationDiscriminative Training for Phrase-Based Machine Translation
Discriminative Training for Phrase-Based Machine Translation Abhishek Arun 19 April 2007 Overview 1 Evolution from generative to discriminative models Discriminative training Model Learning schemes Featured
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationKernels for Structured Data
T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner
More informationExpectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University
Expectation Maximization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University April 10 th, 2006 1 Announcements Reminder: Project milestone due Wednesday beginning of class 2 Coordinate
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationConditional Random Fields : Theory and Application
Conditional Random Fields : Theory and Application Matt Seigel (mss46@cam.ac.uk) 3 June 2010 Cambridge University Engineering Department Outline The Sequence Classification Problem Linear Chain CRFs CRF
More informationProbabilistic Graphical Models
10-708 Probabilistic Graphical Models Homework 4 Due Apr 27, 12:00 noon Submission: Homework is due on the due date at 12:00 noon. Please see course website for policy on late submission. You must submit
More informationUsing Machine Learning to Optimize Storage Systems
Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation
More informationPredict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry
Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry Jincheng Cao, SCPD Jincheng@stanford.edu 1. INTRODUCTION When running a direct mail campaign, it s common practice
More informationAutomatic Summarization
Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization
More informationGradient of the lower bound
Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level
More informationAn Adaptive Framework for Multistream Classification
An Adaptive Framework for Multistream Classification Swarup Chandra, Ahsanul Haque, Latifur Khan and Charu Aggarwal* University of Texas at Dallas *IBM Research This material is based upon work supported
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationLarge-scale visual recognition The bag-of-words representation
Large-scale visual recognition The bag-of-words representation Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline Bag-of-words Large or small vocabularies? Extensions for instance-level
More informationRobust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson
Robust Regression Robust Data Mining Techniques By Boonyakorn Jantaranuson Outline Introduction OLS and important terminology Least Median of Squares (LMedS) M-estimator Penalized least squares What is
More informationAn Exact Algorithm for the Statistical Shortest Path Problem
An Exact Algorithm for the Statistical Shortest Path Problem Liang Deng and Martin D. F. Wong Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Outline Motivation
More informationMachine Learning: Algorithms and Applications Mockup Examination
Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature
More informationAT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands
AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands Svetlana Stoyanchev, Hyuckchul Jung, John Chen, Srinivas Bangalore AT&T Labs Research 1 AT&T Way Bedminster NJ 07921 {sveta,hjung,jchen,srini}@research.att.com
More informationMachine Learning. Supervised Learning. Manfred Huber
Machine Learning Supervised Learning Manfred Huber 2015 1 Supervised Learning Supervised learning is learning where the training data contains the target output of the learning system. Training data D
More informationInformation Fusion Dr. B. K. Panigrahi
Information Fusion By Dr. B. K. Panigrahi Asst. Professor Department of Electrical Engineering IIT Delhi, New Delhi-110016 01/12/2007 1 Introduction Classification OUTLINE K-fold cross Validation Feature
More information12 Classification using Support Vector Machines
160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.
More informationMachine Learning (CSE 446): Concepts & the i.i.d. Supervised Learning Paradigm
Machine Learning (CSE 446): Concepts & the i.i.d. Supervised Learning Paradigm Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 17 Review 1 / 17 Decision Tree: Making a
More informationExpectation Maximization: Inferring model parameters and class labels
Expectation Maximization: Inferring model parameters and class labels Emily Fox University of Washington February 27, 2017 Mixture of Gaussian recap 1 2/27/2017 Jumble of unlabeled images HISTOGRAM blue
More informationQuickView: NLP-based Tweet Search
QuickView: NLP-based Tweet Search Xiaohua Liu, Furu Wei, Ming Zhou, Microsoft QuickView Team School of Computer Science and Technology Harbin Institute of Technology, Harbin, 150001, China Microsoft Research
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More information