LSA-like models Bill Freeman June 21, 2004

Size: px
Start display at page:

Download "LSA-like models Bill Freeman June 21, 2004"

Transcription

1 LSA-like models Bill Freeman June 21, Introduction LSA is simple and stupid, but works pretty well for analyzing the meanings of words and text, given a large, unlabelled training set. Why? LSA factorizes a histogram observation matrix, showing frequency of occurrance counts, with words along columns and documents in rows, then reduces the dimensionality of the coordinates describing each word. The coordinates of words with similar meanings tend to be similar. Let s build up a principled model, inspired by the factorization approach of LSA, but better motivated. The result will be an iterative scheme to learn object parts and objects by observing feature responses over a training set of many images. 2 Histogram Matrix Factorization We examine a simple model, then embellish it. For this section, we ll call a document the contents of a subsection of an image (say a circle of diameter D). We assume that we have first created a finite vocabulary of image feature indices. These could be, for example, a set of 1000 or so vector-quantized SIFT feature responses such as Bryan has used in his work. An observation consists of counting how many of each feature occur in a given document. (Again, we use a bag-of-words representation, although later we ll work our way away from that). Thus, the measurement from a document consists of a column vector showing the number of times each feature was found in the document (inside that circle of the image). We create an observation matrix from a corpus by stacking next to each other all the column vectors of observations from each document. The resulting observation matrix has number-of-features rows by number-of-documents columns. We want to use that training corpus to infer meaning from observed variables. We will to re-write the corpus matrix in terms of hidden variables we ll call objects, which we will learn. The nice thing about the bag-of-words model for documents is that things combine additively. We can then write the observations, a histogram matrix Y, as a product of two other histogram matrices: a matrix, F, whose columns tell what objects are contained in each training image (or document, or circular region within an image), times a matrix, G, telling how many of each feature are present for each object. Figure 1 shows the dimensions of Y = F G. This factorization is reminiscent of SVD, except our product matrices are not the real, orthonormal matrices of an SVD. Nor is it the non-negative matrix factorization of Seung et al, where the multiplicand matrices are constrained to be positive. Here we have a related, but different, set of constraints on the multiplicand matrices: they must be histograms, that is, matrices composed of non-negative integers. We can call this factorization process histogram matrix factorization, or HMF. The hope is that this rather severe set of constraints provides enough structure to solve the equation Y = F G in a way that reveals meaningful objects in the real world, given a large corpus of training images. 3 Learning and inference with histogram matrices We want to look at the vector quantized SIFT features of many, many images, and from the feature counts per image (or counts per image circular region), infer useful things about the objects comprising the images. In particular, given the observation of a new image, by examining the features present, we want to group the features into sets of previously learned objects.

2 Figure 1: Histogram matrix factorization. A histogram matrix, composed of non-negative integers, is decomposed into the product of two other histogram matrices. 2

3 In practice, because of occlusions and noise, the product relationship Y = F G won t hold exactly, so we want to take a probabilistic, rather than numerical approach. There is a penalty for each undetected feature, and a second penalty incurred for each additional feature not predicted by F G. A natural way to impose these penalties is with a factor graph, allowing histogram matrix factorization by running loopy belief propagation, reminiscent of decoding turbo codes or low-density parity check codes because it s a network with large loops and low state dimension at each node. Figure 2 shows the factor graph relating an object to the observed features. Figure 2: Bayesian graphical model relating an object variable to observed feature values. Mathematically, here is a plausible probability that an observed set of feature vector counts, y, was created by a given set of object counts, the column vector g: P ( g y) = ψ(y j F jl g l z jl ) P o (z jl ), (1) featuresj objects l jl where z jl is zero if feature j from object l is occluded, and 1 otherwise. P o (0) is the prior probability that any given feature will be occluded. (Later, when we introduce a hieararchy of objects (by finding objects within smaller sized circles), we can incur an occlusion cost for a single small object or set of features, which would save over incurring the occlusion cost for each one of the features.) g l = 1 if object l is present in the image. F jl = 1 if object l has feature j. ψ(.) is a function that tells how probable is a given deviation from the observed histogram count. 3.1 Performing the histogram matrix factorization The learning problem: we have a histogram matrix (Y ) and we want to break it up into the product of two other histogram matrices. Note that the product of the two found histogram matrices need not give the original matrix exactly, because we will allow for occlusion and other forms of observation noise. The inference problem: Given one of the multiplicand matrices (the features for each object) and a single column of the observation matrix (the features observed for a given image), find the vector of histogram counts for objects which best explains the observed features (taking occlusion and observation noise into account). This problem seems very well matched to Bayesian belief propagation. A number of possible approaches come to mind for the learning problem (histogram matrix factorization). 3

4 1. Perform an initial factorization using SVD. Enforce the constraints (positive, integer values) on one of the two matrices, and solve for the other matrix. Update that matrix (average between new and old values), then switch the roles of the two matrices, and repeat. This algorithm works at least on toy examples, see Section Put it all in a big Bayesian network, and run loopy belief propagation to solve for the two matrices, F and G. That s probably expecting too much from BP. 3. Do some on-line algorithm, starting from a small number of possible objects and modifying the objects vs features matrix and building it up as you see more and more columns of the observation matrix. 4 Reducing the number of features or feature-groups Clearly, similiar real-world objects or parts-of-objects will not create the exact same features in the image. So we need to have some way of learning that certain features, or in the hierarchical version, groups of features, are synonyms of each other. Following the perhaps questionable logic of LSA, this can be done by performing SVD on the matrix [U, S, V ] = Y T, and regularizing by reducing the dimensionality of the matrix V T. The columns of V T provide vector coordinates for each feature. Features with similar coordinates could be grouped together. There should also be other, perhaps more principled, ways of examining the matrix Y and learning which features should be merged into the same feature. This might include looking at the joint co-occurrance matrix of all the features. You could look for the pairs of features for which combining them would cause the least loss of information in the probability distribution described by that histogram. 5 A hierarchical framework for unsupervised object learning It might make sense to gradually build up objects from features. First SIFT features playing similar roles in images could be grouped. Then we could consider documents consisting of small regions of images (see Fig. 3). objects (recurring patterns of features) could be found within those documents, which would be added to the feature set. Then we d examine documents consisting of larger regions of images, and work our way up. The hierarchical approach has several benefits It introduces some spatial localization and structure into the otherwise flat, spaceless bag-of-words model. It allows for entire object parts to be occluded for a penalty less than would be incurred by requiring each feature to be independently occluded. A benefit of introducing a hierarchical representation would be in the treatment of occlusion. Instead of having to incur a penalty independently for each occluded feature, we pay the penalty once for occluding a meta-feature, then we get occlusions of all the component features for free.l Unsupervised Object Learning 1. Form the observation matrix, Y, a histogram of the number of occurances of each feature (rows) in each document (colums). The document can be an image, or smaller regions of images. 2. Group similar features together, based on their cooccurances in Y, to form a new observation matrix Y, which histograms the new features. (Using LSA, as above, or else using something better.) 3. Perform Histogram Matrix Factorization on Y to identify repeating objects or object-components. 4. Treat the identified object-components as features, and add them to the feature set. Form a new observation matrix histogram, Y. 4

5 Figure 3: Showing 3 different areas of circular regions over which to compute feature histograms, leading to a hierarchical representation of objects. 5. Group the related object-components together using LSA (or something better). 6. Perform Histogram Matrix Factorization on Y to identify repeating objects or object-components. 7. etc. 6 Some numerical experiments with histogram matrix factorization Figure 4 shows the result of a histogram matrix factorization method applied to a small example observation matrix. For this example, the factorized matrices, ff and gg, give a product which is actually closer to the original matrix than to the observation matrix they were factored from: The sum of absolute values of errors was 26 for observations minus true, but was 23 for fitted model minus true. >> sum(sum(abs(y-ytrue))) ans = 26 >> sum(sum(abs(y-ff*gg))) ans = 45 >> sum(sum(abs(ytrue-ff*gg))) ans = 23 The matlab code for this toy example. It is very doubtful that this algorithm will scale up by a factor of 100 or The algorithm: Perform an initial factorization using SVD. Enforce the constraints (positive, integer values) on one of the two matrices, and solve for the other matrix. Update that matrix (average between new and old values), then switch the roles of the two matrices, and repeat. % play with histogram matrix factorization. % June 21, 2004 Billf. % The number of possible objects in the toy world. 5

6 Figure 4: Top left: Maximum absolute error for any observation matrix histogram cell in the iterative factorization of the observation matrix. Top right: image showing histogram counts for observation matrix, before adding 1 count errors to one percent of the counts. Bottom left: the resulting observation matrix. Bottom right: the product of the factorized histogram matrices, which should explain the observation matrix. 6

7 dim = 10; % make synthetic f and g matrices. % f tells the features (rows) for each object (column) % g tells the objects (rows) for each document (column) ndoc = 50; nfeat = 80; % Typically, these matrices will be % mostly zeros with some ones. g = rand(dim,ndoc); g = real(g > 0.9); f = rand(nfeat, dim); f = real(f > 0.9); ytrue = f * g; % optionally add some noise to the observation counts. y = abs(-1ˆ(rand(1)>0.5) * real(rand(size(ytrue)) > 0.99) + ytrue); [u, s, v] = svd(y); alpha = 0.5; % initialize estimates for f and g (called ff and gg). gg = s*v ; ff = u(:,1:dim); niter = 300; err = []; for i = 1:niter gg = alpha * round(abs(ff\y)) + (1-alpha) * gg(1:dim,:); ff = alpha * round(abs(y * pinv(gg))) + (1-alpha) * ff; err = [err, max(max(abs(y - ff*gg)))]; end figure; subplot(2,2,1); plot(err); title([ max err for alpha = num2str(alpha)]); subplot(2,2,2); showim(ytrue); title( ytrue ); subplot(2,2,3); showim(y); title( y ); subplot(2,2,4); showim(ff*gg);title( ff*gg ); 7

Multiple Constraint Satisfaction by Belief Propagation: An Example Using Sudoku

Multiple Constraint Satisfaction by Belief Propagation: An Example Using Sudoku Multiple Constraint Satisfaction by Belief Propagation: An Example Using Sudoku Todd K. Moon and Jacob H. Gunther Utah State University Abstract The popular Sudoku puzzle bears structural resemblance to

More information

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Combining PGMs and Discriminative Models for Upper Body Pose Detection Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative

More information

MITOCW ocw f99-lec07_300k

MITOCW ocw f99-lec07_300k MITOCW ocw-18.06-f99-lec07_300k OK, here's linear algebra lecture seven. I've been talking about vector spaces and specially the null space of a matrix and the column space of a matrix. What's in those

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3

CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3 CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3 What I have in mind for our last programming project is to do something with either graphical models or random sampling. A few ideas

More information

BOOLEAN MATRIX FACTORIZATIONS. with applications in data mining Pauli Miettinen

BOOLEAN MATRIX FACTORIZATIONS. with applications in data mining Pauli Miettinen BOOLEAN MATRIX FACTORIZATIONS with applications in data mining Pauli Miettinen MATRIX FACTORIZATIONS BOOLEAN MATRIX FACTORIZATIONS o THE BOOLEAN MATRIX PRODUCT As normal matrix product, but with addition

More information

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture 16 Cutting Plane Algorithm We shall continue the discussion on integer programming,

More information

Singular Value Decomposition, and Application to Recommender Systems

Singular Value Decomposition, and Application to Recommender Systems Singular Value Decomposition, and Application to Recommender Systems CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Recommendation

More information

A Keypoint Descriptor Inspired by Retinal Computation

A Keypoint Descriptor Inspired by Retinal Computation A Keypoint Descriptor Inspired by Retinal Computation Bongsoo Suh, Sungjoon Choi, Han Lee Stanford University {bssuh,sungjoonchoi,hanlee}@stanford.edu Abstract. The main goal of our project is to implement

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun Efficient Learning of Sparse Overcomplete Representations with an Energy-Based Model Marc'Aurelio Ranzato C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun CIAR Summer School Toronto 2006 Why Extracting

More information

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Lori Cillo, Attebury Honors Program Dr. Rajan Alex, Mentor West Texas A&M University Canyon, Texas 1 ABSTRACT. This work is

More information

CS Introduction to Data Mining Instructor: Abdullah Mueen

CS Introduction to Data Mining Instructor: Abdullah Mueen CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts

More information

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)

Motivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM) Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,

More information

QUICK EXCEL TUTORIAL. The Very Basics

QUICK EXCEL TUTORIAL. The Very Basics QUICK EXCEL TUTORIAL The Very Basics You Are Here. Titles & Column Headers Merging Cells Text Alignment When we work on spread sheets we often need to have a title and/or header clearly visible. Merge

More information

Dimension reduction : PCA and Clustering

Dimension reduction : PCA and Clustering Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental

More information

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction Preprocessing The goal of pre-processing is to try to reduce unwanted variation in image due to lighting,

More information

Data: a collection of numbers or facts that require further processing before they are meaningful

Data: a collection of numbers or facts that require further processing before they are meaningful Digital Image Classification Data vs. Information Data: a collection of numbers or facts that require further processing before they are meaningful Information: Derived knowledge from raw data. Something

More information

SVM: Multiclass and Structured Prediction. Bin Zhao

SVM: Multiclass and Structured Prediction. Bin Zhao SVM: Multiclass and Structured Prediction Bin Zhao Part I: Multi-Class SVM 2-Class SVM Primal form Dual form http://www.glue.umd.edu/~zhelin/recog.html Real world classification problems Digit recognition

More information

Page 1 of 7 E7 Spring 2009 Midterm I SID: UNIVERSITY OF CALIFORNIA, BERKELEY Department of Civil and Environmental Engineering. Practice Midterm 01

Page 1 of 7 E7 Spring 2009 Midterm I SID: UNIVERSITY OF CALIFORNIA, BERKELEY Department of Civil and Environmental Engineering. Practice Midterm 01 Page 1 of E Spring Midterm I SID: UNIVERSITY OF CALIFORNIA, BERKELEY Practice Midterm 1 minutes pts Question Points Grade 1 4 3 6 4 16 6 1 Total Notes (a) Write your name and your SID on the top right

More information

Interactive Math Glossary Terms and Definitions

Interactive Math Glossary Terms and Definitions Terms and Definitions Absolute Value the magnitude of a number, or the distance from 0 on a real number line Addend any number or quantity being added addend + addend = sum Additive Property of Area the

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

Problem Set 2: From Perceptrons to Back-Propagation

Problem Set 2: From Perceptrons to Back-Propagation COMPUTER SCIENCE 397 (Spring Term 2005) Neural Networks & Graphical Models Prof. Levy Due Friday 29 April Problem Set 2: From Perceptrons to Back-Propagation 1 Reading Assignment: AIMA Ch. 19.1-4 2 Programming

More information

To complete the computer assignments, you ll use the EViews software installed on the lab PCs in WMC 2502 and WMC 2506.

To complete the computer assignments, you ll use the EViews software installed on the lab PCs in WMC 2502 and WMC 2506. An Introduction to EViews The purpose of the computer assignments in BUEC 333 is to give you some experience using econometric software to analyse real-world data. Along the way, you ll become acquainted

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Fundamentals of Operations Research. Prof. G. Srinivasan. Department of Management Studies. Indian Institute of Technology, Madras. Lecture No.

Fundamentals of Operations Research. Prof. G. Srinivasan. Department of Management Studies. Indian Institute of Technology, Madras. Lecture No. Fundamentals of Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture No. # 13 Transportation Problem, Methods for Initial Basic Feasible

More information

Modular Arithmetic. is just the set of remainders we can get when we divide integers by n

Modular Arithmetic. is just the set of remainders we can get when we divide integers by n 20181004 Modular Arithmetic We are accustomed to performing arithmetic on infinite sets of numbers. But sometimes we need to perform arithmetic on a finite set, and we need it to make sense and be consistent

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms for Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms for Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 1 Course Overview This course is about performing inference in complex

More information

Optimised corrections for finite-difference modelling in two dimensions

Optimised corrections for finite-difference modelling in two dimensions Optimized corrections for 2D FD modelling Optimised corrections for finite-difference modelling in two dimensions Peter M. Manning and Gary F. Margrave ABSTRACT Finite-difference two-dimensional correction

More information

Scanning Real World Objects without Worries 3D Reconstruction

Scanning Real World Objects without Worries 3D Reconstruction Scanning Real World Objects without Worries 3D Reconstruction 1. Overview Feng Li 308262 Kuan Tian 308263 This document is written for the 3D reconstruction part in the course Scanning real world objects

More information

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate

More information

Program Planning, Data Comparisons, Strings

Program Planning, Data Comparisons, Strings Program Planning, Data Comparisons, Strings Program Planning Data Comparisons Strings Reading for this class: Dawson, Chapter 3 (p. 80 to end) and 4 Program Planning When you write your first programs,

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

Data Preprocessing. Chapter 15

Data Preprocessing. Chapter 15 Chapter 15 Data Preprocessing Data preprocessing converts raw data and signals into data representation suitable for application through a sequence of operations. The objectives of data preprocessing include

More information

Course on Artificial Intelligence and Intelligent Systems

Course on Artificial Intelligence and Intelligent Systems Course on Artificial Intelligence and Intelligent Systems Example and exercise using an Excel based Neural Network package Henning Christiansen Roskilde University, Computer Science Dept. c 2007 Version

More information

Space Filling Curves and Hierarchical Basis. Klaus Speer

Space Filling Curves and Hierarchical Basis. Klaus Speer Space Filling Curves and Hierarchical Basis Klaus Speer Abstract Real world phenomena can be best described using differential equations. After linearisation we have to deal with huge linear systems of

More information

Chapter VIII.3: Hierarchical Clustering

Chapter VIII.3: Hierarchical Clustering Chapter VIII.3: Hierarchical Clustering 1. Basic idea 1.1. Dendrograms 1.2. Agglomerative and divisive 2. Cluster distances 2.1. Single link 2.2. Complete link 2.3. Group average and Mean distance 2.4.

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Bayesian Curve Fitting (1) Polynomial Bayesian

More information

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Automatic Tracking of Moving Objects in Video for Surveillance Applications Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering

More information

Numenta Node Algorithms Guide NuPIC 1.7

Numenta Node Algorithms Guide NuPIC 1.7 1 NuPIC 1.7 includes early implementations of the second generation of the Numenta HTM learning algorithms. These algorithms are available as two node types: SpatialPoolerNode and TemporalPoolerNode. This

More information

Sparse Matrices in Matlab*P. Final Report

Sparse Matrices in Matlab*P. Final Report Sparse Matrices in Matlab*P Submitted by: Stu Blair Date: 8 May 2003 Final Report Introduction and Motivation The purpose of this project was to provide sparse matrix functionality to the users of MATLAB*P.

More information

Learning the Structure of Sum-Product Networks. Robert Gens Pedro Domingos

Learning the Structure of Sum-Product Networks. Robert Gens Pedro Domingos Learning the Structure of Sum-Product Networks Robert Gens Pedro Domingos w 20 10x O(n) X Y LL PLL CLL CMLL Motivation SPN Structure Experiments Review Learning Graphical Models Representation Inference

More information

Day 3 Lecture 1. Unsupervised Learning

Day 3 Lecture 1. Unsupervised Learning Day 3 Lecture 1 Unsupervised Learning Semi-supervised and transfer learning Myth: you can t do deep learning unless you have a million labelled examples for your problem. Reality You can learn useful representations

More information

If you are confused about something, it s probably because I haven t explained it well and other people are probably confused too, so please feel

If you are confused about something, it s probably because I haven t explained it well and other people are probably confused too, so please feel If you are confused about something, it s probably because I haven t explained it well and other people are probably confused too, so please feel free to stop me to ask questions. If something I m describing

More information

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR)

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 63 CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 4.1 INTRODUCTION The Semantic Region Based Image Retrieval (SRBIR) system automatically segments the dominant foreground region and retrieves

More information

AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE GRAPHS

AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE GRAPHS Volume 8 No. 0 208, -20 ISSN: 3-8080 (printed version); ISSN: 34-3395 (on-line version) url: http://www.ijpam.eu doi: 0.2732/ijpam.v8i0.54 ijpam.eu AN ANALYSIS ON MARKOV RANDOM FIELDS (MRFs) USING CYCLE

More information

Recognition. Topics that we will try to cover:

Recognition. Topics that we will try to cover: Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) Object classification (we did this one already) Neural Networks Object class detection Hough-voting techniques

More information

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning Clustering & Dimensionality Reduction 273A Intro Machine Learning What is Unsupervised Learning? In supervised learning we were given attributes & targets (e.g. class labels). In unsupervised learning

More information

Small Libraries of Protein Fragments through Clustering

Small Libraries of Protein Fragments through Clustering Small Libraries of Protein Fragments through Clustering Varun Ganapathi Department of Computer Science Stanford University June 8, 2005 Abstract When trying to extract information from the information

More information

HOUGH TRANSFORM CS 6350 C V

HOUGH TRANSFORM CS 6350 C V HOUGH TRANSFORM CS 6350 C V HOUGH TRANSFORM The problem: Given a set of points in 2-D, find if a sub-set of these points, fall on a LINE. Hough Transform One powerful global method for detecting edges

More information

Problem definition Image acquisition Image segmentation Connected component analysis. Machine vision systems - 1

Problem definition Image acquisition Image segmentation Connected component analysis. Machine vision systems - 1 Machine vision systems Problem definition Image acquisition Image segmentation Connected component analysis Machine vision systems - 1 Problem definition Design a vision system to see a flat world Page

More information

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees

CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees CS242: Probabilistic Graphical Models Lecture 2B: Loopy Belief Propagation & Junction Trees Professor Erik Sudderth Brown University Computer Science September 22, 2016 Some figures and materials courtesy

More information

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture - 35 Quadratic Programming In this lecture, we continue our discussion on

More information

OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE

OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc Jäger, and Olaf Hellwich Berlin University of Technology FR3-1, Franklinstr. 28, 10587 Berlin, Germany {wenjuhe, jaeger,

More information

Outlier detection using autoencoders

Outlier detection using autoencoders Outlier detection using autoencoders August 19, 2016 Author: Olga Lyudchik Supervisors: Dr. Jean-Roch Vlimant Dr. Maurizio Pierini CERN Non Member State Summer Student Report 2016 Abstract Outlier detection

More information

Loopy Belief Propagation

Loopy Belief Propagation Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73 Problem Formalization Reasoning about any real-world problem requires assumptions about the structure

More information

A Hierarchical Compositional System for Rapid Object Detection

A Hierarchical Compositional System for Rapid Object Detection A Hierarchical Compositional System for Rapid Object Detection Long Zhu and Alan Yuille Department of Statistics University of California at Los Angeles Los Angeles, CA 90095 {lzhu,yuille}@stat.ucla.edu

More information

1.3 Floating Point Form

1.3 Floating Point Form Section 1.3 Floating Point Form 29 1.3 Floating Point Form Floating point numbers are used by computers to approximate real numbers. On the surface, the question is a simple one. There are an infinite

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Analysis and Latent Semantic Indexing

Analysis and Latent Semantic Indexing 18 Principal Component Analysis and Latent Semantic Indexing Understand the basics of principal component analysis and latent semantic index- Lab Objective: ing. Principal Component Analysis Understanding

More information

The Surprising Power of Belief Propagation

The Surprising Power of Belief Propagation The Surprising Power of Belief Propagation Elchanan Mossel June 12, 2015 Why do you want to know about BP It s a popular algorithm. We will talk abut its analysis. Many open problems. Connections to: Random

More information

Outline. User-based knn Algorithm Basics of Matlab Control Structures Scripts and Functions Help

Outline. User-based knn Algorithm Basics of Matlab Control Structures Scripts and Functions Help Outline User-based knn Algorithm Basics of Matlab Control Structures Scripts and Functions Help User-based knn Algorithm Three main steps Weight all users with respect to similarity with the active user.

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

Problem description. Prescribed. force. Unit thickness, plane stress. Transverse direction. Prescribed. displacement. Longitudinal direction

Problem description. Prescribed. force. Unit thickness, plane stress. Transverse direction. Prescribed. displacement. Longitudinal direction Problem description Two desirable features in a material model are: 1) the ability to reproduce experimental data in an analysis that corresponds to the experiment and 2) that the material model be stable

More information

Module 10A Lecture - 20 What is a function? Why use functions Example: power (base, n)

Module 10A Lecture - 20 What is a function? Why use functions Example: power (base, n) Programming, Data Structures and Algorithms Prof. Shankar Balachandran Department of Computer Science and Engineering Indian Institute of Technology, Madras Module 10A Lecture - 20 What is a function?

More information

Inference and Representation

Inference and Representation Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing

More information

Matrices. A Matrix (This one has 2 Rows and 3 Columns) To add two matrices: add the numbers in the matching positions:

Matrices. A Matrix (This one has 2 Rows and 3 Columns) To add two matrices: add the numbers in the matching positions: Matrices A Matrix is an array of numbers: We talk about one matrix, or several matrices. There are many things we can do with them... Adding A Matrix (This one has 2 Rows and 3 Columns) To add two matrices:

More information

Problem 1: Complexity of Update Rules for Logistic Regression

Problem 1: Complexity of Update Rules for Logistic Regression Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1

More information

The "tree-dependent components" of natural scenes are edge filters

The tree-dependent components of natural scenes are edge filters The "tree-dependent components" of natural scenes are edge filters Daniel Zoran Interdisciplinary Center for Neural Computation Hebrew University of Jerusalem daniez@cs.huji.ac.il Yair Weiss School of

More information

COMP 558 lecture 19 Nov. 17, 2010

COMP 558 lecture 19 Nov. 17, 2010 COMP 558 lecture 9 Nov. 7, 2 Camera calibration To estimate the geometry of 3D scenes, it helps to know the camera parameters, both external and internal. The problem of finding all these parameters is

More information

SPARSE CODES FOR NATURAL IMAGES. Davide Scaramuzza Autonomous Systems Lab (EPFL)

SPARSE CODES FOR NATURAL IMAGES. Davide Scaramuzza Autonomous Systems Lab (EPFL) SPARSE CODES FOR NATURAL IMAGES Davide Scaramuzza (davide.scaramuzza@epfl.ch) Autonomous Systems Lab (EPFL) Final rapport of Wavelet Course MINI-PROJECT (Prof. Martin Vetterli) ABSTRACT The human visual

More information

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection Why Edge Detection? How can an algorithm extract relevant information from an image that is enables the algorithm to recognize objects? The most important information for the interpretation of an image

More information

Introduction to Domain Testing

Introduction to Domain Testing Introduction to Domain Testing Cem Kaner January, 2018 Copyright (c) 2018 Cem Kaner Domain Testing 1 What Is Domain Testing? The most widely taught technique for designing software tests Copyright (c)

More information

Error-Correcting Codes

Error-Correcting Codes Error-Correcting Codes Michael Mo 10770518 6 February 2016 Abstract An introduction to error-correcting codes will be given by discussing a class of error-correcting codes, called linear block codes. The

More information

This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane?

This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane? Intersecting Circles This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane? This is a problem that a programmer might have to solve, for example,

More information

CS294-1 Assignment 2 Report

CS294-1 Assignment 2 Report CS294-1 Assignment 2 Report Keling Chen and Huasha Zhao February 24, 2012 1 Introduction The goal of this homework is to predict a users numeric rating for a book from the text of the user s review. The

More information

LAB 2: Linear Equations and Matrix Algebra. Preliminaries

LAB 2: Linear Equations and Matrix Algebra. Preliminaries Math 250C, Section C2 Hard copy submission Matlab # 2 1 Revised 07/13/2016 LAB 2: Linear Equations and Matrix Algebra In this lab you will use Matlab to study the following topics: Solving a system of

More information

CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION

CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION In chapter 4, SVD based watermarking schemes are proposed which met the requirement of imperceptibility, having high payload and

More information

Shadows in the graphics pipeline

Shadows in the graphics pipeline Shadows in the graphics pipeline Steve Marschner Cornell University CS 569 Spring 2008, 19 February There are a number of visual cues that help let the viewer know about the 3D relationships between objects

More information

Machine Learning. Sourangshu Bhattacharya

Machine Learning. Sourangshu Bhattacharya Machine Learning Sourangshu Bhattacharya Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Curve Fitting Re-visited Maximum Likelihood Determine by minimizing sum-of-squares

More information

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering. Chapter 4 Fuzzy Segmentation 4. Introduction. The segmentation of objects whose color-composition is not common represents a difficult task, due to the illumination and the appropriate threshold selection

More information

Maximum Density Still Life

Maximum Density Still Life The Hebrew University of Jerusalem Computer Science department Maximum Density Still Life Project in course AI 67842 Mohammad Moamen Table of Contents Problem description:... 3 Description:... 3 Objective

More information

Lossless Image Compression having Compression Ratio Higher than JPEG

Lossless Image Compression having Compression Ratio Higher than JPEG Cloud Computing & Big Data 35 Lossless Image Compression having Compression Ratio Higher than JPEG Madan Singh madan.phdce@gmail.com, Vishal Chaudhary Computer Science and Engineering, Jaipur National

More information

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 8 Consistency and Redundancy in Project networks In today s lecture

More information

Ruslan Salakhutdinov and Geoffrey Hinton. University of Toronto, Machine Learning Group IRGM Workshop July 2007

Ruslan Salakhutdinov and Geoffrey Hinton. University of Toronto, Machine Learning Group IRGM Workshop July 2007 SEMANIC HASHING Ruslan Salakhutdinov and Geoffrey Hinton University of oronto, Machine Learning Group IRGM orkshop July 2007 Existing Methods One of the most popular and widely used in practice algorithms

More information

Conditional Random Fields for Object Recognition

Conditional Random Fields for Object Recognition Conditional Random Fields for Object Recognition Ariadna Quattoni Michael Collins Trevor Darrell MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139 {ariadna, mcollins, trevor}@csail.mit.edu

More information

Motivation. Technical Background

Motivation. Technical Background Handling Outliers through Agglomerative Clustering with Full Model Maximum Likelihood Estimation, with Application to Flow Cytometry Mark Gordon, Justin Li, Kevin Matzen, Bryce Wiedenbeck Motivation Clustering

More information

A Hierarchial Model for Visual Perception

A Hierarchial Model for Visual Perception A Hierarchial Model for Visual Perception Bolei Zhou 1 and Liqing Zhang 2 1 MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, and Department of Biomedical Engineering, Shanghai

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Exploring Hyperplane Arrangements and their Regions

Exploring Hyperplane Arrangements and their Regions Exploring Hyperplane Arrangements and their Regions Jonathan Haapala Abstract This paper studies the Linial, Coxeter, Shi, and Nil hyperplane arrangements and explores ideas of how to count their regions.

More information

GLY Geostatistics Fall Lecture 2 Introduction to the Basics of MATLAB. Command Window & Environment

GLY Geostatistics Fall Lecture 2 Introduction to the Basics of MATLAB. Command Window & Environment GLY 6932 - Geostatistics Fall 2011 Lecture 2 Introduction to the Basics of MATLAB MATLAB is a contraction of Matrix Laboratory, and as you'll soon see, matrices are fundamental to everything in the MATLAB

More information

Does the Brain do Inverse Graphics?

Does the Brain do Inverse Graphics? Does the Brain do Inverse Graphics? Geoffrey Hinton, Alex Krizhevsky, Navdeep Jaitly, Tijmen Tieleman & Yichuan Tang Department of Computer Science University of Toronto How to learn many layers of features

More information

Here is the data collected.

Here is the data collected. Introduction to Scientific Analysis of Data Using Spreadsheets. Computer spreadsheets are very powerful tools that are widely used in Business, Science, and Engineering to perform calculations and record,

More information

Machine Learning A W 1sst KU. b) [1 P] For the probability distribution P (A, B, C, D) with the factorization

Machine Learning A W 1sst KU. b) [1 P] For the probability distribution P (A, B, C, D) with the factorization Machine Learning A 708.064 13W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence a) [1 P] For the probability distribution P (A, B, C, D) with the factorization P (A, B,

More information

VLSI Design Automation Final Project Due: June 26 th, Project: A Router

VLSI Design Automation Final Project Due: June 26 th, Project: A Router Project: A Router In lecture, we described how to use the maze routing method to route the wires in a large ASIC, using a 3-dimensional stack of routing grids. In this project assignment, you get to build

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Text Modeling with the Trace Norm

Text Modeling with the Trace Norm Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a low-dimensional representation of text that allows generalization to

More information

Introduction to Octave/Matlab. Deployment of Telecommunication Infrastructures

Introduction to Octave/Matlab. Deployment of Telecommunication Infrastructures Introduction to Octave/Matlab Deployment of Telecommunication Infrastructures 1 What is Octave? Software for numerical computations and graphics Particularly designed for matrix computations Solving equations,

More information