How to Price a House
|
|
- May Daniels
- 5 years ago
- Views:
Transcription
1 How to Price a House An Interpretable Bayesian Approach Dustin Lennon dustin@inferentialist.com Inferentialist Consulting Seattle, WA April 9, 2014
2 Introduction Project to tie up loose ends / came out of interview prep for Climate Corp Disclaimer: two week sprint, not a dissertation An easier version of a more involved spatio-temporal model for zipcode aggregation
3 Outline Size of Housing Market Modeling/Technology Gap 1 Size of Housing Market Modeling/Technology Gap 2 Model Specification General Model Formulation Model Fitting 3 Data Scalability and Sampling Model Output Model Validation 4 Scalability & Sparsity Optimization 5
4 Housing Market Size of Housing Market Modeling/Technology Gap A few Wikipedia Facts Outstanding U.S. residential mortgages: $10.6 trillion as of midyear 2008 By August 2008, 9.2% of all U.S. mortgages outstanding were either delinquent or in foreclosure
5 Housing Market Size of Housing Market Modeling/Technology Gap
6 A Valuation Problem? Size of Housing Market Modeling/Technology Gap Subprime loans, yes, but was there also a systemic failure in estimating home values?
7 Temporal Instability Trulia Size of Housing Market Modeling/Technology Gap Seasonality, perhaps. But a sliding median approach breaks down as the window size goes to zero. page accessed on 6/4/2014
8 Overfitting Zestimates Size of Housing Market Modeling/Technology Gap The time series appears to chase the listing data, stays elevated for a time, then abruptly returns to baseline. page accessed on 6/4/2014
9 Spatial Instability Zestimates Size of Housing Market Modeling/Technology Gap The time series appears to adjust to the recently added zipcode level information, perhaps indicating some spatial instability when adjusting to new data. page accessed on 6/4/2014
10 Ad-hoc Analysis Size of Housing Market Modeling/Technology Gap Limiting case failures Lack of regularization / prior information Uninterpretable models
11 Outline Model Specification General Model Model Fitting 1 Size of Housing Market Modeling/Technology Gap 2 Model Specification General Model Formulation Model Fitting 3 Data Scalability and Sampling Model Output Model Validation 4 Scalability & Sparsity Optimization 5
12 I Model Specification General Model Model Fitting Decompose home value into constituent parts Z i = x t i β + a iy (s i ) + δ i, Z i price paid for the i th home x i covariates associated with β [ e.g., square footage ] β coefficients fixed across space [ e.g., build cost per square foot ] a i lot size Y (s) unit cost of land s i location difference between the true value and the price paid δ i
13 II Model Specification General Model Model Fitting Data Model ( [X ] [ ] ) β [Z β, Y ] N A, Y ( ) = diag [σ 2 z1 2,..., σ 2 zn 2 ] Process Model [β, Y ] = [β][y ] [β] N(ν, Φ) Φ = diag ([φ 1,..., φ k ]) [Y ] N(τ1, Σ) Σ = Σ(θ)
14 III Model Specification General Model Model Fitting σ 2 Σ(θ) interpretable as coefficient of variation defines the covariance structure of the land value term In particular, Σ(θ) is specified through an isotropic, Matern covariance function: ( ) Σ ij (θ) C d ij ; θ 1, θ 2, σ0, 2 σ1 2 ( ) ( ) 1 θ2 ( ) = σ0i 2 0 (d ij ) + σ1 2 2 θ2 1 dij dij Γ(θ 2 ) K θ2 θ 1 and d ij is the Euclidean distance between s i and s j. θ 1
15 III Model Specification General Model Model Fitting
16 General Model Formulation Model Specification General Model Model Fitting Hierarchical Formulation [Z G] N (MG, ) [G] N (µ, Ω) [Z, G] N Joint Distribution {( ) [ Mµ + MΩM t, µ ΩM t Posterior Distribution ( ) [G Z ] N µ, Ω ]} MΩ Ω µ µ + ΩM t ( + MΩM t) 1 (Z Mµ) Ω Ω ΩM t ( + MΩM t) 1 MΩ
17 Fitting the Model Model Specification General Model Model Fitting Inference is on posterior distribution [G Z ; Θ] Specialize general case to hedonic model EM Algorithm to obtain ˆΘ. Iterate until convergence: update µ, Ω minimize 2E [log [Z, G] Z ; Θ] 2E [log[z, G] Z ; Θ] = logdet + logdet Ω + Z t 1 Z + µ t Ω 1 µ [ 2 Z t 1 M + µ t Ω 1] µ [ + µ t M t 1 M + Ω 1] µ [( + tr M t 1 M + Ω 1) ] Ω
18 Outline Data Scalability and Sampling Model Output Model Validation 1 Size of Housing Market Modeling/Technology Gap 2 Model Specification General Model Formulation Model Fitting 3 Data Scalability and Sampling Model Output Model Validation 4 Scalability & Sparsity Optimization 5
19 Data: Maps TIGER/Line Shapefile Data Data Scalability and Sampling Model Output Model Validation
20 Data: Home Sales King County Department of Assessments Data Scalability and Sampling Model Output Model Validation Table Joins: Real Property Sales (non-flagged 2012 records) Exempt From Excite Tax Related Party, Friend, or Neighbor Quit Claim Deed Multi-Parcel Sale Residential Buildings Parcel Information Outlier Filtering: Sale Price: $100k to $5m Lot Size 1.03 acres No properties with multiple sale records in ,812 homes
21 Data: Geocoding Yahoo Data Scalability and Sampling Model Output Model Validation 2012: KC records have UID, street address, no lat/long 2014: Sporadic lat/long (Seattle, not Tacoma) Yahoo geocoder: bash script, 500k lookups over two weeks curl -s "
22 Scalability and Sampling I Data Scalability and Sampling Model Output Model Validation Recall the objective function to be optimized on each iteration of EM algorithm: 2E [log[z, G] Z ; Θ] = logdet + logdet Ω + Z t 1 Z + µ t Ω 1 µ [ 2 Z t 1 M + µ t Ω 1] µ [ + µ t M t 1 M + Ω 1] µ [( + tr M t 1 M + Ω 1) ] Ω Naive approach with dense matrices: extremely memory intensive O(n 3 ) cost to compute inverse Solution: sample, weighted by inverse local density
23 Scalability and Sampling II Data Scalability and Sampling Model Output Model Validation
24 Model Output Coefficients Data Scalability and Sampling Model Output Model Validation σ coefficient of variation [active constraint] ν 1, φ 1 (139.51, ) build cost per square foot (living) ν 2, φ 2 (0.00, ) build cost per square foot (basement) ν 3, φ 3 (0.00, ) build cost per square foot (garage) τ 7.19 lot size cost per square foot θ matern spread parameter [active constraint] θ matern shape parameter [active constraint] σ matern nugget effect [active constraint] σ matern variance
25 Model Output Heatmaps Data Scalability and Sampling Model Output Model Validation Need predictive distribution [y 0 Z ]: E [y 0 Z ] = E [E (y 0 Y, Z ) Z ] = E [E (y 0 Y ) Z ] Var [y 0 Z ] = Var [E (y 0 Y, Z ) Z ] + E [Var (y 0 Y, Z ) Z ] = Var [E (y 0 Y ) Z ] + E [Var (y 0 Y ) Z ] [y 0 Y ] is immediate: extend Σ(θ)
26
27 Model Comparison Data Scalability and Sampling Model Output Model Validation
28 Model Validation Data Scalability and Sampling Model Output Model Validation Not a predictive model; attempts to characterize variation Out of sample coverage of 95% confidence intervals: Process 86.7% Process + Proxy 92.0% Process + Data 97.2% Conclusion: the typical variability in a home s sale price is inherently large
29 Outline Scalability & Sparsity Optimization 1 Size of Housing Market Modeling/Technology Gap 2 Model Specification General Model Formulation Model Fitting 3 Data Scalability and Sampling Model Output Model Validation 4 Scalability & Sparsity Optimization 5
30 Scalability I Scalability & Sparsity Optimization Goal: linear algebra operations to evaluate objective function, gradient should be: sparse matrices low rank perturbations to sparse matrices arbitrarily close to sparse matrices under reasonable parameter choices Larger sample sizes require sparse representation Specializing the general model: M is sparse; Ω decomposes into a diagonal and the Matern matrix, Σ(θ). For θ 1 small and θ 2 bounded, Σ(θ) is arbitrarily close to a sparse matrix For θ 1 and θ 2 bounded, Σ(θ) is well conditioned; relative to underlying Euclidean distances
31 Scalability I Scalability & Sparsity Optimization For θ 1 = 500:
32 Scalability II More on θ 1 Scalability & Sparsity Optimization ˆθ 1 is an active constraint, at the upper bound reflects a desire to increase spatial scale of correlation; smoother surface Conclusion: the upper bound enforced on θ 1 should be interpreted as a model complexity parameter keeping θ 1 small increases sparsity of Σ(θ) and decreases scale of spatial correlation effect choose upper bound via cross validation
33 Inner Optimization Scalability & Sparsity Optimization EM algorithm requires an inner optimization Dynamically adjust the convergence tolerance (optim/factr) in early iterations for speed
34 Outline 1 Size of Housing Market Modeling/Technology Gap 2 Model Specification General Model Formulation Model Fitting 3 Data Scalability and Sampling Model Output Model Validation 4 Scalability & Sparsity Optimization 5
35 The Hedonic Bayesian model needs very few parameters to describe a complex spatial field. The model does a good job describing the variability inherent in the data. Future Work Experimentation with smaller σ 2 ; cross validation of θ 1 upper bound Increase scalability through a more thorough approach to sparsity
Spatial Outlier Detection
Spatial Outlier Detection Chang-Tien Lu Department of Computer Science Northern Virginia Center Virginia Tech Joint work with Dechang Chen, Yufeng Kou, Jiang Zhao 1 Spatial Outlier A spatial data point
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More informationA spatio-temporal model for extreme precipitation simulated by a climate model.
A spatio-temporal model for extreme precipitation simulated by a climate model. Jonathan Jalbert Joint work with Anne-Catherine Favre, Claude Bélisle and Jean-François Angers STATMOS Workshop: Climate
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationVariability in Annual Temperature Profiles
Variability in Annual Temperature Profiles A Multivariate Spatial Analysis of Regional Climate Model Output Tamara Greasby, Stephan Sain Institute for Mathematics Applied to Geosciences, National Center
More informationAdditive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June
for the Austrian housing market, June 14 2012 Ao. Univ. Prof. Dr. Fachbereich Stadt- und Regionalforschung Technische Universität Wien Dr. Strategic Risk Management Bank Austria UniCredit, Wien Inhalt
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationSensor Tasking and Control
Sensor Tasking and Control Outline Task-Driven Sensing Roles of Sensor Nodes and Utilities Information-Based Sensor Tasking Joint Routing and Information Aggregation Summary Introduction To efficiently
More informationLecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)
School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression
More informationICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss
ICRA 2016 Tutorial on SLAM Graph-Based SLAM and Sparsity Cyrill Stachniss 1 Graph-Based SLAM?? 2 Graph-Based SLAM?? SLAM = simultaneous localization and mapping 3 Graph-Based SLAM?? SLAM = simultaneous
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationReddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011
Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 5
Clustering Part 5 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville SNN Approach to Clustering Ordinary distance measures have problems Euclidean
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 9: Data Mining (4/4) March 9, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationExtreme Value Theory in (Hourly) Precipitation
Extreme Value Theory in (Hourly) Precipitation Uli Schneider Geophysical Statistics Project, NCAR GSP Miniseries at CSU November 17, 2003 Outline Project overview Extreme value theory 101 Applying extreme
More informationCollaborative Filtering for Netflix
Collaborative Filtering for Netflix Michael Percy Dec 10, 2009 Abstract The Netflix movie-recommendation problem was investigated and the incremental Singular Value Decomposition (SVD) algorithm was implemented
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationLearning from Data: Adaptive Basis Functions
Learning from Data: Adaptive Basis Functions November 21, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Neural Networks Hidden to output layer - a linear parameter model But adapt the features of the model.
More informationGraphical Models, Bayesian Method, Sampling, and Variational Inference
Graphical Models, Bayesian Method, Sampling, and Variational Inference With Application in Function MRI Analysis and Other Imaging Problems Wei Liu Scientific Computing and Imaging Institute University
More informationTemplates. for scalable data analysis. 3 Distributed Latent Variable Models. Amr Ahmed, Alexander J Smola, Markus Weimer
Templates for scalable data analysis 3 Distributed Latent Variable Models Amr Ahmed, Alexander J Smola, Markus Weimer Yahoo! Research & UC Berkeley & ANU Variations on a theme inference for mixtures Parallel
More informationMachine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves
Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves
More informationContents. I The Basic Framework for Stationary Problems 1
page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More information08 An Introduction to Dense Continuous Robotic Mapping
NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy
More informationUser Guide Version 2.1 August 8, IMAPP, Inc. Technical Support: Monday Friday 8:00 AM 5:00 PM Phone: (888) IMAPP.
User Guide Version 2.1 August 8, 2008 IMAPP, Inc. Technical Support: Monday Friday 8:00 AM 5:00 PM Phone: (888) 462-7701 Email: support@ IMAPP.com www.imapp.com Index Accessing IMAPP... 3 Log in to IMAPP...
More informationTemporal Modeling and Missing Data Estimation for MODIS Vegetation data
Temporal Modeling and Missing Data Estimation for MODIS Vegetation data Rie Honda 1 Introduction The Moderate Resolution Imaging Spectroradiometer (MODIS) is the primary instrument on board NASA s Earth
More informationSimulation Calibration with Correlated Knowledge-Gradients
Simulation Calibration with Correlated Knowledge-Gradients Peter Frazier Warren Powell Hugo Simão Operations Research & Information Engineering, Cornell University Operations Research & Financial Engineering,
More informationBilevel Sparse Coding
Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationECE 176 Digital Image Processing Handout #14 Pamela Cosman 4/29/05 TEXTURE ANALYSIS
ECE 176 Digital Image Processing Handout #14 Pamela Cosman 4/29/ TEXTURE ANALYSIS Texture analysis is covered very briefly in Gonzalez and Woods, pages 66 671. This handout is intended to supplement that
More informationStatistical Matching using Fractional Imputation
Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:
More informationGT "Calcul Ensembliste"
GT "Calcul Ensembliste" Beyond the bounded error framework for non linear state estimation Fahed Abdallah Université de Technologie de Compiègne 9 Décembre 2010 Fahed Abdallah GT "Calcul Ensembliste" 9
More informationMCMC Diagnostics. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) MCMC Diagnostics MATH / 24
MCMC Diagnostics Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) MCMC Diagnostics MATH 9810 1 / 24 Convergence to Posterior Distribution Theory proves that if a Gibbs sampler iterates enough,
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More informationUnsupervised: no target value to predict
Clustering Unsupervised: no target value to predict Differences between models/algorithms: Exclusive vs. overlapping Deterministic vs. probabilistic Hierarchical vs. flat Incremental vs. batch learning
More informationBayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri
Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri Galin L. Jones 1 School of Statistics University of Minnesota March 2015 1 Joint with Martin Bezener and John Hughes Experiment
More informationCS Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts
More informationINLA: an introduction
INLA: an introduction Håvard Rue 1 Norwegian University of Science and Technology Trondheim, Norway May 2009 1 Joint work with S.Martino (Trondheim) and N.Chopin (Paris) Latent Gaussian models Background
More informationCPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017
CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other
More informationGeostatistical Reservoir Characterization of McMurray Formation by 2-D Modeling
Geostatistical Reservoir Characterization of McMurray Formation by 2-D Modeling Weishan Ren, Oy Leuangthong and Clayton V. Deutsch Department of Civil & Environmental Engineering, University of Alberta
More informationMixture models and clustering
1 Lecture topics: Miture models and clustering, k-means Distance and clustering Miture models and clustering We have so far used miture models as fleible ays of constructing probability models for prediction
More informationRobust Kernel Methods in Clustering and Dimensionality Reduction Problems
Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust
More informationSYDE Winter 2011 Introduction to Pattern Recognition. Clustering
SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned
More informationGlobal modelling of air pollution using multiple data sources
Global modelling of air pollution using multiple data sources Matthew Thomas SAMBa, University of Bath Email: M.L.Thomas@bath.ac.uk November 11, 015 1/ 3 OUTLINE Motivation Data Sources Existing Approaches
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More informationLink Prediction for Social Network
Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue
More informationAdvanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz
Advanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz SLAM Constraints connect the poses of the robot while it is moving
More informationRobot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss
Robot Mapping Least Squares Approach to SLAM Cyrill Stachniss 1 Three Main SLAM Paradigms Kalman filter Particle filter Graphbased least squares approach to SLAM 2 Least Squares in General Approach for
More informationGraphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General
Robot Mapping Three Main SLAM Paradigms Least Squares Approach to SLAM Kalman filter Particle filter Graphbased Cyrill Stachniss least squares approach to SLAM 1 2 Least Squares in General! Approach for
More informationAsreml-R: an R package for mixed models using residual maximum likelihood
Asreml-R: an R package for mixed models using residual maximum likelihood David Butler 1 Brian Cullis 2 Arthur Gilmour 3 1 Queensland Department of Primary Industries Toowoomba 2 NSW Department of Primary
More informationarxiv: v2 [stat.ml] 5 Nov 2018
Kernel Distillation for Fast Gaussian Processes Prediction arxiv:1801.10273v2 [stat.ml] 5 Nov 2018 Congzheng Song Cornell Tech cs2296@cornell.edu Abstract Yiming Sun Cornell University ys784@cornell.edu
More informationNon-Linearity of Scorecard Log-Odds
Non-Linearity of Scorecard Log-Odds Ross McDonald, Keith Smith, Matthew Sturgess, Edward Huang Retail Decision Science, Lloyds Banking Group Edinburgh Credit Scoring Conference 6 th August 9 Lloyds Banking
More informationData Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\
Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured
More informationHow To: Advanced CMA
How To: Advanced CMA After you have Started a CMA and you are on the search results page, then you can continue the steps below to creating your Advanced CMA within Matrix. 1. From your search results
More informationData Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\
Data Preprocessing S - MAI AMLT - 2016/2017 (S - MAI) Data Preprocessing AMLT - 2016/2017 1 / 71 Outline 1 Introduction Data Representation 2 Data Preprocessing Outliers Missing Values Normalization Discretization
More informationCpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.
C: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C is one of many capability metrics that are available. When capability metrics are used, organizations typically provide
More informationDetecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference
Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI
More informationDynamic Thresholding for Image Analysis
Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British
More informationSimulation Calibration with Correlated Knowledge-Gradients
Simulation Calibration with Correlated Knowledge-Gradients Peter Frazier Warren Powell Hugo Simão Operations Research & Information Engineering, Cornell University Operations Research & Financial Engineering,
More informationArtificial Intelligence for Robotics: A Brief Summary
Artificial Intelligence for Robotics: A Brief Summary This document provides a summary of the course, Artificial Intelligence for Robotics, and highlights main concepts. Lesson 1: Localization (using Histogram
More informationSupplementary Material : Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision
Supplementary Material : Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision Due to space limitation in the main paper, we present additional experimental results in this supplementary
More informationHomework #4 Programming Assignment Due: 11:59 pm, November 4, 2018
CSCI 567, Fall 18 Haipeng Luo Homework #4 Programming Assignment Due: 11:59 pm, ovember 4, 2018 General instructions Your repository will have now a directory P4/. Please do not change the name of this
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationCHAPTER 2 TEXTURE CLASSIFICATION METHODS GRAY LEVEL CO-OCCURRENCE MATRIX AND TEXTURE UNIT
CHAPTER 2 TEXTURE CLASSIFICATION METHODS GRAY LEVEL CO-OCCURRENCE MATRIX AND TEXTURE UNIT 2.1 BRIEF OUTLINE The classification of digital imagery is to extract useful thematic information which is one
More informationToday. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time
Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine
More informationCollaborative Sparsity and Compressive MRI
Modeling and Computation Seminar February 14, 2013 Table of Contents 1 T2 Estimation 2 Undersampling in MRI 3 Compressed Sensing 4 Model-Based Approach 5 From L1 to L0 6 Spatially Adaptive Sparsity MRI
More informationLearning GMRF Structures for Spatial Priors
Learning GMRF Structures for Spatial Priors Lie Gu, Eric P. Xing and Takeo Kanade Computer Science Department Carnegie Mellon University {gu, epxing, tk}@cs.cmu.edu Abstract The goal of this paper is to
More informationRESTORING ARTIFACT-FREE MICROSCOPY IMAGE SEQUENCES. Robotics Institute Carnegie Mellon University 5000 Forbes Ave, Pittsburgh, PA 15213, USA
RESTORING ARTIFACT-FREE MICROSCOPY IMAGE SEQUENCES Zhaozheng Yin Takeo Kanade Robotics Institute Carnegie Mellon University 5000 Forbes Ave, Pittsburgh, PA 15213, USA ABSTRACT Phase contrast and differential
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationOne-Shot Learning with a Hierarchical Nonparametric Bayesian Model
One-Shot Learning with a Hierarchical Nonparametric Bayesian Model R. Salakhutdinov, J. Tenenbaum and A. Torralba MIT Technical Report, 2010 Presented by Esther Salazar Duke University June 10, 2011 E.
More informationLasso Regression: Regularization for feature selection
Lasso Regression: Regularization for feature selection CSE 416: Machine Learning Emily Fox University of Washington April 12, 2018 Symptom of overfitting 2 Often, overfitting associated with very large
More informationUltrasonic Multi-Skip Tomography for Pipe Inspection
18 th World Conference on Non destructive Testing, 16-2 April 212, Durban, South Africa Ultrasonic Multi-Skip Tomography for Pipe Inspection Arno VOLKER 1, Rik VOS 1 Alan HUNTER 1 1 TNO, Stieltjesweg 1,
More informationWorkload Characterization Techniques
Workload Characterization Techniques Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/
More informationVisual Tracking (1) Feature Point Tracking and Block Matching
Intelligent Control Systems Visual Tracking (1) Feature Point Tracking and Block Matching Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/
More informationEstimating Human Pose in Images. Navraj Singh December 11, 2009
Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationmritc: A Package for MRI Tissue Classification
mritc: A Package for MRI Tissue Classification Dai Feng 1 Luke Tierney 2 1 Merck Research Labratories 2 University of Iowa July 2010 Feng & Tierney (Merck & U of Iowa) MRI Tissue Classification July 2010
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Spring 2018 http://vllab.ee.ntu.edu.tw/dlcv.html (primary) https://ceiba.ntu.edu.tw/1062dlcv (grade, etc.) FB: DLCV Spring 2018 Yu Chiang Frank Wang 王鈺強, Associate Professor
More information10701 Machine Learning. Clustering
171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationMCMC Methods for data modeling
MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationClassification. 1 o Semestre 2007/2008
Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class
More informationSGN (4 cr) Chapter 11
SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter
More informationScalable Multidimensional Hierarchical Bayesian Modeling on Spark
Scalable Multidimensional Hierarchical Bayesian Modeling on Spark Robert Ormandi, Hongxia Yang and Quan Lu Yahoo! Sunnyvale, CA 2015 Click-Through-Rate (CTR) Prediction Estimating the probability of click
More information2014 Stat-Ease, Inc. All Rights Reserved.
What s New in Design-Expert version 9 Factorial split plots (Two-Level, Multilevel, Optimal) Definitive Screening and Single Factor designs Journal Feature Design layout Graph Columns Design Evaluation
More informationVariational Methods for Discrete-Data Latent Gaussian Models
Variational Methods for Discrete-Data Latent Gaussian Models University of British Columbia Vancouver, Canada March 6, 2012 The Big Picture Joint density models for data with mixed data types Bayesian
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationVC 17/18 TP14 Pattern Recognition
VC 17/18 TP14 Pattern Recognition Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos Miguel Tavares Coimbra Outline Introduction to Pattern Recognition
More informationPredictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA
Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,
More informationDoubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica
Boston-Keio Workshop 2016. Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica... Mihoko Minami Keio University, Japan August 15, 2016 Joint work with
More informationOutlier Pursuit: Robust PCA and Collaborative Filtering
Outlier Pursuit: Robust PCA and Collaborative Filtering Huan Xu Dept. of Mechanical Engineering & Dept. of Mathematics National University of Singapore Joint w/ Constantine Caramanis, Yudong Chen, Sujay
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationarxiv: v1 [stat.me] 29 May 2015
MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis Vincent Audigier 1, François Husson 2 and Julie Josse 2 arxiv:1505.08116v1 [stat.me] 29 May 2015 Applied Mathematics
More information