Bayesian Optimization Nando de Freitas

Size: px
Start display at page:

Download "Bayesian Optimization Nando de Freitas"

Transcription

1 Bayesian Optimization Nando de Freitas Eric Brochu, Ruben Martinez-Cantin, Matt Hoffman, Ziyu Wang,

2 More resources This talk follows the presentation in the following review paper available fro Oxford website.

3 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

4 Black-box optimization / design

5

6 Automatic machine learning [Hoffman, Shahriari & df, 2013]

7 Information extraction / concept learning I love silence of the lambs. It s a scary movie. The confit de canard was delish! Expert tuning by Hector Garcia-Molina et al: Towards The Web Of Concepts: Extracting Concepts from Large Datasets [Ziyu Wang et al, 2014]

8 Automatic (Adaptive) Monte Carlo samplers [Wang, Mohamed & df, 2013]

9 Analytics, dynamic creative content and A/B testing [Steve Scott on Bayesian bandits at Google]

10 animation session target [Brochu, Ghosh & df, Brochu, Brochu, df, 2010] Winner of the SRC competition - SIGGRAPH

11 Tuning NP hard problem solvers lpsolve is a mixed integer programming solver, downloaded over 40,000 times last year. 47 discrete parameters (choices)

12 [Denil et al 2012]

13 GP Policy for tracking Digits Experiment: Face Experiment: 13

14 Hierarchical reinforcement learning High-level model-based learning for deciding when to navigate, park, pickup and dropoff passengers. Mid-level active path learning for navigating a topological map. Low-level active policy optimizer to learn control of continuous non-linear vehicle dynamics.

15 Active Path Finding in Middle Level Navigate policy generates sequence of waypoints on a topological map to navigate from a location to a destination.

16 Low-Level: Trajectory following TORCS: 3D game engine that implements complex vehicle dynamics complete with manual and automatic transmission, engine, clutch, tire, and suspension models.

17 Bayesian optimization was used to find the neural net low-level policy and the value function for the upper levels

18 Many other applications Robotics, control, reinforcement learning, SAT solvers, scheduling, planning Configuration of ad-centers, compilers, hardware, software Programming by optimization

19 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

20 Bayesian parametric optimization

21 Beta-Bernoulli Bayesian optimization Consider the following K-armed bandit setting. money!

22 Beta-Bernoulli Bayesian optimization Specify a Beta prior on each arm:

23 Beta-Bernoulli Bayesian optimization

24 Thompson sampling At each iteration, draw a sample from the poster Pick the action of highest expected return

25 Thompson sampling

26 Linear bandits Introduce correlations among the arms Use standard conjugate Normal Inverse Gamma p

27 Linear bandits The posterior mean and covariance are analytical Once again, it is straightforward to do Thompson

28 Being linear in features It is trivial to extend the linear model using feature Features can be RBFs, sinusoids Or even popular deep networks

29 One reason for Bayes The predictive distribution at a test point x is: Z p(yjx; D; ¾ 2) = N (yjxµ; ¾ 2)N (µjµn ; V n )dµ = N (yjxµn ; ¾ 2 + xv n xt ) The variance in this prediction, ¾n2 (x), depends on two terms: the servation noise with variance ¾ 2, and the parameter uncertainty xv this latter term varies depending on how close x is to the training D. The error bars get larger as we move away from the training po representing increased uncertainty. By contrast, the plugin approximation has constant sized error bars, Z ^ ¾ 2) p(yjx; D; ¾ 2) ¼ N (yjxµ; ¾ 2)±µ^(µ)dµ = p(yjxµ;

30 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

31 From linear models to Gaussian processes

32 Gaussian processes

33 Gaussian processes

34 GP log marginal likelihood And GP hyper-parameters

35 Trees for regression [Criminisi et al, 2011]

36 Regression trees [Criminisi et al, 2011]

37 Regression forests [Criminisi et al, 2011]

38 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

39 Bayesian experimental design Uses the principle of maximum expected utility: Example 1: active learning Example 2: sequential decision making with discou

40 Some acquisition functions

41 Thompson sampling with GPs Posterior over location of the minimum: Thompson: Mechanism:

42 Randomized Thompson sampling Bochner s lemma tell us that: With sampling and this lemma, we can construct a feature based approximation that is amenable to computing. Then do linear Bayesian optimization with sinuso

43 Entropy search and variants Posterior over location of the minimum: Utility: minimize uncertainty of location of minimum

44 The choice of utility in practice [Hoffman, Shahriari & df, 2013]

45 The choice of utility in practice

46 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

47 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

48 Hyper-parameters and robustness Learning the hyper-parameters of the GP is important. The tuning method must be automatic! One of the best ways to manage the GP hyper-parameters is to integrate them out, e.g. with slice sampling as in Spearmint. But this is still dangerous!

49 Hyper-parameters and robustness

50 Spearmint Slice sampling Ziyu Wang & NdF 2014 Theory bounds

51 Spearmint Slice sampling Ziyu Wang & NdF 2014 Theory bounds

52 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

53 Analysis of Bayes Opt Proposition 1 (Variance Bound) : Theorem 2 (Vanishing regret) : [df, Zoghi & Smola, 2012]

54

55 Multi-scale optimistic optimization [Remi Munos - SOO, UCT] [Ziyu Wang et al, 2014]

56 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

57 Conditional parameters A big problem in deep learning (see Torch 7)

58 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

59 Input warping [Jasper Snoek, Kevin Swersky, Rich Zemel, Ryan Adams,

60 Treed BayesOpt [Assael, Wang and NdF Rob Gramacy has may papers using trees and GPs ]

61

62 Treed BayesOpt

63 Treed BayesOpt

64 Treed BayesOpt Aggregate to deal with paucity of data in leaves to estimate the hyper-parameters

65 Warping + trees

66 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

67 Parallelization Talk to David Ginsbourger. Essentially, augment the observations for finished runs with predicted observations for unfinished runs. E.g.,

68 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

69 Constraints and cost sensitivity There are many approaches (Gramacy, Snoek, ). Could use a GP with binary observations h(.) that indicate the probability of the constraint being satisfied: Often trivial scaling is used to deal with time.

70 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

71 Random Embedding Bayesian Optimization [Wang, Zoghi, Matheson, Hutter & df, IJCAI 2013 Distinguished paper award]

72

73 Scaling to over a billion dimensions

74 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

75 Multi-task Define a kernel over tasks (ouputs) and inputs. E. Can also do BayesOpt with many parametric approaches to contextual / multi-task regression. E.g. context could be features of

76 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

77 Freeze-thaw learning curve prediction [Kevin Swersky et al, See also work of David Ginsbourger and Frank Hutter]

78 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

79 Unknown optimization boundary [Bobak Shahriari et al, 2015

80 Unknown optimization boundary [Bobak Shahriari et al, 2015

81 Outline Some applications Parametric Bayesian optimization Beta-Bernoulli bandit models Linear bandit models Neural network and other feature-based models Non-Parametric Bayesian optimization Gaussian processes Random forests Acquisition functions A huge bag of problems Hyper-parameters and robustness Optimizing acquisition functions Conditional spaces Non-stationarity Parallelization Constraints and cost sensitivity High-dimensions Multi-task / context Freeze-thaw / early stopping Unknown optimization regions Empirical hardness models and variants

82 Empirical Hardness Models Formally, given a set of particular problem instances, p P, and an algorithm, a A (with free-parameters), the objective of performance prediction is to build a model that predicts the performance of a when run on an arbitrary p P. See work of Kevin Leyton-Brown and Holger Hoos.

83 Thank you

arxiv: v1 [cs.lg] 29 May 2014

arxiv: v1 [cs.lg] 29 May 2014 BayesOpt: A Bayesian Optimization Library for Nonlinear Optimization, Experimental Design and Bandits arxiv:1405.7430v1 [cs.lg] 29 May 2014 Ruben Martinez-Cantin rmcantin@unizar.es May 30, 2014 Abstract

More information

Overview on Automatic Tuning of Hyperparameters

Overview on Automatic Tuning of Hyperparameters Overview on Automatic Tuning of Hyperparameters Alexander Fonarev http://newo.su 20.02.2016 Outline Introduction to the problem and examples Introduction to Bayesian optimization Overview of surrogate

More information

Advancing Bayesian Optimization: The Mixed-Global-Local (MGL) Kernel and Length-Scale Cool Down

Advancing Bayesian Optimization: The Mixed-Global-Local (MGL) Kernel and Length-Scale Cool Down Advancing Bayesian Optimization: The Mixed-Global-Local (MGL) Kernel and Length-Scale Cool Down Kim P. Wabersich Machine Learning and Robotics Lab University of Stuttgart, Germany wabersich@kimpeter.de

More information

Bayesian Optimization for Parameter Selection of Random Forests Based Text Classifier

Bayesian Optimization for Parameter Selection of Random Forests Based Text Classifier Bayesian Optimization for Parameter Selection of Random Forests Based Text Classifier 1 2 3 4 Anonymous Author(s) Affiliation Address email 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

More information

2. Blackbox hyperparameter optimization and AutoML

2. Blackbox hyperparameter optimization and AutoML AutoML 2017. Automatic Selection, Configuration & Composition of ML Algorithms. at ECML PKDD 2017, Skopje. 2. Blackbox hyperparameter optimization and AutoML Pavel Brazdil, Frank Hutter, Holger Hoos, Joaquin

More information

Efficient Hyper-parameter Optimization for NLP Applications

Efficient Hyper-parameter Optimization for NLP Applications Efficient Hyper-parameter Optimization for NLP Applications Lidan Wang 1, Minwei Feng 1, Bowen Zhou 1, Bing Xiang 1, Sridhar Mahadevan 2,1 1 IBM Watson, T. J. Watson Research Center, NY 2 College of Information

More information

Sequential Model-based Optimization for General Algorithm Configuration

Sequential Model-based Optimization for General Algorithm Configuration Sequential Model-based Optimization for General Algorithm Configuration Frank Hutter, Holger Hoos, Kevin Leyton-Brown University of British Columbia LION 5, Rome January 18, 2011 Motivation Most optimization

More information

Structure Optimization for Deep Multimodal Fusion Networks using Graph-Induced Kernels

Structure Optimization for Deep Multimodal Fusion Networks using Graph-Induced Kernels Structure Optimization for Deep Multimodal Fusion Networks using Graph-Induced Kernels Dhanesh Ramachandram 1, Michal Lisicki 1, Timothy J. Shields, Mohamed R. Amer and Graham W. Taylor1 1- Machine Learning

More information

GPflowOpt: A Bayesian Optimization Library using TensorFlow

GPflowOpt: A Bayesian Optimization Library using TensorFlow GPflowOpt: A Bayesian Optimization Library using TensorFlow Nicolas Knudde Joachim van der Herten Tom Dhaene Ivo Couckuyt Ghent University imec IDLab Department of Information Technology {nicolas.knudde,

More information

Towards efficient Bayesian Optimization for Big Data

Towards efficient Bayesian Optimization for Big Data Towards efficient Bayesian Optimization for Big Data Aaron Klein 1 Simon Bartels Stefan Falkner 1 Philipp Hennig Frank Hutter 1 1 Department of Computer Science University of Freiburg, Germany {kleinaa,sfalkner,fh}@cs.uni-freiburg.de

More information

Gaussian Processes for Robotics. McGill COMP 765 Oct 24 th, 2017

Gaussian Processes for Robotics. McGill COMP 765 Oct 24 th, 2017 Gaussian Processes for Robotics McGill COMP 765 Oct 24 th, 2017 A robot must learn Modeling the environment is sometimes an end goal: Space exploration Disaster recovery Environmental monitoring Other

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Infinite dimensional optimistic optimisation with applications on physical systems

Infinite dimensional optimistic optimisation with applications on physical systems Infinite dimensional optimistic optimisation with applications on physical systems Muhammad F. Kasim & Peter A. Norreys University of Oxford Oxford, OX1 3RH {m.kasim1, peter.norreys}@physics.ox.ac.uk Abstract

More information

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University NIPS 2008: E. Sudderth & M. Jordan, Shared Segmentation of Natural

More information

Efficient Transfer Learning Method for Automatic Hyperparameter Tuning

Efficient Transfer Learning Method for Automatic Hyperparameter Tuning Efficient Transfer Learning Method for Automatic Hyperparameter Tuning Dani Yogatama Carnegie Mellon University Pittsburgh, PA 1513, USA dyogatama@cs.cmu.edu Gideon Mann Google Research New York, NY 111,

More information

Bayesian Multi-Scale Optimistic Optimization

Bayesian Multi-Scale Optimistic Optimization Ziyu Wang Babak Shakibi Lin Jin Nando de Freitas University of Oxford University of British Columbia Rocket Gaming Systems University of Oxford Abstract Bayesian optimization is a powerful global optimization

More information

Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms

Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms Frank Hutter 1, Youssef Hamadi 2, Holger Hoos 1, and Kevin Leyton-Brown 1 1 University of British Columbia, Vancouver,

More information

Think Globally, Act Locally: a Local Strategy for Bayesian Optimization

Think Globally, Act Locally: a Local Strategy for Bayesian Optimization Think Globally, Act Locally: a Local Strategy for Bayesian Optimization Vu Nguyen, Sunil Gupta, Santu Rana, Cheng Li, Svetha Venkatesh Centre of Pattern Recognition and Data Analytic (PraDa), Deakin University

More information

Slides credited from Dr. David Silver & Hung-Yi Lee

Slides credited from Dr. David Silver & Hung-Yi Lee Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to

More information

Scalable Meta-Learning for Bayesian Optimization using Ranking-Weighted Gaussian Process Ensembles

Scalable Meta-Learning for Bayesian Optimization using Ranking-Weighted Gaussian Process Ensembles ICML 2018 AutoML Workshop Scalable Meta-Learning for Bayesian Optimization using Ranking-Weighted Gaussian Process Ensembles Matthias Feurer University of Freiburg Benjamin Letham Facebook Eytan Bakshy

More information

08 An Introduction to Dense Continuous Robotic Mapping

08 An Introduction to Dense Continuous Robotic Mapping NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy

More information

Speed-Constrained Tuning for Statistical Machine Translation Using Bayesian Optimization

Speed-Constrained Tuning for Statistical Machine Translation Using Bayesian Optimization Speed-Constrained Tuning for Statistical Machine Translation Using Bayesian Optimization Daniel Beck Adrià de Gispert Gonzalo Iglesias Aurelien Waite Bill Byrne Department of Computer Science, University

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

arxiv: v1 [cs.lg] 31 Mar 2016

arxiv: v1 [cs.lg] 31 Mar 2016 arxiv:1603.09441v1 [cs.lg] 31 Mar 2016 Ian Dewancker Michael McCourt Scott Clark Patrick Hayes Alexandra Johnson George Ke SigOpt, San Francisco, CA 94108 USA Abstract Empirical analysis serves as an important

More information

arxiv: v1 [stat.ml] 14 Aug 2015

arxiv: v1 [stat.ml] 14 Aug 2015 Unbounded Bayesian Optimization via Regularization arxiv:1508.03666v1 [stat.ml] 14 Aug 2015 Bobak Shahriari University of British Columbia bshahr@cs.ubc.ca Alexandre Bouchard-Côté University of British

More information

Learning to Choose Instance-Specific Macro Operators

Learning to Choose Instance-Specific Macro Operators Learning to Choose Instance-Specific Macro Operators Maher Alhossaini Department of Computer Science University of Toronto Abstract The acquisition and use of macro actions has been shown to be effective

More information

Learning to Transfer Initializations for Bayesian Hyperparameter Optimization

Learning to Transfer Initializations for Bayesian Hyperparameter Optimization Learning to Transfer Initializations for Bayesian Hyperparameter Optimization Jungtaek Kim, Saehoon Kim, and Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and

More information

Learning from Data: Adaptive Basis Functions

Learning from Data: Adaptive Basis Functions Learning from Data: Adaptive Basis Functions November 21, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Neural Networks Hidden to output layer - a linear parameter model But adapt the features of the model.

More information

Probabilistic Robotics

Probabilistic Robotics Probabilistic Robotics Sebastian Thrun Wolfram Burgard Dieter Fox The MIT Press Cambridge, Massachusetts London, England Preface xvii Acknowledgments xix I Basics 1 1 Introduction 3 1.1 Uncertainty in

More information

arxiv: v4 [stat.ml] 22 Apr 2018

arxiv: v4 [stat.ml] 22 Apr 2018 The Parallel Knowledge Gradient Method for Batch Bayesian Optimization arxiv:1606.04414v4 [stat.ml] 22 Apr 2018 Jian Wu, Peter I. Frazier Cornell University Ithaca, NY, 14853 {jw926, pf98}@cornell.edu

More information

Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix

Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix Carlos Ordonez, Yiqun Zhang Department of Computer Science, University of Houston, USA Abstract. We study the serial and parallel

More information

L10. PARTICLE FILTERING CONTINUED. NA568 Mobile Robotics: Methods & Algorithms

L10. PARTICLE FILTERING CONTINUED. NA568 Mobile Robotics: Methods & Algorithms L10. PARTICLE FILTERING CONTINUED NA568 Mobile Robotics: Methods & Algorithms Gaussian Filters The Kalman filter and its variants can only model (unimodal) Gaussian distributions Courtesy: K. Arras Motivation

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Probabilistic Robotics

Probabilistic Robotics Probabilistic Robotics Bayes Filter Implementations Discrete filters, Particle filters Piecewise Constant Representation of belief 2 Discrete Bayes Filter Algorithm 1. Algorithm Discrete_Bayes_filter(

More information

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. Rhodes 5/10/17 What is Machine Learning? Machine learning

More information

Model learning for robot control: a survey

Model learning for robot control: a survey Model learning for robot control: a survey Duy Nguyen-Tuong, Jan Peters 2011 Presented by Evan Beachly 1 Motivation Robots that can learn how their motors move their body Complexity Unanticipated Environments

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences

More information

Boosting as a Metaphor for Algorithm Design

Boosting as a Metaphor for Algorithm Design Boosting as a Metaphor for Algorithm Design Kevin Leyton-Brown, Eugene Nudelman, Galen Andrew, Jim McFadden, and Yoav Shoham {kevinlb;eugnud;galand;jmcf;shoham}@cs.stanford.edu Stanford University, Stanford

More information

arxiv: v2 [stat.ml] 15 May 2018

arxiv: v2 [stat.ml] 15 May 2018 Mark McLeod 1 Michael A. Osborne 1 2 Stephen J. Roberts 1 2 arxiv:1703.04335v2 [stat.ml] 15 May 2018 Abstract We propose a novel Bayesian Optimization approach for black-box objectives with a variable

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Mondrian Forests: Efficient Online Random Forests

Mondrian Forests: Efficient Online Random Forests Mondrian Forests: Efficient Online Random Forests Balaji Lakshminarayanan (Gatsby Unit, UCL) Daniel M. Roy (Cambridge Toronto) Yee Whye Teh (Oxford) September 4, 2014 1 Outline Background and Motivation

More information

Tracking Algorithms. Lecture16: Visual Tracking I. Probabilistic Tracking. Joint Probability and Graphical Model. Deterministic methods

Tracking Algorithms. Lecture16: Visual Tracking I. Probabilistic Tracking. Joint Probability and Graphical Model. Deterministic methods Tracking Algorithms CSED441:Introduction to Computer Vision (2017F) Lecture16: Visual Tracking I Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Deterministic methods Given input video and current state,

More information

arxiv: v1 [cs.lg] 6 Jun 2016

arxiv: v1 [cs.lg] 6 Jun 2016 Learning to Optimize arxiv:1606.01885v1 [cs.lg] 6 Jun 2016 Ke Li Jitendra Malik Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720 United States

More information

arxiv: v3 [math.oc] 18 Mar 2015

arxiv: v3 [math.oc] 18 Mar 2015 A warped kernel improving robustness in Bayesian optimization via random embeddings arxiv:1411.3685v3 [math.oc] 18 Mar 2015 Mickaël Binois 1,2 David Ginsbourger 3 Olivier Roustant 1 1 Mines Saint-Étienne,

More information

Adaptive Dropout Training for SVMs

Adaptive Dropout Training for SVMs Department of Computer Science and Technology Adaptive Dropout Training for SVMs Jun Zhu Joint with Ning Chen, Jingwei Zhuo, Jianfei Chen, Bo Zhang Tsinghua University ShanghaiTech Symposium on Data Science,

More information

Machine Learning. Supervised Learning. Manfred Huber

Machine Learning. Supervised Learning. Manfred Huber Machine Learning Supervised Learning Manfred Huber 2015 1 Supervised Learning Supervised learning is learning where the training data contains the target output of the learning system. Training data D

More information

An introduction to multi-armed bandits

An introduction to multi-armed bandits An introduction to multi-armed bandits Henry WJ Reeve (Manchester) (henry.reeve@manchester.ac.uk) A joint work with Joe Mellor (Edinburgh) & Professor Gavin Brown (Manchester) Plan 1. An introduction to

More information

Introduction to Mobile Robotics

Introduction to Mobile Robotics Introduction to Mobile Robotics Gaussian Processes Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann SS08, University of Freiburg, Department for Computer Science Announcement

More information

Practical Course WS12/13 Introduction to Monte Carlo Localization

Practical Course WS12/13 Introduction to Monte Carlo Localization Practical Course WS12/13 Introduction to Monte Carlo Localization Cyrill Stachniss and Luciano Spinello 1 State Estimation Estimate the state of a system given observations and controls Goal: 2 Bayes Filter

More information

IROS 05 Tutorial. MCL: Global Localization (Sonar) Monte-Carlo Localization. Particle Filters. Rao-Blackwellized Particle Filters and Loop Closing

IROS 05 Tutorial. MCL: Global Localization (Sonar) Monte-Carlo Localization. Particle Filters. Rao-Blackwellized Particle Filters and Loop Closing IROS 05 Tutorial SLAM - Getting it Working in Real World Applications Rao-Blackwellized Particle Filters and Loop Closing Cyrill Stachniss and Wolfram Burgard University of Freiburg, Dept. of Computer

More information

Variational Methods for Discrete-Data Latent Gaussian Models

Variational Methods for Discrete-Data Latent Gaussian Models Variational Methods for Discrete-Data Latent Gaussian Models University of British Columbia Vancouver, Canada March 6, 2012 The Big Picture Joint density models for data with mixed data types Bayesian

More information

Using Reinforcement Learning to Optimize Storage Decisions Ravi Khadiwala Cleversafe

Using Reinforcement Learning to Optimize Storage Decisions Ravi Khadiwala Cleversafe Using Reinforcement Learning to Optimize Storage Decisions Ravi Khadiwala Cleversafe Topics What is Reinforcement Learning? Exploration vs. Exploitation The Multi-armed Bandit Optimizing read locations

More information

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400 Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,

More information

Level-set MCMC Curve Sampling and Geometric Conditional Simulation

Level-set MCMC Curve Sampling and Geometric Conditional Simulation Level-set MCMC Curve Sampling and Geometric Conditional Simulation Ayres Fan John W. Fisher III Alan S. Willsky February 16, 2007 Outline 1. Overview 2. Curve evolution 3. Markov chain Monte Carlo 4. Curve

More information

Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates

Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates Ilija Ilievski Graduate School for Integrative Sciences and Engineering National University of Singapore

More information

Markov Random Fields and Gibbs Sampling for Image Denoising

Markov Random Fields and Gibbs Sampling for Image Denoising Markov Random Fields and Gibbs Sampling for Image Denoising Chang Yue Electrical Engineering Stanford University changyue@stanfoed.edu Abstract This project applies Gibbs Sampling based on different Markov

More information

Particle Filter in Brief. Robot Mapping. FastSLAM Feature-based SLAM with Particle Filters. Particle Representation. Particle Filter Algorithm

Particle Filter in Brief. Robot Mapping. FastSLAM Feature-based SLAM with Particle Filters. Particle Representation. Particle Filter Algorithm Robot Mapping FastSLAM Feature-based SLAM with Particle Filters Cyrill Stachniss Particle Filter in Brief! Non-parametric, recursive Bayes filter! Posterior is represented by a set of weighted samples!

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

10703 Deep Reinforcement Learning and Control

10703 Deep Reinforcement Learning and Control 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture

More information

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing

More information

NERC Gazebo simulation implementation

NERC Gazebo simulation implementation NERC 2015 - Gazebo simulation implementation Hannan Ejaz Keen, Adil Mumtaz Department of Electrical Engineering SBA School of Science & Engineering, LUMS, Pakistan {14060016, 14060037}@lums.edu.pk ABSTRACT

More information

Information Filtering for arxiv.org

Information Filtering for arxiv.org Information Filtering for arxiv.org Bandits, Exploration vs. Exploitation, & the Cold Start Problem Peter Frazier School of Operations Research & Information Engineering Cornell University with Xiaoting

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Traditional Acquisition Functions for Bayesian Optimization

Traditional Acquisition Functions for Bayesian Optimization Traditional Acquisition Functions for Bayesian Optimization Jungtaek Kim (jtkim@postech.edu) Machine Learning Group, Department of Computer Science and Engineering, POSTECH, 77-Cheongam-ro, Nam-gu, Pohang-si

More information

Bayesian Adaptive Direct Search: Hybrid Bayesian Optimization for Model Fitting

Bayesian Adaptive Direct Search: Hybrid Bayesian Optimization for Model Fitting Bayesian Adaptive Direct Search: Hybrid Bayesian Optimization for Model Fitting Luigi Acerbi Center for Neural Science New Yor University luigi.acerbi@nyu.edu Wei Ji Ma Center for Neural Science & Dept.

More information

Automated Configuration of MIP solvers

Automated Configuration of MIP solvers Automated Configuration of MIP solvers Frank Hutter, Holger Hoos, and Kevin Leyton-Brown Department of Computer Science University of British Columbia Vancouver, Canada {hutter,hoos,kevinlb}@cs.ubc.ca

More information

Mondrian Forests: Efficient Online Random Forests

Mondrian Forests: Efficient Online Random Forests Mondrian Forests: Efficient Online Random Forests Balaji Lakshminarayanan Joint work with Daniel M. Roy and Yee Whye Teh 1 Outline Background and Motivation Mondrian Forests Randomization mechanism Online

More information

arxiv: v2 [cs.lg] 8 Nov 2017

arxiv: v2 [cs.lg] 8 Nov 2017 ACCELERATING NEURAL ARCHITECTURE SEARCH US- ING PERFORMANCE PREDICTION Bowen Baker 1, Otkrist Gupta 1, Ramesh Raskar 1, Nikhil Naik 2 (* - indicates equal contribution) 1 MIT Media Laboratory, Cambridge,

More information

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min) Where we are Background (15 min) Graph models, subgraph isomorphism, subgraph mining, graph clustering Eploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting

More information

Building Classifiers using Bayesian Networks

Building Classifiers using Bayesian Networks Building Classifiers using Bayesian Networks Nir Friedman and Moises Goldszmidt 1997 Presented by Brian Collins and Lukas Seitlinger Paper Summary The Naive Bayes classifier has reasonable performance

More information

ACCELERATING NEURAL ARCHITECTURE SEARCH US-

ACCELERATING NEURAL ARCHITECTURE SEARCH US- ACCELERATING NEURAL ARCHITECTURE SEARCH US- ING PERFORMANCE PREDICTION Bowen Baker, Otkrist Gupta, Ramesh Raskar Media Laboratory Massachusetts Institute of Technology Cambridge, MA 02142, USA {bowen,otkrist,raskar}@mit.edu

More information

Probabilistic Robotics

Probabilistic Robotics Probabilistic Robotics FastSLAM Sebastian Thrun (abridged and adapted by Rodrigo Ventura in Oct-2008) The SLAM Problem SLAM stands for simultaneous localization and mapping The task of building a map while

More information

Combining Hyperband and Bayesian Optimization

Combining Hyperband and Bayesian Optimization Combining Hyperband and Bayesian Optimization Stefan Falkner Aaron Klein Frank Hutter Department of Computer Science University of Freiburg {sfalkner, kleinaa, fh}@cs.uni-freiburg.de Abstract Proper hyperparameter

More information

Supplementary material for: BO-HB: Robust and Efficient Hyperparameter Optimization at Scale

Supplementary material for: BO-HB: Robust and Efficient Hyperparameter Optimization at Scale Supplementary material for: BO-: Robust and Efficient Hyperparameter Optimization at Scale Stefan Falkner 1 Aaron Klein 1 Frank Hutter 1 A. Available Software To promote reproducible science and enable

More information

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017 CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.

More information

CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3

CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3 CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3 What I have in mind for our last programming project is to do something with either graphical models or random sampling. A few ideas

More information

CSC411 Fall 2014 Machine Learning & Data Mining. Ensemble Methods. Slides by Rich Zemel

CSC411 Fall 2014 Machine Learning & Data Mining. Ensemble Methods. Slides by Rich Zemel CSC411 Fall 2014 Machine Learning & Data Mining Ensemble Methods Slides by Rich Zemel Ensemble methods Typical application: classi.ication Ensemble of classi.iers is a set of classi.iers whose individual

More information

3 Feature Selection & Feature Extraction

3 Feature Selection & Feature Extraction 3 Feature Selection & Feature Extraction Overview: 3.1 Introduction 3.2 Feature Extraction 3.3 Feature Selection 3.3.1 Max-Dependency, Max-Relevance, Min-Redundancy 3.3.2 Relevance Filter 3.3.3 Redundancy

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

CSC 411 Lecture 4: Ensembles I

CSC 411 Lecture 4: Ensembles I CSC 411 Lecture 4: Ensembles I Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 04-Ensembles I 1 / 22 Overview We ve seen two particular classification algorithms:

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

arxiv: v1 [cs.cv] 5 Jan 2018

arxiv: v1 [cs.cv] 5 Jan 2018 Combination of Hyperband and Bayesian Optimization for Hyperparameter Optimization in Deep Learning arxiv:1801.01596v1 [cs.cv] 5 Jan 2018 Jiazhuo Wang Blippar jiazhuo.wang@blippar.com Jason Xu Blippar

More information

Deep Reinforcement Learning

Deep Reinforcement Learning Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3. Policy Gradient and Gradient Estimators 4. Q-prop: Sample Efficient Policy Gradient and an Off-policy Critic

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

SAS High-Performance Analytics Products

SAS High-Performance Analytics Products Fact Sheet What do SAS High-Performance Analytics products do? With high-performance analytics products from SAS, you can develop and process models that use huge amounts of diverse data. These products

More information

arxiv: v1 [stat.ml] 17 Nov 2015

arxiv: v1 [stat.ml] 17 Nov 2015 Bayesian Optimization with Dimension Scheduling: Application to Biological Systems Doniyor Ulmasov a, Caroline Baroukh b, Benoit Chachuat c, Marc Peter Deisenroth a,* and Ruth Misener a,* arxiv:1511.05385v1

More information

CSE 490R P1 - Localization using Particle Filters Due date: Sun, Jan 28-11:59 PM

CSE 490R P1 - Localization using Particle Filters Due date: Sun, Jan 28-11:59 PM CSE 490R P1 - Localization using Particle Filters Due date: Sun, Jan 28-11:59 PM 1 Introduction In this assignment you will implement a particle filter to localize your car within a known map. This will

More information

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification

More information

Overview. EECS 124, UC Berkeley, Spring 2008 Lecture 23: Localization and Mapping. Statistical Models

Overview. EECS 124, UC Berkeley, Spring 2008 Lecture 23: Localization and Mapping. Statistical Models Introduction ti to Embedded dsystems EECS 124, UC Berkeley, Spring 2008 Lecture 23: Localization and Mapping Gabe Hoffmann Ph.D. Candidate, Aero/Astro Engineering Stanford University Statistical Models

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Learning algorithms for physical systems: challenges and solutions

Learning algorithms for physical systems: challenges and solutions Learning algorithms for physical systems: challenges and solutions Ion Matei Palo Alto Research Center 2018 PARC 1 All Rights Reserved System analytics: how things are done Use of models (physics) to inform

More information

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Applied Statistics for Neuroscientists Part IIa: Machine Learning Applied Statistics for Neuroscientists Part IIa: Machine Learning Dr. Seyed-Ahmad Ahmadi 04.04.2017 16.11.2017 Outline Machine Learning Difference between statistics and machine learning Modeling the problem

More information

Warped Mixture Models

Warped Mixture Models Warped Mixture Models Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani Cambridge University Computational and Biological Learning Lab March 11, 2013 OUTLINE Motivation Gaussian Process Latent Variable

More information

BAYESIAN GLOBAL OPTIMIZATION

BAYESIAN GLOBAL OPTIMIZATION BAYESIAN GLOBAL OPTIMIZATION Using Optimal Learning to Tune Deep Learning Pipelines Scott Clark scott@sigopt.com OUTLINE 1. Why is Tuning AI Models Hard? 2. Comparison of Tuning Methods 3. Bayesian Global

More information