random fourier features for kernel ridge regression: approximation bounds and statistical guarantees

Size: px
Start display at page:

Download "random fourier features for kernel ridge regression: approximation bounds and statistical guarantees"

Transcription

1 random fourier features for kernel ridge regression: approximation bounds and statistical guarantees Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, and Amir Zandieh Tel Aviv University, EPFL, and MIT. ICML

2 overview Our Contributions: 1

3 overview Our Contributions: Analyze the random Fourier features method (Rahimi Recht 07) for kernel approximation using leverage score-based techniques. 1

4 overview Our Contributions: Analyze the random Fourier features method (Rahimi Recht 07) for kernel approximation using leverage score-based techniques. Concrete: Introduce new sampling distribution that gives statistical guarantees for kernel ridge regression when used to approximate the Gaussian kernel. 1

5 overview Our Contributions: Analyze the random Fourier features method (Rahimi Recht 07) for kernel approximation using leverage score-based techniques. Concrete: Introduce new sampling distribution that gives statistical guarantees for kernel ridge regression when used to approximate the Gaussian kernel. High Level: Hope that Fourier leverage scores will have further applications in kernel approximation, function approximation, and sparse Fourier transform methods. 1

6 kernel approximation Kernel methods are expensive. 2

7 kernel approximation Kernel methods are expensive. Even writing down K requires Ω(n 2 ) time. 2

8 kernel approximation Kernel methods are expensive. Even writing down K requires Ω(n 2 ) time. Other operations require even more. A single iteration of a linear system solver takes Ω(n 2 ) time. 2

9 kernel approximation Kernel methods are expensive. Even writing down K requires Ω(n 2 ) time. Other operations require even more. A single iteration of a linear system solver takes Ω(n 2 ) time. For n = 100, 000, K has 10 billion entries. Takes 80 GB of storage if each is a double. 2

10 solution: low-rank approximation. Employ classic solution: low-rank approximation 3

11 solution: low-rank approximation. Employ classic solution: low-rank approximation Storing Z uses O(ns) space and computing ZZ T x takes O(ns) time. Orthogonalization, eigendecomposition, and pseudo-inversion of ZZ T all take just O(ns 2 ) time. 3

12 efficient low-rank approximation? Low-rank approximation is itself an expensive task. 4

13 efficient low-rank approximation? Low-rank approximation is itself an expensive task. Optimal low-rank approximation via a direct eigendecomposition, or even approximation via Krylov subspace methods are out of the question since they at least require fully forming K. 4

14 efficient low-rank approximation? Low-rank approximation is itself an expensive task. Optimal low-rank approximation via a direct eigendecomposition, or even approximation via Krylov subspace methods are out of the question since they at least require fully forming K. Many faster methods have been studied: incomplete Cholesky factorization (Fine & Scheinberg 02, Bach & Jordan 02), entrywise sampling (Achlioptas, McSherry, & Schölkopf 01), Nyström approximation (Williams & Seeger 01), random Fourier features (Rahimi & Recht 07) 4

15 efficient low-rank approximation? Low-rank approximation is itself an expensive task. Optimal low-rank approximation via a direct eigendecomposition, or even approximation via Krylov subspace methods are out of the question since they at least require fully forming K. Many faster methods have been studied: incomplete Cholesky factorization (Fine & Scheinberg 02, Bach & Jordan 02), entrywise sampling (Achlioptas, McSherry, & Schölkopf 01), Nyström approximation (Williams & Seeger 01), random Fourier features (Rahimi & Recht 07) 4

16 random fourier features Rahimi & Recht NIPS 07: 5

17 random fourier features Rahimi & Recht NIPS 07: For any shift-invariant k(x i, x j ) = k(x i x j ) let p( ) be the Fourier transform of k( ). By Bochner s theorem, p(η) 0 for all η. 5

18 random fourier features Rahimi & Recht NIPS 07: For any shift-invariant k(x i, x j ) = k(x i x j ) let p( ) be the Fourier transform of k( ). By Bochner s theorem, p(η) 0 for all η. Sample η 1,..., η s R d with probabilities proportional to p(η). 5

19 random fourier features Rahimi & Recht NIPS 07: For any shift-invariant k(x i, x j ) = k(x i x j ) let p( ) be the Fourier transform of k( ). By Bochner s theorem, p(η) 0 for all η. Sample η 1,..., η s R d with probabilities proportional to p(η). 5

20 random fourier features Rahimi & Recht NIPS 07: For any shift-invariant k(x i, x j ) = k(x i x j ) let p( ) be the Fourier transform of k( ). By Bochner s theorem, p(η) 0 for all η. Sample η 1,..., η s R d with probabilities proportional to p(η). ] Set z i = 1 s [e 2πiηT 1 x i,..., e 2πiηT s x i. 5

21 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: 6

22 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: 6

23 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: η Φ i(η)p(η)φ j (η) dη 6

24 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: η Φ i(η)p(η)φ j (η) dη = η e 2πiηT (x i x j ) p(η)dη 6

25 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: η Φ i(η)p(η)φ j (η) dη = η e 2πiηT (x i x j ) p(η)dη = k(x i x j ) 6

26 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: η Φ i(η)p(η)φ j (η) dη = η e 2πiηT (x i x j ) p(η)dη = k(x i x j ) = K i,j. 6

27 view as matrix sampling method Fourier transform k(z) = η R d p(η)e 2πiηT z dη gives: η Φ i(η)p(η)φ j (η) dη = η e 2πiηT (x i x j ) p(η)dη = k(x i x j ) = K i,j. Set Φ = ΦP 1/2. So K = Φ Φ T. 6

28 view as matrix sampling method 7

29 view as matrix sampling method Z(j) = 1 Φ(η) with probability p(η). So E[ZZ T ] = K. sp(η) 7

30 view as matrix sampling method Z(j) = 1 Φ(η) with probability p(η). So E[ZZ T ] = K. sp(η) ] z i = 1 s [e 2πiηT 1 x i,..., e 2πiηT s x i for η 1,..., η s sampled according to p(η). 7

31 view as matrix sampling method 8

32 view as matrix sampling method Z is a sample of Φ = ΦP 1/2. Columns are sampled with probability p(η), i.e., their squared column norms. 8

33 view as matrix sampling method Z is a sample of Φ = ΦP 1/2. Columns are sampled with probability p(η), i.e., their squared column norms. It is well known from work on randomized methods in linear algebra that there are better sampling probabilities (in both theory and practice): the column leverage scores. 8

34 view as matrix sampling method Z is a sample of Φ = ΦP 1/2. Columns are sampled with probability p(η), i.e., their squared column norms. It is well known from work on randomized methods in linear algebra that there are better sampling probabilities (in both theory and practice): the column leverage scores. Also noted by Bach 17, implicit in Rudi et al

35 comparision of approximation guarantees Column Norm Sampling: s = Õ(d/ɛ 2 ) samples ensure that ( ZZ T ) i,j = K i,j ± ɛ for all i, j with high probability [RR07]. 9

36 comparision of approximation guarantees Column Norm Sampling: s = Õ(d/ɛ 2 ) samples ensure that ( ZZ T ) i,j = K i,j ± ɛ for all i, j with high probability [RR07]. Ridge Leverage Score Sampling: s = Õ(s λ/ɛ 2 ) samples gives spectral approximation: (1 ɛ)(zz T + λi) K + λi (1 + ɛ)(zz T + λi). where s λ = tr(k(k + λi) 1 ) is the statistical dimension. 9

37 comparision of approximation guarantees Column Norm Sampling: s = Õ(d/ɛ 2 ) samples ensure that ( ZZ T ) i,j = K i,j ± ɛ for all i, j with high probability [RR07]. Ridge Leverage Score Sampling: s = Õ(s λ/ɛ 2 ) samples gives spectral approximation: (1 ɛ)(zz T + λi) K + λi (1 + ɛ)(zz T + λi). where s λ = tr(k(k + λi) 1 ) is the statistical dimension. Spectral approximation gives statistical guarantees for kernel ridge regression (this work), and approximation bounds for kernel PCA and k-means clustering (Cohen, Musco, Musco 16, 17) 9

38 how to leverage score sample The ridge leverage score for frequency η is given by: τ λ (η) = Φ(η) T (K + λi) 1 Φ(η). 10

39 how to leverage score sample The ridge leverage score for frequency η is given by: τ λ (η) = Φ(η) T (K + λi) 1 Φ(η). Expensive to invert K + λi. But even if you could do that efficiently, it is not at all clear you could efficiently sample from the leverage score distribution. 10

40 bounding the fourier leverage scores Main goal: Get a handle on the Fourier ridge leverage scores for common kernels by upper bounding them with simple distributions. 11

41 bounding the fourier leverage scores Main goal: Get a handle on the Fourier ridge leverage scores for common kernels by upper bounding them with simple distributions. 11

42 bounding the fourier leverage scores Main goal: Get a handle on the Fourier ridge leverage scores for common kernels by upper bounding them with simple distributions. 1. Improve random Fourier features. 2. Bound statistical dimension by the sum of leverage scores. 3. Connections with sparse Fourier transforms, Fourier interpolation, and other problems. 11

43 alternative leverage score characterization Ridge leverage score τ λ (η) = Φ(η) T (K + λi) 1 Φ(η) also equals: τ λ (η) = min y λ 1 Φy Φ(η) y

44 alternative leverage score characterization Ridge leverage score τ λ (η) = Φ(η) T (K + λi) 1 Φ(η) also equals: τ λ (η) = min y λ 1 Φy Φ(η) y 2 2. Intuition: τ λ (η) is small iff there exists a function y : R d C with low energy ( y 2 2 small) whose ( p(η) weighted) Fourier transform is close to the frequency e 2πixT j η at each data point. 12

45 alternative leverage score characterization Ridge leverage score τ λ (η) = Φ(η) T (K + λi) 1 Φ(η) also equals: τ λ (η) = min y λ 1 Φy Φ(η) y 2 2. Intuition: τ λ (η) is small iff there exists a function y : R d C with low energy ( y 2 2 small) whose ( p(η) weighted) Fourier transform is close to the frequency e 2πixT j η at each data point. y reconstructs frequency η from other frequencies. The easier it is to reconstruct, the less important it is to sample. 12

46 frequency reconstruction for bounded data Assume data points are 1-dimensional and bounded: x 1,..., x n [ δ, δ]. One possibility is to choose y with ( p(η) weighted) Fourier transform equal to e 2πixη for all x [ δ, δ]. 13

47 frequency reconstruction for bounded data Assume data points are 1-dimensional and bounded: x 1,..., x n [ δ, δ]. One possibility is to choose y with ( p(η) weighted) Fourier transform equal to e 2πixη for all x [ δ, δ]. Achieved by the shifted sinc function weighted by 1/ p(η). 13

48 frequency reconstruction for bounded data Assume data points are 1-dimensional and bounded: x 1,..., x n [ δ, δ]. One possibility is to choose y with ( p(η) weighted) Fourier transform equal to e 2πixη for all x [ δ, δ]. Achieved by the shifted sinc function weighted by 1/ p(η). pure wave with frequency η target function for bounding the leverage τ λ (η) 1.5 candidate test function that reconstructs target target frequency, η e -2πixη, real component sinc(2δ[ξ-η]) δ -2δ -δ 0 δ 2δ 3δ x -0.5 frequency, ξ η 13

49 improved test function Unfortunately the sinc function falls off too slowly. 14

50 improved test function Unfortunately the sinc function falls off too slowly. 1 For the Gaussian kernel, the e η2 /4 weighting, will grow p(η) faster than sinc(2δη) = sin(2δη) η falls off. So y 2 is unbounded. 14

51 improved test function Unfortunately the sinc function falls off too slowly. 1 For the Gaussian kernel, the e η2 /4 weighting, will grow p(η) faster than sinc(2δη) = sin(2δη) η falls off. So y 2 is unbounded. Solution: Dampen the sinc by multiplying with a Gaussian, keeping Fourier transform nearly identical. 14

52 improved test function Unfortunately the sinc function falls off too slowly. 1 For the Gaussian kernel, the e η2 /4 weighting, will grow p(η) faster than sinc(2δη) = sin(2δη) η falls off. So y 2 is unbounded. Solution: Dampen the sinc by multiplying with a Gaussian, keeping Fourier transform nearly identical. smoothed approximation to target function 1.5 sinc and gaussian components dampened test function with lower energy real component g η (ξ) = y η (ξ)p(ξ) δ -2δ -δ 0 δ 2δ 3δ x -0.5 frequency, ξ η 14

53 ultimate gaussian kernel leverage score bound Upshot: easy to sample from approximate leverage distribution for the Gaussian kernel with x 1,..., x n [ δ, δ] d : Õ(δ d ) when η log n/λ τ λ (η) p(η) = e η 2 2 /2 otherwise. relative sampling probability ridge leverage function classical random Fourier features proposed modified random Fourier features frequency 15

54 experimental results Example of approximating a synthetic wiggly function : True Function KRR Estimator Data True Function MRF Estimator CRF Estimator CRF = classic random Fourier features column norm sampling, MRF = our modified sampling distribution. 16

55 Questions? 17

Divide and Conquer Kernel Ridge Regression

Divide and Conquer Kernel Ridge Regression Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem

More information

CS8803: Statistical Techniques in Robotics Byron Boots. Predicting With Hilbert Space Embeddings

CS8803: Statistical Techniques in Robotics Byron Boots. Predicting With Hilbert Space Embeddings CS8803: Statistical Techniques in Robotics Byron Boots Predicting With Hilbert Space Embeddings 1 HSE: Gram/Kernel Matrices bc YX = 1 N bc XX = 1 N NX (y i )'(x i ) > = 1 N Y > X 2 R 1 1 i=1 NX i=1 '(x

More information

Linear Algebra for Network Loss Characterization

Linear Algebra for Network Loss Characterization Linear Algebra for Network Loss Characterization David Bindel UC Berkeley, CS Division Linear Algebra fornetwork Loss Characterization p.1/23 Application: Network monitoring Set of n hosts in a large network

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

RANDOM FEATURES FOR KERNEL DEEP CONVEX NETWORK

RANDOM FEATURES FOR KERNEL DEEP CONVEX NETWORK RANDOM FEATURES FOR KERNEL DEEP CONVEX NETWORK Po-Sen Huang, Li Deng, Mark Hasegawa-Johnson, Xiaodong He Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA

More information

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff Compressed Sensing David L Donoho Presented by: Nitesh Shroff University of Maryland Outline 1 Introduction Compressed Sensing 2 Problem Formulation Sparse Signal Problem Statement 3 Proposed Solution

More information

Locality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC)

Locality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC) Locality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC) Goal We want to design a binary encoding of data such that similar data points (similarity measures

More information

Large-Scale Face Manifold Learning

Large-Scale Face Manifold Learning Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random

More information

Kernel spectral clustering: model representations, sparsity and out-of-sample extensions

Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Johan Suykens and Carlos Alzate K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg B-3 Leuven (Heverlee), Belgium

More information

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments Jiyan Yang ICME, Stanford University Nov 1, 2015 INFORMS 2015, Philadephia Joint work with Xiangrui Meng (Databricks),

More information

Data Mining through Spectral Analysis

Data Mining through Spectral Analysis Data Mining through Spectral Analysis Frank McSherry Microsoft Research, SVC mcsherry@microsoft.com A Matrix Trick Let s imagine that you secretly choose two matrices that are tall and skinny, and multiply

More information

The Pre-Image Problem in Kernel Methods

The Pre-Image Problem in Kernel Methods The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1

More information

A Geometric Analysis of Subspace Clustering with Outliers

A Geometric Analysis of Subspace Clustering with Outliers A Geometric Analysis of Subspace Clustering with Outliers Mahdi Soltanolkotabi and Emmanuel Candés Stanford University Fundamental Tool in Data Mining : PCA Fundamental Tool in Data Mining : PCA Subspace

More information

Data Driven Frequency Mapping for Computationally Scalable Object Detection

Data Driven Frequency Mapping for Computationally Scalable Object Detection 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, 2011 Data Driven Frequency Mapping for Computationally Scalable Object Detection Fatih Porikli Mitsubishi Electric Research

More information

Kernel Methods for Topological Data Analysis

Kernel Methods for Topological Data Analysis Kernel Methods for Topological Data Analysis Toward Topological Statistics Kenji Fukumizu The Institute of Statistical Mathematics (Tokyo Japan) Joint work with Genki Kusano and Yasuaki Hiraoka (Tohoku

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice Stat60/CS94: Randomized Algorithms for Matrices and Data Lecture 11-10/09/013 Lecture 11: Randomized Least-squares Approximation in Practice Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these

More information

Algebraic Iterative Methods for Computed Tomography

Algebraic Iterative Methods for Computed Tomography Algebraic Iterative Methods for Computed Tomography Per Christian Hansen DTU Compute Department of Applied Mathematics and Computer Science Technical University of Denmark Per Christian Hansen Algebraic

More information

Diffusion Wavelets for Natural Image Analysis

Diffusion Wavelets for Natural Image Analysis Diffusion Wavelets for Natural Image Analysis Tyrus Berry December 16, 2011 Contents 1 Project Description 2 2 Introduction to Diffusion Wavelets 2 2.1 Diffusion Multiresolution............................

More information

Learning with infinitely many features

Learning with infinitely many features Learning with infinitely many features R. Flamary, Joint work with A. Rakotomamonjy F. Yger, M. Volpi, M. Dalla Mura, D. Tuia Laboratoire Lagrange, Université de Nice Sophia Antipolis December 2012 Example

More information

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression Xiaoye Sherry Li, xsli@lbl.gov Liza Rebrova, Gustavo Chavez, Yang Liu, Pieter Ghysels Lawrence Berkeley National

More information

KERNEL METHODS MATCH DEEP NEURAL NETWORKS ON TIMIT

KERNEL METHODS MATCH DEEP NEURAL NETWORKS ON TIMIT KERNEL METHODS MATCH DEEP NEURAL NETWORKS ON TIMIT Po-Sen Huang, Haim Avron, Tara N. Sainath, Vikas Sindhwani, Bhuvana Ramabhadran Department of Electrical and Computer Engineering, University of Illinois

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

Sequential Learning with LS-SVM for Large-Scale Data Sets

Sequential Learning with LS-SVM for Large-Scale Data Sets Sequential Learning with LS-SVM for Large-Scale Data Sets Tobias Jung 1 and Daniel Polani 2 1 Department of Computer Science, University of Mainz, Germany 2 School of Computer Science, University of Hertfordshire,

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

Johnson-Lindenstrauss Lemma, Random Projection, and Applications

Johnson-Lindenstrauss Lemma, Random Projection, and Applications Johnson-Lindenstrauss Lemma, Random Projection, and Applications Hu Ding Computer Science and Engineering, Michigan State University JL-Lemma The original version: Given a set P of n points in R d, let

More information

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material

More information

A (somewhat) Unified Approach to Semisupervised and Unsupervised Learning

A (somewhat) Unified Approach to Semisupervised and Unsupervised Learning A (somewhat) Unified Approach to Semisupervised and Unsupervised Learning Ben Recht Center for the Mathematics of Information Caltech April 11, 2007 Joint work with Ali Rahimi (Intel Research) Overview

More information

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 6, 2011 ABOUT THIS

More information

Learning Kernels with Random Features

Learning Kernels with Random Features Learning Kernels with Random Features Aman Sinha John Duchi Stanford University NIPS, 2016 Presenter: Ritambhara Singh Outline 1 Introduction Motivation Background State-of-the-art 2 Proposed Approach

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 3, 2013 ABOUT THIS

More information

Spectral Clustering and Community Detection in Labeled Graphs

Spectral Clustering and Community Detection in Labeled Graphs Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information

Approximate Nearest Centroid Embedding for Kernel k-means

Approximate Nearest Centroid Embedding for Kernel k-means Approximate Nearest Centroid Embedding for Kernel k-means Ahmed Elgohary, Ahmed K. Farahat, Mohamed S. Kamel, and Fakhri Karray University of Waterloo, Waterloo, Canada N2L 3G1 {aelgohary,afarahat,mkamel,karray}@uwaterloo.ca

More information

Efficient O(N log N) algorithms for scattered data interpolation

Efficient O(N log N) algorithms for scattered data interpolation Efficient O(N log N) algorithms for scattered data interpolation Nail Gumerov University of Maryland Institute for Advanced Computer Studies Joint work with Ramani Duraiswami February Fourier Talks 2007

More information

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017 CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other

More information

A More Efficient Approach to Large Scale Matrix Completion Problems

A More Efficient Approach to Large Scale Matrix Completion Problems A More Efficient Approach to Large Scale Matrix Completion Problems Matthew Olson August 25, 2014 Abstract This paper investigates a scalable optimization procedure to the low-rank matrix completion problem

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\ Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured

More information

Algebraic Iterative Methods for Computed Tomography

Algebraic Iterative Methods for Computed Tomography Algebraic Iterative Methods for Computed Tomography Per Christian Hansen DTU Compute Department of Applied Mathematics and Computer Science Technical University of Denmark Per Christian Hansen Algebraic

More information

From processing to learning on graphs

From processing to learning on graphs From processing to learning on graphs Patrick Pérez Maths and Images in Paris IHP, 2 March 2017 Signals on graphs Natural graph: mesh, network, etc., related to a real structure, various signals can live

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1 Section 5 Convex Optimisation 1 W. Dai (IC) EE4.66 Data Proc. Convex Optimisation 1 2018 page 5-1 Convex Combination Denition 5.1 A convex combination is a linear combination of points where all coecients

More information

Kernels and representation

Kernels and representation Kernels and representation Corso di AA, anno 2017/18, Padova Fabio Aiolli 20 Dicembre 2017 Fabio Aiolli Kernels and representation 20 Dicembre 2017 1 / 19 (Hierarchical) Representation Learning Hierarchical

More information

Kernel Methods in Machine Learning

Kernel Methods in Machine Learning Outline Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Joint work with Ivor Tsang, Pakming Cheung, Andras Kocsor, Jacek Zurada, Kimo Lai November

More information

Factorization with Missing and Noisy Data

Factorization with Missing and Noisy Data Factorization with Missing and Noisy Data Carme Julià, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio López Computer Vision Center and Computer Science Department, Universitat Autònoma de Barcelona,

More information

NYSTRÖM SAMPLING DEPENDS ON THE EIGENSPEC-

NYSTRÖM SAMPLING DEPENDS ON THE EIGENSPEC- NYSTRÖM SAMPLING DEPENDS ON THE EIGENSPEC- TRUM SHAPE OF THE DATA Djallel Bouneffouf IBM Thomas J. Watson Research Center dbouneffouf@us.ibm.com ABSTRACT Spectral clustering has shown a superior performance

More information

ELEG Compressive Sensing and Sparse Signal Representations

ELEG Compressive Sensing and Sparse Signal Representations ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /

More information

Ranking with Uncertain Labels and Its Applications

Ranking with Uncertain Labels and Its Applications Ranking with Uncertain Labels and Its Applications Shuicheng Yan 1, Huan Wang 2, Jianzhuang Liu, Xiaoou Tang 2, and Thomas S. Huang 1 1 ECE Department, University of Illinois at Urbana Champaign, USA {scyan,

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

A Taxonomy of Semi-Supervised Learning Algorithms

A Taxonomy of Semi-Supervised Learning Algorithms A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph

More information

Approximating Gaussian Processes with H 2 -matrices

Approximating Gaussian Processes with H 2 -matrices Approximating Gaussian Processes with H 2 -matrices Steffen Börm 1 and Jochen Garcke 2 1 Max Planck Institute for Mathematics in the Sciences Inselstraße 22 26, 04103 Leipzig, Germany, sbo@mis.mpg.de 2

More information

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING Fei Yang 1 Hong Jiang 2 Zuowei Shen 3 Wei Deng 4 Dimitris Metaxas 1 1 Rutgers University 2 Bell Labs 3 National University

More information

Fast Approximation of Matrix Coherence and Statistical Leverage

Fast Approximation of Matrix Coherence and Statistical Leverage Fast Approximation of Matrix Coherence and Statistical Leverage Michael W. Mahoney Stanford University ( For more info, see: http:// cs.stanford.edu/people/mmahoney/ or Google on Michael Mahoney ) Statistical

More information

A Stochastic Optimization Approach for Unsupervised Kernel Regression

A Stochastic Optimization Approach for Unsupervised Kernel Regression A Stochastic Optimization Approach for Unsupervised Kernel Regression Oliver Kramer Institute of Structural Mechanics Bauhaus-University Weimar oliver.kramer@uni-weimar.de Fabian Gieseke Institute of Structural

More information

Lecture Notes: Constraint Optimization

Lecture Notes: Constraint Optimization Lecture Notes: Constraint Optimization Gerhard Neumann January 6, 2015 1 Constraint Optimization Problems In constraint optimization we want to maximize a function f(x) under the constraints that g i (x)

More information

Sparse Coding for Learning Interpretable Spatio-Temporal Primitives

Sparse Coding for Learning Interpretable Spatio-Temporal Primitives Sparse Coding for Learning Interpretable Spatio-Temporal Primitives Taehwan Kim TTI Chicago taehwan@ttic.edu Gregory Shakhnarovich TTI Chicago gregory@ttic.edu Raquel Urtasun TTI Chicago rurtasun@ttic.edu

More information

arxiv: v2 [cs.lg] 3 Aug 2016

arxiv: v2 [cs.lg] 3 Aug 2016 Robust Non-linear Regression: A Greedy Approach Employing Kernels with Application to Image Denoising George Papageorgiou, Pantelis Bouboulis and Sergios Theodoridis arxiv:1601.00595v2 [cs.lg] 3 Aug 2016

More information

Gabor and Wavelet transforms for signals defined on graphs An invitation to signal processing on graphs

Gabor and Wavelet transforms for signals defined on graphs An invitation to signal processing on graphs Gabor and Wavelet transforms for signals defined on graphs An invitation to signal processing on graphs Benjamin Ricaud Signal Processing Lab 2, EPFL Lausanne, Switzerland SPLab, Brno, 10/2013 B. Ricaud

More information

CS261: A Second Course in Algorithms Lecture #7: Linear Programming: Introduction and Applications

CS261: A Second Course in Algorithms Lecture #7: Linear Programming: Introduction and Applications CS261: A Second Course in Algorithms Lecture #7: Linear Programming: Introduction and Applications Tim Roughgarden January 26, 2016 1 Preamble With this lecture we commence the second part of the course,

More information

The convex geometry of inverse problems

The convex geometry of inverse problems The convex geometry of inverse problems Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison Joint work with Venkat Chandrasekaran Pablo Parrilo Alan Willsky Linear Inverse Problems

More information

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854, USA Part of the slides

More information

Randomized Clustered Nyström for Large-Scale Kernel Machines

Randomized Clustered Nyström for Large-Scale Kernel Machines The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Randomized Clustered Nyström for Large-Scale Kernel Machines Farhad Pourkamali-Anaraki University of Colorado Boulder farhad.pourkamali@colorado.edu

More information

Online Subspace Estimation and Tracking from Missing or Corrupted Data

Online Subspace Estimation and Tracking from Missing or Corrupted Data Online Subspace Estimation and Tracking from Missing or Corrupted Data Laura Balzano www.ece.wisc.edu/~sunbeam Work with Benjamin Recht and Robert Nowak Subspace Representations Capture Dependencies Subspace

More information

Markov Networks in Computer Vision. Sargur Srihari

Markov Networks in Computer Vision. Sargur Srihari Markov Networks in Computer Vision Sargur srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Important application area for MNs 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction

More information

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] Yongjia Song University of Wisconsin-Madison April 22, 2010 Yongjia Song

More information

Model Reduction for Variable-Fidelity Optimization Frameworks

Model Reduction for Variable-Fidelity Optimization Frameworks Model Reduction for Variable-Fidelity Optimization Frameworks Karen Willcox Aerospace Computational Design Laboratory Department of Aeronautics and Astronautics Massachusetts Institute of Technology Workshop

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

The convex geometry of inverse problems

The convex geometry of inverse problems The convex geometry of inverse problems Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison Joint work with Venkat Chandrasekaran Pablo Parrilo Alan Willsky Linear Inverse Problems

More information

Hash-Based Feature Learning for Incomplete Continuous-Valued Data

Hash-Based Feature Learning for Incomplete Continuous-Valued Data Hash-Based Feature Learning for Incomplete Continuous-Valued Data Shuai Yuan Pang-Ning Tan Kendra S. Cheruvelil C. Emi Fergus Nicholas K. Skaff Patricia A. Soranno Abstract Hash-based feature learning

More information

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13. Announcements Edge and Corner Detection HW3 assigned CSE252A Lecture 13 Efficient Implementation Both, the Box filter and the Gaussian filter are separable: First convolve each row of input image I with

More information

Discriminative Clustering for Image Co-segmentation

Discriminative Clustering for Image Co-segmentation Discriminative Clustering for Image Co-segmentation Armand Joulin 1,2 Francis Bach 1,2 Jean Ponce 2 1 INRIA 2 Ecole Normale Supérieure January 15, 2010 Introduction Introduction problem: dividing simultaneously

More information

Analyzing Stochastic Gradient Descent for Some Non- Convex Problems

Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Christopher De Sa Soon at Cornell University cdesa@stanford.edu stanford.edu/~cdesa Kunle Olukotun Christopher Ré Stanford University

More information

LTI Thesis Defense: Riemannian Geometry and Statistical Machine Learning

LTI Thesis Defense: Riemannian Geometry and Statistical Machine Learning Outline LTI Thesis Defense: Riemannian Geometry and Statistical Machine Learning Guy Lebanon Motivation Introduction Previous Work Committee:John Lafferty Geoff Gordon, Michael I. Jordan, Larry Wasserman

More information

Edge and corner detection

Edge and corner detection Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements

More information

A Weighted Kernel PCA Approach to Graph-Based Image Segmentation

A Weighted Kernel PCA Approach to Graph-Based Image Segmentation A Weighted Kernel PCA Approach to Graph-Based Image Segmentation Carlos Alzate Johan A. K. Suykens ESAT-SCD-SISTA Katholieke Universiteit Leuven Leuven, Belgium January 25, 2007 International Conference

More information

Problem Set 4. Assigned: March 23, 2006 Due: April 17, (6.882) Belief Propagation for Segmentation

Problem Set 4. Assigned: March 23, 2006 Due: April 17, (6.882) Belief Propagation for Segmentation 6.098/6.882 Computational Photography 1 Problem Set 4 Assigned: March 23, 2006 Due: April 17, 2006 Problem 1 (6.882) Belief Propagation for Segmentation In this problem you will set-up a Markov Random

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

GRAPH LEARNING UNDER SPARSITY PRIORS. Hermina Petric Maretic, Dorina Thanou and Pascal Frossard

GRAPH LEARNING UNDER SPARSITY PRIORS. Hermina Petric Maretic, Dorina Thanou and Pascal Frossard GRAPH LEARNING UNDER SPARSITY PRIORS Hermina Petric Maretic, Dorina Thanou and Pascal Frossard Signal Processing Laboratory (LTS4), EPFL, Switzerland ABSTRACT Graph signals offer a very generic and natural

More information

Communication Efficient Distributed Kernel Principal Component Analysis

Communication Efficient Distributed Kernel Principal Component Analysis Communication Efficient Distributed Kernel Principal Component Analysis Maria-Florina Balcan School of Computer Science Carnegie Mellon University ninamf@cs.cmu.edu David Woodruff Almaden Research Center

More information

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所 One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,

More information

arxiv: v4 [cs.lg] 13 Feb 2016

arxiv: v4 [cs.lg] 13 Feb 2016 Communication Efficient Distributed Kernel Principal Component Analysis Maria-Florina Balcan, Yingyu Liang, Le Song, David Woodruff, Bo Xie arxiv:1503.06858v4 [cs.lg] 13 Feb 016 Abstract Kernel Principal

More information

Subsampling for Ridge Regression via Regularized Volume Sampling

Subsampling for Ridge Regression via Regularized Volume Sampling Subsampling for Ridge Regression via Regularized Volume Sampling Michał Dereziński Department of Computer Science University of California Santa Cruz mderezin@ucsc.edu Manfred K. Warmuth Department of

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

A Local Learning Approach for Clustering

A Local Learning Approach for Clustering A Local Learning Approach for Clustering Mingrui Wu, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics 72076 Tübingen, Germany {mingrui.wu, bernhard.schoelkopf}@tuebingen.mpg.de Abstract

More information

Parallel Hybrid Monte Carlo Algorithms for Matrix Computations

Parallel Hybrid Monte Carlo Algorithms for Matrix Computations Parallel Hybrid Monte Carlo Algorithms for Matrix Computations V. Alexandrov 1, E. Atanassov 2, I. Dimov 2, S.Branford 1, A. Thandavan 1 and C. Weihrauch 1 1 Department of Computer Science, University

More information

Numerical Analysis I - Final Exam Matrikelnummer:

Numerical Analysis I - Final Exam Matrikelnummer: Dr. Behrens Center for Mathematical Sciences Technische Universität München Winter Term 2005/2006 Name: Numerical Analysis I - Final Exam Matrikelnummer: I agree to the publication of the results of this

More information

Blind Identification of Graph Filters with Multiple Sparse Inputs

Blind Identification of Graph Filters with Multiple Sparse Inputs Blind Identification of Graph Filters with Multiple Sparse Inputs Santiago Segarra, Antonio G. Marques, Gonzalo Mateos & Alejandro Ribeiro Dept. of Electrical and Systems Engineering University of Pennsylvania

More information

Seismic data interpolation beyond aliasing using regularized nonstationary autoregression a

Seismic data interpolation beyond aliasing using regularized nonstationary autoregression a Seismic data interpolation beyond aliasing using regularized nonstationary autoregression a a Published in Geophysics, 76, V69-V77, (2011) Yang Liu, Sergey Fomel ABSTRACT Seismic data are often inadequately

More information

An Empirical Comparison of Sampling Techniques for Matrix Column Subset Selection

An Empirical Comparison of Sampling Techniques for Matrix Column Subset Selection An Empirical Comparison of Sampling Techniques for Matrix Column Subset Selection Yining Wang and Aarti Singh Machine Learning Department, Carnegie Mellonn University {yiningwa,aartisingh}@cs.cmu.edu Abstract

More information

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching Correlated Correspondences [ASP*04] A Complete Registration System [HAW*08] In this session Advanced Global Matching Some practical

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

Modern Signal Processing and Sparse Coding

Modern Signal Processing and Sparse Coding Modern Signal Processing and Sparse Coding School of Electrical and Computer Engineering Georgia Institute of Technology March 22 2011 Reason D etre Modern signal processing Signals without cosines? Sparse

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Incremental Learning of Robot Dynamics using Random Features

Incremental Learning of Robot Dynamics using Random Features Incremental Learning of Robot Dynamics using Random Features Arjan Gijsberts, Giorgio Metta Cognitive Humanoids Laboratory Dept. of Robotics, Brain and Cognitive Sciences Italian Institute of Technology

More information