Data mining with sparse grids

Size: px
Start display at page:

Download "Data mining with sparse grids"

Transcription

1 Data mining with sparse grids Jochen Garcke and Michael Griebel Institut für Angewandte Mathematik Universität Bonn Data mining with sparse grids p.1/40

2 Overview What is Data mining? Regularization networks Sparse grids Numerical examples Conclusions Data mining with sparse grids p.2/40

3 What is Data mining?»data mining is the process of exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and rules.«[berry and Linoff, Mastering Data Mining] Example: Mail-order merchant (who gets a catalog?) Merchant aims to increase revenue per catalog mailed Based on available customer data a response model is built Available information are e.g. Number of quarters with at least one order placed Number of catalogs purchased from Number of days since last order Amount of money spent per quarter going back some years Data mining with sparse grids p.3/40

4 Data mining activities Directed or supervised data mining Classification, classifying risk of credit applicants Estimation, estimating the value of a piece of real estate Prediction, prediction which customers will leave Undirected or unsupervised data mining Affinity grouping / association rules, shopping cart Clustering, cluster of symptoms indicates particular disease Description and visualization Data mining with sparse grids p.4/40

5 Data mining in the knowledge discovery process Identifying the problem Data preparation Data mining Post-processing of the discovered knowledge Putting the results of knowledge discovery in use Data mining with sparse grids p.5/40

6 The classification problem We want to compute a function, the classifier, which approximates the given training data set but also gives good results on unseen data For that a compromise has to be found between the correctness of the approximation, i.e. the size of the data error, and the generalization qualities of the classifier for new, i.e. before unseen, data can be large, we will consider moderately high can consist of up to millions or billions of data points Data mining with sparse grids p.6/40

7 Approximation with data centered ansatz functions Error is zero at the data points, but is overfitting Assume smoothness properties of Data mining with sparse grids p.7/40

8 Regularization networks To get a well-posed, uniquely solvable problem we have to assume knowledge of Regularization theory imposes smoothness constraints Regularization network approach considers the variational problem with Error of the classifier on the given data Assumed smoothness properties Regularization parameter Data mining with sparse grids p.8/40

9 Exact solution with kernels With a basis of we have In the case of a regularization term of the type where is a decreasing positive sequence, the solution of the variational problem has always the form Data mining with sparse grids p.9/40

10 Reproducing Kernel Hilbert Space is a symmetric kernel function can be interpreted as the kernel of a Reproducing Kernel Hilbert Space (RKHS) In other words if certain functions are used in an approximation scheme which are centered in the location of the data points then the approximation solution is a finite series and involves only terms But in general a full system has to be solved Data mining with sparse grids p.10/40

11 Approximation schemes in regularization network context For radially symmetric kernels we end up with radial basis function approximation schemes Many other approximation schemes like additive models hyper-basis functions ridge approximation models and several types of neural networks can be derived by a specific choice of the regularization operator The support vector machine (SVM) approach can also be expressed in the form of a regularization network All scale in general non-linearly in, the number of data points Data mining with sparse grids p.11/40

12 Discretization Different approach: We explicitly restrict the problem to a finite dimensional subspace, with The ansatz functions should form a basis for Cost function should span and preferably Regularization operator is to be minimized in, i.e.. Data mining with sparse grids p.12/40

13 Derivative of the functional :, Plug-in of and differentiation with respect to ) Or equivalently ( Data mining with sparse grids p.13/40

14 Problem to solve With we get the linear equation system is a -matrix with is a -matrix with is a -matrix with is the vector with length of the data classes is the vector of the unknowns and has length Data mining with sparse grids p.14/40

15 Approximation with grid-based ansatz functions In this picture only discrete values are used on the grid points, in general continuous values are used Data mining with sparse grids p.15/40

16 Which function space to take? Again, widely used are methods with global data-centered basis functions, which scale with the number of data points We use a grid to discretize the data space and local basis functions on the grid points A naive grid has grid points, with a reasonable size of, where gives the mesh size, one encounters the curse of dimensionality To overcome this we use sparse grids, which have grid points Data mining with sparse grids p.16/40

17 Interpolation with the hierarchical basis Interpolation Hierarchical basis 1- case is generalized by means of a tensor product approach Hierarchical values of the -dimensional basis functions are bounded through the size of their supports Data mining with sparse grids p.17/40

18 Supports of Data mining with sparse grids p.18/40

19 Sparse grids -linear functions of piece-wise Space span Difference-spaces of level Sparse grid space can be splitted accordingly Function Data mining with sparse grids p.19/40

20 Properties of sparse grids full grid sparse grid number of points approximation properties smoothness properties Sparse grid in 2D and 3D with level Data mining with sparse grids p.20/40

21 Sparse grids Example in six dimensions with level full grid: points sparse grid: points, i.e. Now use sparse grids to solve the minimization problem Linear equation system with points Matrix is more densely populated than corresponding full grid matrices, would add further terms to complexity Explicit assembly of the matrix should be avoided Difficult to implement only the action of the matrices Action of the data matrix would scale with # of data points Therefore use combination technique variant of sparse grids : Data mining with sparse grids p.21/40

22 Combination technique of level 4 in 2D = Data mining with sparse grids p.22/40

23 Sparse grids with the combination technique Solve the problem on the sequence of full grids combine solution on With the results sparse grid dim Example in two dimensions: Data mining with sparse grids p.23/40

24 Sequence of problems to solve Discretize and solve the minimization problem on, with Number of grids # dim main memory of a workstation (for, i.e. small enough for the ) concerning the grid The resulting linear equation system is solved by a diagonally preconditioned conjugate gradient algorithm Data mining with sparse grids p.24/40

25 Complexities of the computation To solve on each grid in the sequence of grids Complexities of the computation storage assembly mv-multipl. is the number of grid points is the number of data points Scales linearly with Data mining with sparse grids p.25/40

26 Numerical Examples We test our method with Benchmark data sets from the UCI Repository Synthetically generated massive data sets Evaluation and comparison with other methods through either Correctness rates on test data set, which where not used during the computation, 10-fold cross validation, or Leave-one-out cross validation The best is found in an outer loop over several s Data mining with sparse grids p.26/40

27 Checkerboard data set / Ripley data set Checkerboard with level fold-correctness rate 96,20% Ripley data set with level 5 (correctness rate of 90.9 %) Ripley data set with level 8 (correctness rate of 89.7 %) Ripley data set with neural networks 91.1 % Best possible rate for Ripley is 92.0%, since 8 % error is introduced Data mining with sparse grids p.27/40

28 Spiral data set level training correctness testing correctness % % % % % % % % Leave-one-out cross-validation results, level 4 to 6 are shown 77.20% with neural networks reported [Singh, 1998] Data mining with sparse grids p.28/40

29 BUPA Liver Disorders data set (6D) SVM SSVM SVM sparse grid combination method level 1 level 2 level 3 level 4 10-fold train. % fold test. % Results for the BUPA Liver Disorders data set (345 data points) from the UCI Repository in comparison to support vector machines [Lee and Mangasarian, 2001] Data mining with sparse grids p.29/40

30 PIMA Indians Diabetes data set (8D) SVM SSVM SVM sparse grid combination method level 1 level 2 level 3 10-fold train. % fold test. % Results for the PIMA Indians Diabetes data set (768 data points) from the UCI Repository in comparison to support vector machines [Lee and Mangasarian, 2001] Data mining with sparse grids p.30/40

31 Synthetic massive 6D data set # of training testing total data matrix data correctness correctness time (sec) time (sec) % 90.8 % level % 90.8 % million 90.7 % 90.7 % % 91.5 % level % 91.6 % million 91.4 % 91.5 % Data mining with sparse grids p.31/40

32 Using simplicial basis functions On the grids of the combination technique linear basis functions based on a simplicial discretization are also possible So-called Kuhn s triangulation for each rectangular block (1,1,1) (0,0,0) Theoretical properties of this variant of the sparse grid technique still has to be investigated in more detail Since the overlap of supports is greatly reduced due to the use of a simplicial discretization, the complexities scale significantly better Data mining with sparse grids p.32/40

33 Complexities for both discretization variants -linear basis functions linear basis functions on simplicials storage assembly mv-multipl. Reduced -dependence in the complexities with linear basis functions on simplicials N is the number of grid points Scales linearly with, the number of data points Data mining with sparse grids p.33/40

34 Ripley data set / Spiral data set with linear basis functions Ripley data set with level 4 (correctness rate of 91.4 %) Compare with 90.9 % with level 5, -linear and 91.1 % with neural networks Spiral data set with level 7, % leave-one-out correctness Spiral data set with level 8, % leave-one-out correctness Compare with % with level 6, -linear Data mining with sparse grids p.34/40

35 BUPA Liver Disorders data set (6D) linear -linear % % level 1 10-fold train fold test level 2 10-fold train fold test level 3 10-fold train fold test level 4 10-fold train fold test Data mining with sparse grids p.35/40

36 Synthetic massive 6D data set training testing total data matrix # of data correctness correctness time (sec) time (sec) level million level million level million linear basis functions level 2 5 million Data mining with sparse grids p.36/40

37 Synthetic massive 10D data set training testing total data matrix # of data correct. correct. time (sec) time (sec) level million level million Data mining with sparse grids p.37/40

38 Parallelization Combination technique parallel on a coarse grain level Classifiers in sequence of grids can be computed independently of each other Just short setup and gather phases are necessary Simple but effective static load balancing strategy Fine grain level parallelization with threads on SMP-machines To compute data dependent the array of the training set can be separated in (# processors) parts Some overhead is introduced to avoid memory conflicts In the iterative solver a vector can be split into parts and each processor now computes the action of the matrix on a vector of size Data mining with sparse grids p.38/40

39 Synthetic massive 10D data set in parallel Coarse grain level parallelization of the combination technique Speed-up of 10.1 with an efficiency of 0.92 on 11 nodes Since only 11 grids have to be calculated no more than 11 nodes are needed Threads for each partial problem in the sequence of grids We achieve acceptable speed-ups from 1.6 for two processors up to 3.7 for eight processors As one would expect the efficiency decreases with the number of processors Both parallelization strategies are used simultaneously Each node is a shared memory dual-processor system On 11 nodes a speed-up of 17.9 with an efficiency of 0.81 Data mining with sparse grids p.39/40

40 Conclusions and outlook Our method is well suited for huge data sets Moderate high number of dimensions Enough for a lot of practical applications after the reduction to the essential dimensions Dimension reduction (e.g. SVD) has to be applied Memory requirements still grow exponentially in Lumping Reduce number of points on the boundary Fast solvers for the partial problems in the sequence of grids Multi-grid with partial semi-coarsening Data mining with sparse grids p.40/40

Data mining with sparse grids using simplicial basis functions

Data mining with sparse grids using simplicial basis functions Data mining with sparse grids using simplicial basis functions Jochen Garcke and Michael Griebel Institut für Angewandte Mathematik Universität Bonn Part of the work was supported within the project 03GRM6BN

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

Dimension-adaptive Sparse Grids

Dimension-adaptive Sparse Grids Dimension-adaptive Sparse Grids Jörg Blank April 16, 2008 Abstract In many situations it is often necessary to compute high dimensional integrals. Due to the curse of dimensionality naive methods are not

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

PROGRAMMING OF MULTIGRID METHODS

PROGRAMMING OF MULTIGRID METHODS PROGRAMMING OF MULTIGRID METHODS LONG CHEN In this note, we explain the implementation detail of multigrid methods. We will use the approach by space decomposition and subspace correction method; see Chapter:

More information

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 9805 jplatt@microsoft.com Abstract Training a Support Vector Machine

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

A dimension adaptive sparse grid combination technique for machine learning

A dimension adaptive sparse grid combination technique for machine learning A dimension adaptive sparse grid combination technique for machine learning Jochen Garcke Abstract We introduce a dimension adaptive sparse grid combination technique for the machine learning problems

More information

Chapter 8 The C 4.5*stat algorithm

Chapter 8 The C 4.5*stat algorithm 109 The C 4.5*stat algorithm This chapter explains a new algorithm namely C 4.5*stat for numeric data sets. It is a variant of the C 4.5 algorithm and it uses variance instead of information gain for the

More information

Robust 1-Norm Soft Margin Smooth Support Vector Machine

Robust 1-Norm Soft Margin Smooth Support Vector Machine Robust -Norm Soft Margin Smooth Support Vector Machine Li-Jen Chien, Yuh-Jye Lee, Zhi-Peng Kao, and Chih-Cheng Chang Department of Computer Science and Information Engineering National Taiwan University

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Generating the Reduced Set by Systematic Sampling

Generating the Reduced Set by Systematic Sampling Generating the Reduced Set by Systematic Sampling Chien-Chung Chang and Yuh-Jye Lee Email: {D9115009, yuh-jye}@mail.ntust.edu.tw Department of Computer Science and Information Engineering National Taiwan

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING

A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING Abhinav Kathuria Email - abhinav.kathuria90@gmail.com Abstract: Data mining is the process of the extraction of the hidden pattern from the data

More information

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine

More information

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 98052 jplatt@microsoft.com Abstract Training a Support Vector

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Eric Xing Lecture 14, February 29, 2016 Reading: W & J Book Chapters Eric Xing @

More information

Space Filling Curves and Hierarchical Basis. Klaus Speer

Space Filling Curves and Hierarchical Basis. Klaus Speer Space Filling Curves and Hierarchical Basis Klaus Speer Abstract Real world phenomena can be best described using differential equations. After linearisation we have to deal with huge linear systems of

More information

A Systematic Overview of Data Mining Algorithms

A Systematic Overview of Data Mining Algorithms A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a

More information

Leave-One-Out Support Vector Machines

Leave-One-Out Support Vector Machines Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm

More information

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Michael Bader TU München Stefanie Schraufstetter TU München Jörn Behrens AWI Bremerhaven Abstract

More information

Using Adaptive Sparse Grids to Solve High-Dimensional Dynamic Models. J. Brumm & S. Scheidegger BFI, University of Chicago, Nov.

Using Adaptive Sparse Grids to Solve High-Dimensional Dynamic Models. J. Brumm & S. Scheidegger BFI, University of Chicago, Nov. Using Adaptive Sparse Grids to Solve High-Dimensional Dynamic Models J. Brumm & S. Scheidegger, Nov. 1st 2013 Outline I.) From Full (Cartesian) Grids to Sparse Grids II.) Adaptive Sparse Grids III.) Time

More information

Practical Guidance for Machine Learning Applications

Practical Guidance for Machine Learning Applications Practical Guidance for Machine Learning Applications Brett Wujek About the authors Material from SGF Paper SAS2360-2016 Brett Wujek Senior Data Scientist, Advanced Analytics R&D ~20 years developing engineering

More information

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3 Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines CS 536: Machine Learning Littman (Wu, TA) Administration Slides borrowed from Martin Law (from the web). 1 Outline History of support vector machines (SVM) Two classes,

More information

PARALLEL CLASSIFICATION ALGORITHMS

PARALLEL CLASSIFICATION ALGORITHMS PARALLEL CLASSIFICATION ALGORITHMS By: Faiz Quraishi Riti Sharma 9 th May, 2013 OVERVIEW Introduction Types of Classification Linear Classification Support Vector Machines Parallel SVM Approach Decision

More information

From processing to learning on graphs

From processing to learning on graphs From processing to learning on graphs Patrick Pérez Maths and Images in Paris IHP, 2 March 2017 Signals on graphs Natural graph: mesh, network, etc., related to a real structure, various signals can live

More information

Parallel Data Mining on a Beowulf Cluster

Parallel Data Mining on a Beowulf Cluster Parallel Data Mining on a Beowulf Cluster Peter Strazdins, Peter Christen, Ole M. Nielsen and Markus Hegland http://cs.anu.edu.au/ Peter.Strazdins (/seminars) Data Mining Group Australian National University,

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Data Mining & Machine Learning F2.4DN1/F2.9DM1

Data Mining & Machine Learning F2.4DN1/F2.9DM1 Data Mining & Machine Learning F2.4DN1/F2.9DM1 Nick Taylor N.K.Taylor@hw.ac.uk Room EM1.62 Data Data Mining - Content Introduction to Data Mining What it is, Who does it and Why Data Warehousing Virtuous

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Application of Finite Volume Method for Structural Analysis

Application of Finite Volume Method for Structural Analysis Application of Finite Volume Method for Structural Analysis Saeed-Reza Sabbagh-Yazdi and Milad Bayatlou Associate Professor, Civil Engineering Department of KNToosi University of Technology, PostGraduate

More information

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS John R Appleyard Jeremy D Appleyard Polyhedron Software with acknowledgements to Mark A Wakefield Garf Bowen Schlumberger Outline of Talk Reservoir

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S

More information

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,

More information

Supervised classification exercice

Supervised classification exercice Universitat Politècnica de Catalunya Master in Artificial Intelligence Computational Intelligence Supervised classification exercice Authors: Miquel Perelló Nieto Marc Albert Garcia Gonzalo Date: December

More information

Divide and Conquer Kernel Ridge Regression

Divide and Conquer Kernel Ridge Regression Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem

More information

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana

More information

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines & Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector

More information

Comparing Univariate and Multivariate Decision Trees *

Comparing Univariate and Multivariate Decision Trees * Comparing Univariate and Multivariate Decision Trees * Olcay Taner Yıldız, Ethem Alpaydın Department of Computer Engineering Boğaziçi University, 80815 İstanbul Turkey yildizol@cmpe.boun.edu.tr, alpaydin@boun.edu.tr

More information

Surfaces, meshes, and topology

Surfaces, meshes, and topology Surfaces from Point Samples Surfaces, meshes, and topology A surface is a 2-manifold embedded in 3- dimensional Euclidean space Such surfaces are often approximated by triangle meshes 2 1 Triangle mesh

More information

Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak

Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models C. Aberle, A. Hakim, and U. Shumlak Aerospace and Astronautics University of Washington, Seattle American Physical Society

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Data Analytics for Simulation Repositories in Industry

Data Analytics for Simulation Repositories in Industry Data Analytics for Simulation Repositories in Industry Rodrigo Iza Teran 1 and Jochen Garcke 1,2 1 Fraunhofer-Institut SCAI Numerical Data-Driven Prediction Schloss Birlinghoven 53754 Sankt Augustin rodrigo.iza-teran@scai.fraunhofer.de

More information

Chemnitz Scientific Computing Preprints

Chemnitz Scientific Computing Preprints Roman Unger Obstacle Description with Radial Basis Functions for Contact Problems in Elasticity CSC/09-01 Chemnitz Scientific Computing Preprints Impressum: Chemnitz Scientific Computing Preprints ISSN

More information

Kernel-based online machine learning and support vector reduction

Kernel-based online machine learning and support vector reduction Kernel-based online machine learning and support vector reduction Sumeet Agarwal 1, V. Vijaya Saradhi 2 andharishkarnick 2 1- IBM India Research Lab, New Delhi, India. 2- Department of Computer Science

More information

Lecture 7: Support Vector Machine

Lecture 7: Support Vector Machine Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each

More information

Linear methods for supervised learning

Linear methods for supervised learning Linear methods for supervised learning LDA Logistic regression Naïve Bayes PLA Maximum margin hyperplanes Soft-margin hyperplanes Least squares resgression Ridge regression Nonlinear feature maps Sometimes

More information

scikit-learn (Machine Learning in Python)

scikit-learn (Machine Learning in Python) scikit-learn (Machine Learning in Python) (PB13007115) 2016-07-12 (PB13007115) scikit-learn (Machine Learning in Python) 2016-07-12 1 / 29 Outline 1 Introduction 2 scikit-learn examples 3 Captcha recognize

More information

GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION Nasehe Jamshidpour a, Saeid Homayouni b, Abdolreza Safari a a Dept. of Geomatics Engineering, College of Engineering,

More information

MULTICORE LEARNING ALGORITHM

MULTICORE LEARNING ALGORITHM MULTICORE LEARNING ALGORITHM CHENG-TAO CHU, YI-AN LIN, YUANYUAN YU 1. Summary The focus of our term project is to apply the map-reduce principle to a variety of machine learning algorithms that are computationally

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

CS 340 Lec. 4: K-Nearest Neighbors

CS 340 Lec. 4: K-Nearest Neighbors CS 340 Lec. 4: K-Nearest Neighbors AD January 2011 AD () CS 340 Lec. 4: K-Nearest Neighbors January 2011 1 / 23 K-Nearest Neighbors Introduction Choice of Metric Overfitting and Underfitting Selection

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea. Abdulrahman Manea PhD Student Hamdi Tchelepi Associate Professor, Co-Director, Center for Computational Earth and Environmental Science Energy Resources Engineering Department School of Earth Sciences

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple

More information

Well Analysis: Program psvm_welllogs

Well Analysis: Program psvm_welllogs Proximal Support Vector Machine Classification on Well Logs Overview Support vector machine (SVM) is a recent supervised machine learning technique that is widely used in text detection, image recognition

More information

Learning from High Dimensional fmri Data using Random Projections

Learning from High Dimensional fmri Data using Random Projections Learning from High Dimensional fmri Data using Random Projections Author: Madhu Advani December 16, 011 Introduction The term the Curse of Dimensionality refers to the difficulty of organizing and applying

More information

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Research Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6)

Research Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6) International Journals of Advanced Research in Computer Science and Software Engineering Research Article June 17 Artificial Neural Network in Classification A Comparison Dr. J. Jegathesh Amalraj * Assistant

More information

Support Vector Machines

Support Vector Machines Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing

More information

Pre-Requisites: CS2510. NU Core Designations: AD

Pre-Requisites: CS2510. NU Core Designations: AD DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification

More information

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC 2015-09-14 Algorithms, System and Data Centre Optimisation for Energy Efficient HPC Vincent Heuveline URZ Computing Centre of Heidelberg University EMCL Engineering Mathematics and Computing Lab 1 Energy

More information

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge

More information

.. Lecture 2. learning and regularization. from interpolation to approximation.

.. Lecture 2. learning and regularization. from interpolation to approximation. .. Lecture. learning and regularization. from interpolation to approximation. Stéphane Canu and Cheng Soon Ong stephane.canu@insarouen.fr asi.insarouen.fr~scanu. RSISE ANU NICTA, Canberra INSA, Rouen RSISE,

More information

5 Learning hypothesis classes (16 points)

5 Learning hypothesis classes (16 points) 5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated

More information

CS 8520: Artificial Intelligence

CS 8520: Artificial Intelligence CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Spring, 2013 1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Univariate and Multivariate Decision Trees

Univariate and Multivariate Decision Trees Univariate and Multivariate Decision Trees Olcay Taner Yıldız and Ethem Alpaydın Department of Computer Engineering Boğaziçi University İstanbul 80815 Turkey Abstract. Univariate decision trees at each

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Two-Phase flows on massively parallel multi-gpu clusters

Two-Phase flows on massively parallel multi-gpu clusters Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Speedup Altair RADIOSS Solvers Using NVIDIA GPU Innovation Intelligence Speedup Altair RADIOSS Solvers Using NVIDIA GPU Eric LEQUINIOU, HPC Director Hongwei Zhou, Senior Software Developer May 16, 2012 Innovation Intelligence ALTAIR OVERVIEW Altair

More information

A Passive-Aggressive Algorithm for Semi-supervised Learning

A Passive-Aggressive Algorithm for Semi-supervised Learning 2010 International Conference on Technologies and Applications of Artificial Intelligence A Passive-Aggressive Algorithm for Semi-supervised earning Chien-Chung Chang and Yuh-Jye ee and Hsing-Kuo Pao Department

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Chapter 14: The Elements of Statistical Learning Presented for 540 by Len Tanaka Objectives Introduction Techniques: Association Rules Cluster Analysis Self-Organizing Maps Projective

More information

Text Classification and Clustering Using Kernels for Structured Data

Text Classification and Clustering Using Kernels for Structured Data Text Mining SVM Conclusion Text Classification and Clustering Using, pgeibel@uos.de DGFS Institut für Kognitionswissenschaft Universität Osnabrück February 2005 Outline Text Mining SVM Conclusion 1 Text

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

PATTERN CLASSIFICATION AND SCENE ANALYSIS

PATTERN CLASSIFICATION AND SCENE ANALYSIS PATTERN CLASSIFICATION AND SCENE ANALYSIS RICHARD O. DUDA PETER E. HART Stanford Research Institute, Menlo Park, California A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS New York Chichester Brisbane

More information

Learning with Regularization Networks

Learning with Regularization Networks Learning with Regularization Networks Petra Kudová Department of Theoretical Computer Science Institute of Computer Science Academy of Sciences of the Czech Republic Outline Introduction supervised learning

More information

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper

More information

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs) Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based

More information

Unlabeled Data Classification by Support Vector Machines

Unlabeled Data Classification by Support Vector Machines Unlabeled Data Classification by Support Vector Machines Glenn Fung & Olvi L. Mangasarian University of Wisconsin Madison www.cs.wisc.edu/ olvi www.cs.wisc.edu/ gfung The General Problem Given: Points

More information

Lab 9. Julia Janicki. Introduction

Lab 9. Julia Janicki. Introduction Lab 9 Julia Janicki Introduction My goal for this project is to map a general land cover in the area of Alexandria in Egypt using supervised classification, specifically the Maximum Likelihood and Support

More information

Dimension reduction : PCA and Clustering

Dimension reduction : PCA and Clustering Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information