Introduction to Pattern Recognition and Machine Learning. Alexandros Iosifidis Academy of Finland Postdoctoral Research Fellow (term )

Similar documents
Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Semi-supervised Clustering

Pattern Recognition Lecture Sequential Clustering

Machine Learning for Media Data Analysis

Maximum Margin Binary Classifiers using Intrinsic and Penalty Graphs

CSC321: Neural Networks. Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis. Geoffrey Hinton

Facial Expression Recognition Using Non-negative Matrix Factorization

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions

What is machine learning?

A Dendrogram. Bioinformatics (Lec 17)

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning

Intro to Artificial Intelligence

Topics in Machine Learning

Lecture 11: Clustering Introduction and Projects Machine Learning

K-Means Clustering 3/3/17

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

Introduction to Machine Learning. Xiaojin Zhu

Emotion Classification

Neetha Das Prof. Andy Khong

Machine Learning (CSE 446): Practical Issues

Neural Networks for Digital Media Analysis and Description

Introduction to Mobile Robotics

Clustering web search results

ECG782: Multidimensional Digital Signal Processing

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

Search Engines. Information Retrieval in Practice

Distant Supervision via Prototype-Based. global representation learning

Deep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong

Applying Supervised Learning

Unsupervised Learning

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning. Unsupervised Learning. Manfred Huber

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

PARALLEL CLASSIFICATION ALGORITHMS

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Supervised vs unsupervised clustering

Clustering Large Credit Client Data Sets for Classification with SVM

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Single-hidden Layer Feedforward Neual network training using class geometric information

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Spectral Classification

Unsupervised Learning

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Computer Vision. Exercise Session 10 Image Categorization

Multiple-Choice Questionnaire Group C

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

A Taxonomy of Semi-Supervised Learning Algorithms

Chapter DM:II. II. Cluster Analysis

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

Image Analysis, Classification and Change Detection in Remote Sensing

Grundlagen der Künstlichen Intelligenz

INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank

Interpretable Machine Learning with Applications to Banking

Neural networks for variable star classification

Machine Learning - Clustering. CS102 Fall 2017

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Clustering & Classification (chapter 15)

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Support Vector Machines

Associative Cellular Learning Automata and its Applications

CHAPTER IX Radial Basis Function Networks

Classification Algorithms in Data Mining

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

CSE 573: Artificial Intelligence Autumn 2010

Unsupervised Learning: Clustering

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

ECLT 5810 Clustering

CS 340 Lec. 4: K-Nearest Neighbors

Pre-Requisites: CS2510. NU Core Designations: AD

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Deep Learning for Computer Vision

A Systematic Overview of Data Mining Algorithms

Manifold Learning for Video-to-Video Face Recognition

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Kernel SVM. Course: Machine Learning MAHDI YAZDIAN-DEHKORDI FALL 2017

OPTIMIZATION METHODS

Fraud Detection using Machine Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Machine Learning: Basic Principles

2. Basic Task of Pattern Classification

Announcements. Recognition I. Gradient Space (p,q) What is the reflectance map?

Object of interest discovery in video sequences

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Unsupervised Learning for Hierarchical Clustering Using Statistical Information

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Data Mining and Analytics

Natural Language Processing

Well Analysis: Program psvm_welllogs

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

Autoencoders, denoising autoencoders, and learning deep networks

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Bias-Variance Trade-off + Other Models and Problems

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CS 343: Artificial Intelligence

Naïve Bayes for text classification

Transcription:

Introduction to Pattern Recognition and Machine Learning Alexandros Iosifidis Academy of Finland Postdoctoral Research Fellow (term 2016-2019) Tampere, August 2016

What do we observe here? Vector Spaces

Vector Spaces What do we observe here? A set of N points in the 2-dimensional space

Vector Spaces What do we observe here? A set of N points in the 2-dimensional space Each point can be represented by a vector x i ={x i1,x i2 }, i=1,n x 2 x 1

Vector Spaces What do we observe here? A set of N points in the 2-dimensional space Each point can be represented by a vector x i ={x i1,x i2 }, i=1,n We call each point as sample and we say that x i is the representation of the i-th sample

Vector Spaces What do we observe here? A set of N points in the 2-dimensional space Each point can be represented by a vector x i ={x i1,x i2 }, i=1,n We call each point as sample and we say that x i is the representation of the i-th sample We say that x i belongs to the 2D vector space or or, D=2.

Vector Spaces What is a sample representation x i? Data representation is followed by a description, e.g. we can describe a person by using his/hers height and weight. In that case: Alekos {187, 110} x 2 x i2 x i x i1 x 1

Vector Spaces What is a sample representation x i? Another representation could be followed by the description of {height, weight, gender}. In that case: Alekos {187, 110, 1} Real value

Vector Spaces What is a sample representation x i? Another representation could be followed by the description of {height, weight, gender}. In that case: Alekos {187, 110, 1} Let us assume that all vectors represent males. Does it make sense to keep the last dimension?

Vector Spaces What is a sample representation x i? Another representation could be followed by the description of {height, weight, gender}. In that case: Alekos {187, 110, 1} Let us assume that all vectors represent males. Does it make sense to keep the last dimension? Sometimes data dimensions contain low information. Sometimes data dimensions can be combined in order to enhance properties of interest.

Vector Spaces What is a sample representation x i? Another representation could be followed by the description of {height, weight, gender}. In that case: Alekos {187, 110, 1} Let us assume that all vectors represent males. Does it make sense to keep the last dimension? Sometimes data dimensions contain low information. Sometimes data dimensions can be combined in order to enhance properties of interest. Subspace learning: Define a vector space, d < D, in which the data representations have some properties of interest.

Another example: Vector Spaces

Vector Spaces Input space R D R d, d D where data dispersion is maximized

Vector Spaces The same example: Each sample is followed by a class label (red or green)

Vector Spaces Input space R D R d, d D, where the dispersion within the classes is minimized and between the classes is maximized

Basic Problems in PR and ML Pattern Recognition problems can be divided in: Unsupervised: the only available information is the data representations, i=1,,n

Basic Problems in PR and ML Pattern Recognition problems can be divided in: Unsupervised: the only available information is the data representations, i=1,,n Supervised: data representations, i=1,,n and class labels l i, i=1,,n, l i Є {1,,C} are available

Basic Problems in PR and ML Pattern Recognition problems can be divided in: Unsupervised: the only available information is the data representations, i=1,,n Supervised: data representations, i=1,,n and class labels l i, i=1,,n, l i Є {1,,C} are available Semi-supervised: a (small) sub-set of the data are labeled, i.e.:, i=1,,n and l i, i=1,,n<n, l i Є {1,,C} are available

Basic Problems in PR and ML Data clustering: Define a set of data groups (clusters) according to a (data closeness/similarity) criterion Unsupervised problem

Basic Problems in PR and ML Data clustering: Define a set of data groups (clusters) according to a (data closeness/similarity) criterion Unsupervised problem

Basic Problems in PR and ML Data clustering: Define a set of data groups (clusters) according to a (data closeness/similarity) criterion Unsupervised problem Is this the correct answer?

Basic Problems in PR and ML Data clustering: Define a set of data groups (clusters) according to a (data closeness/similarity) criterion Unsupervised problem Is this the correct answer?

Basic Problems in PR and ML Data classification: Samples belonging to two patterns (classes) Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern?

Basic Problems in PR and ML Data classification: Samples belonging to two patterns (classes) Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern? - +

Basic Problems in PR and ML Data classification: Samples belonging to two patterns (classes) Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern? - +

Basic Problems in PR and ML Data classification: Samples belonging to two patterns (classes) Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern? Which one?

Basic Problems in PR and ML Data classification: Some of the samples are followed by labels Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern?

Basic Problems in PR and ML Data classification (semi-supervised): Some of the samples are followed by labels Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern? - +

Basic Problems in PR and ML Data classification (semi-supervised): Some of the samples are followed by labels Can we define a rule for recognizing if a new sample (the yellow one) belongs to the red or green pattern? + - Many possible solutions!

Nonlinearity Multi-class classification: Split the problem in multiple binary (two-class) problems: One-Versus-One and One-Versus-Rest

Nonlinearity Multi-class classification: One-Versus-Rest case

Nonlinearity Multi-class classification: One-Versus-Rest case

Nonlinearity Multi-class classification: One-Versus-Rest case Sometimes linear models are not the best solution

Nonlinearity Multi-class classification: A better solution!

Nonlinearity Nonlinear approaches: Piecewise linear methods: approximate the nonlinear functions with multiple linear ones

Nonlinearity Nonlinear approaches: Piecewise linear methods

Nonlinearity Nonlinear approaches: Piecewise linear methods Kernel-based methods: Map the data (usually nonlinearly) to a feature space where the problem is easier to be solved

Nonlinearity Nonlinear approaches: Piecewise linear methods Kernel-based methods Input space R D Feature space Input space R D

Nonlinearity Nonlinear approaches: Piecewise linear methods Kernel-based methods Neural Networks - NN with one hidden layer - Samples x i are (nonlinearly) mapped to a feature space - The use of multiple hidden layer is also possible (each transforming the data representations in the previous layer to a new feature space). x i C t i

Coming next! More on PR and ML stuff during this training: Thursday 25.8 14:00-16:00: Evolutionary Learning (Prof. Serkan Kiranyaz) Friday 26.8 9:00-12:00: Probabilistic and Statistical Learning (Myself) 13:00-16:00: Neural Networks (Prof. Anastasios Tefas) If you want to find out more: http://www.cs.tut.fi/~iosifidi/

Questions?