CSIS. Pattern Recognition. Prof. Sung-Hyuk Cha Fall of School of Computer Science & Information Systems. Artificial Intelligence CSIS

Similar documents
CSIS. Computer Vision. Prof. Sung-Hyuk Cha Fall of School of Computer Science & Information Systems. Artificial Intelligence CSIS

Data Mining and Analytics

Representing structural patterns: Reading Material: Chapter 3 of the textbook by Witten

Data Mining and Machine Learning: Techniques and Algorithms

Nominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN

Summary. Machine Learning: Introduction. Marcin Sydow

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

2. Basic Task of Pattern Classification

Machine Learning 13. week

Basic Concepts Weka Workbench and its terminology

Machine Learning Chapter 2. Input

WEKA: Practical Machine Learning Tools and Techniques in Java. Seminar A.I. Tools WS 2006/07 Rossen Dimov

Nominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Data Mining Practical Machine Learning Tools and Techniques

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Data Mining Practical Machine Learning Tools and Techniques

Practical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer

Input: Concepts, Instances, Attributes

Prediction. What is Prediction. Simple methods for Prediction. Classification by decision tree induction. Classification and regression evaluation

ECG782: Multidimensional Digital Signal Processing

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

I211: Information infrastructure II

CS489/698 Lecture 2: January 8 th, 2018

Instance-Based Representations. k-nearest Neighbor. k-nearest Neighbor. k-nearest Neighbor. exemplars + distance measure. Challenges.

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Data Mining. Neural Networks

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Generative and discriminative classification techniques

Machine Learning in Biology

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

MACHINE LEARNING Example: Google search

CSIS Introduction to Pattern Classification. CSIS Introduction to Pattern Classification 1

Data Mining Algorithms: Basic Methods

Implementation of Classification Rules using Oracle PL/SQL

CS5670: Computer Vision

Supervised Sementation: Pixel Classification

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

An Introduction to Pattern Recognition

Model Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Machine Learning Lecture 3

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Machine Learning for. Artem Lind & Aleskandr Tkachenko

Machine Learning Lecture 3

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Lecture 12 Recognition

Introduction to Machine Learning

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

9/6/14. Our first learning algorithm. Comp 135 Introduction to Machine Learning and Data Mining. knn Algorithm. knn Algorithm (simple form)

BITS F464: MACHINE LEARNING

Unsupervised: no target value to predict

Robotics Programming Laboratory

Lecture 12 Recognition. Davide Scaramuzza

Data Mining Classification: Bayesian Decision Theory

Opening the Black Box Data Driven Visualizaion of Neural N

Image segmentation. Václav Hlaváč. Czech Technical University in Prague

CS513-Data Mining. Lecture 2: Understanding the Data. Waheed Noor

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)

Non-Parametric Modeling

CS6220: DATA MINING TECHNIQUES

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Su et al. Shape Descriptors - III

Nearest Neighbors Classifiers

Intro to Artificial Intelligence

Bus Detection and recognition for visually impaired people

Machine Learning Classifiers and Boosting

CSE 573: Artificial Intelligence Autumn 2010

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Machine Learning Lecture 3

COMPUTER VISION. Dr. Sukhendu Das Deptt. of Computer Science and Engg., IIT Madras, Chennai

More Learning. Ensembles Bayes Rule Neural Nets K-means Clustering EM Clustering WEKA

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Machine Learning (CSMML16) (Autumn term, ) Xia Hong

MS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods

Large Scale Data Analysis Using Deep Learning

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 2, ISSUE 1 JAN-2015

Classification with Decision Tree Induction

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Image Processing, Analysis and Machine Vision

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

9.913 Pattern Recognition for Vision. Class I - Overview. Instructors: B. Heisele, Y. Ivanov, T. Poggio

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Recap: Gaussian (or Normal) Distribution. Recap: Minimizing the Expected Loss. Topics of This Lecture. Recap: Maximum Likelihood Approach

PATTERN CLASSIFICATION AND SCENE ANALYSIS

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

COMP33111: Tutorial and lab exercise 7

OVERVIEW & RECAP COLE OTT MILESTONE WRITEUP GENERALIZABLE IMAGE ANALOGIES FOCUS

Classification Algorithms in Data Mining

What Is Data Mining? CMPT 354: Database I -- Data Mining 2

Face Detection using Hierarchical SVM

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

Contents. Preface to the Second Edition

Transcription:

Pattern Recognition Prof. Sung-Hyuk Cha Fall of 2002 School of Computer Science & Information Systems Artificial Intelligence 1

Perception Lena & Computer vision 2

Machine Vision Pattern Recognition Applications 3

Iris authentication 4

Face Recognition System Each person has different faces. Face Recognition System Query? Face DB 5

Complex Pattern Recognition Applications Sargur N. Srihari 520 Lee Entrance STE 202 f 5 primary number f 6 street name f 2 state abbr. f 7 secondary designator abbr. f 3 5-digit ZIP Code f 8 secondary number Amherst NY 14228-2583 f 1 city name Delivery point: 142282583 f 4 4-digit ZIP+4 add-on Speech Recognition System 6

Applications x x x x x x x x x x x x x x x x x x f, f,..., f ) f, f,..., f ) f, f,..., f ) f, f,..., f ) f, f,..., f ) f, f,..., f ) ( 1 2 d ( 1 2 d ( 1 2 d ( 1 2 d ( 1 2 d ( 1 2 d LCD Pen tablet Microphone Digital Camera biomouse Fingerprint scanner Vital Sign monitor 7

Measurements brightness, length 1 = ( 12, 16 ) 2 = ( 11, 20 ) 1 = ( 7, 6 ) 2 = ( 3, 4 ) Truth features 8

Decision theory (cost) 9

Distributions and Errors Decision boundary identified as salmon salmon identified as bass 10

Parametric Univariate Dichotomizer (a) length (b) lightness (c) width (a) (b) (c) Type I 9 % 7 % 5 % Type II 39 % 27 % 26 % Multivariate Analysis 11

Nearest Neighbor Classifier bright ness? = salmon length Nearest Neighbor Classifier reference set testing set R = r1 = 5, 7 r2 = 3, 8 r3 = 10, 16 r4 = 12, 14 q = 4, 6 T = t1 = 4, 6 t2 = 2, 5 t3 = 11, 17 t4 = 14, 12 rn = 14, 15 tn = 14, 17 too slow for users to wait for the output. Performance is evaluated by using a testing set. 12

Machine Learning (Linear function) bright ness? = salmon Y > ax + b length Artificial Neural Network synapse x 1 (t) dendrites nucleus axon x 2 (t) x n (t) w 2 w 1 w n Σ a(t) y y=f(a) w 0 a O(t+1) the biological neuron the artificial neuron 13

Machine Learning (Linear function) reference training set set testing set R = r1 = 5, 7 r2 = 3, 8 r3 = 10, 16 r4 = 12, 14 Y > ax + b T = t1 = 4, 6 t2 = 2, 5 t3 = 11, 17 t4 = 14, 12 rn = 14, 15 tn = 14, 17 extremely fast. No need to load the training data during the classification. Performance is evaluated by using a testing set. Performance is not as good as the NN classifier s. Non-Linear case bright ness Y > ax + b length 14

Non-Linear case bright ness length NN is better. will learn artificial neural network which is non-linear function. Human Brain 15

Artificial Neural Network f 1 f 2 f 3 f 4 Class f 5 f 6 f 7 Fully Connected, feed forward, back-propagation multi-layer Artificial neural network (11-6-1) (ANN). 16

Purpose of Pattern Recognition Predict unseen future instance. Generalization. Inductive step. 17

width Generalizability (statistical inferece) width training set universe length width length validating set length Inferential Statistics 1. Inferential Statistics is inferring a conclusion about population of interest from a sample. - need a procedure for sampling the population. - need a measure of reliability for the inference. 2. If error rate in a random sample set is the same as in universe, then the procedure is a sound inferential statistical procedure. 3. If error rate in one random sample set is the same as in another random sample set, then the procedure is sound. 18

Generalization δf2 δf1 Universe Sampling & learning δf2 δf1 Sample 1 19

Testing on another sample δf2 δf1 Sample 2 Generalization δf2 δf1 Universe 20

Multiple classification f2 class 1 class 3 class 2 f1 Classes = {class 1, 2, 3} 21

Template for PR Applications 1. Data acquisition: a. Recruit subjects. b. Modality interface (Scanning, picturing, recording, etc). 2. Feature Extraction: a. Raw data to feature vectors. b. Involves image/ voice/ signal processing techniques. 3. Training a classifier: a. Design a classifier (e.g., ANN). b. Enter the training (& validating) feature vector set(s). 4. Classification system: a. embed the ANN engine to your actual program (Java/C) b. User interface for the Final Product. 22

Further Pattern Recognition http://www.csis.pace.edu/~scha/pr Fast Nearest Neighbor Search Algorithms Decision Tree Statistical Pattern Recognition. Artificial Neural Network. Clustering etc. Decision Tree outlook temperature humidity windy play sunny hot high false no sunny hot high true no overcast hot high false yes rainy mild high false yes rainy cool normal false yes rainy cool normal true no overcast cool normal true yes sunny mild high false no sunny cool normal false yes rainy mild normal false yes sunny mild normal true yes overcast mild high true yes overcast hot normal false yes rainy mild high true no 23

Clustering (a) d a k g e j h f i c b (b) d a k g e j c h i f b (c) 1 2 3 (d) a b c... 0.4 0.1 0.5 0.1 0.8 0.1 0.3 0.3 0.4 g a c i e d k b j f h Terminology Classification: The process of assigning one of a limited set of alternative interpretations to (the generator of) a set of data. Often requires the steps of the computation of relative probabilities (or a quantity related to them) followed by the application of a decision rule. All classification processes can be evaluated in terms of "detection" and "misclassification" rates. See "receiver operator characterstics 24

Terminology Computer Vision: Compter Vision is the subject area which deals with the automatic analysis of images for the purposes of quantification or system control (often mimicking tasks which humans find trivial). It is to be distinguised from "Image Processing" which deals only with the computational processes applied to images, including enhancement and compression, but does not deal with abstract representation for the purposes of reasoning and interpretation. Compter Vision can be seen as the inverse of Computer Graphics, though generally the representations and methods of this area are not of use in Computer Vision due to the incomplete and therefore ambiguous nature of images. This requires prior knowledge to be used in order to obtain robust scene interpretation. Terminology Machine Vision: Like "computer vision" but generally more closely associated with its use in robotics. Pattern Recognition Pattern recognition is the process of assigning a pattern classification to a particular set of measurements, normally represented as a high dimensional vector. This is normally done within the context of "probability theory", whereby a particular set of assumptions regarding the expected statistical distribution of measurements is used to compute classification probabilities which can be used as the basis for a decision such as the "Bayes decision rule". There are several popular forms of classifier including "k-nearest neighbour", "parzen windows", "mixture methods" and more recently "artificial neural networks". 25

Terminology Images: An image is two dimensional spatial representation of a group of "objects" (or "scene") which exists in two or more dimensions. It is an intuitive way of presenting data for computer interfaces in the area of graphics, but in machine vision it may be defined as a continuous function of two variables defined within a bounded (generally rectangular) region. Histograms A histogram is an array of non negative integer counts from a set of data, which represents the frequency of occurance of values within a set of non-overlapping regions. Features & Class Features class dark blob hole slant width skew ht pixel hslope nslope pslope vslope int int int real int real int int int int int int.95.49.70.71.50.10.51.92.13.47.32.21.94.49.75.70.50.11.53.84.26.54.35.18.94.49.67.74.50.10.45.85.23.48.32.22.93.72.33.47.50.21.28.30.66.60.42.10.93.74.33.48.50.22.26.30.60.59.45.10.93.79.36.54.50.18.27.32.60.59.52.09 S S S S S S.92.30.61.66.60.11.35.49.70.71.57.10.94.42.72.66.60.11.32.49.67.74.53.10.94.40.75.67.60.12.34.49.75.70.54.11.96.30.60.59.50.10.21.30.66.60.36.10.95.32.60.59.50.09.22.30.60.59.39.10.95.30.66.60.50.10.21.32.60.59.34.09 B B B B B B 26

Representation length a = (12,6,-5) b = (16,9,10) c = (19,7,-10) 40 35 30 25 20 15 10 5 0-5 -10 10 12 width a 14 16 18 20 22 b c 24 lightness 5 6 7 8 9 10 Pattern Recognition The End See U all next week. 27