Naïve Bayes Slides are adapted from Sebastian Thrun (Udacity ), Ke Chen Jonathan Huang and H. Witten s and E. Frank s Data Mining and Jeremy Wyatt,
|
|
- Alexandra O’Neal’
- 6 years ago
- Views:
Transcription
1 Naïve Bayes Slides are adapted from Sebastian Thrun (Udaity ), Ke Chen Jonathan Huang and H. Witten s and E. Frank s Data Mining and Jeremy Wyatt,
2 Bakground There are three methods to establish a lassifier a) Model a lassifiation rule diretly Examples: k-nn, deision trees, pereptron, SVM b) Model the probability of lass memberships given input data Example: pereptron with the ross-entropy ost ) Make a probabilisti model of data within eah lass Examples: naive Bayes, model based lassifiers a) and b) are examples of disriminative lassifiation ) is an example of generative lassifiation b) and ) are both examples of probabilisti lassifiation 2
3 Probability Basis Prior, onditional and joint probability for random variables Prior probability: Conditional probability: Joint probability: Relationship: Independene: Bayesian Rule P(x) P( 2 1 x1 x2 ), P(x x ) x ( x1, x2 ), P( x) P(x1,x2) P(x 1,x2) P( x2 x1) P( x1) P( x1 x2) P( x2) P( x2 x1) P( x2), P( x1 x2) P( x1), P(x1,x2) P( x1) P( x2) P( x) P( x ) P( ) P( x) Disriminative Generative Likelihood Prior Posterior Evidene 3
4 Probabilisti Classifiation Establishing a probabilisti model for lassifiation Disriminative model P( x) 1,,L, x (x1,, xn) P ( 1 x) P ( 2 x) P( x) Disriminative Probabilisti Classifier x1 x2 x ( x, x2,, x L 1 n xn ) 4 To train a disriminative lassifier regardless its probabilisti or nonprobabilisti nature, all training examples of different lasses must be jointly used to build up a single disriminative lassifier. Output L probabilities for L lass labels in a probabilisti lassifier while a single label is ahieved by a non-probabilisti lassifier.
5 Probabilisti Classifiation Establishing a probabilisti model for lassifiation (ont.) Generative model (must be probabilisti) P( x ) 1,,L, x (x1,, xn) x 1 P( x 1) Generative Probabilisti Model for Class 1 x2 x xn ( x, x2 x 1,, x 1 n Generative Probabilisti Model for Class L x2 ) P (x L ) xn L probabilisti models have to be trained independently Eah is trained on only the examples of the same label Output L probabilities for a given input with L models Generative means that suh a model produes data subjet to the distribution via sampling. 5
6 Probabilisti Classifiation Maximum A Posterior (MAP) lassifiation rule For an input x, find the largest one from L probabilities output by a disriminative probabilisti lassifier P( 1 x ),..., P( x). Assign x to label * if P( * x) is the largest. Generative lassifiation with the MAP rule Apply Bayesian rule to onvert them into posterior probabilities P( i x) P( x i) P( P( x) for i 1, 2,, Then apply the MAP rule to assign a label i ) P( x i ) P( i L ) L Common fator for all L probabilities 6
7
8
9
10
11 Naïve Bayes
12 Things We d Like to Do Spam Classifiation Given an , predit whether it is spam or not Medial Diagnosis Given a list of symptoms, predit whether a patient has disease X or not Weather Based on temperature, humidity, et predit if it will rain tomorrow
13 Bayesian Classifiation Problem statement: Given features X 1,X 2,,X n Predit a label Y Digit Reognition Classifier 5 X 1,,X n {0,1} (Blak vs. White pixels) Y {5,6} (predit whether a digit is a 5 or a 6)
14 The Bayes Classifier we saw that a good strategy is to predit: (for example: what is the probability that the image represents a 5 given its pixels?) So How do we ompute that?
15 The Bayes Classifier Use Bayes Rule! Likelihood Prior Normalization Constant Why did this help? Well, we think that we might be able to speify how features are generated by the lass label
16 The Bayes Classifier Let s expand this for our digit reognition task: To lassify, we ll simply ompute these two probabilities and predit based on whih one is greater
17 Model Parameters For the Bayes lassifier, we need to learn two funtions, the likelihood and the prior How many parameters are required to speify the prior for our digit reognition example? (Supposing that eah image is 30x30 pixels) The problem with expliitly modeling P(X 1,,X n Y) is that there are usually way too many parameters: We ll run out of spae We ll run out of time And we ll need tons of training data (whih is usually not available)
18 The Naïve Bayes Model The Naïve Bayes Assumption: Assume that all features are independent given the lass label Y Equationally speaking: (We will disuss the validity of this assumption later)
19 Why is this useful? # of parameters for modeling P(X 1,,X n Y): 2(2 n -1) # of parameters for modeling P(X 1 Y),,P(X n Y) 2n
20 Naïve Bayes Training Now that we ve deided to use a Naïve Bayes lassifier, we need to train it with some data: MNIST Training Data
21 Naïve Bayes Training Training in Naïve Bayes is easy: Estimate P(Y=v) as the fration of reords with Y=v Estimate P(X i =u Y=v) as the fration of reords with Y=v for whih X i =u (This orresponds to Maximum Likelihood estimation of model parameters)
22 Naïve Bayes Training In pratie, some of these ounts an be zero Fix this by adding virtual ounts: (This is like putting a prior on parameters and doing MAP estimation instead of MLE) This is alled Smoothing
23 Naïve Bayes Training For binary digits, training amounts to averaging all of the training fives together and all of the training sixes together.
24 Naïve Bayes Classifiation
25 Naïve Bayes Assumption Reall the Naïve Bayes assumption: that all features are independent given the lass label Y For an example where onditional independene fails: Y=XOR(X 1,X 2 ) Atually, the Naïve Bayes assumption is almost never true X 1 X 2 P(Y=0 X 1,X 2 ) P(Y=1 X 1,X 2 ) Still Naïve Bayes often performs surprisingly well even when its assumptions do not hold
26 26 Naïve Bayes Bayes lassifiation Diffiulty: learning the joint probability is infeasible! Naïve Bayes lassifiation Assume all input features are lass onditionally independent! Apply the MAP lassifiation rule: assign to * if.,..., for ) ( ),, ( ) ( ) ( ) ( 1 1 L n P x x P P P P x x ),, ( 1 x x P n L n n P a P a P P a P a P,,, ), ( )] ( ) ( [ ) ( )] ( ) ( [ 1 * 1 * * * 1 ) ( ) ( ) ( ),, ( ) ( ),, ( ),,, ( ),,, ( x P x P x P x x P x P x x P x x x P x x x P n n n n n ),,, ( ' 2 1 n a a a x Applying the independene assumption ),, ( estimate of * 1 a a P n ),, ( of esitmate 1 a a P n
27 Naïve Bayes For eah target value of Pˆ( i ) estimate P( For every feature value Pˆ( x j x jk i i i ( ) with examples in S; x jk ) estimate P( x i 1,, of eah feature x jk L i ) j ( j 1,, F; k ) with examples in S; 1,,N j ) x' ( a 1,, an ) ˆ( ˆ( ˆ( * * * * [ P a1 ) P an )] Pˆ( ) [ P a1 i) P an i)] Pˆ( i), i, i 1, ˆ(, L 27
28 Example Example: Play Tennis 28
29 Example Learning Phase Outlook Play=Yes Play=No Sunny 2/9 3/5 Overast 4/9 0/5 Rain 3/9 2/5 Temperature Play=Yes Play=No Hot 2/9 2/5 Mild 4/9 2/5 Cool 3/9 1/5 Humidity Play=Yes Play=No High 3/9 4/5 Normal 6/9 1/5 Wind Play=Yes Play=No Strong 3/9 3/5 Weak 6/9 2/5 P(Play=Yes) = 9/14 P(Play=No) = 5/14 29
30 Example Test Phase Given a new instane, predit its label x =(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong) Look up tables ahieved in the learning phrase P(Outlook=Sunny Play=Yes) = 2/9 P(Temperature=Cool Play=Yes) = 3/9 P(Huminity=High Play=Yes) = 3/9 P(Wind=Strong Play=Yes) = 3/9 P(Play=Yes) = 9/14 Deision making with the MAP rule 30 P(Outlook=Sunny Play=No) = 3/5 P(Temperature=Cool Play==No) = 1/5 P(Huminity=High Play=No) = 4/5 P(Wind=Strong Play=No) = 3/5 P(Play=No) = 5/14 P(Yes x ) [P(Sunny Yes)P(Cool Yes)P(High Yes)P(Strong Yes)]P(Play=Yes) = P(No x ) [P(Sunny No) P(Cool No)P(High No)P(Strong No)]P(Play=No) = Given the fat P(Yes x ) < P(No x ), we label x to be No.
31 Naïve Bayes Algorithm: Continuous-valued Features Numberless values taken by a ontinuous-valued feature Conditional probability often modeled with the normal distribution ji ji 2 1 ( x j ji) Pˆ( x j i ) exp 2 2 ji 2 ji : mean (avearage) of feature values x of examples :standard deviation of feature values Learning Phase: for X ( X 1,, Xn), C 1,, L Output: n L normal distributions and P( C i ) i 1,, L Test Phase: Given an unknown instane X a,, a ) Instead of looking-up tables, alulate onditional probabilities with all the normal distributions ahieved in the learning phrase Apply the MAP rule to assign a label (the same as done for the disrete ase) 31 j x j of examples for whih ( 1 n for whih i i
32 Naïve Bayes Example: Continuous-valued Features Temperature is naturally of ontinuous value. Yes: 25.2, 19.3, 18.5, 21.7, 20.1, 24.3, 22.8, 23.1, 19.8 No: 27.3, 30.1, 17.4, 29.5, 15.1 Estimate mean and variane for eah lass N N Yes 21.64, 2 xn, ( xn ) N No 23.88, N n1 n1 Learning Phase: output two Gaussian models for P(temp C) 1 Pˆ( x Yes) Pˆ( x No) 7.09 exp 2 exp 2 ( x 21.64) ( x 23.88) Yes No exp 2 exp ( x 21.64) ( x 23.88)
33 Zero onditional probability If no example ontains the feature value In this irumstane, we fae a zero onditional probability problem during test Pˆ( x ) Pˆ( a ) Pˆ( x 1 i jk i n i ) For a remedy, lass onditional probabilities re-estimated with n n : number of p : prior estimate(usually, m : weight to prior (number of 33 0 for x j for t a n mp Pˆ( a jk i ) n m : number of training examples for whih x training examples for whih p 1/ t jk, Pˆ( a "virtual" examples, jk a i ) 0 (m-estimate) j and possible values i jk of m 1) x j i )
34 Zero onditional probability Example: P(outlook=overast no)=0 in the play-tennis dataset Adding m virtual examples (m: up to 1% of #training example) In this dataset, # of training examples for the no lass is 5. We an only add m=1 virtual example in our m-esitmate remedy. The outlook feature an takes only 3 values. So p=1/3. Re-estimate P(outlook no) with the m-estimate 34
35 Numerial Stability It is often the ase that mahine learning algorithms need to work with very small numbers Imagine omputing the probability of 2000 independent oin flips MATLAB thinks that (.5) 2000 =0
36 Underflow Prevention Multiplying lots of probabilities floating-point underflow. Reall: log(xy) = log(x) + log(y), better to sum logs of probabilities rather than multiplying probabilities.
37 Underflow Prevention Class with highest final un-normalized log probability sore is still the most probable. argmax logp( ) logp( x ) NB C j j ipositions i j
38 Numerial Stability Instead of omparing P(Y=5 X 1,,X n ) with P(Y=6 X 1,,X n ), Compare their logarithms
39 Reovering the Probabilities What if we want the probabilities though?? Suppose that for some onstant K, we have: And How would we reover the original probabilities?
40 Reovering the Probabilities Given: Then for any onstant C: One suggestion: set C suh that the greatest i is shifted to zero:
41 Reap We defined a Bayes lassifier but saw that it s intratable to ompute P(X 1,,X n Y) We then used the Naïve Bayes assumption that everything is independent given the lass label Y A natural question: is there some happy ompromise where we only assume that some features are onditionally independent? Stay Tuned
42 Conlusions Naïve Bayes is: Really easy to implement and often works well Often a good first thing to try Commonly used as a punhing bag for smarter algorithms
43 Evaluating lassifiation algorithms You have designed a new lassifier. You give it to me, and I try it on my image dataset
44 Evaluating lassifiation algorithms I tell you that it ahieved 95% auray on my data. Is your tehnique a suess?
45 Types of errors But suppose that The 95% is the orretly lassified pixels Only 5% of the pixels are atually edges It misses all the edge pixels How do we ount the effet of different types of error?
46 Ground Truth Not Edge Edge Types of errors Edge Predition Not edge True Positive False Positive False Negative True Negative
47 Two parts to eah: whether you got it orret or not, and what you guessed. For example for a partiular pixel, our guess might be labelled True Positive Did we get it orret? True, we did get it orret. What did we say? We said positive, i.e. edge. or maybe it was labelled as one of the others, maybe False Negative Did we get it orret? False, we did not get it orret. What did we say? We said negative, i.e. not edge.
48 Sensitivity and Speifiity Count up the total number of eah label (TP, FP, TN, FN) over a large dataset. In ROC analysis, we use two statistis: Sensitivity = TP TP+FN Can be thought of as the likelihood of spotting a positive ase when presented with one. Or the proportion of edges we find. Speifiity = TN TN+FP Can be thought of as the likelihood of spotting a negative ase when presented with one. Or the proportion of non-edges that we find
49 TP Sensitivity = =? TP+FN TN Speifiity = =? TN+FP Ground Truth 1 0 Predition = 90 ases in the dataset were lass 1 (edge) = 100 ases in the dataset were lass 0 (non-edge) = 190 examples (pixels) in the data overall
50 The ROC spae 1.0 This is edge detetor A This is edge detetor B Sensitivity Note Speifiity
51 The ROC Curve Draw a onvex hull around many points: Sensitivity This point is not on the onvex hull. 1 - Speifiity
52 ROC Analysis All the optimal detetors lie on the onvex hull. sensitivity Whih of these is best depends on the ratio of edges to non-edges, and the different ost of mislassifiation 1 - speifiity Any detetor on this side an lead to a better detetor by flipping its output. Take-home point : You should always quote sensitivity and speifiity for your algorithm, if possible plotting an ROC graph. Remember also though, any statisti you quote should be an average over a suitable range of tests for your algorithm.
53 Holdout estimation What to do if the amount of data is limited? The holdout method reserves a ertain amount for testing and uses the remainder for training Usually: one third for testing, the rest for training
54 Holdout estimation Problem: the samples might not be representative Example: lass might be missing in the test data Advaned version uses stratifiation Ensures that eah lass is represented with approximately equal proportions in both subsets
55 Repeated holdout method Repeat proess with different subsamples more reliable In eah iteration, a ertain proportion is randomly seleted for training (possibly with stratifiiation) The error rates on the different iterations are averaged to yield an overall error rate
56 Repeated holdout method Still not optimum: the different test sets overlap Can we prevent overlapping? Of ourse!
57 Cross-validation Cross-validation avoids overlapping test sets First step: split data into k subsets of equal size Seond step: use eah subset in turn for testing, the remainder for training Called k-fold ross-validation
58 Cross-validation Often the subsets are stratified before the rossvalidation is performed The error estimates are averaged to yield an overall error estimate
59 More on ross-validation Standard method for evaluation: stratified tenfold ross-validation Why ten? Empirial evidene supports this as a good hoie to get an aurate estimate There is also some theoretial evidene for this Stratifiation redues the estimate s variane Even better: repeated stratified ross-validation E.g. ten-fold ross-validation is repeated ten times and results are averaged (redues the variane)
60 Leave-One-Out ross-validation Leave-One-Out: a partiular form of ross-validation: Set number of folds to number of training instanes I.e., for n training instanes, build lassifier n times Makes best use of the data Involves no random subsampling Very omputationally expensive (exeption: NN)
61 Leave-One-Out-CV and stratifiation Disadvantage of Leave-One-Out-CV: stratifiation is not possible It guarantees a non-stratified sample beause there is only one instane in the test set!
62 Bayesian Networks Slides are adapted from Jason Corso
63 Bayes Nets
64 Example
65 Full Joint distribution
66
67
68
69
70
Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationNaïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationECLT 5810 Evaluation of Classification Quality
ECLT 5810 Evaluation of Classification Quality Reference: Data Mining Practical Machine Learning Tools and Techniques, by I. Witten, E. Frank, and M. Hall, Morgan Kaufmann Testing and Error Error rate:
More informationLearning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract
CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and
More informationNONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman
NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauer and Charles A. Bouman Department of Eletrial Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 Shool of
More informationDiscrete sequential models and CRFs. 1 Case Study: Supervised Part-of-Speech Tagging
0-708: Probabilisti Graphial Models 0-708, Spring 204 Disrete sequential models and CRFs Leturer: Eri P. Xing Sribes: Pankesh Bamotra, Xuanhong Li Case Study: Supervised Part-of-Speeh Tagging The supervised
More informationCSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas
ian ian ian Might have reasons (domain information) to favor some hypotheses/predictions over others a priori ian methods work with probabilities, and have two main roles: Optimal Naïve Nets (Adapted from
More informationthe data. Structured Principal Component Analysis (SPCA)
Strutured Prinipal Component Analysis Kristin M. Branson and Sameer Agarwal Department of Computer Siene and Engineering University of California, San Diego La Jolla, CA 9193-114 Abstrat Many tasks involving
More informationImproved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules
Improved Vehile Classifiation in Long Traffi Video by Cooperating Traker and Classifier Modules Brendan Morris and Mohan Trivedi University of California, San Diego San Diego, CA 92093 {b1morris, trivedi}@usd.edu
More informationBoosted Random Forest
Boosted Random Forest Yohei Mishina, Masamitsu suhiya and Hironobu Fujiyoshi Department of Computer Siene, Chubu University, 1200 Matsumoto-ho, Kasugai, Aihi, Japan {mishi, mtdoll}@vision.s.hubu.a.jp,
More informationRelevance for Computer Vision
The Geometry of ROC Spae: Understanding Mahine Learning Metris through ROC Isometris, by Peter A. Flah International Conferene on Mahine Learning (ICML-23) http://www.s.bris.a.uk/publiations/papers/74.pdf
More informationHow do we obtain reliable estimates of performance measures?
How do we obtain reliable estimates of performance measures? 1 Estimating Model Performance How do we estimate performance measures? Error on training data? Also called resubstitution error. Not a good
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Admin Course add/drop deadline tomorrow. Assignment 1 is due Friday. Setup your CS undergrad account ASAP to use Handin: https://www.cs.ubc.ca/getacct
More informationEvaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München
Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics
More informationThe Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines
The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer
More information8 : Learning Fully Observed Undirected Graphical Models
10-708: Probabilisti Graphial Models 10-708, Spring 2018 8 : Learning Fully Observed Undireted Graphial Models Leturer: Kayhan Batmanghelih Sribes: Chenghui Zhou, Cheng Ran (Harvey) Zhang When learning
More informationBackground/Review on Numbers and Computers (lecture)
Bakground/Review on Numbers and Computers (leture) ICS312 Mahine-Level and Systems Programming Henri Casanova (henri@hawaii.edu) Numbers and Computers Throughout this ourse we will use binary and hexadeimal
More informationGradient based progressive probabilistic Hough transform
Gradient based progressive probabilisti Hough transform C.Galambos, J.Kittler and J.Matas Abstrat: The authors look at the benefits of exploiting gradient information to enhane the progressive probabilisti
More informationPerformance of Histogram-Based Skin Colour Segmentation for Arms Detection in Human Motion Analysis Application
World Aademy of Siene, Engineering and Tehnology 8 009 Performane of Histogram-Based Skin Colour Segmentation for Arms Detetion in Human Motion Analysis Appliation Rosalyn R. Porle, Ali Chekima, Farrah
More informationCluster-Based Cumulative Ensembles
Cluster-Based Cumulative Ensembles Hanan G. Ayad and Mohamed S. Kamel Pattern Analysis and Mahine Intelligene Lab, Eletrial and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1,
More informationMachine Learning and Bioinformatics 機器學習與生物資訊學
Molecular Biomedical Informatics 分子生醫資訊實驗室 機器學習與生物資訊學 Machine Learning & Bioinformatics 1 Evaluation The key to success 2 Three datasets of which the answers must be known 3 Note on parameter tuning It
More informationCS4491/CS 7265 BIG DATA ANALYTICS
CS4491/CS 7265 BIG DATA ANALYTICS EVALUATION * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Dr. Mingon Kang Computer Science, Kennesaw State University Evaluation for
More informationCapturing Large Intra-class Variations of Biometric Data by Template Co-updating
Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy ajita.rattani@diee.unia.it Gian Lua Marialis University
More informationAbstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1.
Fuzzy Weighted Rank Ordered Mean (FWROM) Filters for Mixed Noise Suppression from Images S. Meher, G. Panda, B. Majhi 3, M.R. Meher 4,,4 Department of Eletronis and I.E., National Institute of Tehnology,
More informationPrediction. What is Prediction. Simple methods for Prediction. Classification by decision tree induction. Classification and regression evaluation
Prediction Prediction What is Prediction Simple methods for Prediction Classification by decision tree induction Classification and regression evaluation 2 Prediction Goal: to predict the value of a given
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART V Credibility: Evaluating what s been learned 10/25/2000 2 Evaluation: the key to success How
More informationEvaluating Classifiers
Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with
More informationList of Exercises: Data Mining 1 December 12th, 2015
List of Exercises: Data Mining 1 December 12th, 2015 1. We trained a model on a two-class balanced dataset using five-fold cross validation. One person calculated the performance of the classifier by measuring
More informationAn Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index
IJCSES International Journal of Computer Sienes and Engineering Systems, ol., No.4, Otober 2007 CSES International 2007 ISSN 0973-4406 253 An Optimized Approah on Applying Geneti Algorithm to Adaptive
More informationWeka ( )
Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised
More informationRepresenting structural patterns: Reading Material: Chapter 3 of the textbook by Witten
Representing structural patterns: Plain Classification rules Decision Tree Rules with exceptions Relational solution Tree for Numerical Prediction Instance-based presentation Reading Material: Chapter
More informationDynamic Algorithms Multiple Choice Test
3226 Dynami Algorithms Multiple Choie Test Sample test: only 8 questions 32 minutes (Real test has 30 questions 120 minutes) Årskort Name Eah of the following 8 questions has 4 possible answers of whih
More informationContour Box: Rejecting Object Proposals Without Explicit Closed Contours
Contour Box: Rejeting Objet Proposals Without Expliit Closed Contours Cewu Lu, Shu Liu Jiaya Jia Chi-Keung Tang The Hong Kong University of Siene and Tehnology Stanford University The Chinese University
More informationGray Codes for Reflectable Languages
Gray Codes for Refletable Languages Yue Li Joe Sawada Marh 8, 2008 Abstrat We lassify a type of language alled a refletable language. We then develop a generi algorithm that an be used to list all strings
More information1. make a scenario and build a bayesian network + conditional probability table! use only nominal variable!
Project 1 140313 1. make a scenario and build a bayesian network + conditional probability table! use only nominal variable! network.txt @attribute play {yes, no}!!! @graph! play -> outlook! play -> temperature!
More informationDrawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0;
Naïve line drawing algorithm // Connet to grid points(x0,y0) and // (x1,y1) by a line. void drawline(int x0, int y0, int x1, int y1) { int x; double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx;
More informationrepresent = as a finite deimal" either in base 0 or in base. We an imagine that the omputer first omputes the mathematial = then rounds the result to
Sientifi Computing Chapter I Computer Arithmeti Jonathan Goodman Courant Institute of Mathemaial Sienes Last revised January, 00 Introdution One of the many soures of error in sientifi omputing is inexat
More informationCalculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem
Calulation of typial running time of a branh-and-bound algorithm for the vertex-over problem Joni Pajarinen, Joni.Pajarinen@iki.fi Otober 21, 2007 1 Introdution The vertex-over problem is one of a olletion
More informationKERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION
KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION Cuiui Kang 1, Shengai Liao, Shiming Xiang 1, Chunhong Pan 1 1 National Laboratory of Pattern Reognition, Institute of Automation, Chinese
More informationCS 188: Artificial Intelligence Fall Machine Learning
CS 188: Artificial Intelligence Fall 2007 Lecture 23: Naïve Bayes 11/15/2007 Dan Klein UC Berkeley Machine Learning Up till now: how to reason or make decisions using a model Machine learning: how to select
More informationEvaluation. Evaluate what? For really large amounts of data... A: Use a validation set.
Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?
More informationPlot-to-track correlation in A-SMGCS using the target images from a Surface Movement Radar
Plot-to-trak orrelation in A-SMGCS using the target images from a Surfae Movement Radar G. Golino Radar & ehnology Division AMS, Italy ggolino@amsjv.it Abstrat he main topi of this paper is the formulation
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationCS 584 Data Mining. Classification 3
CS 584 Data Mining Classification 3 Today Model evaluation & related concepts Additional classifiers Naïve Bayes classifier Support Vector Machine Ensemble methods 2 Model Evaluation Metrics for Performance
More informationA Descriptive Framework for the Multidimensional Medical Data Mining and Representation
Journal of Computer Siene 7 (4): 519-55, 011 ISSN 1549-3636 011 Siene Publiations A Desriptive Framework for the Multidimensional Medial Data Mining and Representation Veeramalai Sankaradass and Kannan
More information8 Instruction Selection
8 Instrution Seletion The IR ode instrutions were designed to do exatly one operation: load/store, add, subtrat, jump, et. The mahine instrutions of a real CPU often perform several of these primitive
More informationArtificial Intelligence Naïve Bayes
Artificial Intelligence Naïve Bayes Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [M any slides adapted from those created by Dan Klein and Pieter Abbeel for CS188
More information1 The Knuth-Morris-Pratt Algorithm
5-45/65: Design & Analysis of Algorithms September 26, 26 Leture #9: String Mathing last hanged: September 26, 27 There s an entire field dediated to solving problems on strings. The book Algorithms on
More informationThis fact makes it difficult to evaluate the cost function to be minimized
RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess
More informationA Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu
More informationQuery Evaluation Overview. Query Optimization: Chap. 15. Evaluation Example. Cost Estimation. Query Blocks. Query Blocks
Query Evaluation Overview Query Optimization: Chap. 15 CS634 Leture 12 SQL query first translated to relational algebra (RA) Atually, some additional operators needed for SQL Tree of RA operators, with
More informationA Novel Validity Index for Determination of the Optimal Number of Clusters
IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers
More informationUCSB Math TI-85 Tutorials: Basics
3 UCSB Math TI-85 Tutorials: Basis If your alulator sreen doesn t show anything, try adjusting the ontrast aording to the instrutions on page 3, or page I-3, of the alulator manual You should read the
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2011 Lecture 20: Naïve Bayes 4/11/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. W4 due right now Announcements P4 out, due Friday First contest competition
More informationA New RBFNDDA-KNN Network and Its Application to Medical Pattern Classification
A New RBFNDDA-KNN Network and Its Appliation to Medial Pattern Classifiation Shing Chiang Tan 1*, Chee Peng Lim 2, Robert F. Harrison 3, R. Lee Kennedy 4 1 Faulty of Information Siene and Tehnology, Multimedia
More informationOne Against One or One Against All : Which One is Better for Handwriting Recognition with SVMs?
One Against One or One Against All : Whih One is Better for Handwriting Reognition with SVMs? Jonathan Milgram, Mohamed Cheriet, Robert Sabourin To ite this version: Jonathan Milgram, Mohamed Cheriet,
More informationClassification with Decision Tree Induction
Classification with Decision Tree Induction This algorithm makes Classification Decision for a test sample with the help of tree like structure (Similar to Binary Tree OR k-ary tree) Nodes in the tree
More informationA scheme for racquet sports video analysis with the combination of audio-visual information
A sheme for raquet sports video analysis with the ombination of audio-visual information Liyuan Xing a*, Qixiang Ye b, Weigang Zhang, Qingming Huang a and Hua Yu a a Graduate Shool of the Chinese Aadamy
More informationTowards Optimal Naive Bayes Nearest Neighbor
Towards Optimal Naive Bayes Nearest Neighbor Régis Behmo 1, Paul Marombes 1,2, Arnak Dalalyan 2,andVéronique Prinet 1 1 NLPR / LIAMA, Institute of Automation, Chinese Aademy of Sienes 2 IMAGINE, LIGM,
More informationClassification Part 4
Classification Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Model Evaluation Metrics for Performance Evaluation How to evaluate
More informationA Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering
A Novel Bit Level Time Series Representation with Impliation of Similarity Searh and lustering hotirat Ratanamahatana, Eamonn Keogh, Anthony J. Bagnall 2, and Stefano Lonardi Dept. of omputer Siene & Engineering,
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data
More informationThe goal of machine learning
[appliations CORNER] Kilian Q. Weinberger, Fei Sha, and Lawrene K. Saul Convex Optimizations for Distane Metri Learning and Pattern Classifiation The goal of mahine learning is to build automated systems
More informationCleanUp: Improving Quadrilateral Finite Element Meshes
CleanUp: Improving Quadrilateral Finite Element Meshes Paul Kinney MD-10 ECC P.O. Box 203 Ford Motor Company Dearborn, MI. 8121 (313) 28-1228 pkinney@ford.om Abstrat: Unless an all quadrilateral (quad)
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Credibility: Evaluating what s been learned Issues: training, testing,
More informationMachine Vision. Laboratory Exercise Name: Student ID: S
Mahine Vision 521466S Laoratory Eerise 2011 Name: Student D: General nformation To pass these laoratory works, you should answer all questions (Q.y) with an understandale handwriting either in English
More informationData Mining Algorithms: Basic Methods
Algorithms: The basic methods Inferring rudimentary rules Data Mining Algorithms: Basic Methods Chapter 4 of Data Mining Statistical modeling Constructing decision trees Constructing rules Association
More informationThe Implementation of RRTs for a Remote-Controlled Mobile Robot
ICCAS5 June -5, KINEX, Gyeonggi-Do, Korea he Implementation of RRs for a Remote-Controlled Mobile Robot Chi-Won Roh*, Woo-Sub Lee **, Sung-Chul Kang *** and Kwang-Won Lee **** * Intelligent Robotis Researh
More informationChapter 4: Algorithms CS 795
Chapter 4: Algorithms CS 795 Inferring Rudimentary Rules 1R Single rule one level decision tree Pick each attribute and form a single level tree without overfitting and with minimal branches Pick that
More informationMenu. X + /X=1 and XY+X /Y = X(Y + /Y) = X
Menu K-Maps and Boolean Algera >Don t ares >5 Variale Look into my... 1 Karnaugh Maps - Boolean Algera We have disovered that simplifiation/minimization is an art. If you see it, GREAT! Else, work at it,
More informationCPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017
CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.
More informationEvaluation of different biological data and computational classification methods for use in protein interaction prediction.
Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly
More informationFace and Facial Feature Tracking for Natural Human-Computer Interface
Fae and Faial Feature Traking for Natural Human-Computer Interfae Vladimir Vezhnevets Graphis & Media Laboratory, Dept. of Applied Mathematis and Computer Siene of Mosow State University Mosow, Russia
More informationExtracting Partition Statistics from Semistructured Data
Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk
More informationHEXA: Compact Data Structures for Faster Packet Processing
Washington University in St. Louis Washington University Open Sholarship All Computer Siene and Engineering Researh Computer Siene and Engineering Report Number: 27-26 27 HEXA: Compat Data Strutures for
More informationPartitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning
Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning
More informationDefect Detection and Classification in Ceramic Plates Using Machine Vision and Naïve Bayes Classifier for Computer Aided Manufacturing
Defet Detetion and Classifiation in Cerami Plates Using Mahine Vision and Naïve Bayes Classifier for Computer Aided Manufaturing 1 Harpreet Singh, 2 Kulwinderpal Singh, 1 Researh Student, 2 Assistant Professor,
More informationSemantic Concept Detection Using Weighted Discretization Multiple Correspondence Analysis for Disaster Information Management
Semanti Conept Detetion Using Weighted Disretization Multiple Correspondene Analysis for Disaster Information Management Samira Pouyanfar and Shu-Ching Chen Shool of Computing and Information Sienes Florida
More informationEvaluating Machine-Learning Methods. Goals for the lecture
Evaluating Machine-Learning Methods Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from
More informationRegularization and model selection
CS229 Lecture notes Andrew Ng Part VI Regularization and model selection Suppose we are trying select among several different models for a learning problem. For instance, we might be using a polynomial
More informationA DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR
Malaysian Journal of Computer Siene, Vol 10 No 1, June 1997, pp 36-41 A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Md Rafiqul Islam, Harihodin Selamat and Mohd Noor Md Sap Faulty of Computer Siene and
More informationSemi-Supervised Affinity Propagation with Instance-Level Constraints
Semi-Supervised Affinity Propagation with Instane-Level Constraints Inmar E. Givoni, Brendan J. Frey Probabilisti and Statistial Inferene Group University of Toronto 10 King s College Road, Toronto, Ontario,
More informationWeak Dependence on Initialization in Mixture of Linear Regressions
Proeedings of the International MultiConferene of Engineers and Computer Sientists 8 Vol I IMECS 8, Marh -6, 8, Hong Kong Weak Dependene on Initialization in Mixture of Linear Regressions Ryohei Nakano
More informationChapter 4: Algorithms CS 795
Chapter 4: Algorithms CS 795 Inferring Rudimentary Rules 1R Single rule one level decision tree Pick each attribute and form a single level tree without overfitting and with minimal branches Pick that
More informationAnd, the (low-pass) Butterworth filter of order m is given in the frequency domain by
Problem Set no.3.a) The ideal low-pass filter is given in the frequeny domain by B ideal ( f ), f f; =, f > f. () And, the (low-pass) Butterworth filter of order m is given in the frequeny domain by B
More informationCS 188: Artificial Intelligence Fall Announcements
CS 188: Artificial Intelligence Fall 2006 Lecture 22: Naïve Bayes 11/14/2006 Dan Klein UC Berkeley Announcements Optional midterm On Tuesday 11/21 in class Review session 11/19, 7-9pm, in 306 Soda Projects
More informationAnnouncements. CS 188: Artificial Intelligence Fall Machine Learning. Classification. Classification. Bayes Nets for Classification
CS 88: Artificial Intelligence Fall 00 Lecture : Naïve Bayes //00 Announcements Optional midterm On Tuesday / in class Review session /9, 7-9pm, in 0 Soda Projects. due /. due /7 Dan Klein UC Berkeley
More informationIntroduction to Machine Learning Prof. Mr. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Mr. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 19 Python Exercise on Naive Bayes Hello everyone.
More informationFigure 1. LBP in the field of texture analysis operators.
L MEHODOLOGY he loal inary pattern (L) texture analysis operator is defined as a gray-sale invariant texture measure, derived from a general definition of texture in a loal neighorhood. he urrent form
More informationSegmentation of brain MR image using fuzzy local Gaussian mixture model with bias field correction
IOSR Journal of VLSI and Signal Proessing (IOSR-JVSP) Volume 2, Issue 2 (Mar. Apr. 2013), PP 35-41 e-issn: 2319 4200, p-issn No. : 2319 4197 Segmentation of brain MR image using fuzzy loal Gaussian mixture
More informationExploiting Enriched Contextual Information for Mobile App Classification
Exploiting Enrihed Contextual Information for Mobile App Classifiation Hengshu Zhu 1 Huanhuan Cao 2 Enhong Chen 1 Hui Xiong 3 Jilei Tian 2 1 University of Siene and Tehnology of China 2 Nokia Researh Center
More information2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any urrent or future media, inluding reprinting/republishing this material for advertising
More informationOutline: Software Design
Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling
More informationAccommodations of QoS DiffServ Over IP and MPLS Networks
Aommodations of QoS DiffServ Over IP and MPLS Networks Abdullah AlWehaibi, Anjali Agarwal, Mihael Kadoh and Ahmed ElHakeem Department of Eletrial and Computer Department de Genie Eletrique Engineering
More informationBayesian Belief Networks for Data Mining. Harald Steck and Volker Tresp. Siemens AG, Corporate Technology. Information and Communications
Bayesian Belief Networks for Data Mining Harald Stek and Volker Tresp Siemens AG, Corporate Tehnology Information and Communiations 81730 Munih, Germany fharald.stek, Volker.Trespg@mhp.siemens.de Abstrat
More informationThe Happy Ending Problem
The Happy Ending Problem Neeldhara Misra STATUTORY WARNING This doument is a draft version 1 Introdution The Happy Ending problem first manifested itself on a typial wintery evening in 1933 These evenings
More informationCS 188: Artificial Intelligence Fall 2011
CS 188: Artificial Intelligence Fall 2011 Lecture 21: ML: Naïve Bayes 11/10/2011 Dan Klein UC Berkeley Example: Spam Filter Input: email Output: spam/ham Setup: Get a large collection of example emails,
More informationData Mining Classification: Bayesian Decision Theory
Data Mining Classification: Bayesian Decision Theory Lecture Notes for Chapter 2 R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification, 2nd ed. New York: Wiley, 2001. Lecture Notes for Chapter
More informationShape Outlier Detection Using Pose Preserving Dynamic Shape Models
Shape Outlier Detetion Using Pose Preserving Dynami Shape Models Chan-Su Lee Ahmed Elgammal Department of Computer Siene, Rutgers University, Pisataway, NJ 8854 USA CHANSU@CS.RUTGERS.EDU ELGAMMAL@CS.RUTGERS.EDU
More informationDynamic Programming. Lecture #8 of Algorithms, Data structures and Complexity. Joost-Pieter Katoen Formal Methods and Tools Group
Dynami Programming Leture #8 of Algorithms, Data strutures and Complexity Joost-Pieter Katoen Formal Methods and Tools Group E-mail: katoen@s.utwente.nl Otober 29, 2002 JPK #8: Dynami Programming ADC (214020)
More information