3 Supervised Learning

Size: px
Start display at page:

Download "3 Supervised Learning"

Transcription

1 Preface The rapd growth of the Web n the last decade makes t the largest publcly accessble data source n the world. Web mnng ams to dscover useful nformaton or knowledge from Web hyperlnks, page contents, and usage logs. Based on the prmary knds of data used n the mnng process, Web mnng tasks can be categorzed nto three man types: Web structure mnng, Web content mnng and Web usage mnng. Web structure mnng dscovers knowledge from hyperlnks, whch represent the structure of the Web. Web content mnng extracts useful nformaton/knowledge from Web page contents. Web usage mnng mnes user access patterns from usage logs, whch record clcks made by every user. The goal of ths book s to present these tasks, and ther core mnng algorthms. The book s ntended to be a text wth a comprehensve coverage, and yet, for each topc, suffcent detals are gven so that readers can gan a reasonably complete knowledge of ts algorthms or technques wthout referrng to any external materals. Four of the chapters, structured data extracton, nformaton ntegraton, opnon mnng, and Web usage mnng, make ths book unque. These topcs are not covered by exstng books, but yet they are essental to Web data mnng. Tradtonal Web mnng topcs such as search, crawlng and resource dscovery, and lnk analyss are also covered n detal n ths book. Although the book s enttled Web Data Mnng, t also ncludes the man topcs of data mnng and nformaton retreval snce Web mnng uses ther algorthms and technques extensvely. The data mnng part manly conssts of chapters on assocaton rules and sequental patterns, supervsed learnng (or classfcaton), and unsupervsed learnng (or clusterng), whch are the three most mportant data mnng tasks. The advanced topc of partally (sem-) supervsed learnng s ncluded as well. For nformaton retreval, ts core topcs that are crucal to Web mnng are descrbed. Ths book s thus naturally dvded nto two parts. The frst part, whch conssts of Chaps. 5, covers data mnng foundatons. The second part, whch contans Chaps. 6, covers Web specfc mnng. Two man prncples have guded the wrtng of ths book. Frst, the basc content of the book should be accessble to undergraduate students, and yet there are suffcent n-depth materals for graduate students who plan to

2 VIII Preface pursue Ph.D. degrees n Web data mnng or related areas. Few assumptons are made n the book regardng the prerequste knowledge of readers. One wth a basc understandng of algorthms and probablty concepts should have no problem wth ths book. Second, the book should examne the Web mnng technology from a practcal pont of vew. Ths s mportant because most Web mnng tasks have mmedate real-world applcatons. In the past few years, I was fortunate to have worked drectly or ndrectly wth many researchers and engneers n several search engne and e-commerce companes, and also tradtonal companes that are nterested n explotng the nformaton on the Web n ther busnesses. Durng the process, I ganed practcal experences and frst-hand knowledge of realworld problems. I try to pass those non-confdental peces of nformaton and knowledge along n the book. The book, thus, should have a good balance of theory and practce. I hope that t wll not only be a learnng text for students, but also a valuable source of nformaton/knowledge and even deas for Web mnng researchers and practtoners. Acknowledgements Many researchers have asssted me techncally n wrtng ths book. Wthout ther help, ths book mght never have become realty. My deepest thanks goes to Flppo Menczer and Bamshad Mobasher, who were so knd to have helped wrte two essental chapters of the book. They are both experts n ther respectve felds. Flppo wrote the chapter on Web crawlng and Bamshad wrote the chapter on Web usage mnng. I am also very grateful to Wee Sun Lee, who helped a great deal n the wrtng of Chap. 5 on partally supervsed learnng. Jan Pe helped wth the wrtng of the PrefxSpan algorthm n Chap., and checked the MS-PS algorthm. Eduard Dragut asssted wth the wrtng of the last secton of Chap. 0 and also read the chapter many tmes. Yuanln Zhang gave many great suggestons on Chap. 9. I am ndebted to all of them. Many other researchers also asssted n varous ways. Yang Da and Rudy Setono helped wth Support Vector Machnes (SVM). Chrs Dng helped wth lnk analyss. Clement Yu and ChengXang Zha read Chap. 6, and Amy Langvlle read Chap. 7. Kevn C.-C. Chang, J-Rong Wen and Clement Yu helped wth many aspects of Chap 0. Justn Zobel helped clarfy some ssues related to ndex compresson, and Ion Muslea helped clarfy some ssues on wrapper nducton. Dvy Agrawal, Yunbo Cao, Edward Fox, Hang L, Xaol L, Zhaohu Tan, Dell Zhang and Zan Zheng helped check varous chapters or sectons. I am very grateful.

3 Preface I X Dscussons wth many researchers helped shape the book as well: Amr Ashkenaz, Imran Azz, Roberto Bayardo, Wendell Baker, Lng Bao, Jeffrey Benkler, AnHa Doan, Byron Dom, Mchael Gamon, Robert Grossman, Jawe Han, Wynne Hsu, Ronny Kohav, Davd D. Lews, Ian McAllster, We-Yng Ma, Marco Maggn, Llew Mason, Kamel Ngan, Julan Qan, Yan Qu, Thomas M. Trpak, Andrew Tomkns, Alexander Tuzhln, Wemn Xao, Gu Xu, Phlp S. Yu, and Mohammed Zak. My former and current students, Gao Cong, Mnqng Hu, Ntn Jndal, Xn L, Ymng Ma, Yanhong Zha and Kad Zhao checked many algorthms and made numerous correctons. Some chapters of the book have been used n my graduate classes at the Unversty of Illnos at Chcago. I thank the students n these classes for mplementng several algorthms. Ther questons helped me mprove and, n some cases, correct the algorthms. It s not possble to lst all ther names. Here, I would partcularly lke to thank John Castano, Xaowen Dng, Murthy Ganapathbhotla, Cyntha Kersey, Har Prasad Dvyakott, Ravkanth Turlapat, Srkanth Tadkonda, Mako Tamura, Hasheng Wang, and Chad Wllams for pontng out errors n texts, examples or algorthms. Mchael Bombyk from DePaul Unversty also found several typng errors. It was a pleasure workng wth the helpful staff at Sprnger. I thank my edtor Ralf Gerstner who asked me n early 005 whether I was nterested n wrtng a book on Web mnng. It has been a wonderful experence workng wth hm snce. I also thank my copyedtor Mke Nugent for helpng me mprove the presentaton, and my producton edtor Mchael Renfarth for gudng me through the fnal producton process. Two anonymous revewers also gave me many nsghtful comments. The Department of Computer Scence at the Unversty of Illnos at Chcago provded computng resources and a supportve envronment for ths proect. Fnally, I thank my parents, brother and sster for ther constant supports and encouragements. My greatest grattude goes to my own famly: Yue, Shelley and Kate. They have helped me n so many ways. Despte ther young ages, Shelley and Kate actually read many parts of the book and caught numerous typng errors. My wfe has taken care of almost everythng at home and put up wth me and the long hours that I have spent on ths book. I dedcate ths book to them. Bng Lu

4 3 Supervsed Learnng Supervsed learnng has been a great success n real-world applcatons. It s used n almost every doman, ncludng text and Web domans. Supervsed learnng s also called classfcaton or nductve learnng n machne learnng. Ths type of learnng s analogous to human learnng from past experences to gan new knowledge n order to mprove our ablty to perform real-world tasks. However, snce computers do not have experences, machne learnng learns from data, whch are collected n the past and represent past experences n some real-world applcatons. There are several types of supervsed learnng tasks. In ths chapter, we focus on one partcular type, namely, learnng a target functon that can be used to predct the values of a dscrete class attrbute. Ths type of learnng has been the focus of the machne learnng research and s perhaps also the most wdely used learnng paradgm n practce. Ths chapter ntroduces a number of such supervsed learnng technques. They are used n almost every Web mnng applcaton. We wll see ther uses from Chaps Basc Concepts A data set used n the learnng task conssts of a set of data records, whch are descrbed by a set of attrbutes A = {A, A,, A A }, where A denotes the number of attrbutes or the sze of the set A. The data set also has a specal target attrbute C, whch s called the class attrbute. In our subsequent dscussons, we consder C separately from attrbutes n A due to ts specal status,.e., we assume that C s not n A. The class attrbute C has a set of dscrete values,.e., C = {c, c,, c C }, where C s the number of classes and C. A class value s also called a class label. A data set for learnng s smply a relatonal table. Each data record descrbes a pece of past experence. In the machne learnng and data mnng lterature, a data record s also called an example, an nstance, a case or a vector. A data set bascally conssts of a set of examples or nstances. Gven a data set D, the obectve of learnng s to produce a classfcaton/predcton functon to relate values of attrbutes n A and classes n C. The functon can be used to predct the class values/labels of the future

5 56 3 Supervsed Learnng data. The functon s also called a classfcaton model, a predctve model or smply a classfer. We wll use these terms nterchangeably n ths book. It should be noted that the functon/model can be n any form, e.g., a decson tree, a set of rules, a Bayesan model or a hyperplane. Example : Table 3. shows a small loan applcaton data set. It has four attrbutes. The frst attrbute s Age, whch has three possble values, young, mddle and old. The second attrbute s Has_Job, whch ndcates whether an applcant has a ob. Its possble values are true (has a ob) and false (does not have a ob). The thrd attrbute s Own_house, whch shows whether an applcant owns a house. The fourth attrbute s Credt_ratng, whch has three possble values, far, good and excellent. The last column s the Class attrbute, whch shows whether each loan applcaton was approved (denoted by Yes) or not (denoted by No) n the past. Table 3.. A loan applcaton data set ID Age Has_ob Own_house Credt_ratng Class young false false far No young false false good No 3 young true false good Yes 4 young true true far Yes 5 young false false far No 6 mddle false false far No 7 mddle false false good No 8 mddle true true good Yes 9 mddle false true excellent Yes 0 mddle false true excellent Yes old false true excellent Yes old false true good Yes 3 old true false good Yes 4 old true false excellent Yes 5 old false false far No We want to learn a classfcaton model from ths data set that can be used to classfy future loan applcatons. That s, when a new customer comes nto the bank to apply for a loan, after nputtng hs/her age, whether he/she has a ob, whether he/she owns a house, and hs/her credt ratng, the classfcaton model should predct whether hs/her loan applcaton should be approved. Our learnng task s called supervsed learnng because the class labels (e.g., Yes and No values of the class attrbute n Table 3.) are provded n

6 3. Basc Concepts 57 the data. It s as f some teacher tells us the classes. Ths s n contrast to the unsupervsed learnng, where the classes are not known and the learnng algorthm needs to automatcally generate classes. Unsupervsed learnng s the topc of the next chapter. The data set used for learnng s called the tranng data (or the tranng set). After a model s learned or bult from the tranng data by a learnng algorthm, t s evaluated usng a set of test data (or unseen data) to assess the model accuracy. It s mportant to note that the test data s not used n learnng the classfcaton model. The examples n the test data usually also have class labels. That s why the test data can be used to assess the accuracy of the learned model because we can check whether the class predcted for each test case by the model s the same as the actual class of the test case. In order to learn and also to test, the avalable data (whch has classes) for learnng s usually splt nto two dsont subsets, the tranng set (for learnng) and the test set (for testng). We wll dscuss ths further n Sect The accuracy of a classfcaton model on a test set s defned as: Number of correct classfcatons Accuracy =, () Total number of test cases where a correct classfcaton means that the learned model predcts the same class as the orgnal class of the test case. There are also other measures that can be used. We wll dscuss them n Sect We pause here to rases two mportant questons:. What do we mean by learnng by a computer system?. What s the relatonshp between the tranng and the test data? We answer the frst queston frst. Gven a data set D representng past experences, a task T and a performance measure M, a computer system s sad to learn from the data to perform the task T f after learnng the system s performance on the task T mproves as measured by M. In other words, the learned model or knowledge helps the system to perform the task better as compared to no learnng. Learnng s the process of buldng the model or extractng the knowledge. We use the data set n Example to explan the dea. The task s to predct whether a loan applcaton should be approved. The performance measure M s the accuracy n Equaton (). Wth the data set n Table 3., f there s no learnng, all we can do s to guess randomly or to smply take the maorty class (whch s the Yes class). Suppose we use the maorty class and announce that every future nstance or case belongs to the class Yes. If the future data are drawn from the same dstrbuton as the exstng tranng data n Table 3., the estmated classfcaton/predcton accuracy

7 58 3 Supervsed Learnng on the future data s 9/5 = 0.6 as there are 9 Yes class examples out of the total of 5 examples n Table 3.. The queston s: can we do better wth learnng? If the learned model can ndeed mprove the accuracy, then the learnng s sad to be effectve. The second queston n fact touches the fundamental assumpton of machne learnng, especally the theoretcal study of machne learnng. The assumpton s that the dstrbuton of tranng examples s dentcal to the dstrbuton of test examples (ncludng future unseen examples). In practcal applcatons, ths assumpton s often volated to a certan degree. Strong volatons wll clearly result n poor classfcaton accuracy, whch s qute ntutve because f the test data behave very dfferently from the tranng data then the learned model wll not perform well on the test data. To acheve good accuracy on the test data, tranng examples must be suffcently representatve of the test data. We now llustrate the steps of learnng n Fg. 3. based on the precedng dscussons. In step, a learnng algorthm uses the tranng data to generate a classfcaton model. Ths step s also called the tranng step or tranng phase. In step, the learned model s tested usng the test set to obtan the classfcaton accuracy. Ths step s called the testng step or testng phase. If the accuracy of the learned model on the test data s satsfactory, the model can be used n real-world tasks to predct classes of new cases (whch do not have classes). If the accuracy s not satsfactory, we need to go back and choose a dfferent learnng algorthm and/or do some further processng of the data (ths step s called data pre-processng, not shown n the fgure). A practcal learnng task typcally nvolves many teratons of these steps before a satsfactory model s bult. It s also possble that we are unable to buld a satsfactory model due to a hgh degree of randomness n the data or lmtatons of current learnng algorthms. Tranng data Learnng algorthm model Test data Accuracy Step : Tranng Step : Testng Fg. 3.. The basc learnng process: tranng and testng From the next secton onward, we study several supervsed learnng algorthms, except Sect. 3.3, whch focuses on model/classfer evaluaton. We note that throughout the chapter we assume that the tranng and test data are avalable for learnng. However, n many text and Web page related learnng tasks, ths s not true. Usually, we need to collect raw data,

8 3. Decson Tree Inducton 59 desgn attrbutes and compute attrbute values from the raw data. The reason s that the raw data n text and Web applcatons are often not sutable for learnng ether because ther formats are not rght or because there are no obvous attrbutes n the raw text documents or Web pages. 3. Decson Tree Inducton Decson tree learnng s one of the most wdely used technques for classfcaton. Its classfcaton accuracy s compettve wth other learnng methods, and t s very effcent. The learned classfcaton model s represented as a tree, called a decson tree. The technques presented n ths secton are based on the C4.5 system from Qunlan [453]. Example : Fgure 3. shows a possble decson tree learnt from the data n Table 3.. The tree has two types of nodes, decson nodes (whch are nternal nodes) and leaf nodes. A decson node specfes some test (.e., asks a queston) on a sngle attrbute. A leaf node ndcates a class. Age? Young mddle old Has_ob? Own_house? Credt_ratng? true false true false far good excellent Yes No (/) (3/3) Yes No (3/3) (/) No Yes Yes (/) (/) (/) Fg. 3.. A decson tree for the data n Table 3. The root node of the decson tree n Fg. 3. s Age, whch bascally asks the queston: what s the age of the applcant? It has three possble answers or outcomes, whch are the three possble values of Age. These three values form three tree branches/edges. The other nternal nodes have the same meanng. Each leaf node gves a class value (Yes or No). (x/y) below each class means that x out of y tranng examples that reach ths leaf node have the class of the leaf. For nstance, the class of the left most leaf node s Yes. Two tranng examples (examples 3 and 4 n Table 3.) reach here and both of them are of class Yes. To use the decson tree n testng, we traverse the tree top-down accordng to the attrbute values of the gven test nstance untl we reach a leaf node. The class of the leaf s the predcted class of the test nstance.

9 60 3 Supervsed Learnng Example 3: We use the tree to predct the class of the followng new nstance, whch descrbes a new loan applcant. Age Has_ob Own_house Credt-ratng Class young false false good? Gong through the decson tree, we fnd that the predcted class s No as we reach the second leaf node from the left. A decson tree s constructed by parttonng the tranng data so that the resultng subsets are as pure as possble. A pure subset s one that contans only tranng examples of a sngle class. If we apply all the tranng data n Table 3. on the tree n Fg. 3., we wll see that the tranng examples reachng each leaf node form a subset of examples that have the same class as the class of the leaf. In fact, we can see that from the x and y values n (x/y). We wll dscuss the decson tree buldng algorthm n Sect An nterestng queston s: Is the tree n Fg. 3. unque for the data n Table 3.? The answer s no. In fact, there are many possble trees that can be learned from the data. For example, Fg. 3.3 gves another decson tree, whch s much smaller and s also able to partton the tranng data perfectly accordng to ther classes. Own_house? true false Yes (6/6) Has_ob? true false Yes No (3/3) (6/6) Fg A smaller tree for the data set n Table 3. In practce, one wants to have a small and accurate tree for many reasons. A smaller tree s more general and also tends to be more accurate (we wll dscuss ths later). It s also easer to understand by human users. In many applcatons, the user understandng of the classfer s mportant. For example, n some medcal applcatons, doctors want to understand the model that classfes whether a person has a partcular dsease. It s not satsfactory to smply produce a classfcaton because wthout understandng why the decson s made the doctor may not trust the system and/or does not gan useful knowledge. It s useful to note that n both Fg. 3. and Fg. 3.3, the tranng examples that reach each leaf node all have the same class (see the values of

10 3. Decson Tree Inducton 6 (x/y) at each leaf node). However, for most real-lfe data sets, ths s usually not the case. That s, the examples that reach a partcular leaf node are not of the same class,.e., x y. The value of x/y s, n fact, the confdence (conf) value used n assocaton rule mnng, and x s the support count. Ths suggests that a decson tree can be converted to a set of f-then rules. Yes, ndeed. The converson s done as follows: Each path from the root to a leaf forms a rule. All the decson nodes along the path form the condtons of the rule and the leaf node or the class forms the consequent. For each rule, a support and confdence can be attached. Note that n most classfcaton systems, these two values are not provded. We add them here to see the connecton of assocaton rules and decson trees. Example 4: The tree n Fg. 3.3 generates three rules., means and. Own_house = true Class =Yes [sup=6/5, conf=6/6] Own_house = false, Has_ob = true Class = Yes [sup=3/5, conf=3/3] Own_house = false, Has_ob = false Class = No [sup=6/5, conf=6/6]. We can see that these rules are of the same format as assocaton rules. However, the rules above are only a small subset of the rules that can be found n the data of Table 3.. For nstance, the decson tree n Fg. 3.3 does not fnd the followng rule: Age = young, Has_ob = false Class = No [sup=3/5, conf=3/3]. Thus, we say that a decson tree only fnds a subset of rules that exst n data, whch s suffcent for classfcaton. The obectve of assocaton rule mnng s to fnd all rules subect to some mnmum support and mnmum confdence constrants. Thus, the two methods have dfferent obectves. We wll dscuss these ssues agan n Sect. 3.5 when we show that assocaton rules can be used for classfcaton as well, whch s obvous. An nterestng and mportant property of a decson tree and ts resultng set of rules s that the tree paths or the rules are mutually exclusve and exhaustve. Ths means that every data nstance s covered by a sngle rule (a tree path) and a sngle rule only. By coverng a data nstance, we mean that the nstance satsfes the condtons of the rule. We also say that a decson tree generalzes the data as a tree s a smaller (more compact) descrpton of the data,.e., t captures the key regulartes n the data. Then, the problem becomes buldng the best tree that s small and accurate. It turns out that fndng the best tree that models the data s a NP-complete problem [48]. All exstng algorthms use heurstc methods for tree buldng. Below, we study one of the most successful technques.

11 6 3 Supervsed Learnng. Algorthm decsontree(d, A, T) f D contans only tranng examples of the same class c C then make T a leaf node labeled wth class c ; 3 elsef A = then 4 make T a leaf node labeled wth c, whch s the most frequent class n D 5 else // D contans examples belongng to a mxture of classes. We select a sngle 6 // attrbute to partton D nto subsets so that each subset s purer 7 p 0 = mpurtyeval-(d); 8 for each attrbute A A (={A, A,, A k }) do 9 p = mpurtyeval-(a, D) 0 endfor Select A g {A, A,, A k } that gves the bggest mpurty reducton, computed usng p 0 p ; f p 0 p g < threshold then // A g does not sgnfcantly reduce mpurty p 0 3 make T a leaf node labeled wth c, the most frequent class n D. 4 else // A g s able to reduce mpurty p 0 5 Make T a decson node on A g ; 6 Let the possble values of A g be v, v,, v m. Partton D nto m dsont subsets D, D,, D m based on the m values of A g. 7 for each D n {D, D,, D m } do 8 f D then 9 create a branch (edge) node T for v as a chld node of T; 0 decsontree(d, A {A g }, T ) // A g s removed endf endfor 3 endf 4 endf Fg A decson tree learnng algorthm 3.. Learnng Algorthm As ndcated earler, a decson tree T smply parttons the tranng data set D nto dsont subsets so that each subset s as pure as possble (of the same class). The learnng of a tree s typcally done usng the dvde-andconquer strategy that recursvely parttons the data to produce the tree. At the begnnng, all the examples are at the root. As the tree grows, the examples are sub-dvded recursvely. A decson tree learnng algorthm s gven n Fg For now, we assume that every attrbute n D takes dscrete values. Ths assumpton s not necessary as we wll see later. The stoppng crtera of the recurson are n lnes 4 n Fg The algorthm stops when all the tranng examples n the current data are of the same class, or when every attrbute has been used along the current tree

12 3. Decson Tree Inducton 63 path. In tree learnng, each successve recurson chooses the best attrbute to partton the data at the current node accordng to the values of the attrbute. The best attrbute s selected based on a functon that ams to mnmze the mpurty after the parttonng (lnes 7 ). In other words, t maxmzes the purty. The key n decson tree learnng s thus the choce of the mpurty functon, whch s used n lnes 7, 9 and n Fg The recursve recall of the algorthm s n lne 0, whch takes the subset of tranng examples at the node for further parttonng to extend the tree. Ths s a greedy algorthm wth no backtrackng. Once a node s created, t wll not be revsed or revsted no matter what happens subsequently. 3.. Impurty Functon Before presentng the mpurty functon, we use an example to show what the mpurty functon ams to do ntutvely. Example 5: Fgure 3.5 shows two possble root nodes for the data n Table 3.. Age? Own_house? Young mddle old true false No: 3 No: No: Yes: Yes: 3 Yes: 4 (A) No: 0 No: 6 Yes: 6 Yes: 3 Fg Two possble root nodes or two possble attrbutes for the root node Fg. 3.5(A) uses Age as the root node, and Fg. 3.5(B) uses Own_house as the root node. Ther possble values (or outcomes) are the branches. At each branch, we lsted the number of tranng examples of each class (No or Yes) that land or reach there. Fg. 3.5(B) s obvously a better choce for the root. From a predcton or classfcaton pont of vew, Fg. 3.5(B) makes fewer mstakes than Fg. 3.5(A). In Fg. 3.5(B), when Own_house = true every example has the class Yes. When Own_house = false, f we take maorty class (the most frequent class), whch s No, we make three mstakes/errors. If we look at Fg. 3.5(A), the stuaton s worse. If we take the maorty class for each branch, we make fve mstakes (marked n bold). Thus, we say that the mpurty of the tree n Fg. 3.5(A) s hgher than the tree n Fg. 3.5(B). To learn a decson tree, we prefer Own_house to Age to be the root node. Instead of countng the number of mstakes or errors, C4.5 uses a more prncpled approach to perform ths evaluaton on every attrbute n order to choose the best attrbute to buld the tree. (B)

13 64 3 Supervsed Learnng The most popular mpurty functons used for decson tree learnng are nformaton gan and nformaton gan rato, whch are used n C4.5 as two optons. Let us frst dscuss nformaton gan, whch can be extended slghtly to produce nformaton gan rato. The nformaton gan measure s based on the entropy functon from nformaton theory [484]: entropy ( D) = Pr( c )log Pr( c ) () C = Pr( c ) =, C = where Pr(c ) s the probablty of class c n data set D, whch s the number of examples of class c n D dvded by the total number of examples n D. In the entropy computaton, we defne 0log0 = 0. The unt of entropy s bt. Let us use an example to get a feelng of what ths functon does. Example 6: Assume we have a data set D wth only two classes, postve and negatve. Let us see the entropy values for three dfferent compostons of postve and negatve examples:. The data set D has 50% postve examples (Pr(postve) = 0.5) and 50% negatve examples (Pr(negatve) = 0.5). entropy ( D) = 0.5 log log 0.5 =.. The data set D has 0% postve examples (Pr(postve) = 0.) and 80% negatve examples (Pr(negatve) = 0.8). entropy ( D) = 0. log log 0.8 = The data set D has 00% postve examples (Pr(postve) = ) and no negatve examples, (Pr(negatve) = 0). entropy ( D) = log 0 log 0 = 0. We can see a trend: When the data becomes purer and purer, the entropy value becomes smaller and smaller. In fact, t can be shown that for ths bnary case (two classes), when Pr(postve) = 0.5 and Pr(negatve) = 0.5 the entropy has the maxmum value,.e., bt. When all the data n D belong to one class the entropy has the mnmum value, 0 bt. It s clear that the entropy measures the amount of mpurty or dsorder n the data. That s exactly what we need n decson tree learnng. We now descrbe the nformaton gan measure, whch uses the entropy functon.

14 3. Decson Tree Inducton 65 Informaton Gan The dea s the followng:. Gven a data set D, we frst use the entropy functon (Equaton ) to compute the mpurty value of D, whch s entropy(d). The mpurtyeval- functon n lne 7 of Fg. 3.4 performs ths task.. Then, we want to know whch attrbute can reduce the mpurty most f t s used to partton D. To fnd out, every attrbute s evaluated (lnes 8 0 n Fg. 3.4). Let the number of possble values of the attrbute A be v. If we are gong to use A to partton the data D, we wll dvde D nto v dsont subsets D, D,, D v. The entropy after the partton s v D entropya ( D) = entropy( D ). (3) D = The mpurtyeval- functon n lne 9 of Fg. 3.4 performs ths task. 3. The nformaton gan of attrbute A s computed wth: gan( D, A ) = entropy( D) entropy ( D). (4) Clearly, the gan crteron measures the reducton n mpurty or dsorder. The gan measure s used n lne of Fg. 3.4, whch chooses attrbute A g resultng n the largest reducton n mpurty. If the gan of A g s too small, the algorthm stops for the branch (lne ). Normally a threshold s used here. If choosng A g s able to reduce mpurty sgnfcantly, A g s employed to partton the data to extend the tree further, and so on (lnes 5 n Fg. 3.4). The process goes on recursvely by buldng sub-trees usng D, D,, D m (lne 0). For subsequent tree extensons, we do not need A g any more, as all tranng examples n each branch has the same A g value. Example 7: Let us compute the gan values for attrbutes Age, Own_house and Credt_Ratng usng the whole data set D n Table 3.,.e., we evaluate for the root node of a decson tree. Frst, we compute the entropy of D. Snce D has 6 No class tranng examples, and 9 Yes class tranng examples, we have entropy D) = log log ( = A We then try Age, whch parttons the data nto 3 subsets (as Age has three possble values) D (wth Age=young), D (wth Age=mddle), and D 3 (wth Age=old). Each subset has fve tranng examples. In Fg. 3.5, we also see the number of No class examples and the number of Yes examples n each subset (or n each branch).

15 66 3 Supervsed Learnng entropy Age ( D) = entropy( D ) entropy( D ) entropy( D3 ) = = Lkewse, we compute for Own_house, whch parttons D nto two subsets, D (wth Own_house=true) and D (wth Own_house=false). entropy 6 9 D) = entropy ( D) entropy ( D ) = = Own _ house ( Smlarly, we obtan entropy Has_ob (D) = 0.647, and entropy Credt_ratng (D) = The gans for the attrbutes are: gan(d, Age) = = gan(d, Own_house) = = 0.40 gan(d, Has_ob) = = 0.34 gan(d, Credt_ratng) = = Own_house s the best attrbute for the root node. Fgure 3.5(B) shows the root node usng Own_house. Snce the left branch has only one class (Yes) of data, t results n a leaf node (lne n Fg. 3.4). For Own_house = false, further extenson s needed. The process s the same as above, but we only use the subset of the data wth Own_house = false,.e., D. Informaton Gan Rato The gan crteron tends to favor attrbutes wth many possble values. An extreme stuaton s that the data contan an ID attrbute that s an dentfcaton of each example. If we consder usng ths ID attrbute to partton the data, each tranng example wll form a subset and has only one class, whch results n entropy ID (D) = 0. So the gan by usng ths attrbute s maxmal. From a predcton pont of revew, such a partton s useless. Gan rato (Equaton 5) remedes ths bas by normalzng the gan usng the entropy of the data wth respect to the values of the attrbute. Our prevous entropy computatons are done wth respect to the class attrbute: ganrato( D, A ) = s D D log = D D gan( D, A ) where s s the number of possble values of A, and D s the subset of data (5)

16 3. Decson Tree Inducton 67 that has the th value of A. D / D corresponds to the probablty of Equaton (). Usng Equaton (5), we smply choose the attrbute wth the hghest ganrato value to extend the tree. Ths method works because f A has too many values the denomnator wll be large. For nstance, n our above example of the ID attrbute, the denomnator wll be log D. The denomnator s called the splt nfo n C4.5. One note s that the splt nfo can be 0 or very small. Some heurstc solutons can be devsed to deal wth t (see [453]) Handlng of Contnuous Attrbutes It seems that the decson tree algorthm can only handle dscrete attrbutes. In fact, contnuous attrbutes can be dealt wth easly as well. In a real lfe data set, there are often both dscrete attrbutes and contnuous attrbutes. Handlng both types n an algorthm s an mportant advantage. To apply the decson tree buldng method, we can dvde the value range of attrbute A nto ntervals at a partcular tree node. Each nterval can then be consdered a dscrete value. Based on the ntervals, gan or ganrato s evaluated n the same way as n the dscrete case. Clearly, we can dvde A nto any number of ntervals at a tree node. However, two ntervals are usually suffcent. Ths bnary splt s used n C4.5. We need to fnd a threshold value for the dvson. Clearly, we should choose the threshold that maxmzes the gan (or ganrato). We need to examne all possble thresholds. Ths s not a problem because although for a contnuous attrbute A the number of possble values that t can take s nfnte, the number of actual values that appear n the data s always fnte. Let the set of dstnctve values of attrbute A that occur n the data be {v, v,, v r }, whch are sorted n ascendng order. Clearly, any threshold value lyng between v and v + wll have the same effect of dvdng the tranng examples nto those whose value of attrbute A les n {v, v,, v } and those whose value les n {v +, v +,, v r }. There are thus only r possble splts on A, whch can all be evaluated. The threshold value can be the mddle pont between v and v +, or ust on the rght sde of value v, whch results n two ntervals A v and A > v. Ths latter approach s used n C4.5. The advantage of ths approach s that the values appearng n the tree actually occur n the data. The threshold value that maxmzes the gan (ganrato) value s selected. We can modfy the algorthm n Fg. 3.4 (lnes 8 ) easly to accommodate ths computaton so that both dscrete and contnuous attrbutes are consdered. A change to lne 0 of the algorthm n Fg. 3.4 s also needed. For a contnuous attrbute, we do not remove attrbute A g because an nterval can

17 68 3 Supervsed Learnng be further splt recursvely n subsequent tree extensons. Thus, the same contnuous attrbute may appear multple tmes n a tree path (see Example 9), whch does not happen for a dscrete attrbute. From a geometrc pont of vew, a decson tree bult wth only contnuous attrbutes represents a parttonng of the data space. A seres of splts from the root node to a leaf node represents a hyper-rectangle. Each sde of the hyper-rectangle s an axs-parallel hyperplane. Example 8: The hyper-rectangular regons n Fg. 3.6(A), whch parttons the space, are produced by the decson tree n Fg. 3.6(B). There are two classes n the data, represented by empty crcles and flled rectangles..6.5 Y (A) A partton of the data space X X > Y.5 >.5 Y.6 >.6 Y > X 3 > 3 (B). The decson tree X 4 > 4 Fg A parttonng of the data space and ts correspondng decson tree Handlng of contnuous (numerc) attrbutes has an mpact on the effcency of the decson tree algorthm. Wth only dscrete attrbutes the algorthm grows lnearly wth the sze of the data set D. However, sortng of a contnuous attrbute takes D log D tme, whch can domnate the tree learnng process. Sortng s mportant as t ensures that gan or ganrato can be computed n one pass of the data Some Other Issues We now dscuss several other ssues n decson tree learnng. Tree Prunng and Overfttng: A decson tree algorthm recursvely parttons the data untl there s no mpurty or there s no attrbute left. Ths process may result n trees that are very deep and many tree leaves may cover very few tranng examples. If we use such a tree to predct the tranng set, the accuracy wll be very hgh. However, when t s used to classfy unseen test set, the accuracy may be very low. The learnng s thus not effectve,.e., the decson tree does not generalze the data well. Ths

18 3. Decson Tree Inducton 69 phenomenon s called overfttng. More specfcally, we say that a classfer f overfts the data f there s another classfer f such that f acheves a hgher accuracy on the tranng data than f, but a lower accuracy on the unseen test data than f [385]. Overfttng s usually caused by nose n the data,.e., wrong class values/labels and/or wrong values of attrbutes, but t may also be due to the complexty and randomness of the applcaton doman. These problems cause the decson tree algorthm to refne the tree by extendng t to very deep usng many attrbutes. To reduce overfttng n the context of decson tree learnng, we perform prunng of the tree,.e., to delete some branches or sub-trees and replace them wth leaves of maorty classes. There are two man methods to do ths, stoppng early n tree buldng (whch s also called pre-prunng) and prunng the tree after t s bult (whch s called post-prunng). Postprunng has been shown more effectve. Early-stoppng can be dangerous because t s not clear what wll happen f the tree s extended further (wthout stoppng). Post-prunng s more effectve because after we have extended the tree to the fullest, t becomes clearer whch branches/subtrees may not be useful (overft the data). The general dea of post-prunng s to estmate the error of each tree node. If the estmated error for a node s less than the estmated error of ts extended sub-tree, then the sub-tree s pruned. Most exstng tree learnng algorthms take ths approach. See [453] for a technque called the pessmstc error based prunng. Example 9: In Fg. 3.6(B), the sub-tree representng the rectangular regon X, Y >.5, Y.6 n Fg. 3.6(A) s very lkely to be overfttng. The regon s very small and contans only a sngle data pont, whch may be an error (or nose) n the data collecton. If t s pruned, we obtan Fg. 3.7(A) and (B)..6.5 Y X (A) A partton of the data space X > Y > X 3 > 3 X 4 > 4 (B). The decson tree Fg The data space partton and the decson tree after prunng

19 70 3 Supervsed Learnng Another common approach to prunng s to use a separate set of data called the valdaton set, whch s not used n tranng and nether n testng. After a tree s bult, t s used to classfy the valdaton set. Then, we can fnd the errors at each node on the valdaton set. Ths enables us to know what to prune based on the errors at each node. Rule Prunng: We noted earler that a decson tree can be converted to a set of rules. In fact, C4.5 also prunes the rules to smplfy them and to reduce overfttng. Frst, the tree (C4.5 uses the unpruned tree) s converted to a set of rules n the way dscussed n Example 4. Rule prunng s then performed by removng some condtons to make the rules shorter and fewer (after prunng some rules may become redundant). In most cases, prunng results n a more accurate rule set as shorter rules are less lkely to overft the tranng data. Prunng s also called generalzaton as t makes rules more general (wth fewer condtons). A rule wth more condtons s more specfc than a rule wth fewer condtons. Example 0: The sub-tree below X n Fg. 3.6(B) produces these rules: Rule : Rule : Rule 3: X, Y >.5, Y >.6 X, Y >.5, Y.6 O X, Y.5 Note that Y >.5 n Rule s not useful because of Y >.6, and thus Rule should be Rule : X, Y >.6 In prunng, we may be able to delete the condtons Y >.6 from Rule to produce: X Then Rule and Rule 3 become redundant and can be removed. A useful pont to note s that after prunng the resultng set of rules may no longer be mutually exclusve and exhaustve. There may be data ponts that satsfy the condtons of more than one rule, and f naccurate rules are dscarded, of no rules. An orderng of the rules s thus needed to ensure that when classfyng a test case only one rule wll be appled to determne the class of the test case. To deal wth the stuaton that a test case does not satsfy the condtons of any rule, a default class s used, whch s usually the maorty class. Handlng Mssng Attrbute Values: In many practcal data sets, some attrbute values are mssng or not avalable due to varous reasons. There are many ways to deal wth the problem. For example, we can fll each

20 3.3 Classfer Evaluaton 7 mssng value wth the specal value unknown or the most frequent value of the attrbute f the attrbute s dscrete. If the attrbute s contnuous, use the mean of the attrbute for each mssng value. The decson tree algorthm n C4.5 takes another approach. At a tree node, dstrbute the tranng example wth mssng value for the attrbute to each branch of the tree proportonally accordng to the dstrbuton of the tranng examples that have values for the attrbute. Handlng Skewed Class Dstrbuton: In many applcatons, the proportons of data for dfferent classes can be very dfferent. For nstance, n a data set of ntruson detecton n computer networks, the proporton of ntruson cases s extremely small (< %) compared wth normal cases. Drectly applyng the decson tree algorthm for classfcaton or predcton of ntrusons s usually not effectve. The resultng decson tree often conssts of a sngle leaf node normal, whch s useless for ntruson detecton. One way to deal wth the problem s to over sample the ntruson examples to ncrease ts proporton. Another soluton s to rank the new cases accordng to how lkely they may be ntrusons. The human users can then nvestgate the top ranked cases. 3.3 Classfer Evaluaton After a classfer s constructed, t needs to be evaluated for accuracy. Effectve evaluaton s crucal because wthout knowng the approxmate accuracy of a classfer, t cannot be used n real-world tasks. There are many ways to evaluate a classfer, and there are also many measures. The man measure s the classfcaton accuracy (Equaton ), whch s the number of correctly classfed nstances n the test set dvded by the total number of nstances n the test set. Some researchers also use the error rate, whch s accuracy. Clearly, f we have several classfers, the one wth the hghest accuracy s preferred. Statstcal sgnfcance tests may be used to check whether one classfer s accuracy s sgnfcantly better than that of another gven the same tranng and test data sets. Below, we frst present several common methods for classfer evaluaton, and then ntroduce some other evaluaton measures Evaluaton Methods Holdout Set: The avalable data D s dvded nto two dsont subsets, the tranng set D tran and the test set D test, D = D tran D test and D tran D test =

21 7 3 Supervsed Learnng. The test set s also called the holdout set. Ths method s manly used when the data set D s large. Note that the examples n the orgnal data set D are all labeled wth classes. As we dscussed earler, the tranng set s used for learnng a classfer whle the test set s used for evaluatng the resultng classfer. The tranng set should not be used to evaluate the classfer as the classfer s based toward the tranng set. That s, the classfer may overft the tranng set, whch results n very hgh accuracy on the tranng set but low accuracy on the test set. Usng the unseen test set gves an unbased estmate of the classfcaton accuracy. As for what percentage of the data should be used for tranng and what percentage for testng, t depends on the data set sze and two thrds for tranng and one thrd for testng are commonly used. To partton D nto tranng and test sets, we can use a few approaches:. We randomly sample a set of tranng examples from D for learnng and use the rest for testng.. If the data s collected over tme, then we can use the earler part of the data for tranng/learnng and the later part of the data for testng. In many applcatons, ths s a more sutable approach because when the classfer s used n the real-world the data are from the future. Ths approach thus better reflects the dynamc aspects of applcatons. Multple Random Samplng: When the avalable data set s small, usng the above methods can be unrelable because the test set would be too small to be representatve. One approach to deal wth the problem s to perform the above random samplng n tmes. Each tme a dfferent tranng set and a dfferent test set are produced. Ths produces n accuraces. The fnal estmated accuracy on the data s the average of the n accuraces. Cross-Valdaton: When the data set s small, the n-fold cross-valdaton method s very commonly used. In ths method, the avalable data s parttoned nto n equal-sze dsont subsets. Each subset s then used as the test set and the remanng n subsets are combned as the tranng set to learn a classfer. Ths procedure s then run n tmes, whch gves n accuraces. The fnal estmated accuracy of learnng from ths data set s the average of the n accuraces. 0-fold and 5-fold cross-valdatons are often used. A specal case of cross-valdaton s the leave-one-out cross-valdaton. In ths method, each fold of the cross valdaton has only a sngle test example and all the rest of the data s used n tranng. That s, f the orgnal data has m examples, then ths s m-fold cross-valdaton. Ths method s normally used when the avalable data s very small. It s not effcent for a large data set as m classfers need to be bult.

22 3.3 Classfer Evaluaton 73 In Sect. 3..4, we mentoned that a valdaton set can be used to prune a decson tree or a set of rules. If a valdaton set s employed for that purpose, t should not be used n testng. In that case, the avalable data s dvded nto three subsets, a tranng set, a valdaton set and a test set. Apart from usng a valdaton set to help tree or rule prunng, a valdaton set s also used frequently to estmate parameters n learnng algorthms. In such cases, the values that gve the best accuracy on the valdaton set are used as the fnal values of the parameters. Cross-valdaton can be used for parameter estmatng as well. Then a separate valdaton set s not needed. Instead, the whole tranng set s used n cross-valdaton Precson, Recall, F-score and Breakeven Pont In some applcatons, we are only nterested n one class. Ths s partcularly true for text and Web applcatons. For example, we may be nterested n only the documents or web pages of a partcular topc. Also, n classfcaton nvolvng skewed or hghly mbalanced data, e.g., network ntruson and fnancal fraud detecton, we are typcally nterested n only the mnorty class. The class that the user s nterested n s commonly called the postve class, and the rest negatve classes (the negatve classes may be combned nto one negatve class). Accuracy s not a sutable measure n such cases because we may acheve a very hgh accuracy, but may not dentfy a sngle ntruson. For nstance, 99% of the cases are normal n an ntruson detecton data set. Then a classfer can acheve 99% accuracy wthout dong anythng by smply classfyng every test case as not ntruson. Ths s, however, useless. Precson and recall are more sutable n such applcatons because they measure how precse and how complete the classfcaton s on the postve class. It s convenent to ntroduce these measures usng a confuson matrx (Table 3.). A confuson matrx contans nformaton about actual and predcted results gven by a classfer. Table 3.. Confuson matrx of a classfer Classfed postve Classfed negatve Actual postve TP FN Actual negatve FP TN where TP: the number of correct classfcatons of the postve examples (true postve) FN: the number of ncorrect classfcatons of postve examples (false negatve) FP: the number of ncorrect classfcatons of negatve examples (false postve) TN: the number of correct classfcatons of negatve examples (true negatve)

23 74 3 Supervsed Learnng Based on the confuson matrx, the precson (p) and recall (r) of the postve class are defned as follows: TP TP p =. r =. (6) TP + FP TP + FN In words, precson p s the number of correctly classfed postve examples dvded by the total number of examples that are classfed as postve. Recall r s the number of correctly classfed postve examples dvded by the total number of actual postve examples n the test set. The ntutve meanngs of these two measures are qute obvous. However, t s hard to compare classfers based on two measures, whch are not functonally related. For a test set, the precson may be very hgh but the recall can be very low, and vce versa. Example : A test data set has 00 postve examples and 000 negatve examples. After classfcaton usng a classfer, we have the followng confuson matrx (Table 3.3), Table 3.3. Confuson matrx of a classfer Classfed postve Classfed negatve Actual postve 99 Actual negatve Ths confuson matrx gves the precson p = 00% and the recall r = % because we only classfed one postve example correctly and classfed no negatve examples wrongly. Although n theory precson and recall are not related, n practce hgh precson s acheved almost always at the expense of recall and hgh recall s acheved at the expense of precson. In an applcaton, whch measure s more mportant depends on the nature of the applcaton. If we need a sngle measure to compare dfferent classfers, the F-score s often used: pr F = (7) p + r The F-score (also called the F -score) s the harmonc mean of precson and recall. F = (8) + p r

24 3.4 Rule Inducton 75 The harmonc mean of two numbers tends to be closer to the smaller of the two. Thus, for the F-score to be hgh, both p and r must be hgh. There s also another measure, called precson and recall breakeven pont, whch s used n the nformaton retreval communty. The breakeven pont s when the precson and the recall are equal. Ths measure assumes that the test cases can be ranked by the classfer based on ther lkelhoods of beng postve. For nstance, n decson tree classfcaton, we can use the confdence of each leaf node as the value to rank test cases. Example : We have the followng rankng of 0 test documents. represents the hghest rank and 0 represents the lowest rank. + ( ) represents an actual postve (negatve) documents Assume that the test set has 0 postve examples. At rank : p = / = 00% r = /0 = 0% At rank : p = / = 00% r = /0 = 0% At rank 9: p = 6/9 = 66.7% r = 6/0 = 60% At rank 0: p = 7/0 = 70% r = 7/0 = 70% The breakeven pont s p = r = 70%. Note that nterpolaton s needed f such a pont cannot be found. 3.4 Rule Inducton In Sect. 3., we showed that a decson tree can be converted to a set of rules. Clearly, the set of rules can be used for classfcaton as the tree. A natural queston s whether t s possble to learn classfcaton rules drectly. The answer s yes. The process of learnng such rules s called rule nducton or rule learnng. We study two approaches n the secton Sequental Coverng Most rule nducton systems use an algorthm called sequental coverng. A classfer bult wth ths algorthm conssts of a lst of rules, whch s also called a decson lst [463]. In the lst, the orderng of the rules s sgnfcant. The basc dea of sequental coverng s to learn a lst of rules sequentally, one at a tme, to cover the tranng data. After each rule s learned,

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Associative Based Classification Algorithm For Diabetes Disease Prediction

Associative Based Classification Algorithm For Diabetes Disease Prediction Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

On Some Entertaining Applications of the Concept of Set in Computer Science Course

On Some Entertaining Applications of the Concept of Set in Computer Science Course On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 48 CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 3.1 INTRODUCTION The raw mcroarray data s bascally an mage wth dfferent colors ndcatng hybrdzaton (Xue

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

SI485i : NLP. Set 5 Using Naïve Bayes

SI485i : NLP. Set 5 Using Naïve Bayes SI485 : NL Set 5 Usng Naïve Baes Motvaton We want to predct somethng. We have some text related to ths somethng. somethng = target label text = text features Gven, what s the most probable? Motvaton: Author

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Learning to Classify Documents with Only a Small Positive Training Set

Learning to Classify Documents with Only a Small Positive Training Set Learnng to Classfy Documents wth Only a Small Postve Tranng Set Xao-L L 1, Bng Lu 2, and See-Kong Ng 1 1 Insttute for Infocomm Research, Heng Mu Keng Terrace, 119613, Sngapore 2 Department of Computer

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

MATHEMATICS FORM ONE SCHEME OF WORK 2004

MATHEMATICS FORM ONE SCHEME OF WORK 2004 MATHEMATICS FORM ONE SCHEME OF WORK 2004 WEEK TOPICS/SUBTOPICS LEARNING OBJECTIVES LEARNING OUTCOMES VALUES CREATIVE & CRITICAL THINKING 1 WHOLE NUMBER Students wll be able to: GENERICS 1 1.1 Concept of

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature Detecton of hand graspng an object from complex background based on machne learnng co-occurrence of local mage feature Shnya Moroka, Yasuhro Hramoto, Nobutaka Shmada, Tadash Matsuo, Yoshak Shra Rtsumekan

More information

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

A Clustering Algorithm for Chinese Adjectives and Nouns 1

A Clustering Algorithm for Chinese Adjectives and Nouns 1 Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,

More information

Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset

Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset Under-Samplng Approaches for Improvng Predcton of the Mnorty Class n an Imbalanced Dataset Show-Jane Yen and Yue-Sh Lee Department of Computer Scence and Informaton Engneerng, Mng Chuan Unversty 5 The-Mng

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Petri Net Based Software Dependability Engineering

Petri Net Based Software Dependability Engineering Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information