Comparison of Performance in Text Mining using Categorization of Unstructured Data

Size: px
Start display at page:

Download "Comparison of Performance in Text Mining using Categorization of Unstructured Data"

Transcription

1 Indan Journal of Scence and Technology, Vol 9(4), DOI: /jst/06/v94/9648, June 06 ISSN (Prnt) : ISSN (Onlne) : Comparson of Performance n Text Mnng usng Categorzaton of Unstructured Data Lee Junyeon *, Shn Seungsoo and Km Jungju 3 Department of Meda Engneerng, Tongmyung Unversty, Korea; jylee@tu.ac.kr Department of Informaton Securty, Tongmyung Unversty, Korea; shnss@tu.ac.kr 3 Choonhae College of Health, Korea; jj79005@hanmal.net Abstract Background/Objectves: The text mnng would help fndng nformaton to the users n the enormous documents. The text mnng has been actvely developed and utlzed n varous felds, manly Englsh-based document, but Study on the Korean text mnng has been relatvely lmted. The mportance of the Korean text mnng has emerged wth ncreasng bg data ncludng Korean text data, the needs for the ntensve study and applcaton of Bg Data are ncreasng. Methods/ Statstcal Analyss: In ths study, we compared the performance of these classfcatons by applyng the method of Bayesan methods, k-nn, decson trees, SVM, and as a neural network n classfcaton of unstructured newspaper artcle nto gven categores. Fndngs: In the experment result, the SVM model has a hgh F-measure value relatve to other models, and has shown stable results n the classfcaton nformaton and recall rate. Also, ths model showed a hgh F-measure value n the classfcaton of a more granular lst. Applcaton/Improvements: The methods of k-nn and decson tree show slghtly lower performance than SVM, they are turned out to be approprate models usng classfcaton problem cause of havng advantages to easy nterpretaton and short learnng tme. Keywords: Categorzaton, Decson Tree, k-nn, Nave Bayes, Text Mnng. Introducton The goal of nformaton access s to help users fnd documents that satsfy ther nformaton needs. The standard procedure s akn to lookng for needles n a needlestack - the problem sn t so much that the desred nformaton s not known, but rather that the desred nformaton coexsts wth many other vald peces of nformaton. The goal of data mnng s to dscover or derve new nformaton from data, fndng patterns across datasets, and/or separatng sgnal from nose. The fact that an nformaton retreval system can return a document that contans the nformaton a user requested does not mply that a new dscovery has been made: the nformaton had to have already been known to the author of the text; otherwse the author could not have wrtten t down. A large database treated n data mnng can be classfed nto an unstructured db and structured db accordng to the structure. Data mnng has been developed mostly focused based on the structured database technques up to now. Therefore, data mnng nvestgators were aware of the need to research on the unstructured databases, there have recently been ncreasng studes wth them. Text mnng can be dvded nto data processng and data analyss. Data processng s a step of machnng to facltate the unstructured data to the data analyss, and data analyss s extractng meanngful nformaton from the text usng data mnng, machne learnng, statstcs. <Table > shows the classfcaton of data mnng and text data mnng applcatons. The researches n text mnng of textual materal are n progress due to the ncreasng of amount of data. Extractng a pattern or relatonshp from the text data, such as web pages, electronc documents, electronc mal, a process of managng the new nformaton, the classfcaton technques to mnng technques to the text data based on a gven keyword assgnng artcle and wthout advance nformaton, there are smlar bndng document clusterng technques to group. Document *Author for correspondence

2 Comparson of Performance n Text Mnng usng Categorzaton of Unstructured Data Table. A classfcaton of data mnng and text data mnng applcatons Non-textual data Textual data Fndng Patterns standard data mnng computatonal lngustcs Fndng Nuggets Novel classfcaton scheme s classfed accordng to a gven keyword, and accordng to a gven set of keywords, the document s determned whether classfed nto the category or not. Document classfcaton allows convenent access to the user a number of documents by structurng categores automatcally. So, t has emerged as a very mportant factor for effcent nformaton management and search. Utlzng a document classfcaton scheme of text mnng n prevous studes usng a neural network and k-nn classfcaton nto four categores, t was compared to the performance 3. In ths paper, we classfy the bg data consstng of Hangul and compared the performance between the varous classfcaton technques. In chapter, we descrbe how to formalze unstructured text data. And chapter 3 says theoretcal background of varous classfcaton technques, and chapter 4 extracts the result by applyng real data. Fnally chapter 5 presents the conclusons.? real Textual Data Mnng Non-Novel database queres nformaton retreval Fgure s a fgure showng a process to conduct a morpheme analyss to extract a keyword correspondng to the noun. In the morphologcal analyss, the scheme excluded the postpostonal partcle, verb, etc., and extracted only the classfcaton word as a noun. And then, the extracted nouns are structured to apply a statstcal analyss technque. Because stop words do not gve any nformaton wth appearng n several or every documents, the extracted nouns except wth the stop words wll be used.. Extractng General Format Document should convert nto structured data to be sutable for analyss because the unstructured data. A document can be expressed as a matrx of columns ndcatng the lne that represents a keyword extractng a set of documents by usng the concept of a set of words and documents 4. Ths matrx s called word-document matrx, t can be expressed n the form of a matrx of m n words and documents, such as Table. In the Table, f j means the frequency of the -th word s ncluded n the j-th document..3 Weghted Functon To ndcate a better correlaton of words, word-document matrx t can be converted by usng the weght functon.. Text Mnng The purpose of text mnng s to fnd useful patterns from the large documents and t means to apply an algorthm to the method of the text data and the statstcal machne learnng. In the aspect of fndng a pattern and extractng the nformaton from a bg data, text mnng s smlar to data mnng, except usng unformulated data.. Extractng Keywords Words that reflect the content and features of a document called the content word. Frst, we determnes each part of speech by analyzng the morpheme of the sentence n order to extract the content word. Statements nclude varous word of part of speech, and the noun s used to ntroduce and descrbe the concept of a statement. The noun s the most appeared to cent word, so we can revew t the most weght. Fgure. Table..Example of a keyword extracton. The World-Document Matrx Doc Doc j Doc n Word f f j f n Word f f j f n Word m f m f mj f mn Vol 9 (4) June 06 Indan Journal of Scence and Technology

3 Lee Junyeon, Shn Seungsoo and Km Jungju Weght functon s composed of local weght functon and global weght functon 5. Elements of the transformaton matrx s equaton (.) usng a word wth the document j, as can be calculated as the product of a functon of both weght L(,j) represents a local weght functon and G() represents global weght functon. aj = L(, j) G ( ) (.) The local weght functon uses a smple frequency, or a smple log converson or bnomal converson form generally. Smple frequency functon : L (, j) =, Log converson functon : f j L (, j) = log ( + ) f j, Bnomal converson functon : fj, L (, j) = 0 fj, = 0 Regonal weght functon gves weghted value to a word ncluded n document j s represented as L(,j). The weghts of the words for the entre document referred to globally weght functon, and represented as G(). Ths s also known as the term weghts, and are used to supplement the dsadvantages of regonal weght functon does not reflect the characterstcs of the words for the entre collecton of documents 6. Global weghtng functon has a functon usng the functon, the functon representng the words the nformaton that appears n quantty, functon and standardzed concept consderng the frequency and the word s the rato of the ndcated number of documents n the word n the entre document usng the entropy concept as follows, then and t s expressed as follows. The functon usng entropy: G () = + f g j fj log g log ( N) The functon of the word by quanttatve: N G () = log + d The functon consderng the rato (frequency of words vs. number of documents): G ()= g d In the global weght functon g s the frequency wth whch the word appears throughout the document collecton and, d s the number of document, and N means total number of document. Globally weghted usng the concept of entropy has a large value when the emergence of a rare word n the document. If a certan word once all appearance n all documents, f the word s not a characterstc that dfferentates the respectve document locally weghted usng the entropy concept has a value of 0, only appear a certan word one document of the entre document, It has a value of. Ths global weghtng functon usng the standardzaton concept of regonal weghts have the same characterstcs as used n the collecton of documents represents the percentage of words that appear n the document collecton Classfcaton Method 3. Baysan Method The Bayesan method s one of the most wdely used algorthms n the feld of document classfcaton, calculate the posteror probablty that t wll be assgned to each category receves a sngle document usng the Bayesan theory, such as the equaton (3.) 8. Pd ( Cj) P( C j) PC ( j d) = (3.) Pd ( ) In ths Formula, d means random document, Cj means j-th category. P(d) has the same value to all categores, doesn t need to calculate probablty. Bayesan method assumes all the words are ndependent from each other, assgnment of the category that occurs n the document s that mutually exclusve. So, the P(C j d) can be calculated as follow equaton (3.) PC ( d) = P( C ) P( w C ) j j j = n (3.) In (3.), P(w C j ) means (the number of w s occurrence n C j )/(the number of all words occurrence n C j ), and P(C j ) means (the number of documents allocated n C j )/(the number of all documents). Vol 9 (4) June 06 Indan Journal of Scence and Technology 3

4 Comparson of Performance n Text Mnng usng Categorzaton of Unstructured Data The Baysan method calculates the probablty of beng classfed n each category by assgnng a document to a category havng the maxmum value, because a number of categores are categores to calculate the condtonal probablty for each of the categores havng the hghest probablty belong to the document. 3. k-nn The k-nn classfcaton method s to select one pattern from among the stored pattern wth a dstance of at least the learnng data wth respect to any partcular pattern and classfes the category number whch belongs to the category of the gven pattern 9. The k-nn algorthm s prmarly used to the degree of smlarty between the new document (d x ) and the learnng document (d j ) to fnd the degree of smlarty s hgh top k neghborng documents n document study groups. Representatve degree of smlarty s the cosne smlarty degree s calculated as n equaton (3.3). sm( d, d ) = x j k k t xk t jk (t ) ( t ) xk k jk (3.3) In the equaton (3.3), t xk, t jk means the weght value of word k appeared n d x, d j. The degree of smlarty between the new document and document neghbors s used as the document category, the weght of the neghborhood, f neghborng pages when the category weghts, share category s hgh. When extractng k learnng documents among the new document and the order of smlarty usng the cosne smlarty, the category assgned to each k learnng documents s a canddate lst of categores to be assgned to the new document. The order to fnd the best category to be assgned to the new document category from the canddate lst, calculate the complance wth total frequency of each category or the smlarty score per category such as expresson (3.4) n the learnng document classfcaton. rel( Ck dx) sm( dx, dj) {( Ck d j) (3.4) j In the k-nn algorthm selecton of k t wll have a large mpact on the classfcaton result. Lkely k s not too small, there s a possblty of overchargng as the sum of the nose data for tranng, as opposed to the data groups classfed as near to that k s too large classfcaton exst. 3.3 Decson Tree Decson tree s made a decson rule to classfy the functon. In supervsed learnng problem t s sometmes mportant to the predcton of the fnal model and analyss even more emphass on predctve analyss or accordng to crcumstances. By dvdng the regons of each varable repeatedly wth supervsed learnng technques, decson tree creates a rule wth ncorrect predcton and correct analyss relatvely to the other supervsed learnng technques for the whole area. Rules created by the decson tree has the advantage of beng easy to understand and easy to mplement, because they have f-then structure and smlar to SQL database language format. The analyss of the decson tree s composed of growth of the decson tree, prunng, feasblty study, analyss and forecastng as shown n Table 3. The most wdely used n decson tree algorthm s CHAID. Ths can be appled to all types of the target and classfcaton varables, such as nomnal, the order-type, contnuous type. 3.4 SVM SVM s recevng attenton not only n the classfcaton problem, and also applcable to a regresson problem, accurate and applcable to varous types of data of the predcted predcton easer problem. Unlke the probablty estmaton of the object to mnmze the emprcal rsk, such as the logstc regresson and dscrmnant analyss conventonal methods classfcaton SVM s got wth the purpose to mnmze structural rsk look place only the classfcaton effcency tself over exstng probablty estmaton method overall, hgher predctablty. Table 3. Dvson Growng decson tree prunng feasblty study analyss and forecastng The Analyss Process of Decson Tree Descrpton Grow the trees to fnd an approprate optmum separaton crtera, and stop f they meet the approprate stoppng rules. Remove a branch whch wll ncrease the rsk of an error sgnfcantly or has an mproper nference rules. Evaluate the decson tree usng gan chart, rsk chart or verfcaton data Interpret and apply buld a wooden model after settng a predcton model predctons. 4 Vol 9 (4) June 06 Indan Journal of Scence and Technology

5 Lee Junyeon, Shn Seungsoo and Km Jungju The reason of effectveness n SVM s that t performs lnear classfcaton quckly and easly when the lnearty see the non-lnear data set at a low level at a hgher level and expand ts dmensons. Most of the pattern and can not be lnearly separated, to separate the non-lnear patterns and converts the nput space of the non-lnear pattern n the feature space of the lnear pattern. At ths tme, the kernel functons are used to classfy non-lnear patterns. K( xx, ) = (( x x ) + ) (3.5) Polynomal kernel functon s dependent on the drecton between the two vectors, a vector havng the same drecton, they eventually result there s gven a hgh value s for a polynomal kernel functon, a polynomal kernel functon s the same as equaton (3.5). 4. Experments In ths study, we used a Korea Herald artcle for one month (n January 06). Total 340 artcles related to North Korea and nuclear are used n the analyss. It was repeated 0 tmes performed to ensure the valdty of the generalzed model buldng, and analyzed through the same procedure as n <Fgure >. 4. Structurng Data Snce the newspaper artcle s unstructured data, we perform the structurng process at frst to apply formal analyss technques. The frst step of structurng s to extract a keyword from the data, the newspaper artcle, comprses a varety of combnatons, such as letters, specal characters, numbers. Table 4 shows a part of newspaper artcle and the keywords extracted. Even though some nouns appeared n all documents wthout meanng, we exclude them and put to the stop words. The vector of the rough wth the keywords extracted noun s created for every artcle. Usng ths vector, word-document matrx such as Table 5 was created. Table 5 s a part of word-document matrx generated by confgurng each of the newspaper artcles. In the vector, column s words extracted from artcle, and row s each newspaper artcle number. In Table 5, the number means Doc 8, Doc 9 and Doc nclude the word coal n Fgure. Analyss Process. d Table 4. Keywords Extracton Artcle Chna could stop buyng coal from North Korea to punsh the ally for ts recent nuclear test, a South Korean expert suggested Thursday amd calls for Bejng to take a frmer stance aganst Pyongyang. Chna has come under growng pressure from South Korea and the Unted States to help draw a strong sanctons resoluton from the U.N. Securty Councl to punsh the North for ts fourth nuclear test last week. Cho Kyung-soo, presdent of the North Korea Resources Insttute n Seoul, noted the North s hgh relance on trade wth Chna. Coal exports account for nearly half of all North Korean exports to Chna, he sad n a phone ntervew wth Yonhap News Agency. North Korea earned $.84 bllon from exports to Chna n 04, nearly 90 percent of the $3.6 bllon earned n total, accordng to data from the nsttute and the Korea Trade-Investment Promoton Agency. Table 5. terms Word-Document Matrx documents Extracted nouns coal, North Korea, ally, nuclear, test, South Korea, Thursday, calls, Bejng, stance, Pyongyang, Chna, pressure, Unted States, sancton, resoluton, U.N. Securty Councl, week, Cho Kyungsoo, presdent, Resources Insttute, Seoul, relance, trade, half, he, phone, ntervew, Yonhap News Agency, bllon, percent, data, Korea Trade Investment Pormoton Agency Coach Coagulaton Coal Coalton Coast Coat Coater Coatng Coatrack Coatroom the artcle. Word-document matrx s the basc form of a newspaper artcle whch converts unstructured data nto structured data. Ths study s due to the newspaper s vew to applyng classfcaton technques to formalze the newspaper nto one object, word-document matrx was appled to Vol 9 (4) June 06 Indan Journal of Scence and Technology 5

6 Comparson of Performance n Text Mnng usng Categorzaton of Unstructured Data the pre-analyss. The matrx transposton s located on the lne newspapers, and a consderable number of words havng the meanng of the word matrx rather than a varable number of newspaper artcles that correspond to the nature of the data object to the locaton column. 4. Comparson of Models To evaluate the performance of the classfcaton model, t was used for Correct Classfed Rate (CCR), Recall Rate (RR), and F-measure to be used n document classfcaton crtera. The Correct Classfed Rate s the rate of correct predcton, and Recall Rate s the rato actually ht accurate predctons. And F-measure means the combnatonal mean of CCR and RR, and ths s convenent expresson method to compare models. The calculaton of CCR and RR and F-measure can be wrtten as equaton (4.) usng the actual stuaton and predcton for the classfcaton n Table 6. A CCR : p = A+ B A RR : r = A+ C Fpr (, ) = pr p + r (4.) In order to compare the performance of the artcle were classfed usng the F-M value usng a CCR and RR, exhbted the model F-Measure specfc value correspondng to 0 tmes the Table 7. Table 6. The Classfcaton Table of Correctness n Predcton Model Predcton Table 7. Category Model Congruty Real Incongruty Congruty A B Incongruty C D The Mean and Stand Devaton n each Model Mean F-measure Standard Devaton Baysan k-nn Decson Tree SVM SVM model has the best F-measure value (59%) model showed the value compared to the other models n ths category. 5. Conclusons In ths paper, we converted the unstructured data to structured usng text mnng technques. And we compared those data by applyng the materal to the conventonal technques of Bayesan statstcal classfcaton, k-nn, decson tree, SVM to form a predcton model. After morphologcal analyss, we extracted only for the classfcaton word as a noun. By applyng a stopwords dctonary wth respect to the extracted to remove words that can cause nose n the analyss and the word-document matrx was produced usng the remanng words. By weghtng the matrx to ndcate a good correlaton between the word and document, t was converted nto a smple frequency matrx. Model Evaluaton crtera nclude the performance was compared usng a common nformaton classfcaton and rate of recall and F-measure that evaluaton crtera n the classfcaton document. It was also run for 0 teratons performed to ensure the valdty and generalzaton of the model buldng. The F-measure value of SVM model shows better performance than other models. After usng the standardzed Unstructured newspaper artcle n ths paper, applyng to each analyss were compared to the results. Overall, the performance was shown n the model s relatvely low, the reason s estmated to be lmtatons of the automatc document classfcaton algorthm. 5. Acknowledgement Ths work was supported by the Natonal Research Foundaton of Korea Grant funded by the Korean Government (NRF-04SA5B ). 6. References. Hearst Mart A. Untanglng Text Data Mnng. Proceedngs of the 37th annual meetng of the Assocaton for Computatonal Lngustcs on Computatonal Lngustcs. 999; 3(0). DOI: 0.35/ Kho Kwangsoo, Chung Wonkyo, Shn Youngkeun, Park Sangsung, Jang Dongsk. A Study on Development of Patent Informaton Retreval Usng Textmnng. Journal of the Korea Academa-Industral cooperaton Socety. 0; (8): Vol 9 (4) June 06 Indan Journal of Scence and Technology

7 Lee Junyeon, Shn Seungsoo and Km Jungju 3. Cho Taeho. Comparson of Neural Network and k-nn Algorthm for News Artcle Classfcaton. 998 Conference on Korea Informaton Scence Socety. 998; 5(): Bartere MM and Deshmukh PR. Cluster Orented Image Retreval Systems. IJCA Proceedngs on Emergng Trends n Computer Scence and Informaton Technology. 0; ETCSIT(3): Mttermayer M and Knolmayer G. Text Mnng Systems for Market Response to News: A survey, Workng paper n Insttut fur Wrtschaftsnformatk der Unverstat Bern. 006; 84: Chn KK. The Graduate School of the Unversty of Darwn College: Support Vector Machnes appled to speech pattern classfcaton. Dssertaton of PhD Cart, NC, USA: SAS Insttute Inc.: SAS Publshng. SAS Text Mner 4. Reference Van Drel MA, Bruggeman J, Vrend G, Brunner HG and Leunssen JA. A Text-Mnng Analyss of the Human Phenome. European Journal of Human Genetcs. 006; 4(50): Sebastan F. Machne learnng n automated text categorzaton. ACM Computng Surveys. 00; 34():-47. Vol 9 (4) June 06 Indan Journal of Scence and Technology 7

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

RECOGNIZING GENDER THROUGH FACIAL IMAGE USING SUPPORT VECTOR MACHINE

RECOGNIZING GENDER THROUGH FACIAL IMAGE USING SUPPORT VECTOR MACHINE Journal of Theoretcal and Appled Informaton Technology 30 th June 06. Vol.88. No.3 005-06 JATIT & LLS. All rghts reserved. ISSN: 99-8645 www.jatt.org E-ISSN: 87-395 RECOGNIZING GENDER THROUGH FACIAL IMAGE

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1 A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent

More information

A Novel Term_Class Relevance Measure for Text Categorization

A Novel Term_Class Relevance Measure for Text Categorization A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

An efficient method to build panoramic image mosaics

An efficient method to build panoramic image mosaics An effcent method to buld panoramc mage mosacs Pattern Recognton Letters vol. 4 003 Dae-Hyun Km Yong-In Yoon Jong-Soo Cho School of Electrcal Engneerng and Computer Scence Kyungpook Natonal Unv. Abstract

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment

Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment TELKOMNIKA, Vol.10, No.5, September 2012, pp. 1087~1092 e-issn: 2087-278X accredted by DGHE (DIKTI), Decree No: 51/Dkt/Kep/2010 1087 Parallel Implementaton of Classfcaton Algorthms Based on Cloud Computng

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples 94 JOURNAL OF COMPUTERS, VOL. 4, NO. 1, JANUARY 2009 Relable Negatve Extractng Based on knn for Learnng from Postve and Unlabeled Examples Bangzuo Zhang College of Computer Scence and Technology, Jln Unversty,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

The Study of Remote Sensing Image Classification Based on Support Vector Machine

The Study of Remote Sensing Image Classification Based on Support Vector Machine Sensors & Transducers 03 by IFSA http://www.sensorsportal.com The Study of Remote Sensng Image Classfcaton Based on Support Vector Machne, ZHANG Jan-Hua Key Research Insttute of Yellow Rver Cvlzaton and

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Efficient Text Classification by Weighted Proximal SVM *

Efficient Text Classification by Weighted Proximal SVM * Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Efficiency Comparison of Data Mining Techniques for Missing-Value Imputation

Efficiency Comparison of Data Mining Techniques for Missing-Value Imputation Journal of Industral and Intellgent Informaton Vol. 3, No. 4, December 2015 Effcency Comparson of Mnng Technques for Mssng-Value Imputaton Jarumon Nookhong and Nutthapat Kaewrattanapat Suan Sunandha Rajabhat

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

An Anti-Noise Text Categorization Method based on Support Vector Machines *

An Anti-Noise Text Categorization Method based on Support Vector Machines * An Ant-Nose Text ategorzaton Method based on Support Vector Machnes * hen Ln, Huang Je and Gong Zheng-Hu School of omputer Scence, Natonal Unversty of Defense Technology, hangsha, 410073, hna chenln@nudt.edu.cn,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

A Clustering Algorithm for Key Frame Extraction Based on Density Peak Journal of Computer and Communcatons, 2018, 6, 118-128 http://www.scrp.org/ournal/cc ISSN Onlne: 2327-5227 ISSN Prnt: 2327-5219 A Clusterng Algorthm for Key Frame Extracton Based on Densty Peak Hong Zhao

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM PERFORMACE EVALUAIO FOR SCEE MACHIG ALGORIHMS BY SVM Zhaohu Yang a, b, *, Yngyng Chen a, Shaomng Zhang a a he Research Center of Remote Sensng and Geomatc, ongj Unversty, Shangha 200092, Chna - yzhac@63.com

More information

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB V. Hotař, A. Hotař Techncal Unversty of Lberec, Department of Glass Producng Machnes and Robotcs, Department of Materal

More information

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu

More information

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition Optmal Desgn of onlnear Fuzzy Model by Means of Independent Fuzzy Scatter Partton Keon-Jun Park, Hyung-Kl Kang and Yong-Kab Km *, Department of Informaton and Communcaton Engneerng, Wonkwang Unversty,

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

A Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems

A Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems 2008 INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT A Smlarty-Based Prognostcs Approach for Remanng Useful Lfe Estmaton of Engneered Systems Tany Wang, Janbo Yu, Davd Segel, and Jay Lee

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information