Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds

Size: px

Start display at page:

Download "Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds"

Crystal Atkinson
5 years ago
Views:

1 Learnng from Multple Related Data Streams wth Asynchronous Flowng Speeds Zh Qao, Peng Zhang, Jng He, Jnghua Yan, L Guo Insttute of Computng Technology, Chnese Academy of Scences, Bejng, , Chna. School of Engneerng & Scence, Vctora Unversty, PO Box 14428, Melbourne, Australa qaozh@software.ct.ac.cn, Jng.He@vu.edu.au, {zhangpeng, Abstract Related data streams refer to data streams that can be joned together by matchng ther jon attrbutes. Exstng research on learnng from related data streams s based on an assumpton that all streams arrve at a central processng unt n a synchronous way, such that n an arbtrary sldng wndow, all tuples of the streams can be perfectly joned together. Ths assumpton, however, does not hold when related data streams are generated or transferred at dfferent speeds, and thus may arrve n the central processng unt n an asynchronous manner. In ths paper, we argue that for asynchronous data streams, there exst a small porton of perfectly joned examples (.e., complete examples) and a large porton of partally joned examples (.e., ncomplete examples). Accordngly, we present a new Learnng from Complete and Fxed Examples () framework that can fx ncomplete examples to boost the learnng. Experments on both synthetc and real-world data streams demonstrate that s able to acheve a hgher predcton accuracy for learnng from related data streams than other smple solutons can offer. I. INTRODUCTION Exstng work n data stream mnng has made great efforts n knowledge dscovery for a sngle data stream [1, 2, 8, 9], but fndng patterns from multple related data streams s stll nadequately addressed. For many real-world data stream applcatons, stream data are often collected from dfferent channels wth dfferent modaltes. Under such envronments, t s natural to combne multple data streams together to dscover trs and patterns underneath the stream data. Related data streams refer to data streams that can be joned together by some shared jon attrbutes. In ths paper, we consder the problem of learnng from multple related data streams wth asynchronous speeds. Learnng from multple related data streams has been dscussed before, but manly from the prvacy preservng data stream mnng [3] perspectve. In ther studes, related data streams are assumed flowng synchronously, such that stream data can be perfectly joned together n a sldng wndow through ther jon attrbutes. However, n many real-world data stream applcatons, related data streams may be generated or transferred at dfferent speeds, and thus may arrve at the central processng unt n an asynchronous way. As a result, the jon attrbutes may not match each other perfectly n a sldng wndow. Under ths observaton, our man goal n ths paper s to learn from multple related data streams by takng the asynchronous factor nto consderaton, whch, to the best of our knowledge, has not been addressed before. To make the concept clear, assume the gven of two examples n Fgures 1 and 2, both of whch are a snapshot of a sldng wndow (Fgure 1 s a wdely used example n the prvacy preservng data stream mnng area [3]). In the examples, the target s to dscover patterns of the proftable tradng n stock markets by combnng related data streams, such as the phone call streams between dealers and managers/staffs of publc companes, the stock tradng streams that record the tradng actons, and the News streams from a local TV staton reportng ther forecasts on the stock markets. From Fgure 1, t s clear that all the jon attrbutes (whch are denoted by the arrows connectng the streams) are perfectly matched n the snapshot. Ths, however, only happens n an deal stuaton. A much more common case, as shown n Fgure 2, s that jon attrbutes may not be perfectly matched n a sldng wndow. For nstance, at the tme pont 9:02am, Denns gves Peter a call tellng hm to sell CCC s stock, but n the tradng stream we can t get Ross s matchng tuples (whch may not be avalable untl 9:04am due to the communcaton delay). As a result, the jon operaton wll generate few perfectly joned examples as shown n Table 1 (whch are referred to as the complete examples) and a large porton of partally joned examples as shown n Table 2 (whch are referred to as the ncomplete examples n ths paper). Learnng from multple related data streams should, therefore, take both types of examples nto consderaton. In ths paper, we present a Learnng from Complete and Fxed Examples () method for learnng from multple related data stream wth asynchronous speeds. The am of the model s to fx the large porton of ncomplete examples usng nformaton from complete examples to boost the learnng. Experments on both synthetc and real world data streams demonstrate that s able to help buld models wth a hgher predcton accuracy than other smple solutons can offer. The rest of ths paper s structured as follows. In the next secton, we formulate the problem of learnng from multple related data streams and dscuss some smple solutons. We ntroduce the learnng framework n Secton 3, and conduct experments on both synthetc and real world data streams n Secton 4. We survey related work n Secton 5, and conclude the paper n Secton 6.

2 Table I THE PERFECTLY JOINED EXAMPLES (WHICH ARE REFERRED TO AS complete examples) Org Caller Callee Dealer Type Stock Company Class Stock Forecast AAA Adams Jack Jack Sell A AAA Yes A Yes BBB Ray Selna Selna Buy B BBB Yes B No Table II THE PARTIALLY JOINED EXAMPLES (WHICH ARE REFERRED TO AS ncomplete examples) Org Caller Callee Dealer Type Stock Company Class Stock Forecast CCC Denns Peter?????????? John Buy C CCC No C No??? Ross Sell E EEE Yes?????????? D Yes Org Caller Callee 9:00 AAA Adams Jack 9:01 BBB Ray Selna 9:02 CCC Denns Peter Phone Call Stream Dealer Type Stock Comp. Class 9:01 Jack Sell A AAA Yes 9:02 Selna Buy B BBB Yes 9:03 Peter Sell A AAA No 9:03 Peter Buy C CCC Yes Tradng Stream Stock Fore. 9:00 A Yes 9:01 B No 9:02 C No Ne ws Stre am Fgure 1. An llustraton of multple related data streams wth synchronous speeds. Org Caller Callee 9:00 AAA Adams Jack 9:01 BBB Ray Selna 9:02 CCC Denns Peter Phone Call Stream Dealer Type Stock Comp. Class 9:01 Jack Sell A AAA Yes 9:02 Selna Buy B BBB Yes 9:02 John Buy C CCC No 9:03 Ross Sell E EEE Yes Tra dng Stream Stock Fore. 9:00 A Yes 9:01 B No 9:02 C No 9:03 D Yes Ne ws Stre am Fgure 2. An llustraton of multple related data streams wth asynchronous speeds. II. PROBLEM DEFINITION AND SIMPLE SOLUTIONS Consder m related data streams {S 1, S 2,...,S m }. All the streams share some jon attrbutes and can be joned together accordng to certan condtons. The joned stream has n classes {c 1, c 2,...,c n }. The sldng wndow sze s set to be w. A complete example s generated by jonng m tuples from m streams, whereas an ncomplete example s generated by jonng r (1 r < m) tuples from r streams. For an ncomplete example, we dvde t nto three parts: a set of observed attrbutes O, a set of unobserved attrbutes U, and ts class label L. Moreover, observed attrbutes O can be further dvded nto {o 1, o 2,...,o r }, where o s a tuple from stream S. Besdes, n ths paper we assume that all the attrbutes are ndepent wth each other. A. Problem Defnton Learnng from multple related data streams s not a trval task, consderng that even learnng from one sngle data stream s severely challenged by data volumes and concept drftng realtes. In multple related data streams scenaro, dfferent streams may be generated at dfferent stes that are far away from each other, whch makes related tuples may arrve at the central processng unt at dfferent speeds. Therefore, after the jon operaton, the generated tranng examples are lkely to contan a small porton of complete examples and a large porton of ncomplete examples. Such realtes n asynchronous data streams envronment rase the followng concerns. Frst, snce the complete examples provde the global vew of the related data streams, makng proper use of such type of examples s a basc necessty. Second, snce the ncomplete examples reflect local nformaton of the data dstrbuton, usng them properly to boost the performance plays an mportant role n buldng an accurate model. Thrd, tuples whch can not be perfectly joned n the current sldng wndow does not mean that they are useless, consderng that ther related tuples n other streams may arrve n succeedng wndows. Fourth, patterns and trs from multple related data streams may experence concept drftng wth tme elapsng. To sum up, learnng from multple related data streams needs to take the followng four concerns nto consderaton: Be able to make proper use of the complete examples whch provde the global vew of the multple related data streams; Be able to make use of the ncomplete examples properly to boost the performance. In asynchronous data streams, ncomplete examples take a large porton of the tranng examples. Any mproper use of such data may worsen the performance. Be able to ncrementally update the hstorcal nformaton over data streams. Be able to handle concept drftng. When a new concept emerges, adaptng quckly to ft the new concept s a necessty. B. Smple Solutons Intutvely, the followng two methods can be appled to learn from multple related data streams. Learn from Complete Examples (): drops all the ncomplete examples, and only learns from the complete

3 examples. For nstance, as far as Fgure 2 s consdered, just uses complete examples n Table 1 to buld model, but dscards all the ncomplete examples n Table 2. The mert of s that t usually has a low executon overhead because t only uses a small porton of the complete examples to buld model. However, ts lmtaton s also apparent. Snce the complete examples are sparse, t usually can not yeld a good predcton model. Learn from Complete and Incomplete Examples () Unlke the method that smply drops all the ncomplete examples, keeps all the ncomplete examples. To the unobserved attrbutes n the ncomplete examples, marks them wth a unform symbol? as shown n Table 2. By dong so, the orgnal problem s converted to a new problem of learnng wth unobserved values. The mert of s that t not only uses complete examples, but also uses ncomplete examples to buld model. When the complete examples are sparse, s able to acheve satsfactory results by consderng the ncomplete examples. The lmtaton of s that t flls up all the unobserved values smply wth a unform symbol?, whch can not always acheve a satsfactory result. A better alternatve method s to use the hstorcal nformaton from the passed stream data to fx the unobserved values, as what we wll do n the followng Leanng from Complete and Fxed Examples () method. III. FORMULATION OF THE MODEL In ths secton, we descrbe the formulaton of the model n detal. As we dscussed above, t s a common case that multple related data streams arrve at the central processng unt n an asynchronous manner, such that only a small porton of tuples n a sldng wndow can be completely joned, leavng a large porton of tuples be ncompletely joned. Thus, the generated tranng examples usually contan only few complete examples and a lot of ncomplete examples. The goal of our model s to summarze all the hstorcal complete records to nfer the mssng values n the ncomplete examples to boost the performance. A. Learn from Complete Examples Complete examples provde a global vew of related data streams, whch s valuable for understandng the patterns and trs behnd the streams. In ths secton, we consder how to summarze the completed examples over data streams. In data streams, t s mpractcal to buffer all the hstorcal data to boost the performance. In order to get hstorcal nformaton wthout bufferng all the data, an alternatve way s to desgn lght weght data structures to summarze nformaton from the passed data. Ths knd of data structures should be able to ncrementally update when new data comes, meanwhle, t also should be able to handle concept drftng problem. A well known example of such data structures s the mcro-cluster structure proposed for clusterng data streams n [4]. In ths paper, we use the sample average as the basc data structure to summarze the data streams. More precsely, for class label c, we defne a sample average vector X c = 1 S c S c S c j as the basc data structure, where S c denotes the total number of complete examples n the passed data havng class label c, and S c j denotes the j th example n the passed data havng class label c. On one hand, X c s able to ncrementally mantan the hstorcal nformaton over data streams. For nstance, when a new sldng wndow comes, the sample average X c can be updated as n Eq.(1), X c = X c X c = S c S c + S c X S c + c S c + S c X c (1) where X c s the sample average of all the complete examples wth class label c n the new sldng wndow, and S c denotes the total number of complete examples havng class label c. From Eq. (1), we can observe that X c s able to scale up over large amounts of data. On the other hand, the sample average X c also can be used to detect and handle concept drftng problem. Gven a parameter λ ( λ>0), concept drftng can be defned as that the current sample average X c bases from the mantaned sample average X c larger than λ. Snce concept drftng wll make the kept X c be too obsolete to reflect the current data dstrbuton, we wll reset X c usng the current sample average X c (.e., let X c = X c ) when the concept drfts. By dong so, t s safe to say that mantanng the sample average of the complete examples s an effectve way to handle both ncremental learnng and concept drftng problems. B. Learn from Incomplete Examples Although ncomplete examples provde less nformaton than the completed examples, they should not be neglected for two reasons. Frst, ncomplete examples usually take up a large porton of the whole tranng examples, and any mproper use of such data may even deterorate the performance. Second, ncomplete examples can provde useful local nformaton to some extent. Therefore, learnng from such type of examples properly s also very mportant. Unlke the method whch smply marks all the unobserved values wth?, our method tres to fx the unobserved values usng X c. For an ncomplete example, unobserved values exst n two dfferent types: (1) only some attrbutes are unobserved (.e., the 2 nd and 3 th examples n Table 2), and (2) both some attrbutes and the class label are unobserved (.e., the 1 th and 4 th examples n Table 2). To a Type (1) example, we frst use ts observed attrbutes to update the hstorcal nformaton X c, and then use the correspondng subvector n X c to fll up ts unobserved attrbutes. To a Type (2) example, snce t has both unobserved attrbutes and unobserved class label, we gve a two-stage learnng strategy to fx t as descrbed n Theorem 1.

4 Theorem 1 Assume all the attrbutes are ndepent. If we want to fx an ncomplete example havng both unobserved attrbutes and unobserved class label, we should frst fx ts class label accordng to the observed attrbutes, and then fx ts unobserved attrbutes accordng to the fxed class label. Proof The goal s to fx both the unobserved label L and the unobserved attrbutes U usng the observed attrbutes O. Accordng to the probablty theory, P(U, L, O) P(L, O)P(U L, O) P(U, L O) = = P(O) P(O) = P(O)P(L O)P(U L) = P(L O)P(U L) (2) P(O) Thus, n order to fx the unobserved L and U, we should frst fx L usng P(L O), and then fx U usng P(U L). After formulatng ths two-stage learnng method, a followng queston s how to calculate the probabltes P(L O) and P(U L). As far as P(L O) s consdered, accordng to the Bayesan decson rule, the class label L should be the one whch has the maxmal probablty P(c O) (1 n), P(L O) = max P(c O). (3) Snce all the attrbutes are ndepent, Eq.(3) can be further transformed to Eq.(4), P(L O) = max P(c O) = max P(c o j ). (4) =1 Eq.(4) shows that each observed tuple o j ( j = 1,...,r) wll put a weght to decde the fnal class label L. Therefore, to fx an ncomplete example s label, t s essental to take all ts observed attrbutes nto consderaton. By usng the logarthm functon, we transform Eq.(4) to Eq.(5) as follow, P(L O) max r log P(c o j ) max P(c o j ). (5) Eq. (5) shows that the fnal class label L wll be the class c that has the largest probablty r log P(c o j ). Snce the probabltes P(c o j )( = 1,...,c; j = 1,...,r) are hard to calculate, especally when there exst contnuous attrbute values, we use an approxmate method named Label Consensus Score to approxmately calculate every P(c o j ). Defnton 1(Label Consensus Score) The label consensus score between label c and an observed tuple o j s defned as the dstance value wth reverse proporton to the Eucldean dstance between the sample average X j and tuple o j. The fnal class label L wll be the class c whch has the largest label consensus score on all the r observed tuples as shown n Eq. (6), P(L O) max P(c o j ) max score(c, o j ) (6) After gettng the class label L, P(U L) can be easly calculated by usng the hstorcal nformaton X L to fll up all the unobserved attrbutes U. C. The Learnng Framework Algorthm 1 shows the learnng framework, whch conssts of fve major steps. In the frst step, jons all the tuples n the current sldng wndow to get a tranng examples T. Note that T contans a small porton of completed examples and a large porton of ncomplete examples. In the second step, the completed examples are used to update the hstorcal nformaton over data streams. More precsely, t frst calculates the sample average X c on each class label c n the current sldng wndow, and then compares X c wth the hstorcal one X c. If the dfference between them s larger than a gven threshold λ, then t replaces X c wth the current one X c; otherwse, t absorbs X c nto the X c.in the thrd step, the ncomplete examples are fxed by fllng up all the unobserved values. If the class label c s observed, then t uses all the observed attrbutes o j to update the X c j ; otherwse, the unobserved class label s estmated usng Eq.(6). Besdes, all the unobserved attrbutes are fxed usng the hstorcal nformaton from X c. In the fourth step, the completed examples and the fxed examples are combned together to buld a classfcaton model. In the last step, the model s tested on a next sldng wndow. IV. EXPERIMENTS In ths secton, we report expermental results and comparsons of the proposed framework from the followng three aspects: performance wth respect to (1) dfferent sldng wndow szes (.e., w), (2) dfferent success jon rates (.e., p), whch s used to smulate the dfferent arrvng speeds among dfferent data streams, and (3) dfferent concept drftng thresholds (.e., λ ). Benchmark methods: To assess s performance, we use the and methods as the benchmark methods for comparsons. As dscussed above, learns only from the completed examples, whle learns from both the completed and ncomplete examples. All these learnng methods are mplemented n Java wth an ntegraton of WEKA data mnng tool [5], and the Lbsvm (The java package can be downloaded from yasser/wlsvm/.) s used as the base classfcaton model. Synthetc Data Streams: We employ the followng four steps to generate synthetc related data streams wth asynchronous speeds. Frstly, we generate a sngle large data stream S. Then we vertcally splt S nto several equal ntervals. After that, each nterval s assgned an addtonal jon attrbute td, and treated as a sngle data stream. Fnally, we use p to smulate the asynchronous streams scenaro (.e., for an arbtrary tuple t, we generate a random number rd (0 < rd < 1). If rd < p, then t wll be taken as an unobserved tuple). More specfcally, we frst desgn a sequence of pars S = {(x 1, y 1 ),...,(x T, y T )}, where x R 19 s the attrbutes and y { 1, +1} s the class label. The classfcaton boundary s defned as 19 =1 a x = a 0, and the concept

5 Input : Related data streams S 1,...,S m, sldng wndow w, concept drftng λ Output: Average accuracy P and varance V Randomly generate a X c for each class label c /* Intalzaton */; for each sldng wndow w do T=jon streams(s 1,...,S m ) /* Step 1 */; for the completed examples do /* Step 2 */ for each class label c do f X c X c λ then X c X c X c ; else X c = X c for all ncomplete examples do /* Step 3 */ f class label c s observed then foreach observed tuple o j ( j = 1,...,r) do X c j X c j o j ; else Calculate the class label usng Eq. (6) ; Fx all the unobserved attrbutes usng X c ; Buld up a classfer f on the completed & fxed examples /* Step 4 */; Test classfer f on the next sldng wndow, and get P /* Step 5 */; Output the average accuracy P and the varance V. Algorthm 1: The learnng framework drftng s smulated by makng every a ( = 0,...,19) have 10% chance to evolve to a and 5% chance to reverse ts drecton. After that, we splt S vertcally and equally nto fve ntervals, wth each nterval havng fve attrbutes. At last, we use dfferent p to smulate the asynchronous streams scenaro. Real World Data Streams: We use the URL Reputaton data streams from the UCI Machne Learnng Repostory [6]. The goal s to detect malcous web stes by combnng the host-based features and the lexcal features of ther URLs. Fgure 3 llustrates the data collecton archtecture. The malcous URLs are obtaned from a large Web mal provder, whle the bengn URLs are randomly drawn from Yahoo s drecton lstng. For every ncomng URL, the feature collector collects the URL s host-based features by queryng DNS, WHOIS, blacklst and geographc nformaton servers, and collects the lexcal features from the lexcal related servers. Snce these servers locate at dfferent stes, we can take these related data streams as asynchronous data streams. We also use the parameter p here to smulate the Fgure 3. Overvew of real-tme URL feed, feature collecton, and classfcaton nfrastructure [7]. dfferent speeds among each server. We wll analyss the frst week data n our experments. For each type of attrbutes, we extract the former 20 attrbutes for analyss. Expermental Results: We lst our expermental results n Fgure 4 and Table 3. The parameters, f not specally mentoned, are set as follows: λ = 0.3, w = 500, and p = 0.5. Fgure 4(a) shows comparson results wth respect to dfferent sldng wndow szes. It s obvous that always performs the best, especally when the wndow sze s small. Ths s because under asynchronous data streams scenaro, the smaller the wndow sze s, the less chance to get completed tranng examples. Fortunately, s able to receve good predcton accuraces by fxng the ncomplete examples among asynchronous data streams. To further explore the relatonshp between the asynchronous rate p and the predcton accuracy, we conduct another experment as shown n Fgure 4(b). Not surprsngly, all the three algorthms suffer a loss when the number of ncomplete examples ncreases. But s able to avod a sgnfcant drop by fllng up the unobserved attrbutes usng the hstorcal sample average nformaton. In Fgure 4(c), we test dfferent values of the parameter λ whch defnes the concept drftng threshold. As shown n the fgure, performs the best when λ = 0.3, and thus n followng experments, we let λ be 0.3. We lst the expermental results on the real world URL Reputaton data streams n Table 3. All the three models are compared on the frst week data streams wth respect to ther predcton accuraces and computaton overheads. The sldng wndow sze s set to be 500, so there are totally 40 sldng wndows everyday. From Table 3, we can observe that always has the best predcton accuraces, follows as the second best model, and s the least accurate model. Ths valdates our argument that by fxng the ncomplete examples usng the hstorcal nformaton, s able to acheve good results n asynchronous data streams. Fgure 4(e) shows the detaled comparsons on the 40 sldng wndows on a specfc day (.e., Day 4). It s clear that always has the best predcton accuraces over all the 40 wndows. Consequently, t s safe to say that s able to learn accurately and tmely from related data streams wth asynchronous arrvng speeds.

6 Wndow Sze (a) Parameter w Parameter P (b) Parameter p Parameter λ (c) Parameter λ Tme Cost (Mllsecond) Wndow Sze Wndow ID (d) Tme cost (e) Chunk-by-chunk comparsons on Day 4 Fg. 4. Comparson results on both synthetc and real world data streams Table III COMPARISONS OF ACCURACY AND TIME COST (MILLISECOND) ON THE URL REPUTATION DATA. Day1 298 ± ± ±0.093 Day2 010 ± ± ±0.066 Day3 320 ± ± ±0.085 Day4 100 ± ± ±0.074 Day5 410 ± ± ±0.093 Day6 303 ± ± ±0.092 Day ± ± ±0.050 V. CONCLUSIONS In ths paper, we consder a new problem of learnng from multple related asynchronous data streams. We frst argued that to learn from such data streams, four concerns should be taken nto consderaton: (1) use the complete examples to gan a global vew across multple data streams, (2) make proper use of the ncomplete examples to boost the learnng, (3) ncrementally mantan the hstorcal nformaton from the hstorcal data, and (4) detect and handle concept drftng n data streams. To meet these challenges, n ths paper we presented a new Learnng from Complete and Fxed Examples () framework to learn from multple related data streams wth dfferent flowng speeds. More specfcally, frst combnes all the related data streams to generate tranng examples n the current sldng wndow, whch may contan a small porton of complete examples and a large porton of ncomplete examples. After that, employs a two stage method to fx the ncomplete examples by fxng the class label and the unobserved attrbutes. At last, bulds a predcton model on the completed examples. Durng the whole learnng procedure, the sample average s used to ncrementally mantan the hstorcal nformaton. The concept drftng problem s also addressed by measurng the change of the sample average. The contrbuton of work reported n the paper s fourfold: (1) we frst consder the problem of learnng from multple related data streams wth asynchronous speeds, and formulate ths problem as learnng from complete and ncomplete examples; (2) we propose a two stage learnng method to fx the unobserved class label and attrbutes for ncomplete examples; (3) we propose an effcent label consensus score method to approxmately calculate the Bayesan decson rule on data streams; (4) we propose a learnng framework to learn from multple related data streams wth asynchronous speeds. ACKNOWLEDGMENT Ths research was partally supported by the Natonal Scence Foundaton of Chna (NSFC) under Grant No , and Basc Research Program of Chna (973 Program) under Grant No.2007CB REFERENCES [1] K. Crammer, M. Kearns, J. Wortman: Learnng from Multpel Sources. Journal of Machne Learnng Research 9 (2008). [2] P. Zhang, X. Zhu, L. Guo: Mnng Data Streams wth Labeled and Unlabeled Tranng Examples. In: Proc. of IEEE ICDM 09 (2009) [3] Y. Xu, K. Wang, A. Fu, R. She, J. Pe: Classfcaton Spannng Correlated Data Streams. In Proc. of CIKM 2006, (2005). [4] C. Aggarwal et al.: A framework for clusterng evolvng data streams. In: Proceedngs of VLDB (2003). [5] I. Wtten, and E. Frank: Data mnng: practcal machne learnng tools and technques, Morgan Kaufmann (2005). [6] D. Newman, S. Hettch, C. Blake, and C Merz: UCI Repostory of machne learnng (1998). [7] J. Ma et al.: Identfyng Suspcous URLs: An Applcaton of Large-Scale Onlne Learnng, In Proc. of ICML 09 (2009). [8] P. Zhang, X. Zhu, Y. Sh: Categorzng and Mnng Concept Drftng Data Streams, In: Proc. of KDD 08 (2008). [9] X. Zhu, P. Zhang, X. Ln, Y. Sh: Actve Learnng from Stream Data Usng Optmal Weght Classfer Ensemble. IEEE Trans. on System, Man, Cybernetcs, Part B, Vol. 40 (4) (2010)

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department