A Martingale Framework for Concept Change Detection in Time-Varying Data Streams
|
|
- Loren Blair
- 5 years ago
- Views:
Transcription
1 A Martigale Framework for Cocept Chage Detectio i Time-Varyig Data Streams She-Shyag Ho sho@gmu.edu Departmet of Computer Sciece, George Maso Uiversity, 4400 Uiversity Drive, Fairfax, VA USA Abstract I a data streamig settig, data poits are observed oe by oe. The cocepts to be leared from the data poits may chage ifiitely ofte as the data is streamig. I this paper, we exted the idea of testig exchageability olie (Vovk et al., 2003) to a martigale framework to detect cocept chages i time-varyig data streams. Two martigale tests are developed to detect cocept chages usig: (i) martigale values, a direct cosequece of the Doob s Maximal Iequality, ad (ii) the martigale differece, justified usig the Hoeffdig-Azuma Iequality. Uder some assumptios, the secod test theoretically has a lower probability tha the first test of rejectig the ull hypothesis, o cocept chage i the data stream, whe it is i fact correct. Experimets show that both martigale tests are effective i detectig cocept chages i time-varyig data streams simulated usig two sythetic data sets ad three bechmark data sets. 1. Itroductio A challege i miig data streams is the detectio of chages i the data-geeratig process. Recet research icludes profilig ad visualizig chages i data streams usig velocity desity estimatio (Aggarwal, 2003), but reaches o coclusio o whether a chage takes place. Fa et al. (2004) proposed chage detectio (active miig) based o error estimatio of a model of the ew data stream without kowig the true class labels. Kifer et al. (2004) proposed a chage-detectio method with statistical guaratees of the reliability of detected chages, however Appearig i Proceedigs of the 22 d Iteratioal Coferece o Machie Learig, Bo, Germay, Copyright 2005 by the author(s)/ower(s). the method is impractical for high dimesioal data streams. Besides detectig chages, chage-adaptive methods for the so-called cocept drift problem, based o a slidig widow (istace selectio) (Klikeberg & Joachims, 2000; Widmer & Kubat, 1996), istace weightig (Klikeberg, 2004), ad esemble learig (Chu et al., 2004; Kolter & Maloof, 2003; Wag et al., 2003), are also suggested. The problem of detectig chages i sequetial data was first studied by statisticias ad mathematicias. I the olie settig, data are observed oe by oe from a source. The disruptio of stochastic homogeeity of the data might sigal a chage i the data-geeratig process, which would require decisiomakig to avoid possible losses. This problem is geerally kow as a chage-poit detectio. Methods of chage detectio first appeared i the Forties based o Wald s sequetial aalysis (Wald, 1947), ad later, Page itroduced the cumulative sum method (Page, 1957). These methods are parametric ad work oly for low-dimesioal data streams. A effective chage-detectig algorithm requires that (i) the mea (or media) delay time betwee a true chage poit ad its detectio be miimal, (ii) the umber of miss detectios be miimal, ad (iii) data streams be hadled efficietly. I this paper, we propose a martigale framework that effectively ad efficietly detects cocept chages i time-varyig data streams. I this framework, whe a ew data poit is observed, hypothesis testig usig a martigale takes place to decide whether chage occurs. Two tests are show to be effective usig this framework: testig exchageability usig (i) a martigale value (Vovk et al., 2003) ad (ii) the martigale differece. The first test is a direct cosequece of the Doob s Maximal Iequality. We provide detailed justificatio for the secod test usig the Hoeffdig-Azuma Iequality. Uder some assumptios, this secod test has a much lower probability tha the first test of rejectig the ull hypothesis, o cocept chage i the
2 data stream, whe it is i fact correct. The efficiecy of the martigale tests depeds o the speed of the classifier used for the costructio of the martigale. Our martigale approach is a efficiet, oe-pass icremetal algorithm that (i) does ot require a slidig widow o the data stream, (ii) does ot require moitorig the performace of the base classifier as data poits are streamig, ad (iii) works well for high dimesioal, multi-class data stream. I Sectio 2, we review the cocept of martigale ad exchageability. I Sectio 3, we describe ad justify two tests usig martigales. I Sectio 4, we examie both tests i time-varyig data streams simulated usig two sythetic data sets ad three bechmark data sets. 2. Martigale ad Exchageability Let {Z i : 1 i < } be a sequece of radom variables. A fiite sequece of radom variables Z 1,, Z is exchageable if the joit distributio p(z 1,, Z ) is ivariat uder ay permutatio of the idices of the radom variables. A martigale is a sequece of radom variables {M i : 0 i < } such that M is a measurable fuctio of Z 1,, Z for all = 0, 1, (i particular, M 0 is a costat value) ad the coditioal expectatio of M +1 give M 0,, M is equal to M, i.e. E(M +1 M 1,, M ) = M (1) Vovk et al. (2003) itroduced the idea of testig exchageability olie usig the martigale. After observig a ew data poit, a learer outputs a positive martigale value reflectig the stregth of evidece foud agaist the ull hypothesis of data exchageability. Cosider a set of labeled examples Z = {z 1,, z 1 } = {(x 1, y 1 ),, (x 1, y 1 )} where x i is a object ad y i { 1, 1}, its correspodig label, for i = 1, 2,, 1. Assumig that a ew labeled example, z, is observed, testig exchageability for the sequece of examples z 1, z 2,, z cosists of two mai steps (Vovk et al., 2003): A. Extract a p-value p for the set Z {z } from the strageess measure deduced from a classifier The radomized p-value of the set Z {z } is defie as V (Z {z }, θ ) = #{i : α i > α } + θ #{i : α i = α } (2) where α i is the strageess measure for z i, i = 1, 2,, ad θ is radomly chose from [0, 1]. The strageess measure is a way of scorig how a data poit is differet from the rest. Each data poit z i is assiged a strageess value α i based o the classifier used (e.g. support vector machie (SVM), earest eighbor rule, ad decisio tree). I our work, the SVM is used to compute the strageess measure, which ca be either the Lagrage multipliers or the distaces from the hyperplae for the examples i Z {z }. The p-values p 1, p 2, output by the radomized p- value fuctio V are distributed uiformly i [0, 1], provided that the iput examples z 1, z 2, are geerated by a exchageable probability distributio i the iput space (Vovk et al., 2003). This property of output p-values o loger holds whe the exchageability coditio is ot satisfied (see Sectio 3). B. Costruct the radomized power martigale A family of martigales, idexed by ɛ [0, 1], ad referred to as the radomized power martigale, is defied as M (ɛ) = i=1 ( ) ɛp ɛ 1 i (3) where the p i s are the p-values output by the radomized p-value fuctio V, with the iitial martigale M (ɛ) 0 = 1. We ote that M (ɛ) = ɛp ɛ 1 M (ɛ) 1. Hece, it is ot ecessary to stored the previous p-values. I our experimets, we use ɛ = 0.92, which is withi the desirable rage where the martigale value is more sesitive to a violatio of the exchageability coditio (Vovk et al., 2003). Whe θ = 1, the p-value fuctio V is determiistic, the martigale costructed is also determiistic. We use this determiistic martigale i our justificatio for the secod test i Sectio Testig for Chage Detectio Ituitively, we assume that a sequece of data poits with a cocept chage cosists of cocateatig two data segmets, S 1 ad S 2, such that the cocepts of S 1 ad S 2 are C 1 ad C 2 respectively ad C 1 C 2. Switchig a data poit z i from S 2 to a positio i S 1 will make the data poit stads out i S 1. The exchageability coditio is, therefore, violated. Exchageability is a ecessary coditio for a coceptually stable data stream. The absece of exchageability would suggest cocept chages.
3 Whe a cocept chage occurs, the p-values output from the radomized p-value fuctio (2) become skewed ad the p-value distributio is o loger uiform. By the Kolmogorov-Smirov Test (KS-Test) 1, the p-values are show ot to be distributed uiformly after the cocept chages. The ull hypothesis the p-values output by (2) are uiformly distributed is rejected at sigificace level α = 0.05, after sufficiet umber of data poits are observed (see the example i Figure 1). The skewed p-value distributio plays a importat role i our martigale test for chage detectio as small p-values iflate the martigale values. We ote that a immediate detectio of a true chage is practically impossible. Hece, a short delay time betwee a chage ad its detectio is highly desirable. the Hoeffdig-Azuma Iequality respectively. Cosider the simple ull hypothesis H 0 : o cocept chage i the data stream agaist the alterative H 1 : cocept chage occurs i the data stream. The test cotiues to operate as log as Martigale Test 1 (MT1): OR 0 < M (ɛ) < λ (4) where λ is a positive umber. Oe rejects the ull hypothesis whe M (ɛ) λ. Martigale Test 2 (MT2): 0 < M (ɛ) M (ɛ) 1 < t (5) where t is a positive umber. Oe rejects the ull hypothesis whe M (ɛ) M (ɛ) 1 t Justificatio for Martigale Test 1 (MT1) Assumig that {M k : 0 k < } is a oegative martigale, the Doob s Maximal Iequality (Steele, 2001) states that for ay λ > 0 ad 0 <, ( ) λp max M k λ E(M ) (6) k Figure 1. The 10-dimesioal data poits simulated usig the ormally distributed clusters data geerator (see Sectio 4.1.2) are observed oe by oe from the 1st to the 2000th data poit with cocept chage startig at the 1001th data poit. The reader should ot cofuse the p- values from the KS-Test ad the p-values computed from (2) Martigale Framework for Detectig Chages I the martigale framework, whe a ew data poit is observed, hypothesis testig takes place to decide whether a cocept chage occurs i the data stream. The decisio is based o whether the exchageability coditio is violated, which, i tur, is based o the martigale value. Two hypothesis tests based o the martigale (3) are proposed based o the Doob s Maximal Iequality ad 1 Kifer et al. (2004) proposed usig Kolmogorov- Smirov Test (KS-Test) for detectig chages usig two slidig widows ad a discrepacy measure which was tested oly o 1D data stream. Hece, if E(M ) = E(M 1 ) = 1, the ( ) P max M k λ 1 k λ (7) This iequality meas that it is ulikely for ay M k to have a high value. Oe rejects the ull hypothesis whe the martigale value is greater tha λ. But there is a risk of aoucig a chage detectio whe there is o chage. The amout of risk oe is willig to take will determie what λ value to use Justificatio for Martigale Test 2 (MT2) Theorem 1 (Hoeffdig-Azuma Iequality) Let c 1,, c m be costats ad let Y 1,, Y m be a martigale differece sequece with Y k c k, for each k. The for ay t 0, ( ) m P Y k t k=1 ) 2 exp ( t2 2 c 2 k (8) To use this probability boud to justify our hypothesis test, we eed the martigale differece to be bouded, i.e. Y i = M i M i 1 K such that M i ad M i 1
4 are two arbitrary cosecutive martigale values ad K R +. This bouded differece coditio states that the process does ot make big jumps. Moreover, it is ulikely that the process waders far from its iitial poit. Hece, before usig (8) to costruct the probability upper boud to justify MT2, we eed to show that the differece betwee two cosecutive power martigale values is bouded for some fixed ɛ. As metioed earlier i Sectio 2, we use the determiistic power martigale i our proof. We set θ = 1, for Z + i the radomized p-value fuctio (2). A output p-values p is a multiple of 1 betwee 1 ad 1. The martigale differece is However, the probability upper boud (13) for MT2 also depeds o, the umber of data poits used. As icreases, the upper boud also icreases. The probability of rejectig the ull hypothesis whe it is correct icreases. To maitai a much better probability boud for larger, t ca be icreased (see Figure 2) at the expese of a higher delay time (see Sectio 4.2). d = 1 i=1 ( ) ɛp ɛ 1 i (ɛp ɛ 1 1) (9) For p = u, 1 u, if ( ) log ɛ p < exp 1 ɛ (10) we have d > 0; otherwise d < 0. The most egative d occurs whe p = 1 ad the most positive d occurs whe p = 1. This most positive value is higher tha the most egative value ad, therefore, p = 1 will be used i the bouded differece coditio. Whe m = 1, the Hoeffdig-Azuma Iequality (8) becomes ) P ( Y 1 t) 2 exp ( t2 (11) ad hece, for ay, P ( M (ɛ) M (ɛ) 1 t) 2 exp ( 2 ɛ ( 1 t 2 2c 2 1 ) ɛ 1 1 ) 2 (M (ɛ) 1 ) 2 (12) Assumig that every testig step is a ew testig step based o a ew martigale sequece, we set the previous martigale value M (ɛ) 1 = M (ɛ) 0 = 1 o the righthad side of the iequality (12). Hece, we have P ( M (ɛ) M (ɛ) 1 t) 2 exp If we oly cosider M (ɛ) ( 2 ɛ ( 1 t 2 ) ) ɛ 1 2 (13) 1 > M (ɛ) 1, the upper boud is less tha the right-had side of (13). Like MT1, oe selects t accordig to the risk oe is willig to take. Figure 2. Compariso of the upper boud of the probability of the martigale differece for some t values ad ɛ = 0.92, ad the fixed probability upper boud for the martigale value whe λ = 20 o a data stream cosistig of data poits. To have a upper boud for MT1 that matches the upper boud for a particular t value (say, 3 4) for a small (< 5000), λ has to be very large. From Figure 2, oe observes that if a slidig widow is ot used for MT2, the classifier used to extract the p- value should dyamically remove old data poits from its memory whe the upper boud exceeds a predefied value. I our experimets, we use a pseudoadaptive approach for the widow size. Our widow starts from the previous detected poit ad icreases i size util the ext chage poit is detected, as log as the probability upper boud does ot exceed a fixed value we specified for a particular chose t. Otherwise, we remove the earliest data poit from the memory. We ote that i our experimets the iterval betwee two true chage poits is small (< 2, 000) ad the performace of MT2 is ot affected by the upper boud (13) as icreases. 4. Experimets Experimets are performed to show that the two tests are effective i detectig cocept chages i timevaryig data streams simulated usig two sythetic data sets ad three bechmark data sets. The five differet simulated data streams are described i Sectio 4.1.
5 We examie the performace of both tests based o the retrieval performace idicators, recall ad precisio, ad the delay time for chage detectios for various λ ad t values o two time-varyig data streams simulated usig the two sythetic data sets. The retrieval performace idicators are defied i our cotext as: Precisio = Recall = Number of Correct Detectios Number of Detectios Number of Correct Detectios Number of True Chages Precisio is the probability that a detectio is actually correct, i.e. detectig a true chage. Recall is the probability that a chage detectio system recogizes a true chage. The delay time for a detected chage is the umber of time uits from the true chage poit to the detected chage poit, if ay. We also show that both martigale tests are feasible o high dimesioal (i) umerical, (ii) categorical, ad (iii) multi-class data streams. I the experimets, a fast adiabatic icremetal SVM (Cauweberghs & Poggio, 2000), usig the Gaussia kerel ad C = 10, is used to deduce the strageess measure for the data poits. A ecessary coditio for both tests to work well is that the classifier must have a reasoable classificatio accuracy. At a fixed ɛ, the performace of the two tests deped o the λ or t. Experimetal results are reported i Sectio Simulated Data Stream Descriptios I this subsectio, we describe how the five data streams with cocept chages are simulated by (i) usig rotatig hyperplae (Hulte et al., 2001) (Sectio 4.1.1), (ii) usig the ormally distributed clusters data geerator (NDC) (Musicat, 1998) (Sectio 4.1.2), (iii) combiig rigorm ad twoorm data sets (Breima, 1996) (Sectio 4.1.3), (iv) modifyig UCI ursery data set (Blake & Merz, 1998) (Sectio 4.1.4), ad (v) modifyig the USPS hadwritte digits data set (LeCu et al., 1989) (Sectio 4.1.5) Simulated Data Stream usig Rotatig Hyperplae A data stream is simulated by usig a rotatig hyperplae to geerate a sequece of 100,000 data poits cosistig of chages occurrig at poits (1, 000 i)+1, for i = 1, 2,, 99. First we radomly geerate 1,000 data poits with each compoet s values i the closed iterval [ 1, 1]. These data poits are labeled positive ad egative based o the followig equatio: m { < c : egative w i x i = c : positive i=1 (14) where c is a arbitrary fixed costat, x i is the compoet of a data poit, x, ad the fixed compoets, w i, of a weight vector are radomly geerated betwee -1 ad 1. Similarly, the ext 1,000 radom data poits are labeled usig (14) with a ew radomly geerated fixed weight vector. This process cotiues util we get a data stream cosistig of 100 segmets of 1,000 data poits each. Noise is added by radomly switchig the class labels of p% of the data poits. I our experimet, p = 5 ad m = Simulated Data Stream usig the ormally distributed clusters data geerator (NDC) Liearly o-separable biary-class data streams of 100,000 data poits cosistig of chages occurrig at poits (1, 000 i) + 1, for i = 1, 2,, 99 is simulated usig the NDC i R 10 with radomly geerated cluster meas ad variaces. The values for each dimesio are scaled to rage i [ 1, 1]. The geeratig process for the data stream is similar to that used for the rotatig hyperplae data stream described i Sectio Numerical High Dimesioal Datasets: Rigorm ad Twoorm We combied the rigorm (RN) (two ormal distributio, oe withi the other) ad twoorm (TN) (two overlappig ormal distributio) data sets to form a ew biary-class data stream of 20 umerical attributes cosistig of 14, 800 data poits. The 7,400 data poits from the RN are partitioed ito 8 subsets with the first 7 subsets (RN i, i = 1,, 7) cosistig of 1,000 data poits each ad RN 8 cosistig of 400 data poits. Similarly, the 7,400 data poits from TN are also partitioed ito 8 subsets with the first 7 subsets (T N i, i = 1,, 7) cosistig of 1,000 data poits each ad the T N 8 cosistig of 400 data poits. The ew data stream is a sequece of data poits arraged as follows: {RN 1 ; T N 1 ; RN 7 ; T N 7 ; RN 8 ; T N 8 } with 15 chages at data poits 1000i + 1 for i = 1,, 14, ad 14, Categorical High Dimesioal Dataset: Nursery bechmark We modified the ursery data set, which cosists of 12,960 data poits i 5 classes with 8 omial attributes, to form a ew biary-class data stream.
6 Segmet Digit 1 Digit 2 Digit 3 Total Chage Poit (0) 502 (1) 731 (2) (0) 658 (3) 652 (4) (1) 556 (5) 664 (6) (7) 542 (8) 644 (9) Table 1. Three-Digit Data Stream: TR (D): TR is the umber of data poits ad D is the true digit class of the data poits. First, we combied three classes (ot recommeded, recommeded, ad highly recommeded) ito a sigle class cosistig of 4,650 data poits labeled as egative examples. The set RN is formed by radomly selectig 4,000 out of the 4,650 data poits. The priority class cotais 4,266 data poits that are labeled as positive examples. We radomly selected 4,000 out of the 4,266 data poits to form the set P P. The special priority class, which cotais 4,044 data poits, is split ito two subsets cosistig of 2,000 data poits each, a set (SP P ) of positive examples, ad a set (SP N) of egative examples. The other 44 data poits are removed. New subsets of data poits are costructed as follows: Set A i : 500 egative examples from RN ad 500 positive examples from P P. Set B i : 500 egative examples from SP N ad 500 positive examples from P P. Set C i : 500 egative examples from RN ad 500 positive examples from SP P. The data stream S is costructed as follows: {A 1 ; B 1 ; C 1 ; A 2 ; B 2 ; C 2 ; A 3 ; B 3 ; C 3 ; A 4 ; B 4 ; C 4 } cosistig of 12,000 examples with 11 chage poits Multi-class High Dimesioal Data: three-digit data stream from USPS hadwritte digits data set. The USPS hadwritte digits data set, which cosists of 10 classes of dimesio 256 ad icludes 7,291 data poits, is modified to form a data stream as follows. There are four differet data segmets. Each segmet draws from a fixed set of three differet digits i a radom fashio. The three-digit sets chage from oe segmet to the ext. The compositio of the data stream ad groud truth for the chage poits are summarized i Table 1. We ote that the chage poits do ot appear at fixed itervals. The oe-agaist-the-rest multi-class SVM is used to extract p-values. For the three-digit data stream, three oe-agaist-therest SVM are used. Hece, three martigale values are computed at each poit to detect chage (see Figure 7). Whe oe of the martigale values is greater tha λ (or t), chage is detected Results Figure 3 ad 4 show the recall, precisio, ad delay time of the two martigale tests o the data streams simulated usig the rotatig hyperplae ad NDC respectively. As ca be see from Figure 3 ad 4 (first row), the recall is cosistetly greater tha 0.95 o both simulated data streams for various λ ad t values. Both tests recogize cocept chages with high probability. As λ or t icreases, oe observes that the precisio icreases. As λ icreases from 4 to 100, the upper boud (7) becomes tighter, decreasig from 0.25 to 0.01, for MT1. This correspods to the precisio icreasig from 0.82 to 1 (see Figure 3), decreasig the false alarm rate. O the other had, as t icreases from 1.5 to 5, precisio icreases from 0.88 to 1. The upper boud (13) for MT2 is cosistetly small as log as the data stream used for computig the martigale is short (e.g. at = 1, 000, whe t = 1.5, the upper boud is ad whe t = 5, the upper boud is ). This is a plausible explaatio for MT2 havig a higher precisio tha MT1. A similar tred also appears i simulated data streams usig the NDC (see Figure 4). To this ed, it seems that for high recall ad precisio, a large λ or t should be used. Figure 3 ad 4 (secod row) reveal, usurprisigly, that a higher precisio (usig higher λ or t) comes at the expese of a higher mea (or media) delay time for both tests. The mea (or media) delay time for the two tests do ot differ sigificatly. With a box-plot o the delay time, oe ca observe that the delay time distributio skews toward large values (i.e. small values are packed tightly together ad large values stretch out ad cover a wider rage), idepedet of the λ or t value. The delay time is very likely to be less tha the mea delay time. I real applicatios, λ or t must be chose to miimize losses (or cost) due to delay time, missed detectios, ad false alarms. Figure 5, 6, ad 7 show the feasibility of MT1 ad MT2 o high dimesioal (i) umerical (combiig rigorm ad twoorm data sets), (ii) categorical (modified UCI ursery data set), ad (iii) multi-class (modified USPS hadwritte digit data set) data streams, respectively. From the figures, oe observes that for MT2, whe chages are detected, there are more variatios i the martigale values. To this ed, oe sees
7 Figure 3. Simulated data streams usig the rotatig hyperplae. Left Colum: MT1 (with λ 1 scaled by a factor of 100 for easier visualizatio of the probability); Right Colum: MT2. First Row: Precisio ad Recall; Middle Row: Mea ad Media Delay time for various λ ad t values. that a chage is detected by either of the tests whe the martigale value deviates from its iitial value, M 0 = Coclusio I this paper, we describe a martigale framework for detectig cocept chages i time-varyig data streams based o the violatio of exchageability coditio. Two tests usig martigales to detect chages are used to demostrate this framework. Oe test usig the martigale value (MT1) for chage detectio is easily justified usig the Doob s Maximal Iequality. The other test, based o the martigale differece (MT2), is justified usig the Hoeffdig-Azuma Iequality. Uder some assumptios, MT2 theoretically has a much lower probability tha MT1 of rejectig the ull hypothesis o cocept chage i the data stream whe it is i fact correct. Our experimets show that both martigale tests detect cocept chages with high probability. Precisio icreases with the icrease of λ or t values, but at the expese of a higher mea (or media) delay time. Experimets also show the effectiveess of the two tests for cocept chage detectio o high-dimesioal (i) umerical, (ii) categorical, ad (iii) multi-class data streams. Figure 4. Simulated data streams usig the NDC data geerator. Left Colum: MT1; Right Colum: MT2. (Explaatio: See Captio for Figure 3.) Ackowledgmets The author thaks the reviewers for useful commets, Alex Gammerma ad Vladimir Vovk for the mauscript of (Vovk et al., 2005) ad useful discussios, ad Harry Wechsler for guidace ad discussios. Refereces Aggarwal, C. C. (2003). A framework for chage diagosis of data streams. Proc. ACM SIGMOD It. Cof. o Maagemet of Data (pp ). ACM. Blake, C., & Merz, C. (1998). UCI repository of machie learig databases. Breima, L. (1996). Bias, variace, ad arcig classifiers (Techical Report 460). Statistics Departmet, Uiversity of Califoria. Cauweberghs, G., & Poggio, T. (2000). Icremetal support vector machie learig. Advaces i Neural Iformatio Processig Systems 13 (pp ). MIT Press. Chu, F., Wag, Y., & Zaiolo, C. (2004). A adaptive learig approach for oisy data streams. Proc. 4th IEEE It. Cof. o Data Miig (pp ). IEEE Computer Society. Fa, W., Huag, Y.-A., Wag, H., & Yu, P. S. (2004). Active miig of data streams. Proc. 4th SIAM It. Cof. o Data Miig. SIAM.
8 Figure 5. Simulated data streams usig the Rigorm ad Twoorm data sets: The martigale values of the data stream. represet detected chage poits. Top Graph: MT1 (λ = 20), mea (media) delay time is (26) with 2 false alarms. Bottom Graph: MT2 (t = 3.5), a miss detectio at The mea (Media) delay time is (24.5). Figure 6. Simulated data streams usig the UCI ursery dataset: The martigale values of the data stream. represet detected chage poits. Top Graph: MT1 (λ = 6), the mea (media) delay time is (81). Bottom Graph: MT2 (t = 3.5), the mea (media) delay time is (92). Hulte, G., Specer, L., & Domigos, P. (2001). Miig time-chagig data streams. Proc. 7th ACM SIGKDD It. Cof. o Kowledge Discovery ad Data Miig (pp ). ACM. Kifer, D., Be-David, S., & Gehrke, J. (2004). Detectig chage i data streams. Proc. 13th It. Cof. o Very Large Data Bases (pp ). Morga Kaufma. Klikeberg, R. (2004). Learig driftig cocepts: examples selectio vs example weightig. Itelliget Data Aalysis, Special Issue o Icremetal Learig Systems capable of dealig with cocept drift, 8, Klikeberg, R., & Joachims, T. (2000). Detectig cocept drift with support vector machies. Proc. 17th It. Cof. o Machie Learig (pp ). Morga Kaufma. Kolter, J. Z., & Maloof, M. A. (2003). Dyamic weighted majority: A ew esemble method for trackig cocept drift. ICDM (pp ). IEEE Computer Society. LeCu, Y., Boser, B., Deker, J. S., Hederso, D., Howard, R. E., Hubbard, W., & Jackel, L. J. (1989). Backpropagatio applied to hadwritte zip code recogitio. Neural Computatio, 1, Musicat, D. R. (1998). Normally distributed clustered datasets. Computer Scieces Departmet, Uiversity of Wiscosi, Madiso, Page, E. S. (1957). O problem i which a chage i a parameter occurs at a ukow poit. Biometrika, 44, Steele, M. (2001). Stochastic calculus ad fiacial applicatios. Spriger Verlag. Figure 7. Simulated three-digit data stream usig the USPS hadwritte digit data set: The martigale values of the data stream. represet detected chage poits. Left Graph: MT1 (λ = 10), the delay time are 45, 99 ad 62. There is oe false alarm. Right Graph: MT2 (t = 2.5), the delay time are 88, 81 ad 73. There is oe false alarm. Vovk, V., Gammerma, A., & Shafer, G. (2005). Algorithmic learig i a radom world. Spriger. Vovk, V., Nouretdiov, I., & Gammerma, A. (2003). Testig exchageability o-lie. Proc. 20th It. Cof. o Machie Learig (pp ). AAAI Press. Wald, A. (1947). Sequetial aalysis. Wiley, N. Y. Wag, H., Fa, W., Yu, P. S., & Ha, J. (2003). Miig cocept-driftig data streams usig esemble classifiers. Proc. 9th ACM SIGKDD It. Cof. o Kowledge Discovery ad Data Miig (pp ). ACM. Widmer, G., & Kubat, M. (1996). Learig i the presece of cocept drift ad hidde cotexts. Machie Learig, 23,
Pattern Recognition Systems Lab 1 Least Mean Squares
Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig
More informationFundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le
Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical
More information3D Model Retrieval Method Based on Sample Prediction
20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer
More informationCS 683: Advanced Design and Analysis of Algorithms
CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,
More informationAn Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem
A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.
More informationImproving Template Based Spike Detection
Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for
More informationImage Segmentation EEE 508
Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.
More informationSorting in Linear Time. Data Structures and Algorithms Andrei Bulatov
Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,
More informationarxiv: v2 [cs.ds] 24 Mar 2018
Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves
More informationLecture 28: Data Link Layer
Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig
More informationANN WHICH COVERS MLP AND RBF
ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi
More informationEvaluation scheme for Tracking in AMI
A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:
More informationAccuracy Improvement in Camera Calibration
Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z
More informationSAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.
SAMPLE VERSUS POPULATION Populatio - cosists of all possible measuremets that ca be made o a particular item or procedure. Ofte a populatio has a ifiite umber of data elemets Geerally expese to determie
More informationHash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.
Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative
More informationImproved Random Graph Isomorphism
Improved Radom Graph Isomorphism Tomek Czajka Gopal Paduraga Abstract Caoical labelig of a graph cosists of assigig a uique label to each vertex such that the labels are ivariat uder isomorphism. Such
More informationLecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein
068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig
More informationEvaluation of Support Vector Machine Kernels for Detecting Network Anomalies
Evaluatio of Support Vector Machie Kerels for Detectig Network Aomalies Prera Batta, Maider Sigh, Zhida Li, Qigye Dig, ad Ljiljaa Trajković Commuicatio Networks Laboratory http://www.esc.sfu.ca/~ljilja/cl/
More informationOnes Assignment Method for Solving Traveling Salesman Problem
Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:
More informationProtected points in ordered trees
Applied Mathematics Letters 008 56 50 www.elsevier.com/locate/aml Protected poits i ordered trees Gi-Sag Cheo a, Louis W. Shapiro b, a Departmet of Mathematics, Sugkyukwa Uiversity, Suwo 440-746, Republic
More informationCSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University
CSCI 5090/7090- Machie Learig Sprig 018 Mehdi Allahyari Georgia Souther Uiversity Clusterig (slides borrowed from Tom Mitchell, Maria Floria Balca, Ali Borji, Ke Che) 1 Clusterig, Iformal Goals Goal: Automatically
More informationData Structures and Algorithms. Analysis of Algorithms
Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output
More informationEuclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process
Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig
More informationLecture 2: Spectra of Graphs
Spectral Graph Theory ad Applicatios WS 20/202 Lecture 2: Spectra of Graphs Lecturer: Thomas Sauerwald & He Su Our goal is to use the properties of the adjacecy/laplacia matrix of graphs to first uderstad
More informationA New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method
A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro
More informationAdministrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today
Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised
More informationNew HSL Distance Based Colour Clustering Algorithm
The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics
More informationChapter 3 Classification of FFT Processor Algorithms
Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As
More informationPseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured
More informationLecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming
Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis
More informationNeuro Fuzzy Model for Human Face Expression Recognition
IOSR Joural of Computer Egieerig (IOSRJCE) ISSN : 2278-0661 Volume 1, Issue 2 (May-Jue 2012), PP 01-06 Neuro Fuzzy Model for Huma Face Expressio Recogitio Mr. Mayur S. Burage 1, Prof. S. V. Dhopte 2 1
More informationBig-O Analysis. Asymptotics
Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses
More informationA new algorithm to build feed forward neural networks.
A ew algorithm to build feed forward eural etworks. Amit Thombre Cetre of Excellece, Software Techologies ad Kowledge Maagemet, Tech Mahidra, Pue, Idia Abstract The paper presets a ew algorithm to build
More informationComputational Geometry
Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed
More informationLower Bounds for Sorting
Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig
More informationBayesian approach to reliability modelling for a probability of failure on demand parameter
Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee
More informationThe golden search method: Question 1
1. Golde Sectio Search for the Mode of a Fuctio The golde search method: Questio 1 Suppose the last pair of poits at which we have a fuctio evaluatio is x(), y(). The accordig to the method, If f(x())
More informationBOOLEAN MATHEMATICS: GENERAL THEORY
CHAPTER 3 BOOLEAN MATHEMATICS: GENERAL THEORY 3.1 ISOMORPHIC PROPERTIES The ame Boolea Arithmetic was chose because it was discovered that literal Boolea Algebra could have a isomorphic umerical aspect.
More informationAnalysis of Documents Clustering Using Sampled Agglomerative Technique
Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based
More informationImprovement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation
Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity
More informationInvestigating methods for improving Bagged k-nn classifiers
Ivestigatig methods for improvig Bagged k-nn classifiers Fuad M. Alkoot Telecommuicatio & Navigatio Istitute, P.A.A.E.T. P.O.Box 4575, Alsalmia, 22046 Kuwait Abstract- We experimet with baggig knn classifiers
More informationLecture 1: Introduction and Strassen s Algorithm
5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access
More informationPruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c
Advaces i Egieerig Research (AER), volume 131 3rd Aual Iteratioal Coferece o Electroics, Electrical Egieerig ad Iformatio Sciece (EEEIS 2017) Pruig ad Summarizig the Discovered Time Series Associatio Rules
More informationFast Fourier Transform (FFT) Algorithms
Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform
More informationNeural Networks A Model of Boolean Functions
Neural Networks A Model of Boolea Fuctios Berd Steibach, Roma Kohut Freiberg Uiversity of Miig ad Techology Istitute of Computer Sciece D-09596 Freiberg, Germay e-mails: steib@iformatik.tu-freiberg.de
More informationRandom Graphs and Complex Networks T
Radom Graphs ad Complex Networks T-79.7003 Charalampos E. Tsourakakis Aalto Uiversity Lecture 3 7 September 013 Aoucemet Homework 1 is out, due i two weeks from ow. Exercises: Probabilistic iequalities
More informationRunning Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments
Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
More informationAnalysis of Algorithms
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The
More informationBASED ON ITERATIVE ERROR-CORRECTION
A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity
More informationBezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only
Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of
More informationA SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON
A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work
More informationLecture 13: Validation
Lecture 3: Validatio Resampli methods Holdout Cross Validatio Radom Subsampli -Fold Cross-Validatio Leave-oe-out The Bootstrap Bias ad variace estimatio Three-way data partitioi Itroductio to Patter Recoitio
More informationPerformance Plus Software Parameter Definitions
Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios
More informationA Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System
A Novel Feature Extractio Algorithm for Haar Local Biary Patter Texture Based o Huma Visio System Liu Tao 1,* 1 Departmet of Electroic Egieerig Shaaxi Eergy Istitute Xiayag, Shaaxi, Chia Abstract The locality
More information1 Graph Sparsfication
CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider
More informationRunning Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments
Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The
More informationState-space feedback 6 challenges of pole placement
State-space feedbac 6 challeges of pole placemet J Rossiter Itroductio The earlier videos itroduced the cocept of state feedbac ad demostrated that it moves the poles. x u x Kx Bu It was show that whe
More informationLecture 5. Counting Sort / Radix Sort
Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018
More information6.854J / J Advanced Algorithms Fall 2008
MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms
More informationData Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types
Data Aalysis Cocepts ad Techiques Chapter 2 1 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity
More informationProbabilistic Fuzzy Time Series Method Based on Artificial Neural Network
America Joural of Itelliget Systems 206, 6(2): 42-47 DOI: 0.5923/j.ajis.2060602.02 Probabilistic Fuzzy Time Series Method Based o Artificial Neural Network Erol Egrioglu,*, Ere Bas, Cagdas Haka Aladag
More informationAlgorithm Selection using Reinforcement Learning
Algorithm Selectio usig Reiforcemet Learig Michail G. Lagoudakis Departmet of Computer Sciece, Duke Uiversity, Durham, NC 2778, USA Michael L. Littma Shao Laboratory, AT&T Labs Research, Florham Park,
More informationJournal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article
Available olie www.jocpr.com Joural of Chemical ad Pharmaceutical Research, 2013, 5(12):745-749 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 K-meas algorithm i the optimal iitial cetroids based
More informationElementary Educational Computer
Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified
More information( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb
Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most
More informationNew Fuzzy Color Clustering Algorithm Based on hsl Similarity
IFSA-EUSFLAT 009 New Fuzzy Color Clusterig Algorithm Based o hsl Similarity Vasile Ptracu Departmet of Iformatics Techology Tarom Compay Bucharest Romaia Email: patrascu.v@gmail.com Abstract I this paper
More informationOur second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.
Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for
More informationAnalysis of Different Similarity Measure Functions and their Impacts on Shared Nearest Neighbor Clustering Approach
Aalysis of Differet Similarity Measure Fuctios ad their Impacts o Shared Nearest Neighbor Clusterig Approach Ail Kumar Patidar School of IT, Rajiv Gadhi Techical Uiversity, Bhopal (M.P.), Idia Jitedra
More informationA Kernel Density Based Approach for Large Scale Image Retrieval
A Kerel Desity Based Approach for Large Scale Image Retrieval Wei Tog Departmet of Computer Sciece ad Egieerig Michiga State Uiversity East Lasig, MI, USA togwei@cse.msu.edu Rog Ji Departmet of Computer
More informationIntroduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP
Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible
More informationDesigning a learning system
CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try
More informationArithmetic Sequences
. Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered
More informationReliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1
Reliable Trasmissio Sprig 2018 CS 438 Staff - Uiversity of Illiois 1 Reliable Trasmissio Hello! My computer s ame is Alice. Alice Bob Hello! Alice. Sprig 2018 CS 438 Staff - Uiversity of Illiois 2 Reliable
More informationAn Efficient Algorithm for Graph Bisection of Triangularizations
A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu
More informationConvergence results for conditional expectations
Beroulli 11(4), 2005, 737 745 Covergece results for coditioal expectatios IRENE CRIMALDI 1 ad LUCA PRATELLI 2 1 Departmet of Mathematics, Uiversity of Bologa, Piazza di Porta Sa Doato 5, 40126 Bologa,
More informationThe Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana
The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:
More information. Written in factored form it is easy to see that the roots are 2, 2, i,
CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or
More informationPerformance Comparisons of PSO based Clustering
Performace Comparisos of PSO based Clusterig Suresh Chadra Satapathy, 2 Guaidhi Pradha, 3 Sabyasachi Pattai, 4 JVR Murthy, 5 PVGD Prasad Reddy Ail Neeruoda Istitute of Techology ad Scieces, Sagivalas,Vishaapatam
More informationENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics
ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced by 50,
More informationOn-line cursive letter recognition using sequences of local minima/maxima. Robert Powalka
O-lie cursive letter recogitio usig sequeces of local miima/maxima Summary Robert Powalka 19 th August 1993 This report presets the desig ad implemetatio of a o-lie cursive letter recogizer usig sequeces
More informationClassification of binary vectors by using DSC distance to minimize stochastic complexity
Patter Recogitio Letters 24 (2003) 65 73 www.elsevier.com/locate/patrec Classificatio of biary vectors by usig DSC distace to miimize stochastic complexity Pasi Fr ati *, Matao Xu, Ismo K arkk aie Departmet
More informationHow do we evaluate algorithms?
F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:
More informationChapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig
More informationThe isoperimetric problem on the hypercube
The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose
More informationCubic Polynomial Curves with a Shape Parameter
roceedigs of the th WSEAS Iteratioal Coferece o Robotics Cotrol ad Maufacturig Techology Hagzhou Chia April -8 00 (pp5-70) Cubic olyomial Curves with a Shape arameter MO GUOLIANG ZHAO YANAN Iformatio ad
More informationAn Estimation of Distribution Algorithm for solving the Knapsack problem
Vol.4,No.5, 214 Published olie: May 25, 214 DOI: 1.7321/jscse.v4.5.1 A Estimatio of Distributio Algorithm for solvig the Kapsack problem 1 Ricardo Pérez, 2 S. Jös, 3 Arturo Herádez, 4 Carlos A. Ochoa *1,
More informationOur Learning Problem, Again
Noparametric Desity Estimatio Matthew Stoe CS 520, Sprig 2000 Lecture 6 Our Learig Problem, Agai Use traiig data to estimate ukow probabilities ad probability desity fuctios So far, we have depeded o describig
More informationEvaluating Top-k Selection Queries
Evaluatig Top-k Selectio Queries Surajit Chaudhuri Microsoft Research surajitc@microsoft.com Luis Gravao Columbia Uiversity gravao@cs.columbia.edu Abstract I may applicatios, users specify target values
More informationAn Efficient Algorithm for Graph Bisection of Triangularizations
Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe
More informationFire Recognition in Video. Walter Phillips III Mubarak Shah Niels da Vitoria Lobo.
Fire Recogitio i Video Walter Phillips III Mubarak Shah Niels da Vitoria Lobo {wrp65547,shah,iels}@cs.ucf.edu Computer Visio Laboratory Departmet of Computer Sciece Uiversity of Cetral Florida Orlado,
More informationn n B. How many subsets of C are there of cardinality n. We are selecting elements for such a
4. [10] Usig a combiatorial argumet, prove that for 1: = 0 = Let A ad B be disjoit sets of cardiality each ad C = A B. How may subsets of C are there of cardiality. We are selectig elemets for such a subset
More informationcondition w i B i S maximum u i
ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility
More informationSOFTWARE usually does not work alone. It must have
Proceedigs of the 203 Federated Coferece o Computer Sciece ad Iformatio Systems pp. 343 348 A method for selectig eviromets for software compatibility testig Łukasz Pobereżik AGH Uiversity of Sciece ad
More informationHui Xiao School of Environmental Science, Nanjing Xiaozhuang University, Nanjing , China
doi:0.3/00.39.. Cotiuous knn Queries i Dyamic Road Networks Hui Xiao School of Evirometal Sciece, Najig Xiaozhuag Uiversity, Najig 7, Chia Abstract Cotiuous knn queries have bee widely studied i recet
More informationResearch on Identification Model of Financial Fraud of Listed Company Based on Data Mining Technology
208 2d Iteratioal Coferece o Systems, Computig, ad Applicatios (SYSTCA 208) Research o Idetificatio Model of Fiacial Fraud of Listed Compay Based o Data Miig Techology Jiaqi Hu, Xiao Che School of Busiess,
More informationBAYESIAN WITH FULL CONDITIONAL POSTERIOR DISTRIBUTION APPROACH FOR SOLUTION OF COMPLEX MODELS. Pudji Ismartini
Proceedig of Iteratioal Coferece O Research, Implemetatio Ad Educatio Of Mathematics Ad Scieces 014, Yogyakarta State Uiversity, 18-0 May 014 BAYESIAN WIH FULL CONDIIONAL POSERIOR DISRIBUION APPROACH FOR
More informationUnsupervised Discretization Using Kernel Density Estimation
Usupervised Discretizatio Usig Kerel Desity Estimatio Maregle Biba, Floriaa Esposito, Stefao Ferilli, Nicola Di Mauro, Teresa M.A Basile Departmet of Computer Sciece, Uiversity of Bari Via Oraboa 4, 7025
More informationA Study on the Performance of Cholesky-Factorization using MPI
A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio
More informationAlgorithms for Disk Covering Problems with the Most Points
Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi
More information