HISTOGRAMS are an important statistic reflecting the

Size: px
Start display at page:

Download "HISTOGRAMS are an important statistic reflecting the"

Transcription

1 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST D 2 HistoSketch: Disciminative and Dynamic Similaity-Peseving Sketching of Steaming Histogams Dingqi Yang, Bin Li, Laua Rettig, and Philippe Cudé-Mauoux Abstact Histogam-based similaity has been widely adopted in many machine leaning tasks. Howeve, measuing histogam similaity is a challenging task fo steaming histogams, whee the elements of a histogam ae obseved one afte the othe in an online manne. The eve-gowing cadinality of histogam elements ove the data steams makes any similaity computation inefficient in that case. To tackle this poblem, we popose in this pape D 2 HistoSketch, a similaity-peseving sketching method fo steaming histogams to efficiently appoximate thei Disciminative and Dynamic similaity. D 2 HistoSketch can fast and memoy-efficiently maintain a set of compact and fixed-size sketches of steaming histogams to appoximate the similaity between histogams. To povide high-quality similaity appoximations, D 2 HistoSketch consides both disciminative and gadual fogetting weights fo similaity measuement, and seamlessly incopoates them in the sketches. Based on both synthetic and eal-wold datasets, ou empiical evaluation shows that ou method is able to efficiently and effectively appoximate the similaity between steaming histogams while outpefoming state-of-the-at sketching methods. Compaed to full steaming histogams with both disciminative and gadual fogetting weights in paticula, D 2 HistoSketch is able to damatically educe the classification time (with a 7500x speedup) at the expense of a small loss in accuacy only (about 3.25%). Index Tems Similaity-Peseving Sketching, Histogams, Steaming Data, Concept Dift, Disciminative Weighting 1 INTRODUCTION HISTOGRAMS ae an impotant statistic eflecting the empiical distibution of data. They have been widely used not only as a popula data analysis and visualization tool, but also as an impotant featue fo measuing similaities between data instances, such as colo histogams fo images o wod histogams fo documents. As a esult, histogam-based similaity measues have been extensively exploited in many classification and clusteing tasks and fo vaious application domains, including image pocessing [1], document analysis [2], social netwok analysis [3], and business intelligence [4]. Despite its impotance in machine leaning, computing histogam-based similaities is often difficult in pactice, paticulaly fo data steams. In this study, we conside steaming histogams, whee the elements of a histogam ae obseved ove a data steam as shown in Fig. 1(a) in Section 3. Steaming histogams can be used fo a wide ange of applications, such as solving ange queies and similaity seach in a steaming database, change detection and classification ove data steams. In pactice, steaming histogams ae often seen when online o offline businesses obseve thei customes activity data. Fo example, a Point of Inteest (POI), such as a supemaket o a estauant, may obseve a continuous data steam of visits fom its customes and conside to analyze the histogam of its customes visits. By Dingqi Yang, Laua Rettig and Philippe Cudé-Mauoux ae with the Depatment of Infomatics at the Univesity of Fiboug, Switzeland, {dingqi.yang, laua.ettig, philippe.cude-mauoux}@unif.ch. Bin Li is with School of Compute Science, Fudan Univesity, Shanghai, China, libin@fudan.edu.cn. Manuscipt eceived xxx; evised xxx. measuing the similaity between two POIs based on such histogams, one can build vaious high-quality applications. Fo example, semantic place labeling [4] infes a POI s type based on the assumption that two POIs shaing simila histogams of thei customes pobably belong to the same type. Howeve, it is challenging to measue the similaity between such steaming histogams in pactice, due to the eve-inceasing cadinality of the histogam elements ove time. In the above example, this coesponds to the case of an eve-gowing numbe of customes. The monotonically inceasing size of the steaming histogams makes any similaity computation inefficient, which futhe makes leaning algoithms impactical. To solve this poblem, similaity-peseving data sketching (hashing) techniques [5] have been intensively studied in steam data pocessing [6], [7]. Thei key idea is to maintain a set of compact and fixed-size sketches fo the oiginal data to appoximate thei similaity unde a cetain measue. In the cuent liteatue, most existing data sketching techniques [8], [9], [10], [11] conside the case of steaming data instances, whee complete data instances ae eceived one by one fom a data steam (e.g., a steam of images whose colo histogam can be easily deived). In contast, a steaming histogam assumes that the elements of a histogam descibing an individual data instance ae continuously eceived in abitay ode fom a data steam (e.g., the histogam of customes visits to a POI), which depats fom classical techniques that focus on sketching complete data instances. Theefoe, these methods cannot be efficiently applied fo sketching steaming histogams. In this pape, we tackle the similaity-peseving data

2 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST sketching poblem fo steaming histogams. Specifically, an efficient similaity-peseving sketching method fo steaming histogams should allow fo fast and memoy-efficient maintenance of the sketches. Fast maintenance equies sketches of steaming histogams to be incementally updatable. In othe wods, the new sketch of a steaming histogam should be incementally computed fom the fome sketch and the newly aived element. Moeove, memoyefficient maintenance equies the sketching method to ceate a small and bounded memoy ovehead when computing the sketches, which diffes fom existing sketching methods that equie a lage set of andom vaiables as in-memoy paametes [11], [12], [13], [14], whee the size of these paametes is popotional to the cadinality of the histogam elements. In addition, to maintain high-quality similaitypeseving sketches, the following two issues should be consideed when measuing similaities. Fist, as histogam elements ae not all equally impotant when measuing histogam-based similaity, disciminative similaity should be consideed, which efes to the similaity that impoves the disciminative capability of some classification/clusteing methods [15]. Specifically, in the case of labeled histogams, a histogam element appeaing only in the histogams of a specific label has moe disciminative capability than one appeaing unifomly in all histogams. Taking the example of semantic place labeling whee we want to classify POIs accoding to thei customes visiting pattens, it means that visits fom uses having stonge pefeences on visiting a specific type of POIs ae moe disciminative. Fo static datasets, such a disciminative similaity can be easily computed using vaious featue weighting methods [16] to give a highe weight to moe disciminative histogam elements. Howeve, it is not staightfowad to incopoate such a disciminative similaity in sketching steaming histogams, whee disciminative weights have to be updated ove time, and moe impotantly, to be incopoated in the sketches. Second, as a common poblem in data steams, concept dift should also be taken into account fo steaming histogams, whee the undelying distibution of a steaming histogam changes ove time in unfoeseen ways. Taking the example of customes visits to POIs, the custome population of a estauant may change abuptly if the estauant changes its type (e.g., fom a Japanese estauant to a pizzeia), o gadually if it updates its menu. It is theefoe citical to conside this issue in sketching steaming histogams. The most common appoach to handle concept dift is fogetting the outdated data [17]. A typical solution is gadual fogetting, whee the steaming data ae associated with weights invesely popotional to thei age [18]. Taking exponential decay [19] as an example, the weight of a histogam element deceases by a weight decay facto evey time when a new element is eceived fom the data steam. Although such a weighting pocess is easy to implement when building a histogam fom its steaming elements, it is not staightfowad to incopoate such weights dynamically in sketching steaming histogams. This poblem becomes even moe challenging when futhe consideing the equiement of incementally updatable sketching. To addess the above challenges, we intoduce D 2 HistoSketch, a similaity-peseving sketching method fo steaming histogams to appoximate thei Disciminative and Dynamic similaity. D 2 HistoSketch is designed to efficiently maintain a set of compact and fixed-sized sketches ove steaming histogams to appoximate the similaity between the histogams. Specifically, to measue the similaity between histogams, ou method focuses on nomalized min-max similaity, which has been poven to be an effective similaity measue fo nonnegative data in vaious application domains [11]. To ceate a sketch fom a histogam, we boow the idea fom consistent weighted sampling [20] that was oiginally poposed fo appoximating minmax similaity fo complete data instances. In addition, we fomally deive a memoy-efficient sketching method with few in-memoy paametes. To efficiently maintain the sketch ove the steaming histogam elements, we fist adjust the oiginal sketch to seamlessly incopoate both disciminative weights and gadual fogetting weights, and then incementally compute the new sketch based on the adjusted sketch and the incoming histogam element. Ou main contibutions can be summaized as follows: To the best of ou knowledge, this is the fist wok consideing the disciminative and dynamic similaitypeseving sketching ove steaming histogams. We design an efficient similaity-peseving sketching method fo steaming histogams, D 2 HistoSketch, which allows fo fast and memoy-efficient maintenance of the sketches, whee the sketches can be incementally updated with a small and bounded memoy ovehead. To povide high-quality similaity-peseving sketches fo downsteam tasks, D 2 HistoSketch consides both disciminative similaity that impoves the disciminative capability of the sketches fo some classification/clusteing methods, and dynamic similaity that adapts to concept dift by gadually fogetting outdated histogam elements. We empiically evaluate ou method on multiple classification tasks using both synthetic and eal-wold datasets. Ou esults show that D 2 HistoSketch is able to efficiently appoximate disciminative and dynamic similaity fo steaming histogams. Compaed to full steaming histogams, D 2 HistoSketch is able to damatically educe the classification time (with a 7500x speedup) at the expense of a modest loss in accuacy (about 3.25%). Compaed to ou pevious wok HistoSketch [14], in this pape, we fist fomally deive a memoy-efficient sketching method with fewe in-memoy paametes, which also allows fo faste sketch maintenance. We then add disciminative similaity fo bette similaity measuement. In paticula, the new expeiments we pesent show that compaed to ou pevious wok ou newly poposed D 2 HistoSketch achieves both a highe classification accuacy (with a 2.78% impovement on aveage) as well as faste sketch maintenance (with a 5.9% impovement on pocessing speed). The est of the pape is oganized as follows. Section 2 pesents the elated wok. Section 3 fist pesents the peliminay of ou wok. Aftewads, we pesent the poposed D 2 HistoSketch method in Section 4. The expeimental evaluation is shown in Section 5. We conclude ou wok in Section 6.

3 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST RELATED WORK As a key statistical tool in empiical data analysis, histogams have been widely used not only as a popula visualization of empiical data distibution [21], but also as a featue to measue data similaity that is futhe exploited in many machine leaning tasks [1], [2], [3], [22], [23]. Although the histogam of a static dataset can often be easily computed, it is pactically difficult to compute histogams fo data steams with typically unknown cadinality and which thus equie an unbounded amount of memoy to maintain the histogam. In this context, count sketch [24] and count-min sketch [25] and othe online histogam building methods [26] wee poposed to appoximate the fequency table of elements (i.e., histogams) fom a data steam with a fixed-size data stuctue. Howeve, the esulting sketches do not peseve the similaity between diffeent data steams. This pape diffes fom the objective of the above sketching methods by addessing the poblem of similaity-peseving sketching of data steams. Similaity-peseving sketching [5] has been extensively studied to efficiently appoximate the similaity of high dimensional data, such as gaphs [27], images [28], [29] and videos [30], [31]. Its basic idea is to maintain a set of compact sketches of the oiginal high dimensional data to efficiently appoximate thei similaities, such as Jaccad [8], [9], cosine [10], and min-max [4], [11], [12], [13], [20], [32] similaities. These sketches can then enable many applications, paticulaly fo infomation etieval systems like image o document seach engines [28]. Howeve, most of the existing methods ae designed to sketch complete data instances, which ae fundamentally diffeent fom steaming histogams, whee histogams ae incementally built fom the steams of its elements. Moe impotantly, little attention has been given on studying high-quality similaitypeseving sketches, on which we put a paticula focus in this pape by consideing both disciminative and dynamic similaities. Disciminative similaity efes to the similaity that impoves the disciminative capability of some classification/clusteing methods [15]. It has been widely used in many machine leaning applications. Fo example, disciminative similaity has been shown to be able to significantly impove the pefomance of vaious compute vision and patten ecognition tasks [33]. In pactice, disciminative similaity can be computed using featue weighting techniques, whee a similaity function is paameteized with disciminative weights [16] (e.g., entopy weights [34]). Although it is easy to implement disciminative similaity fo static datasets, it is not staightfowad to apply it to similaity-peseving sketching of steaming histogams, as such disciminative weights have to be dynamically ecomputed ove time, and moe impotantly, to be incopoated in the sketches. Dynamic adaptation to concept dift is a common poblem in steaming data pocessing. It efes to the case whee the undelying statistical popeties of the steaming data change ove time (often in unknown ways), which futhe degades the pefomance of leaning algoithms [35]. Accoding to a ecent suvey on concept-dift, the most popula appoach to handling data steams with unknown dynamics is fogetting outdated data [17]. Existing solutions can be classified into two categoies. Fist, the abupt fogetting appoach selects a set of data fo leaning. A sliding window is often used to select ecent data. Although this appoach is effective against abupt difts (in tems of the data statistical popeties), it is less applicable to gadual difts [36] as it gives the same significance to all selected pieces of data while completely discading all othe data. Second, the gadual fogetting appoach assigns weights that ae invesely popotional to the age of the data [18], such as exponential decay weights [19]. In this study, we advocate the gadual fogetting appoach to tackle the concept-dift poblem in steaming histogams. The histogam is built with weighted elements fom the steams, with weights deceasing ove time. Diffeent fom existing methods using gadual fogetting against concept dift, we conside incopoating the gadual fogetting appoach in similaitypeseving sketching of steaming histogams. Compaed to ou pevious wok HistoSketch [14], this pape makes the following impovements. Fist, we fomally deive a memoy-efficient sketching method D 2 HistoSketch with few in-memoy paametes, while ou pevious wok HistoSketch equies a lage set of andom vaiables as in-memoy paametes fo sketching, whee the size of these paametes is popotional to the cadinality of the histogam elements. Second, we conside both disciminative and dynamic similaity in D 2 HistoSketch, while HistoSketch only consides the dynamic adaptation to concept dift. Finally, we conduct new expeiments to futhe validate D 2 HistoSketch; in paticula, compaed to ou pevious wok HistoSketch, ou newly poposed D 2 HistoSketch achieves both a highe classification accuacy (with a 2.78% impovement on aveage) as well as faste sketch maintenance (with a 5.9% impovement on pocessing speed). 3 PRELIMINARY AND PROBLEM FORMULATION In this section, we fist fomally intoduce steaming histogams and then min-max similaity, followed by ou poblem definition of similaity-peseving sketching. 3.1 Steaming Histogam We conside a steaming histogam computed ove a data steam of its elements x t whee t N indicates the ode of the obseved element in the steam. Element x t E ae obseved one by one, whee E efes to the set of all histogam elements obseved in any steaming histogams so fa, which expands ove time with new (unseen) histogam elements. In othe wods, E can be egaded as the union of histogam elements fom all histogams obseved so fa. Due to the steaming natue, the cadinality E is unknown and continuously inceases ove time. A classical histogam can then be epesented as a vecto V N E, whee each value encodes the cumulative count of the coesponding histogam elements i E, i.e., = t 1 x t=i, whee 1 cond is an indicato function which is equal to 1 when cond is tue and 0 othewise. Fig. 1(a) illustates an example of a classical steaming histogam. To measue the disciminative and dynamic similaity between steaming histogams, we build the histogam V

4 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST fom its weighted elements. Specifically, we assign a disciminative weight and a gadual fogetting weight to each steaming element to measue disciminative and dynamic similaity between steaming histogams, espectively Steaming Histogam with Disciminative Weights Each histogam element i is associated with a disciminative weight wi d. Specifically, we assume that each histogam is associated with a label l, whee l L and L efes to a set of labels fo histogams. Taking the histogam of customes visits to a POI as an example, a POI can be associated with a categoy (label l) fast food estauant, and L efes to all possible POI categoies hee. To compute the disciminative weight wi d fo each histogam element i, we leveage a widely used weighting function, i.e., entopy weighting [34]. (We note that ou appoach is not limited to any specific weighting function.) Specifically, we use entopy weighting to empiically measue the uncetainty of the labels when obseving individual histogam elements. Fo each histogam elements i in E, we compute its entopy weight wi d as follows: wi d l L P [l, i] log P [l, i] = 1 + (1) log L whee P [l, i] is the pobability that a histogam with element i is labeled as l, l L. Highe values of wi d imply highe degees of disciminability fo the coesponding histogam element i. To calculate P [l, i], we maintain a label fequency vecto F l of size E fo each label to ecod the cumulative fequency of each histogam element in E. With each incoming histogam element i, we update Fi l accodingly. Thus, we ae able to empiically compute F P [l, i] = l i at any time. Paticulaly, fo each incoming histogam element i, only the coesponding wi d needs l L F i l to be updated. In pactice, F l is maintained in a count-min sketch data stuctue fo memoy efficiency (we will discuss it in Section 4.4). Subsequently, we compute the histogam V R E >0 such that is associated with its weighted wi d, i.e., = t wd i 1 x t=i. Fig. 1(b) shows an example of a steaming histogam with disciminative weights Steaming Histogam with Gadual Fogetting Weights Each steaming element x t is associated with a gadual fogetting weight w g t, which is invesely popotional to its age. To compute w g t, we adopt the exponential decay weight [19], which is computed as follows: w g t = e λ(tn t) (2) whee t n is the ode of the latest histogam element eceived fom the steam and λ is the weight decay facto. Subsequently, we compute the histogam V R E >0 such that is the weighted cumulative count of the coesponding histogam elements i, i.e., = t wg t 1 xt=i. Fig. 1(c) shows an example of a steaming histogam with gadual fogetting weights. (a) Classical histogam (unweighted) = t 1 x t =i (b) Histogam with disciminative weights = t wd i 1 x t =i (c) Histogam with gadual fogetting weights = t wg t 1 x t =i (d) Histogam with both weights = t wd i wg t 1 x t =i Fig. 1. Illustation of the steaming histogams with diffeent weights. Left: The diffeent histogam elements ae assigned diffeent colos, and thei heights indicate the coesponding weights. Right: The same elements ae accumulated to build the coesponding histogam Steaming Histogam with Both Weights We combine both the disciminative and the gadual fogetting weights to compute the histogam, = t wd i w g t 1 xt=i. Fig. 1(d) shows an example of a steaming histogam with both weights. 3.2 Nomalized Min-Max Similaity To measue the similaity between steaming histogams, we esot to nomalized min-max similaity, which has been shown to be an effective similaity measue fo nonnegative data. Moe pecisely, Li [11] conducted an extensive study on min-max similaity (named as min-max kenel in [11]) by compaing fou kenels including linea kenel, min-max kenel, nomalized-min-max kenel and intesection kenel, on diffeent classification tasks ove a sizable collection of public datasets. The esults illustate the advantages of the (nomalized) min-max kenels. Fomally, given two steaming histogams V a and V b, the min-max similaity is defined as follows: i E min(v a Sim MM (V a, V b i ) =, b ) i E max( a, b) (3) As a histogam is often used to chaacteize the empiical data distibution, we apply the sum-to-one nomalization befoe computing the similaity: i E V a i = 1, i E V b i = 1. (4) In that way, Eq. 3 becomes the nomalized min-max similaity, denoted by Sim NMM.

5 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST Poblem Fomulation Fo steaming histogams, the eve-inceasing cadinality E makes the computation of the nomalized min-max similaity become inefficient. Theefoe, we popose to maintain two sketches S a and S b of size K (K E ) fo V a and V b, espectively, with the popety that thei collision pobability (i.e., P [Sj a = Sj b ], whee j = 1, 2,..., K) is exactly the nomalized min-max similaity between V a and V b : P [Sj a = Sj b ] = Sim NMM (V a, V b ) (5) Then, the nomalized min-max similaity between V a and V b can be appoximated by the Hamming similaity between S a and S b. The computation ove S, which is compact and of fixed size, is much moe efficient than the one ove the full histogam V, which is a lage, eve-gowing vecto. The poblem tackled by this pape is how to ceate and maintain the above similaity-peseving sketch S fo the steaming histogam V with both disciminative and gadual fogetting weights in a fast and memoy-efficient manne. 4 D 2 HISTOSKETCH Ou D 2 HistoSketch is designed to efficiently maintain a set of compact and fixed-size sketches fo steaming histogams with both disciminative and gadual fogetting weights, in ode to efficiently appoximate thei nomalized minmax similaities. In this section, we fist pesent consistent weighted sampling, which inspied ou method. We then descibe ou method fo sketch ceation, followed by the poposed incemental sketch update pocess. 4.1 Consistent Weighted Sampling Consistent weighted sampling was oiginally poposed to appoximate min-max similaity fo complete and high dimensional data (e.g., a vecto of lage size) [11], [12], [13], [20]. The basic idea is to geneate data samples such that the pobability of dawing identical samples fo a pai of vectos is equal to thei min-max similaity. A set of such samples can then be egaded as a sketch of the input vecto. The fist consistent weighted sampling method [20] was designed to handle intege vectos. Specifically, taking a classical histogam V N E as an example, it fist uses a andom hash function h j to geneate independent and unifom distibuted andom hash values h j (i, f) fo each (i, f), whee i E and f {1, 2,..., }, and then etuns (i j, f j ) = agmin i E,f {1,2,...,} h j (i, f) as one sample (i.e., one sketch element S j ). Note that the andom hash function h j depends only on (i, f), and maps (i, f) uniquely to h j (i, f). By applying K independent andom hash functions (j = 1, 2,..., K), we geneate sketch S (of size K) fom V (of abitay size). Following this pocess, the collision pobability between two sketch elements (i a j, f a j ) and (i b j, f b j ), which ae geneated fom V a and V b, espectively, is poven to be exactly the min-max similaity of the two vectos [20], [32]: P [(i a j, fj a ) = (i b j, fj b )] = Sim MM (V a, V b ) (6) To impove the efficiency of the above method and allow eal vectos as input, Ioffe [12] late poposed an impoved method. Its key idea is that, athe than geneating diffeent andom hash values (whee has to be an intege), it diectly geneates one hash value a i,j (and its coesponding f N, f ) fo each i by taking as the input of the andom hash value geneation pocess. In such a case, can be any positive eal numbe. Based on this method, Li [11] futhe poposed to simplify the sketch by only keeping i j athe than (i j, f j ), and empiically poved the following popety: P [i a j = i b j ] P [(i a j, fj a ) = (i b j, fj b )] (7) A shot desciption of the method poposed in [11] is pesented in the following. To geneate one sketch element S j (sample i j ), the method fist daws thee andom vaiables offline as in-memoy paametes: i,j Gamma(2, 1), c i,j Gamma(2, 1) and β i,j Unifom(0, 1), and then computes ( y i,j = exp ( i,j log V )) i + β i,j β i,j i,j (8) c i,j a i,j = y i,j exp( i,j ) (9) The sketch element is then etuned as S j = agmin i E a i,j. Please efe to [11], [12] fo moe details and fo the poof of Eq. 6 and 7. In this pape, we design a fast and memoy-efficient sketching pocess to handle steaming histogams with both disciminative and gadual fogetting weights, whee V R E >0. Specifically, we popose a new appoach to diectly compute a i,j, which has the following thee highly desiable popeties: 1) The geneated a i,j follows the exact same distibution as the one geneated using Eq. 9, which ensues the coectness of ou sketching method (i.e., Eq. 6 and Eq. 7 still hold). 2) The a i,j sampling pocess does not equie the lage set of in-memoy paametes i,j, c i,j, β i,j whee i E and j = 1, 2,..., K; this makes ou method memoyefficient. 3) The ceated sketch S is invaiant unde unifom scaling of V, which seves as a basis fo the fast incemental sketch update. In the following, we fist pesent ou sketch ceation method, and then the incemental sketch update pocess. 4.2 Sketch Ceation Ou sketch ceation method boows the idea of consistent weighted sampling with eal numbe inputs [11]. Diffeent fom the oiginal method, we popose a new appoach to compute a i,j. Specifically, the objective of the oiginal method is to sample y i,j such that log y i,j is unifomly distibuted on [log i,j, log ] conditioned on i,j. Among many possible fomulations that can fulfill this distibution equiement, the oiginal fomulation (Eq. 8) is specifically designed to also sample the coesponding f j to obtain the sketch element (i j, f j log Vi ) (whee f = i,j + β i,j in [12]). Howeve, as poved in [11], fj can be ignoed fom the

6 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST sketch (i j, f j ) (i.e., Eq. 7). In such a case, it is only necessay to sample y i,j satisfying its distibution equiement, and then compute the coesponding a i,j. In the following, we fist deive a simplified method to geneate y i,j, based on which we then deive a memoy-efficient method to diectly compute a i,j. Fist, we popose to compute y i,j as follows: y i,j = exp(log i,j β i,j ) (10) fo which the following poposition holds. Poposition 1. Eq. 10 geneates y i,j following the same distibution as geneated by Eq. 8, i.e., log y i,j follows a unifom distibution on [log i,j, log ] conditioned on i,j. Poof. Consideing the vaiable log y i,j, Eq. 8 can be deived as (we ignoe the subscipt (i, j) of and β in the following poof): log y i,j = ( log = log ) + β β (( log Vi + β ) log V ) (11) i + β log Vi log Vi whee ( + β) + β is the fac function of log + β, which etuns its factional pat [37]. Since both and ae known, this fac function can be consideed log Vi log Vi as fac(β + C), whee C = is a constant. Consideing β Unifom(0, 1), this function actually etuns the factional pat of a vaiable following Unifom(C, C + 1), which emains the same as Unifom(0, 1). In othe wods, it is a unifom mapping fom Unifom(0, 1) to itself. Subsequently, we have fac( + β) Unifom(0, 1), which can be eplaced by β. Theefoe, we obtain: log y i,j = log β (12) which is the same as in Eq. 10. Theefoe, y i,j geneated by Eq. 10 follows the same distibution as geneated by Eq. 8. We intoduce z = log y i,j and compute its Cumulative Distibution Function (CDF) as follows: P (Z < z) = P (log β < z) ( ) log Vi z = P < β Consideing β Unifom(0, 1), we obtain: P (Z < z) = 1 log z = z (log ) log (log ) (13) (14) which is the CDF of Unifom(log, log ). This completes the poof. Second, based on Poposition 1, we deive a memoyefficient method to diectly compute a i,j as follows: a i,j = log β (15) fo which the following poposition holds. Poposition 2. Eq. 15 geneates a i,j following the same distibution as geneated by Eq. 9. Poof. As suggested by Poposition 1, we now use Eq. 10 to sample y i,j. Combining Eq. 10 with Eq. 9 and knowing (1 β) and β both follow Unifom(0, 1), we obtain: a i,j = c exp( β) (16) Consideing a new vaiable m = β, we can compute its Pobability Distibution Function (PDF) as: f M (m) = β f B(β)f R ( m β )dβ (17) whee f B ( ) and f R ( ) ae the PDF of β Unifom(0, 1) and Gamma(2, 1), espectively. We futhe deive f M (m) as follows: f M (m) = β 1 m β exp( m β )dβ = exp( m) (18) We see that m = β follows the exponential distibution Exp(1), and can then be sampled as m = log β, β Unifom(0, 1). Subsequently, Eq. 16 can be simplified as: a i,j = cβ (19) It is woth noting that the numeato cβ follows the same distibution as β Exp(1), as they ae both the multiplication of a vaiable fom U nif om(0, 1) and a vaiable fom Gamma(2, 1). Theefoe, cβ can be futhe simplified as log β whee β Unifom(0, 1). We note that as β, β, β Unifom(0, 1), we keep using only β Unifom(0, 1) fo the sake of simplicity fo notations, and obtain: a i,j = log β (20) which is the same as Eq. 15. This completes the poof. Poposition 2 ensues the coectness of ou sketching method (i.e., Eqs. 6 and 7). Moe impotantly, ou method only equies the paametes β i,j Unifom(0, 1), which can be efficiently geneated using andom hash functions athe than being maintained as in-memoy paametes Memoy-Efficient Sketching Implementation To efficiently geneate β i,j in an online manne, we apply a andom hash function h j on i to obtain the coesponding hash value h j (i), which follows a unifom distibution ove (0, 1). We then use h j (i) to eplace β i,j. With K independent andom hash functions {h j j = 1, 2,..., K}, we can compute a i,j in a memoy-efficient manne as follows: a i,j = log h j(i) (21) In summay, ou memoy-efficient method maintains only K independent andom hash functions {h j j = 1, 2,..., K}, athe than the lage set of in-memoy paametes i,j, c i,j, β i,j whose size is popotional to the cadinality of the histogam elements E. Alg. 1 shows the sketch ceation pocess. Note that we keep both the sketch S and its hash values A (the latte will be used fo incemental sketch update). Fig. 2 shows the sketch ceation pocess fo one sketch element S j.

7 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST Fig. 3. Illustation of deceasing gadual fogetting weights (exponentially) when a new histogam element x t+1 is eceived (fome weights ae epesented in gay) Fig. 2. Ceating one sketch element fom histogam V with cadinality E = 5 (i = 1, 2,..., 5). By computing the hash value a i,j fo each i, we select the histogam element whose hash value is minimal as the sketch element and also keep its coesponding hash value, i.e., (S j = 3, A j = 0.14). Algoithm 1 Sketch ceation Input: Histogam V, Sketch length K, Independent andom hash functions {h j j = 1, 2,..., K} Output: Sketch S and the coesponding hash values A 1: fo j=1,2,...,k do log hj(i) 2: Compute a i,j = 3: Set sketch element S j = agmin i E a i,j 4: Set the coesponding hash value A j = min i E a i,j 5: end fo 6: etun S and A β) Unifom(0, 1), the computation can be simplified to ĥ j (i) = 1 β. As we only cae about the odeing of those hash values, we can futhe simplify the computations via a monotonic tansfomation, and obtain ĥj(i) = log β. By eplacing β with a andom hash value h j (i) (h j is a andom hash function), we obtain: ĥ j (i) = log h j(i) (25) which is the same as Eq. 21. In othe wods, a i,j is equal to ĥ j (i) = min f=1,2,...,vi h j (i, f) fo intege V N E. Theefoe, ou method can be egaded as a genealization of the oiginal consistent weighted sampling method to measue the nomalized min-max similaity with eal V R E Intinsic Connection to Consistent Weighted Sampling with Intege Vectos We now discuss its intinsic connection to the oiginal consistent weighted sampling method with intege vecto inputs [20], and show that ou method is indeed a genealization of the oiginal method to eal vectos inputs. The oiginal consistent weighted sampling method [20] uses a andom hash function h j to geneate the hash values h j (i, f) fo each (i, f), whee i E and f {1, 2,..., }, and then etuns (i j, f j ) = agmin i E,f {1,2,...,} h j (i, f) as one sample (i.e., one sketch element S j ). As suggested by [11], the sketches can be simplified by keeping only i j and discading f j. By defining ĥj(i) = min f=1,2,...,vi h j (i, f), one sample is then etuned as i j = agmin i E ĥj(i). We now deive a new method to diectly compute ĥ j (i) without involving f. The key idea is to diectly compute ĥj(i) fom only one andom hash value and, athe than andom hash values. Specifically, since h j Unifom(0, 1), we compute the CDF of ĥj(i) as: P [ĥj(i) p] = 1 (1 p) Vi (22) fo p (0, 1). We assume a andom unifomly distibuted vaiable β U nif om(0, 1), and fomulate its CDF P [β q] = q, q (0, 1) as: P [β q] = P [β 1 (1 p) Vi ] = 1 (1 p) Vi (23) whee q = 1 (1 p) Vi is an invetible continuous inceasing function (p = 1 (1 q)) ove (0, 1). By applying the change-of-vaiable technique on Eq. 23, we obtain: P [1 (1 β) p] = 1 (1 p) Vi (24) Since Eqs. 24 and 22 show the same CDF, ĥ j (i) can be obtained by ĥj(i) = 1 (1 β). As (1 4.3 Incemental Sketch Update Incemental sketch update equies that a new sketch S(t + 1) can be fast computed based on the fome sketch S(t) (with its coesponding hash values A(t)) and the incoming histogam element x t+1. Specifically, when a new histogam element x t+1 = i is eceived fom the data steam, the gadual fogetting weights of all existing elements of the histogam evolve by a facto of e λ (as shown in Fig. 3), which esults in a unifom scaling of V. One of the key popeties of ou sketch is that it is invaiant unde unifom scaling of V, which allows us to pefom the scaling by quickly adjusting A only. Aftewads, as only the disciminative weight of histogam element i is updated, we only need to ecompute, and then its coesponding hash value a i,j. Finally, the new sketch S(t + 1) can be computed based on the adjusted sketch S(t) (with A(t)) and the incoming histogam element x t+1 = i (with a i,j). In the following, we fist pesent the unifom scaling invaiance popety of ou sketch, and then the incemental update pocess. Poposition 3. If sketch S (with its coesponding hash values A) is ceated fo V using Alg. 1, then fo any positive constant γ, S emains the sketch fo γv with the coesponding hash values 1 γ A. Poof. Alg. 1 computes a sketch element S j follows: fo γv as a i,j = log h j(i) = 1 γ γ a i,j (26) ( ) 1 S j = agmin i E γ a i,j = agmin a i,j = S j (27) i E A j = min i E This completes the poof. 1 γ a i,j = 1 γ min i E a i,j = 1 γ A j (28)

8 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST Fig. 4. An example of incementally updating one sketch element. I). Accoding to the scaling of the histogam, we keep the sketch invaiant S j (t) = 3, and adjust its hash value fom A j (t) = 0.14 to A j (t) e λ = (λ = 0.05 in this example). II). By adding the incoming histogam element x t+1 = 2 (2 E) with weight wi d 1 to the scaled histogam, we ecompute only the hash value fo i = 2, i.e., a 2,j = III). By selecting the minimum hash value between A j (t) e λ = and a 2,j = 0.143, we update S j (t + 1) = 2 and A j (t + 1) = Poposition 3 seves as a basis fo ou incemental sketch update pocess, which woks as follows (fo one sketch element S j ): I. When a new histogam element x t+1 = i is eceived, we scale V (t) by a facto of e λ, and adjust the sketch accoding to Poposition 3, i.e., S j (t) and A j (t) e λ. II. We add the incoming histogam element i to the scaled histogam. Fist, we update the disciminative weight wi d accoding to the method descibed in Section Then, as the newest histogam element always has gadual fogetting weight 1 (w g t = e λ(tn tn) = 1), we add the incoming histogam element i with the oveall weight wi d 1 to the scaled histogam: (t+1) = (t) e λ +wi d 1 if i E. In case i / E, we add i to E and expand V to include (t+1) = wi d 1. Aftewads, we ecompute only the hash value fo i, i.e., a i,j. III. By compaing the new hash value a i,j with the adjusted hash value A j (t) e λ, we update the sketch S j (t + 1) = i and A j (t + 1) = a i,j if a i,j < A j (t) e λ, S j (t + 1) = S j (t) and A j (t + 1) = A j (t) e λ othewise. Fig. 4 illustates the incemental sketch update pocess following the pevious example shown in Fig Implementation Details Ou incemental sketch update pocess equies to access the fome histogam V (t) to compute V (t+1). To maintain such a steaming histogam V of an eve-inceasing size, we popose an extended count-min sketch model. The classical count-min sketch [25] is a fixed-sized pobabilistic data stuctue Q (d ows and g columns) seving as a fequency table of steaming elements. It uses d independent andom hash functions h l (l = 1, 2,..., d) to map steaming elements onto a ange of 1, 2,..., g (countes). Evey time a new element i is eceived, fo each ow l, its hash function h l is applied to i to detemine a coesponding column h l (i), and then the counte Q l,hl (i) is inceased by 1. To get the estimated fequency at time t, the coesponding hash function is applied to i to look up the coesponding counte fo each ow. The estimate is then etuned as the minimum of all the pobed countes acoss all ows, i.e., (t) = min l Q l,hl (i). The estimated fequency eo is guaanteed [38] to be at most 2 g with pobability 1 ( 1 2 )d. In ou case of steaming histogam with both disciminative and gadual fogetting weights, the cumulative fequency (weights) of all histoical histogam elements is scaled with a facto e λ (i.e., V (t) e λ ) evey time a new element is eceived fom the data steams. Theefoe, we extend the above count-min sketch method to conside such decay weights as follows. Fist, befoe adding a new element i fom the data steam, we unifomly scale all countes acoss all ows by that facto, i.e., Q l,hl (i)(t) e λ. In such a way, fo any histogam element i, its estimated weighted cumulative count (t) is also scaled to min l Q l,hl (i)(t) e λ = (t) e λ, which coesponds exactly to Step I in ou sketch update pocess. Aftewads, we add the new element with its coesponding weight wi d 1, i.e., Q l,hl (i)(t + 1) = (t) e λ + wi d 1, which coesponds to Step II in ou sketch update pocess. As the afoementioned estimated fequency eo of count-min sketch depends only on its stuctual paamete g and d, it is easy to see that the estimated eo does not change unde this staightfowad extension, as a unifom scaling does not affect the data stuctue itself. Fo the detailed poof of the estimated eo, please efe to [38]. In this study, we empiically set the default paametes d = 10, g = 50 to guaantee an eo of at most 4% with pobability (see Section fo moe details). In addition, we also use a classical count-min sketch to stoe the label fequency vecto F l, which is used to compute the disciminative weights as descibed in Section Fo the sake of simplicity, we keep the same paametes as the one we used fo V. 4.5 Analysis of D 2 HistoSketch Discussion on Eo Bound The oiginal consistent weighed sampling technique [32] have poven the coectness of Eq. 6, with its associated eo bound. Howeve, this method takes only intege vectos as input, and equies a sketch element (i j, f j ) which pohibits its application to steaming histogams. To handle steaming histogams, ou poposed method is based on the 0-bit consistent weighted sampling technique poposed by [11]. This technique is able to take eal vectos as input, and moe impotantly, the autho empiically poved Eq. 7, which seves as a fundamental element of ou method. Howeve, as mentioned by the autho in [11], a igoous poof of Eq. 7 tuns out to be a difficult pobability poblem, which is not given in the oiginal pape. In this study, we focus on designing a sketching method fo steaming histogams with both disciminative and gadual fogetting weights, and we have igoous poven the coectness of ou method by showing its equivalence to the 0-bit consistent weighed sampling technique (using Poposition 1 and 2). Then, ou method

9 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST has the same eo bound as the 0-bit consistent weighed sampling technique [11]. Howeve, the futhe poof about the 0-bit consistent weighed sampling technique and its associated eo bound emains a difficult poblem, and is outside the scope of ou pape Time and Space Complexity Analysis Time. Since ou sketches ae incementally maintained, we discuss the time complexity fo each incoming histogam element fom the data steams. Specifically, to update a sketch of length K, we pefom ou incemental sketch update pocess fo all K sketch elements, which takes O(K) time. In addition, we also need to etieve/update the coesponding count-min sketches fo both V and F l (with d ows), which takes O(d) time. The total time complexity fo updating one histogam element is hence O(K + d). Compaed to ou pevious wok HistoSketch [14], ou poposed method with few in-memoy paametes can compute sketches in a moe efficient way, showing a highe pocessing speed with an impovement of 12.2% (see Section fo moe detail). Space. Each steaming histogam is epesented by a sketch of length K taking O(K) space. Fo the incemental sketch update pupose, we stoe the aw steaming histogam n a count-min sketch data stuctue (d ows and g columns) taking O(dg) space. Subsequently, the total space complexity is O(K +dg) fo one steaming histogam. Consideing a set of n histogams with a set of L labels (the label fequency vecto F l stoed in a count-min sketch data stuctue with d ows and g columns), the total space complexity is O(n (K + dg) + L (dg)). Note that ou space complexity is linea w..t. the numbe of histogams n, and moe impotantly, that it is independent of the eve-gowing cadinality of steaming histogam elements E. Compaed to ou pevious wok HistoSketch [14] that equies in-memoy paametes i,j, c i,j, β i,j and thus has a space complexity O(n (K + dg) + L (dg) + K E ), ou poposed memoy-efficient method can damatically educe the memoy consumption in pactice, as E is usually lage and gowing ove time. Fo example, in ou expeiments on the NYC dataset (n = 3173, K = 100, L = 9, d = 10, g = 50 and E is about 2 million), the memoy consumption of ou poposed method is only 1% of the memoy equied by ou pevious wok HistoSketch. 5 EXPERIMENTAL EVALUATION In this section, we evaluate D 2 HistoSketch on multiple classification tasks using both synthetic and eal-wold datasets. In the following, we fist pesent ou expeimental setup, followed by the esults on both types of datasets. 5.1 Expeimental Setup To evaluate the pefomance of ou similaity-peseving sketches, we pefom classification tasks based on these sketches in diffeence scenaios, which is a common evaluation scheme fo similaity peseving sketching methods [4], [11], [13], [14]. Specifically, based on labeled steaming histogams, we ty to classify those histogam instances without labels. We use a KNN classifie [39] which can always take the most up-to-date taining data (sketches) fo classification without maintaining a built classification model. Such Fig. 5. Pobability of steaming histogam elements geneated fom its initial distibution fo the synthetic dataset. a popety fits ou case of classifying steaming histogams with continuously incoming histogam elements, whee the sketches ae continuously updated accodingly. In addition, using a simple KNN classifie, we can put moe focus on the distance measues athe than classifies. We empiically set KNN to conside the five neaest neighbos. We conside the following evaluation scenaios: Synthetic Dataset. Synthetic data is widely used in studying concept dift adaptation [19], [40]. The advantage is that we can simulate diffeent cases of concept dift in steaming histogams with contollable paametes. Typical methods of simulating data steams with concept dift often use a moving hypeplane to geneate a steam of complete data instances [40]. Howeve, it cannot be diectly adopted fo steaming histogams, as the elements of a histogam ae obseved in a steaming manne. Theefoe, we design ou own data simulation method. Specifically, we conside two Gaussian distibutions N (100, 20) and N (110, 20) epesenting two classes of histogams, espectively. The steaming histogam elements ae then geneated as the neaest integes of the andom numbes sampled fom those distibutions. Fo each class, we simulate 500 histogams with 1000 elements each. We then split the 500 histogams to 50%-50% fo taining and the testing, espectively. The histogam elements ae geneated in a andom ode. To simulate concept-dift issues, we conside both abupt and gadual dift cases [19] in testing data. Fo abupt dift, stating fom 25% of steaming histogam elements, the testing data of one distibution abuptly stats to eceive the histogam elements geneated fom the othe distibution, and also changes thei labels immediately. Fo gadual dift, fom 25% to 35% of steaming histogam elements, the testing data of one distibution gadually stats to eceive the histogam elements geneated fom the othe distibution with an inceasing pobability (fom 0 to 1). The labels of the testing data also change gadually fom one class to the othe, i.e., fom 0% to 100%. Fig. 5 shows the pobability of steaming histogam elements geneated fom its initial distibutions. Note that it is complementay to the pobability of histogam elements geneated fom the othe distibution. POI Dataset. Ou sketching method can be applied to solve the poblem of semantic place labeling [4], whee we want to infe a place s categoy (e.g., supemaket o ba) based on its customes visiting pattens (i.e., the steaming histogam of its customes visits). The basic intuition is that POIs of the diffeent type usually have diffeent tempoal visiting pattens, e.g., bas ae mostly visited duing the night while museums ae often visited duing the daytime. Pevious studies have shown that consideing usetime pais as histogam elements (i.e., fine-gained visiting

10 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST TABLE 1 POI dataset statistics Dataset New Yok City Tokyo Istanbul (NYC) (TKY) (IST) Numbe of check-ins 142, , ,771 Numbe of POIs 3,174 2,993 3,120 Numbe of uses 12,798 9,160 15,479 pattens) yields much highe accuacy than consideing only time (i.e., coase-gained visiting pattens) [4]. We thus conside fine-gained pattens in the following. Specifically, fo one use s visit to a POI, we fist map the visiting time onto one of the 168 hous in a week peiod (discetization of time), and then conside the use-time pai as a histogam element. With a lage and continuously inceasing numbe of uses ove time, the cadinality of the steaming histogam apidly inceases. Moe impotantly, the visiting patten of a POI may change both abuptly (e.g., caused by the change of POI type) and gadually (e.g., caused by the intoduction of new menu items in a estauant). To evaluate ou method using this task, we use a dataset fom Fousquae povided by [3], [23], which has been widely used to study the poblems of semantic place labeling [4], [14] and POI ecommendation [41], [42], [43]. The dataset contains use check-in data on POIs fo about two yeas (fom Apil 2012 to Mach 2014). Each check-in ecods one visit of a use to a POI (with the associated categoy) at a cetain time. We andomly select 20% of the POIs as unlabeled testing data and egad the est as taining data. The classification is pefomed at the end of each month on the second yea (in ode to avoid too few check-ins fo some POIs duing the fist yea). The categoies (labels) of POIs in the dataset ae classified by Fousquae into 9 oot categoies (i.e., Ats & Entetainment, College & Univesity, Food, Geat Outdoos, Nightlife Spot, Pofessional & Othe Places, Residence, Shop & Sevice, Tavel & Tanspot), which ae futhe classified into 291 sub-categoies. Without loss of geneality, we select thee big cities, New Yok City, Tokyo and Istanbul, fo ou expeiments. Table 1 summaizes the main chaacteistics of ou dataset. Movielens Dataset. Ou sketching method can also be applied to pedict movie gene based on uses tagging activity steams. Specifically, on a movie ecommendation platfom, a use can add tags to a movie. Subsequently, we chaacteize each movie using the histogam of use tagging activities, whee each use is a histogam element. Simila to the case of POI dataset, we ae then able to classify a movie in to diffeent genes based on its steaming histogam. We use the MovieLens 20M Dataset povided by [44]. This dataset contains movie tagging activity about 10 yeas. We select top 8 pimay genes (i.e., action, adventue, animation, childen, comedy, Cime, documentay, Dama ), and keep only the tagging data on the elated movies. Finally, we obtain 24,394 movies, 7,622 uses, 434,054 tagging activities. We andomly select 20% of the movies as unlabeled testing data and egad the est as taining data. The classification is pefomed at the end of 6th to 10th yeas. (a) Abupt dift Fig. 6. Pefomance on classification task. 5.2 Pefomance on Synthetic Dataset (b) Gadual dift In this section, we fist show the effectiveness of D 2 HistoSketch by compaing the classification pefomance using diffeent steaming histogams and thei coesponding sketches. Aftewads, we study the impact of the sketch length K, the weight decay facto λ, and the count-min sketch paametes in diffeent scenaios Pefomance on classification task To demonstate the advantages of consideing disciminative and gadual fogetting weights in steaming histogams and the similaity-peseving sketches, we compae the following fou methods that keep the full histogams in memoy, and also thei coesponding sketching methods. We set sketch length K = 100 fo all the sketching methods, use entopy weighting fo disciminative weights, and set λ = 0.02 fo gadual fogetting weights when applicable. Histogam-Classical: full histogams whee the histogam s elements ae unweighted. Histogam-Disciminative: full histogams whee the histogam s elements ae assigned with disciminative weights. Histogam-Fogetting: full histogams whee the histogam s elements ae assigned with gadual fogetting weights. D 2 Histogam: full histogams whee the histogam s elements ae assigned with both disciminative and gadual fogetting weights. Sketch-Classical: ou sketching method configued with unweighted histogam elements (appoximation of Histogam-Classical). Sketch-Disciminative: the sketching method with only disciminative weights (appoximation of Histogam- Disciminative), which is equivalent to POISketch [4]. Sketch-Fogetting: the sketching method with only gadual fogetting weights (appoximation of Histogam-Fogetting), which is equivalent to HistoSketch [14]. D 2 HistoSketch: ou poposed sketching method with both disciminative and gadual fogetting weights (appoximation of D 2 Histogam). Fig. 6 shows the classification accuacy ove time (numbe of steaming elements pe histogam) fo both abupt and gadual difts. Fist, compaing the accuacy of using full histogams, we obseve that ou D 2 Histogam achieves the best esults oveall. On one hand, consideing gadual fogetting weights can significantly educe the ecovey time fom concept dift. Fo example, in the case of abupt

11 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST (a) Abupt dift Fig. 7. Impact of sketch length K. (b) Gadual dift (a) Abupt dift Fig. 8. Impact of weight decay facto λ. (b) Gadual dift dift (stating fom 250th histogam elements), accuacy ecoves afte eceiving 450 and 150 elements fo Histogam- Disciminative and D 2 Histogam, espectively. On the othe hand, disciminative weights can effectively impove accuacy outside of the peiod fo concept dift adaptation (i.e., befoe concept dift and afte ecovey fom concept dift). Fo example, fo both cases of abupt and gadual difts, D 2 Histogam shows a 2.6% impovement of accuacy fom Histogam-Fogetting. Second, among all the sketching methods, the poposed D 2 HistoSketch achieves the best esults, as it can efficiently appoximate the similaity between D 2 Histogam with both disciminative and gadual fogetting weights. Specifically, D 2 HistoSketch quickly adapts to concept dift (showing simila adaptation speed as that of D 2 Histogam), and shows a significant accuacy impovement of 7.9% compaed to Sketch-Fogetting Impact of sketch length K The sketch length K influences how well the sketch can appoximate the similaity with the oiginal data. In this expeiment, by fixing the gadual fogetting weight decay facto λ = 0.02, we vay the sketch length K within [20, 50, 100, 200, 500, 1000] to investigate the pefomance of ou method. We also plot the esults fom Histogam- Classical and D 2 Histogam as efeences. Fig. 7 shows the classification accuacy ove time in both cases of abupt and gadual dift. Fist, we obseve a positive impact of sketch length K on the classification accuacy, i.e., lage values of K imply a highe accuacy, as longe sketches can peseve moe infomation and thus bette appoximate the similaities of the D 2 Histogam. The accuacy flattens out afte K = 500 (D 2 HistoSketch with K 500 ae highly ovelapped with D 2 Histogam), indicating that a sketch of length 500 is sufficient fo accuate similaity appoximation. Second, we find that the sketch length K has no obvious impact on the adaptation speed, which is actually contolled by the gadual fogetting weight decay facto λ (we will discuss this in the next expeiment) Impact of weight decay facto λ The weight decay facto λ balances the tade-off between the concept dift adaptation speed and the similaity appoximation pefomance. In this expeiment, by fixing the sketch length K = 100, we vay the weight decay facto λ within [0, 0.005, 0.01, 0.02, 0.05, 0.1] to investigate the pefomance of ou method. We also compae ou method with the following two methods: Histogam-LatestK whee the histogam is built with the latest K histogam elements fom the steam, which is a typical method fo abupt fogetting (i.e., sliding window based concept-dift adaptation) [17]. It can also be egaded as a sketching method in the sense that the latest K histogam elements (unweighted) ae the sketches to epesent the histogam. Histogam-LatestK-Disciminative whee we futhe incopoate disciminative weights (i.e., entopy weights as fo ou method) into Histogam-LatestK. Fig. 8 shows the classification accuacy ove time in both cases of abupt and gadual dift. Fist, by compaing the esults of diffeent weight decay factos, we obseve clealy the tade-off between the concept dift adaptation speed and classification accuacy. On one hand, lage λ values imply faste adaptation to concept dift, as the algoithm quickly fogets outdated data (i.e., it puts lowe weights on the outdated data). On the othe hand, lage values of λ lead to lowe classification accuacy, as the algoithm uses less infomation fom fome histogam elements fo sketching, which leads to wose similaity appoximation. Second, compaed to the two additional baselines, we find that ou D 2 HistoSketch-λ-0.05 outpefoms Histogam-LatestK by achieving highe accuacy and faste adaptation speed at the same time. Howeve, we found that Histogam-LatestK- Disciminative shows compaable esults to ou method, i.e., its adaptation speed is faste than D 2 HistoSketch-λ and slowe than D 2 HistoSketch-λ-0.05 while its accuacy is lowe than D 2 HistoSketch-λ-0.02 and highe than D 2 HistoSketch-λ Despite such simila esults, thee ae two obvious advantages of ou methods: 1) D 2 HistoSketch is able to balance the adaptation speed and the accuacy unde fixed-size sketches while Histogam-LatestK needs to vay sketch length K to tune such tade-off; 2) ou method is much faste in similaity computation than Histogam- LatestK-Disciminative, as the latte equies set opeations while D 2 HistoSketch only elies on Hamming distance. Fo example, to classify one histogam using ou testing PC 1, ou method needs only 13ms while Histogam-LatestK takes 152ms, which shows a 12x speedup Impact of count-min sketch paametes In this expeiment, we study the impact of the count-min sketch paametes on the classification pefomance. Specifically, we study the classification accuacy of ou method by vaying the numbe of ows d and the numbe of column g within [5, 10, 20, 50, 100]. 1. Intel Coe i7-4770hq@2.20ghz, 16GB RAM, Mac OS X, implementation using MATLAB v2014b.

12 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST (a) Abupt dift Fig. 9. Impact of count-min sketch paametes. (b) Gadual dift Fig. 9 shows the aveage classification accuacy outside of the peiod fo concept dift adaptation. We obseve that lage d and g can both lead to a highe accuacy, as they can bette appoximate the fequency table of histogam V and the label fequency vecto F l. Howeve, lage d and g will also lead to highe time and space complexity as discussed in Section Theefoe, as the accuacy flattens out afte d = 10, g = 50, we empiically set the count-min sketch paametes d = 10, g = 50 fo all the expeiments. 5.3 Pefomance on Real-Wold Datasets To evaluate ou method using eal-wold datasets, we fist compae it with state-of-the-at appoaches, and then show its classification accuacy ove time, followed by its untime pefomance Compaison with othe methods We compae D 2 HistoSketch with all afoementioned baselines. The sketch length K is empiically set to 100 fo all elated methods. Fig. 10 plots the aveage classification accuacy. Fist, we obseve that D 2 Histogam yields the highest accuacy, showing the effectiveness of consideing both disciminative and gadual fogetting weights on steaming histogam elements. Thid, ou D 2 HistoSketch also outpefoms othe sketching methods by efficiently appoximating similaity between D 2 Histogam. In paticula, D 2 HistoSketch can efficiently appoximate the similaity with only a small loss of classification accuacy (e.g., about 3.25% fo oot POI categoies on the NYC dataset) compaed to D 2 Histogam. Finally, D 2 HistoSketch also outpefoms Histogam-LatestK/Histogam-LatestK- Disciminative, showing the effectiveness of gadual (athe than abupt) fogetting of histoical data. In addition, compaing the esults between the POI dataset and the Movielens dataset, we find that the impovement by consideing gadual fogetting (eithe in histogams and sketches) on Movielens dataset is smalle than on POI dataset. It is due to the fact that concept-dift is aely obseved in the Movielens dataset. In othe wods, compaed to POIs, a movie aely changes its gene Classification accuacy ove time In this expeiment, we study the classification accuacy of ou sketching method ove time. We focus on the NYC dataset with the 9 oot levels of POI categoies (Expeiments on the othe datasets show simila esults). In addition to the andomly selection stategies of testing POIs, we futhe conside only the POIs with categoy changes as testing data, as the change of POI categoies will likely lead to concept dift (paticulaly of the abupt kind). We keep the same paamete setting as in the pevious expeiment. Fig. 10. Compaison with othe methods (a) Random testing POIs Fig. 11. Classification pefomance ove time (b) Categoy-changing POIs Fig. 11(a) shows the classification accuacy fo each of the 12 testing months. We obseve that the accuacy of all sketching methods slightly inceases ove time with the accumulated histogam elements, as obseving moe histogam elements leads to moe accuate similaity measuement of steaming histogams. In paticula, compaed to othe sketching methods, ou D 2 HistoSketch achieves consistently highe accuacy by consideing both disciminative and gadual fogetting weights. Fig. 11(b) shows the same esults on the testing POIs with categoy changes only. We obseve a lage impovement of consideing gadual fogetting weights than that in Fig. 11(a). Fo example, the accuacy gain of ou method ove Sketch-Disciminative is 3.03% when testing on categoy-changing POIs, while it is 2.13% when testing on andom POIs. This obsevation futhe shows the effectiveness of ou method at handling concept dift. Note that as opposed to the synthetic dataset, we do not obseve any sudden dop of accuacy, as sets of POIs aely change thei types simultaneously Runtime pefomance fo classification In this expeiment, we investigate the untime pefomance of sketch-based classification. Moe pecisely, we evaluate the classification time w..t. sketch length K. We also compae with Histogam-LatestK which can be egaded as a sketching method as it keeps only the latest K histogam elements. Fig. 12(a) plots the KNN classification time (log scale) on ou test PC 3. We obseve that compaed to the full histogams (D 2 Histogam), using D 2 HistoSketch damatically educes the classification time (with a 7500x speedup), since the Hamming distance between sketches of size K (e.g., K=100 in pevious expeiments) can be much moe efficiently computed than the nomalized min-max similaity between the full histogams of much lage size E (e.g., E is about 2 million fo the NYC dataset). We also note that all the sketching methods have simila classification time, as they all compute the Hamming distance between vectos

13 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST (a) Classification time Fig. 12. Runtime pefomance (b) Pocessing speed of size K. Compaed to Histogam-LatestK, ou method also shows a 15x speedup, as Histogam-LatestK equies set opeations fo similaity computation Runtime pefomance fo sketch maintenance In this expeiment, we investigate the untime pefomance of sketch maintenance. We evaluate the steaming histogam pocessing speed w..t. sketch length K (as the time complexity of maintaining D 2 HistoSketch mainly depends on the sketch length K). Hee we compae ou method with the following two methods: 1) HistoSketch-Oiginal [14] (consideing gadual fogetting weights only) whose sampling pocess is implemented with Eq 10 and 9 and equies the lage set of in-memoy paametes i,j, c i,j, β i,j ; 2) HistoSketch-Fast which is implemented with ou efficient sampling method as show in Eq. 21 and consideing gadual fogetting weights only (fo a fai compaison with HistoSketch-Oiginal). Fig. 12(b) shows the pocessing speed of steaming histogam elements. Fist, compaed to HistoSketch-Oiginal, HistoSketch-Fast shows a highe pocessing speed (with a 12.2% impovement ove HistoSketch-Oiginal). Note that these two methods ae equivalent in peseving similaities of steaming histogams with gadual fogetting weight only, and thus show exactly the same classification pefomance. Theefoe, ou poposed sampling method is moe untime-efficient than the sampling method in HistoSketch-Oiginal. Second, by adding the disciminative weights, D 2 HistoSketch yields a slightly slowe pocessing speed than HistoSketch-Fast, but with highe classification pefomance. Finally, compaed to HistoSketch-Oiginal, D 2 HistoSketch can achieve a faste pocessing speed (with a 5.9% impovement on aveage) and a highe classification accuacy (with a 2.78% impovement on aveage) at the same time. We also find that the pocessing speed slightly deceases with inceasing sketch lengths fo all the methods, as longe sketches equied a little moe time to update. Moeove, we believe that the pocessing speed of ou method (about 2200 pe second) is able to handle most eal-wold use cases. Fo example, Fousquae check-in steams achieved a peak-day ecod of 7 million check-ins/day in 2015 (about 81 check-ins/sec on aveage). In addition, ou method can be easily paallelized w..t. the numbe of histogams (i.e., numbe of POIs), as sketches of steaming histogams ae independently maintained fom each othe. 6 CONCLUSION AND FUTURE WORK This pape intoduces D 2 HistoSketch, a similaitypeseving sketching method fo steaming histogams to efficiently appoximate thei Disciminative and Dynamic similaity. D 2 HistoSketch is designed to fast and memoy-efficiently maintain a set of compact and fixedsized sketches of steaming histogams to appoximate thei nomalized min-max similaity. To povide high-quality similaity appoximations, D 2 HistoSketch consides both disciminative and gadual fogetting weights fo similaity measuement, and seamlessly incopoates them in the sketches. Based on both synthetic and eal-wold datasets, ou empiical evaluation shows that ou method is able to efficiently and effectively appoximate the similaity between steaming histogams while outpefoming stateof-the-at sketching methods. Compaed to full steaming histogams with both disciminative and gadual fogetting weights in paticula, D 2 HistoSketch is able to damatically educe the classification time (with a 7500x speedup) at the expense of a small loss in accuacy only (about 3.25%). As futue wok, we will exploe the poblem of sketching two-dimensional (bivaiate) steaming histogams, and apply ou method to othe application domains. ACKNOWLEDGMENTS This poject has eceived funding fom the Euopean Reseach Council (ERC) unde the Euopean Union s Hoizon 2020 eseach and innovation pogamme (gant ageement /GaphInt), and Fudan Univesity Statup Reseach Gant (SXH ). REFERENCES [1] O. Chapelle, P. Haffne, and V. N. Vapnik, Suppot vecto machines fo histogam-based image classification, IEEE Tans. on Neual Netwoks, vol. 10, no. 5, pp , [2] L. D. Bake and A. K. McCallum, Distibutional clusteing of wods fo text classification, in Poc. of SIGIR, 1998, pp [3] D. Yang, D. Zhang, L. Chen, and B. Qu, Nationtelescope: Monitoing and visualizing lage-scale collective behavio in lbsns, Jounal of Netwok and Compute Applications, vol. 55, pp , [4] D. Yang, B. Li, and P. Cudé-Mauoux, Poisketch: Semantic place labeling ove use activity steams, in Poc. of IJCAI, 2016, pp [5] J. Wang, T. Zhang, J. Song, N. Sebe, and H. T. Shen, A suvey on leaning to hash, IEEE Tansactions on Patten Analysis and Machine Intelligence, vol. 40, no. 4, pp , [6] C. C. Aggawal and P. S. Yu, On classification of high-cadinality data steams, in Poc. of SDM, 2010, pp [7] Y. Bachach, E. Poat, and J. S. Rosenschein, Sketching techniques fo collaboative filteing, in Poc. of IJCAI, 2009, pp [8] A. Z. Bode, M. Chaika, A. M. Fieze, and M. Mitzenmache, Min-wise independent pemutations, in Poc. of STOC, 1998, pp [9] M. Mitzenmache, R. Pagh, and N. Pham, Efficient estimation fo high similaities using odd sketches, in Poc. of WWW, 2014, pp [10] K. Kutzkov, M. Ahmed, and S. Nikitaki, Weighted similaity estimation in data steams, in Poc. of CIKM, 2015, pp [11] P. Li, 0-bit consistent weighted sampling, in Poc. of KDD, 2015, pp [12] S. Ioffe, Impoved consistent sampling, weighted minhash and l1 sketching, in Poc. of ICDM, 2010, pp [13] W. Wu, B. Li, L. Chen, and C. Zhang, Consistent weighted sampling made moe pactical, in Poc. of WWW, 2017, pp [14] D. Yang, B. Li, L. Rettig, and P. Cudé-Mauoux, Histosketch: Fast similaity-peseving sketching of steaming histogams with concept dift, in Poc. of ICDM, New Oleans, USA, 2017.

14 JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST [15] Y. Yang, F. Liang, N. Jojic, S. Yan, J. Feng, and T. S. Huang, Disciminative similaity fo clusteing and semi-supevised leaning, axiv: [stat.ml], [Online]. Available: [16] D. Wettscheeck, D. W. Aha, and T. Mohi, A eview and empiical evaluation of featue weighting methods fo a class of lazy leaning algoithms, Atif. Intell. Rev., vol. 11, no. 1-5, pp , [17] J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, A suvey on concept dift adaptation, ACM Computing Suveys, vol. 46, no. 4, p. 44, [18] I. Koychev, Gadual fogetting fo adaptation to concept dift. Poc. of ECAI Wokshop, 2000, pp [19] R. Klinkenbeg, Leaning difting concepts: Example selection vs. example weighting, Intelligent Data Analysis, vol. 8, no. 3, pp , [20] M. Manasse, F. McShey, and K. Talwa, Consistent weighted sampling, Technical Repot MSR-TR , [21] M. O. Wad, G. Ginstein, and D. Keim, Inteactive data visualization: foundations, techniques, and applications. CRC Pess, [22] D. Yang, D. Zhang, V. W. Zheng, and Z. Yu, Modeling use activity pefeence by leveaging use spatial tempoal chaacteistics in lbsns, IEEE Tansactions on Systems, Man, and Cybenetics: Systems, vol. 45, no. 1, pp , [23] D. Yang, D. Zhang, and B. Qu, Paticipatoy cultual mapping based on collective behavio data in location-based social netwoks, ACM Tans. on Intelligent Systems and Technology, vol. 7, no. 3, p. 30, [24] M. Chaika, K. Chen, and M. Faach-Colton, Finding fequent items in data steams, in Intenational Colloquium on Automata, Languages, and Pogamming. Spinge, 2002, pp [25] G. Comode and S. Muthukishnan, An impoved data steam summay: the count-min sketch and its applications, J. Algoithms, vol. 55, no. 1, pp , [26] Y. Ben-Haim and E. Tom-Tov, A steaming paallel decision tee algoithm, Jounal of Machine Leaning Reseach, vol. 11, no. Feb, pp , [27] B. Li, X. Zhu, L. Chi, and C. Zhang, Nested subtee hash kenels fo lage-scale gaph classification ove steams, in Poc. of ICDM, 2012, pp [28] A. Gionis, P. Indyk, R. Motwani et al., Similaity seach in high dimensions via hashing, in Poc. of VLDB, vol. 99, no. 6, 1999, pp [29] J. Song, T. He, L. Gao, X. Xu, A. Hanjalic, and H. Shen, Binay geneative advesaial netwoks fo image etieval, in Poceedings of the AAAI Confeence on Atificial Intelligence, 2018, pp [30] J. Song, H. Zhang, X. Li, L. Gao, M. Wang, and R. Hong, Selfsupevised video hashing with hieachical binay auto-encode, IEEE Tansactions on Image Pocessing, vol. 27, no. 7, pp , [31] J. Song, L. Gao, L. Liu, X. Zhu, and N. Sebe, Quantizationbased hashing: a geneal famewok fo scalable image and video etieval, Patten Recognition, vol. 75, pp , [32] B. Haeuple, M. Manasse, and K. Talwa, Consistent weighted sampling made fast, small, and easy, axiv: [cs.ds], [Online]. Available: [33] C. Yang, R. Duaiswami, and L. Davis, Efficient mean-shift tacking via a new similaity measue, in Poc. of CVPR, vol. 1. IEEE, 2005, pp [34] P. Nakov, A. Popova, and P. Mateev, Weight functions impact on lsa pefomance, EuoConfeence RANLP, pp , [35] A. Tsymbal, The poblem of concept dift: definitions and elated wok, Technical Repot TCD-CS Compute Science Depatment, [36] Y. Koen, Collaboative filteing with tempoal dynamics, Communications of the ACM, vol. 53, no. 4, pp , [37] R. L. Gaham, Concete mathematics: a foundation fo compute science. Peason Education India, [38] G. Comode and S. Muthukishnan, Appoximating data with the count-min data stuctue, IEEE Softwae, [39] T. M. Mitchell, Machine leaning. pp. I XVII, [40] H. Wang, W. Fan, P. S. Yu, and J. Han, Mining concept-difting data steams using ensemble classifies, in Poc. of KDD, 2003, pp [41] D. Yang, D. Zhang, Z. Yu, and Z. Wang, A sentiment-enhanced pesonalized location ecommendation system, in Poce. of HT, 2013, pp [42] D. Yang, D. Zhang, Z. Yu, and Z. Yu, Fine-gained pefeenceawae location seach leveaging cowdsouced digital footpints fom lbsns, in Poceedings of the 2013 ACM intenational joint confeence on Pevasive and ubiquitous computing. ACM, 2013, pp [43] Z. Yu, H. Xu, Z. Yang, and B. Guo, Pesonalized tavel package with multi-point-of-inteest ecommendation based on cowdsouced use footpints, IEEE Tansactions on Human-Machine Systems, vol. 46, no. 1, pp , [44] F. M. Hape and J. A. Konstan, The movielens datasets: Histoy and context, ACM Tansactions on Inteactive Intelligent Systems (TiiS), vol. 5, no. 4, p. 19, pediction. Dingqi Yang is cuently a senio eseache at the Univesity of Fiboug in Switzeland. He eceived the Ph.D. degee in compute science fom Piee and Maie Cuie Univesity and Institut Mines-TELECOM/TELECOM SudPais, whee he won both the CNRS SAMOVAR Doctoate Awad and the Institut Mines-TELECOM Pess Mention in His eseach inteests include big social media data analytics, ubiquitous computing, and smat city applications. Bin Li eceived the Ph.D. degee in compute science fom Fudan Univesity, Shanghai, China. He was a Senio Reseach Scientist with Data61 (fomely NICTA), CSIRO, Eveleigh, NSW, Austalia and a Lectue with the Univesity of Technology Sydney, Boadway, NSW, Austalia. He is cuently an Associate Pofesso with the School of Compute Science, Fudan Univesity, Shanghai, China. His cuent eseach inteests include machine leaning and data analytics, paticulaly in complex data epesentation, modeling, and Laua Rettig is a Ph.D. student at the Univesity of Fiboug, Switzeland unde the supevision of Philippe Cudé-Mauoux. He aeas of inteest ae big data infastuctues, semantic data, entity linking, data steams, and deep leaning. She eceived he Maste s degee fom the Univesity of Fiboug, with he thesis witten on big data steaming using eal-wold telecommunication data duing an intenship at Swisscom, fo whch she won two best thesis awads at the Univesity of Fiboug. Philippe Cude-Mauoux is a Full Pofesso and the diecto of the exascale Infolab at the Univesity of Fiboug in Switzeland. He eceived his Ph.D. fom the Swiss Fedeal Institute of Technology EPFL, whee he won both the Doctoate Awad and the EPFL Pess Mention. Befoe joining the Univesity of Fiboug he woked on infomation management infastuctues fo IBM Watson Reseach, Micosoft Reseach Asia, and MIT. His eseach inteests ae in next-geneation, Big Data management infastuctues fo non-elational data. Webpage:

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012 2011, Scienceline Publication www.science-line.com Jounal of Wold s Electical Engineeing and Technology J. Wold. Elect. Eng. Tech. 1(1): 12-16, 2012 JWEET An Efficient Algoithm fo Lip Segmentation in Colo

More information

Detection and Recognition of Alert Traffic Signs

Detection and Recognition of Alert Traffic Signs Detection and Recognition of Alet Taffic Signs Chia-Hsiung Chen, Macus Chen, and Tianshi Gao 1 Stanfod Univesity Stanfod, CA 9305 {echchen, macuscc, tianshig}@stanfod.edu Abstact Taffic signs povide dives

More information

Controlled Information Maximization for SOM Knowledge Induced Learning

Controlled Information Maximization for SOM Knowledge Induced Learning 3 Int'l Conf. Atificial Intelligence ICAI'5 Contolled Infomation Maximization fo SOM Knowledge Induced Leaning Ryotao Kamimua IT Education Cente and Gaduate School of Science and Technology, Tokai Univeisity

More information

Lecture # 04. Image Enhancement in Spatial Domain

Lecture # 04. Image Enhancement in Spatial Domain Digital Image Pocessing CP-7008 Lectue # 04 Image Enhancement in Spatial Domain Fall 2011 2 domains Spatial Domain : (image plane) Techniques ae based on diect manipulation of pixels in an image Fequency

More information

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES Svetlana Avetisyan Mikayel Samvelyan* Matun Kaapetyan Yeevan State Univesity Abstact In this pape, the class

More information

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

A Two-stage and Parameter-free Binarization Method for Degraded Document Images A Two-stage and Paamete-fee Binaization Method fo Degaded Document Images Yung-Hsiang Chiu 1, Kuo-Liang Chung 1, Yong-Huai Huang 2, Wei-Ning Yang 3, Chi-Huang Liao 4 1 Depatment of Compute Science and

More information

An Extension to the Local Binary Patterns for Image Retrieval

An Extension to the Local Binary Patterns for Image Retrieval , pp.81-85 http://x.oi.og/10.14257/astl.2014.45.16 An Extension to the Local Binay Pattens fo Image Retieval Zhize Wu, Yu Xia, Shouhong Wan School of Compute Science an Technology, Univesity of Science

More information

IP Network Design by Modified Branch Exchange Method

IP Network Design by Modified Branch Exchange Method Received: June 7, 207 98 IP Netwok Design by Modified Banch Method Kaiat Jaoenat Natchamol Sichumoenattana 2* Faculty of Engineeing at Kamphaeng Saen, Kasetsat Univesity, Thailand 2 Faculty of Management

More information

Towards Adaptive Information Merging Using Selected XML Fragments

Towards Adaptive Information Merging Using Selected XML Fragments Towads Adaptive Infomation Meging Using Selected XML Fagments Ho-Lam Lau and Wilfed Ng Depatment of Compute Science and Engineeing, The Hong Kong Univesity of Science and Technology, Hong Kong {lauhl,

More information

Image Enhancement in the Spatial Domain. Spatial Domain

Image Enhancement in the Spatial Domain. Spatial Domain 8-- Spatial Domain Image Enhancement in the Spatial Domain What is spatial domain The space whee all pixels fom an image In spatial domain we can epesent an image by f( whee x and y ae coodinates along

More information

A modal estimation based multitype sensor placement method

A modal estimation based multitype sensor placement method A modal estimation based multitype senso placement method *Xue-Yang Pei 1), Ting-Hua Yi 2) and Hong-Nan Li 3) 1),)2),3) School of Civil Engineeing, Dalian Univesity of Technology, Dalian 116023, China;

More information

Communication vs Distributed Computation: an alternative trade-off curve

Communication vs Distributed Computation: an alternative trade-off curve Communication vs Distibuted Computation: an altenative tade-off cuve Yahya H. Ezzeldin, Mohammed amoose, Chistina Fagouli Univesity of Califonia, Los Angeles, CA 90095, USA, Email: {yahya.ezzeldin, mkamoose,

More information

Gravitational Shift for Beginners

Gravitational Shift for Beginners Gavitational Shift fo Beginnes This pape, which I wote in 26, fomulates the equations fo gavitational shifts fom the elativistic famewok of special elativity. Fist I deive the fomulas fo the gavitational

More information

An Unsupervised Segmentation Framework For Texture Image Queries

An Unsupervised Segmentation Framework For Texture Image Queries An Unsupevised Segmentation Famewok Fo Textue Image Queies Shu-Ching Chen Distibuted Multimedia Infomation System Laboatoy School of Compute Science Floida Intenational Univesity Miami, FL 33199, USA chens@cs.fiu.edu

More information

A Novel Automatic White Balance Method For Digital Still Cameras

A Novel Automatic White Balance Method For Digital Still Cameras A Novel Automatic White Balance Method Fo Digital Still Cameas Ching-Chih Weng 1, Home Chen 1,2, and Chiou-Shann Fuh 3 Depatment of Electical Engineeing, 2 3 Gaduate Institute of Communication Engineeing

More information

A Memory Efficient Array Architecture for Real-Time Motion Estimation

A Memory Efficient Array Architecture for Real-Time Motion Estimation A Memoy Efficient Aay Achitectue fo Real-Time Motion Estimation Vasily G. Moshnyaga and Keikichi Tamau Depatment of Electonics & Communication, Kyoto Univesity Sakyo-ku, Yoshida-Honmachi, Kyoto 66-1, JAPAN

More information

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension 17th Wold Confeence on Nondestuctive Testing, 25-28 Oct 2008, Shanghai, China Segmentation of Casting Defects in X-Ray Images Based on Factal Dimension Jue WANG 1, Xiaoqin HOU 2, Yufang CAI 3 ICT Reseach

More information

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma apreduce Optimizations and Algoithms 2015 Pofesso Sasu Takoma www.cs.helsinki.fi Optimizations Reduce tasks cannot stat befoe the whole map phase is complete Thus single slow machine can slow down the

More information

Illumination methods for optical wear detection

Illumination methods for optical wear detection Illumination methods fo optical wea detection 1 J. Zhang, 2 P.P.L.Regtien 1 VIMEC Applied Vision Technology, Coy 43, 5653 LC Eindhoven, The Nethelands Email: jianbo.zhang@gmail.com 2 Faculty Electical

More information

The Dual Round Robin Matching Switch with Exhaustive Service

The Dual Round Robin Matching Switch with Exhaustive Service The Dual Round Robin Matching Switch with Exhaustive Sevice Yihan Li, Shivenda S. Panwa, H. Jonathan Chao Abstact Vitual Output Queuing is widely used by fixed-length highspeed switches to ovecome head-of-line

More information

Topic -3 Image Enhancement

Topic -3 Image Enhancement Topic -3 Image Enhancement (Pat 1) DIP: Details Digital Image Pocessing Digital Image Chaacteistics Spatial Spectal Gay-level Histogam DFT DCT Pe-Pocessing Enhancement Restoation Point Pocessing Masking

More information

Optical Flow for Large Motion Using Gradient Technique

Optical Flow for Large Motion Using Gradient Technique SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 3, No. 1, June 2006, 103-113 Optical Flow fo Lage Motion Using Gadient Technique Md. Moshaof Hossain Sake 1, Kamal Bechkoum 2, K.K. Islam 1 Abstact: In this

More information

INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM

INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM LiLi Du Depatment of Civil, Achitectual, and Envionmental Engineeing Illinois Institute of Technology 3300

More information

Effective Data Co-Reduction for Multimedia Similarity Search

Effective Data Co-Reduction for Multimedia Similarity Search Effective Data Co-Reduction fo Multimedia Similaity Seach Zi Huang Heng Tao Shen Jiajun Liu Xiaofang Zhou School of Infomation Technology and Electical Engineeing The Univesity of Queensland, QLD 472,

More information

A Recommender System for Online Personalization in the WUM Applications

A Recommender System for Online Personalization in the WUM Applications A Recommende System fo Online Pesonalization in the WUM Applications Mehdad Jalali 1, Nowati Mustapha 2, Ali Mamat 2, Md. Nasi B Sulaiman 2 Abstact foeseeing of use futue movements and intentions based

More information

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS Daniel A Menascé Mohamed N Bennani Dept of Compute Science Oacle, Inc Geoge Mason Univesity 1211 SW Fifth

More information

THE THETA BLOCKCHAIN

THE THETA BLOCKCHAIN THE THETA BLOCKCHAIN Theta is a decentalized video steaming netwok, poweed by a new blockchain and token. By Theta Labs, Inc. Last Updated: Nov 21, 2017 esion 1.0 1 OUTLINE Motivation Reputation Dependent

More information

On Error Estimation in Runge-Kutta Methods

On Error Estimation in Runge-Kutta Methods Leonado Jounal of Sciences ISSN 1583-0233 Issue 18, Januay-June 2011 p. 1-10 On Eo Estimation in Runge-Kutta Methods Ochoche ABRAHAM 1,*, Gbolahan BOLARIN 2 1 Depatment of Infomation Technology, 2 Depatment

More information

An Optimised Density Based Clustering Algorithm

An Optimised Density Based Clustering Algorithm Intenational Jounal of Compute Applications (0975 8887) Volume 6 No.9, Septembe 010 An Optimised Density Based Clusteing Algoithm J. Hencil Pete Depatment of Compute Science St. Xavie s College, Palayamkottai,

More information

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM Luna M. Rodiguez*, Sue Ellen Haupt, and Geoge S. Young Depatment of Meteoology and Applied Reseach Laboatoy The Pennsylvania State Univesity,

More information

Topological Characteristic of Wireless Network

Topological Characteristic of Wireless Network Topological Chaacteistic of Wieless Netwok Its Application to Node Placement Algoithm Husnu Sane Naman 1 Outline Backgound Motivation Papes and Contibutions Fist Pape Second Pape Thid Pape Futue Woks Refeences

More information

Hierarchically Clustered P2P Streaming System

Hierarchically Clustered P2P Streaming System Hieachically Clusteed P2P Steaming System Chao Liang, Yang Guo, and Yong Liu Polytechnic Univesity Thomson Lab Booklyn, NY 11201 Pinceton, NJ 08540 Abstact Pee-to-pee video steaming has been gaining populaity.

More information

Adaptation of TDMA Parameters Based on Network Conditions

Adaptation of TDMA Parameters Based on Network Conditions Adaptation of TDMA Paametes Based on Netwok Conditions Boa Kaaoglu Dept. of Elect. and Compute Eng. Univesity of Rocheste Rocheste, NY 14627 Email: kaaoglu@ece.ocheste.edu Tolga Numanoglu Dept. of Elect.

More information

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters Optics and Photonics Jounal, 016, 6, 94-100 Published Online August 016 in SciRes. http://www.scip.og/jounal/opj http://dx.doi.og/10.436/opj.016.68b016 Fequency Domain Appoach fo Face Recognition Using

More information

Conversion Functions for Symmetric Key Ciphers

Conversion Functions for Symmetric Key Ciphers Jounal of Infomation Assuance and Secuity 2 (2006) 41 50 Convesion Functions fo Symmetic Key Ciphes Deba L. Cook and Angelos D. Keomytis Depatment of Compute Science Columbia Univesity, mail code 0401

More information

Scaling Location-based Services with Dynamically Composed Location Index

Scaling Location-based Services with Dynamically Composed Location Index Scaling Location-based Sevices with Dynamically Composed Location Index Bhuvan Bamba, Sangeetha Seshadi and Ling Liu Distibuted Data Intensive Systems Laboatoy (DiSL) College of Computing, Geogia Institute

More information

Clustering Interval-valued Data Using an Overlapped Interval Divergence

Clustering Interval-valued Data Using an Overlapped Interval Divergence Poc. of the 8th Austalasian Data Mining Confeence (AusDM'9) Clusteing Inteval-valued Data Using an Ovelapped Inteval Divegence Yongli Ren Yu-Hsn Liu Jia Rong Robet Dew School of Infomation Engineeing,

More information

Point-Biserial Correlation Analysis of Fuzzy Attributes

Point-Biserial Correlation Analysis of Fuzzy Attributes Appl Math Inf Sci 6 No S pp 439S-444S (0 Applied Mathematics & Infomation Sciences An Intenational Jounal @ 0 NSP Natual Sciences Publishing o Point-iseial oelation Analysis of Fuzzy Attibutes Hao-En hueh

More information

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks Spial Recognition Methodology and Its Application fo Recognition of Chinese Bank Checks Hanshen Tang 1, Emmanuel Augustin 2, Ching Y. Suen 1, Olivie Baet 2, Mohamed Cheiet 3 1 Cente fo Patten Recognition

More information

DUe to the recent developments of gigantic social networks

DUe to the recent developments of gigantic social networks Exploing Communities in Lage Pofiled Gaphs Yankai Chen, Yixiang Fang, Reynold Cheng Membe, IEEE, Yun Li, Xiaojun Chen, Jie Zhang 1 Abstact Given a gaph G and a vetex q G, the community seach (CS) poblem

More information

E.g., movie recommendation

E.g., movie recommendation Recommende Systems Road Map Intodction Content-based ecommendation Collaboative filteing based ecommendation K-neaest neighbo Association les Matix factoization 2 Intodction Recommende systems ae widely

More information

WIRELESS sensor networks (WSNs), which are capable

WIRELESS sensor networks (WSNs), which are capable IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. XX, XXX 214 1 Lifetime and Enegy Hole Evolution Analysis in Data-Gatheing Wieless Senso Netwoks Ju Ren, Student Membe, IEEE, Yaoxue Zhang, Kuan

More information

Optimal Adaptive Learning for Image Retrieval

Optimal Adaptive Learning for Image Retrieval Optimal Adaptive Leaning fo Image Retieval ao Wang Dept of Compute Sci and ech singhua Univesity Beijing 00084, P. R. China Wangtao7@63.net Yong Rui Micosoft Reseach One Micosoft Way Redmond, WA 9805,

More information

Comparisons of Transient Analytical Methods for Determining Hydraulic Conductivity Using Disc Permeameters

Comparisons of Transient Analytical Methods for Determining Hydraulic Conductivity Using Disc Permeameters Compaisons of Tansient Analytical Methods fo Detemining Hydaulic Conductivity Using Disc Pemeametes 1,,3 Cook, F.J. 1 CSRO Land and Wate, ndoooopilly, Queensland The Univesity of Queensland, St Lucia,

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAE COMPRESSION STANDARDS Lesson 17 JPE-2000 Achitectue and Featues Instuctional Objectives At the end of this lesson, the students should be able to: 1. State the shotcomings of JPE standad.

More information

FACE VECTORS OF FLAG COMPLEXES

FACE VECTORS OF FLAG COMPLEXES FACE VECTORS OF FLAG COMPLEXES ANDY FROHMADER Abstact. A conjectue of Kalai and Eckhoff that the face vecto of an abitay flag complex is also the face vecto of some paticula balanced complex is veified.

More information

Prioritized Traffic Recovery over GMPLS Networks

Prioritized Traffic Recovery over GMPLS Networks Pioitized Taffic Recovey ove GMPLS Netwoks 2005 IEEE. Pesonal use of this mateial is pemitted. Pemission fom IEEE mu be obtained fo all othe uses in any cuent o futue media including epinting/epublishing

More information

Modeling spatially-correlated data of sensor networks with irregular topologies

Modeling spatially-correlated data of sensor networks with irregular topologies This full text pape was pee eviewed at the diection of IEEE Communications Society subject matte expets fo publication in the IEEE SECON 25 poceedings Modeling spatially-coelated data of senso netwoks

More information

Effective Missing Data Prediction for Collaborative Filtering

Effective Missing Data Prediction for Collaborative Filtering Effective Missing Data Pediction fo Collaboative Filteing Hao Ma, Iwin King and Michael R. Lyu Dept. of Compute Science and Engineeing The Chinese Univesity of Hong Kong Shatin, N.T., Hong Kong { hma,

More information

Quality Aware Privacy Protection for Location-based Services

Quality Aware Privacy Protection for Location-based Services In Poceedings of the th Intenational Confeence on Database Systems fo Advanced Applications (DASFAA 007), Bangkok, Thailand, Apil 9-, 007. Quality Awae Pivacy Potection fo Location-based Sevices Zhen Xiao,,

More information

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System Slotted Random Access Potocol with Dynamic Tansmission Pobability Contol in CDMA System Intaek Lim 1 1 Depatment of Embedded Softwae, Busan Univesity of Foeign Studies, itlim@bufs.ac.k Abstact In packet

More information

Any modern computer system will incorporate (at least) two levels of storage:

Any modern computer system will incorporate (at least) two levels of storage: 1 Any moden compute system will incopoate (at least) two levels of stoage: pimay stoage: andom access memoy (RAM) typical capacity 32MB to 1GB cost pe MB $3. typical access time 5ns to 6ns bust tansfe

More information

arxiv: v2 [physics.soc-ph] 30 Nov 2016

arxiv: v2 [physics.soc-ph] 30 Nov 2016 Tanspotation dynamics on coupled netwoks with limited bandwidth Ming Li 1,*, Mao-Bin Hu 1, and Bing-Hong Wang 2, axiv:1607.05382v2 [physics.soc-ph] 30 Nov 2016 1 School of Engineeing Science, Univesity

More information

Separability and Topology Control of Quasi Unit Disk Graphs

Separability and Topology Control of Quasi Unit Disk Graphs Sepaability and Topology Contol of Quasi Unit Disk Gaphs Jiane Chen, Anxiao(Andew) Jiang, Iyad A. Kanj, Ge Xia, and Fenghui Zhang Dept. of Compute Science, Texas A&M Univ. College Station, TX 7784. {chen,

More information

On the Forwarding Area of Contention-Based Geographic Forwarding for Ad Hoc and Sensor Networks

On the Forwarding Area of Contention-Based Geographic Forwarding for Ad Hoc and Sensor Networks On the Fowading Aea of Contention-Based Geogaphic Fowading fo Ad Hoc and Senso Netwoks Dazhi Chen Depatment of EECS Syacuse Univesity Syacuse, NY dchen@sy.edu Jing Deng Depatment of CS Univesity of New

More information

Erasure-Coding Based Routing for Opportunistic Networks

Erasure-Coding Based Routing for Opportunistic Networks Easue-Coding Based Routing fo Oppotunistic Netwoks Yong Wang, Sushant Jain, Magaet Matonosi, Kevin Fall Pinceton Univesity, Univesity of Washington, Intel Reseach Bekeley ABSTRACT Routing in Delay Toleant

More information

Geophysical inversion with a neighbourhood algorithm I. Searching a parameter space

Geophysical inversion with a neighbourhood algorithm I. Searching a parameter space Geophys. J. Int. (1999) 138, 479 494 Geophysical invesion with a neighbouhood algoithm I. Seaching a paamete space Malcolm Sambidge Reseach School of Eath Sciences, Institute of Advanced studies, Austalian

More information

vaiation than the fome. Howeve, these methods also beak down as shadowing becomes vey signicant. As we will see, the pesented algoithm based on the il

vaiation than the fome. Howeve, these methods also beak down as shadowing becomes vey signicant. As we will see, the pesented algoithm based on the il IEEE Conf. on Compute Vision and Patten Recognition, 1998. To appea. Illumination Cones fo Recognition Unde Vaiable Lighting: Faces Athinodoos S. Geoghiades David J. Kiegman Pete N. Belhumeu Cente fo Computational

More information

Color Correction Using 3D Multiview Geometry

Color Correction Using 3D Multiview Geometry Colo Coection Using 3D Multiview Geomety Dong-Won Shin and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 13 Cheomdan-gwagio, Buk-ku, Gwangju 500-71, Republic of Koea ABSTRACT Recently,

More information

A Shape-preserving Affine Takagi-Sugeno Model Based on a Piecewise Constant Nonuniform Fuzzification Transform

A Shape-preserving Affine Takagi-Sugeno Model Based on a Piecewise Constant Nonuniform Fuzzification Transform A Shape-peseving Affine Takagi-Sugeno Model Based on a Piecewise Constant Nonunifom Fuzzification Tansfom Felipe Fenández, Julio Gutiéez, Juan Calos Cespo and Gacián Tiviño Dep. Tecnología Fotónica, Facultad

More information

Modeling Spatially Correlated Data in Sensor Networks

Modeling Spatially Correlated Data in Sensor Networks Modeling Spatially Coelated Data in Senso Netwoks Apoova Jindal and Konstantinos Psounis Univesity of Southen Califonia The physical phenomena monitoed by senso netwoks, e.g. foest tempeatue, wate contamination,

More information

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences Ameican Jounal of ata ining and Knowledge iscovey 27; 2(4): 2-8 http://www.sciencepublishinggoup.com//admkd doi:.648/.admkd.2724.2 Genealized Gey Taget ecision ethod Based on ecision akes Indiffeence Attibute

More information

AN ANALYSIS OF COORDINATED AND NON-COORDINATED MEDIUM ACCESS CONTROL PROTOCOLS UNDER CHANNEL NOISE

AN ANALYSIS OF COORDINATED AND NON-COORDINATED MEDIUM ACCESS CONTROL PROTOCOLS UNDER CHANNEL NOISE AN ANALYSIS OF COORDINATED AND NON-COORDINATED MEDIUM ACCESS CONTROL PROTOCOLS UNDER CHANNEL NOISE Tolga Numanoglu, Bulent Tavli, and Wendi Heinzelman Depatment of Electical and Compute Engineeing Univesity

More information

A New Finite Word-length Optimization Method Design for LDPC Decoder

A New Finite Word-length Optimization Method Design for LDPC Decoder A New Finite Wod-length Optimization Method Design fo LDPC Decode Jinlei Chen, Yan Zhang and Xu Wang Key Laboatoy of Netwok Oiented Intelligent Computation Shenzhen Gaduate School, Habin Institute of Technology

More information

Annales UMCS Informatica AI 2 (2004) UMCS

Annales UMCS Informatica AI 2 (2004) UMCS Pobane z czasopisma Annales AI- Infomatica http://ai.annales.umcs.pl Annales Infomatica AI 2 (2004) 33-340 Annales Infomatica Lublin-Polonia Sectio AI http://www.annales.umcs.lublin.pl/ Embedding as a

More information

Desired Attitude Angles Design Based on Optimization for Side Window Detection of Kinetic Interceptor *

Desired Attitude Angles Design Based on Optimization for Side Window Detection of Kinetic Interceptor * Poceedings of the 7 th Chinese Contol Confeence July 6-8, 008, Kunming,Yunnan, China Desied Attitude Angles Design Based on Optimization fo Side Window Detection of Kinetic Intecepto * Zhu Bo, Quan Quan,

More information

An Improved Resource Reservation Protocol

An Improved Resource Reservation Protocol Jounal of Compute Science 3 (8: 658-665, 2007 SSN 549-3636 2007 Science Publications An mpoved Resouce Resevation Potocol Desie Oulai, Steven Chambeland and Samuel Piee Depatment of Compute Engineeing

More information

AUTOMATED LOCATION OF ICE REGIONS IN RADARSAT SAR IMAGERY

AUTOMATED LOCATION OF ICE REGIONS IN RADARSAT SAR IMAGERY AUTOMATED LOCATION OF ICE REGIONS IN RADARSAT SAR IMAGERY Chistophe Waceman (1), William G. Pichel (2), Pablo Clement-Colón (2) (1) Geneal Dynamics Advanced Infomation Systems, P.O. Box 134008 Ann Abo

More information

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes CS630 Repesenting and Accessing Digital Infomation Infomation Retieval: Basics Thosten Joachims Conell Univesity Infomation Retieval Basics Retieval Models Indexing and Pepocessing Data Stuctues ~ 4 lectues

More information

Image Registration among UAV Image Sequence and Google Satellite Image Under Quality Mismatch

Image Registration among UAV Image Sequence and Google Satellite Image Under Quality Mismatch 0 th Intenational Confeence on ITS Telecommunications Image Registation among UAV Image Sequence and Google Satellite Image Unde Quality Mismatch Shih-Ming Huang and Ching-Chun Huang Depatment of Electical

More information

= dv 3V (r + a 1) 3 r 3 f(r) = 1. = ( (r + r 2

= dv 3V (r + a 1) 3 r 3 f(r) = 1. = ( (r + r 2 Random Waypoint Model in n-dimensional Space Esa Hyytiä and Joma Vitamo Netwoking Laboatoy, Helsinki Univesity of Technology, Finland Abstact The andom waypoint model (RWP) is one of the most widely used

More information

High performance CUDA based CNN image processor

High performance CUDA based CNN image processor High pefomance UDA based NN image pocesso GEORGE VALENTIN STOIA, RADU DOGARU, ELENA RISTINA STOIA Depatment of Applied Electonics and Infomation Engineeing Univesity Politehnica of Buchaest -3, Iuliu Maniu

More information

Monte Carlo Simulation for the ECAT HRRT using GATE

Monte Carlo Simulation for the ECAT HRRT using GATE Monte Calo Simulation fo the ECAT HRRT using GATE F. Bataille, C. Comtat, Membe, IEEE, S. Jan, and R. Tébossen Abstact The ECAT HRRT (High Resolution Reseach Tomogaph, CPS Innovations, Knoxville, TN, U.S.A.)

More information

Tissue Classification Based on 3D Local Intensity Structures for Volume Rendering

Tissue Classification Based on 3D Local Intensity Structures for Volume Rendering 160 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 6, NO., APRIL-JUNE 000 Tissue Classification Based on 3D Local Intensity Stuctues fo Volume Rendeing Yoshinobu Sato, Membe, IEEE, Cal-Fedik

More information

POMDP: Introduction to Partially Observable Markov Decision Processes Hossein Kamalzadeh, Michael Hahsler

POMDP: Introduction to Partially Observable Markov Decision Processes Hossein Kamalzadeh, Michael Hahsler POMDP: Intoduction to Patially Obsevable Makov Decision Pocesses Hossein Kamalzadeh, Michael Hahsle 2019-01-02 The R package pomdp povides an inteface to pomdp-solve, a solve (witten in C) fo Patially

More information

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM Accepted fo publication Intenational Jounal of Flexible Automation and Integated Manufactuing. A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM Nagiza F. Samatova,

More information

Reachable State Spaces of Distributed Deadlock Avoidance Protocols

Reachable State Spaces of Distributed Deadlock Avoidance Protocols Reachable State Spaces of Distibuted Deadlock Avoidance Potocols CÉSAR SÁNCHEZ and HENNY B. SIPMA Stanfod Univesity We pesent a family of efficient distibuted deadlock avoidance algoithms with applications

More information

Lecture 27: Voronoi Diagrams

Lecture 27: Voronoi Diagrams We say that two points u, v Y ae in the same connected component of Y if thee is a path in R N fom u to v such that all the points along the path ae in the set Y. (Thee ae two connected components in the

More information

On the Conversion between Binary Code and Binary-Reflected Gray Code on Boolean Cubes

On the Conversion between Binary Code and Binary-Reflected Gray Code on Boolean Cubes On the Convesion between Binay Code and BinayReflected Gay Code on Boolean Cubes The Havad community has made this aticle openly available. Please shae how this access benefits you. You stoy mattes Citation

More information

Worst-Case Delay Bounds for Uniform Load-Balanced Switch Fabrics

Worst-Case Delay Bounds for Uniform Load-Balanced Switch Fabrics Wost-Case Delay Bounds fo Unifom Load-Balanced Switch Fabics Spyidon Antonakopoulos, Steven Fotune, Rae McLellan, Lisa Zhang Bell Laboatoies, 600 Mountain Ave, Muay Hill, NJ 07974 fistname.lastname@alcatel-lucent.com

More information

On using circuit-switched networks for file transfers

On using circuit-switched networks for file transfers On using cicuit-switched netwoks fo file tansfes Xiuduan Fang, Malathi Veeaaghavan Univesity of Viginia Email: {xf4c, mv5g}@viginia.edu Abstact High-speed optical cicuit-switched netwoks ae being deployed

More information

A Mathematical Implementation of a Global Human Walking Model with Real-Time Kinematic Personification by Boulic, Thalmann and Thalmann.

A Mathematical Implementation of a Global Human Walking Model with Real-Time Kinematic Personification by Boulic, Thalmann and Thalmann. A Mathematical Implementation of a Global Human Walking Model with Real-Time Kinematic Pesonification by Boulic, Thalmann and Thalmann. Mashall Badley National Cente fo Physical Acoustics Univesity of

More information

3D Hand Trajectory Segmentation by Curvatures and Hand Orientation for Classification through a Probabilistic Approach

3D Hand Trajectory Segmentation by Curvatures and Hand Orientation for Classification through a Probabilistic Approach 3D Hand Tajectoy Segmentation by Cuvatues and Hand Oientation fo Classification though a Pobabilistic Appoach Diego R. Faia and Joge Dias Abstact In this wok we pesent the segmentation and classification

More information

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation A Minutiae-based Fingepint Matching Algoithm Using Phase Coelation Autho Chen, Weiping, Gao, Yongsheng Published 2007 Confeence Title Digital Image Computing: Techniques and Applications DOI https://doi.og/10.1109/dicta.2007.4426801

More information

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples Multi-azimuth Pestack Time Migation fo Geneal Anisotopic, Weakly Heteogeneous Media - Field Data Examples S. Beaumont* (EOST/PGS) & W. Söllne (PGS) SUMMARY Multi-azimuth data acquisition has shown benefits

More information

Embeddings into Crossed Cubes

Embeddings into Crossed Cubes Embeddings into Cossed Cubes Emad Abuelub *, Membe, IAENG Abstact- The hypecube paallel achitectue is one of the most popula inteconnection netwoks due to many of its attactive popeties and its suitability

More information

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer Using the Gow-And-Pune Netwok to Solve Poblems of Lage Dimensionality B.J. Biedis and T.D. Gedeon School of Compute Science & Engineeing The Univesity of New South Wales Sydney NSW 2052 AUSTRALIA bbiedis@cse.unsw.edu.au

More information

Methods for history matching under geological constraints Jef Caers Stanford University, Petroleum Engineering, Stanford CA , USA

Methods for history matching under geological constraints Jef Caers Stanford University, Petroleum Engineering, Stanford CA , USA Methods fo histoy matching unde geological constaints Jef Caes Stanfod Univesity, Petoleum Engineeing, Stanfod CA 9435-222, USA Abstact Two geostatistical methods fo histoy matching ae pesented. Both ely

More information

DISTRIBUTION MIXTURES

DISTRIBUTION MIXTURES Application Example 7 DISTRIBUTION MIXTURES One fequently deals with andom vaiables the distibution of which depends on vaious factos. One example is the distibution of atmospheic paametes such as wind

More information

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information Title CALCULATION FORMULA FOR A MAXIMUM BENDING MOMENT AND THE TRIANGULAR SLAB WITH CONSIDERING EFFECT OF SUPPO UNIFORM LOAD Autho(s)NOMURA, K.; MOROOKA, S. Issue Date 2013-09-11 Doc URL http://hdl.handle.net/2115/54220

More information

Survey of Various Image Enhancement Techniques in Spatial Domain Using MATLAB

Survey of Various Image Enhancement Techniques in Spatial Domain Using MATLAB Suvey of Vaious Image Enhancement Techniques in Spatial Domain Using MATLAB Shailenda Singh Negi M.Tech Schola G. B. Pant Engineeing College, Paui Gahwal Uttaahand, India- 46194 ABSTRACT Image Enhancement

More information

Dynamic Topology Control to Reduce Interference in MANETs

Dynamic Topology Control to Reduce Interference in MANETs Dynamic Topology Contol to Reduce Intefeence in MANETs Hwee Xian TAN 1,2 and Winston K. G. SEAH 2,1 {stuhxt, winston}@i2.a-sta.edu.sg 1 Depatment of Compute Science, School of Computing, National Univesity

More information

Accurate Diffraction Efficiency Control for Multiplexed Volume Holographic Gratings. Xuliang Han, Gicherl Kim, and Ray T. Chen

Accurate Diffraction Efficiency Control for Multiplexed Volume Holographic Gratings. Xuliang Han, Gicherl Kim, and Ray T. Chen Accuate Diffaction Efficiency Contol fo Multiplexed Volume Hologaphic Gatings Xuliang Han, Gichel Kim, and Ray T. Chen Micoelectonic Reseach Cente Depatment of Electical and Compute Engineeing Univesity

More information

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor Obstacle Avoidance of Autonomous Mobile Robot using Steeo Vision Senso Masako Kumano Akihisa Ohya Shin ichi Yuta Intelligent Robot Laboatoy Univesity of Tsukuba, Ibaaki, 35-8573 Japan E-mail: {masako,

More information

User Specified non-bonded potentials in gromacs

User Specified non-bonded potentials in gromacs Use Specified non-bonded potentials in gomacs Apil 8, 2010 1 Intoduction On fist appeaances gomacs, unlike MD codes like LAMMPS o DL POLY, appeas to have vey little flexibility with egads to the fom of

More information

DEADLOCK AVOIDANCE IN BATCH PROCESSES. M. Tittus K. Åkesson

DEADLOCK AVOIDANCE IN BATCH PROCESSES. M. Tittus K. Åkesson DEADLOCK AVOIDANCE IN BATCH PROCESSES M. Tittus K. Åkesson Univesity College Boås, Sweden, e-mail: Michael.Tittus@hb.se Chalmes Univesity of Technology, Gothenbug, Sweden, e-mail: ka@s2.chalmes.se Abstact:

More information

ART GALLERIES WITH INTERIOR WALLS. March 1998

ART GALLERIES WITH INTERIOR WALLS. March 1998 ART GALLERIES WITH INTERIOR WALLS Andé Kündgen Mach 1998 Abstact. Conside an at galley fomed by a polygon on n vetices with m pais of vetices joined by inteio diagonals, the inteio walls. Each inteio wall

More information

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design Tansmission Lines Modeling Based on Vecto Fitting Algoithm and RLC Active/Passive Filte Design Ahmed Qasim Tuki a,*, Nashien Fazilah Mailah b, Mohammad Lutfi Othman c, Ahmad H. Saby d Cente fo Advanced

More information

IP Multicast Simulation in OPNET

IP Multicast Simulation in OPNET IP Multicast Simulation in OPNET Xin Wang, Chien-Ming Yu, Henning Schulzinne Paul A. Stipe Columbia Univesity Reutes Depatment of Compute Science 88 Pakway Dive South New Yok, New Yok Hauppuage, New Yok

More information

Topic 7 Random Variables and Distribution Functions

Topic 7 Random Variables and Distribution Functions Definition of a Random Vaiable Distibution Functions Popeties of Distibution Functions Topic 7 Random Vaiables and Distibution Functions Distibution Functions 1 / 11 Definition of a Random Vaiable Distibution

More information