Data & Knowledge Engineering

Size: px
Start display at page:

Download "Data & Knowledge Engineering"

Transcription

1 Data & Knowledge Engineering 7 (211) Contents lists available at ScienceDirect Data & Knowledge Engineering journal hoepage: An approxiate duplicate eliination in RFID data streas Chun-Hee Lee a, Chin-Wan Chung b, a Data Analytics Group, SAIT, Sasung Electronics, Yongin, , Republic of Korea b Departent of Coputer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 35 71, Republic of Korea article info abstract Article history: Received 6 March 21 Received in revised for 16 July 211 Accepted 18 July 211 Available online 31 July 211 Keywords: Duplicate eliination RFID Bloo filter Real-tie DBs Sart cards The RFID technology has been applied to a wide range of areas since it does not require contact in detecting RFID tags. However, due to the ultiple readings in any cases in detecting an RFID tag and the deployent of ultiple readers, RFID data contains any duplicates. Since RFID data is generated in a streaing fashion, it is difficult to reove duplicates in one pass with liited eory. We propose one pass approxiate ethods based on Bloo Filters using a sall aount of eory. We first devise Tie Bloo Filters as a siple extension to Bloo Filters. We then propose Tie Interval Bloo Filters to reduce errors. Tie Interval Bloo Filters need ore space than Tie Bloo Filters. We propose a ethod to reduce space for Tie Interval Bloo Filters. Since Tie Bloo Filters and Tie Interval Bloo Filters are based on Bloo Filters, they do not produce false negative errors. Experiental results show that our approaches can effectively reove duplicates in RFID data streas in one pass with a sall aount of eory. 211 Elsevier B.V. All rights reserved. 1. Introduction Recently, due to the advanceent of inforation technology, various kinds of data such as XML, RDF, and RFID data have been generated [18,9,5,8]. Especially, a large aount of RFID data has been generated in any environents since the RFID technology does not require contact in detecting RFID tags and therefore has been used in any areas such as business, ilitary, and edical applications. The RFID adoption in Walart is a typical RFID exaple in the business area. However, the advantage of the RFID technology causes a new proble. Since an RFID tag is detected without contact, if an RFID tag is within a proper range fro an RFID reader, the RFID tag will be detected whether we want to or not. Therefore, if RFID tags stay or ove slowly in the detection region, uch unnecessary data (i.e., duplicate RFID data) will be generated. On the other hand, if RFID tags ove fast or any RFID tags ove siultaneously in the detection region, one RFID reader ay not be able to detect all of the. To prevent issing readings of RFID tags in such cases, several RFID readers are generally deployed in order to onitor a single location [1,2]. When several RFID readers detect one RFID tag at the sae tie, duplicate data is generated. Therefore, we cannot avoid the generation of duplicate RFID data in RFID applications. An intelligent RFID reader with the processing capability can eliinate duplicate RFID data generated by the RFID reader. However, duplicate RFID data generated fro ultiple readers can not be reoved with only the self-contained processing capability of RFID readers. Therefore, we need a technique to eliinate duplicate RFID data in the server (RFID iddleware) that collects RFID data fro various RFID readers. Consider the exaple in Fig. 1. There are two RFID readers and two tags pass through the detection region. In this situation, Reader1 and Reader2 detect tags with the identifier ID1 and ID2. Each reader generates the detection inforation such as btag ID, location of the reader, tien. Reader1 detects the tag with ID1 and generates RFID data bid1, Loc1, 1N, bid1, Loc1, 2N, bid1, Loc1, 4N. However, the RFID data is duplicate data except bid1, Loc1, 1N. Also, Reader1 generates RFID data bid2, Loc1, 3NbID2, Loc1, 5N for the tag with ID2 and bid2, Loc1, 5N is duplicate data. In the sae way, Reader2 generates RFID data and has duplicates as shown Corresponding author. Tel.: ; fax: E-ail addresses: chunhee1.lee@sasung.co (C.-H. Lee), chungcw@kaist.edu (C.-W. Chung) X/$ see front atter 211 Elsevier B.V. All rights reserved. doi:1.116/j.datak

2 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Reader1(Loc1) Tag: ID2 Tag: ID1 Reader2(Loc2) <ID1, Loc1, 1> <ID1, Loc1, 2> <ID2, Loc1, 3> <ID1, Loc1, 4> <ID2, Loc1, 5> <ID1, Loc2, 1> <ID2, Loc2, 2> <ID1, Loc2, 3> <ID2, Loc2, 5> Server <ID1, Loc1, 1> <ID1, Loc2, 1> <ID1, Loc1, 2> <ID2, Loc2, 2> <ID2, Loc1, 3> <ID1, Loc2, 3> <ID1, Loc1, 4> <ID2, Loc2, 5> <ID2, Loc1, 5> <ID1, Loc1, 1> <ID2, Loc2, 2> Fig. 1. An exaple for duplicate RFID data eliination. in the lower part of Fig. 1. This is because a reader detects a tag continuously within the detection region. RFID data generated in each reader is sent to the server. Then, there are ore duplicates in the server since two readers ay detect the sae tag. For exaple, bid1, Loc1, 1N in Reader1 and bid1, Loc2, 1N in Reader2 are duplicate. Therefore, the server reoves duplicate RFID data as preprocessing and the result for preprocessing is bid1, Loc1, 1NbID2, Loc2, 2N. This proble sees siple for a sall aount of RFID data. However, the volue of RFID data is generally very big. Also, if a tag is attached to each ite, the aount of data generated in a large retailer will exceed terabytes in a day [13]. And, RFID data is produced in a strea. It eans that a duplicate eliination ethod should process data instantly with liited eory. It is difficult to design an exact duplicate eliination ethod. As an alternative, we can use approxiate duplicate eliination techniques for RFID applications that do not require exact answers. In such an application, we can consider a real-tie analysis application for the oveent of custoers in a large departent store. Each store in the departent store has RFID readers. Each custoer has a unique RFID tag. The anager wants the real-tie analysis of the oveent of custoers such as the nuber of custoers in each store and the store which has the axiu nuber of custoers. For such an analysis, the central server in the departent store should reove duplicates. In this environent, so uch RFID data coes into the server siultaneously and there are duplicates. Especially, when a custoer stays at the sae location for a long tie, it generates a large nuber of unnecessary duplicate data. In order to eliinate duplicates exactly, we need to keep all RFID data including duplicates in eory during a long period. Therefore, it is difficult to eliinate duplicates exactly in real-tie using a sall aount of eory relative to the aount of RFID data. In this application, the anager does not feel uncofortable even if statistics with allowable errors are provided. Therefore, we propose approxiate RFID duplicate eliination techniques in one pass with the liited eory. Bloo Filters have been widely used as a very copact data structure with an allowable error. To anage RFID data streas with a sall aount of eory, we devise Bloo Filter based approaches. However, since Bloo Filters are targeted for static data, we should adapt Bloo Filters to RFID data strea environents. We thus propose Tie Bloo Filters as a straightforward adaptation of Bloo Filters. Since Tie Bloo Filters are based on Bloo Filters, they do not generate false negatives which are duplicate data contained in the result after filtering. However, they ay generate false positives that are non-duplicate data which are not contained in the result after filtering. We provide the false positive probability for Tie Bloo Filters. Also, to reduce false positives for Tie Bloo Filters, we devise Tie Interval Bloo Filters using the concept of the interval. Our contributions are as follows: The Effective Duplicate Eliination Methods. In data strea environents, it is not easy to design one pass duplicate eliination algorith with liited eory. To design such an algorith, we adapt Bloo Filters to RFID data strea environents. We propose effective approxiate duplicate eliination ethods, Tie Bloo Filters and Tie Interval Bloo Filters. They can eliinate duplicates in one pass with a sall aount of eory. Space Optiization for Tie Interval Bloo Filters. While the Tie Bloo Filter stores one tie field, the Tie Interval Bloo Filter stores two tie fields as a tie interval. Therefore, the Tie Interval Bloo Filter needs ore space than the Tie Bloo Filter. We devise a ethod to reduce space for the Tie Interval Bloo Filter. The Forulation of a Duplicate Eliination Proble. Though any papers ention the duplicate eliination proble in RFID data, they do not forulate it rigorously. In this paper, we forulate the duplicate eliination proble in RFID data streas forally. Paraeter Setting. In Tie Bloo Filters and Tie Interval Bloo Filters, the nuber of hash functions k affects errors. We propose a forula to find k and validate it using an experiental evaluation Organization The rest of this paper is organized as follows. In Section 2, we present related work. In Section 3, we explain Bloo Filters as a preliinary, and in Section 4, we foralize the duplicate eliination proble in RFID data streas. We describe Tie Bloo Filters and Tie Interval Bloo Filters in Sections 5 and 6, respectively. In Section 7, we discuss paraeter setting in Tie Bloo Filters and Tie Interval Bloo Filters. We easure the effectiveness of Tie Bloo Filters and Tie Interval Bloo Filters experientally in Section 8. Finally, we conclude our work in Section 9.

3 172 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Related work As RFID tags have been fabricated at a very low cost, an enorous volue of RFID tags will be used in various places (e.g., retail stores, hospitals) in the near future. Therefore, a vast aount of RFID data will be generated. Chawathe et al.[8] discuss challenges for anaging such an enorous volue of RFID data. Also, Bornhovd et al.[5] give an overview of the Auto-ID infrastructure (Device Layer/Device Operation Layer/Business Process Bridging Layer/Enterprise Application Layer) which integrates data fro sart ites (e.g., RFID, sensor) with existing business processes. RFID-based ite-tracking applications are typical RFID applications. To support ite-tracking applications, soe work has been done [14,13,3]. In[14], the bitap datatype and its operations are introduced. By sharing the prefix of EPC, 1 a collection of EPCs is copactly represented in the bitap datatype. However, [14] does not focus on query processing in an RFID-based ite tracking. Gonzalez et al. [13] propose a new warehousing odel, RFID-Cuboid, for RFID data copression and path-dependent aggregates. Also, to trace RFID tags efficiently, Ban et al. [3] propose the Tie Paraeterized Interval R-Tree. Cao et al.[6] present a distributed RFID strea processing syste with the inference for tracking and onitoring. In addition, the works in [12,16,17] provide a tool to analyze a large aount of RFID data in the warehouse environent. Gonzalez et al. [12] propose FlowCube to analyze coodity flows at ultiple levels. Lee and Chung [16,17] devise an effective path encoding schee to answer path oriented queries. As raw RFID data obtained fro RFID readers isses readings and includes duplicate readings, there has been soe work on RFID data cleansing [1,15,21]. Since RFID data fro various sites is sent to the server, there exists heavy traffic in the network. In [1], to reduce network traffic, the edge processing is proposed. In the edge processing, data aggregation and soothing, filtering, and grouping are perfored as early as possible. The teporal soothing filter that infers issing readings within a fixed tie window is a general ethod to correct issing readings. However, it is difficult for users to decide the best tie window size and the soothing filter with a fixed tie window size does not perfor well due to the dynaics of RFID flows. Thus, to correct issing readings, Shawn et al.[15] use an adaptive window size according to the characteristics of RFID flows. In order to adapt the window size, they apply statistical sapling. Generally, data cleansing for RFID data streas is perfored in a preprocessing step. However, Rao et al.[21] propose a deferred data cleansing ethod which is perfored during query execution. One of the ost general and iportant cleaning of RFID data is to reove duplicate RFID data. There has been traditional research such as hash-based algoriths and sorting-based algoriths in duplicate eliination [11]. However, since those algoriths focus on duplicate eliination in stored data, they need to scan data any ties. Thus, it is difficult to apply traditional duplicate eliination techniques to RFID data streas. To eliinate duplicates for streaing data, approxiate algoriths are proposed [2,1]. In[2], Metwally et al. use Bloo Filters for detecting duplicates in click streas. However, if click streas are generated continuously, all entries of the Bloo Filter will becoe 1 in the long run and then the Bloo Filter will be useless. In [1], Deng et al. solve the above proble using Stable Bloo Filters. In order to reove old data in a streaing environent, the Stable Bloo Filter sets cells corresponding to an input data to the axiu value and decreases values of randoly selected cells whenever data arrives. However, Stable Bloo Filters have false positive errors as well as false negative errors. In addition, the paraeter setting in Stable Bloo Filters is coplex and its solution is not always optial. In [23], Wang et al. provide a ethod to reove duplicates over distributed data streas. The eager approach and the lazy approach are proposed to share Bloo Filters between the coordinator and the reote sites. To eliinate duplicate RFID data, HashMerge is proposed in [2]. In HashMerge, a hash structure is used for an efficient duplicate eliination. However, it does not deal with the deletion operation which reoves unnecessary data in the hash table. Also, the proble of RFID duplicate eliination is entioned in [1,21,22] but they do not propose algoriths to eliinate duplicates. The authors in [19] use count Bloo Filters to reove duplicate RFID data. However, they consider duplicate eliination only at the reader level. Therefore, RFID readers should preprocess RFID data and send the preprocessed data to the server. In the case of RFID readers without the self-contained coputing power, the approach cannot be applied. Also, the critical disadvantage of the approach is that it generates both false positives and false negatives. While the above approaches focus on eliinating duplicate RFID data, [7] focuses on preventing the generation of duplicate RFID data. Duplicate RFID data generated in the sae reader can be reoved easily with the self-contained processing capability. However, duplicate RFID data generated in different readers cannot be reoved with only the self-contained processing capability of the readers. To prevent the generation of duplicate RFID data in different readers, [7] deals with the redundant reader eliination. By reoving the redundant readers which read the sae tag siultaneously, [7] prevents the generation of duplicate RFID data. However, they assue writable tags on which soe inforation can be written. Therefore, the ethod in [7] cannot be applied to the general cases. 3. Preliinary Bloo Filters were originally proposed in [4]. The goal of Bloo Filters is to test whether any eleent is contained in a given set S using a sall aount of eory. Although it is easy to test the ebership of any eleent, if a set contains a large nuber of eleents, we have to keep all eleents of the set in eory and thus need a large aount of eory. Bloo Filters test the ebership with allowable errors using a sall aount of eory copared to the given set. 1 Electronic Product Code(EPC) is a coding schee of RFID tags to identify the uniquely.

4 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Field Raw RFID Data Strea Filtering Module Non-duplicate RFID Data Strea Applications Duplicate RFID Data Server Drop Duplicate RFID Data Fig. 2. The syste architecture. The Bloo Filter is based on a hash coding. It consists of an array of bits and k hash functions (h 1,h 2,,h k ). We assue that k hash functions address eleents to range {,, 1} with a unifor distribution and are independent fro each other. Initially, all cells of the Bloo Filter are set to. To represent a large nuber of eleents in a set S using the Bloo Filter, the Bloo Filter sets cells, which correspond to h 1 (a), h 2 (a),,h k (a), to 1 for each eleent a in S. Since one cell ay be set to 1 by different eleents, the Bloo Filter tests the ebership of any eleent with errors. To test whether an eleent a is contained in a set S using the Bloo Filter, we check k cells which correspond to h 1 (a),h 2 (a),, h k (a). If the value of at least one cell aong k cells is, it is obvious that a is not in S. In this case, the Bloo Filter reports that a is not contained in S. If all values of k cells are 1, a is probably in S. The Bloo Filter reports that a is contained in S. In this case, there ay exist false positive errors that report a is in S although it is not in S. However, in the Bloo Filter, there do not exist false negative errors that report that a is not in S although a is in S. When the nuber of hash functions is k, the nuber of cells is, and the nuber of eleents in a set is n, the false positive probability f is f = kn k [4].Ifis large enough, 1 1 kn e kn =. Thus, f (1 e kn/ ) k. When n and are given, Bloo Filters can evaluate k to iniize f using derivatives. When k = ðln2þ, f is iniized. Therefore, Bloo Filters set k n to ðln2þ in order to iniize false positive errors. n 4. Proble stateent The syste architecture that we consider is depicted in Fig. 2. The RFID data strea generated in the field coes into the server. There are the Filtering Module and Applications in the server. Before the raw RFID data strea is oved to applications, it passes through the filtering odule. With the filtering odule, duplicate RFID data is reoved fro the raw RFID data strea. Applications receive non-duplicate RFID data in a strea. That is, in the filtering odule, duplicate RFID data is iediately dropped fro the raw RFID data strea. RFID data is generated at any RFID readers continuously. RFID readers send RFID data to the server. Hence, the server receives a sequence of RFID data in a strea. RFID data consists of a tag identifier (i.e., EPC), the location of the RFID reader, and the detected tie of the tag. We define an RFID data strea forally as follows: Definition 1. An RFID data strea S is a set {s 1,,s n }, 2 where s i consists of a triple (TagID,Loc,Tie) such that "TagID" is the electronic product code (EPC) of the tag and used for identifying the tag uniquely. "Loc" is the location of the reader which detects the tag. "Tie" is the tie of detecting the tag. In an RFID data strea, RFID data x is considered as a duplicate if there exists y( x) in the RFID data strea such that x. TagID=y. TagID, and x. Tie y. Tie τ, where τ is an application-specific positive value [2,21,22]. Fro the definition of the duplicate, we can find a non-duplicate RFID data strea by eliinating duplicates. However, in a case that RFID data with the sae TagID is generated repeatedly at intervals that are less than or equal to τ, finding a non-duplicate RFID data strea is confusing. For exaple, consider an RFID data strea S={s 1,s 2,s 3 }, where s 1 =(tag1, loc1,5), s 2 =(tag1,loc1,1), and s 3 =(tag1,loc1,15) (τ=8). Fro S, we know that s 2 is a duplicate because of s 1, and s 3 is a duplicate because of s 2. However, S arrives at the server in sequence. If s 2 arrives at the server, we can first reove s 2 fro {s 1,s 2 }. In that case, if s 3 arrives at the server, s 3 is not a duplicate since there does not exist s 2. 2 Strictly speaking, an RFID data strea S is a sequence bs 1,,s n N but, in this section, we consider an RFID data strea as a set because eleents in an RFID data strea contain the tie inforation and a sequence is generated in tie order.

5 174 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) (a) Tie Bloo Filter (b) Tie Interval Bloo Filter Fig. 3. The structures of the Tie Bloo Filter and the Tie Interval Bloo Filter. Depending on the property of applications, a non-duplicate data strea for S can be defined differently. In this paper, a nonduplicate RFID data strea for S is considered as {s 1 } instead of {s 1,s 3 } since data with the sae TagID that is generated repeatedly at intervals that are less than or equal to τ is generally useless. We forulate the duplicate RFID data eliination proble forally with soe definitions. Definition 2. For RFID data x and y, x and y are connected in S if there exists z S such that x.tagid=y.tagid, z.tagid=x.tagid, x.tie z.tie τ and z.tie y.tie τ. x and y are connected in S also if x and z are connected in S and z and y are connected in S. Connected is defined in order to reove duplicate RFID data of the sae TagID that is generated repeatedly at intervals that are less than or equal to τ. If an RFID data strea contains two eleents which have the sae TagID and Tie, one is an obvious duplicate. In this case, we consider the latter eleent in a sequence order as a duplicate. We focus on the eleents with different Tie to identify a duplicate. Using Definition 2, we define the duplicate forally in Definition 3. Definition 3. RFID data x is a duplicate in S if there exists any y( x) S such that (x.tagid=y.tagid and x.tie y.tie τ) or (x and y are connected in S and y.tie x.tie). The non-duplicate RFID data strea for an RFID data strea S is an RFID data strea after reoving all duplicate RFID data in S. We can define the non-duplicate RFID data strea as the duplicate-free axial set. The duplicate-free set and the duplicate-free axial set are defined as follows: Definition 4. A set S ( S) is called a duplicate-free set in S if there does not exists x S such that x is a duplicate in S. Definition 5. The set S ( S) is a duplicate-free axial set in S if S is a duplicate-free set in S and for all x S S, x is a duplicate in S. We want to find the duplicate-free axial set S (i.e., a non-duplicate RFID data strea) for an RFID data strea S. However, it is difficult to find the duplicate-free axial set in data streas with a sall aount of space. Since soe applications allow errors, we consider a ethod to find an approxiation of the duplicate-free axial set. Especially, we liit allowable errors as false positive errors in this paper. Therefore, our proble is defined as follows: Given space and a assive RFID data strea S, find a duplicate-free set Ŝ using space such that j Ŝ S j=js j is iniized, where S is the duplicate-free axial set in S. 5. Tie Bloo Filters To eliinate duplicates in RFID data streas, we first devise a siple approach based on Bloo Filters, called the Tie Bloo Filter. Fro Definition 3, we know that RFID data x ay not be a duplicate according to x.tie even if there exists RFID data with the sae TagID in an RFID data strea. Therefore, we can use the tie inforation in detecting RFID duplicates. While each entry in the Bloo Filter is set to zero or one, each entry in the Tie Bloo Filter is set to the detected tie of an RFID tag. That is, the Tie Bloo Filter uses an integer array instead of a bit array. The Tie Bloo Filter is depicted in Fig. 3(a). The Tie Bloo Filter uses k independent hash functions (h 1,h 2,,h k ) with range {,, 1} like the Bloo Filter. The value of the i-th cell in the Tie Bloo Filter is denoted by M[i]. In order to store RFID data x in the Tie Bloo Filter, we find k cells that correspond to h 1 (x.tagid),,h k (x.tagid). The Tie Bloo Filter then sets k cells to the detected tie of RFID data x (i.e., x.tie). If the cells are already set to the detected tie of the previous RFID data, the Tie Bloo Filter overwrites it with the detected tie of the current RFID data. 3 To know whether soe RFID data x is a duplicate, we check k cells corresponding to h 1 (x.tagid),,h k (x.tagid). If there exists at least onecellsuchthatx.tie M[h i (x.tagid)]nτ, we are sure that RFID data x is not a duplicate since duplicate data exists within τ tie. In the 3 We assue RFID data streas arrive at the server in tie order for convenience.

6 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Fig. 4. Duplicate eliination algorith of the Tie Bloo Filter. Tie Bloo Filter, we can regard the condition x. Tie M[h i (x. TagID)] τ as 1 in the Bloo Filter and the condition x. Tie M[h i (x. TagID)]Nτ as in the Bloo Filter. Note that the evaluation of the condition of the Tie Bloo Filter changes as tie goes by. Fig. 4 shows the algorith to eliinate duplicates in RFID data streas using the Tie Bloo Filter. The algorith can find the duplicate-free set with a sall aount of eory. All cells in the Tie Bloo Filter are initially set to. When RFID data x arrives at the server, x passes through the Tie Bloo Filter. By the Tie Bloo Filter, x will be filtered. In other words, x will be dropped or sent to the application. First, when x passes through the Tie Bloo Filter, we check the condition x.tie M[h i (x.tagid)] τ for all i {1,2,,k} to know whether x is a duplicate or not (Line 1). Since all cells are initially set to, if the value of at least one cell is, x is not a duplicate regardless of the condition. If the condition x.tie M[h i (x.tagid)] τ is satisfied and the value of M[h i (x.tagid)] is not for all i {1,2,,k}, x will be probably a duplicate. Then, we drop it (Line 2). Otherwise, we send the non-duplicate data to the application (Lines 3 4). Finally, we update cells to x.tie whenever x is a duplicate or not (Line 5). If we update cells only when that data is not a duplicate, we cannot reove duplicate data of the sae TagID that is generated repeatedly at intervals that are less than or equal to τ. Exaple 1. Fig. 5 shows the state of the Tie Bloo Filter after an RFID data strea S={s 1,s 2,s 3 } passes through the Tie Bloo Filter, where s 1 =(ID1, Loc1, 1), s 2 =(ID2, Loc2, 12), and s 3 =(ID1, Loc1, 13). Suppose hash functions are as shown in Fig. 5, the size of an array in the Tie Bloo Filter is 8, k is 3, and τ is 1. To explain the Tie Bloo Filter, we assue h 2 (ID1)=h 1 (ID2). Consider how the Tie Bloo Filter operates for an RFID data strea S={s 1,s 2,s 3 }. When s 1 arrives at the Tie Bloo Filter, it checks whether s 1 is a duplicate or not. Since all values in M[],M[5],and M[2] are initially, s 1 is not a duplicate. Therefore, s 1 is sent to the application. The Tie Bloo Filter then sets M[],M[5],and M[2] to 1 in Fig. 5(a). When s 2 passes through the Tie Bloo Filter, it is sent to the application since it is not a duplicate. The Tie Bloo Filter sets M[5],M[7],and M[3] to 12 as shown in Fig. 5(b). In the case of M[5], the Tie Bloo Filter sets it to 12 although 1 was set previously. When s 3 passes through the Tie Bloo Filter, it is not a duplicate since 13 M[]Nτ and 13 M[2]Nτ (M[], and M[2] are values in Fig. 5(b)). s 3 is sent to the application and the Tie Bloo Filter sets M[], M[5], and M[2] to 13 in Fig. 5(c). In this exaple, since all eleents are not duplicates, the application receives all data. We can derive the false positive probability for the Tie Bloo Filter in a siilar way as the Bloo Filter [4]. We provide the false positive probability for the Tie Bloo Filter in Theore 1. (a) (b) (c) Fig. 5. The state of the Tie Bloo Filter in Exaple 1.

7 176 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Theore 1. The false positive probability for the Tie Bloo Filter is kn k, where k is the nuber of hash functions, is the nuber of cells, and n is the nuber of non-duplicate RFID data within τ tie. Proof. In order to derive the false positive probability, we assue that a non-duplicate RFID data strea coes to the Tie Bloo Filter. In real applications, RFID data has any duplicates. However, duplicate RFID data sets the sae cells in the Tie Bloo Filter since they have the sae TagID. Also, they set cells to a siilar tie since they are usually tie clustered. Therefore, the state of the Tie Bloo Filter after passing through the original RFID data is siilar to that of the Tie Bloo Filter after passing through non-duplicate RFID data reoving duplicates fro the original RFID data. Therefore, it is reasonable to assue that non-duplicate RFID data coes to the Tie Bloo Filter. Consider, in the past, soe RFID data passed through the Tie Bloo Filter. Now, for any eleent x, we want to know the false positive probability. Since duplicates are defined within τ tie, data that passed through the Tie Bloo Filter before τ tie is eaningless. Thus, we consider only data within τ tie with respect to x. Tie. Let p 1 be the probability that x.tie M[h i (x.tagid)] τ for 1 i k and p 2 be the probability that x is a duplicate. Then, The false positive probability for x = p 1 p 2 To derive p 1,wefirst derive the probability that x.tie M[u] τ for soe u. We assue that hash functions have unfor distributions. For RFID data y within τ tie, the probability that h i (y.tagid) u for soe i is 1 1. The probability that h i (y.tagid) u for all i {1,2,,k} is 1 1 k. Since n is the nuber of RFID data which coes to the Tie Bloo Filter within τ tie, the probability of x.tie M[u]Nτ is 1 1 k n. The probability of x.tie M[u] τ is kn. Therefore, the probability p1 that x.tie M[h i (x.tagid)] τ for all i {1,2,,k} is kn k. h 1 (x.tagid) h 2 (x.tagid) h 3 (x.tagid) h 4 (x.tagid) Start Tie End Tie M[] 1 25 M[1] 2 M[2] 3 15 M[3] 1 3 M[4] 5 3 M[5] 15 M[6] 1 1 M[7] intersection (a) The case where the intersection is not epty h 1 (x.tagid) h 2 (x.tagid) h 3 (x.tagid) h 4 (x.tagid) Start Tie End Tie M[] 7 M[1] 5 15 M[2] 1 3 M[3] 13 3 M[4] 5 25 M[5] 1 2 M[6] 1 1 M[7] intersection (b) The case where the intersection is epty Fig. 6. An exaple for the intersection of tie intervals in the Tie Interval Bloo Filter.

8 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Since we assue that a non-duplicate RFID data strea coes to the Tie Bloo Filter, p 2 is equals to. Therefore, the false positive probability for the Tie Bloo Filter is kn! k. 6. Tie interval bloo filters In the previous section, we introduced Tie Bloo Filters. However, since they are a siple extension of Bloo Filters, we devise Tie Interval Bloo Filters to iprove Tie Bloo Filters. Since detection regions are distributed with proper ranges and RFID tags are detected only within or around detection regions, duplicate RFID data is location-clustered, and therefore tieclustered. That is, if soe RFID tag is detected, RFID data with the sae TagID will be detected any ties during a period. Thus, we can represent the range of detected ties of RFID data with the sae TagID as a short tie interval. Fig. 3(b) shows the structure of the Tie Interval Bloo Filter. The start tie and the end tie of the i-th cell are denoted by M[i].StartTie and M[i].EndTie, respectively. StartTie and EndTie are initially set to and 1, respectively. Consider how the Tie Interval Bloo Filter keeps StartTie and EndTie in a cell. For an easy explanation, we assue that the nuber of hash functions is 1 and one cell in the Tie Interval Bloo Filter is set by only one TagID. When RFID data x with TagID=1 arrives at the Tie Interval Bloo Filter first, we set both StartTie and EndTie of the cell corresponding to TagID=1 to x. Tie. That is, the initial tie interval is a tie point. When another RFID data y with TagID=1 arrives at the Tie Interval Bloo Filter, we change only EndTie of the cell corresponding to TagID=1 to y. Tie. Thus, the Tie Interval Bloo Filter keeps the tie interval corresponding to TagID=1 in the cell. For RFID data x, if x. Tie EndTie is ore than τ, x. Tie StartTie is also ore than τ. Such a tie interval is useless. In this case, we set both StartTie and EndTie to x.tie again to initialize the tie interval. Since one cell ay be set by various RFID data with different TagID, the tie interval for each TagID ay be ixed. However, the interval [StartTie, EndTie] in the cell includes detected ties of all RFID data with TagID corresponding to the cell. To know whether soe RFID data x is a duplicate, we check whether the intersection of all tie intervals corresponding to h 1 (x),h 2 (x),,h k (x) is epty. If soe RFID data with TagID=x.TagID arrived at the Tie Interval Bloo Filter, all tie intervals corresponding to x.tagid should have included the detected tie of that data and the intersection would have not been epty. Therefore, when the intersection is epty, we are sure that any RFID data with TagID=x.TagID did not arrive at the Tie Interval Bloo Filter within τ tie. Fig. 6 illustrates how to deterine whether soe RFID data x is a duplicate. The coordinate on the right side of the Tie Interval Bloo Filter of Fig. 6 represents the tie intervals corresponding to RFID data x. In Fig. 6(a), the intersection for four tie intervals is [1,15]. Therefore, it is not epty and x is considered as a duplicate. In Fig. 6(b), the intersection for four tie intervals is epty. Therefore, the Tie Interval Bloo Filter reports that x is not a duplicate. Fig. 7 shows the algorith to eliinate duplicates using the Tie Interval Bloo Filter. If RFID data x arrives at the Tie Interval Bloo Filter, we first check whether x is a duplicate or not. To do this, we check whether the intersection of intervals Fig. 7. Duplicate eliination algorith of the Tie Interval Bloo Filter.

9 178 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Tie h 1 (ID1) M[] 11 h 2 (ID1) M[1] h 3 (ID1) M[2] 11 h 1 (ID2) M[3] M[4] 18 h 1 (ID4) h 2 (ID2) M[5] 11 h 2 (ID4) h 3 (ID2) M[6] M[7] 15 h 3 (ID4) h 1 (ID3) M[8] M[9] h 2 (ID3) M[1] h 3 (ID3) M[11] 18 M[12] 15 M[13] Fig. 8. The state of the Tie Bloo Filter in Exaple 2. corresponding to h 1 (x),h 2 (x),,h k (x) is not epty (Lines 1 7). If the intersection (intersectinterval in Line 6) is located in the tie before x. Tie τ (i.e., x. Tie intersectinterval. EndTieNτ), we regard it as epty since duplicates are defined within τ tie. Therefore, when intersectinterval is not epty and x. Tie intersectinterval. EndTie τ, x is a duplicate and is dropped (Lines 7 8). Otherwise, x is sent to the application (Line 9 1). We then update StartTie and EndTie (Lines 11 2). If StartTie= (a case where data corresponding to the cell is first inserted) or x. Tie EndTieNτ, we initialize both StartTie and EndTie to x. Tie (Lines 13 17). Otherwise, we update EndTie and extend tie intervals (Lines 18 19). Exaple 2. Consider an RFID data strea {(ID1, Loc1, 1), (ID1, Loc1, 11), (ID2, Loc2, 14), (ID3, Loc3, 15), (ID2, Loc4, 17), (ID2, Loc2, 18)}. Before explaining the operations of the Tie Interval Bloo Filter, we explain those of the Tie Bloo Filter. Fig. 8 shows the state of the Tie Bloo Filter after the RFID data strea passes through the Tie Bloo Filter. Suppose hash functions are as shown in Figs. 8 and 9, the size of an array in filters is 8, k is 3, and τ is 1. To explain the Tie Interval Bloo Filter, we assue h 1 (ID1)=h 2 (ID4), h 1 (ID2)=h 1 (ID4), and h 3 (ID3)=h 3 (ID4). If (ID4,Loc4,2) passes through the Tie Bloo Filter in Tie h 1 (ID1) M[] 1 11 h 2 (ID1) M[1] h 3 (ID1) M[2] 1 11 h 1 (ID2) M[3] M[4] h 1 (ID4) h 2 (ID2) M[5] 1 11 h 2 (ID4) h 3 (ID2) M[6] M[7] h 3 (ID4) h 1 (ID3) M[8] M[9] h 2 (ID3) M[1] h 3 (ID3) M[11] M[12] M[13] Fig. 9. The state of the Tie Interval Bloo Filter in Exaple 2.

10 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Fig. 8, the Tie Bloo Filter reports that it is a duplicate since 2 M[h 1 (ID4)] τ, 2 M[h 2 (ID4)] τ and 2 M[h 3 (ID4)] τ. However, it is not a duplicate. That is, it is a false positive. Meanwhile, the Tie Interval Bloo Filter keeps the start tie and the end tie. Fig. 9 shows the state of the Tie Interval Bloo Filter after the RFID data strea above passes through the Tie Interval Bloo Filter. Consider the case where (ID4, Loc4, 2) passes through the Tie Interval Bloo Filter in Fig. 9. The intersection of three intervals corresponding to ID4 (i.e., [1,11], [14,18], [15]) is epty. Therefore, the Tie Interval Bloo Filter reports that it is not a duplicate and it is really not a duplicate. For the Tie Interval Bloo Filter, it is difficult to derive the false positive probability since the false positive probability ust be coputed fro the intersection of the tie intervals. Instead of deriving the false positive probability for the Tie Interval Bloo Filter, we provide Theore 2. Fro Theore 2, we know that the Tie Interval Bloo Filter is better than the Tie Bloo Filter in ters of the false positive probability. Theore 2. Consider the Tie Bloo Filter and the Tie Interval Bloo Filter which have the sae nuber of cells and the sae configuration for hash functions. Then, for any RFID data streas, the nuber of false positives of the Tie Interval Bloo Filter is less than or equal to that of the Tie Bloo Filter. Proof. Assue that the sae RFID data strea coes to the Tie Bloo Filter and the Tie Interval Bloo Filter. Since false positives are generated only when the Tie Bloo Filter or the Tie Interval Bloo Filter reports that a given data is a duplicate, we prove that for any eleent x, if the Tie Interval Bloo Filter reports that x is a duplicate, then the Tie Bloo Filter also reports that x is a duplicate. Suppose that the Tie Interval Bloo Filter reports that x is a duplicate. Then, intersectinterval ϕ and x:tie intersectinterval:endtie τ: ð1þ Since intersectinterval is not epty in the Tie Interval Bloo Filter, all lines corresponding to x in the Tie Interval Bloo Filter (i.e., [s i,e i ] for 1 i k) include intersectinterval. Therefore, e i intersectinterval:endtie for 1 i k ðalso; s i intersectinterval:starttieþ ð2þ See how to update cells in the Tie Bloo Filter and the Tie Interval Bloo Filter (i.e., Line 5 in Fig. 4 and Lines 11 2 in Fig. 7). We can easily know that e i in the Tie Interval Bloo Filter is equal to i in the Tie Bloo Filter, where e i is defined in Line 4 of Fig. 7 and i is M[h i (x.tagid)] in the Tie Bloo Filter. Thus, by the Eq. (2) i = e i intersectinterval:endtie By ultiplying 1 in both sides i intersectinterval:endtie By adding x.tie in both sides, x:tie i x:tie intersectinterval:endtie Since x. Tie intersectinterval. EndTie τ by Eq. (1), x:tie i τ That is, all conditions in Line 1 of Fig. 4 are satisfied. Therefore, the Tie Bloo Filter reports that x is a duplicate Space optiization If the nuber of cells of the Tie Bloo Filter is equal to that of the Tie Interval Bloo Filter, the Tie Interval Bloo Filter requires twice as uch space as the Tie Bloo Filter. However, since only the tie interval within τ tie is useful, we can optiize space for the Tie Interval Bloo Filter. We keep StartTie and EndTie StartTie(DiffTie) instead of StartTie and EndTie as shown in Fig. 1. Since the space for DiffTie is less than that for EndTie, we can reduce the space. However, DiffTie ay be ore than τ while the algorith to eliinate duplicates using the Tie Interval Bloo Filter in Fig. 7 runs. Therefore, we revise the algorith in order to keep DiffTie less than or equal to τ. Using the revised algorith in Fig. 11, we can always keep DiffTie less than or equal to τ. We will explain it later. StartTie and EndTie require uch space in reality copared with DiffTie, because they record in ost cases onth, date, hour, inute, and second. Since τ is a sall value, we do not need uch space to store DiffTie.

11 18 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) M[] M[1] M[2] StartTie EndTie StartTie DiffTie M[] M[1] M[2] (a) Tie Interval Bloo Filter (b) Tie Interval Bloo Filter with space optiization Fig. 1. Space optiization in the Tie Interval Bloo Filter. The duplicate eliination algorith of the Tie Interval Bloo Filter with space optiization is shown in Fig. 11. All StartTie and DiffTie are initially set to and 1, respectively. Like the algorith TIBF(), the algorith TIBF_OPT() first checks whether x is a duplicate or not, then it drops x or sends x to the application (Line 1 1). In TIBF_OPT(), EndTie can be calculated easily fro StartTie+DiffTie (Line 4). To keep DiffTie less than or equal to τ, TIBF_OPT() update cells differently fro TIBF() (Lines 11 28). When StartTie= or x. Tie (StartTie+DiffTie) Nτ, we set StartTie to x. Tie and DiffTie to in order to initialize tie intervals in the Tie Interval Bloo Filter (Lines 13 17). Otherwise, we change StartTie and DiffTie. In the case that x. Tie StartTieNτ, the part of the tie interval is useless. To reove the useless part, we set StartTie to x.tie τ and DiffTie to τ (Line 22 23). If x.tie StartTie τ, we set only DiffTie to x. Tie StartTie (Lines 25 26). Fig. 11. Duplicate eliination algorith of the Tie Interval Bloo Filter with space optiization.

12 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) Probability of detection Major detection region Minor detection region Distance between tag and reader Fig. 12. The detection odel. 7. Paraeter setting The false positive probability of Bloo Filters is kn k and k is set to ðln2þ n to iniize false positive errors, where is the nuber of cells and n is the nuber of eleents in a set. Siilarly, we can find k for Tie Bloo Filters to iniize false positive errors. The false positive probability of Tie Bloo Filters is kn k, where n is the nuber of nonduplicate RFID data within τ tie. Therefore, when k = ðln2þ, the false positive probability is iniized. n In this paper, we do not provide the false positive probability for Tie Interval Bloo Filters. Instead, we provide Theore 2. By Theore 2, if the Tie Bloo Filter and the Tie Interval Bloo Filter have the sae nuber of cells and the sae configuration for hash functions, the Tie Interval Bloo Filter is better than the Tie Bloo Filter in ters of false positive errors. If we set k in the Tie Interval Bloo Filter to ðln2þ like the Tie Bloo Filter, the Tie Bloo Filter and the Tie Interval Bloo Filter n have the sae configuration for hash functions. Therefore, when we set k in the Tie Interval Bloo Filter to ðln2þ,we n guarantee that the Tie Interval Bloo Filter is better than the Tie Bloo Filter if the Tie Bloo Filter and the Tie Interval, where n is the nuber of non- Bloo Filter have the sae nuber of cells. Thus, for both Tie Bloo Filters and Tie Interval Bloo Filters, we set k to duplicate RFID data within τ tie. 8. Experients ðln2þ n In order to easure the effectiveness of our approaches, we conducted an experiental evaluation. There are soe works which deal with the duplicate RFID data eliination proble, however, to the best of our knowledge, our work is the first to systeatically deal with an approxiate duplicate eliination in RFID data streas. Therefore, we copared our two approaches, the Tie Bloo Filter and the Tie Interval Bloo Filter. Also, we included Bloo Filters to observe their behavior when applied to the RFID data strea. We experiented on a Pentiu 3GHz PC with 1 GB ain eory using Java Data sets Since there does not exist a well-known RFID data set, we ade a synthetic data generator siilar to [15]. To siulate the detection of an RFID tag, we used the detection odel in [15]. If the distance between a tag and an RFID reader is far enough, the tag is not detected. As the tag approaches the RFID reader, the tag is detected with the probability proportional to the distance. The region is called the inor detection region as shown in Fig. 12 [15]. When the tag and the RFID reader are in a close distance, the detection probability is constant. The region is called the ajor detection region. Tag generation at the departure Reader Tag reoval at the destination Tag Reader Fig. 13. The tag generation odel.

13 182 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) BF TBF TIBF processing rate (unit: 1 tuples/sec) The nuber of tuples (unit: 1,,) Fig. 14. The processing rate for SData according to the nuber of tuples. 3 BF TBF TIBF processing rate (unit: 1 tuples/sec) The nuber of tuples (unit: 1,,) Fig. 15. The processing rate for MData according to the nuber of tuples. Since our goal is to test whether duplicate data is eliinated effectively, we assue that tags ove in a straight line as shown in Fig. 13. Tags are generated at only the departure and then ove in a straight line with the velocity that is assigned randoly when tags are generated. We assign the sae velocity for tags generated at the sae tie since tags generally ove together in real applications. If a tag reaches the destination, the tag will disappear. Also, in a straight line, there are any detection locations and there ay exist ultiple readers at detection locations in order to iprove the detection ratio. We generate 1 7,2 1 7,3 1 7,4 1 7,5 1 7, and tuples both in the case where there exists a single RFID reader at a detection location and in the case where there exist three RFID readers at a detection location. We call the forer data SData and the latter data MData. MData has ore duplicate data than SData Experiental results We evaluate the Tie Bloo Filter and the Tie Interval Bloo Filter 4 with respect to a processing rate and an error rate. We show experiental results according to the nuber of tuples in Section and those according to the space size in Section The Tie Interval Bloo Filter in this section uses space optiization.

14 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) BF TBF TIBF.8 Error Rate The nuber of tuples (unit: 1,,) Fig. 16. The error rate for SData according to the nuber of tuples Experients according to the nuber of tuples Figs. 14 and 15 show the processing rate according to the nuber of tuples. A processing rate is the nuber of tuples that are processed in a unit tie. We fix the aount of allowable eory to bits. The Bloo Filter (BF) shows the best processing rate aong all approaches for SData as shown in Fig. 14 since it has the siplest structure. In the Bloo Filter, the processing rate increases when the nuber of tuples increases. k in the Bloo Filter decreases when the nuber of tuples increases because k is set to ðln2þ n.ifk is sall, we will set and scan a few cells in the Bloo Filter. Thus, in the Bloo Filter, the processing rate increases as the nuber of tuples increases. However, in the Tie Bloo Filter (TBF) and the Tie Interval Bloo Filter (TIBF), since k is not affected by the nuber of tuples, the processing rate is constant. In MData (Fig. 15), the processing rate shows a siilar result as SData (Fig. 14). The processing rates in the Tie Bloo Filter and the Tie Interval Bloo Filter are constant and that in the Bloo Filter increases as the nuber of tuples increases. Also, in the case of the Bloo Filter, the processing rate for MData is less than that for SData because k for MData is larger than k for SData. Note that k is set to ðln2þ n, where is the nuber of cells and n is the nuber of distinct eleents in a set. To validate the effectiveness of the Tie Bloo Filter and the Tie Interval Bloo Filter, we evaluate the error rate according to the the nuber of tuples. The error rate of the Bloo Filter is uch worse than those of other approaches for both SData (Fig. 16) and MData (Fig. 17) as we expected. The error rate of the Tie Interval Bloo Filter is the best aong all approaches and is less 1 BF TBF TIBF.8 Error Rate The nuber of tuples (unit: 1,,) Fig. 17. The error rate for MData according to the nuber of tuples.

15 184 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) BF TBF TIBF processing rate (unit: 1 tuples/sec) Space size (unit: 1,, bits) Fig. 18. The processing rate for SData according to the space size. than.7% in all cases. The error rate of the Tie Bloo Filter is a little worse than that of the Tie Interval Bloo Filter. The error rates of all approaches increase as the nuber of tuples increases Experients according to the space size We evaluate the processing rate and the error rate according to space size given the nuber of tuples is Figs. 18 and 19 show how the processing rate changes as the space size increases. In the Bloo Filter, the Tie Bloo Filter, and the Tie Interval Bloo Filter, since k is set proportional to the space size, the processing rate shows a tendency to decrease as the space size increases. While the processing rate of the Bloo Filter for SData (Fig. 18) is uch better than other approaches, the processing rate of the Bloo Filter for MData (Fig. 19) is a little better than the Tie Bloo Filter and the Tie Interval Bloo Filter. This is because k of the Bloo Filter for MData increases uch ore than that for SData copared to the Tie Bloo Filter and the Tie Interval Bloo Filter. The error rate according to the space size is shown in Figs. 2 and 21. In the Bloo Filter, the error rate stays alost constant as the space increases since ost of entries in the Bloo Filter are set to 1. However, the error rate of the Tie Bloo Filter, the Tie 5 BF TBF TIBF processing rate (unit: 1 tuples/sec) Space size (unit: 1,, bits) Fig. 19. The processing rate for MData according to the space size.

16 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) BF TBF TIBF.8 Error Rate Space size (unit: 1,, bits) Fig. 2. The error rate for SData according to the space size. Interval Bloo Filter decreases considerably as the space increases. As shown in Figs. 2 and 21, the error rate of the Tie Interval Bloo Filter is better than other approaches in any cases. Since the Bloo Filter cannot change the cell dynaically, the error rate is high even for the case of the sall nuber of tuples Optial k Like the Bloo Filter, k affects the error rate of the Tie Bloo Filter and the Tie Interval Bloo Filter. To investigate whether the setting of k proposed in Section 7 is valid, we evaluate the error rate as k changes fro 1 to 1. Figs. 22 and 23 show the error rate according to k given that space size is bits and the nuber of tuples is InFigs. 22 and 23, the thick circle eans the k value evaluated by the forula in Section 7 and the double circle eans the k value evaluated by applying the forula of Bloo Filters. In Figs. 22 and 23, all k values corresponding to the thick circles are considered good choices in ters of error rates. However, the error rates corresponding to the double circles are uch worse than the optial error rates. We evaluated the error rate according to k for various space sizes and data sizes. As a result, we conclude that the proposed k value in Section 7 is considered as a good choice in ost cases although it is not always optial. 1 BF TBF TIBF.8 Error Rate Space size (unit: 1,, bits) Fig. 21. The error rate for MData according to the space size.

17 186 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) TBF TIBF.25.2 Error Rate k Fig. 22. The error rate for SData according to k. 9. Conclusion Generally, RFID data streas include nuerous duplicate data. Since RFID data is produced in a strea, it is difficult to eliinate duplicate RFID data in one pass with a sall aount of eory. Thus, we propose approxiate duplicate eliination ethods, Tie Bloo Filters and Tie Interval Bloo Filters, based on Bloo Filters. In the Tie Bloo Filter, we replace a bit array in the Bloo Filter with an array of tie inforation since the definition of duplicates in RFID data is related to the detected.16 TBF TIBF.12 Error Rate k Fig. 23. The error rate for MData according to k.

18 C.-H. Lee, C.-W. Chung / Data & Knowledge Engineering 7 (211) tie. Also, to reduce false positives, the Tie Interval Bloo Filter uses a tie interval. Finally, experiental results show that the proposed approaches can reove duplicate RFID data in one pass with a sall error. Acknowledgents We would like to thank the editor and anonyous reviewers. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea governent (MEST) (No ). References [1] Manage data successfully with RFID anywhere edge processing, [2] Y. Bai, F. Wang, P. Liu, Efficiently filtering RFID data streas, VLDB Workshop on Clean Databases, 26. [3] C. Ban, B. Hong, D. Ki, Tie paraeterized interval R-tree for tracing tags in RFID systes, DEXA, 25, pp [4] B.H. Bloo, Space/tie trade-offs in hash coding with allowable errors, Counication of the ACM 13 (7) (197) [5] C. Bornhövd, T. Lin, S. Haller, J. Schaper, Integrating autoatic data acquisition with business processes experiences with SAP's Auto-ID infrastructure, VLDB, 24, pp [6] Z. Cao, C. Suttony, Y. Diao, P. Shenoy, Distributed inference and query processing for RFID tracking and onitoring, PVLDB 4 (5) (211) [7] B. Carbunar, M.K. Raanathan, M. Koyutürk, C. Hoffann, A. Graa, Redundant reader eliination in RFID systes, IEEE SECON, 25, pp [8] S.S. Chawathe, V. Krishnaurthy, S. Raachandran, S. Sara, Managing RFID data, VLDB, 24, pp [9] A. Chebotko, S. Lu, X. Fei, F. Fotouhi, RDFProv: a relational RDF store for querying and anaging scientific workflow provenance, Data & Knowledge Engineering 69 (8) (21) [1] F. Deng, D. Rafiei, Approxiately detecting duplicates for streaing data using stable bloo filters, SIGMOD, 26, pp [11] H. Garcia-Molina, J.D. Ullan, J. Wido, Database Syste Ipleentation, Prentice Hall International, Inc., Upper Saddle River, New Jersey, 2, p [12] H. Gonzalez, J. Han, X. Li, FlowCube: constructuing RFID FlowCubes for ulti-diensional analysis of coodity flows, VLDB, 26, pp [13] H. Gonzalez, J. Han, X. Li, D. Klabjan, Warehousing and Analyzing Massive RFID Data Sets, ICDE, 26. [14] Y. Hu, S. Sundara, T. Chora, J. Srinivasan, Supporting RFID-based ite tracking applications in Oracle DBMS using a bitap datatype, VLDB, 25, pp [15] S.R. Jeffery, M.N. Garofalakis, M.J. Franklin, Adaptive cleaning for RFID data streas, VLDB, 26, pp [16] C.-H. Lee, C.-W. Chung, Efficient storage schee and query processing for supply chain anageent using RFID, SIGMOD, 28, pp [17] C.-H. Lee, C.-W. Chung, RFID data processing in supply chain anageent using a path encoding schee, IEEE Transactions on Knowledge and Data Engineering (TKDE) 23 (5) (211) [18] J. Lu, X. Meng, T.W. Ling, Indexing and querying XML using extended Dewey labeling schee, Data & Knowledge Engineering 7 (1) (211) [19] H. Mahdin, J. Abawajy, An approach to filtering duplicate RFID data streas, Counications in Coputer and Inforation Science 124 (21) [2] A. Metwally, D. Agrawal, A.E. Abbadi, Duplicate detection in click streas, WWW, 25, pp [21] J. Rao, S. Doraisway, H. Thakkar, L.S. Colby, A deferred cleansing ethod for RFID data analytics, VLDB, 26, pp [22] F. Wang, P. Liu, Teporal anageent of RFID data, VLDB, 25, pp [23] X. Wang, Q. Zhang, Y. Jia, Efficiently filtering duplicates over distributed data streas, International Conference on Coputer Science and Software Engineering (CSSE), 28, pp Chun-Hee Lee received a PhD degree in coputer science fro the Korea Advanced Institute of Science and Technology (KAIST), Korea, in 21. His research interests include sensor network, strea data anageent, and graph databases. Chin-Wan Chung received a B.S. degree in electrical engineering fro Seoul National University, Korea, in 1973, and a Ph.D. degree in coputer engineering fro the University of Michigan, Ann Arbor, USA, in Fro 1983 to 1993, he was a Senior Research Scientist and a Staff Research Scientist in the Coputer Science Departent at the General Motors Research Laboratories (GMR). Since 1993, he has been a professor in the Departent of Coputer Science at the Korea Advanced Institute of Science and Technology (KAIST), Korea. He was in the progra coittees of ajor database and Web conferences including ACM SIGMOD, VLDB, IEEE ICDE, and WWW. He was an associate editor of ACM TOIT, and is an associate editor of WWW Journal. He will be the general chair of WWW214. His current research interests include the seantic Web, social networks, obile Web, sensor networks and strea data anageent, and ultiedia databases.

Real-Time Detection of Invisible Spreaders

Real-Time Detection of Invisible Spreaders Real-Tie Detection of Invisible Spreaders MyungKeun Yoon Shigang Chen Departent of Coputer & Inforation Science & Engineering University of Florida, Gainesville, FL 3, USA {yoon, sgchen}@cise.ufl.edu Abstract

More information

The optimization design of microphone array layout for wideband noise sources

The optimization design of microphone array layout for wideband noise sources PROCEEDINGS of the 22 nd International Congress on Acoustics Acoustic Array Systes: Paper ICA2016-903 The optiization design of icrophone array layout for wideband noise sources Pengxiao Teng (a), Jun

More information

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks Investigation of The Tie-Offset-Based QoS Support with Optical Burst Switching in WDM Networks Pingyi Fan, Chongxi Feng,Yichao Wang, Ning Ge State Key Laboratory on Microwave and Digital Counications,

More information

Different criteria of dynamic routing

Different criteria of dynamic routing Procedia Coputer Science Volue 66, 2015, Pages 166 173 YSC 2015. 4th International Young Scientists Conference on Coputational Science Different criteria of dynaic routing Kurochkin 1*, Grinberg 1 1 Kharkevich

More information

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS Key words SOA, optial, coplex service, coposition, Quality of Service Piotr RYGIELSKI*, Paweł ŚWIĄTEK* OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS One of the ost iportant tasks in service oriented

More information

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH Kulapraote Prathuchai Geoinforatics Center, Asian Institute of Technology, 58 Moo9, Klong Luang, Pathuthani, Thailand.

More information

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13 Coputer Aided Drafting, Design and Manufacturing Volue 26, uber 2, June 2016, Page 13 CADDM 3D reconstruction of coplex curved objects fro line drawings Sun Yanling, Dong Lijun Institute of Mechanical

More information

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks Energy-Efficient Disk Replaceent and File Placeent Techniques for Mobile Systes with Hard Disks Young-Jin Ki School of Coputer Science & Engineering Seoul National University Seoul 151-742, KOREA youngjk@davinci.snu.ac.kr

More information

Structural Balance in Networks. An Optimizational Approach. Andrej Mrvar. Faculty of Social Sciences. University of Ljubljana. Kardeljeva pl.

Structural Balance in Networks. An Optimizational Approach. Andrej Mrvar. Faculty of Social Sciences. University of Ljubljana. Kardeljeva pl. Structural Balance in Networks An Optiizational Approach Andrej Mrvar Faculty of Social Sciences University of Ljubljana Kardeljeva pl. 5 61109 Ljubljana March 23 1994 Contents 1 Balanced and clusterable

More information

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks IEEE IFOCO 2 International Workshop on Future edia etworks and IP-based TV Utility-Based Resource Allocation for ixed Traffic in Wireless etworks Li Chen, Bin Wang, Xiaohang Chen, Xin Zhang, and Dacheng

More information

Performance Analysis of RAID in Different Workload

Performance Analysis of RAID in Different Workload Send Orders for Reprints to reprints@benthascience.ae 324 The Open Cybernetics & Systeics Journal, 2015, 9, 324-328 Perforance Analysis of RAID in Different Workload Open Access Zhang Dule *, Ji Xiaoyun,

More information

Enhancing Real-Time CAN Communications by the Prioritization of Urgent Messages at the Outgoing Queue

Enhancing Real-Time CAN Communications by the Prioritization of Urgent Messages at the Outgoing Queue Enhancing Real-Tie CAN Counications by the Prioritization of Urgent Messages at the Outgoing Queue ANTÓNIO J. PIRES (1), JOÃO P. SOUSA (), FRANCISCO VASQUES (3) 1,,3 Faculdade de Engenharia da Universidade

More information

Fair Energy-Efficient Sensing Task Allocation in Participatory Sensing with Smartphones

Fair Energy-Efficient Sensing Task Allocation in Participatory Sensing with Smartphones Advance Access publication on 24 February 2017 The British Coputer Society 2017. All rights reserved. For Perissions, please eail: journals.perissions@oup.co doi:10.1093/cojnl/bxx015 Fair Energy-Efficient

More information

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS Guofei Jiang and George Cybenko Institute for Security Technology Studies and Thayer School of Engineering Dartouth College, Hanover NH 03755

More information

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, August 23, 2004, 12:38 PM) PART III: CHAPTER ONE DIFFUSERS FOR CGH S

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, August 23, 2004, 12:38 PM) PART III: CHAPTER ONE DIFFUSERS FOR CGH S COPUTER GEERATED HOLOGRAS Optical Sciences 67 W.J. Dallas (onday, August 3, 004, 1:38 P) PART III: CHAPTER OE DIFFUSERS FOR CGH S Part III: Chapter One Page 1 of 8 Introduction Hologras for display purposes

More information

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh 1, Issa Khalil 2, and Abdallah Khreishah 3 1: Coputer Engineering & Inforation Technology,

More information

THE rapid growth and continuous change of the real

THE rapid growth and continuous change of the real IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 1, JANUARY/FEBRUARY 2015 47 Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh, Issa

More information

Collaborative Web Caching Based on Proxy Affinities

Collaborative Web Caching Based on Proxy Affinities Collaborative Web Caching Based on Proxy Affinities Jiong Yang T J Watson Research Center IBM jiyang@usibco Wei Wang T J Watson Research Center IBM ww1@usibco Richard Muntz Coputer Science Departent UCLA

More information

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches Efficient Estiation of Inclusion Coefficient using HyperLogLog Sketches Azade Nazi, Bolin Ding, Vivek Narasayya, Surajit Chaudhuri Microsoft Research {aznazi, bolind, viveknar, surajitc}@icrosoft.co ABSTRACT

More information

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands Oblivious Routing for Fat-Tree Based Syste Area Networks with Uncertain Traffic Deands Xin Yuan Wickus Nienaber Zhenhai Duan Departent of Coputer Science Florida State University Tallahassee, FL 3306 {xyuan,nienaber,duan}@cs.fsu.edu

More information

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Mapping Data in Peer-to-Peer Systes: Seantics and Algorithic Issues Anastasios Keentsietsidis Marcelo Arenas Renée J. Miller Departent of Coputer Science University of Toronto {tasos,arenas,iller}@cs.toronto.edu

More information

CS 361 Meeting 8 9/24/18

CS 361 Meeting 8 9/24/18 CS 36 Meeting 8 9/4/8 Announceents. Hoework 3 due Friday. Review. The closure properties of regular languages provide a way to describe regular languages by building the out of sipler regular languages

More information

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing A Trajectory Splitting Model for Efficient Spatio-Teporal Indexing Slobodan Rasetic Jörg Sander Jaes Elding Mario A. Nasciento Departent of Coputing Science University of Alberta Edonton, Alberta, Canada

More information

The Internal Conflict of a Belief Function

The Internal Conflict of a Belief Function The Internal Conflict of a Belief Function Johan Schubert Abstract In this paper we define and derive an internal conflict of a belief function We decopose the belief function in question into a set of

More information

Multipath Selection and Channel Assignment in Wireless Mesh Networks

Multipath Selection and Channel Assignment in Wireless Mesh Networks Multipath Selection and Channel Assignent in Wireless Mesh Networs Soo-young Jang and Chae Y. Lee Dept. of Industrial and Systes Engineering, KAIST, 373-1 Kusung-dong, Taejon, Korea Tel: +82-42-350-5916,

More information

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems Design Optiization of Mixed Tie/Event-Triggered Distributed Ebedded Systes Traian Pop, Petru Eles, Zebo Peng Dept. of Coputer and Inforation Science, Linköping University {trapo, petel, zebpe}@ida.liu.se

More information

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL IJRET: International Journal of Research in Engineering and Technology ISSN: 239-63 PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/ QUEUEING MODEL Raghunath Y. T. N. V, A. S. Sravani 2 Assistant

More information

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011 EE 364B Convex Optiization An ADMM Solution to the Sparse Coding Proble Sonia Bhaskar, Will Zou Final Project Spring 20 I. INTRODUCTION For our project, we apply the ethod of the alternating direction

More information

Analysing Real-Time Communications: Controller Area Network (CAN) *

Analysing Real-Time Communications: Controller Area Network (CAN) * Analysing Real-Tie Counications: Controller Area Network (CAN) * Abstract The increasing use of counication networks in tie critical applications presents engineers with fundaental probles with the deterination

More information

A Novel Fast Constructive Algorithm for Neural Classifier

A Novel Fast Constructive Algorithm for Neural Classifier A Novel Fast Constructive Algorith for Neural Classifier Xudong Jiang Centre for Signal Processing, School of Electrical and Electronic Engineering Nanyang Technological University Nanyang Avenue, Singapore

More information

Adaptive Holistic Scheduling for In-Network Sensor Query Processing

Adaptive Holistic Scheduling for In-Network Sensor Query Processing Adaptive Holistic Scheduling for In-Network Sensor Query Processing Hejun Wu and Qiong Luo Departent of Coputer Science and Engineering Hong Kong University of Science & Technology Clear Water Bay, Kowloon,

More information

Geo-activity Recommendations by using Improved Feature Combination

Geo-activity Recommendations by using Improved Feature Combination Geo-activity Recoendations by using Iproved Feature Cobination Masoud Sattari Middle East Technical University Ankara, Turkey e76326@ceng.etu.edu.tr Murat Manguoglu Middle East Technical University Ankara,

More information

Modeling Parallel Applications Performance on Heterogeneous Systems

Modeling Parallel Applications Performance on Heterogeneous Systems Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln

More information

Planet Scale Software Updates

Planet Scale Software Updates Planet Scale Software Updates Christos Gkantsidis 1, Thoas Karagiannis 2, Pablo Rodriguez 1, and Milan Vojnović 1 1 Microsoft Research 2 UC Riverside Cabridge, UK Riverside, CA, USA {chrisgk,pablo,ilanv}@icrosoft.co

More information

Image Filter Using with Gaussian Curvature and Total Variation Model

Image Filter Using with Gaussian Curvature and Total Variation Model IJECT Vo l. 7, Is s u e 3, Ju l y - Se p t 016 ISSN : 30-7109 (Online) ISSN : 30-9543 (Print) Iage Using with Gaussian Curvature and Total Variation Model 1 Deepak Kuar Gour, Sanjay Kuar Shara 1, Dept.

More information

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering Clustering Cluster Analysis of Microarray Data 4/3/009 Copyright 009 Dan Nettleton Group obects that are siilar to one another together in a cluster. Separate obects that are dissiilar fro each other into

More information

An Efficient Approach for Content Delivery in Overlay Networks

An Efficient Approach for Content Delivery in Overlay Networks An Efficient Approach for Content Delivery in Overlay Networks Mohaad Malli, Chadi Barakat, Walid Dabbous Projet Planète, INRIA-Sophia Antipolis, France E-ail:{alli, cbarakat, dabbous}@sophia.inria.fr

More information

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage 216 IEEE 23rd International Conference on High Perforance Coputing A Low-Cost Multi-Failure Resilient Replication Schee for High Data Availability in Cloud Storage Jinwei Liu* and Haiying Shen *Departent

More information

Verdict Accuracy of Quick Reduct Algorithm using Clustering, Classification Techniques for Gene Expression Data

Verdict Accuracy of Quick Reduct Algorithm using Clustering, Classification Techniques for Gene Expression Data Verdict Accuracy of Quick Reduct Algorith using Clustering, Classification Techniques for Gene Expression Data T.Chandrasekhar 1, K.Thangavel 2 and E.N.Sathishkuar 3 1 Departent of Coputer Science, eriyar

More information

λ-harmonious Graph Colouring Lauren DeDieu

λ-harmonious Graph Colouring Lauren DeDieu λ-haronious Graph Colouring Lauren DeDieu June 12, 2012 ABSTRACT In 198, Hopcroft and Krishnaoorthy defined a new type of graph colouring called haronious colouring. Haronious colouring is a proper vertex

More information

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points Joint Measureent- and Traffic Descriptor-based Adission Control at Real-Tie Traffic Aggregation Points Stylianos Georgoulas, Panos Triintzios and George Pavlou Centre for Counication Systes Research, University

More information

Gromov-Hausdorff Distance Between Metric Graphs

Gromov-Hausdorff Distance Between Metric Graphs Groov-Hausdorff Distance Between Metric Graphs Jiwon Choi St Mark s School January, 019 Abstract In this paper we study the Groov-Hausdorff distance between two etric graphs We copute the precise value

More information

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making I ve Seen Enough : Increentally Iproving Visualizations to Support Rapid Decision Making Sajjadur Rahan Marya Aliakbarpour 2 Ha Kyung Kong Eric Blais 3 Karrie Karahalios,4 Aditya Paraeswaran Ronitt Rubinfield

More information

A Hybrid Network Architecture for File Transfers

A Hybrid Network Architecture for File Transfers JOURNAL OF IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 6, NO., JANUARY 9 A Hybrid Network Architecture for File Transfers Xiuduan Fang, Meber, IEEE, Malathi Veeraraghavan, Senior Meber,

More information

Theoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem

Theoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem Theoretical Analysis of Local Search and Siple Evolutionary Algoriths for the Generalized Travelling Salesperson Proble Mojgan Pourhassan ojgan.pourhassan@adelaide.edu.au Optiisation and Logistics, The

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN International Journal of Scientific & Engineering Research, Volue 4, Issue 0, October-203 483 Design an Encoding Technique Using Forbidden Transition free Algorith to Reduce Cross-Talk for On-Chip VLSI

More information

A Beam Search Method to Solve the Problem of Assignment Cells to Switches in a Cellular Mobile Network

A Beam Search Method to Solve the Problem of Assignment Cells to Switches in a Cellular Mobile Network A Bea Search Method to Solve the Proble of Assignent Cells to Switches in a Cellular Mobile Networ Cassilda Maria Ribeiro Faculdade de Engenharia de Guaratinguetá - DMA UNESP - São Paulo State University

More information

Deterministic Voting in Distributed Systems Using Error-Correcting Codes

Deterministic Voting in Distributed Systems Using Error-Correcting Codes IEEE TRASACTIOS O PARALLEL AD DISTRIBUTED SYSTEMS, VOL. 9, O. 8, AUGUST 1998 813 Deterinistic Voting in Distributed Systes Using Error-Correcting Codes Lihao Xu and Jehoshua Bruck, Senior Meber, IEEE Abstract

More information

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions *

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 21, 607-626 (2005) A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Mebership Functions * SHIH-HSU HUANG AND JIAN-YUAN LAI + Departent of

More information

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL 1 Te-Wei Chiang ( 蔣德威 ), 2 Tienwei Tsai ( 蔡殿偉 ), 3 Jeng-Ping Lin ( 林正平 ) 1 Dept. of Accounting Inforation Systes, Chilee Institute

More information

Adaptive Parameter Estimation Based Congestion Avoidance Strategy for DTN

Adaptive Parameter Estimation Based Congestion Avoidance Strategy for DTN Proceedings of the nd International onference on oputer Science and Electronics Engineering (ISEE 3) Adaptive Paraeter Estiation Based ongestion Avoidance Strategy for DTN Qicai Yang, Futong Qin, Jianquan

More information

A Generic Architecture for Programmable Trac. Shaper for High Speed Networks. Krishnan K. Kailas y Ashok K. Agrawala z. fkrish,

A Generic Architecture for Programmable Trac. Shaper for High Speed Networks. Krishnan K. Kailas y Ashok K. Agrawala z. fkrish, A Generic Architecture for Prograable Trac Shaper for High Speed Networks Krishnan K. Kailas y Ashok K. Agrawala z fkrish, agrawalag@cs.ud.edu y Departent of Electrical Engineering z Departent of Coputer

More information

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3 2017 2nd International Conference on Coputational Modeling, Siulation and Applied Matheatics (CMSAM 2017) ISBN: 978-1-60595-499-8 TensorFlow and Keras-based Convolutional Neural Network in CAT Iage Recognition

More information

Implementation of fast motion estimation algorithms and comparison with full search method in H.264

Implementation of fast motion estimation algorithms and comparison with full search method in H.264 IJCSNS International Journal of Coputer Science and Network Security, VOL.8 No.3, March 2008 139 Ipleentation of fast otion estiation algoriths and coparison with full search ethod in H.264 A.Ahadi, M.M.Azadfar

More information

Boosted Detection of Objects and Attributes

Boosted Detection of Objects and Attributes L M M Boosted Detection of Objects and Attributes Abstract We present a new fraework for detection of object and attributes in iages based on boosted cobination of priitive classifiers. The fraework directly

More information

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router Derivation of an Analytical Model for Evaluating the Perforance of a Multi- Queue Nodes Network Router 1 Hussein Al-Bahadili, 1 Jafar Ababneh, and 2 Fadi Thabtah 1 Coputer Inforation Systes Faculty of

More information

Generating Mechanisms for Evolving Software Mirror Graph

Generating Mechanisms for Evolving Software Mirror Graph Journal of Modern Physics, 2012, 3, 1050-1059 http://dx.doi.org/10.4236/jp.2012.39139 Published Online Septeber 2012 (http://www.scirp.org/journal/jp) Generating Mechaniss for Evolving Software Mirror

More information

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B.

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. Girod1 1 Stanford University, USA 2 Qualco Inc., USA ABSTRACT We study

More information

AGV PATH PLANNING BASED ON SMOOTHING A* ALGORITHM

AGV PATH PLANNING BASED ON SMOOTHING A* ALGORITHM International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.5, Septeber 205 AGV PATH PLANNING BASED ON SMOOTHING A* ALGORITHM Xie Yang and Cheng Wushan College of Mechanical Engineering,

More information

Shortest Path Determination in a Wireless Packet Switch Network System in University of Calabar Using a Modified Dijkstra s Algorithm

Shortest Path Determination in a Wireless Packet Switch Network System in University of Calabar Using a Modified Dijkstra s Algorithm International Journal of Engineering and Technical Research (IJETR) ISSN: 31-869 (O) 454-4698 (P), Volue-5, Issue-1, May 16 Shortest Path Deterination in a Wireless Packet Switch Network Syste in University

More information

Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms

Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL., NO., NOV 27 Scheduling Parallel Real-Tie Recurrent Tasks on Multicore Platfors Risat Pathan, Petros Voudouris, and Per Stenströ Abstract We

More information

Privacy-preserving String-Matching With PRAM Algorithms

Privacy-preserving String-Matching With PRAM Algorithms Privacy-preserving String-Matching With PRAM Algoriths Report in MTAT.07.022 Research Seinar in Cryptography, Fall 2014 Author: Sander Sii Supervisor: Peeter Laud Deceber 14, 2014 Abstract In this report,

More information

Optimal Route Queries with Arbitrary Order Constraints

Optimal Route Queries with Arbitrary Order Constraints IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.?, NO.?,? 20?? 1 Optial Route Queries with Arbitrary Order Constraints Jing Li, Yin Yang, Nikos Maoulis Abstract Given a set of spatial points DS,

More information

Geometry. The Method of the Center of Mass (mass points): Solving problems using the Law of Lever (mass points). Menelaus theorem. Pappus theorem.

Geometry. The Method of the Center of Mass (mass points): Solving problems using the Law of Lever (mass points). Menelaus theorem. Pappus theorem. Noveber 13, 2016 Geoetry. The Method of the enter of Mass (ass points): Solving probles using the Law of Lever (ass points). Menelaus theore. Pappus theore. M d Theore (Law of Lever). Masses (weights)

More information

Evaluation of the Timing Properties of Two Control Networks: CAN and PROFIBUS

Evaluation of the Timing Properties of Two Control Networks: CAN and PROFIBUS Evaluation of the Tiing Properties of Two Control Networs: CAN and PROFIBUS Max Mauro Dias Santos 1, Marcelo Ricardo Steer 2 and Francisco Vasques 3 1 UnilesteMG, CEP 35170-056, Coronel Fabriciano MG Brasil.

More information

The Application of Bandwidth Optimization Technique in SLA Negotiation Process

The Application of Bandwidth Optimization Technique in SLA Negotiation Process The Application of Bandwidth Optiization Technique in SLA egotiation Process Srećko Krile University of Dubrovnik Departent of Electrical Engineering and Coputing Cira Carica 4, 20000 Dubrovnik, Croatia

More information

News Events Clustering Method Based on Staging Incremental Single-Pass Technique

News Events Clustering Method Based on Staging Incremental Single-Pass Technique News Events Clustering Method Based on Staging Increental Single-Pass Technique LI Yongyi 1,a *, Gao Yin 2 1 School of Electronics and Inforation Engineering QinZhou University 535099 Guangxi, China 2

More information

Available online at Procedia Computer Science 9 (2012 ) International Conference on Computational Science, ICCS 2012

Available online at   Procedia Computer Science 9 (2012 ) International Conference on Computational Science, ICCS 2012 Available online at www.sciencedirect.co Procedia Coputer Science 9 (2012 ) 1363 1370 Abstract International Conference on Coputational Science, ICCS 2012 Reachability Analysis of Concurrent Boolean Progras

More information

An Adaptive Low-latency Power Management Protocol for Wireless Sensor Networks

An Adaptive Low-latency Power Management Protocol for Wireless Sensor Networks An Adaptive Low-latency Power Manageent Protocol for Wireless Sensor Networks Giuseppe Anastasi, Marco Conti*, Mario Di Francesco, Andrea Passarella* Pervasive Coputing & Networking Lab. (PerLab) Departent

More information

MAPPING THE DATA FLOW MODEL OF COMPUTATION INTO AN ENHANCED VON NEUMANN PROCESSOR * Peter M. Maurer

MAPPING THE DATA FLOW MODEL OF COMPUTATION INTO AN ENHANCED VON NEUMANN PROCESSOR * Peter M. Maurer MAPPING THE DATA FLOW MODEL OF COMPUTATION INTO AN ENHANCED VON NEUMANN PROCESSOR * Peter M. Maurer Departent of Coputer Science and Engineering University of South Florida Tapa, FL 33620 Abstract -- The

More information

The Boundary Between Privacy and Utility in Data Publishing

The Boundary Between Privacy and Utility in Data Publishing The Boundary Between Privacy and Utility in Data Publishing Vibhor Rastogi Dan Suciu Sungho Hong ABSTRACT We consider the privacy proble in data publishing: given a database instance containing sensitive

More information

Distributed Multicast Tree Construction in Wireless Sensor Networks

Distributed Multicast Tree Construction in Wireless Sensor Networks Distributed Multicast Tree Construction in Wireless Sensor Networks Hongyu Gong, Luoyi Fu, Xinzhe Fu, Lutian Zhao 3, Kainan Wang, and Xinbing Wang Dept. of Electronic Engineering, Shanghai Jiao Tong University,

More information

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification Solving the Daage Localization Proble in Structural Health Monitoring Using Techniques in Pattern Classification CS 9 Final Project Due Dec. 4, 007 Hae Young Noh, Allen Cheung, Daxia Ge Introduction Structural

More information

An Adaptive and Low-latency Power Management Protocol for Wireless Sensor Networks

An Adaptive and Low-latency Power Management Protocol for Wireless Sensor Networks An Adaptive and Low-latency Power Manageent Protocol for Wireless Sensor Networks Giuseppe Anastasi, Marco Conti*, Mario Di Francesco, Andrea Passarella* Pervasive Coputing & Networking Lab. (PerLab) Departent

More information

A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER

A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER VIOLETA TOMAŠEVIĆ, SLOBODAN BOJANIĆ 2 and OCTAVIO NIETO-TALADRIZ 2 The Mihajlo Pupin Institute, Volgina 5, 000 Belgrade, SERBIA AND MONTENEGRO 2 Technical University

More information

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm Detection of Outliers and Reduction of their Undesirable Effects for Iproving the Accuracy of K-eans Clustering Algorith Bahan Askari Departent of Coputer Science and Research Branch, Islaic Azad University,

More information

Automatic Graph Drawing Algorithms

Automatic Graph Drawing Algorithms Autoatic Graph Drawing Algoriths Susan Si sisuz@turing.utoronto.ca Deceber 7, 996. Ebeddings of graphs have been of interest to theoreticians for soe tie, in particular those of planar graphs and graphs

More information

Data-driven Hybrid Caching in Hierarchical Edge Cache Networks

Data-driven Hybrid Caching in Hierarchical Edge Cache Networks Data-driven Hybrid Caching in Hierarchical Edge Cache Networks Abstract Hierarchical cache networks are increasingly deployed to facilitate high-throughput and low-latency content delivery to end users.

More information

Survivability Function A Measure of Disaster-Based Routing Performance

Survivability Function A Measure of Disaster-Based Routing Performance Survivability Function A Measure of Disaster-Based Routing Perforance Journal Club Presentation on W. Molisz. Survivability function-a easure of disaster-based routing perforance. IEEE Journal on Selected

More information

Using Imperialist Competitive Algorithm in Optimization of Nonlinear Multiple Responses

Using Imperialist Competitive Algorithm in Optimization of Nonlinear Multiple Responses International Journal of Industrial Engineering & Production Research Septeber 03, Volue 4, Nuber 3 pp. 9-35 ISSN: 008-4889 http://ijiepr.iust.ac.ir/ Using Iperialist Copetitive Algorith in Optiization

More information

COLLABORATIVE BEAMFORMING FOR WIRELESS AD-HOC NETWORKS

COLLABORATIVE BEAMFORMING FOR WIRELESS AD-HOC NETWORKS International Journal of Coputer Science and Counication Vol. 3, No. 1, January-June 2012, pp. 181-185 COLLABORATIVE BEAMFORMING FOR WIRELESS AD-HOC NETWORKS A.H. Karode 1, S.R. Suralkar 2, Manoj Bagde

More information

Optimized stereo reconstruction of free-form space curves based on a nonuniform rational B-spline model

Optimized stereo reconstruction of free-form space curves based on a nonuniform rational B-spline model 1746 J. Opt. Soc. A. A/ Vol. 22, No. 9/ Septeber 2005 Y. J. Xiao and Y. F. Li Optiized stereo reconstruction of free-for space curves based on a nonunifor rational B-spline odel Yi Jun Xiao Departent of

More information

Shape Optimization of Quad Mesh Elements

Shape Optimization of Quad Mesh Elements Shape Optiization of Quad Mesh Eleents Yufei Li 1, Wenping Wang 1, Ruotian Ling 1, and Changhe Tu 2 1 The University of Hong Kong 2 Shandong University Abstract We study the proble of optiizing the face

More information

CLUSTERING OF AE SIGNAL AMPLITUDES BY MEANS OF FUZZY C-MEANS ALGORITHM

CLUSTERING OF AE SIGNAL AMPLITUDES BY MEANS OF FUZZY C-MEANS ALGORITHM The 12 th International Conference of the Slovenian Society for Non-Destructive Testing»Application of Conteporary Non-Destructive Testing in Engineering«Septeber 4-6, 213, Portorož, Slovenia More info

More information

Resource Optimization for Web Service Composition

Resource Optimization for Web Service Composition Resource Optiization for Web Coposition Xia Gao, Ravi Jain, Zulfikar Razan, Ulas Kozat Abstract coposition recently eerged as a costeffective way to quickly create new services within a network. Soe research

More information

On the Computation and Application of Prototype Point Patterns

On the Computation and Application of Prototype Point Patterns On the Coputation and Application of Prototype Point Patterns Katherine E. Tranbarger Freier 1 and Frederic Paik Schoenberg 2 Abstract This work addresses coputational probles related to the ipleentation

More information

Collection Selection Based on Historical Performance for Efficient Processing

Collection Selection Based on Historical Performance for Efficient Processing Collection Selection Based on Historical Perforance for Efficient Processing Christopher T. Fallen and Gregory B. Newby Arctic Region Supercoputing Center University of Alaska Fairbanks Fairbanks, Alaska

More information

Verifying the structure and behavior in UML/OCL models using satisfiability solvers

Verifying the structure and behavior in UML/OCL models using satisfiability solvers IET Cyber-Physical Systes: Theory & Applications Review Article Verifying the structure and behavior in UML/OCL odels using satisfiability solvers ISSN 2398-3396 Received on 20th October 2016 Revised on

More information

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou Departent of Coputer Science and Inforation Engineering, National Chiayi

More information

ACNS: Adaptive Complementary Neighbor Selection in BitTorrent-like Applications

ACNS: Adaptive Complementary Neighbor Selection in BitTorrent-like Applications ACNS: Adaptive Copleentary Neighbor Selection in BitTorrent-like Applications Zhenbao Zhou 1, 2, Zhenyu Li 1, 2, Gaogang Xie 1 1 Institute of Coputing Technology, Chinese Acadey of Sciences, Beijing 100190,

More information

Identifying Converging Pairs of Nodes on a Budget

Identifying Converging Pairs of Nodes on a Budget Identifying Converging Pairs of Nodes on a Budget Konstantina Lazaridou Departent of Inforatics Aristotle University, Thessaloniki, Greece konlaznik@csd.auth.gr Evaggelia Pitoura Coputer Science and Engineering

More information

Knowledge Discovery Applied to Agriculture Economy Planning

Knowledge Discovery Applied to Agriculture Economy Planning Knowledge Discovery Applied to Agriculture Econoy Planning Bing-ru Yang and Shao-un Huang Inforation Engineering School University of Science and Technology, Beiing, China, 100083 Eail: bingru.yang@b.col.co.cn

More information

Dynamical Topology Analysis of VANET Based on Complex Networks Theory

Dynamical Topology Analysis of VANET Based on Complex Networks Theory BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volue 14, Special Issue Sofia 014 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.478/cait-014-0053 Dynaical Topology Analysis

More information

Multi Packet Reception and Network Coding

Multi Packet Reception and Network Coding The 2010 Military Counications Conference - Unclassified Progra - etworking Protocols and Perforance Track Multi Packet Reception and etwork Coding Aran Rezaee Research Laboratory of Electronics Massachusetts

More information

An Ad Hoc Adaptive Hashing Technique for Non-Uniformly Distributed IP Address Lookup in Computer Networks

An Ad Hoc Adaptive Hashing Technique for Non-Uniformly Distributed IP Address Lookup in Computer Networks An Ad Hoc Adaptive Hashing Technique for Non-Uniforly Distributed IP Address Lookup in Coputer Networks Christopher Martinez Departent of Electrical and Coputer Engineering The University of Texas at San

More information

A Broadband Spectrum Sensing Algorithm in TDCS Based on ICoSaMP Reconstruction

A Broadband Spectrum Sensing Algorithm in TDCS Based on ICoSaMP Reconstruction MATEC Web of Conferences 73, 03073(08) https://doi.org/0.05/atecconf/087303073 SMIMA 08 A roadband Spectru Sensing Algorith in TDCS ased on I Reconstruction Liu Yang, Ren Qinghua, Xu ingzheng and Li Xiazhao

More information

2nd Workshop on Advanced Research and Technology in Industry Applications (WARTIA 2016)

2nd Workshop on Advanced Research and Technology in Industry Applications (WARTIA 2016) 2nd Workshop on Advanced Research and Technology in Industry Applications (WARTIA 2016) Method for Marking and Positioning Corrosions in the Autoobile Engine Cylinder Cavity fro the Ultrasonic Phased Array

More information

Research on the Classification and Selection of Archive Texts with the Improved C4.5 Algorithm

Research on the Classification and Selection of Archive Texts with the Improved C4.5 Algorithm INTERNATIONAL JOURNAL OF IRUITS, SYSTEMS AND SIGNAL PROESSING Volue 2, 208 Research on the lassification and Selection of Archive Texts with the Iproved 4.5 Algorith Xianbin Lv Abstract In the age of inforation

More information

Improve Peer Cooperation using Social Networks

Improve Peer Cooperation using Social Networks Iprove Peer Cooperation using Social Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 Noveber 5, 2007 Corresponding

More information

Rule Extraction using Artificial Neural Networks

Rule Extraction using Artificial Neural Networks Rule Extraction using Artificial Neural Networks S. M. Karuzzaan 1 Ahed Ryadh Hasan 2 Abstract Artificial neural networks have been successfully applied to a variety of business application probles involving

More information