Modeling Object Flows from Distributed and Federated RFID Data Streams for Efficient Tracking and Tracing (Supplemental)

Size: px

Start display at page:

Download "Modeling Object Flows from Distributed and Federated RFID Data Streams for Efficient Tracking and Tracing (Supplemental)"

Miranda Horn
6 years ago
Views:

1 1 Modeling Object Flows from Distributed and Federated RFID Data Streams for Efficient Tracking and Tracing (Supplemental) Yanbo Wu, Quan Z. Sheng, Member, IEEE, Hong Shen, and Sherali Zeadally F 1 WHY CENTRALIZED MODELS DO NOT WORK If data from different nodes are stored in a centralized database using the schema above, the definitions themselves can answer the queries. Proper indices can improve the performance. However, there are still some significant performance issues in large scale systems such as IoT where the number of nodes and objects could be very large. We discuss below why a centralized solution does not scale by considering the characteristics of RFID data management. Frequent Updates. Data in IoT are generated as streams and update becomes a frequent operation comparing to traditional database management systems that are optimized for frequent-read scenarios. With a centralized database (or database clusters), frequent updates to the database not only increase the storage cost, but more importantly, cause high costs in index maintenance. For example, for streaming data, a B + tree index could be potentially huge. The space cost for the index itself is O(Num objects Length path ) where Num objects is the total number of objects in the whole network, Length path is the average number of nodes of paths that the object moves along. Row Level Security Requirement. Business applications often require high security. Even in a federated system, the partners want to control their own data. For example, a supermarket wants to hide the buying information about the same product from one supplier to another, while the suppliers can access the data related to their own products. This requires row level authentication and the authentication overhead for space is high. Archiving. RFID data, similar to other time series data, is sensitive to time. Recent records are more interesting to the analyzers than the distant ones. For the purpose of storage efficiency, it is often necessary Y. Wu is with Beijing Jiaotong University. ybwu@bjtu.edu.cn Q.Z. Sheng and H. Shen are with the University of Adelaide. S. Zeadally is with the University of the District of Columbia. EECi0 EECi1 EECi2 EECi3 EECi4 Fig. 1: Example of LTTF Slot to archive the old data in order to make room for the new data. However, different participants may have different definitions of oldness (i.e., when the data should be archived). As a result, when old records are archived, a range query on End with a selection on Node has to be performed. Furthermore, because the records for different nodes are likely to be stored in different pages, after deletion, the pages will have a lot of fragments. The rearrangement of records in these pages could be very costly and time consuming. Mining Efficiency. Stream mining techniques such as online aggregation, critical layers or popular path materialization all require a significant amount of memory in a central server. 2 AN EXAMPLE OF EEC SLOT Figure 1 shows an example structure of the slot s i in LTTF. Slot s i represents the data of the time interval [(2 i 1) T 0, (2 i+1 1) T 0 ). It is split into w s EECs. In this example, w s is 5. For each EEC, we maintain a histogram which models the distribution of data from/to a neighbor. The current size of a slot is defined as the number of used EECs in it, so s i.size apple w s. The reason why we split the slots further into EECs and maintain histograms for each of them, instead of maintaining one histogram for each slot, is because we want a finer control over the memory usage. By splitting the slots, we can limit the memory usage for each slot by changing

2 2 Algorithm : Merge the Model: merge Input: The slot which is being merged s i The model M Output: The merged model 1: if s i+1 does not exist in M 2: s i+1 new slot, M.append(s i+1 ) 3: end if 4: if s i+1 is full, merge(s i+1, M) 5: for each neighbor b j in s i 6: h i+1,j s i+1 [b j ] 7: if h i+1,j is nil 8: s i+1 [b j ] h i+1,j new array size of w s 9: end if 10: move h i+1,j [1] h i+1,j [w s /2] to h i+1,j [w s /2+1] h i+1,j [w s ] 11: for k 1 to w s /2 12: h i+1,j [k] s i [b j ][2 k]+s i [b j ][2 k +1] 13: end for 14: end for Fig. 2: Algorithm to Merge the LTTF Model nodes in the network (but objects are still moving from/to that node). In this case, the information of connections relevant to that node becomes unavailable. Secondly, a node n i and one of its source nodes n j may both be the direct source node of n k. We call this situation the Sideway Problem. Due to this, the algorithms in Section 5 of main file may fail to get the actual source node. As shown in Figure 3a, suppose object o moved from n j to n i, then from n i to n k. Both n i and n j are in the n k s business neighbor set, and for the past few event cycles, n j has more objects moving to n k. Since n j is the first to be queried, it will be treated as the source node of object o from which it moves to n k. Clearly, this is not correct because the source node should be n i. n k the value of w s. The larger w s is, the more memory the model requires, and the more accurate the model is. In an extreme case, when w s is 1, the slots are not split and the accuracy is the lowest. The height of the bars in the histogram of each EEC is the volume of objects flow from/to a neighbor for the time represented by that EEC. The x-axis of the histogram represents the neighbors. The length of the model is defined as the number of slots within it. A model of length l can store the history of w s w e P l 1 i=0 2i. We can calculate the length of the model which stores the history for the past t time units using: t l = log 2 ( + 1) = log w s w 2 ( t + 1) (1) e T 0 Suppose we define the width of an event cycle as an hour and for each slot, there are 24 EECs with in it, i.e., w e is an hour and w s is 24, we only need dlog 2 365e =9slots to store the history of one year. This efficiency in space enables the storage of these slots in main memory. 3 MERGING WITH THE NEXT SLOT When the new event cycle fills the most recent time slot s 0, the second most recent slot s 1 will be merged to the succeeding slots recursively. The merge algorithm in Figure 2 first checks whether the next slot in the model is full. If so, it will be merged (line 4). This process is done recursively until either a non-full slot is found or all the existing slots are full. In the latter case, a new slot is created and appended to the model. Then the slot being merged will be integrated into the first non-full slot (line 5 14). It is compressed by merging two consecutive EECs into one (line 12). Note that in this algorithm, we assume that the width of slots w s is even. This assumption does not affect the performance of the algorithm. 4 THE BUSINESS NEIGHBOR TREE There are two critical issues in real-life applications. Firstly, a node may go offline with or without notification to other n j n i n k n a n b (a) Sideway Problem n j n i n j (b) Source Tree Fig. 3: Sideway Problem and the Source Node Tree A source tree at a node n k is a subgraph of the whole network topology, with n k as the root, representing the topology of nodes that have either direct or indirect connections to n k. Figure 3b depicts the source tree maintained at n k in the sub-network shown in Figure 3a. The root of the tree is n k. Its child nodes are the direct source nodes of n k (i.e., n i, n j and n b in Figure 3b). The maintenance of the source tree is done simultaneously with the flow synopsis building. In the algorithm to gather the source node information, neighbors also include its source tree in the result set R. This source tree is then merged with the local source tree. In this way, the source tree is built recursively. This structure adds more connectivity to the network, which increases the stability of the system in the case that some nodes may leave. However, it does not require replication of any form. It also helps to solve the Sideway Problem. Essentially, the source tree gathers the nodes that are often on the same paths in the object flows. It does not store item-level information and statistics from other nodes. It is impossible for a node to get these information without permission from neighbors. As a result, although we stored a subgraph locally, we do not expose confidential information. Solving Node Leaving Problem. With the source node tree, when a neighbor n i leaves the network, we can query its direct source nodes S(n i ) to check whether an object comes from n i, because if an object comes from a node s n b n a

3 3 direct source node n i, it must also come from one of n i s direct source nodes. In the source node tree, the LTTFs for all the nodes in S(n i ) are maintained, together with the LTTF for n i itself. If an object comes from a node n a in S(n i ), it is used in the maintenance of the LTTFs for both n i and n a. In this way, even when n i is offline, its LTTF is kept accurate. We do not remove n i from the tree when it leaves. Instead, we mark it as disconnected. After it is back online, we remove the disconnected mark. This is useful to deal with temporary departure of nodes (e.g., short server breakdown). The LTTFs for nodes in S(n i ) are deleted because we no longer need them to build the flow synopsis. In this way, even when some nodes in the network leave, we can still respond to tracing queries and continue to build flow synopsis. If after a certain configurable time, n i is still offline, it will be removed from the tree and its LTTF will be deleted or archived. Solving Sideway Problem. To solve the Sideway problem, for the algorithms to answer tracing queries and building the flow synopsis, if a node has both direct and indirect connection to the root node, it will be queried regardless of the order of probabilities. For example, in Figure 3b, not only n j is the direct source node of the root node n k, but there is also an indirect connection (n j! n i! n k ) between n j and n k. n j is queried regardless of its order in the sorted neighbor list. After receiving the result set, we compare the arrival timestamps of the objects which are included in the result sets from more than one neighbors and select the node with the latest timestamp as the source node. For example, if the arrival timestamp for an object is earlier in n j than that in n i, it is clear that the object was sent through the path n j! n i! n k. 5 DETERMINING w s AND w e Generally, our model is a synopsis for the RFID stream. A generic description of such a data structure is that it supports two operations, update and computeanswer [1]. In our model, the update operation takes more time than computeanswer, because it involves network queries (See Section 6). More importantly, it may be so slow that the histograms for the past event cycle have not yet been constructed when it starts constructing the histograms for the current event cycle. In this case, the most recent histograms in slot s 0 is actually an old one. If this happens, the most valuable data is not used because it is not ready yet. Figure 4 illustrates such a situation. At the end of the second event cycle (EC2), the construction of histograms for EC1 has not been done (which will be done at w e + t u, where t u is the time used for update). To avoid this problem, w e should be large enough, i.e., w e should be larger than the maximum possible value of t u. But it should not be too large. If it is too large, the pattern of object flow may change during an event cycle. In this case, the change is not captured. t u is determined by the EC1 0 we 2we we + tu Fig. 4: When the Update Time Is Too Long size of the sample, the number of the business neighbors and the size of the whole network as discussed in Section 6. It is a dynamic value. A good way to determine w e would be to make it a function of the three parameters. However maintaining the size of the whole network is not an easy task, and is costly. In this work, we propose an adaptive approach. When a node joins the network, it sets w e to a default value. During the maintenance process of the model, if t 1 u becomes larger than w e or smaller than w e 2d, w e d t u (d > 1) (d is a constant between 1 and 2). In real applications, after the network becomes stable, t u will likely become stable too, so w e is almost a constant. Our algorithms, which are based on the assumption that w e is a constant, are not affected. To deal with the unstable phase of model construction, each event cycle in the model is parameterized with a timestamp t end for the end of the cycle. The maximum number of EECs in a slot (i.e., w s ) has the influence on the memory usage of the LTTF model. Generally, the memory usage of the model is O(l w s n) (l is the length of model and n is the number of neighbors). w s can be large, as long as the memory is sufficient. Larger w s ensures the model with a finer granularity. 6 PERFORMANCE ANALYSIS 6.1 Model Maintenance Cost The time complexity for maintaining the TISH model consists of three parts: i) sampling the input, ii) querying the source nodes, and iii) updating the model. Suppose t u is the time used to update the model for an event cycle, we have: t u = t sample + t query + t update (2) Using the reservoir algorithm, we only need to scan the input once, and this can be done on-the-fly. So the cost for this part is O(x), where x is the total number of objects in the current event cycle. For the worst case, the model update algorithms (Section 4 of the main file) will go through the whole model, with a complexity of O(l), where l is the number of slots in the model. It should be noted that we can avoid the movement of histograms shown in Figure 2 (line 10). l is often small. The most costly part is querying the source nodes, as the network latency is typically much longer than the 1. t u is an observable variable. EC2 EC3

4 4 B N 1 N 2 N 3 N 4 N B 0 1 ( =0) N 1 N 3 N 2 N 4 N B 0 2 ( =2) N 1 N 2 N 4 N 5 N B 0 3 ( =1) N 1 N 2 N 4 N 3 N Fig. 5: Example in Modeling Accuracy local processing time. Obviously, when querying the source nodes, the queries can be sent out simultaneously. However, the network traffic cost will be O(n) (n is the number of neighbors). This is not economical because building the flow synopsis is not necessary to be done in real time. Indeed, the TISH model is only updated when an event cycle ends, and the maintenance time t u is controlled to be less than an event cycle as discussed in Section 5. With the algorithm shown in Section 5 of the main file, we can save the network traffic cost by querying the nodes which have most possibility to be the source node first. Moreover, we can also avoid the underlying P2P queries when there are no new business neighbors joining. Even if there is one, after the first event cycle, the new neighbor will be in the LTTF model with high probability. As a result, the network cost is reduced. The accuracy of the model is very important for reducing network traffic cost to maintain the model. We represent the real distribution of objects source nodes during a specific time period as B, which is the neighbor list sorted in descending order by the real distributions (e.g., B in Figure 5). For example, in Figure 5, there are five possible source nodes (N 1 to N 5 ) with the possibilities of 50% for N 1, 30% for N 2, and 20% for N 3. The distributions modeled by the TISH model in the same period of time is denoted as B 0. Figure 5 depicted different scenarios for the accuracy of the TISH model. Suppose that, the m objects (the value of m does not matter in this discussion), which we want to query their source nodes, come from y different nodes. In this case, y apple m. In Figure 5, y is 3, because in the real distribution B, the m objects can only come from N 1, N 2 and N 3. In the best case scenario, y queries are needed. Because we have to query the first y nodes in order to find source nodes for all objects. B1 0 is one of the best scenarios. Although the order of N 2 and N 3 are not the same as B, 3 queries are still enough to find the source nodes. In general, the best scenario happens as long as the first y nodes in the model B 0 are the same as the first y nodes in the real distribution B, regardless of the order. In the worst case scenario, all the nodes in the neighbor list are queried. B2 0 is one of these cases. The neighbor N 3 which indeed is a source node, is ranked the last in B2, 0 so we have to query all the 5 neighbors to find the source nodes for all objects. In general, the worst case happens when there exists one node in the first y nodes of B being ranked the last in B 0. Suppose is the number of extra queries made in addition to the optimized value y, we can conclude that: = max y i=1 (B0.rank(B[i])) y (3) For example, in Figure 5, is 0 for B1 0 because all first 3 nodes in B are still ranked first 3 in B1, 0 thus max 3 i=1 (B0 1.rank(B[i])) is 3. For B2, 0 N 3 in first 3 of B is ranked 5 in B2, 0 thus max 3 i=1 (B0 2.rank(B[i])) is 5, so is 2 (i.e., 5-3). Similarly for B3, 0 N 3 is ranked the 4 th in B3, 0 so is 1. The goal of our model is to keep the average value of as small as possible. We will demonstrate through extensive simulation experiments (described in Section 7) that our proposed TISH model is very accurate in this sense. 6.2 Tracing and Tracking Cost The cost of tracking and tracing consists of two parts. If the initiating node of the query is not on the movement path of the object, the query should first be redirected to a certain node on the path. The cost of this step is the cost of the underlying P2P overlay lookup cost. However, in real applications, it is rare that the query is initiated by a third party. So in most cases, this part of the cost does not exist. The second part is the query cost. The total cost of a trace query (t trace ) is a function of the length of the object s moving path. In general, t trace = P l i=1 t i, where t i is the time used to establish the i th segment in the path and l is the length of the path. We assume that t i is proportional to the number of queries (q) made, so t i = q T where T is a constant representing normal remote query processing time. Clearly, the cost is determined by q. It is easy to infer that if the object comes from the j th neighbor in B, then j queries are needed to find it. Thus, the optimized average value of q is (in this section, we reuse the definition of notations in Section 6.1.): q = yx (j B[j].probability) (4) j=1 i.e., the order of the querying follows the actual order of the neighbors sorted by the probability that the object comes from them. The number of queries q 0 by using the TISH model is: yx q 0 = (B 0.rank(B[j]) B[j].probability) (5) j=1 Figure 6 shows an example of the optimized cost and the cost with the TISH model by using the examples in Figure 5. If the actual order (B) is followed when querying the source node, we only need, on average, =1.7 queries. But in a different order, for example, in B 0 1, the order of N 2 and N 3 is reverted. In this case, we have to use 0.5 B 0 1.rank(N 1 )+0.3 B 0 1.rank(N 2 )+0.2 B 0 1.rank(N 3 )= =1.8 queries. The goal of tracing query processing is to make q 0 as close to q as possible. We will show in Section 7 that the

5 5 Number of Queries B[j] probability B B1 0 B2 0 B3 0 N N N q or q Fig. 6: Examples of Tracing Efficiency Fig. 8: Number of Network Calls against c Fig. 7: Distribution of Number of Network Calls for Model Maintenance TISH model can make the difference less than 0.1, i.e., in most cases, B 0 contains the same neighbors with the same order as B. 7 SUPPLEMENTAL EXPERIMENTS 7.1 Model Maintenance Cost An interesting performance evaluation is to investigate how many network calls are used per query for the tracking and tracing algorithm, and the type of distribution we obtain. We illustrate the result in Figure 7, by using a histogram. We note that the majority of the queries use less than 10 (the average number of fan-out) network calls in order to maintain the TISH model. According to the analysis in Section 6.1, we expect that the average number of network calls for model maintenance is close to the effective fanouts, which is the average number of active (non-idle) connections for all the nodes (5 in our experiments). The average number of network calls in this experiment is 5.11, which is close to what we expect. 7.2 Effects of Constant c We are interested in how the constant c affects the performance of the model. In this experiment, we examined the performance of the model maintenance with various values of c from 1 to 5. It is clear that when c is big, the accuracy becomes worse. This is because by using larger c, more old data are included and they biases the model. So we limit the test to a reasonable range. We ran the tests with both Constant Patterns -only and Mixed Without Random settings as defined in Section 6 of the main file. Figure 8 shows the result for this experiment. With Mixed Without Random setting, the performance is the best when c is 3. This is because that when c is smaller than 3, only two of the most recent event cycles are used. The model becomes too sensitive to the changes of the patterns. The little bias in the sampling will be expanded in the model. When c is greater than 3, more historical data are used to calculate the probabilities. This makes the model not sensitive to the changes of the patterns. However, with Constant Pattern, when c is small (less than 4 in our result), it does not affect the performance. This is because the network topology does not change. The only dynamic change is the idle/active state switch of connections. If c is too large, the TISH model becomes insensitive to these dynamic changes and the performance becomes worse. 8 ADDITIONAL LITERATURE REVIEW 8.1 Data Structures and Data Transformation A raw RFID record is a triple tuple (ID, time, location) [2], which presents little value until it is transformed into a form suitable for applicationlevel interactions. In addition, such data has implicit meanings and associated relationships with other RFID records where applications have to make appropriate inferences [15]. In one of the earliest efforts on RFID data modeling, Wang et al. propose an RFID data model that abstracts static and dynamic entities including object, reader, location and transaction [16], [17]. It models the interaction between them as either state or event based relationships. The data model also provides a rule-based data filter engine. In [12], RFID applications are classified into a set of typical scenarios and a generalized data modeling framework with constructs for each typical scenario is proposed. In [6], Hu et al. focus on the path encoding by using a Bitmap data type and they demonstrated that significant storage savings can be achieved. In [7], a novel data structure and algorithms to efficiently encode, decode and query an object s moving path in an RFID database are proposed. The approach has been further improved very recently in [8] to cope with the situation where the moving path is long. Finally, Lin et al. propose in [11] a Multi-Table model in which path and containment relationships can be defined.

6 6 8.2 Knowledge Discovery For an RFID data warehousing approach proposed by Gonzalez et al. in [4], the authors observe that individual objects tend to move and stay together (i.e., bulky object movements in supply chain). They propose a novel model to compress RFID data without information loss. The model consists of a hierarchy of highly compact summaries of data aggregated at different abstraction levels where analysis takes place. These summaries are represented as RFIDcuboids. Each RFID-cuboid records object movements and stores product information for each RFID object, information on objects that stay together at a location, and path information necessary to link multiple stay records. This work is further improved in [3] by discovering the Gateway nodes that have either high fan-in or high fan-out edges. The RFID cuboids are created based on this discovery to save more space. This method is restricted to process static data in centralized environments. It is not suitable for processing RFID data streams. In [9], Lee and Park propose a dynamic tracing model for supply chain management, which considers not only the movements, but also the combination and splitting of RFID-attached objects. In [13], an algorithm to extract frequent sequential patterns from RFID data streams is proposed. The authors also introduce a decision tree model to determine the prime movements of tagged objects by using the extracted patterns. TMS-RFID [10] is a system to manage temporal RFID data, which extracts complex temporal event patterns over RFID streams. In [5], the authors propose a probability evaluation model and algorithms for moving range query processing. These models focus on different aspects of modeling RFID data, but they all have the same limitations as RFID-cuboid. SPIRE [14] is a novel RFID data interpretation and compression system. It extracts location and containment relationships over RFID streams by applying a probabilistic algorithm over a time-varying graph model. This model can be deployed in either centralized or distributed environments. However, it still requires full access to all the data. The BRIDGE project 2 is an implementation of the EPCglobal framework. It includes a probabilistic model to predict future location of RFID-attached objects using Markov Chain. However, this model is designed specifically for supply chain management systems. It requires synchronization with static supply chain model, which makes it inflexible. [3] H. Gonzalez, J. Han, H. Cheng, X. Li, D. Klabjan, and T. Wu. Modeling Massive RFID Data Sets: A Gateway-Based Movement Graph Approach. TKDE, 22:90 104, [4] H. Gonzalez, J. Han, X. Li, and D. Klabjan. Warehousing and Analyzing Massive RFID Data Sets. In Proceedings of the 22nd International Conference on Data Engineering (ICDE 06), Atlanta, Georgia, USA, April [5] Y. Gu, G. Yu, N. Guo, and Y. Chen. Probabilistic Moving Range Query over RFID Spatio-temporal Data Streams. In Proceeding of the 18th ACM Conference on Information and Knowledge Management (CIKM 09), Hong Kong, China, [6] Y. Hu, S. Sundara, T. Chorma, and J. Srinivasan. Supporting RFID- Based Item Tracking Applications in Oracle DBMS Using a Bitmap Datatype. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 05), Trondheim, Norway, September [7] C.-H. Lee and C.-W. Chung. Efficient Storage Scheme and Query Processing for Supply Chain Management Using RFID. In Proceedings of the 2008 ACM International Conference on Management of Data (SIGMOD 08), Vancouver, Canada, [8] C.-H. Lee and C.-W. Chung. RFID Data Processing in Supply Chain Management Using a Path Encoding Scheme. IEEE Transactions on Knowledge and Data Engineering, 23(5): , [9] D. Lee and J. Park. RFID-based Traceability in the Supply Chain. Industrial Management and Data Systems, 108(6): , [10] X. Li, J. Liu, Q. Z. Sheng, S. Zeadally, and W. Zhong. TMS- RFID: Temporal Management of Large-scale RFID Applications. Information Systems Frontiers, 13(4): , [11] D. Lin, H. G. Elmongui, E. Bertino, and B. C. Ooi. Data Management in RFID Applications. In Proceedings of the 18th International Conference on Database and Expert Systems Applications (DEXA 07), Regensburg, Germany, [12] S. Liu, F. Wang, and P. Liu. Integrated RFID Data Modeling: An Approach for Querying Physical Objects in Pervasive Computing. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA, [13] T. Nakahara, T. Uno, and K. Yada. Extracting Promising Sequential Patterns from RFID Data Using the LCM Sequence. In Proceedings of the 14th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES 10), Cardiff, UK, [14] Y. Nie, R. Cocci, Z. Cao, Y. Diao, and P. Shenoy. SPIRE: Efficient Data Interpretation and Compression over RFID Streams. IEEE Transactions on Knowledge and Data Engineering, 24(1): , [15] Q. Z. Sheng, X. Li, and S. Zeadally. Enabling Next-Generation RFID Applications: Solutions and Challenges. IEEE Computer, 41(9):21 28, September [16] F. Wang and P. Liu. Temporal Management of RFID Data. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 05), Trondheim, Norway, September [17] F. Wang, S. Liu, and P. Liu. A Temporal RFID Data Model for Querying Physical Objects. Pervasive and Mobile Computing, 6(3): , REFERENCES [1] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and Issues in Data Stream Systems. In Proceedings of the 21th ACM Symposium on Principles of Database Systems (PODS 02), Madison, Wisconsin, [2] S. S. Chawathe, V. Krishnamurthy, S. Ramachandran, and S. Sarma. Managing RFID Data. In Proceedings of the 30th International Conference on Very Large Data Bases (VLDB 04), Toronto, Canada, September

A Highly Accurate Method for Managing Missing Reads in RFID Enabled Asset Tracking

A Highly Accurate Method for Managing Missing Reads in RFID Enabled Asset Tracking Rengamathi Sankarkumar (B), Damith Ranasinghe, and Thuraiappah Sathyan Auto-ID Lab, The School of Computer Science, University