Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation

Size: px
Start display at page:

Download "Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation"

Transcription

1 International Journal of Networking and Computing ISSN (print) ISSN (online) Volume 3, Number 2, pages , July 213 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation Md. Anisuzzaman Siddique, Asif Zaman, Md. Mahbubul Islam Computer Science and Engineering Department University of Rajshahi Rajshahi-625, Bangladesh anis and Yasuhiko Morimoto Graduate School of Engineering Hiroshima University Kagamiyama 1-7-1, Higashi-Hiroshima , Japan Received: February 2, 213 Revised: May 14, 213 Accepted: June 18, 213 Communicated by Sayaka Kamei Abstract The existing k-dominant skyline solutions are restricted to centralized query processors, limiting scalability, and imposing a single point of failure. To overcome those problems in this paper, we propose the computation and maintenance algorithms for spatial k-dominant skyline query processing in large-scale distributed environment. Where the underlying dataset is partitioned into geographically distant computing core (personal computer) that are connected to the coordinator (server). Our proposed techniques preserve the spatial k-dominant computation object itself into a serialized form. This preservation is done in client s core after completing a computational job successfully. When the issue of maintenance comes in action, preserve data object retrieves and use for computation. This procedure eliminates the necessity of intermediate re-send and re-computation of k-dominant skyline for the maintenance issue. Thus, we quantify the gain of data transferring consecutively into different cores to maximize the overall gain as well as the query or balancing the load on different cores fairly. Extensive performance study shows that proposed algorithms are efficient and robust to different data distributions. Keywords: Skyline, k-dominant Skyline, Load table, Maintenance 1 Introduction An user often wants to optimize their decision making and selection criteria across multiple attributes. In many cases, especially in multi-criteria decision making, there is no single best answer to a query. Skyline queries do not require an explicit preference function, which may be difficult for the user to define when the relative importance of the different criteria is vague. 244

2 International Journal of Networking and Computing ID Price Distance h h Distance h 1 Skyline h 2 h h h 3 h 5 h h 4 (a) Hotels (b) Skyline Price Figure 1: Skyline Example A skyline query retrieves a set of skyline objects so that the user can choose promising objects from them and make further inquiries. Skyline objects in a database are objects that are not dominated by any other objects in the database. Given an n-dimensional database DB, an object O i is said to be in skyline of DB if there is no other object O j (i j) in DB such that O j is better than O i in all dimensions. If there exists such O j, then we say that O i is dominated by O j or O j dominates O i. Figure 1 shows a typical example of skyline. The table in the figure is a list of hotels, each of which contains two numerical attributes: distance and price, for online booking. A user chooses a hotel from the list according to her/his preference. In this situation, her/his choice usually comes from the hotels in skyline, i.e., one of h 1, h 3, h 4 (see Figure 1 (b)). Therefore, such skyline query functions are important for several database applications, including customer information systems, decision support, data visualization, and so forth. A number of efficient algorithms for computing skyline objects have been reported in the literature [1, 2, 3, 4, 5]. In a high-dimensional dataset a skyline query often retrieves too many objects to investigate. To minimize the result and to find more important and meaningful objects, Chan et al. considered k-dominant skyline query [6]. They relaxed the definition of dominated so that an object is likely to be dominated by another. Given an n-dimensional database, an object O i is said to k-dominates another object O j (i j) if there are k (k n) dimensions in which O i is better than or equal to O j. A k-dominant skyline object is an object that is not k-dominated by any other objects. Until now the k-dominant skyline literature has focused on centralized databases. In practice, data are often collected from multiple sources that are geographically distant from each other. For example, assume a travel agency has numerous computing cores (data sources), each of which is located in a different city (New York, Los Angeles, Chicago, etc.), and manages only local properties. A query may demand the k-dominant skyline of the properties in specific source via coordinator (server). For this or similar type application, a distributed architecture is more suitable. This is because, compared to the centralized counterpart, it has smaller computation cost, it allows better local data management, smaller update cost, and higher tolerance to machine failures. In this paper, we study the efficient maintenance of spatial k-dominant skyline queries, where the underlying dataset is partitioned into geographically distant computing cores that are connected to the coordinator. Assume a system architecture, that consists of a server with a network access point, and several core connect to the server (Figure 2). The server can directly communicate with any core and request for k-dominant skyline computation. In such a setting, each time the server sends a k-dominant skyline request with new dataset or maintenance request to a core, all objects information that are not k-dominated have to be transferred to the server. Our previous work [16] suffers from data transmission point of view. The server needs to resend the dataset again and again for each update. For example, assume initially cores C 1, C 2, and C 3 are assigned to process three regional datasets R 1, R 2, and R 3 respectively. After completing the computational work when a 245

3 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation C2 C1 C3 C7 C4 C6 C5 Figure 2: System architecture core C 1, becomes free then the new dataset D 4 will be assigned to it. In our previous work [16], for computational complexity, C 1 did not store internal data structure related to D 1. Next, when dataset D 1 requires some maintenance work then the server needs to assign a new core to perform the same computation and maintenance operation respectively. This procedure consumes a lot of computing power. The proposed work overcomes that problem and the controlling server need not to resend the dataset again and again for each maintenance issue. In this approach data computation object itself is stored as binary logic object data in the corresponding computing core. This technique usually used to store image objects. As computation object itself is stored, we don t have to worry about the complexity of storing internal data structure related to a dataset. Computational objects are stored and can be restored again if necessary. Wu et al. [29] proposed distributed skyline query processing algorithm (DSL), to support skyline search on a horizontally partitioned dataset. DSL is not applicable to our problem, where the underlying database DB is horizontally partitioned onto the participating cores in an arbitrary manner. Specifically, we allow each core to contain any subset of DB, rather than only the data falling in a particular region. It is worth mentioning that the partitioning scheme adopted by DSL incurs expensive network overhead for maintenance DB. For example, assume that core C 1 receives an insertion to DB that, however, falls in the region assigned to core C 2. Since the object must be retained at core C 2, an object transfer (from core C 1 to core C 2 ) is required. In general, every insertion may be accomplished by an object transfer, which is prohibitively costly in practice. Similar thing will happen for delete as well as update operations. The proposed work does not suffer from these drawbacks, because each core can manage its local dataset without any network transmission. Here are the contribution of this paper: (1) We extend k-dominant computation from centralized to distributed datasets. We propose a framework for the maintenance of distributed spatial k-dominant skyline queries over multiple cores, aiming to reduce data transmission overhead. (2) We introduce three novel algorithms for distributed k-dominant skyline queries. The registration and ranking power calculation, and the job distributed algorithms work in server site. For client site we develop computation core selection algorithm. (4) Moreover, all computing cores are reliable, failure of one or more computing core does not create any problem, the rest of the computing cores are responsible to conduct the computational work. (5) We conduct extensive experiments to perform the evaluation. The evaluation results show that the proposed techniques are efficient and scalable. Our algorithms consistently outperform against its competitor multicore based spatial k-dominant skyline () in all examined setups. The remainder of this paper is organized as follows. Section 2 discusses related work. Section 3 presents the notions and properties of regional k-dominant skyline computation. We provide detailed 246

4 International Journal of Networking and Computing examples and analysis of our algorithm in Section 4. We experimentally evaluate our algorithm in Section 5 under a variety of settings. Finally, Section 6 concludes the paper. 2 Related Work Our work is motivated by previous studies of skyline query processing, k-dominant skyline query processing as well as distributed k-dominant skyline query processing, which are reviewed in this section. 2.1 Skyline Query Processing Borzsonyi et al. first introduced the skyline operator over large databases and proposed three algorithms: Block-N ested-loops(bn L), Divide-and-Conquer(D&C), and B-tree-based schemes [2]. BNL compares each object of the database with every other object, and reports it as a result only if any other object does not dominate it. A window W is allocated in main memory, and the input relation is sequentially scanned. In this way, a block of skyline objects is produced in every iteration. In case the window saturates, a temporary file is used to store objects that cannot be placed in W. This file is used as the input to the next pass. D&C divides the dataset into several partitions such that each partition can fit into memory. Skyline objects for each individual partition are then computed by a main-memory skyline algorithm. The final skyline is obtained by merging the skyline objects for each partition. Chomicki et al. improved BNL by presorting, they proposed Sort-F ilter-skyline(sf S) as a variant of BNL [4]. Among index-based methods, Tan et al. proposed two progressive skyline computing methods Bitmap and Index [7]. In the Bitmap approach, every dimension value of a point is represented by a few bits. By applying bit-wise AND operation on these vectors, a given point can be checked if it is in the skyline without referring to other points. The index method organizes a set of d-dimensional objects into d lists such that an object O is assigned to list i if and only if its value at attribute i is the best among all attributes of O. Each list is indexed by a B-tree, and the skyline is computed by scanning the B-tree until an object that dominates the remaining entries in the B-trees is found. The current most efficient method is Branch-and-Bound Skyline(BBS), proposed by Papadias et al., which is a progressive algorithm based on the best-first nearest neighbor (BF-NN) algorithm [5]. Instead of searching for nearest neighbor repeatedly, it directly prunes using the R*-tree structure. Recently, more aspects of skyline computation have been explored. Vlachou et al. introduce the concept of extended skyline set, which contains all data elements that are necessary to answer a skyline query in any arbitrary subspace [9]. Fotiadou et al. mention about the efficient computation of extended skylines using bitmaps in [2]. Chan et al. introduce the concept of skyline frequency to facilitate skyline retrieval in high-dimensional spaces [11]. Tao et al. discuss skyline queries in arbitrary subspaces [12]. 2.2 k-dominant Skyline Query Processing Chan et al. introduced k-dominant skyline query [6]. They proposed three algorithms, namely, One-Scan Algorithm (OSA), Two-Scan Algorithm (TSA), and Sorted Retrieval Algorithm (SRA). OSA uses the property that a k-dominant skyline objects cannot be worse than any skyline object on more than k dimensions. This algorithm maintains the skyline objects in a buffer during the scan of the dataset and uses them to prune away objects that are k-dominated. TSA retrieves a candidate set of dominant skyline objects in the first scan by comparing every object with a set of candidates. The second scan verifies whether these objects are truly dominant skyline objects or not. This method turns out to be much more efficient than the one-scan method. A theoretical analysis is provided to show the reason for its superiority. The third algorithm, SRA is motivated by the rank aggregation algorithm proposed by Fagin et al., which pre-sorts data objects separately according to each dimension and then merges these ranked lists [8]. The above skyline as well as k-dominant methods are designed for centralized database and do not work in the distributed environment. Several attempts have been made to address distributed 247

5 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation skyline retrieval, leading to a number of interesting methods [17, 14]. Balke et al. extended the skyline problem for the web in which the attributes of an object are distributed in different webaccessible servers. They assumes that a d-dimensional database is vertically partitioned into d lists, where each list contains the values of a dimension in ascending order. They also develop an enhanced solution that, instead of visiting the lists in a round-robin fashion, always accesses the most promising list, so that the least expansion is performed on each list to discover an anchor point p. Huang et al. [14] study skyline search on hand-held devices in a mobile ad-hoc network (MANET). Each device store a tiny horizontal fraction of the underlying relation, and is able to communicate only with its neighbors, i.e., devices within its contact range. The connection is ad-hoc because (i) two devices cease to be neighbors when they move out of each other s contact range, and (ii) new neighbors are formed when two devices come close to each other. The objective is to compute the skyline over the data of all the devices and report it at the querying device Q, by consuming as little bandwidth as possible. 2.3 Distributed k-dominant Skyline Distributed skyline computation has recently attracted considerable attention and has been studied in a variety of distributed systems, including web information systems [17], parallel systems [23], peer-to-peer systems [18, 2, 21, 24, 25, 27, 28, 29], mobile ad-hoc networks [26], as well as more generic distributed systems [19, 22]. Most of the existing approaches focus on highly distributed environments, such as peer-to-peer networks, assuming that all data sources store common attributes. Several approaches belong to this category, namely DSL [29], SSP [27], SKYPEER [24], and BITPEER [2]. DSL was proposed by Wu et al. [29] and it is the first paper that addresses constrained skyline query processing over disjoint data partitions by using a structured peer-to-peer overlay, namely CAN. Wang et al. [27] propose the SSP algorithm based on the use of a tree based peer-to-peer overlay (BATON) for assigning data to servers. SKYPEER [24] transforms the multidimensional data into one-dimensional values and utilizes a thresholding scheme in order to reduce the transferred data. Fotiadou et al. [2] propose BITPEER for supporting efficiently continuous subspace skylines in a distributed setting by using distributed bitmap indexes. However, unfortunately none of the above methods are designed for distributed k-dominant skyline computation. This is because the maintenance of k-dominant skyline for an update is much more difficult due to the intransitivity of the k-dominance relation. Assume that A k-dominates B and B k-dominates C. However, A does not always k-dominate C. Moreover, C may k-dominate A. Because of the intransitivity property, we have to compare each object against every other object to check the k-dominance. Therefore in [15], Siddique and Morimoto, resolved k-dominant skyline maintenance problem in a centralized dataset. We proposed M SKS [16] algorithm for spatial k-dominant skyline computation and maintenance. The server needs to resend the dataset again and again for each update. In this paper, we expand the k-dominant skyline queries to spatially distributed datasets and resolve the dataset resending problem. 3 Preliminaries This section discusses the distributed k-dominant skyline computation, maintenance problems, and associated properties. Our objective is to extract the local k-dominant skyline and deliver it to a server computer CR. We refer CR as the coordinator of the k-dominant skyline retrieval. Assume there are x regional datasets named R 1, R 2,, R x and y computing cores called C 1, C 2,, C y. CR can be any computer other than C 1, C 2,, C y. CR is able to initiate communication with each core directly. Assume each dataset is n-dimensional and D 1, D 2,, D n are the n attributes of each regional dataset. Let O 1, O 2,, O r be r objects (tuples) of a regional dataset, say R p (1 p x). We use O i.d j to denote the j-th dimension value of O i. 248

6 International Journal of Networking and Computing 3.1 k-dominance An object O i is said to dominate another object O j, which we denote as O i O j, if O i.d s O j.d s for all dimensions D s (s = 1,, n) and O i.d t < O j.d t for at least one dimension D t (1 t n). We call such O i as dominant object and such O j as dominated object between O i and O j. By contrast, an object O i is said to k-dominate another object O j, denoted as O i k O j, if O i.d s O j.d s in k dimensions among n dimensions and O i.d t < O j.d t in one dimension among the k dimensions. We call such O i as k-dominant object and such O j as k-dominated object between O i and O j. An object O i is said to have δ-domination power if there are δ dimensions in which O i is better than or equal to all other objects of R p. 3.2 Spatial k-dominant Skyline An object O i R p (1 p x) is said to be a spatial skyline object of R p if O i is not dominated by any other object in R p. Similarly, an object O i R p is said to be a spatial k-dominant skyline object of R p if O i is not k-dominated by any other object in R p. We denote a set of all spatial k-dominant skyline objects in R p as Sky k (R p ). Note that objects that have k-domination power must be k-dominant skyline objects but not vice versa. 4 Distributed Spatial k-dominant Skyline In this section, we present our algorithms for job distributing and computing spatial k-dominant skyline objects and maintaining the result when update occurs. There is a crucial difference between our network architecture and those in [14, 9, 29]. The architecture in [14, 9, 29] aim at supporting massive distributed environments with a huge number of data machines. In our scenario the cores number is moderately small. Our job distribution algorithms are useful for efficient k-dominant skyline retrieval in distributed environment. Job distribution is important for three reasons. (i) Efficient distributing method enhances user experience by preventing a long idle period. (ii) It allocates the best computing core to the largest regional dataset. (iii) It provides core allocation guarantee for every regional dataset. We illustrate job distributing technique among multiple cores in Section 4.1. Then show how to compute spatial k-dominant skyline for all k at a time in Section 4.2. Section 4.3 presents three types of maintenance solution. They are insertion, deletion, and update operation. The distributed system is designed in such a manner for enhanced computation speed. If a user tries to compute in single core, it will consume a lot of computing time as well as computing power. The user may complain about the overhead of transmitting full dataset each time a new computation is initiated. The proposed techniques do not suffer from transmission issue. 4.1 Job Distribution If the number of regional dataset is equal to or smaller than the available computing core (i.e., x y) then there is no issue to consider for job distribution. However, when x > y, (i.e., the number of regional dataset is larger than the available computing core) job distribution issues occur. We propose three algorithms for job distribution. They are: registration and ranking power calculation, job allocation, and computation core selection. Among the algorithms registration and ranking power calculation and job allocation algorithms work in server site and the remaining computation core selection algorithm works in client site. Registration and Ranking Power Calculation Suppose there are y computing cores called C 1, C 2,, C y for distributed spatial k-dominant skyline computation. Every node needs to register itself to the coordinator CR. The CR has an independent thread for registering and ranking computing cores. A core (say C m ) sends a registration request 249

7 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation Algorithm 1: Registration and Ranking Power Calculation Create LoadTable(ID, IP Address, Port no., Computing power, Memory size, Network traffic, Rank power, Status) 1. Start and wait 1ms for a registration request. 2. If new request arrives then i. Assign a new ID I m for requesting core C m. ii. Store C m s C m (IP ), C m (CP ), C m (M) in standard database iii. Set I m status as FREE. Else i. Select all cores whose status are not BUSY (i.e., either FREE or DEAD ). ii. For each core C m do the following steps: a) Ping each C m (IP ) 4-6 times and keep average response time C m (RT ). b) If C m (RT ) = infinity, i.e., destination host is unreachable, then Set C m status as DEAD and continue. c) Calculate C m ranking power, RP m = 6 * C m (CP ) + C m (M) +(C m (RT ) 2 ). d) Store RP m in LoadTable. 3. Go to step 1. to CR. Then the independent thread of CR creates a load table named LoadTable(ID, IP Address, Port no., Computing power, Memory size, Network traffic, Rank power, Status). ID can be any value between 1 to y. Each core status can be either FREE, BUSY or DEAD. Computing power C m (CP ), memory size C m (M) are supplied by corresponding core during the registration procedure. method periodically monitors the traffic status during RP m calculation. The average ping time or response time C m (RT ), from controlling core CR to C m uses as an indication of traffic status. If the network is busy, the average ping time will be high. Again if C m is disconnected then the average ping time becomes infinity. In such situation C m is marked as a dead core, i.e., core status field of LoadTable for C m is set as DEAD. We consider three parameters such as computing power, memory size, and network traffic to calculate RP m of C m. C m (CP ) and C m (M) are provided by C m during the registration period and are stored in CR s LoadTable. Rank power RP m indicates the computing power of each individual core and is calculated using formula: RP m = 6 * C m (CP ) + C m (M) +(C m (RT ) 2 ). In each ideal time (after waiting for new request up to a certain period of time), CR continuously computes the ranking power of cores whose status are not BUSY. In all other situations the proposed technique calculates RP m and stores it in LoadTable. Registration and ranking power calculation algorithm is shown in Algorithm 1. Job Allocation The next challenging task is distributing computation job among cores. To do this CR creates a QueueTable(Data id, Data location, Job type, Job status, Processing core ID, Processing time). Data id can be any value such as A, B, C, etc. Data location is the location of the dataset. Job type can be either New or Maintenance. Job status can be set as either Unprocessed or Processed. Processing core ID is the id of the participating core. Processing time is the completion time of the assigned job. A job request may have two different operations: completely new processing (i.e., Spatial k-dominant skyline computation) or a maintenance operation. If the job request is completely new then the type field in QueueTable populate with New otherwise set Maintenance. When CR receives a new job. It places the job in queue and sets job data status as Unprocessed. In CR an independent thread runs continuously to search the queue in order to find whether or not any new job is being submitted. If a job or a set of jobs has status Unprocessed then the thread starts parsing other field values for each new job. Assume a new request for dataset R p has arrived in queue. CR needs to find a suitable core to assign the computation responsibility. If CR receives New request for the dataset R p then, it searches LoadTable to find a free core say, C m with status FREE to do the job. It is possible to have more than one core with FREE status. In such situation it selects the core C m with highest Rank power. After that it initiates a new child thread at CR to consult the computation process with C m. In the child thread, C m status 25

8 International Journal of Networking and Computing Algorithm 2: Job Allocation Create LoadTable (ID, IP Address, Port no., Computing power, Memory size, Network traffic, Rank power, Status) QueueTable(Data id, Data location, Job type, Job status, Processing core ID, Processing time) 1. Search QueueTable for all entries with job status Unprocessed. 2. If found then for all available dataset do the following steps a. Read the job type. b. If the job type is New then i. Search LoadTable for a core C m with status FREE. ii. If C m found then set Status = BUSY Initiate new thread for assign computation: # Make RMI call to do the computation. # Receive the result. # Store result into a standard database. # Update queue, set core ID = C m and Job status = Processed. # Mark C m Status = FREE. iii. Else continue c. Elseif the job type is maintenance then i. Search QueueTable for the core C q who processed the dataset last time. ii. If C q found then Search LoadTable to get the current C q status. If C q Status = FREE then # Update LoadTable, set C q Status = BUSY. # Initiate the maintenance job: (.) Make RMI call to do the computation. (.) Receive result. (.) Store result into standard database. (.) Update queue, set core ID = C m and Job status = Processed. (.) Mark C q Status = FREE. Else continue Else continue 3. Go to step 1. is changed from FREE to BUSY. CR transfers R p to C m, and waits for result. Waiting period may vary depending on the size of R m. After receiving the computation result, the resultant data is being preserved in database and C m status is marked as FREE. The computation time and computing core ID are also stored in CR s data storage system. In case of Maintenance request CR searches QueueTable for the core say, C q who has processed R p most recently. After finding the core C q, CR searches LoadTable to obtain the information on current status of C q. If C q is available i.e., FREE then CR initiates necessary steps to send the maintenance request to C q and waits for result. After receiving the result, it updates the local database. However if C q is not free, CR has to wait until it becomes FREE. During the time of initiating a maintenance job to core C q, LoadTables status field is changed from FREE to BUSY. Meanwhile the dataset R p status field in QueueTable is set to Processed. Algorithm 2 represents the job allocation algorithm. Populating Load and Queue Table This section discusses about how to populate LoadTable and QueueTable. Assume two computing core (core 1 and core 2) send their registration request to CR. These requests are registered to LoadTable as shown in Table 1. Suppose at a certain period of time CR receives three data processing requests. Among the requests two for new data processing and remaining one for maintenance. At this stage the QueueTable becomes Table 2. From QueueTable CR finds some jobs with Unprocessed status. LoadTable shows that both cores have FREE status and core 1 has 251

9 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation Table 1: LoadTable after complete the registration task ID IP Add. Port no. Comp. Power Mem. Size Net. traffic Rank Power Status GHz 248 MB 6 ms FREE GHz 248 MB 3 ms 26. FREE Table 2: QueueTable after receives three jobs Data id Data Location Job type Job status Processing core ID Processing time A C:\Data\a.csv New Unprocessed NA NA B C:\Data\b.csv New Unprocessed NA NA A C:\Data\x.csv Maintenance Unprocessed NA NA the highest ranking power. Then CR assigns the job with data id A to core 1 and data id B to core 2, and updates LoadTable and QueueTable accordingly (see Tables 3 and 4). Since now no core has FREE status, no new job initialization is possible until any one becomes FREE. Suppose after 15 seconds, computation process of data A has been completed. CR preserves the result. Updated LoadTable and QueueTable are shown in Tables 5 and 6, respectively. Finally, the maintenance job upon dataset A is assigned to core 1. Similar procedure will repeat for large number of new dataset computation as well as maintenance. Computation Core Selection The most challenging task is to develop a mechanism for clients end computation. In this approach, client s core, say, C c, begins its operation by issuing a participation registration request to CR. After that C c waits until any computation request is received. As we mentioned previously, there exist two types computation requests: New and Maintenance. If C c receives a New request, it simply creates an object of computation class and initiates the spatial k-dominant skyline computation. After performing the calculation, C c preserves resultant data and some internal status data of computing object for future maintenance operation. Preserving only resultant and status data may cause some difficulties to re-initiate k-dominate computation as well as maintenance. To overcome those problems we preserve the computing object itself instead of resultant and status data. C c serializes the computing object, and preserves the object as Binary Logic Object Data (BLOD) into a database of file system. Then it returns the result to CR and waits for a new job request. In case of receiving a Maintenance computation request for regional R p dataset, C c searches its local database or file system for the computation object whose dataset match with R p. It should be found, because CR sent such request upon finding that C c has already completed such computation in recent time. C c de-serializes the BLOD data and constructs computation object. After performing required maintenance operation, C c sends back the resultant data and again preserves the serialized form of computation object. Computation core selection algorithm is shown Algorithm Spatial k-dominant Skyline Computation Assume there is a regional dataset R p (1 p x) as shown in Table 7. In order to prune unnecessary objects efficiently in the k-dominant skyline computation, we compute domination power of each Table 3: LoadTable after assign the jobs between two cores ID IP Add. Port no. Comp. Power Mem. Size Net. traffic Rank Power Status GHz 248 MB 6 ms BUSY GHz 248 MB 3 ms 26. BUSY 252

10 International Journal of Networking and Computing Table 4: QueueTable after assign the jobs Data id Data Location Job type Job status Processing core ID Processing time A C:\Data\a.csv New Processing 1 NA B C:\Data\b.csv New Processing 2 NA A C:\Data\x.csv Maintenance Unprocessed NA NA Table 5: LoadTable after completing the job of core 1 ID IP Add. Port no. Comp. Power Mem. Size Net. traffic Rank Power Status GHz 248 MB 6 ms FREE GHz 248 MB 3 ms 26. BUSY Table 6: QueueTable after completing the job of core 1 Data id Data Location Job type Job status Processing core ID Processing time A C:\Data\a.csv New Processed 1 15 B C:\Data\b.csv New Unprocessed 2 NA A C:\Data\x.csv Maintenance Unprocessed NA NA Algorithm 3: Computation Core Selection Standard database management system (MySQL) 1. Each core with Status = FREE sends computation request to CR. 2. If new processing request arrives from CR then a. Check the job type. b. If the job type is New then i. Initiate new computation class object. ii. Perform spatial k-dominant skyline computation. iii. Serialize the computation object. iv. Preserve the computation object into standard database. v. Return result to the CR. c. Elseif the job type is Maintenance Search local database management system for processing object i. If found then # De-serialize the stored object to make it alive. # Perform the maintenance operation. # Serialize the computation object. # Replace the previous serialized object with the new one. # Return result to CR. ii. Else return ERROR: OBJECT NOT FOUND. 4. Return to step

11 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation Table 7: Symbolic Regional Dataset, R p Obj. D 1 D 2 D 3 D 4 D 5 D 6 O O O O O O O Table 8: Ordered Domination Table Obj. D 1 D 2 D 3 D 4 D 5 D 6 DP Sum DC IDX O O 5 O O 7 O O 7 O O 7 O O 7 O O 7 O O 7 object. We sort objects in descending order by domination power. If more than one objects have the same domination power then sort those objects in ascending order of the sum value. This order reflects how to k-dominate other objects. Table 8 is the sorted object sequence of Table 7, in which the column DP is the domination power and the column Sum is the sum of all values. In the sequence, object O 7 has the highest domination power 4. Note that object O 7 dominates all objects below it in four attributes, D 1, D 2, D 5, and D 6. After computing the sorted object sequence, we compute dominated counter (DC) and dominant index (IDX). Detailed algorithm can be found in [15]. The dominated counter (DC) indicates the maximum number of dominated dimensions by another object in R p. The dominant index (IDX) is the strongest dominator. That means IDX keeps the record of the corresponding strongest dominator for each object. The columns DC and IDX of Table 8 show the result of the procedure. Sky k (R p ) is a set of objects whose DC is less than k. According to the dominated counter, we can see that Sky 6 (R p ) = {O 7, O 5, O 2, O 6, O 3 }, Sky 5 (R p ) = {O 7, O 5 }, and Sky 4 (R p ) = {O 7 }. Since there is no object whose DC value is less than 3, thus Sky 3 (R p ) = { }. 4.3 Spatial k-dominant Skyline Maintenance In this section, we discuss the maintenance problem of Sky k (R p ) after an update has occurred in R p. In the maintenance phase, we keep a vector that contains the minimal value for each dimension, which we call the minimal vector. The minimal vector of Table 8 is {1, 2, 1, 1, 1, 2}. We also keep an inverted index of the dominant index column (IDX). Table 9 is the inverted index of Table 8. Lemma 1: Assume O 1.D s O 2.D s for all dimensions D s (s = 1,, n) for O 1, O 2 R p. If O 1 is not deleted from R p, O 2 will never be in the k-dominant skyline of R p. Proof: Since O 2 is n-dominated by O 1, it is also k-dominated by O 1 because (k n). Therefore, while O 2 is in the R p, there will be at least one object, namely O 1, that k-dominates it and consequently cannot become a k-dominant skyline. It implies that only Sky n (R p ) objects are relevant for the k-dominant skyline maintenance task. 254

12 International Journal of Networking and Computing Table 9: Inverted Index Obj. dominated O 5 O 7 O 7 O 5, O 4, O 2, O 6, O 3, O 1 Table 1: Ordered Domination Table after Insertions Obj. DP Sum DC IDX Obj. DP Sum DC IDX O O 5 O O 5 O O 7 O O 7 O O 7 O O 7 O O 7 O O 7 O O 7 O O 7 O O 2 O O 7 O O 7 O O 1 O O 7 O O 7 O O 7 O O 7 O O 7 Insertion Assume O I is inserted into R p. We maintain Sky k (R p ) by examining the dominated counter (DC) by scanning the sorted object sequence. If the DC value of an object is updated, we also update the inverted index. During the update procedure, if O I is n-dominated by another object, then we can stop the procedure immediately. After updating DC and IDX, we insert O I into the sorted object sequence. We use a skip list data structure for the sequence so that we can insert efficiently. Assume we insert O 8 = (6, 6, 6, 6, 6, 6) in the running example. By comparing with the first object O 7, we can find that O 8 is 6 dominated by O 7. Therefore, we immediately complete the update of DC and IDX. Then, we insert O 8 into the last position of the sorted sequence. Next, we insert O 9 = (3, 4, 4, 2, 2, 3). O 9 is not 6 dominated by any object and we can find that the strongest dominator of O 9 is O 2. Then, DC and IDX are updated as in Table 1 after inserting O 9. Next, we insert O 1 = (2, 2, 3, 1, 2, 3). O 1 is not 6-dominated by any object and the strongest dominator of O 1 is O 7. Moreover, O 1 becomes the strongest dominator of O 9. Then, DC and IDX are updated as in Table 1 after inserting O 1. Deletion Assume O D is deleted from R p. We check the inverted index to examine whether there is O D s record in the index. Note that in Table 9 Obj. column in each record is the strongest dominator for objects in the dominated column. Therefore, if there is no O D s record in the inverted index, we do not have to change the dominated counter (DC) for other objects. Otherwise, we initialize DC for each object in the O D s index record. Then, we perform the pairwise comparison. In this case, we do not have to examine DC for objects that are not in the O D s index record and therefore it is not costly. Consider the running example again. Assume we delete O 1. We examine DC of O 9 by scanning Table 1. The updated result is Table 1. Update In this section, we propose maintenance solution for update operation. Although one can handle update operation with consecutive deletion and insertion. However, single update operation is usually faster than consecutive delete and insert operation. Assume O U is updated from R p. Then, 255

13 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation Table 11: Ordered Domination Table after Updates Obj. DP Sum DC IDX Obj. DP Sum DC IDX O O 5 O O 4 O O 7 O O 7 O O 7 O O 5 O O 7 O O 5 O O 7 O O 4 O O 3 O O 5 O O 7 O O 4 Table 12: Response time for heterogeneous cores Core Type Time (ms) Single Core Core Core i Core i Core i same as delete operation, we have to check the inverted index to examine whether there is O U s record in the index. If there is no O U s record in the inverted index, then we need one scan to revise the dominated counter (DC) for updated object as well as all other data objects. Otherwise, we initialize DC for each object from scratch. However, in this situation we argue that compared with non-idx objects, the number of IDX objects is very few and therefore update operation is also not very costly. Consider Table 8 and select a non-idx object such as O 3 for update. Assume we update D 5 value of this object from 5 to 1. Then after dominance checking with other objects we find that O 3 becomes the strongest dominator of object O 6. This update procedure is shown in Table 11. Again from Table 8 if we select an IDX object such as O 7. In this case, we update the values of D 1 and D 5 from 1 to 6. The revised result after this update is shown in Table Performance Evaluation We conduct a series of experiments to evaluate the effectiveness and efficiency of our proposed methods. There are no other existing techniques that deal with the problem of maintaining k- dominant skyline. So in this paper, we compare our methods against M SKS, which was proposed in [16]. Our heterogeneous simulation takes five different cores (PCs) that are connected to a coordinator (server). Processors of those cores are as follows: single core, core2, core i3, core i5, and core i7. They have 3GB main memory. We choose independent distribution with 6 dimensionality and 1k data cardinality to evaluate each core performance. Table 12 shows the computation time of each core. We observe that high power computing cores take less time than those of low power cores. This result establishes the necessity of rank power computation. Next, our homogeneous simulation takes similar type of cores. The number of cores is moderately small (from 2 to 8). Each of the cores has an Intel(R) Core2 Duo 2 GHz CPU and 3GB main memory. We evaluate our algorithms against M SKS using different types of datasets. As benchmark, we use the datasets proposed by Borzsony et al. [2], in which there are three types of synthetic data distributions: correlated, anti-correlated, and independent. The generation of the synthetic datasets is controlled by three parameters, n, Size, and Dist, where n is the number of attributes, Size is the total number of objects in the dataset, and Dist can be any of the three distributions. In addition, we have generated smaller synthetic datasets for all insertion experiments. For example to conduct insertion experiment on 1k synthetic dataset, we have also generated additional 1k dataset. As for delete and update operations, we choose random object from the experimental dataset. 256

14 International Journal of Networking and Computing 5.1 Spatial k-dominant Computation Varying Core Number We first study the effect of spatial k-dominant computation varying core number on our techniques. We use anti-correlated, independent, and correlated datasets with dimensionality n to 7, cardinality to 1k, and k to 6. The number of cores varies from 2 to 8. Figure 3(a), (b), and (c) shows the time to compute spatial k-dominant skyline. Figure 3 shows that the performance of proposed methods outperform than M SKS for all data distribution. 5.2 Effect of Data Distribution To study the effect of data distributions we use anti-correlated, independent, and correlated datasets with dimensionality n to 7, cardinality to 1k, k to 6, and core number to 8. figure 4(a), (b), and (c) shows the time to maintain spatial k-dominant skyline for the maintenance ranges from 1% to 5%. In this experiment, 1% maintenance for 1k dataset implies that there are 333 insertions, 333 deletions, and 334 updates (total 1k updates) occurred in the dataset. Figure 4 shows that the performance of both methods deteriorates significantly with the increase of maintenance size. We further notice that for anti-correlated dataset, many spatial k-dominant skyline objects retrieved as a result the maintenance cost of this distribution incurs high computational overheads. 5.3 Effect of Dimensionality For this experiment, we choose the anti-correlated datasets and the core number is 8. We set the data cardinality to 1k and vary dataset dimensionality n ranges from 4 to 8 and k from 3 to 7. Figure 5(a), (b), and (c) shows the update performance. The M SKS technique is highly affected by the curse of dimensionality, i.e., as the space becomes sparser its response time rapidly decreases. Figure 5 shows that if the dimensionality and the maintenance ratio increase the response time of proposed methods grows steadily, which is much less than that of. 5.4 Effect of Cardinality For this experiment, we select the anti-correlated datasets with varying dataset cardinality ranges from 1k to 2k, set the value of n to 7, k to 6, and core number to 8. Figure 6(a), (b), and (c) shows the time to maintain k-dominant skyline for maintenance ranges from 1% to 5%. The result shows that if the maintenance ratio and data cardinality increase the response time of proposed methods also increases. However, the response time is much smaller than that of M SKS. 6 Conclusion k-dominant skyline has side effect of taking larger time for centralized computation. Centralized system also has maintenance problem. To solve those problems, we consider regionally distributed k-dominant skyline and present an efficient architecture for computing the query result as well as the maintenance problem. We introduced registration and ranking power calculation, job allocation, and computation core selection algorithms for efficient load distribution and shortening the response time. Our algorithms consistently outperform against M SKS algorithm. Intensive experiments show the efficiency and scalability of our proposed methods. However, current architecture can handle small number of cores. We leave the work to support large number of cores for future research. References [1] T. Xia, D. Zhang, and Y. Tao, On Skylining with Flexible Dominance Relation, in Proceedings of ICDE, 28, pp

15 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation Core Number a) Correlated Core Number b) Anti-correlated Core Number c) Independent Figure 3: Computation performance for different core number 258

16 International Journal of Networking and Computing % 2% 3% 4% 5% Maintenance a) Correlated % 2% 3% 4% 5% Maintenance b) Anti-correlated 1% 2% 3% 4% 5% Maintenance c) Independent Figure 4: Maintenance performance for different data distributions 259

17 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation % 2% 3% 4% 5% Maintenance a) 4 dimension % 2% 3% 4% 5% Maintenance b) 6 dimension % 2% 3% 4% 5% Maintenance c) 8 dimension Figure 5: Maintenance performance for different dimensions 26

18 International Journal of Networking and Computing % 2% 3% 4% Maintenance a) 1k % 2% 3% 4% 5% Maintenance b) 15k % 2% 3% 4% 5% Maintenance c) 2k Figure 6: Maintenance performance for different data cardinality 261

19 Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation [2] S. Borzsonyi, D. Kossmann, and K. Stocker, The skyline operator, in Proceedings of ICDE, 21, pp [3] D. Kossmann, F. Ramsak, and S. Rost, Shooting stars in the sky: An online algorithm for skyline queries, in Proceedings of VLDB, 22, pp [4] J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, Skyline with Presorting, in Proceedings of ICDE, 23, pp [5] D. Papadias, Y. Tao, G. Fu, and B. Seeger, Progressive skyline computation in database systems, in ACM Transactions on Database Systems, vol. 3(1), pp , March 25. [6] C. Y. Chan, H. V. Jagadish, K-L. Tan, A-K. H. Tung, and Z. Zhang, Finding k-dominant Skyline in High Dimensional Space, in Proceedings of ACM SIGMOD, 26, pp [7] K.-L. Tan, P.-K. Eng, and B. C. Ooi, Efficient Progressive Skyline Computation, in Proceedings of VLDB, 21, pp [8] R. Fagin, A. Lotem, and M. Naor, Optimal aggregation algorithms for middleware, in Proceedings of ACM PODS, 21, pp [9] A. Vlachou, C. Doulkeridis, Y. Kotidis, and M. Vazirgiannis, SKYPEER: Efficient Subspace Skyline Computation over Distributed Data, in Proceedings of ICDE, 27, pp [1] K. Fotiadou and E. Pitoura, BITPEER: Continuous Subspace Skyline Computation with Distributed Bitmap Indexes, in Proceedings of DaAMaP, 28, pp [11] C. Y. Chan, H. V. Jagadish, K-L. Tan, A-K. H. Tung, and Z. Zhang, On High Dimensional Skylines, in Proceedings of EDBT, 26, pp [12] Y. Tao, X. Xiao, and J. Pei, Subsky: Efficient Computation of Skylines in Subspaces, in Proceedings of ICDE, 26, pp [13] W.-T Balke, U. Guntzer, and J. X. Zheng, Efficient Distributed Skylining for Web Information Systems, in Proceedings of EDBT, 24, pp [14] Z. Huang, C. S. Jensen, H. Lu, and B. C. Ooi, Skyline Queries Against Mobile Lightweight Devices in Manets, in Proceedings of ICDE, 26, pp. 66. [15] M. A. Siddique and Y. Morimoto, Efficient Maintenance of all k-dominant Skyline Query Results for Frequently Updated Database, in The International Journal on Advances in Software, Vol. 3, No. 3 & 4, 21, pp [16] M. A. Siddique, A. Zaman, M. M. Islam, and Y. Morimoto, Multicore based Spatial k- dominant Skyline Computation, in Proceedings of the 3rd Int l Conference on Networking and Computing (ICNC), 212, pp [17] W.-T. Balke, U. Guntzer, and J. X. Zheng, Efficient distributed skylining for web information systems, in Proceedings of EDBT, 24, pp [18] L. Chen, B. Cui, H. Lu, L. Xu, and Q. Xu, isky: Efficient and progressive skyline computing in a structured P2P network, in Proceedings of ICDCS, 28, pp [19] B. Cui, H. Lu, Q. Xu, L. Chen, Y. Dai, and Y. Zhou, Parallel distributed processing of constrained skyline queries by filtering, in Proceedings of ICDE, 28, pp [2] K. Fotiadou and E. Pitoura, BITPEER: Continuous subspace skyline computation with distributed bitmap indexes, in Proceedings of DAMAP, 28, pp [21] K. Hose, C. Lemke, and K.-U. Sattler, Processing relaxed skylines in PDMS using distributed data summaries, in Proceedings of CIKM, 26, pp

Multicore based Spatial k-dominant Skyline Computation

Multicore based Spatial k-dominant Skyline Computation 2012 Third International Conference on Networking and Computing Multicore based Spatial k-dominant Skyline Computation Md. Anisuzzaman Siddique, Asif Zaman, Md. Mahbubul Islam Computer Science and Engineering

More information

k-dominant and Extended k-dominant Skyline Computation by Using Statistics

k-dominant and Extended k-dominant Skyline Computation by Using Statistics k-dominant and Extended k-dominant Skyline Computation by Using Statistics Md. Anisuzzaman Siddique Graduate School of Engineering Hiroshima University Higashi-Hiroshima, Japan anis_cst@yahoo.com Abstract

More information

Distributed Skyline Processing: a Trend in Database Research Still Going Strong

Distributed Skyline Processing: a Trend in Database Research Still Going Strong Distributed Skyline Processing: a Trend in Database Research Still Going Strong Katja Hose Max-Planck Institute for Informatics Saarbrücken, Germany Akrivi Vlachou Norwegian University of Science and Technology,

More information

Constrained Skyline Query Processing against Distributed Data Sites

Constrained Skyline Query Processing against Distributed Data Sites Constrained Skyline Query Processing against Distributed Data Divya.G* 1, V.Ranjith Naik *2 1,2 Department of Computer Science Engineering Swarnandhra College of Engg & Tech, Narsapuram-534280, A.P., India.

More information

Finding Skylines for Incomplete Data

Finding Skylines for Incomplete Data Proceedings of the Twenty-Fourth Australasian Database Conference (ADC 213), Adelaide, Australia Finding Skylines for Incomplete Data Rahul Bharuka P Sreenivasa Kumar Department of Computer Science & Engineering

More information

The Multi-Relational Skyline Operator

The Multi-Relational Skyline Operator The Multi-Relational Skyline Operator Wen Jin 1 Martin Ester 1 Zengjian Hu 1 Jiawei Han 2 1 School of Computing Science Simon Fraser University, Canada {wjin,ester,zhu}@cs.sfu.ca 2 Department of Computer

More information

DERIVING SKYLINE POINTS OVER DYNAMIC AND INCOMPLETE DATABASES

DERIVING SKYLINE POINTS OVER DYNAMIC AND INCOMPLETE DATABASES How to cite this paper: Ghazaleh Babanejad, Hamidah Ibrahim, Nur Izura Udzir, Fatimah Sidi, & Ali Amer Alwan. (2017). Deriving skyline points over dynamic and incomplete databases in Zulikha, J. & N. H.

More information

MapReduce-Based Computation of Area Skyline Query for Selecting Good Locations in a Map

MapReduce-Based Computation of Area Skyline Query for Selecting Good Locations in a Map DEIM Forum 2018 A3-2 MapReduce-Based Computation of Area Skyline Query for Selecting Good Locations in a Map Chen LI, Annisa, Asif ZAMAN, and Yasuhiko MORIMOTO Graduate School of Engineering, Hiroshima

More information

Finding k-dominant Skylines in High Dimensional Space

Finding k-dominant Skylines in High Dimensional Space Finding k-dominant Skylines in High Dimensional Space Chee-Yong Chan, H.V. Jagadish 2, Kian-Lee Tan, Anthony K.H. Tung, Zhenjie Zhang School of Computing 2 Dept. of Electrical Engineering & Computer Science

More information

Keywords Skyline, Multi-criteria Decision Making, User defined Preferences, Sorting-Based Algorithm, Performance.

Keywords Skyline, Multi-criteria Decision Making, User defined Preferences, Sorting-Based Algorithm, Performance. Volume 4, Issue 7, July 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Skyline Computation

More information

Diversity in Skylines

Diversity in Skylines Diversity in Skylines Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong Sha Tin, New Territories, Hong Kong taoyf@cse.cuhk.edu.hk Abstract Given an integer k, a diverse

More information

SUBSKY: Efficient Computation of Skylines in Subspaces

SUBSKY: Efficient Computation of Skylines in Subspaces SUBSKY: Efficient Computation of Skylines in Subspaces Yufei Tao Department of Computer Science City University of Hong Kong Tat Chee Avenue, Hong Kong taoyf@cs.cityu.edu.hk Xiaokui Xiao Department of

More information

Processing Rank-Aware Queries in P2P Systems

Processing Rank-Aware Queries in P2P Systems Processing Rank-Aware Queries in P2P Systems Katja Hose, Marcel Karnstedt, Anke Koch, Kai-Uwe Sattler, and Daniel Zinn Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684

More information

A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods

A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods S.Anusuya 1, M.Balaganesh 2 P.G. Student, Department of Computer Science and Engineering, Sembodai Rukmani Varatharajan Engineering

More information

Revisited Skyline Query Algorithms on Streams of Multidimensional Data

Revisited Skyline Query Algorithms on Streams of Multidimensional Data Revisited Skyline Query Algorithms on Streams of Multidimensional Data Alexander Tzanakas, Eleftherios Tiakas, Yannis Manolopoulos Aristotle University of Thessaloniki, Thessaloniki, Greece Abstract This

More information

Goal Directed Relative Skyline Queries in Time Dependent Road Networks

Goal Directed Relative Skyline Queries in Time Dependent Road Networks Goal Directed Relative Skyline Queries in Time Dependent Road Networks K.B. Priya Iyer 1 and Dr. V. Shanthi2 1 Research Scholar, Sathyabama University priya_balu_2002@yahoo.co.in 2 Professor, St. Joseph

More information

Efficient Computation of Group Skyline Queries on MapReduce

Efficient Computation of Group Skyline Queries on MapReduce GSTF Journal on Computing (JOC) DOI: 10.5176/2251-3043_5.1.356 ISSN:2251-3043 ; Volume 5, Issue 1; 2016 pp. 69-76 The Author(s) 2016. This article is published with open access by the GSTF. Efficient Computation

More information

A survey of skyline processing in highly distributed environments

A survey of skyline processing in highly distributed environments The VLDB Journal (2012) 21:359 384 DOI 10.1007/s00778-011-0246-6 REGULAR PAPER A survey of skyline processing in highly distributed environments Katja Hose Akrivi Vlachou Received: 14 December 2010 / Revised:

More information

Friend Recommendation by Using Skyline Query and Location Information

Friend Recommendation by Using Skyline Query and Location Information Bulletin of Networking, Computing, Systems, and Software www.bncss.org, ISSN 2186 5140 Volume 5, Number 1, pages 68 72, January 2016 Friend Recommendation by Using Skyline Query and Location Information

More information

AGiDS: A Grid-based Strategy for Distributed Skyline Query Processing

AGiDS: A Grid-based Strategy for Distributed Skyline Query Processing : A Grid-based Strateg for Distributed Skline Quer Processing João B. Rocha-Junior, Akrivi Vlachou, Christos Doulkeridis, and Kjetil Nørvåg Norwegian Universit of Science and Technolog {joao,vlachou,cdoulk,noervaag}@idi.ntnu.no

More information

WS-Sky: An Efficient and Flexible Framework for QoS-Aware Web Service Selection

WS-Sky: An Efficient and Flexible Framework for QoS-Aware Web Service Selection 202 IEEE Ninth International Conference on Services Computing WS-Sky: An Efficient and Flexible Framework for QoS-Aware Web Service Selection Karim Benouaret, Djamal Benslimane University of Lyon, LIRIS/CNRS

More information

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases Jinlong Wang, Congfu Xu, Hongwei Dan, and Yunhe Pan Institute of Artificial Intelligence, Zhejiang University Hangzhou, 310027,

More information

Communication-Efficient Distributed Skyline Computation

Communication-Efficient Distributed Skyline Computation Communication-Efficient Distributed Skyline Computation ABSTRACT Haoyu Zhang Indiana University Bloomington Bloomington, IN 47408, USA hz30@umail.iu.edu In this paper we study skyline queries in the distributed

More information

Adapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments

Adapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments Adapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments Boliang Zhang 1,ShuigengZhou 1, and Jihong Guan 2 1 School of Computer Science, and Shanghai Key Lab of Intelligent Information

More information

Efficient Parallel Skyline Processing using Hyperplane Projections

Efficient Parallel Skyline Processing using Hyperplane Projections Efficient Parallel Skyline Processing using Hyperplane Projections Henning Köhler henning@itee.uq.edu.au Jing Yang 2 jingyang@ruc.edu.cn Xiaofang Zhou,2 zxf@uq.edu.au School of Information Technology and

More information

Keywords : Skyline Join, Skyline Objects, Non Reductive, Block Nested Loop Join, Cardinality, Dimensionality

Keywords : Skyline Join, Skyline Objects, Non Reductive, Block Nested Loop Join, Cardinality, Dimensionality Volume 4, Issue 3, March 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Skyline Evaluation

More information

Progressive Skylining over Web-Accessible Database

Progressive Skylining over Web-Accessible Database Progressive Skylining over Web-Accessible Database Eric Lo a Kevin Yip a King-Ip Lin b David W. Cheung a a Department of Computer Science, The University of Hong Kong b Division of Computer Science, The

More information

Finding both Aggregate Nearest Positive and Farthest Negative Neighbors

Finding both Aggregate Nearest Positive and Farthest Negative Neighbors Finding both Aggregate Nearest Positive and Farthest Negative Neighbors I-Fang Su 1, Yuan-Ko Huang 2, Yu-Chi Chung 3,, and I-Ting Shen 4 1 Dept. of IM,Fotech, Kaohsiung, Taiwan 2 Dept. of IC, KYU, Kaohsiung,

More information

Parallel Distributed Processing of Constrained Skyline Queries by Filtering

Parallel Distributed Processing of Constrained Skyline Queries by Filtering Parallel Distributed Processing of Constrained Skyline Queries by Filtering Bin Cui 1,HuaLu 2, Quanqing Xu 1, Lijiang Chen 1, Yafei Dai 1, Yongluan Zhou 3 1 Department of Computer Science, Peking University,

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Dynamic Skyline Queries in Metric Spaces

Dynamic Skyline Queries in Metric Spaces Dynamic Skyline Queries in Metric Spaces Lei Chen and Xiang Lian Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon Hong Kong, China

More information

On High Dimensional Skylines

On High Dimensional Skylines On High Dimensional Skylines Chee-Yong Chan, H.V. Jagadish, Kian-Lee Tan, Anthony K.H. Tung, and Zhenjie Zhang National University of Singapore & University of Michigan {chancy,tankl,atung,zhangzh2}@comp.nus.edu.sg,

More information

Best Keyword Cover Search

Best Keyword Cover Search Vennapusa Mahesh Kumar Reddy Dept of CSE, Benaiah Institute of Technology and Science. Best Keyword Cover Search Sudhakar Babu Pendhurthi Assistant Professor, Benaiah Institute of Technology and Science.

More information

A Study on External Memory Scan-based Skyline Algorithms

A Study on External Memory Scan-based Skyline Algorithms A Study on External Memory Scan-based Skyline Algorithms Nikos Bikakis,2, Dimitris Sacharidis 2, and Timos Sellis 3 National Technical University of Athens, Greece 2 IMIS, Athena R.C., Greece 3 RMIT University,

More information

Efficient Processing of Top-k Spatial Preference Queries

Efficient Processing of Top-k Spatial Preference Queries Efficient Processing of Top-k Spatial Preference Queries João B. Rocha-Junior, Akrivi Vlachou, Christos Doulkeridis, and Kjetil Nørvåg Department of Computer and Information Science Norwegian University

More information

QSkycube: Efficient Skycube Computation Using Point Based Space Partitioning

QSkycube: Efficient Skycube Computation Using Point Based Space Partitioning : Efficient Skycube Computation Using Point Based Space Partitioning Jongwuk Lee, Seung won Hwang epartment of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang,

More information

A NOVEL APPROACH ON SPATIAL OBJECTS FOR OPTIMAL ROUTE SEARCH USING BEST KEYWORD COVER QUERY

A NOVEL APPROACH ON SPATIAL OBJECTS FOR OPTIMAL ROUTE SEARCH USING BEST KEYWORD COVER QUERY A NOVEL APPROACH ON SPATIAL OBJECTS FOR OPTIMAL ROUTE SEARCH USING BEST KEYWORD COVER QUERY S.Shiva Reddy *1 P.Ajay Kumar *2 *12 Lecterur,Dept of CSE JNTUH-CEH Abstract Optimal route search using spatial

More information

Computing Continuous Skyline Queries without Discriminating between Static and Dynamic Attributes

Computing Continuous Skyline Queries without Discriminating between Static and Dynamic Attributes Computing Continuous Skyline Queries without Discriminating between Static and Dynamic Attributes Ibrahim Gomaa, Hoda M. O. Mokhtar Abstract Although most of the existing skyline queries algorithms focused

More information

Searching of Nearest Neighbor Based on Keywords using Spatial Inverted Index

Searching of Nearest Neighbor Based on Keywords using Spatial Inverted Index Searching of Nearest Neighbor Based on Keywords using Spatial Inverted Index B. SATYA MOUNIKA 1, J. VENKATA KRISHNA 2 1 M-Tech Dept. of CSE SreeVahini Institute of Science and Technology TiruvuruAndhra

More information

Processing Skyline Queries in Temporal Databases

Processing Skyline Queries in Temporal Databases Processing Skyline Queries in Temporal Databases Christos Kalyvas Department of Information and Communication Systems Engineering, University of the Aegean, Samos, Greece chkalyvas@aegean.gr Theodoros

More information

ISSN: (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

ERCIM Alain Bensoussan Fellowship Scientific Report

ERCIM Alain Bensoussan Fellowship Scientific Report ERCIM Alain Bensoussan Fellowship Scientific Report Fellow: Christos Doulkeridis Visited Location : NTNU Duration of Visit: March 01, 2009 February 28, 2010 I - Scientific activity The research conducted

More information

Closest Keywords Search on Spatial Databases

Closest Keywords Search on Spatial Databases Closest Keywords Search on Spatial Databases 1 A. YOJANA, 2 Dr. A. SHARADA 1 M. Tech Student, Department of CSE, G.Narayanamma Institute of Technology & Science, Telangana, India. 2 Associate Professor,

More information

Flexible and Efficient Resolution of Skyline Query Size Constraints

Flexible and Efficient Resolution of Skyline Query Size Constraints 1 Flexible and Efficient Resolution of Syline Query Size Constraints Hua Lu, Member, IEEE, Christian S. Jensen, Fellow, IEEE Zhenjie Zhang Student Member, IEEE Abstract Given a set of multi-dimensional

More information

A Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique

A Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 6 ISSN : 2456-3307 A Real Time GIS Approximation Approach for Multiphase

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

Nowadays data-intensive applications play a

Nowadays data-intensive applications play a Journal of Advances in Computer Engineering and Technology, 3(2) 2017 Data Replication-Based Scheduling in Cloud Computing Environment Bahareh Rahmati 1, Amir Masoud Rahmani 2 Received (2016-02-02) Accepted

More information

d1 d2 d d1 d2 d3 (b) The corresponding bitmap structure

d1 d2 d d1 d2 d3 (b) The corresponding bitmap structure Efficient Progressive Skyline Computation Kian-Lee Tan Pin-Kwang Eng Beng Chin Ooi Department of Computer Science National University of Singapore Abstract In this paper, we focus on the retrieval of a

More information

Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries

Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries Lei Zou 1, Lei Chen 2 1 Huazhong University of Science and Technology 137 Luoyu Road, Wuhan, P. R. China 1 zoulei@mail.hust.edu.cn

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Top-k Keyword Search Over Graphs Based On Backward Search

Top-k Keyword Search Over Graphs Based On Backward Search Top-k Keyword Search Over Graphs Based On Backward Search Jia-Hui Zeng, Jiu-Ming Huang, Shu-Qiang Yang 1College of Computer National University of Defense Technology, Changsha, China 2College of Computer

More information

FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data

FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University,

More information

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets 2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets Tao Xiao Chunfeng Yuan Yihua Huang Department

More information

A Joint Replication-Migration-based Routing in Delay Tolerant Networks

A Joint Replication-Migration-based Routing in Delay Tolerant Networks A Joint -Migration-based Routing in Delay Tolerant Networks Yunsheng Wang and Jie Wu Dept. of Computer and Info. Sciences Temple University Philadelphia, PA 19122 Zhen Jiang Dept. of Computer Science West

More information

Secure and Advanced Best Keyword Cover Search over Spatial Database

Secure and Advanced Best Keyword Cover Search over Spatial Database Secure and Advanced Best Keyword Cover Search over Spatial Database Sweety Thakare 1, Pritam Patil 2, Tarade Priyanka 3, Sonawane Prajakta 4, Prof. Pathak K.R. 4 B. E Student, Dept. of Computer Engineering,

More information

SKYLINE QUERIES FOR MULTI-CRITERIA DECISION SUPPORT SYSTEMS SATYAVEER GOUD GUDALA. B.E., Osmania University, 2009 A REPORT

SKYLINE QUERIES FOR MULTI-CRITERIA DECISION SUPPORT SYSTEMS SATYAVEER GOUD GUDALA. B.E., Osmania University, 2009 A REPORT SKYLINE QUERIES FOR MULTI-CRITERIA DECISION SUPPORT SYSTEMS by SATYAVEER GOUD GUDALA B.E., Osmania University, 2009 A REPORT submitted in partial fulfillment of the requirements for the degree MASTER OF

More information

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,

More information

Nearest Neighbour Expansion Using Keyword Cover Search

Nearest Neighbour Expansion Using Keyword Cover Search Nearest Neighbour Expansion Using Keyword Cover Search [1] P. Sai Vamsi Aravind MTECH(CSE) Institute of Aeronautical Engineering, Hyderabad [2] P.Anjaiah Assistant Professor Institute of Aeronautical Engineering,

More information

Mining Distributed Frequent Itemset with Hadoop

Mining Distributed Frequent Itemset with Hadoop Mining Distributed Frequent Itemset with Hadoop Ms. Poonam Modgi, PG student, Parul Institute of Technology, GTU. Prof. Dinesh Vaghela, Parul Institute of Technology, GTU. Abstract: In the current scenario

More information

Parallel Mining of Maximal Frequent Itemsets in PC Clusters

Parallel Mining of Maximal Frequent Itemsets in PC Clusters Proceedings of the International MultiConference of Engineers and Computer Scientists 28 Vol I IMECS 28, 19-21 March, 28, Hong Kong Parallel Mining of Maximal Frequent Itemsets in PC Clusters Vong Chan

More information

Top-K Ranking Spatial Queries over Filtering Data

Top-K Ranking Spatial Queries over Filtering Data Top-K Ranking Spatial Queries over Filtering Data 1 Lakkapragada Prasanth, 2 Krishna Chaitanya, 1 Student, 2 Assistant Professor, NRL Instiute of Technology,Vijayawada Abstract: A spatial preference query

More information

Frequent Itemsets Melange

Frequent Itemsets Melange Frequent Itemsets Melange Sebastien Siva Data Mining Motivation and objectives Finding all frequent itemsets in a dataset using the traditional Apriori approach is too computationally expensive for datasets

More information

A CSP Search Algorithm with Reduced Branching Factor

A CSP Search Algorithm with Reduced Branching Factor A CSP Search Algorithm with Reduced Branching Factor Igor Razgon and Amnon Meisels Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84-105, Israel {irazgon,am}@cs.bgu.ac.il

More information

Efficient Index Based Query Keyword Search in the Spatial Database

Efficient Index Based Query Keyword Search in the Spatial Database Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 5 (2017) pp. 1517-1529 Research India Publications http://www.ripublication.com Efficient Index Based Query Keyword Search

More information

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Chapter General Characteristics Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include

More information

Continuous Processing of Preference Queries in Data Streams

Continuous Processing of Preference Queries in Data Streams Continuous Processing of Preference Queries in Data Streams Maria Kontaki, Apostolos N. Papadopoulos, and Yannis Manolopoulos Department of Informatics, Aristotle University 54124 Thessaloniki, Greece

More information

Similarity Search: A Matching Based Approach

Similarity Search: A Matching Based Approach Similarity Search: A Matching Based Approach Anthony K. H. Tung Rui Zhang Nick Koudas Beng Chin Ooi National Univ. of Singapore Univ. of Melbourne Univ. of Toronto {atung, ooibc}@comp.nus.edu.sg rui@csse.unimelb.edu.au

More information

Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce

Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Huayu Wu Institute for Infocomm Research, A*STAR, Singapore huwu@i2r.a-star.edu.sg Abstract. Processing XML queries over

More information

Data Communication and Parallel Computing on Twisted Hypercubes

Data Communication and Parallel Computing on Twisted Hypercubes Data Communication and Parallel Computing on Twisted Hypercubes E. Abuelrub, Department of Computer Science, Zarqa Private University, Jordan Abstract- Massively parallel distributed-memory architectures

More information

Ranking Spatial Data by Quality Preferences

Ranking Spatial Data by Quality Preferences International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 7 (July 2014), PP.68-75 Ranking Spatial Data by Quality Preferences Satyanarayana

More information

9/23/2009 CONFERENCES CONTINUOUS NEAREST NEIGHBOR SEARCH INTRODUCTION OVERVIEW PRELIMINARY -- POINT NN QUERIES

9/23/2009 CONFERENCES CONTINUOUS NEAREST NEIGHBOR SEARCH INTRODUCTION OVERVIEW PRELIMINARY -- POINT NN QUERIES CONFERENCES Short Name SIGMOD Full Name Special Interest Group on Management Of Data CONTINUOUS NEAREST NEIGHBOR SEARCH Yufei Tao, Dimitris Papadias, Qiongmao Shen Hong Kong University of Science and Technology

More information

Analysis of Basic Data Reordering Techniques

Analysis of Basic Data Reordering Techniques Analysis of Basic Data Reordering Techniques Tan Apaydin 1, Ali Şaman Tosun 2, and Hakan Ferhatosmanoglu 1 1 The Ohio State University, Computer Science and Engineering apaydin,hakan@cse.ohio-state.edu

More information

Efficient Algorithm for Frequent Itemset Generation in Big Data

Efficient Algorithm for Frequent Itemset Generation in Big Data Efficient Algorithm for Frequent Itemset Generation in Big Data Anbumalar Smilin V, Siddique Ibrahim S.P, Dr.M.Sivabalakrishnan P.G. Student, Department of Computer Science and Engineering, Kumaraguru

More information

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

Mining Frequent Itemsets for data streams over Weighted Sliding Windows Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology

More information

Spatial Index Keyword Search in Multi- Dimensional Database

Spatial Index Keyword Search in Multi- Dimensional Database Spatial Index Keyword Search in Multi- Dimensional Database Sushma Ahirrao M. E Student, Department of Computer Engineering, GHRIEM, Jalgaon, India ABSTRACT: Nearest neighbor search in multimedia databases

More information

Dynamic Skyline Queries in Large Graphs

Dynamic Skyline Queries in Large Graphs Dynamic Skyline Queries in Large Graphs Lei Zou, Lei Chen 2, M. Tamer Özsu 3, and Dongyan Zhao,4 Institute of Computer Science and Technology, Peking University, Beijing, China, {zoulei,zdy}@icst.pku.edu.cn

More information

A Survey on Efficient Location Tracker Using Keyword Search

A Survey on Efficient Location Tracker Using Keyword Search A Survey on Efficient Location Tracker Using Keyword Search Prasad Prabhakar Joshi, Anand Bone ME Student, Smt. Kashibai Navale Sinhgad Institute of Technology and Science Kusgaon (Budruk), Lonavala, Pune,

More information

Implementation of Skyline Sweeping Algorithm

Implementation of Skyline Sweeping Algorithm Implementation of Skyline Sweeping Algorithm BETHINEEDI VEERENDRA M.TECH (CSE) K.I.T.S. DIVILI Mail id:veeru506@gmail.com B.VENKATESWARA REDDY Assistant Professor K.I.T.S. DIVILI Mail id: bvr001@gmail.com

More information

Mining Thick Skylines over Large Databases

Mining Thick Skylines over Large Databases Mining Thick Skylines over Large Databases Wen Jin 1,JiaweiHan,andMartinEster 1 1 School of Computing Science, Simon Fraser University {wjin,ester}@cs.sfu.ca Department of Computer Science, Univ. of Illinois

More information

Progressive Skyline Computation in Database Systems

Progressive Skyline Computation in Database Systems Progressive Skyline Computation in Database Systems DIMITRIS PAPADIAS Hong Kong University of Science and Technology YUFEI TAO City University of Hong Kong GREG FU JP Morgan Chase and BERNHARD SEEGER Philipps

More information

Load Balancing Algorithm over a Distributed Cloud Network

Load Balancing Algorithm over a Distributed Cloud Network Load Balancing Algorithm over a Distributed Cloud Network Priyank Singhal Student, Computer Department Sumiran Shah Student, Computer Department Pranit Kalantri Student, Electronics Department Abstract

More information

Maintenance of the Prelarge Trees for Record Deletion

Maintenance of the Prelarge Trees for Record Deletion 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31, 2007 105 Maintenance of the Prelarge Trees for Record Deletion Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Department of

More information

Nearest Neighbor Search on Vertically Partitioned High-Dimensional Data

Nearest Neighbor Search on Vertically Partitioned High-Dimensional Data Nearest Neighbor Search on Vertically Partitioned High-Dimensional Data Evangelos Dellis, Bernhard Seeger, and Akrivi Vlachou Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Straße,

More information

Adaptive Processing of Multi-Criteria Decision Support Queries

Adaptive Processing of Multi-Criteria Decision Support Queries Adaptive Processing of Multi-Criteria Decision Support Queries Shweta Srivastava, Venkatesh Raghavan and Elke A. Rundensteiner Microsoft Corporation, One Memorial Drive, Cambridge, Massachusetts, USA Data

More information

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function DEIM Forum 2018 I5-5 Abstract A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function Yoshiki SEKINE and Nobutaka SUZUKI Graduate School of Library, Information and Media

More information

Efficiently Evaluating Skyline Queries on RDF Databases

Efficiently Evaluating Skyline Queries on RDF Databases Efficiently Evaluating Skyline Queries on RDF Databases Ling Chen, Sidan Gao, and Kemafor Anyanwu Semantic Computing Research Lab, Department of Computer Science North Carolina State University {lchen10,sgao,kogan}@ncsu.edu

More information

A Case for Merge Joins in Mediator Systems

A Case for Merge Joins in Mediator Systems A Case for Merge Joins in Mediator Systems Ramon Lawrence Kirk Hackert IDEA Lab, Department of Computer Science, University of Iowa Iowa City, IA, USA {ramon-lawrence, kirk-hackert}@uiowa.edu Abstract

More information

NOVEL CACHE SEARCH TO SEARCH THE KEYWORD COVERS FROM SPATIAL DATABASE

NOVEL CACHE SEARCH TO SEARCH THE KEYWORD COVERS FROM SPATIAL DATABASE NOVEL CACHE SEARCH TO SEARCH THE KEYWORD COVERS FROM SPATIAL DATABASE 1 Asma Akbar, 2 Mohammed Naqueeb Ahmad 1 M.Tech Student, Department of CSE, Deccan College of Engineering and Technology, Darussalam

More information

Compression of the Stream Array Data Structure

Compression of the Stream Array Data Structure Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

Enhanced Performance of Database by Automated Self-Tuned Systems

Enhanced Performance of Database by Automated Self-Tuned Systems 22 Enhanced Performance of Database by Automated Self-Tuned Systems Ankit Verma Department of Computer Science & Engineering, I.T.M. University, Gurgaon (122017) ankit.verma.aquarius@gmail.com Abstract

More information

Answering Aggregate Queries Over Large RDF Graphs

Answering Aggregate Queries Over Large RDF Graphs 1 Answering Aggregate Queries Over Large RDF Graphs Lei Zou, Peking University Ruizhe Huang, Peking University Lei Chen, Hong Kong University of Science and Technology M. Tamer Özsu, University of Waterloo

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

Results and Discussions on Transaction Splitting Technique for Mining Differential Private Frequent Itemsets

Results and Discussions on Transaction Splitting Technique for Mining Differential Private Frequent Itemsets Results and Discussions on Transaction Splitting Technique for Mining Differential Private Frequent Itemsets Sheetal K. Labade Computer Engineering Dept., JSCOE, Hadapsar Pune, India Srinivasa Narasimha

More information

An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks

An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks Ye-In Chang, Meng-Hsuan Tsai, and Xu-Lun Wu Abstract Due to wireless communication technologies, positioning technologies,

More information

Comparative Study of Subspace Clustering Algorithms

Comparative Study of Subspace Clustering Algorithms Comparative Study of Subspace Clustering Algorithms S.Chitra Nayagam, Asst Prof., Dept of Computer Applications, Don Bosco College, Panjim, Goa. Abstract-A cluster is a collection of data objects that

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

Spatiotemporal Access to Moving Objects. Hao LIU, Xu GENG 17/04/2018

Spatiotemporal Access to Moving Objects. Hao LIU, Xu GENG 17/04/2018 Spatiotemporal Access to Moving Objects Hao LIU, Xu GENG 17/04/2018 Contents Overview & applications Spatiotemporal queries Movingobjects modeling Sampled locations Linear function of time Indexing structure

More information

ABSTRACT I. INTRODUCTION

ABSTRACT I. INTRODUCTION International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISS: 2456-3307 Hadoop Periodic Jobs Using Data Blocks to Achieve

More information