Data warehouse access using multi-agent system
|
|
- Donald Wilkinson
- 5 years ago
- Views:
Transcription
1 Distrib Parallel Databases (2009) 25: DOI /s Data warehouse access using multi-agent system Nader Kolsi Abdelaziz Abdellatif Khaled Ghedira Published online: 21 February 2009 Springer Science+Business Media, LLC 2009 Abstract The new approach that we will propose, in this paper deals with the dynamic data distribution of the data warehouse (DWH) on a set of servers. This distribution is different from the classical one which depends on how data is used. It consists in distributing data when the machine reaches its storage limit capacity. The proposed approach insures the scalability and exploits the storage and processing resources available in the organization using the DWH. It is worth noting that our approach is based on a multi-agent model mixed with the scalability distribution proposed by the Scalable Distributed Data Structures. Our multi-agent model is made up of stationary agent classes: Client, Dispatcher, Domain and Server, and a mobile agent class: Messenger. These agents collaborate and achieve automatically the storage, splitting, redirection and access operations on the distributed DWH. In this paper, we focus on the global dynamic for the data access operation and we present the inherent experimental results. Keywords Data warehouse Dynamic distribution Data access Multi-agent system Mobile agent Scalable and distributed data structures Communicated by Ladjel Bellatreche. N. Kolsi ( ) Higher Institute of Business Administration of Sfax, Sfax, Tunisia nader.kolsi@fsegs.rnu.tn A. Abdellatif University of Sciences of Tunis, Tunis, Tunisia abdelaziz.abdellatif@fst.rnu.tn K. Ghedira National School of Informatics Sciences, Manouba Campus University, Tunis, Tunisia khaled.ghedira@isg.rnu.tn
2 30 Distrib Parallel Databases (2009) 25: Introduction The data warehouse (DWH), as defined by its inventor W.H. Inmon [19], is a collection of data which are subject-oriented, integrated, stamped, non-volatile, and used as a support of decision making. It is considered as a deposit of data that have been collected from heterogeneous and autonomous distributed sources. It is used for analytical tasks in business. The DWH usually contains a very large amount of data. This is because of the scope of the period that the DWH must cover (historical data) and the diversity of data sources from which data are extracted. The DWH is a principal component of the information systems in the organizations. In fact, it is the subject of many research works. This research deals with five main parts as shown in [29]: (1) data warehouse modeling and design, (2) data warehouse architectures, (3) data warehouse maintenance, (4) operational issues, and (5) optimization. Our research focuses on the operational issues and optimization topics mainly, but also data warehouse architectures and design. Our work aims at solving the problems of storage space and performance through: (1) developing a dynamic system that can manage the DWH automatically (data storage, data distribution on a set of servers, and data access), (2) taking advantage of the storage and processing resources available in the organization (processors, memory, hard disks, etc.), (3) getting better data storage time, and (4) improving the query response time. This paper is organized as follows: Sect. 2 gives an overview of related works and discusses the problems related to optimization topics. In Sect. 3, we present the multi-agent system. In Sect. 4, we describe the proposed multi-agent model. Section 5 details the global dynamic of the data access operation. In Sect. 6, the inherent experimental results are revealed. Finally, in Sect. 7, a conclusion and an outlook to future works are made. 2 Related works So far, distribution of data warehouses has not attracted much attention in research. The use of DWH with distributed structure has appeared only with the data marts [11, 18]. Although the use of small data marts (data warehouses) was the first attempt to solve the problems of space and performance, data marts are basically stand-alone and have data integration problems in a global data warehouse context. In addition, the performance of many distributed queries is normally poor, mainly due to the load balance problems. Furthermore, each individual data mart is primarily designed and tuned to answer the queries related to its own subject area, whereas the response to global queries depends on the global system tuning and the network speed. So most of researches in literature working in optimization topics propose solutions based on a centralized DWH or a model of partitioning which consist of storing the facts-table in pieces, instead of a large monolithic object on a set of I/O devices
3 Distrib Parallel Databases (2009) 25: with multiprocessors machine or on a centralized database. The latter is very expensive because of the large setup costs, and it is not very flexible due to its centralized nature [5]. In these researches, several queries optimization techniques are proposed. These techniques can be classified in two categories [2]: redundant structures: as materialized views and indexes [3, 15, 22]. These techniques compete for the same resource representing the storage cost and incur maintenance overhead in the presence of updates [28], non-redundant structures: as horizontal partitioning [9]. These techniques do not require an extra space as those in the first category. All these techniques are supported by the current database management systems (DBMS). The improvements, which are provided to these systems and concern the management of large data amount, are not sufficient to satisfy the needs due to the data amount growth of the DWH. In addition, the static data fragmentation schema, actually used in these systems, constitutes a major handicap. It is worth noting that, in our approach, we use the two techniques mentioned above (non-redundant and redundant structures). The horizontal partitioning technique will be used to distribute the data warehouse on a set of machines. The materialized views and indexes will be used on each individual machine that must be tuned and optimized for performance. Obviously, most researches, in the literature, that work on the data warehouse distribution propose solutions based on the studies made on the production databases under the name of very large data bases. These solutions are based on the classic data distribution which depends on the data use and has a static distribution plan. Furthermore, this type of distribution is defined at the design phase. In [9], the authors propose a solution to make this distribution plan dynamic. They present an algorithm to find the optimal vertical schema fragmentation based on the particle swarm optimization. Other researches [30, 31] use the abstract state machines [7] as a flexible and quality-oriented formal method to design and optimize a distributed DWH and OLAP (On Line Analytical Processing) applications. We have to point out that, in our approach, the data distribution that we consider is different from the usual-used ones [21]. In fact, it is not defined at the design phase. However, it is imposed by the storage capacity. As a matter of fact, when a machine reaches its storage capacity limit, we add another one. Then, we distribute the data on the two machines to have a balanced load. There are several ways to divide horizontally the relation. Typically, we can assign tuples to the processors in a round-robin fashion (round-robin partitioning), we can use hashing (hash partitioning), or we can assign tuples to the processors by ranges of values (range partitioning) [5]. In [5, 6, 8, 14], the papers authors use the Data Warehouse Striping (DWS) technique. The latter is a round-robin data partitioning approach especially designed for distributed data warehouse environments. By using the DWS, the fact table will be distributed into an arbitrary number of machines which is fixed at the beginning. Consequently, the queries will be executed in parallel by all of the machines [8]. The round-robin distribution is simple to use and guaranties the load balancing, although its major disadvantage is that we must have machines with
4 32 Distrib Parallel Databases (2009) 25: the same treatment and storage capacities. Otherwise, some machines will be too busy and the others will be under used. We have to note that, in our approach, we use the range partitioning applied by the scalable and distributed data structures (see Sect. 3). So, the queries are executed in parallel not by all the machines but only by those that contain the necessary partitions. Furthermore, the data distribution is dynamic and automatic. In fact, at each time when one machine reaches its limit capacity, it starts up the data distribution operation without needing an external intervention (administrator). Moreover, the number of used machines, in our approach, is not fixed. Therefore, the storage capacity of the DWH tends theoretically to the infinite because we can, at any moment, add dynamically other machines. In the following section, we present the scalable and distributed data structures principle. 3 Scalable and Distributed Data Structures The Scalable and Distributed Data Structures (SDDS) deal with the storage of a large data amount on a set of interconnected machines. The SDDS principle consists in distributing the file contents in a way that allows us to benefit from the available memory on a set of interconnected machines [4, 10]. This distribution is based on the identifiers (keys). In fact, the latter residing in one machine must be included between a lower bound mark and a higher one (see Sect. 5.1). The increasing content of the file involves its splitting. This principle has been extended from files to operational databases [24, 26, 27]. The infinite storage capacity and dynamic data distribution are guaranteed by the principle of the SDDSs [23]. In the rest of this paper, we consider that the two terms splitting and distributing have the same significance. In the following section, we present the multi-agent system concepts. 4 Multi-agent system The agent paradigm is currently in vogue within a lot of research domains. An agent can be a physical or virtual entity that acts autonomously (without the direct intervention of humans or others), on behalf of entities (person, organisation, etc.), in response to input from his environment. Agents have a social ability. They may communicate with the users, system resources and other agents as required in order achieving its goals and tendencies. Moreover, more advanced agents may cooperate with other agents to carry out tasks beyond the capability of a single agent. So, agents contain some level of intelligence, ranging from pre-defined rules up to self-learning artificial intelligence inference machines. This intelligence enables agents to act not only reactively, but sometimes also proactively. An agent can be static or mobile. The latter is a particular class of agent with the ability during execution to migrate dynamically (code, data and execution state) from one machine to another, where it can resume its execution, in order to reach data or
5 Distrib Parallel Databases (2009) 25: remote resources. It has been suggested that mobile agent technology, amongst other things, can help to reduce network traffic and to overcome network latencies [17]. Moreover, the mobile agents have proved a high performance when we access to the data distributed on a set of interconnected machines [1] and when we store these data [20]. A MAS is a system composed of multiple autonomous agents and comprises the following elements [13]: 1. An environment E is a space which generally has volume. 2. A set of situated objects O, that is to say, it is possible at a given moment to associate any object with a position in E. 3. An assembly of agents A, which are specific objects (a subset of O ), represent the active entities in the system. 4. An assembly of relations R, which link objects (and therefore, agents) to one another. 5. An assembly of operations Op, that allows the agents of A to perceive, produce, transform, and manipulate objects in O. 6. Operators with the task of representing the application of these operations and the reaction of the world to this attempt at modification, which we shall call the laws of the universe. The following section reveals the data distribution principle and the proposed multi-agent model. 5 Proposed model The aim of our proposed model is to solve the problems in the DWH context using the available resources in the organization. These problems are related to the data storage, splitting and access. According to the proposed approach, the DWH will be distributed on a set of machines. In this case, the data management needs the collaboration and the interaction between those machines in order to reply to the user s queries while assuring the parallel processing of these queries. Thus, we have chosen to use the Multi-Agent System (MAS) with the mobile agents as essential actors. In fact, the MAS allows following the progress of the dynamic data distribution, facilitates the collaboration, the interaction, and the independency of the different machines, and improves the parallel execution of the user queries. The use of mobile agents in the proposed solution seems to be very helpful because it allows: (1) decreasing the network loads, (2) liberating client machines during the results preparation that needs generally a very important execution-time, (3) and, essentially, securing the data that are transported in the network (see Sect. 6). We use the SDDS principle based on data distribution through intervals (range partitioning) in order to distribute the data of the DWH on a set of machines. This type of distribution allows the decomposition of the DWH into a set of domains. Each domain can be stored on one or more machines according to its data size.
6 34 Distrib Parallel Databases (2009) 25: Principle of data distribution The DWH is horizontally distributed on a set of machines that have the same DBMS and the same star schema (see Fig. 1). Furthermore, on each machine, we can use the materialized views and indexes to tune and to optimize the performance. The principle is to start with a single machine for which we define: (1) the storage capacity limit of this machine for which the used DBMS gives its highest performance (for data access and storage), and (2) both the inferior bound mark and the superior one for each fact table key. When this machine reaches its limit, we add another one and we distribute the data on the two machines to obtain a balanced load. In most cases, the fact table undergoes the splitting operation, because of its important volume. The dimensional tables are distributed when their key constitutes a distribution criterion. Otherwise, they are duplicated. In Table 1, we present a scenario of data splitting. Machine 1 starts up the first splitting operation when it reaches its capacity storage limit. First, we search for the key value that gives two balanced partitions (e.g. Product_Id that is an integer of two numbers). Then, we move the data, related to the new interval, to machine 2. Finally, we update the intervals. The second splitting operation is launched by machine 2 (e.g. Date_Id that is a date). The same process is restarted when one machine reaches its limit capacity. In fact, the data distribution can be continued according to the same criteria or to other ones (Customer_Id, Region_Id). We notice that each SALE table record belongs to only one DWH partition. If we consider that each of these DWH partitions is stored in separate databases, we must, on the one hand, split the Date table and Product table according to the same criteria used for the SALES table. On the other hand, we duplicate the other tables in order to (1) facilitate the checking of the integrity constraints, (2) ensure the databases autonomy, and (3) improve the join time when we access to data. Fig. 1 Distributed data warehouse Table 1 Splitting scenario Start First splitting Second splitting... Machine 1 M1 M2 M1 M2 M3... Customer Id [A, Z] [A, Z] [A, Z]... [A, Z] [A, Z]... Production Id [0, 99] [0, 50] [51, 99] [51, 99] [51, 99] Region Id [AA, ZZ] [AA, ZZ] [AA, ZZ] [AA, ZZ] [AA, ZZ] Date Id [Jan, Dec] [Jan, Dec] [Jan, Dec] [Jan, Jun] [Jul, Dec]
7 Distrib Parallel Databases (2009) 25: The following part deals with the proposed multi-agent model architecture and the waiting database notion that we use in our approach. 5.2 The proposed multi-agent model The proposed model consists of five static agent classes (Client, Dispatcher, Splitting, Domain and Server) and a mobile agent class (Messenger). Each agent class is defined by its knowledge (static or dynamic), its acquaintances (agents that it knows and with which it can communicate), and its behavior [12]. Figure 2 illustrates the interaction between the different agents. The Client agents act as an interface between the user and the DWH management system (Dispatcher agent). In fact, the user utilizes the Client agent to send the data storage and the data access operations (queries) to the Dispatcher agent. Each Client agent has the Dispatcher agent as an acquaintance. Its static knowledge is made up of its name and its address. This agent class does not have dynamic knowledge. The Dispatcher agent arranges the received operations according to their arrival order. These operations will be treated by the Messenger agent. When the Dispatcher agent receives the operation results from the Messenger agents, it sends them to the Client agent, if the latter is connected. Otherwise, it saves them until the Client agent will be connected again. The acquaintances of the Dispatcher agent are: (i) the Client agents which send queries, (ii) the Messenger agents which take charge of executing these operations, and (iii) the Splitting agent. Its static knowledge consists of its name and its address. Its dynamic knowledge is made up of a list containing all the Domain agents existing in the system and two waiting queues. The first queue is used to store operations received from the Client agents. The second one is used to store the results provided by the Messenger agents. Then, the Dispatcher agent sends these results to the sending Client agent (as it is described above). The Messenger agents take charge of executing each operation found in the operations waiting queue of the Dispatcher agent. Each Messenger agent makes the Fig. 2 The proposed multi-agent model architecture
8 36 Distrib Parallel Databases (2009) 25: execution plan of this operation. Then, it visits all the Domain agents concerned with this operation. Finally, it gives the ultimate results to the Dispatcher agent. Each Messenger agent has as acquaintances the Dispatcher agent and the Domain agents necessary to execute the operation. Its static knowledge is made up of its name and its maximum size of data that it can transport. This maximum depends on the network characteristics. The Messenger agent dynamic knowledge consists of: (i) the list of Domain agents to visit for executing the operation, (ii) the operation to execute, (iii) the lists of data to store (if the operation is data storage), or the list of data that are collected from visited Domain agents (if the operation is data access), and (iv) the size of transported data. It has a very important role in our architecture because it allows: (1) reducing the message traffic on the network, (2) accelerating the data storage and access operations, and, essentially, (3) securing the data circulation on the network (see Sect. 6). The Domain agents are responsible for sending the operations to the Server agents which they control. Then, they collect the replies sent by the Server agents and transmit the final result to the Messenger agent. The Domain agent has as acquaintances: (i) the Server agents that are under its control, (ii) the Messenger agents with which it has operations to execute and (iii) the Splitting agent. Its static knowledge is composed of its name, its address, the disk space limit of each Server agent, the maximum number of Server agent it can manage and the maximum size of data it can receive from the Messenger agents. This maximum depends on the machine characteristics (memory, processor, etc...). Its dynamic knowledge consists of the descendant list, the size of memorized data, and two waiting queues. The first queue is used to store the operations brought by the Messenger agents. The second one is used to store the replies sent by the Server agents. Later on, the Domain agent sends them to the appropriate Messenger agent. The Server agents undertake the received operations and send the replies to the Domain agent. Each Server agent has the Domain agent to which it belongs as acquaintances. Its static knowledge is made up of its name and its address. Its dynamic knowledge is a waiting queue used to store the operations received from the Domain agent. The Splitting agent is responsible for the splitting operations and the maintaining of the data road card that allows finding the data location. The splitting operation is started up when the machine reaches its storage capacity limit. The role of this agent consists in the following steps. First, it creates a new Domain agent when it receives a splitting request. Then, it informs the Domain agent, asking for splitting, of the location and the characteristics of the new one. Finally, it sends to the Dispatcher agent the new information concerning the two Domain agents in order to update the Domain agents list. The Splitting agent has as acquaintances the Dispatcher agent and the Domain agents that ask for splitting. Its static knowledge consists of its name and its address. Its dynamic knowledge is the list of splitting requests sent by the Domain agents. The Dispatcher agent manages a metabase which allows it to follow the evolution of the data distribution on the Domain agents, the network status and the Messenger agents load rate (see Fig. 3). This metabase is also used by the Messenger agents to make the execution plans of the received operations and determine the Domain agents
9 Distrib Parallel Databases (2009) 25: Fig. 3 Agent MetaBases tables to visit. The Splitting agent, also, uses this metabase for the splitting operations and updates it at the end of each splitting operation. Furthermore, each Domain agent has an appropriate metabase in order to follow the evolution of the data distribution on its descendants (Server agents) (see Fig. 3, the framed tables). In the following section, we detail the dynamic of the proposed model for the data access operation. 6 Multi-agent dynamic for the data access operation The proposed model is designed to support the different management operations of data warehouse, namely the data storage, splitting, redirection and access. In this paper, we present only the data access operation and we will not consider the case where the system is interrupted. The sequence diagrams presented later describe both the interactions and the agent behaviors made to accomplish the data access operation. The formalism used to represent these diagrams is the MA-UML (Mobile Agent UML) [16], which is an extension of AUML (Agent UML allows modeling the mobile agent behaviors). In this operation, the used agents are: the Client agents, the Dispatcher agent, the Messenger agents, the Domain agents, and the Server agents. These agents exchange different messages in order to accomplish the data access operation. This exchange is shown in the diagram presented in Fig. 4. The data access operation is started up when the users submit their queries to the Client agents. These latter sent them to the Dispatcher agent. The Client agent is satisfied when receiving a result for each sent query. Otherwise, it sends again the query to the Dispatcher agent, eventually, if the query contains any syntax errors, it requests user to correct them. When receiving the queries, the Dispatcher agent assigns each query to a Messenger agent. If no Messenger agent is available, the Dispatcher agent creates one for
10 38 Distrib Parallel Databases (2009) 25: Fig. 4 Data access operation each query. The Dispatcher agent is satisfied when receiving a result for each query. This result will be sent to the appropriate Client agent. If this latter is not connected, the Dispatcher agent places the received result in its results queue. The Domain agent is unsatisfied, if the Messenger agent informs it that there are any syntax errors. In this case, the Dispatcher agent, in its turn, informs the Client agent which sending the query. The Messenger agent is in the charge of the query execution. When receiving the query, it determines the list of Domain agents containing the data replying to the query. The Messenger agent uses the available information in the metabase and the clause WHERE of the query, to determine these agents and their addresses. If this clause does not exist, the list will contain all the Domain agents in the system. The Messenger agent clone itself as much as the number of the visited Domain agents. Each cloned Messenger agent moves to one of the selected Domain agents. When it receives the reply from the visited agent, it returns to the original Messenger agent, sends it the query partial result and kills itself. The cloned Messenger agent is satisfied when receiving the reply from the visited Domain agent. If the query has a clause GROUP BY and/or a clause ORDER BY, the original Messenger agent creates a temporary table, corresponding to the query, to save the received partial results. When it receives all the partial results, the original Messenger agent executes the query on the temporary table to get the final result that will be sent to the Dispatcher agent and it drops the table. If the query does not have these two clauses, the original Messenger agent gathers the partial results and then sends the final result to the Dispatcher agent. The original Messenger agent is satisfied when all the cloned Messenger agents return with the partial results. When receiving the query from the cloned Messenger agent, the Domain agent verifies whether the data requested by the received query belongs to the Server agents
11 Distrib Parallel Databases (2009) 25: which are under its responsibility. If this condition is true, the Domain agent sends the query to the appropriate Server agents. Otherwise, the Domain agent forwards this query to the right Domain agent. The last case occurs when a splitting operation happens before the query arrival. The Domain agent is satisfied when receiving the results from all the Server agents. These results will be sent to the cloned Messenger agent. The Server agent executes the query and sends the obtained result to the responsible Domain agent. It is satisfied when replying to all the received queries. If there are any syntax errors, the Server agent is unsatisfied and it informs the Domain agent. In the following section, we present the results obtained for the data access operation. 7 Experimental evaluation In order to validate our model for the data access operation, we have implemented three prototypes and we have measured the query execution time. One of them permits to access data on a centralized database (DB). The others allow accessing data on a set of machines. In fact, as described below, we have made the experiences using one machine that sends the query and N (three then five) machines that contain the DWH partitions. These machines have the same configuration: P4 and 256 Mo (RAM). We have used JDeveloper10g as a development toolkit, Oracle as a DBMS, and IBM Aglets as a multi-agent platform. We have programmed an engine that inserts the data in DWH partitions. In the first prototype, we have used two machines (Client/Server) and we have programmed an engine which accesses the data, stored on the server machine (centralized DWH), from the client machine. In the second prototype, we have programmed an access engine, without MAS, that accesses the data distributed on a set of machines (three then five machines) using the database links etc.). Each machine contains 1/N of the use data size. The given results (see Figs. 5 and 6) illustrate the aggregate functions (count, sum, avg, max, and min) with this type of query: Fig. 5 Experimental results with data size = 600 Mo without Group by/order by
12 40 Distrib Parallel Databases (2009) 25: Fig. 6 Experimental results with data size = 2.1 Go without Group by/order by Fig. 7 Experimental results with data size = 600 Mo with Group by/order by Select aggregate_function (s) From ((Select aggregate_function (sale_qt) s From Sales@dwh1) Union all (...)... Union all (Select aggregate_function (sale_qt) s From Sales@dwhN)); In the last prototype, we have programmed the MAS dynamic (see Sect. 6). In this prototype, the machines are used as follows: (1) on one of these machines we have made the Dispatcher agent, the metabase (MB), the Client agent and the Messenger agents, and (2) on each of the other N machines, we have made a Domain agent, a partition of the DWH database (DWHi) containing 1/N of the used data size, a MB and a Server agent. The query type used to get the given results (see Figs. 5 and 6)is: Select aggregate_function (sale_qt) From Sales; We have tested these prototypes using different data sizes: records equivalent to 600 Mo and records equivalent to 2.1 Go. We have also tested our model (see Figs. 7 and 8) using this type of query: Select... Group by region_id Order by region_id
13 Distrib Parallel Databases (2009) 25: Fig. 8 Experimental results with data size = 2.1 Go with Group by/order by Table 2 Table of Average gains percentage compared to the centralized DWH Query Query without group by/order by with group by/order by = 600 Mo = 2.1Go = 600 Mo = 2.1Go MAS MAS Distributed DWH without MAS 3 Distributed DWH without MAS 5 In Table 2, we present the average gains percentage obtained when we compared the distributed prototypes to the centralized prototype. When we compare the time needed to execute queries, by the prototype using a distributed DWH without MAS to the time needed by the prototype using a centralized DWH, we remark, in most of cases, that the average gains are positive. This is explained by the facts that: (1) the query accesses only a small part of the fact table, and (2) we execute the query on the fact table part in parallel. These averages turn negative when we have a small data size distributed on a set of machines. In these cases, the data load time becomes sizeable in the query execution time. We note that the average gains given by our model are the best. These gains result from reducing: (1) the network load charge (the Messenger agent encapsulates the partial result) and (2) the communications between machines (each machine executes the query locally). In Table 3, we give the time needed to execute each query step on each used machine to demonstrate these gains. We take as examples the Avg query and the ALL functions query using 3 machines and data size = 2.1 Go.
14 42 Distrib Parallel Databases (2009) 25: Table 3 Time execution by query step by machine Avg query ALL functions query Without group by and order by time in ms (a) M1 M2 M3 M1 M2 M3 Query execution (ServA) Tuple transmission (MessA) MAS coordination Execution time (DomA) Total execution time With group by and order by time in ms (b) M1 M2 M3 M1 M2 M3 Query execution (ServA) Data grouping and sorting (ServA) Tuple transmission (MessA) Insertion partial results in temporary table (MessA) MAS coordination Execution time (DomA) Create the temporary table (MessA) Made the final result (MessA) Total execution time ServA = Server Agent, MessA = Messenger Agent, DomA = Domain Agent We note that the time needed to execute the query on each machine is equal to the time needed to execute the query on a centralized DWH (AVG query (a) = ms, AVG query (b) = ms, ALL query (a) = ms, ALL query (b) = ms) divided by three. In addition, the time required to transmit tuples, to coordinate MAS and to make the final result increase slightly when the number of returned tuples increases. This time is approximately 6500 ms. For the query without group by and order by clauses, when we distribute the data on 5 machines, the time of the query execution is reduced approximately by an average equal to 1000 ms. But, for the query with group by and order by clauses, the time
15 Distrib Parallel Databases (2009) 25: of the query execution is reduced approximately by an average equal to 4500 ms. And, for these two query types, the time needed coordinate MAS and to make the final result increase by approximately 2500 ms. This is why, the time obtained for the same queries, when using 5 machines, is as follow: AVG query (a) = ms, AVG query (b) = ms, ALL query (a) = ms, ALL query (b) = ms. Our model not only gives the best access time but it also secures the data circulation on the network. In fact, we have made a function that the cloned Messenger agent executes, at each time, when it reaches one machine. This function allows to the cloned Messenger agent to check whether the address of the reached machine belongs to its address list. If the address is not found, the cloned Messenger agent tries to leave this machine. If it cannot leave this machine, it destroys the transported data and kills itself. 8 Conclusion In this article, we have presented some researches that deal with the data distribution in the DWH context and the multi-agent system. Then, we have described our proposed multi-agent model and its global dynamic concerning the data access operation. Finally, we have demonstrated the improvements obtained when we have used the MAS and the Messenger agents in the data access operation. We can conclude that when the number of used machines increases the average gains given by our model increase. But, we have to note that the increase in the number of used machines is relative to data size and the query complexity. Otherwise, if we have a few data distributed on a big number of machines, the circulation time makes by the cloned Messenger agents becomes sizeable and the centralized DWH access will be more efficient. These results will be considered to perform the data splitting operation. For each query, we estimate the execution time if we distribute the data on two machines. If this time is less than the time made when data are centralized, we split data. As near future work, we will test our model with Benchmarks (TPC-H and APB-1) and we will compare the given results to those obtained in the literature. We will, also, implement the query redirection process. Another future direction is to study how to make our system robust enough to deal with the momentarily unavailability of one or more machines. References 1. Arcangeli, J., Hameurlain, A., Migeon, F., Morvan, F.: Apport des agents mobiles à l évaluation et l optimisation de requêtes bases de données réparties à grande échelle. Technical Report, laboratory IRIT, Université Paul Sabatier (2002) 2. Bellatreche, L., Boukhalfa, K.: An evolutionary approach to schema partitioning selection in a data warehouse. In: DAWAK 2005, Bellatreche, L., Schneider, M., Lorinquer, H., Mohania, M.: Bringing together partitioning, materialized views and indexes to optimize performance of relational data warehouses. In: Proceeding of the International Conference on Data Warehousing and Knowledge Discovery (DAWAK 2004), pp , September 2004
16 44 Distrib Parallel Databases (2009) 25: Bennour, F.: Les structures de données distribuées et scalables sous windows: tendance hachage linéaire. Doctoral Thesis U. Paris 9, Bernardino, J., Madeira, H.: Data warehousing and OLAP: improving query performance using distributed computing. In: 12th Conference on Advanced Information Systems Engineering. Stockholm, Sweden Bernardino, J., Furtado, P.S., Madeira, H.C.: Approximate query answering using data warehouse striping. J. Intell. Inf. Syst. 19(2), (2002) 7. Börger, E., Stärk, R.: Abstract State Machines. Springer, Berlin, Heidelberg, New York (2003) 8. Almeida, R., Vieira, J., Vieira, M., Madeira, H., Bernardino, J.: Efficient data distribution for DWS. In: Proc. of the 10th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 08), Turin, Italy, September Lecture Notes in Computer Science, vol Springer, Berlin (2008) ISBN Derrar, H., Boussaïd, O., Ahmed-Nacer, M.: Une approche de répartition des données d un entrepôt basée sur l optimisation par essaim particulaire. In: 4èmes journées francophones sur les Entrepôts de Données et l Analyse enligne (EDA 2008), Toulouse, Juin 2008; RNTI, vol. B-4, Cépaduès, Toulouse, pp Diene, Litwin, W.: Performance measurements of RP*: a scalable distributed data structure for range partitioning. In: Int. Conf. on Information Society in the 21st Century: Emerging Techn. and New Challenges. Aizu City, Japan, Informatica white paper. Enterprise-scalable data marts: a new strategy for building and deploying fast, scalable data warehousing systems. (1997) 12. Ferber, J.: Les Systemes Multi-Agents vers une Intelligence Collective. InterEditions, Paris (1995) 13. Ferber, J.: Multi-Agent System: An Introduction to Distributed Artificial Intelligence. Addison- Wesley, Longman, Harlow (1999) 14. Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: DOLAP 04 WORKSHOP of the Int l Conference on Information and Knowledge Management (CIKM), Washington, November Gupta, H.: Selection and maintenance of views in a data warehouse. Ph.D. thesis, Standford University, September (1999) 16. Hachicha, H., Loukil, A., Ghédira, K.: MA-UML: une extension de A-UML aux agents mobiles. In: JFIADSMA 2002, Lille, French 17. Harrison, C.G., Chess, D.M., Kershenbaum, A.: Mobile agents: are they a good idea? Technical report, IBM Research Division (1995) 18. Hewlett-Packard white paper. HP Intelligent Warehouse. (1997) 19. Inmon, W.: Building the data warehouse. QED Technical Publishing Group (1992) 20. Kolsi, N., Abdellatif, A., Ghedira, K.: Agent based dynamic data storage and distribution in data warehouses. In: KES-AMSTA, Kolsi, N., Ghedira, K., Abdellatif, A.: Utilisation d un système multi-agents pour la répartition et la scalabilité des données d un data warehouse. In: Acts of the Fourth Scientific Days, Tome 1, pp , Borj El Amri Aviation School, Tunis, Tunisia, May Kotidis, Y., Roussopoulos, N.: Dynamat: a dynamic view management system for data warehouses. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp , June Litwin, W., Neimat, M.A., Schneider, D.: RP*: a family of order-preserving scalable distributed data structures. In: 20th Intl. Conf. On very Large Data Bases VLDB, Litwin, W., Risch, T., Schwarz, Th.: An architecture for a scalable distributed DBS: application to SQL server 2000, Extended abstract. In: 2nd Intl. Workshop on Cooperative Internet Computing (CIC 2002), Hong Kong, August, Narasayya, S.V.R., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp , June Ndiaye, Y., Diene, A., Litwin, W., Risch, W.: AMOS-SDDS: a scalable distributed data manager for windows multicomputers. In: ISCA 14th Intl. Conf. on Par. and Distr. Computing Systems, Texas, USA, August 8 10, Sahri, S., Litwin, W., Schwartz, T.: An overview of a scalable distributed database system SD-SQL server. In: Bell, D., Hong, J. (eds.) Flexible and Efficient Information Handling: 23d British National Conference on Databases, BNCOD 2006, Belfast, Northern Ireland, UK, July 2006 Proceedings. Lecture Notes in Computer Science, vol. 4942, pp Springer, Berlin, Heidelberg, New York (2006)
17 Distrib Parallel Databases (2009) 25: Surajit, S.C., Narasayya, V.R.: Automated selection of materialized views and indexes in microsoft SQL server. In: Proceedings of the International Conference on Very Large Databases, pp , September Wu, M., Buchmann, A.: Research issues in data warehousing. In: BTW 97, March Zhao, J., Ma, H.: Quality-assured design of on-line analytical processing systems using abstract state machines. In: Ehrich, H.-D., Schewe, K.-D. (eds.) Proceedings of the Fourth International Conference on Quality Software (QSIC 2004), Braun-Schweig, Germany, IEEE Computer Society Press, Los Alamitos (2004) 31. Zhao, J., Schewe, K.-D.: Using abstract state machines for distributed data warehouse design. In: Hartmann, S., Roddick, J. (eds.) Conceptual Modelling 2004 First Asia-Pacific Conference on Conceptual Modelling, Dunedin, New Zealand, CRPIT, vol. 31, pp Australian Computer Society, Sydney (2004)
Agent Based Architecture in Distributed Data Warehousing
International Journal of Scientific and Research Publications, Volume 2, Issue 5, May 2012 1 Agent Based Architecture in Distributed Data Warehousing Bindia, Jaspreet Kaur Sahiwal Department of Computer
More informationData Warehouse Design Using Row and Column Data Distribution
Int'l Conf. Information and Knowledge Engineering IKE'15 55 Data Warehouse Design Using Row and Column Data Distribution Behrooz Seyed-Abbassi and Vivekanand Madesi School of Computing, University of North
More informationCrises Management in Multiagent Workflow Systems
Crises Management in Multiagent Workflow Systems Małgorzata Żabińska Department of Computer Science, AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Kraków, Poland zabinska@agh.edu.pl
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationManaging Data Resources
Chapter 7 Managing Data Resources 7.1 2006 by Prentice Hall OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Describe how
More informationIJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):
IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online): 2321-0613 Tanzeela Khanam 1 Pravin S.Metkewar 2 1 Student 2 Associate Professor 1,2 SICSR, affiliated
More informationA MAS Based ETL Approach for Complex Data
A MAS Based ETL Approach for Complex Data O. Boussaid, F. Bentayeb, J. Darmont Abstract : In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have
More informationNovel Materialized View Selection in a Multidimensional Database
Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/
More information4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)
4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) Benchmark Testing for Transwarp Inceptor A big data analysis system based on in-memory computing Mingang Chen1,2,a,
More informationDesigning and Implementing an Object Relational Data Warehousing System
Designing and Implementing an Object Relational Data Warehousing System Abstract Bodgan Czejdo 1, Johann Eder 2, Tadeusz Morzy 3, Robert Wrembel 3 1 Department of Mathematics and Computer Science, Loyola
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationImproving the Performance of OLAP Queries Using Families of Statistics Trees
Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University
More informationDynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering
Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of
More informationAdvances in Databases and Information Systems 1997
ELECTRONIC WORKSHOPS IN COMPUTING Series edited by Professor C.J. van Rijsbergen Rainer Manthey and Viacheslav Wolfengagen (Eds) Advances in Databases and Information Systems 1997 Proceedings of the First
More informationDesign Patterns for Description-Driven Systems
Design Patterns for Description-Driven Systems N. Baker 3, A. Bazan 1, G. Chevenier 2, Z. Kovacs 3, T Le Flour 1, J-M Le Goff 4, R. McClatchey 3 & S Murray 1 1 LAPP, IN2P3, Annecy-le-Vieux, France 2 HEP
More informationQUERY RECOMMENDATION SYSTEM USING USERS QUERYING BEHAVIOR
International Journal of Emerging Technology and Innovative Engineering QUERY RECOMMENDATION SYSTEM USING USERS QUERYING BEHAVIOR V.Megha Dept of Computer science and Engineering College Of Engineering
More informationA generic conceptual framework for selfmanaged
A generic conceptual framework for selfmanaged environments E. Lavinal, T. Desprats, and Y. Raynaud IRIT, UMR 5505 - Paul Sabatier University 8 route de Narbonne, F-3062 Toulouse cedex 9 {lavinal, desprats,
More informationUMCS. Annales UMCS Informatica AI 6 (2007) Fault tolerant control for RP* architecture of Scalable Distributed Data Structures
Annales Informatica AI 6 (2007) 5-13 Annales Informatica Lublin-Polonia Sectio AI http://www.annales.umcs.lublin.pl/ Fault tolerant control for RP* architecture of Scalable Distributed Data Structures
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL
More informationPerformance Measurements of RP* : A Scalable Distributed Data Structure For Range Partitioning
Performance Measurements of RP* : A Scalable Distributed Data Structure For Range Partitioning Aly Wane Diène & Witold Litwin CERIA University Paris 9 Dauphine http://ceria ceria.dauphine..dauphine.fr
More informatione ara e om utin stems:
~cs~l? MPCS'94 25,0. SJ~ First International Conference on - BSSI e ara e om utin stems: The Challenges of General-Purpose and Special-Purpose Computing May 2-6, 1994 Ischia, Italy ~IEEE Computer Society
More informationEvaluation of Parallel Programs by Measurement of Its Granularity
Evaluation of Parallel Programs by Measurement of Its Granularity Jan Kwiatkowski Computer Science Department, Wroclaw University of Technology 50-370 Wroclaw, Wybrzeze Wyspianskiego 27, Poland kwiatkowski@ci-1.ci.pwr.wroc.pl
More informationAn Overview of Cost-based Optimization of Queries with Aggregates
An Overview of Cost-based Optimization of Queries with Aggregates Surajit Chaudhuri Hewlett-Packard Laboratories 1501 Page Mill Road Palo Alto, CA 94304 chaudhuri@hpl.hp.com Kyuseok Shim IBM Almaden Research
More informationMouse Pointer Tracking with Eyes
Mouse Pointer Tracking with Eyes H. Mhamdi, N. Hamrouni, A. Temimi, and M. Bouhlel Abstract In this article, we expose our research work in Human-machine Interaction. The research consists in manipulating
More informationMobile Agent-Based Load Monitoring System for the Safety Web Server Environment
Mobile -Based Load Monitoring System for the Safety Web Server Environment H.J. Park 1, K.J. Jyung 2, and S.S. Kim 3 1 School of Computer Information and Communication Engineering, Sangji University, Woosandong,
More informationJob Re-Packing for Enhancing the Performance of Gang Scheduling
Job Re-Packing for Enhancing the Performance of Gang Scheduling B. B. Zhou 1, R. P. Brent 2, C. W. Johnson 3, and D. Walsh 3 1 Computer Sciences Laboratory, Australian National University, Canberra, ACT
More informationChapter 18: Parallel Databases
Chapter 18: Parallel Databases Introduction Parallel machines are becoming quite common and affordable Prices of microprocessors, memory and disks have dropped sharply Recent desktop computers feature
More informationChapter 5 INTRODUCTION TO MOBILE AGENT
Chapter 5 INTRODUCTION TO MOBILE AGENT 135 Chapter 5 Introductions to Mobile Agent 5.1 Mobile agents What is an agent? In fact a software program is containing an intelligence to help users and take action
More informationManagement Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management
Management Information Systems Review Questions Chapter 6 Foundations of Business Intelligence: Databases and Information Management 1) The traditional file environment does not typically have a problem
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationAMOS-SDDS: A Scalable Distributed Data Manager for Windows Multicomputers
To be presented at the ISCA 14th Intl. Conf. on Par. and Distr. Computing Systems, Texas, USA, August 8-10, 2001 AMOS-SDDS: A Scalable Distributed Data Manager for Windows Multicomputers Yakham Ndiaye,
More informationDC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting.
DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting April 14, 2009 Whitemarsh Information Systems Corporation 2008 Althea Lane Bowie,
More informationPerformance Problems of Forecasting Systems
Performance Problems of Forecasting Systems Haitang Feng Supervised by: Nicolas Lumineau and Mohand-Saïd Hacid Université de Lyon, CNRS Université Lyon 1, LIRIS, UMR5205, F-69622, France {haitang.feng,
More informationAn Overview of various methodologies used in Data set Preparation for Data mining Analysis
An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of
More informationCorrelation Based Feature Selection with Irrelevant Feature Removal
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationSAS Scalable Performance Data Server 4.3
Scalability Solution for SAS Dynamic Cluster Tables A SAS White Paper Table of Contents Introduction...1 Cluster Tables... 1 Dynamic Cluster Table Loading Benefits... 2 Commands for Creating and Undoing
More informationMODELING THE PHYSICAL DESIGN OF DATA WAREHOUSES FROM A UML SPECIFICATION
MODELING THE PHYSICAL DESIGN OF DATA WAREHOUSES FROM A UML SPECIFICATION Sergio Luján-Mora, Juan Trujillo Department of Software and Computing Systems University of Alicante Alicante, Spain email: {slujan,jtrujillo}@dlsi.ua.es
More informationDIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY
DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY Reham I. Abdel Monem 1, Ali H. El-Bastawissy 2 and Mohamed M. Elwakil 3 1 Information Systems Department, Faculty of computers and information,
More informationStreamOLAP. Salman Ahmed SHAIKH. Cost-based Optimization of Stream OLAP. DBSJ Japanese Journal Vol. 14-J, Article No.
StreamOLAP Cost-based Optimization of Stream OLAP Salman Ahmed SHAIKH Kosuke NAKABASAMI Hiroyuki KITAGAWA Salman Ahmed SHAIKH Toshiyuki AMAGASA (SPE) OLAP OLAP SPE SPE OLAP OLAP OLAP Due to the increase
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationLa Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes
National Engineering School of Mechanic & Aerotechnics 1, avenue Clément Ader - BP 40109-86961 Futuroscope cedex France La Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes
More informationFundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.
Fundamentals of Database Systems 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute
More informationEfficient integration of data mining techniques in DBMSs
Efficient integration of data mining techniques in DBMSs Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex, FRANCE {bentayeb jdarmont
More informationDeveloping InfoSleuth Agents Using Rosette: An Actor Based Language
Developing InfoSleuth Agents Using Rosette: An Actor Based Language Darrell Woelk Microeclectronics and Computer Technology Corporation (MCC) 3500 Balcones Center Dr. Austin, Texas 78759 InfoSleuth Architecture
More informationManaging Data Resources
Chapter 7 OBJECTIVES Describe basic file organization concepts and the problems of managing data resources in a traditional file environment Managing Data Resources Describe how a database management system
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationUsing Tcl Mobile Agents for Monitoring Distributed Computations
Using Tcl Mobile Agents for Monitoring Distributed Computations Dilyana Staneva, Emil Atanasov Abstract: Agents, integrating code and data mobility, can be used as building blocks for structuring distributed
More informationRevisiting Join Site Selection in Distributed Database Systems
Revisiting Join Site Selection in Distributed Database Systems Haiwei Ye 1, Brigitte Kerhervé 2, and Gregor v. Bochmann 3 1 Département d IRO, Université de Montréal, CP 6128 succ Centre-Ville, Montréal
More informationHorizontal Aggregations for Mining Relational Databases
Horizontal Aggregations for Mining Relational Databases Dontu.Jagannadh, T.Gayathri, M.V.S.S Nagendranadh. Department of CSE Sasi Institute of Technology And Engineering,Tadepalligudem, Andhrapradesh,
More informationResearch Article ISSN:
Research Article [Srivastava,1(4): Jun., 2012] IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY An Optimized algorithm to select the appropriate Schema in Data Warehouses Rahul
More informationComputing Data Cubes Using Massively Parallel Processors
Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University
More informationChapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model
Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using
More informationAdapting Mixed Workloads to Meet SLOs in Autonomic DBMSs
Adapting Mixed Workloads to Meet SLOs in Autonomic DBMSs Baoning Niu, Patrick Martin, Wendy Powley School of Computing, Queen s University Kingston, Ontario, Canada, K7L 3N6 {niu martin wendy}@cs.queensu.ca
More informationLeast-Connection Algorithm based on variable weight for multimedia transmission
Least-onnection Algorithm based on variable weight for multimedia transmission YU SHENGSHENG, YANG LIHUI, LU SONG, ZHOU JINGLI ollege of omputer Science Huazhong University of Science & Technology, 1037
More informationDATA MINING TRANSACTION
DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is
More informationdata dependence Data dependence Structure dependence
data dependence Structure dependence If the file-system programs are affected by change in the file structure, they exhibit structuraldependence. For example, when we add dateof-birth field to the CUSTOMER
More informationDatabase system development lifecycles
Database system development lifecycles 2009 Yunmook Nah Department of Electronics and Computer Engineering School of Computer Science & Engineering Dankook University 이석호 ä ± Á Ç ºÐ ¼ ¼³ è ± Çö î µ ½Ã
More informationManaging Changes to Schema of Data Sources in a Data Warehouse
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2001 Proceedings Americas Conference on Information Systems (AMCIS) December 2001 Managing Changes to Schema of Data Sources in
More informationDistributed KIDS Labs 1
Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database
More informationSpecific Objectives Contents Teaching Hours 4 the basic concepts 1.1 Concepts of Relational Databases
Course Title: Advanced Database Management System Course No. : ICT. Ed 525 Nature of course: Theoretical + Practical Level: M.Ed. Credit Hour: 3(2T+1P) Semester: Second Teaching Hour: 80(32+8) 1. Course
More informationParadigm Shift of Database
Paradigm Shift of Database Prof. A. A. Govande, Assistant Professor, Computer Science and Applications, V. P. Institute of Management Studies and Research, Sangli Abstract Now a day s most of the organizations
More informationData Warehousing Introduction. Toon Calders
Data Warehousing Introduction Toon Calders toon.calders@ulb.ac.be Course Organization Lectures on Tuesday 14:00 and Friday 16:00 Check http://gehol.ulb.ac.be/ for room Most exercises in computer class
More informationOracle Tuxedo. CORBA Technical Articles 11g Release 1 ( ) March 2010
Oracle Tuxedo CORBA Technical Articles 11g Release 1 (11.1.1.1.0) March 2010 Oracle Tuxedo CORBA Technical Articles, 11g Release 1 (11.1.1.1.0) Copyright 1996, 2010, Oracle and/or its affiliates. All rights
More informationPeer-to-Peer Systems. Chapter General Characteristics
Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include
More informationLH*TH: New fast Scalable Distributed Data Structures (SDDSs)
IJCSI International Journal of Computer Science Issues, Volume, Issue 6, No 2, November 204 ISSN (Print): 694-084 ISSN (Online): 694-0784 www.ijcsi.org 23 LH*TH: New fast Scalable Distributed Data Structures
More informationV Conclusions. V.1 Related work
V Conclusions V.1 Related work Even though MapReduce appears to be constructed specifically for performing group-by aggregations, there are also many interesting research work being done on studying critical
More informationData about data is database Select correct option: True False Partially True None of the Above
Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another
More informationHorizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator
Horizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator R.Saravanan 1, J.Sivapriya 2, M.Shahidha 3 1 Assisstant Professor, Department of IT,SMVEC, Puducherry, India 2,3 UG student, Department
More informationSQL-to-MapReduce Translation for Efficient OLAP Query Processing
, pp.61-70 http://dx.doi.org/10.14257/ijdta.2017.10.6.05 SQL-to-MapReduce Translation for Efficient OLAP Query Processing with MapReduce Hyeon Gyu Kim Department of Computer Engineering, Sahmyook University,
More informationEvaluating the Performance of Mobile Agent-Based Message Communication among Mobile Hosts in Large Ad Hoc Wireless Network
Evaluating the Performance of Mobile Agent-Based Communication among Mobile Hosts in Large Ad Hoc Wireless Network S. Bandyopadhyay Krishna Paul PricewaterhouseCoopers Limited Techna Digital Systems Sector
More informationA Resource Look up Strategy for Distributed Computing
A Resource Look up Strategy for Distributed Computing F. AGOSTARO, A. GENCO, S. SORCE DINFO - Dipartimento di Ingegneria Informatica Università degli Studi di Palermo Viale delle Scienze, edificio 6 90128
More informationA Low-Cost Correction Algorithm for Transient Data Errors
A Low-Cost Correction Algorithm for Transient Data Errors Aiguo Li, Bingrong Hong School of Computer Science and Technology Harbin Institute of Technology, Harbin 150001, China liaiguo@hit.edu.cn Introduction
More informationOn Latency Management in Time-Shared Operating Systems *
On Latency Management in Time-Shared Operating Systems * Kevin Jeffay University of North Carolina at Chapel Hill Department of Computer Science Chapel Hill, NC 27599-3175 jeffay@cs.unc.edu Abstract: The
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 02 Introduction to Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationQualitative Evaluation Profiles of Data-Warehousing Systems
Qualitative Evaluation Profiles of -Warehousing Systems Cyril S. Ku and Yu H. Zhou Department of Computer Science William Paterson University Wayne, NJ 07470, USA Abstract base optimization is one of the
More informationResource and Service Trading in a Heterogeneous Large Distributed
Resource and Service Trading in a Heterogeneous Large Distributed ying@deakin.edu.au Y. Ni School of Computing and Mathematics Deakin University Geelong, Victoria 3217, Australia ang@deakin.edu.au Abstract
More informationA Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 6 ISSN : 2456-3307 A Real Time GIS Approximation Approach for Multiphase
More informationSimulating a Finite State Mobile Agent System
Simulating a Finite State Mobile Agent System Liu Yong, Xu Congfu, Chen Yanyu, and Pan Yunhe College of Computer Science, Zhejiang University, Hangzhou 310027, P.R. China Abstract. This paper analyzes
More informationDatabase Server. 2. Allow client request to the database server (using SQL requests) over the network.
Database Server Introduction: Client/Server Systems is networked computing model Processes distributed between clients and servers. Client Workstation (usually a PC) that requests and uses a service Server
More informationData Warehousing and OLAP Technologies for Decision-Making Process
Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)
More informationImproving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2
Improving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2 1 M.tech Computer Engineering OITM Hissar, GJU Univesity Hissar
More informationSystem and method for encoding and decoding data files
( 1 of 1 ) United States Patent 7,246,177 Anton, et al. July 17, 2007 System and method for encoding and decoding data files Abstract Distributed compression of a data file can comprise a master server
More informationAn Oracle White Paper April 2010
An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited
More informationThe Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees
The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees Omar H. Karam Faculty of Informatics and Computer Science, The British University in Egypt and Faculty of
More informationCGS 3066: Spring 2017 SQL Reference
CGS 3066: Spring 2017 SQL Reference Can also be used as a study guide. Only covers topics discussed in class. This is by no means a complete guide to SQL. Database accounts are being set up for all students
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 05 Data Modeling Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Data Modeling
More informationDecision Support Systems aka Analytical Systems
Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis
More informationCorrectness Criteria Beyond Serializability
Correctness Criteria Beyond Serializability Mourad Ouzzani Cyber Center, Purdue University http://www.cs.purdue.edu/homes/mourad/ Brahim Medjahed Department of Computer & Information Science, The University
More informationSomething to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:
Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base
More informationMobile Element Scheduling for Efficient Data Collection in Wireless Sensor Networks: A Survey
Journal of Computer Science 7 (1): 114-119, 2011 ISSN 1549-3636 2011 Science Publications Mobile Element Scheduling for Efficient Data Collection in Wireless Sensor Networks: A Survey K. Indra Gandhi and
More informationRAMSES: a Reflective Middleware for Software Evolution
RAMSES: a Reflective Middleware for Software Evolution Walter Cazzola 1, Ahmed Ghoneim 2, and Gunter Saake 2 1 Department of Informatics and Communication, Università degli Studi di Milano, Italy cazzola@dico.unimi.it
More informationA Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining
A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining Miss. Rituja M. Zagade Computer Engineering Department,JSPM,NTC RSSOER,Savitribai Phule Pune University Pune,India
More informationCREATING CUSTOMIZED DATABASE VIEWS WITH USER-DEFINED NON- CONSISTENCY REQUIREMENTS
CREATING CUSTOMIZED DATABASE VIEWS WITH USER-DEFINED NON- CONSISTENCY REQUIREMENTS David Chao, San Francisco State University, dchao@sfsu.edu Robert C. Nickerson, San Francisco State University, RNick@sfsu.edu
More informationHybrid Approach for the Maintenance of Materialized Webviews
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2010 Proceedings Americas Conference on Information Systems (AMCIS) 8-2010 Hybrid Approach for the Maintenance of Materialized Webviews
More informationUpdates through Views
1 of 6 15 giu 2010 00:16 Encyclopedia of Database Systems Springer Science+Business Media, LLC 2009 10.1007/978-0-387-39940-9_847 LING LIU and M. TAMER ÖZSU Updates through Views Yannis Velegrakis 1 (1)
More informationSpeed-up of Parallel Processing of Divisible Loads on k-dimensional Meshes and Tori
The Computer Journal, 46(6, c British Computer Society 2003; all rights reserved Speed-up of Parallel Processing of Divisible Loads on k-dimensional Meshes Tori KEQIN LI Department of Computer Science,
More informationA Mobile Agent-based Model for Service Management in Virtual Active Networks
A Mobile Agent-based Model for Service Management in Virtual Active Networks Fábio Luciano Verdi and Edmundo R. M. Madeira Institute of Computing, University of Campinas (UNICAMP), Campinas-SP, Brazil
More information