Agent Based Cloud Storage System

Size: px

Start display at page:

Download "Agent Based Cloud Storage System"

Aubrie Pope
5 years ago
Views:

1 Agent Based Cloud Storage System Prof. Dr. ABDEL-FATTAH HEGAZY *, Prof. Dr. AMR BADR **, MOHAMMED KASSAB * * Department of Computer Science Arab Academy for Science Technology and Maritime Transport Egypt kassab.mohd@ymail.com ** Department of Computer Science Cairo University Egypt Abstract: - Cloud computing technology is envisioned as the next generation architecture of IT Enterprise. It is defined as a set of scalable data servers or chunk servers that provide computing and storage services to clients. The cloud storage is a relatively basic and widely applied service which can provide users with stable, massive data storage space. Our research shows that the architecture of current Cloud Computing System moves from central to distributed one; the reason for such movement is to avoid the bottle neck introduces since all data chunks must be indexed by a master index server. In this paper, we propose new cloud storage architecture based on P2P using agents. The system is based on a new architecture with better scalability, fault tolerance and enhanced performance. Keywords: - Cloud Computing, Architecture, Storage, P2P, Agents, Distributed Systems 1 Introduction A cloud computing platform dynamically provisions servers as required. Servers in the cloud can be physical and/or virtual machines. Other computing resources such as network devices, storage area networks, firewall and other security devices are included as well. This paper will focus on the storage service provisioned by the cloud. Some typical cloud systems, such as GFS of Google[1], Elastic Cloud of Amazon[2], Blue Cloud of IBM[3], all have a similar central architecture for storage, with a central entity to index or manage the distributed data storage entities. It is effective to simplify the design and maintenance of the system by a central managed architecture, but the central entity definitely becomes a bottleneck due to the frequent visits. Although systems in practice have used some technique as backup recovery to avoid the probably disaster from the central bottle neck, the flaw from such architecture has not been resolved essentially. To overcome such bottleneck resulting from the central master entity used for indexing, another architecture based on P2P which provides a pure distributed data storage environment without any central entity management is introduced. In this paper, we propose a cloud computing architecture based on agents which provisions the benefits provided by the P2P architecture but with better performance. Rest of the paper is organized as follows, in section 2 we will introduce some related work about cloud storage system and P2P storage system. In section 3 of this paper, we describe a typical scenario to explain the architecture of our proposed cloud computing storage environment. In section 4, there is an introduction to the prototype used. In section 5, we illustrate our results. Section 6 is conclusion and proposal for future work. 2 Related Works In this section, we shall introduce some related work regarding cloud computing system and P2P storage architecture. 2.1 Google File System (Single Master) The first to give prominence to the term could compute (and maybe to coin it) was Google s CEO Eric Schmidt, in late 2006[4]. Computing platform [5] which was first developed for the most important application of Google search service[6] and now has extended to other applications. Google cloud computing infrastructure has four systems which are independent of and closely linked to each other. They are Google File System for distributed file storage, MapReduce program model for parallel Google applications[6], Chubby for distributed lock mechanism[7] and BigTable for Google large-scale distributed ISSN: X 240 ISBN:

database[8].a GFS cluster consists of a single master and multiple chunk servers and is accessed by multiple clients, as shown in Fig. 1[1]. Files are divided into fixed-size chunks.

2 database[8].a GFS cluster consists of a single master and multiple chunk servers and is accessed by multiple clients, as shown in Fig. 1[1]. Files are divided into fixed-size chunks. Chunk servers store chunks on local disks as Linux files and read or write chunk data specified by a chunk handle and byte range. For reliability, each chunk is replicated on multiple chunk servers and hence the replicas must be synchronized in order to maintain consistency. The master maintains all file system metadata. This includes the namespace, access control information, the mapp.ing from files to chunks, and the current locations of chunk. When a client wants to visit some data on a chunk server, it will first send a request to the Master, and the master then replies with the corresponding chunk handle and locations of the replicas. The client then sends a request to one of the replicas and fetches the data wanted.[1] Fig. 1 Architecture of Google File System The above architecture is clearly based on a master indexed storage system. The defect of such central based index architecture is that the GFS master becomes the bottle neck of the system since any request must be originated from the master index before being directed to the target chunk server, leaving such burden on the GFS master. 2.2 P2P Storage System The distributed P2P network indexed by DHT can resolve the problems of bottle neck resulting from central index system. Since the management is distributed equality to every peer in the network, there is no bottle neck any more, but with new alternatives, new problems arise. The new problem is how to maintain the consistency of the replicas in case of read/write operations. Some P2P systems for distributed storage have been developed now, such as Oasis[9], OM[10] and Sigma[11]. They keep the replica consistency in different ways and index the data resource by DHT. In the following section, we will discuss a cloud storage system based on P2P Cloud Based on P2P Architecture The architecture captioned in the next Fig. 2[12] is divided into three roles. The client application is responsible for the data request from the cloud. A gateway represents the entity which can direct the request/response between the client application and the cloud and lead the request to the nearest node on the network. Finally, the chunk server is the entity which is served as the data resource node and P2P node as well. Each chunk server has three function module interfaces. As illustrated in the Fig. below, an index module which is responsible for the global resource index which is assigned by DHT. Rout module, to pass a lookup request by a next hop routing table. Finally, the data or chunk module which provides the data resource stored on the local machine. Fig.2 Cloud Based on P2P Architecture ISSN: X 241 ISBN:

Following Fig. 3[12] shows that in the index module, there is a chain containing the data index information pointers to all the data blocks with the same name ID will be linked in a sub-chain.

1 Architecture Even Sub-Cloud Client Gateway Odd Sub-Cloud Agent Agent Chunk Server Chunk Server Fig.4 Cloud Storage Based on Agent OE-P2P 3.

3 Following Fig. 3[12] shows that in the index module, there is a chain containing the data index information pointers to all the data blocks with the same name ID will be linked in a sub-chain. A pointer contains the address of a data block and the update version number of that block.[12] Fig. 3 Index Module Details 3 Cloud Based on Agents OE-P2P 3.1 Architecture Even Sub-Cloud Client Gateway Odd Sub-Cloud Agent Agent Chunk Server Chunk Server Fig.4 Cloud Storage Based on Agent OE-P2P 3.2 Work Flow In this section, we will present the system working flow with a typical scenario. Before a client can perform the work, data blocks and the corresponding replica should be uploaded to the Chunk Servers. Selection of the chunk servers for storage is the same with the traditional P2P cloud computing platform. 1. Client sends a request for a data block with logic identifier to the Gateway. 2. Gateway analyze the request, parse the identifier of the data block in the request, such as logic address, and change it to 128 bits logic ID by DHT algorithm which can be recognized by chunk servers agents on the OE-P2P network. 3. Gateway will direct the search request data package to any of the sub-clouds based on the logic ID. In case the logic ID is even the search request data package will be directed to the nearest node on the even subcloud, otherwise the search request data package will be addressed to the nearest node on the odd sub-cloud. 4. Gateway constructs an OE-P2P search request data package including the logic ID, and sends the request to the chunk server s agents OE-P2P network. 5. The OE-P2P search request package routed among the chunk servers following the OE-P2P search protocol such as Chord[13], Can[13], Pastry[13], and Tapestry[13]. The chunk servers agents now act as a routing nodes of OE-P2P and the routing interface will be taken used of. 6. The request reaches the server which contains the index information of the logic ID in searching. 7. The index includes all the pointers of the data replica with the same ID. The chunk server agent now acts as an index server and the index function interface will play its role. The chunk server will select the latest pointer by its version number, if there are more than one candidate, the server should select the nearest node by comparing the IP address of the client and the data server, then return the best address to the client. 8. When the client gets the best address, it will then send its request to the address of the chunk server which contains the data block. Now the chunk server acts as a data provider as the traditional cloud storage platform does. 3.3 Replication In our case the cloud provides storage as a service to users SaaS[14], a frequently met problem is write/read for mutual exclusion. In a central managed system as GFS, this can be resolved by using lock mechanism, but in distributed system, it will be more complicated. In this section, we will ISSN: X 242 ISBN:

4 discuss the replica consistency control. Here is an example for writing consistency in our P2P cloud storage system: 1. Client finds the chunk server node which contains the indexed information of the target data block. 2. Client tell the index node that it will do write operation. 3. The index node check the chain of replicas state to see whether there is another writing being processed. The state of the chain is lock or unlock. If the state of the pointer with the latest version is unlock, the index node will allow that write operation by returning the chunk server address of the latest version (if there are multiple candidates, the index node should select the nearest node to the client by comparing the IP addresses), and change the state to lock; If lock, the index node will queue the requests until the state reverts to unlock or is timed out, the first write request in the queue then can be carried out. 4. Client gets the address of the newest version, connects that chunk server, write the update data to the block, then sends message to the index node to notify that the write operation has finished. 5. When the index node receives the finish message, the version number of the pointer to the just modified block will be incremented by 1. Then a procedure of consistency update to all the replicas will be start. 6. After a replica server finishes update, it will send an update response messages to the index node and the version in its pointer in the chain will be increased by the index node. 7. When all the pointers in the chain have updated to the newest version, the state of the chain will be set unlock. If the period of lock state is overtime, the state of the chain will reset unlock by force and will be state here will be timed out. When the chain has set unlock, any delayed update response messages will be discarded and the version of the corresponding pointer in the chain will remain unaffected. This prevents the system from waiting indefinitely if there is something wrong with some replica servers in case of update. Since the version of the data block on the delayed server is old, no client will visit that server for the old data later unless its update is confirmed. An agent-scheduling routine[15] is taken in consideration to make sure that all replicas are to be of the same version and avoid inconsistencies. The procedure in carried out on a regular time interval where all replicas are updated to the same version number. We developed a prototype simulation system based on all three architectures discussed in this paper. The design is as follows: Operating system used is Windows XP. The simulator was developed using C#.Net. The number of chunk servers, clients, operation types (read/write) and number of operations are entered as parameters to the system. 5 Results We implemented all three architectures and we considered a LAN setting, where latencies on each node are taken in consideration in our simulation. Our simulation was fed with varying number of clients in order to test both response time and throughput. Our test environment was composed of 5 servers accommodating 50 files distributed randomly and the numbers of clients were entered as a parameter ranging as 10, 50, 100, 200 & 400 where all clients each accessed 10 files applying both read/write operations. Fig.5: Average response time during write operation In Fig. 5 we demonstrate that with increasing number of clients, OE-P2P shows better response time during write mode. Google s result showed highest response time due to the bottle-neck resulted from the centralized architecture, whereas P2P search space was higher compared to the number of chunk servers of OE-P2P. 4 Prototype System ISSN: X 243 ISBN:

5 Fig.8: Average throughput during read operation Fig. 8 illustrates that all three architectures show similar results for the number of bytes read since no delay occurs as the agent scheduled replication procedure was not used to maintain replicas in case of read operation. Fig.6: Average throughput during write operation In Fig. 6 OE-P2P and P2P show higher number of bytes written due to the agent scheduled replication procedure used to maintain consistencies among replicas, compared with Google, but OE-P2P showed better results due to less search space compared to P2P. Fig.7: Average response time during read operation In Fig. 7 we demonstrate that with increasing number of clients, OE-P2P shows lowest response time during read mode compared with P2P and Google as OE-P2P was free from the bottle neck from which Google suffered and smaller search space compared to P2P. 6 Conclusion The experimental results confirm that OE-P2P shows better results compared to both Google & P2P architectures in terms of response time, due to the fact that OE-P2P was defect from the bottle neck resulting from the centralized architecture of Google as well as the fact that OE-P2P divided the search space resulting in higher response time as illustrated above. But Google showed better results in terms of throughput in write operation. We show that as the number of clients increase, the response time increases for both Google and P2P compared to OE-P2P, whereas Google and OE-P2P shows better throughput for increasing number of clients compared to P2P. 7 References [1]Ghemawat S, Gobioff H, Leung ST. The Google file system. In: Proc. of the 19th ACM Symp. On Operating Systems Principles. New York: ACM Press, 2003.pp [2]Boss G, Malladi P, Quan D, Legregni L, Hall H.Cloud computing. IBM White Paper, dw/wes/hipods/cloud_computing_wp_final_8ct.pdf [3]Amazon. Amazon elastic compute cloud(amazonec2) [4]Francesco Maria Aymerich, Gianni Fenu, Simone Surcis. An Approach to a Cloud Computing Network, ICADIWT, 2008 pp [5]Barroso LA, Dean J, Hölzle U. Web search for a planet: The Google cluster architecture. IEEE Micro, 2003,23(2): PP [6]Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In: Proc. of the 6th Symp on Operating System Design and Implementation. Berkeley: USENIX Association, pp [7]Burrows M. The chubby lock service for loosely-coupled distributed systems. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation. Berkeley: USENIX Association, pp [8]Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable: A distributed storage system for structured data. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation. Berkeley: USENIX Association, pp ISSN: X 244 ISBN:

6 [9]Oasis:M. Rodrig, and A. Lamarca, Decentralized Weighted Voting for P2P Data Management, in Proc. of the 3rd ACM International Workshop on Data Engineering for Wireless and Mobile Access, 2003, pp [10]OM:H. Yu. and A. Vahdat, Consistent and Automatic Replica Regeneration, in Proc. of First Symposium on Networked Systems Design and Implementation (NSDI '04), [11]Sigma:S. Lin, Q. Lian, M. Chen, and Z. Zhang, A practical distributed mutual exclusion protocol in dynamic peer-to- peer systems, in Proc. of 3 rd International Workshop on Peer-to-Peer Systems (IPTPS 04), [12] Ke Xu 1, Meina Song 2, Xiaoqi Zhang 3, Junde Song4,"A Cloud Computing Platform Based on P2P" ITIME '09 IEEE International Symposium on IT in Medicine & Education, pp [13] RafitIzhak-Ratzin, Improving the BitTorrent Protocol Using Different Incentive Techniques, University of California, LA, Doctor of Philosophy in Computer Science, 2010 [14] A white paper produced by the Cloud Computing Use Case Discussion Group, 2009 Cloud Computing Use Cases, Version 2.0, [15]E. Sarhan, A. Ghalwash, M.Khafagy, Agent Based Replication for Scaling Back-End Databases of Dynamic Content Web Sites, 12 th WSEAS International Conference on COMPUTERS, Heraklion, Greece, pp ISSN: X 245 ISBN:

Lessons Learned While Building Infrastructure Software at Google

Lessons Learned While Building Infrastructure Software at Google Jeff Dean jeff@google.com Google Circa 1997 (google.stanford.edu) Corkboards (1999) Google Data Center (2000) Google Data Center (2000)