Advaned Siene and Tehnology Letters, pp.67-73 http://dx.doi.org/0.4257/astl.204.50. Bath Auditing for Multilient Data in Multiloud Storage Zhihua Xia, Xinhui Wang, Xingming Sun, Yafeng Zhu, Peng Ji and Jin Wang Jiangsu Engineering Center of Networ Monitoring, Nanjing University of Information Siene & Tehnology, Nanjing, 20044, China Shool of Computer & Software, Nanjing University of Information Siene & Tehnology, Nanjing, 20044, China Abstrat. Cloud storage enables users to outsoure their data to loud servers and enjoy the on-demand high-quality servies. However, this new paradigm also introdues many hallenges due to the seurity and integrity threats toward user outsoured data. Reently, various remote integrity auditing methods have been proposed, but most of them an only serve for the single loud environment or the individual auditing for eah data file. In this paper, we develop an effiient auditing mehanism, whih support bath auditing for multiple data files in multi-loud environment. By utilizing the bilinear map, the proposed protool ahieves full stateless and transparent verifiation. By onstruting a sequene-enfored Merle Hash Tree, the proposed protool an resist the replae atta. In addition, our protool protets the position information of the data blos by generating fae data blos to onfuse the organizer. By omputing intermediate values of the verifiation on loud servers, our method an greatly redue the omputing overhead of the auditor. The performane analysis proves the good effiieny of the proposed protool. Introdution Cloud storage servie is an important servie of loud omputing, whih has beome a new profit growth point by relieving individuals or enterprises burden for storage management and maintenane. By remotely storing data into the loud, users an aess their data via networs at anytime and from anywhere. However, many users are still hesitant to use this novel paradigm due to the seurity and integrity threats toward their outsoured data. This is beause data loss ould our in any infrastruture, whatever high degree of reliable measures the loud servie providers (CSPs) would tae []. Moreover, the CSPs ould be dishonest. They may hide data loss aidents to maintain the reputation, or even disard the data that has not been or rarely aessed to save the storage spae and laim the data is still orretly stored in the loud. Therefore, it is highly essential for the loud users to he the integrity and availability of their loud data. In order to address the issue above, various Provable Data Possession (PDP) protools have been proposed [2-4]. PDP is a probabilisti proof tehnique for heing ISSN: 2287-233 ASTL Copyright 204 SERSC
Advaned Siene and Tehnology Letters the availability and integrity of outsoured data with randomly sampling a few file blos. Ateniese et al. [2] first defined PDP model with publi verifiation. They utilized RSA-based homomorphi linear authentiators (HLA) and suggested randomly sampling a few blos of the file for verifiation. In order to support dynami operations, Ateniese et al.[3] developed a partially dynami version of salable PDP model based on symmetri ey ryptography. After that, Erway et al. [4] presented a sip list-based dynamis PDP model with fully data dynami operation. Wang et al. also proposed a dynamis PDP sheme based on ombining Boneh Lynn Shaham signature (BLS)-based HLA with Merle Hash Tree (MHT) struture [5, 6]. The sheme supports both the publi stateless verifiation and fully dynami data update. Their subsequent wor [7] proposed a sheme supporting privay-preserving publi auditing, whih was also extended to enable bath auditing. In other related wor, Juels and alisi [8] desribe a proofs of retrievability (POR) model, whih not only an verify data possession but also ensure retrievability of raw data files when abnormality is deteted. Most of the above PDP shemes mainly address integrity verifiation issues at a single CSP. As a more feasible appliation senario, users may store their data in multiloud with a distributed manner to redue the threats of data integrity and availability [9]. In this senario, multiloud is omposed of multiple private or publi louds. Eah CSP has a different level of quality of servie as well as a different ost assoiated with it. Hene, the users an store their data files on more than one CSP aording to the required level of seurity and their affordable budgets. Within multiloud, an organization an offer and manage in-house and out-house resoures [0, ]. In this paper, we develop an effiient auditing mehanism, whih support bath auditing for multiple data files in multi-loud environment. By utilizing the bilinear map, the proposed protool an aggregate the verifiation tas from different users to redue the omputing overhead of the auditor. By onstruting a sequene-enfored Merle Hash Tree, the proposed protool an resist the replae atta. In addition, our protool protets the position information of the data blos by generating fae data blos to onfuse the organizer, so as to ahieve full stateless and transparent verifiation. 2 Problem Statement We onsider a multiloud storage servie model whih is adopted by some previous wors [0, ]. The model involves three different entities loud users, multiloud and third party auditor. The loud users have a number of data files to be stored in multiple louds. They have the authority to aess and manipulate the stored data. The multi-loud onsists of multiple Cloud Server Providers (CSPs). They provide data storage servie and have enough storage spae and signifiant omputation resoures. In this paper, to redue the ommuniation burden of verifier, one of CSPs is designated as an organizer for auditing purpose. For example, the Zoho loud in Fig. is onsidered as an organizer. In our sheme, the organizer taes the responsibility to 68 Copyright 204 SERSC
Advaned Siene and Tehnology Letters distribute the auditing hallenge and aggregate the proof from multiple louds. In this paper, we suppose that the CSPs annot ommuniate with eah other apart from the organizer for the auditing issues, and also, the verifier an only ontat with the organizer. The Third Party Auditor (TPA) has a more powerful omputation and ommuniation ability than regular loud users. In loud storage system, none of loud servie providers or users ould be guaranteed to provide unbiased auditing result. Thus, third party auditing is a natural hoie. Moreover, by resorting to TPA, users an be relieved from the burden of heing the integrity of outsoure data. 3 The Bath Auditing CPDP Protool In the proposed protool, we need to use a bilinear map group system, two hash funtions and a signature funtion. Let S ( p, g, G, G, G, e ) be a bilinear map group 2 T system with generator g, where G, G 2 and G T are multipliative yli groups of * prime order p, and e : G G G is a bilinear map. Let H ( ) : {0,} G be a seure 2 T map-to-point hash funtion, h ( ) : G Z be another hash funtion whih maps ele- T p ment of G to Z T p, and S ig () be the signature funtion. Let { U } be the set of users, and { P } be the set of CSPs. The number of CSPs is denoted as C. 3. Setup Phase Eah user runs setup phase of the protool as follow: Step: e y G e n ( ). The th user U generates a signing ey pair ( ss, sp ). Se- let a random Z p, and ompute g. Selet s random elements { u, u,..., u } G,, 2, s publi parameters are p s p g u,. To sum up, the seret ey is s (, ss ) and the j j s (,,,{ } ). Step2: T a g G en ( s, F, P ). The user U splits his file F into n s setors F { m } Z n s, i, j i[, n ], l[, s ] p, where the i th blo of user U blo m onsists, i i i i s of s setors { m, m,..., m }. The user om-,,,, 2,, putes t n a m e n S ig ( n a m e n ) ss as the file tag for F, where n a m e is the file name. Next, the user U onstruts SMHT with a root R, where the leave nodes are an ordered hash values of data blos { ( )} as desribed in part 2.3. The,, [, ] user also signs R with the private ey : T H m i i i n. () S ig ( H ( R )) ( H ( R )) Copyright 204 SERSC 69
Advaned Siene and Tehnology Letters For eah data blo m ( i [, n ]), the user U omputes a data tag as, i, i,, ( s m i j T u ), i, i j, j. (2) Besides, the user U generates an additional random data blo denoted as, n ' m, whih is also divided into s setors { m }. Then the orresponding tag is omputed, n ', j j [, s ] as C, ', ( ( ) s m n j H R u ), n ' j, j. (3) After all the parameters have been prepared, the user U distributes the data blo and tag pairs ( m, ) to the orresponding loud servie providers, and sends the, i, i random data blo and its orresponding tag ( m, ) to eah loud servie provid-, n ', n ' er. In addition, the user U sends the parameters{s M H T, S ig ( H ( R )), t } to the organizer, and sends the publi parameters p to TPA. After data transmission, the user ass TPA to ondut the onfirmation auditing to mae sure that their data is orretly stored on all the servers. One onfirmed, the user an hoose to delete the loal opy of the data blos apart from the seret ey. By now, TPA ould run the sampling auditing periodially to he the data integrity for users. 3.2 Bath Audit Phase Suppose that TPA proess auditing sessions of distint data files simultaneously. The data files are stored on C CSPs ({ P } C ). The audit phase is exeuted as follows: Step: G en P ro o f ({ m, }, h a l ). It is an interative 5-move protool among CSPs, i i an organizer (O), and an Auditor (TPA). This proess is desribed in the following. ) Retrieve (O TPA): After reeiving the verifiation request, the organizer sends the file tags { t } to the TPA. The TPA [, ] ers { } [, ] n a m e and { n } by using publi eys { s s p }, and verifies all [, ] [, ] the signatures. The verifiation quit by emitting FALSE if the file name verifiation fails. 2) Challenge (O TPA): If the file name is suessful verified, the TPA generates hallenge index-oeffiient message { (, )} for [, ] Q i v i i I, where I [, n ] speifies the sampled blos that will be verified, and v i Z p is random element. TPA hooses a random element Z p and omputes the set of 70 Copyright 204 SERSC
Advaned Siene and Tehnology Letters hallenge stamps { H } [,. To sum up, TPA generates the hallenge as ] follow, and sent it to the organizer. (,{ } ). (4) [, ] hal Q H [, ] 3) Challenge2 (P O): Upon reeiving the hallenge hal from the TPA, the organizer forwards hal to eah P. 4) Response (P O): For eah Q, eah P pis out the symbol m P means that the data blo m, i, i Q { ( i, v )} Q,. Here, i m, i P is stored on P. Then, P al- ulates the linear ombination of speified data blos for eah file as = m,, j, n ', j v m Z, (5) i, i, j p ( i, vi ) Q, 5) and generates data proof DP and tag proof TP as s,, j D P (, ) e u H, j j T P v i ( ), n ', i, G ( i, vi ) Q,. (6) Finally, P returns ( D P, T P ) to the organizer. 6) Response2 (O TPA): Upon reeiving all the proofs from all the CSPs, the organizer aggregates all of the proof into a response ( D P, T P ), whih are alulated as (7) D P D P, T P T P. In addition, the organizer also provides the TPA with AAI { }, whih, i ii, [, ] are the siblings of the nodes on the path from the leaves { T } to the, i ii, [, ] root R of the S M H T. Finally, the organizer responds TPA with proof.,,, (8) i P r S ig H R ii, [, ] [, ] Step2: B a th V erifyp ro o f ( p, h a l, P r ). After reeiving proof Pr from the organizer, TPA starts to verify to proof. First, for eah [, ] and i I, TPA first verifies the position of { T } by heing if the, i fo r i I, [, ] tion, L E F T ( T ) i i holds or not. Seond, TPA onstruts a verifiation root R ' by Copyright 204 SERSC 7
Advaned Siene and Tehnology Letters using { T, } for [, ], and verifies values of R ' by heing Eq. (0)., i, i i I Third, if the both authentiation above sueeds, TPA verifies data integrity by heing Eq. (). e ( sig ( H ( R )), g ) e ( H ( R '), )? (9)? v i, i (0) ii e ( T P, g ) D P e ( H ( R ) T, H ) If the data blos are not damaged by the loud server, the equation () will be proved to be true. 4 Seurity Analysis Privay-preserving property: Before omputing and returning Response, CSPs will authentiate the hallenge requests with the ertifiate issued by user on TPA s publi ey. Thus, only TPA an send the authentiation request. Moreover, after reeive the final response from the organizer, TPA need to first validate the leaves of the SMHT whih is stored in the organizer. This guarantees only the organizer an ompute the final response. Besides, to protet the data privay, we use Eq. (6) to proess the linear ombination of speified data blos, whih is the same as the random mas tehnique. Transparent verifiation property: This paper introdues an organizer, who is responsible for interating with TPA. So, TPA annot learn the details of data storage. In this paper, it is onsidered that the organizer is also not trusted. In order to prevent the organizers from getting the loation of the data blo, user generates a pair of additional data-tag whih will be sent to eah CSP at setup phase of protool. At the audit phase, whihever data blos are in the hallenge request from the TPA, every CSP will send a response to the organizer, whih an oneal the details of data storage from the organizer. Thus, the proposed bath auditing protool oneals the storage details of multiple CSPs from both TPA and the organizer. 5 Conlusions In this paper, we explore ollaborative integrity auditing issue of remote data stored in the multiple louds. A bath auditing PDP mehanism for multi-loud environment is proposed. Our onstrution enables TPA to perform multiple auditing tass for multiple data files in multiple louds. Meanwhile, transparent verifiation, full stateless verifiation and seure are also important objetives of the protool. Utilizing the smht onstrution, the proposed protool ahieves full stateless verifiation and dynami data operation with integrity assurane. The paper uses BLS-based homo- 72 Copyright 204 SERSC
Advaned Siene and Tehnology Letters mophi authentiator to equip the verifiation protool, whih enable the TPA to perform bath audits for multiple users based on the tehnique of bilinear aggregate signature. In addition, by letting the loud servers omputing intermediate values of the verifiation for the auditor, our method an greatly redue the omputing overhead of the TPA. At last, seurity analysis shows that the proposed bath auditing protool is seure, and it also holds muh better effiieny than the individual auditing with our performane evaluation. Anowledgements. This wor is supported by the NSFC (623206, 6734, 67342, 67336, 60325, 637332, 637333), GYHY20206033, 2030030, 203DFG2860, BC20302 and PAPD fund. Referenes. Goodson, G. R. et al.: Effiient Byzantine-tolerant erasure-oded storage. Dependable Systems and Networs, 2004 International Conferene on, 2004, pp. 35-44. 2. Ateniese, G. et al.: Provable data possession at untrusted stores. Proeedings of the 4th ACM onferene on Computer and ommuniations seurity, 2007, pp. 598-609. 3. Ateniese, G. et al.: Salable and effiient provable data possession. Proeedings of the 4th international onferene on Seurity and privay in ommuniation netowrs, 2008, p. 9. 4. Erway, C. et al.: Dynami provable data possession. Proeedings of the 6th ACM onferene on Computer and ommuniations seurity, 2009, pp. 23-222. 5. Wang, Q. et al.: Enabling publi auditability and data dynamis for storage seurity in loud omputing. Parallel and Distributed Systems, IEEE Transations on, vol. 22, pp. 847-859, 20. 6. Boneh, D. et al.: Short signatures from the Weil pairing. Advanes in Cryptology ASIACRYPT 200, ed: Springer, 200, pp. 54-532. 7. Wang, C. et al.: Privay-preserving publi auditing for seure loud storage. 203. 8. A. Juels and B. S. alisi Jr.: PORs: Proofs of retrievability for large files," in Proeedings of the 4th ACM onferene on Computer and ommuniations seurity, 2007, pp. 584-597. 9. AlZain, M. A. et al.: Cloud omputing seurity: from single to multi-louds. System Siene (HICSS), 202 45th Hawaii International Conferene on, 202, pp. 5490-5499. 0. Zhu, Y. et al.: Collaborative integrity verifiation in hybrid louds. Collaborative Computing: Networing, Appliations and Worsharing (CollaborateCom), 20 7th International Conferene on, 20, pp. 9-200.. Zhu, Y. et al.: Cooperative provable data possession for integrity verifiation in multi-loud storage. IEEE Transations on Parallel and Distributed Systems, vol. 23(2), pp. 223-2244, 202. Copyright 204 SERSC 73