International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN
|
|
- Barnard Russell
- 5 years ago
- Views:
Transcription
1 International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY Vidya Kurtadikar 1,Chaitanya Atre 2, Dhanraj Gade 3, Ritwik Jadhav 4, Rohan Gandhi 5 Department of Information Technology, MITCOE, Pune, India. ABSTRACT: Nowadays, there is a drastic increase in demand for data storage. Due to increasing demand for data storage, the concept of cloud computing is on the rise. This enormous amount of data can be backed-up on the cloud storage but, the problem is that it significantly increases the cost of the storage and its performance. Traditional storage process of data introduces redundancies and hence concept of Data deduplication is developed. Data deduplication is an effective solution for eliminating the redundancies. It uses the concepts of hash values and index tables which help in removal of duplicate data. With the use of Data deduplication process, an effective performance increase and reduction in cost of storage can be observed. In this paper, we have discussed different data deduplication methods considering their advantages and disadvantages. We have also proposed enhanced security method for generating data chunks by using standard encryption method. Keywords: Data Deduplication, Cloud Storage, Hashing, Chunking, Redundant Data. [1] INTRODUCTION Nowadays, due to advancements in the Technology, the amount of digital data generated by the applications is increasing at a faster rate. As the storage systems have a limited capacity, storing such huge amount of data has posed challenge. Recent International Data Corporation (IDC) studies indicate that in past five years the volume of data has increased by almost nine times to 7 ZB per year and a more explosive growth is expected in next ten years [2]. Such a massive growth in storage is controlled by the technique of Data Deduplication [1]. This process identifies duplicate contents at the chunk level using hash values and deduplicates them. Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 1
2 SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY Data Deduplication can be implemented at File Level and at Chunk Level. Chunk level Deduplication is always preferred over the file deduplication as in the process of chunk level deduplication, the fingerprints of various chunks are compared and the redundant ones are deduplicated. As against in the File Level Deduplication, the whole file is compared with the other file by checking its metadata. Chunking plays a significant role in determining the efficiency of the data deduplication algorithm. The performance of the data deduplication algorithm can be computed by analysing the size and number of chunks [1]. As stated in [1] Data Deduplication saves a lot of storage space and money spent on the storage of the data by optimizing the storage space and bandwidth costs. Hence, a greener environment can be obtained as fewer spaces are required to house the data in primary and remote storage. As we are maintaining less storage, fast return on investment can be obtained. This process also helps in increasing the network bandwidth and improving the network efficiency. [2] DIFFERENT APPROACHES OF DATA DEDUPLICATION Data deduplication - often called intelligent compression or single-instance storage - is a process that eliminates redundant copies of data and reduces storage overhead. Data deduplication techniques guarantee that only a single unique instance of data gets stored in the backup system. In the process of Data deduplication, we divide the file or any block of data into multiple chunks and a hash value of these chunks is calculated by using hash techniques like SHA-1 or MD5 [3]. Using these hash values, we can compare a chunk of data with the incoming data chunk and if a match is found, then, we can conclude that a similar data chunks exists in the storage system and hence, we need to replace the duplicate data chunk with the reference of existing data chunks. Figure:1. Diagram of Illustrating Data Deduplication Process [2.1] STEPS OF DATA DEDUPLICATION 1. Creation of Chunks: The File is divided into chunks using any of the chunking methods - Fixed length chunking or Variable length chunking. Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 2
3 International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN Hash Value Computation: Depending upon the chunks formed, the hash values of these chunks are computed by using any of the various available hashing algorithms SHA-1, MD5, SHA Deduplication Process: Now, the hash values of the data chunks which can be called as fingerprints are stored in an index table. Duplicate entries of data can be checked by making use of the index tables. If the hash values are found to be matching, duplicate data can be replaced with the reference of the original data chunk [1]. [2.2] HASH ALGORITHMS Hash Algorithms are mainly used in the process of Data deduplication. Hash values of data chunks are computed which are generally called as Fingerprints are used for eliminating the redundancies in the data. Commonly used hash algorithm for the deduplication process is the SHA-1. [2.3] SHA-1 SHA-1 is a cryptographic security algorithm used in deduplication process for computing the fingerprints of the data chunks. SHA-1 is closely modelled after MD5. SHA-1 produces a message digest of 160 bits by dividing the input data in blocks of 512 bits each [4]. This 160 bits value computed for each data chunk is unique and used for eliminating redundancies in the data. Figure: 2. Classification Tree of Data Deduplication [2.4] SOURCE BASED DEDUPLICATION The process of elimination of redundancies from data before transfering that data to the target server is what we call Source Deduplication. Source deduplication provides numerous advantages like reduced bandwidth and storage usage. But, Source deduplication can be slower than target based deduplication, considering large amount of data. Source deduplication works from the client side that works in co-ordination with the server to compare new data chunks with previously stored data chunks. Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 3
4 SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY [2.5] TARGET BASED DEDUPLICATION Target deduplication is the removal of redundancies from a storage systems as it passes through an application placed between the source and the target server. Target deduplication reduces the amount of storage required but, unlike source deduplication, it does not reduce the amount of data that must be sent across a LAN or WAN during the storage. [2.6] INLINE DATA DEDUPLICATION Inline Deduplication refers to eliminating the data redundancies as the data enter the system. This process cuts down on the bulk of data and makes the system efficient. The benefit of the inline deduplication process is that the calculation of hash values and search process for redundant data is done before the data is actually stored in the system. [2.7] POST PROCESS DEDUPLICATION A process of analysis and removal of redundant data after the data is written to storage system is Post Process Deduplication or Asynchronous Deduplication. Post-process deduplication offers a number of advantages like efficient lookup process which includes calculation of hash values and search process for redundant data,which is done after the data files are stored on the storage system. [2.8] FILE BASED & SUB-FILE BASED DEDUPLICATION File Based Deduplication works by calculating a single checksum of the complete file and comparing it with another file s calculated checksum. It s simple and fast, but the efficiency of deduplication is less, as this process does not handle the problem of duplicate content found inside different files.while the sub-file level deduplication technique includes a process which breaks the file into smaller fixed or variable sized data chunks, and further uses a standard hashbased algorithm to find similar blocks. [2.9] FIXED LENGTH AND VARIABLE LENGTH DEDUPLICATION Fixed Length chunking method splits files into equally sized chunks. The chunk boundaries are based on offsets like 4, 8, 16 kb, etc. It uses a simple checksum based approach to find the duplicates. The process is highly constrained and offers limited advantages. Figure: 3. Fixed v/s Variable Length Chunks Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 4
5 International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN The process of variable length chunking states that the files can be broken into multiple chunks of variable sizes by breaking them up based on the content rather than on the fixed size of the files [5]. This method is used as an alternative to fixed length chunking. In variable length chunking, boundaries of the chunks are based on the content present in the file. Hence, whenever, any data gets updated, it is not necessary to alter the entire whole file. With data broken down based on the content, efficiency of deduplication process increases as it is easy to recognize and eliminate redundant data chunks [5]. [3] VARIABLE LENGTH CHUNKING MECHANISMS [3.1] RABIN KARP FINGERPRINT ALGORITHM Rabin Karp is a variable length chunking algorithm that uses the concept of hash functions and rolling hash technique. A rolling hash (also known as recursive hashing or rolling checksum) is a hash function where the input is hashed in a window that moves through the input. A rolling hash allows an algorithm to calculate a hash value without having the rehash the entire string. For example, when searching for a word in a text, as the algorithm shifts one letter to the right, the algorithm uses a rolling hash to do an operation on the hash to get the new hash from the old hash. The concept behind Rabin Karp algorithm is to define a data window on the data stream, which has finite and predefined size of say "N". Now using a hashing technique calculate a hash on the window. Check if the hash matches a predefined "fingerprint". If the "fingerprint" does match the calculated rolling hash on the window, mark the Nth element as the chuck boundary. If the "fingerprint" doesn't match the calculated rolling hash on the window, slide the window by one element and recalculate the rolling hash. Repeat this until you find a match [10]. [3.2] TWO THRESHOLD TWO DIVISORS ALGORITHM TTTD algorithm proposed by HP Laboratories, Palo Alto, uses four parameters, the maximum threshold, the minimum threshold, the main divisor, and the second divisor. The maximum threshold parameter is used for eliminating very large sized chunks while minimum threshold is used for eliminating very small sized chunks. These two parameters are necessary to control the variations in chunk sizes. The main divisor can be used to make the chunk size close to our expected chunk size. The second divisor finds the breakpoint when the main divisor cannot find it. The breakpoint found by second divisor are large and close to maximum thresholds [6]. [4] PERFORMANCE ANALYSIS When comparing File level chunking, fixed length chunking and variable length chunking, it is observed that variable length chunking provides good results. Table below represents different deduplicate/undeduplicate size (8, 16, 32, 64 Kb) in % with different chunking algorithms namely File Level, Fixed Length and variable length chunking [5]. Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 5
6 SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY Table: 1. Performance of Different Length Deduplication processes. 8 Kb 16 Kb 32 Kb 64 Kb File Level Fixed Length Rabin Karp In Fixed Length chunking, each chunk is of particular fixed length. However, problem arises when the file needs to be updated. By inserting a line of text in the file, the chunking algorithm shifts all the subsequent chunks, also no chunk following the first chunk is preserved. Hence, simple small updates radically change the chunks and deduplication becomes an issue. As against in Variable Length chunking, chunks are content based and are not of uniform length. Similarly, in variable chunking method, when updation process takes place, only the chunk where any new line of text is inserted or deleted is changed. Remaining chunks are unaltered and this is biggest benefit of variable length chunking. Hence, the method proposed uses variable length chunking algorithm. Table: 2. Comparison of Various Deduplication Systems Metrics File Level Fixed Size Variable Size Deduplication Ratio Low Better Good Processing Time Medium Less High Table 2 shows the comparison of different deduplication algorithms with different performance metrics. Deduplication ratio indicates that how many redundancies are removed and variable sized deduplication is much better than others. In terms of processing time, variable sized deduplication is worst owing to expensive variable length chunking [9]. As chunks are of variable sizes, time taken to process is high. [5] ENHANCED METHOD As explained earlier, Data deduplication is a technique which stores only unique data chunks in the storage and removes the redundant data chunks. Hence, due to such efficient storage process, technique of data deduplication is preferred over the traditional storage systems. This paper focuses on presenting the studied data deduplication types. We have enhanced the existing method for data deduplication which depends on eliminating the redundant data chunks while it is stored in the storage or backup. Figure 4 shows the flow diagram of the method. Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 6
7 International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN Figure: 4. Flowchart of the Deduplication Method In this method, the data (text file) which the client intends to upload to the storage, is divided into chunks by using any of the variable length chunking algorithm, preferably Rabin Karp Fingerprint. A hash value of the created chunks will be computed. SHA-1 algorithm will be used for computing the hash value of the data chunks. The reason for selecting SHA-1 over any other hashing algorithm like MD5 is that SHA-1 is efficient and more secure as its message digest is 160 bits. These unique chunks whose hash values are computed will be stored in the storage and hash values will be stored in the index tables. When any new file is ready to upload, the hash values of this new data chunks will be compared with the values stored in the index tables. If any match is found, a reference of original data chunk will be stored and value in index table will be updated and if any match is not found, the data chunk will be directly stored in the storage. Now for enhancing the security, the data chunk will be encrypted before it is stored in the backup. Basically, the need for security arises when deduplication system is implemented in the cloud. Security issues can be observed when information is processed on the cloud platform. The user who uploads the file does not have the right over where the file is stored. Hence, there might be the possibility that the cloud service provider or any third party application can handle and access the data. Hence, the need for encryption arises. Before storing the data, the text will be encrypted and then, it will be stored in the backup or target storages. The algorithm which will be used for encryption is Advanced Encryption Standard (AES). Before downloading the file, all the data chunks will be decrypted with the key provided, merged into a single file and then, it will be downloaded. [6] CONCLUSION This paper focuses on study of the Data Deduplication process for storage systems and an enhanced security method using standard encryption. Data Deduplication methods are used to achieve cost effective storage and effective network bandwidth while encryption is used to protect the data from unauthorized access. The actual concept lies in removing the redundancies present in the data items. It is one of the emerging concepts which is currently being implemented by cloud providers. Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 7
8 SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY REFERENCES [1] Subhanshi Singhal and Naresh Kumar, A Survey on Data Deduplication in International Journal on Recent and Innovation Trends in computing and communication, May [2] Bo Mao, Hong Jiang, Suzhen Wu, Lei Tian, Leveraging Data Deduplication to improve performance of Primary Storage Systems in the Cloud in IEEE Transactions on Computer, Vol 65, No. 6, June [3] Golthi Tharunn, Gowtham Kommineni, Sarpella Sasank Varma, Akash Singh Verma, Data Deduplication in Cloud Storage in International Journal of Advanced Engineering and Global Technology, Vol 03, Issue 08, August [4] Chaitya B. Shah, Drashti R. Panchal, Secured Hash Algorithm-1 Review Paper in International Journal for Advanced Research in Engineering and Technology, Vol 02, Oct [5] A Venish and K. Shiva Shankar, Study of Chunking Algorithm in Data Deduplication. [6] BingChun Chang, A Running Time Improvement for Two Threshold Two Divisors Algorithm, MS, SJSU Scholar Works, [7] J. Malhotra and J. Bakal, A Survey and Comparative Study of Data Deduplication Techniques, in International Conference on Pervasive Computing, Pune [8] Zuhair S. Al-Sagar, Mohammed S. Saleh, Aws Zuhair Sameen, Optimizing Cloud Storage by Data Deduplication: A Study, in International Research Journal of Engineering and Technology, Vol 02, Issue 09, Dec [9] Daehee Kim, Sejun Song, Baek-Young Choi, Data Deduplication for Data optimization for Storage and Network Systems. [10] Vidya Kurtadikar, Chaitanya Atre, Dhanraj Gade, Ritwik Jadhav, Rohan Gandhi 8
Alternative Approaches for Deduplication in Cloud Storage Environment
International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 10 (2017), pp. 2357-2363 Research India Publications http://www.ripublication.com Alternative Approaches for
More informationChunkStash: Speeding Up Storage Deduplication using Flash Memory
ChunkStash: Speeding Up Storage Deduplication using Flash Memory Biplob Debnath +, Sudipta Sengupta *, Jin Li * * Microsoft Research, Redmond (USA) + Univ. of Minnesota, Twin Cities (USA) Deduplication
More informationDeploying De-Duplication on Ext4 File System
Deploying De-Duplication on Ext4 File System Usha A. Joglekar 1, Bhushan M. Jagtap 2, Koninika B. Patil 3, 1. Asst. Prof., 2, 3 Students Department of Computer Engineering Smt. Kashibai Navale College
More informationData deduplication for Similar Files
Int'l Conf. Scientific Computing CSC'17 37 Data deduplication for Similar Files Mohamad Zaini Nurshafiqah, Nozomi Miyamoto, Hikari Yoshii, Riichi Kodama, Itaru Koike, Toshiyuki Kinoshita School of Computer
More informationContents Part I Traditional Deduplication Techniques and Solutions Introduction Existing Deduplication Techniques
Contents Part I Traditional Deduplication Techniques and Solutions 1 Introduction... 3 1.1 Data Explosion... 3 1.2 Redundancies... 4 1.3 Existing Deduplication Solutions to Remove Redundancies... 5 1.4
More informationA Comparative Survey on Big Data Deduplication Techniques for Efficient Storage System
A Comparative Survey on Big Data Techniques for Efficient Storage System Supriya Milind More Sardar Patel Institute of Technology Kailas Devadkar Sardar Patel Institute of Technology ABSTRACT - Nowadays
More informationA DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU
A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:
More informationA Review on Backup-up Practices using Deduplication
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 9, September 2015,
More informationCompression and Decompression of Virtual Disk Using Deduplication
Compression and Decompression of Virtual Disk Using Deduplication Bharati Ainapure 1, Siddhant Agarwal 2, Rukmi Patel 3, Ankita Shingvi 4, Abhishek Somani 5 1 Professor, Department of Computer Engineering,
More informationData Deduplication using Even or Odd Block (EOB) Checking Algorithm in Hybrid Cloud
Data Deduplication using Even or Odd Block (EOB) Checking Algorithm in Hybrid Cloud Suganthi.M 1, Hemalatha.B 2 Research Scholar, Depart. Of Computer Science, Chikkanna Government Arts College, Tamilnadu,
More informationOracle Advanced Compression. An Oracle White Paper June 2007
Oracle Advanced Compression An Oracle White Paper June 2007 Note: The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
More informationUnderstanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp.
Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Primary Storage Optimization Technologies that let you store more data on the same storage Thin provisioning Copy-on-write
More informationAES and DES Using Secure and Dynamic Data Storage in Cloud
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 1, January 2014,
More informationByte Index Chunking Approach for Data Compression
Ider Lkhagvasuren 1, Jung Min So 1, Jeong Gun Lee 1, Chuck Yoo 2, Young Woong Ko 1 1 Dept. of Computer Engineering, Hallym University Chuncheon, Korea {Ider555, jso, jeonggun.lee, yuko}@hallym.ac.kr 2
More informationDEDUPLICATION OF VM MEMORY PAGES USING MAPREDUCE IN LIVE MIGRATION
DEDUPLICATION OF VM MEMORY PAGES USING MAPREDUCE IN LIVE MIGRATION TYJ Naga Malleswari 1 and Vadivu G 2 1 Department of CSE, Sri Ramaswami Memorial University, Chennai, India 2 Department of Information
More informationDATA DEDUPLCATION AND MIGRATION USING LOAD REBALANCING APPROACH IN HDFS Pritee Patil 1, Nitin Pise 2,Sarika Bobde 3 1
DATA DEDUPLCATION AND MIGRATION USING LOAD REBALANCING APPROACH IN HDFS Pritee Patil 1, Nitin Pise 2,Sarika Bobde 3 1 Department of Computer Engineering 2 Department of Computer Engineering Maharashtra
More informationSelf Destruction Of Data On Cloud Computing
Volume 118 No. 24 2018 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ Self Destruction Of Data On Cloud Computing Pradnya Harpale 1,Mohini Korde 2, Pritam
More informationWhite paper ETERNUS CS800 Data Deduplication Background
White paper ETERNUS CS800 - Data Deduplication Background This paper describes the process of Data Deduplication inside of ETERNUS CS800 in detail. The target group consists of presales, administrators,
More informationEfficient Resource Allocation And Avoid Traffic Redundancy Using Chunking And Indexing
Efficient Resource Allocation And Avoid Traffic Redundancy Using Chunking And Indexing S.V.Durggapriya 1, N.Keerthika 2 M.E, Department of CSE, Vandayar Engineering College, Pulavarnatham, Thanjavur, T.N,
More informationAn Oracle White Paper October Advanced Compression with Oracle Database 11g
An Oracle White Paper October 2011 Advanced Compression with Oracle Database 11g Oracle White Paper Advanced Compression with Oracle Database 11g Introduction... 3 Oracle Advanced Compression... 4 Compression
More informationSECURE DEDUPLICATION OF DATA IN CLOUD STORAGE
SECURE DEDUPLICATION OF DATA IN CLOUD STORAGE 1 Babaso D. Aldar, 2 Vidyullata Devmane 1,2 Computer Engg. Dept, Shah and Anchor Kutchhi Engg. College, Mumbai University. 1 babaaldar@gmail.com, 2 devmane.vidyullata@gmail.com
More informationS. Indirakumari, A. Thilagavathy
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 2 ISSN : 2456-3307 A Secure Verifiable Storage Deduplication Scheme
More informationHyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process
Hyper-converged Secondary Storage for Backup with Deduplication Q & A The impact of data deduplication on the backup process Table of Contents Introduction... 3 What is data deduplication?... 3 Is all
More informationDeduplication Storage System
Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business
More informationFiLeD: File Level Deduplication Approach
FiLeD: File Level Deduplication Approach Jyoti Malhotra *, Jagdish Bakal Department of Computer Science & Engineering, G.H. Raisoni College of Engineering, RTM Nagpur University, Nagpur, Maharashtra-440016,
More informationSoftware-defined Storage: Fast, Safe and Efficient
Software-defined Storage: Fast, Safe and Efficient TRY NOW Thanks to Blockchain and Intel Intelligent Storage Acceleration Library Every piece of data is required to be stored somewhere. We all know about
More informationRio-2 Hybrid Backup Server
A Revolution in Data Storage for Today s Enterprise March 2018 Notices This white paper provides information about the as of the date of issue of the white paper. Processes and general practices are subject
More informationDon t Get Duped By Dedupe: Introducing Adaptive Deduplication
Don t Get Duped By Dedupe: Introducing Adaptive Deduplication 7 Technology Circle Suite 100 Columbia, SC 29203 Phone: 866.359.5411 E-Mail: sales@unitrends.com URL: www.unitrends.com 1 The purpose of deduplication
More informationA Survey on Resource Allocation policies in Mobile ad-hoc Computational Network
A Survey on policies in Mobile ad-hoc Computational S. Kamble 1, A. Savyanavar 2 1PG Scholar, Department of Computer Engineering, MIT College of Engineering, Pune, Maharashtra, India 2Associate Professor,
More informationParallelizing Inline Data Reduction Operations for Primary Storage Systems
Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr
More informationENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING
ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING S KEERTHI 1*, MADHAVA REDDY A 2* 1. II.M.Tech, Dept of CSE, AM Reddy Memorial College of Engineering & Technology, Petlurivaripalem. 2. Assoc.
More informationSpeeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta and Jin Li Microsoft Research, Redmond, WA, USA Contains work that is joint with Biplob Debnath (Univ. of Minnesota) Flash Memory
More informationTechnology Insight Series
EMC Avamar for NAS - Accelerating NDMP Backup Performance John Webster June, 2011 Technology Insight Series Evaluator Group Copyright 2011 Evaluator Group, Inc. All rights reserved. Page 1 of 7 Introduction/Executive
More informationDeduplication and Its Application to Corporate Data
White Paper Deduplication and Its Application to Corporate Data Lorem ipsum ganus metronique elit quesal norit parique et salomin taren ilat mugatoque This whitepaper explains deduplication techniques
More informationAn Application Awareness Local Source and Global Source De-Duplication with Security in resource constraint based Cloud backup services
An Application Awareness Local Source and Global Source De-Duplication with Security in resource constraint based Cloud backup services S.Meghana Assistant Professor, Dept. of IT, Vignana Bharathi Institute
More informationOnline Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.
, pp.1-10 http://dx.doi.org/10.14257/ijmue.2014.9.1.01 Design and Implementation of Binary File Similarity Evaluation System Sun-Jung Kim 2, Young Jun Yoo, Jungmin So 1, Jeong Gun Lee 1, Jin Kim 1 and
More informationWAN OPTIMIZATION BY APPLICATION INDEPENDENT DATA REDUNDANCY ELIMINATION (BY CACHING)
WAN OPTIMIZATION BY APPLICATION INDEPENDENT DATA REDUNDANCY ELIMINATION (BY CACHING) Manjunath R Bhat 1, Bhishma 2, Kumar D 3, Kailas K M 4, Basavaraj Jakkali 5 1,2,3,4 Pursuing B.E in Computer Science
More informationMulti-level Byte Index Chunking Mechanism for File Synchronization
, pp.339-350 http://dx.doi.org/10.14257/ijseia.2014.8.3.31 Multi-level Byte Index Chunking Mechanism for File Synchronization Ider Lkhagvasuren, Jung Min So, Jeong Gun Lee, Jin Kim and Young Woong Ko *
More informationScale-out Data Deduplication Architecture
Scale-out Data Deduplication Architecture Gideon Senderov Product Management & Technical Marketing NEC Corporation of America Outline Data Growth and Retention Deduplication Methods Legacy Architecture
More informationANALYSIS COMPUTER SCIENCE Discovery Science, Volume 9, Number 20, April 3, Comparative Study of Classification Algorithms Using Data Mining
ANALYSIS COMPUTER SCIENCE Discovery Science, Volume 9, Number 20, April 3, 2014 ISSN 2278 5485 EISSN 2278 5477 discovery Science Comparative Study of Classification Algorithms Using Data Mining Akhila
More informationA Comparative Analysis of Traffic Flows for AODV and DSDV Protocols in Manet
A Comparative Analysis of Traffic Flows for and Protocols in Manet Ranichitra.A 1, Radhika.S 2 1 Assistant Professor, 2 M.Phil Scholar, Department of Computer Science, Sri S.R.N.M College, Sattur, India
More informationThe What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage)
The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) Enterprise storage: $30B market built on disk Key players: EMC, NetApp, HP, etc.
More informationReducing Replication Bandwidth for Distributed Document Databases
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 Document-oriented
More informationDELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE
WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily
More informationHOW DATA DEDUPLICATION WORKS A WHITE PAPER
HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS ABSTRACT IT departments face explosive data growth, driving up costs of storage for backup and disaster recovery (DR). For this reason,
More informationWhat is Network Acceleration?
What is Network Acceleration? How do WAN Optimization, Network Acceleration, and Protocol Streamlining work, and what can they do for your network? Contents Introduction Availability Improvement Data Reduction
More informationA Research Paper on Lossless Data Compression Techniques
IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 1 June 2017 ISSN (online): 2349-6010 A Research Paper on Lossless Data Compression Techniques Prof. Dipti Mathpal
More informationAccelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information
Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information Min Fu, Dan Feng, Yu Hua, Xubin He, Zuoning Chen *, Wen Xia, Fangting Huang, Qing
More informationWHITE PAPER. DATA DEDUPLICATION BACKGROUND: A Technical White Paper
WHITE PAPER DATA DEDUPLICATION BACKGROUND: A Technical White Paper CONTENTS Data Deduplication Multiple Data Sets from a Common Storage Pool.......................3 Fixed-Length Blocks vs. Variable-Length
More informationCase 6:16-cv Document 1 Filed 07/22/16 Page 1 of 86 PageID #: 1 UNITED STATES DISTRICT COURT EASTERN DISTRICT OF TEXAS TYLER DIVISION
Case 6:16-cv-01037 Document 1 Filed 07/22/16 Page 1 of 86 PageID #: 1 UNITED STATES DISTRICT COURT EASTERN DISTRICT OF TEXAS TYLER DIVISION REALTIME DATA LLC d/b/a IXO, Plaintiff, v. Case No. 6:16-cv-1037
More informationWHITE PAPER. How Deduplication Benefits Companies of All Sizes An Acronis White Paper
How Deduplication Benefits Companies of All Sizes An Acronis White Paper Copyright Acronis, Inc., 2000 2009 Table of contents Executive Summary... 3 What is deduplication?... 4 File-level deduplication
More informationProtect enterprise data, achieve long-term data retention
Technical white paper Protect enterprise data, achieve long-term data retention HP StoreOnce Catalyst and Symantec NetBackup OpenStorage Table of contents Introduction 2 Technology overview 3 HP StoreOnce
More informationOracle Advanced Compression. An Oracle White Paper April 2008
Oracle Advanced Compression An Oracle White Paper April 2008 Oracle Advanced Compression Introduction... 2 Oracle Advanced Compression... 2 Compression for Relational Data... 3 Innovative Algorithm...
More informationDesign and Implementation of Various File Deduplication Schemes on Storage Devices
Design and Implementation of Various File Deduplication Schemes on Storage Devices Yong-Ting Wu, Min-Chieh Yu, Jenq-Shiou Leu Department of Electronic and Computer Engineering National Taiwan University
More informationFlashed-Optimized VPSA. Always Aligned with your Changing World
Flashed-Optimized VPSA Always Aligned with your Changing World Yair Hershko Co-founder, VP Engineering, Zadara Storage 3 Modern Data Storage for Modern Computing Innovating data services to meet modern
More informationA multilingual reference based on cloud pattern
A multilingual reference based on cloud pattern G.Rama Rao Department of Computer science and Engineering, Christu Jyothi Institute of Technology and Science, Jangaon Abstract- With the explosive growth
More informationDeduplication: The hidden truth and what it may be costing you
Deduplication: The hidden truth and what it may be costing you Not all deduplication technologies are created equal. See why choosing the right one can save storage space by up to a factor of 10. By Adrian
More informationIMAGE COMPRESSION USING HYBRID TRANSFORM TECHNIQUE
Volume 4, No. 1, January 2013 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMAGE COMPRESSION USING HYBRID TRANSFORM TECHNIQUE Nikita Bansal *1, Sanjay
More informationINTELLIGENT SUPERMARKET USING APRIORI
INTELLIGENT SUPERMARKET USING APRIORI Kasturi Medhekar 1, Arpita Mishra 2, Needhi Kore 3, Nilesh Dave 4 1,2,3,4Student, 3 rd year Diploma, Computer Engineering Department, Thakur Polytechnic, Mumbai, Maharashtra,
More informationA Review on various Location Management and Update Mechanisms in Mobile Communication
International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 2 No. 2 Jun. 2014, pp. 268-274 2014 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/
More informationDesign Tradeoffs for Data Deduplication Performance in Backup Workloads
Design Tradeoffs for Data Deduplication Performance in Backup Workloads Min Fu,DanFeng,YuHua,XubinHe, Zuoning Chen *, Wen Xia,YuchengZhang,YujuanTan Huazhong University of Science and Technology Virginia
More informationLOAD BALANCING AND DEDUPLICATION
LOAD BALANCING AND DEDUPLICATION Mr.Chinmay Chikode Mr.Mehadi Badri Mr.Mohit Sarai Ms.Kshitija Ubhe ABSTRACT Load Balancing is a method of distributing workload across multiple computing resources such
More informationANALYSIS OF AES ENCRYPTION WITH ECC
ANALYSIS OF AES ENCRYPTION WITH ECC Samiksha Sharma Department of Computer Science & Engineering, DAV Institute of Engineering and Technology, Jalandhar, Punjab, India Vinay Chopra Department of Computer
More informationCLUSTERING BIG DATA USING NORMALIZATION BASED k-means ALGORITHM
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,
More informationScale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014
Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Gideon Senderov Director, Advanced Storage Products NEC Corporation of America Long-Term Data in the Data Center (EB) 140 120
More informationFog Computing. ICTN6875: Emerging Technology. Billy Short 7/20/2016
Fog Computing ICTN6875: Emerging Technology Billy Short 7/20/2016 Abstract During my studies here at East Carolina University, I have studied and read about many different t types of emerging technologies.
More informationData Deduplication Overview and Implementation
Data Deduplication Overview and Implementation Somefun Olawale Mufutau 1, Nwala Kenneth 2, Okonji Charles 3, Omotosho Olawale Jacob 4 1 Computer Science Department Babcock University, Ilisan Remo Ogun
More informationCLIENT DATA NODE NAME NODE
Volume 6, Issue 12, December 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Efficiency
More informationHow to Reduce Data Capacity in Objectbased Storage: Dedup and More
How to Reduce Data Capacity in Objectbased Storage: Dedup and More Dong In Shin G-Cube, Inc. http://g-cube.kr Unstructured Data Explosion A big paradigm shift how to generate and consume data Transactional
More informationWide Area Networking Technologies
Wide Area Networking Technologies TABLE OF CONTENTS INTRODUCTION... 1 TASK 1... 1 1.1Critically evaluate different WAN technologies... 1 1.2 WAN traffic intensive services... 4 TASK 2... 5 1.3 and 1.4
More informationEnhance Data De-Duplication Performance With Multi-Thread Chunking Algorithm. December 9, Xinran Jiang, Jia Zhao, Jie Zheng
Enhance Data De-Duplication Performance With Multi-Thread Chunking Algorithm This paper is submitted in partial fulfillment of the requirements for Operating System Class (COEN 283) Santa Clara University
More informationInternational Journal of Scientific Research and Reviews
Research article Available online www.ijsrr.org ISSN: 2279 0543 International Journal of Scientific Research and Reviews Asymmetric Digital Signature Algorithm Based on Discrete Logarithm Concept with
More informationLamassu: Storage-Efficient Host-Side Encryption
Lamassu: Storage-Efficient Host-Side Encryption Peter Shah, Won So Advanced Technology Group 9 July, 2015 1 2015 NetApp, Inc. All rights reserved. Agenda 1) Overview 2) Security 3) Solution Architecture
More informationSHHC: A Scalable Hybrid Hash Cluster for Cloud Backup Services in Data Centers
2011 31st International Conference on Distributed Computing Systems Workshops SHHC: A Scalable Hybrid Hash Cluster for Cloud Backup Services in Data Centers Lei Xu, Jian Hu, Stephen Mkandawire and Hong
More informationHEAD HardwarE Accelerated Deduplication
HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version
More informationINTRODUCTION TO XTREMIO METADATA-AWARE REPLICATION
Installing and Configuring the DM-MPIO WHITE PAPER INTRODUCTION TO XTREMIO METADATA-AWARE REPLICATION Abstract This white paper introduces XtremIO replication on X2 platforms. XtremIO replication leverages
More informationComputation of Multiple Node Disjoint Paths
Chapter 5 Computation of Multiple Node Disjoint Paths 5.1 Introduction In recent years, on demand routing protocols have attained more attention in mobile Ad Hoc networks as compared to other routing schemes
More informationComputer Based Image Algorithm For Wireless Sensor Networks To Prevent Hotspot Locating Attack
Computer Based Image Algorithm For Wireless Sensor Networks To Prevent Hotspot Locating Attack J.Anbu selvan 1, P.Bharat 2, S.Mathiyalagan 3 J.Anand 4 1, 2, 3, 4 PG Scholar, BIT, Sathyamangalam ABSTRACT:
More informationCopyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.
1 Using patented high-speed inline deduplication technology, Data Domain systems identify redundant data as they are being stored, creating a storage foot print that is 10X 30X smaller on average than
More informationBackup management with D2D for HP OpenVMS
OpenVMS Technical Journal V19 Backup management with D2D for HP OpenVMS Table of contents Overview... 2 Introduction... 2 What is a D2D device?... 2 Traditional tape backup vs. D2D backup... 2 Advantages
More informationThe Design of an Anonymous and a Fair Novel E-cash System
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 2, Number 2 (2012), pp. 103-109 International Research Publications House http://www. ripublication.com The Design of
More informationInternational Journal of Computer Engineering and Applications,
International Journal of Computer Engineering and Applications, Volume XII, Issue I, Jan. 18, www.ijcea.com ISSN 2321-3469 SECURING TEXT DATA BY HIDING IN AN IMAGE USING AES CRYPTOGRAPHY AND LSB STEGANOGRAPHY
More informationCode Compression for RISC Processors with Variable Length Instruction Encoding
Code Compression for RISC Processors with Variable Length Instruction Encoding S. S. Gupta, D. Das, S.K. Panda, R. Kumar and P. P. Chakrabarty Department of Computer Science & Engineering Indian Institute
More informationComparison of FP tree and Apriori Algorithm
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 6 (June 2014), PP.78-82 Comparison of FP tree and Apriori Algorithm Prashasti
More informationVirtual Machine Placement in Cloud Computing
Indian Journal of Science and Technology, Vol 9(29), DOI: 10.17485/ijst/2016/v9i29/79768, August 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Virtual Machine Placement in Cloud Computing Arunkumar
More informationAn Analysis of Most Effective Virtual Machine Image Encryption Technique for Cloud Security
An Analysis of Most Effective Virtual Machine Image Encryption Technique for Cloud Security Mr. RakeshNag Dasari Research Scholar, Department of computer science & Engineering, KL University, Green Fields,
More informationSURVEY ON SMART ANALYSIS OF CCTV SURVEILLANCE
International Journal of Computer Engineering and Applications, Volume XI, Special Issue, May 17, www.ijcea.com ISSN 2321-3469 SURVEY ON SMART ANALYSIS OF CCTV SURVEILLANCE Nikita Chavan 1,Mehzabin Shaikh
More informationThe Power of Prediction: Cloud Bandwidth and Cost Reduction
The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal Zohar Israel Cidon Technion Osnat(Ossi) Mokryn Tel-Aviv College Traffic Redundancy Elimination (TRE) Traffic redundancy stems from downloading
More informationDeduplication File System & Course Review
Deduplication File System & Course Review Kai Li 12/13/13 Topics u Deduplication File System u Review 12/13/13 2 Storage Tiers of A Tradi/onal Data Center $$$$ Mirrored storage $$$ Dedicated Fibre Clients
More informationVIDEO CLONE DETECTOR USING HADOOP
VIDEO CLONE DETECTOR USING HADOOP Abstract Ms. Nikita Bhoir 1 Ms. Akshata Kolekar 2 Internet has become the most important part in the people s day-to-day life. It is widely used due to abundant resources.
More informationISSN Vol.08,Issue.16, October-2016, Pages:
ISSN 2348 2370 Vol.08,Issue.16, October-2016, Pages:3146-3152 www.ijatir.org Public Integrity Auditing for Shared Dynamic Cloud Data with Group User Revocation VEDIRE AJAYANI 1, K. TULASI 2, DR P. SUNITHA
More informationReducing The De-linearization of Data Placement to Improve Deduplication Performance
Reducing The De-linearization of Data Placement to Improve Deduplication Performance Yujuan Tan 1, Zhichao Yan 2, Dan Feng 2, E. H.-M. Sha 1,3 1 School of Computer Science & Technology, Chongqing University
More informationPacket Classification Using Standard Access Control List
Packet Classification Using Standard Access Control List S.Mythrei 1, R.Dharmaraj 2 PG Student, Dept of CSE, Sri Vidya College of engineering and technology, Virudhunagar, Tamilnadu, India 1 Research Scholar,
More informationBenchmarking results of SMIP project software components
Benchmarking results of SMIP project software components NAILabs September 15, 23 1 Introduction As packets are processed by high-speed security gateways and firewall devices, it is critical that system
More informationStrategies for Single Instance Storage. Michael Fahey Hitachi Data Systems
Strategies for Single Instance Storage Michael Fahey Hitachi Data Systems Abstract Single Instance Strategies for Storage Single Instance Storage has become a very popular topic in the industry because
More informationEfficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services
Efficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services Neha Shiraz, Dr. Parikshit N. Mahalle Persuing M.E, Department of Computer Engineering, Smt. Kashibai Navale College
More informationA Novel Spatial Domain Invisible Watermarking Technique Using Canny Edge Detector
A Novel Spatial Domain Invisible Watermarking Technique Using Canny Edge Detector 1 Vardaini M.Tech, Department of Rayat Institute of Engineering and information Technology, Railmajra 2 Anudeep Goraya
More informationHash-Based String Matching Algorithm For Network Intrusion Prevention systems (NIPS)
Hash-Based String Matching Algorithm For Network Intrusion Prevention systems (NIPS) VINOD. O & B. M. SAGAR ISE Department, R.V.College of Engineering, Bangalore-560059, INDIA Email Id :vinod.goutham@gmail.com,sagar.bm@gmail.com
More informationParametric Search using In-memory Auxiliary Index
Parametric Search using In-memory Auxiliary Index Nishant Verman and Jaideep Ravela Stanford University, Stanford, CA {nishant, ravela}@stanford.edu Abstract In this paper we analyze the performance of
More informationCloud computing is an emerging IT paradigm that provides
JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO. 1, JANUARY 27 1 CoRE: Cooperative End-to-End Traffic Redundancy Elimination for Reducing Cloud Bandwidth Cost Lei Yu, Haiying Shen, Karan Sapra, Lin Ye and
More information