Multi-level Byte Index Chunking Mechanism for File Synchronization

Size: px
Start display at page:

Download "Multi-level Byte Index Chunking Mechanism for File Synchronization"

Transcription

1 , pp Multi-level Byte Index Chunking Mechanism for File Synchronization Ider Lkhagvasuren, Jung Min So, Jeong Gun Lee, Jin Kim and Young Woong Ko * 1 Dept. of Computer Engineering, Hallym University, Chuncheon, Korea {Ider555, jso, jeonggun.lee, jinkim, yuko}@hallym.ac.kr Abstract In this paper, we propose a probabilistic algorithm for detecting duplicated data blocks in low bandwidth network. The algorithm identifies duplicated regions of the destination file, and only sends non-duplicated region of data. The proposed system produces two types of double Index table for a file, each chunk sizes are 4MB and 32KB, respectively. At the first level, system client detects large sized identical data blocks using 4MB chunk sized indextable by using byte-index chunking approach in rapid time. At the second level, we perform byte-index chunking using 32KB index-table on entire non-duplicated data area produced through first level file similarity detection. This gives us opportunity to more accuracy rated data deduplication and doesn t consume so much time because deduplication work restricted by only non-duplicated area. Experiment result shows the proposed approach can reduce processing time significantly comparable to fixed-size chunking. Also data deduplication rate is as high as variable-sized chunking. Keywords: Deduplication, Cloud storage, Chunk, Index-table, Anchor Byte 1. Introduction Explosive growth of digital data causes storage crises and data deduplication becomes one of the hottest topics in the data storage. Data deduplication is a method of reducing storage capacity by eliminating duplicated data. By adapting data deduplication, we can attain more files than before. Most existing deduplication solutions aim to remove duplicate data in the data storage systems using the traditional chunk-level deduplication strategies. There are lots of data deduplication system, In Content-defined Chunking [1], each block size is partitioned by anchoring based on their data patterns. This scheme can prevent the data shifting problem of the Static Chunking approach. One of the well-known Content-defined Chunking algorithms is LBFS[2], a network file system designed for low bandwidth networks. However, content-defined data deduplication approach can achieve high deduplication ratio, but requires much time to perform deduplication process in comparison to the other data deduplication approaches. Static chunking[3] is the fastest algorithm among the others for detecting duplicated blocks but the performance is not acceptable with boundary shifting problem. In this paper, the main idea is to apply lookup process in a destination file by predicting the duplicated region compared to source files with high probability. There is a fundamental concept to expedite finding redundant data process. We use a table which is named Index-table (size 256*256). This two dimensional matrix structured table is used as reference of file chunks on the server during lookup process. The server chunks * Corresponding author : yuko@hallym.ac.kr ISSN: IJSEIA Copyright c 2014 SERSC

2 are stored to Index-table as indexed by their Anchor byte values. Also Index-table stores metadata (server file chunk hashes and their indexes) in each cell. In this work, our key idea is to adapt multi-level byte-index chunking for large sized files. The byte-index chunking shows efficient and improved performance for small and medium sized files, but shows insufficient performance for large sized files exceeding several gigabytes. Therefore, we exploited multi-level approach to enhance byte-index chunking. The proposed scheme divides files into two groups considering file size. If the file size is over 5GB (in current implementation), we use 4MB large index-table and 32KB small index-table for accelerating data detecting process. But file size is below 5GB then we perform file detection process only using 32KB sized index-table. This isolation makes huge impact to fast performance for large scale data files. When we are looking for identical data on the large scale file (file size is over than 5GB), we starts the lookup process (high-level byte-index chunking) with 4MB chunk sized index-table. If we find a chunk that likely to be duplicated using 4MByte index-table, then we start low-level byte-index chunking using small chunk on 32KB index-table. The strategy of the proposed system is to find large-sized duplicated region and later performs detailed deduplication process. The rest of this paper is organized as follows. In Section 2, we describe related works about data deduplication system. In Section 3, we explain the design principle of proposed Byte-index Chunking system and implementation details. In Section 4, we show performance evaluation result of the proposed system and we conclude and discuss future research plan. 2. Related Works There are several different data deduplication algorithms [3-7]: Static-Chunking (SC), Content-defined Chunking (CDC), Whole-file Chunking (WFC) and delta encoding. Static Chunking let files be divided into a number of fixed-sized blocks, and then apply hash functions to create the hash key of the blocks. Venti [3] is a network storage system using Static Chunking, where 160-bit SHA1 hash key is used as the address of the data. This enforces a write-once policy since no other data block can be found with the same address. The addresses of multiple writes of the same data are identical. So duplicate data is easily identified and the data block is stored only once. The main limitation of Static Chunking is boundary shift problem. For example when adding a new data to a file, all subsequent blocks in the file will be rewritten and are likely to be considered as different from those in the original file. Therefore, it's difficult to find duplicated blocks in the file, which makes deduplication performance degrade. In Content-defined Chunking, each block size is partitioned by anchoring based on their data patterns. This scheme can prevent the data shifting problem of the Static Chunking approach. One of the well-known Content-defined Chunking system is that is a network file system designed for low bandwidth networks. LBFS exploits similarities between files or versions of the same file to save bandwidth. It avoids sending data over the network when the same data can already be found in the server s file system or the client s cache. Using this technique, LBFS achieves up to two orders of magnitude reduction in bandwidth utilization on common workloads, compared to traditional network file systems. Delta encoding [4] stores data in the form of differences between sequential data. Lots of backup system adopts this scheme in order to give their users previous versions of the same file from previous backups. This reduces associated costs in the amount of data that has to be stored as differing versions, moreover, those costs in the uploading of each file that has been updated. DRED system use delta encoding approach to implement deduplication service. DRED system [5][6] can efficiently remove the duplicated data which use delta encoding technique in Web pages, and Copyright c 2014 SERSC

3 3. System Design and Implementation Figure 1 describes overall structure of the proposed system that processes data deduplication between the server and client. In this work, we implemented a deduplication server, employing source-based approach using byte-index approach with a refined and improved touch. The server maintains metadata between server and client and the server produces double Index-tables (size: 256*256), which chunk size is 4MB (when file size over than 5GB). In order to perform lookup process for finding high probability of the duplicated data, Index-table provides the key information of 256x256 sized table structure; keeping chunk numbers and the chunk s edge chunk byte values are used as their cell row and column numbers. From this information, we can find parts of data blocks which have very high probability of duplicated information in very fast way and got a scheme that, eventually, allows for efficient data deduplication by utilizing data file pattern, as shown in Figure 1. Figure 1 System Architecture Overview: Multi-level Byte Index approach Suppose we have a file to synchronize. Then it has to be divided into fixed-length blocks, each block s SHA-1 hash value, make numbering each chunks to reference (chunk-index) and its each edge bytes (left and right boundary byte) values are saved in the server. As can be seen in Figure 2, in the proposed system, we consider edge bytes as Anchor points. Figure 2. Overview of Chunking with Anchor Points Copyright c 2014 SERSC 341

4 After the chunking process, the proposed system then produces [256, 256] amount of index table for the synchronized file. For every chunk of file, we need to set its chunk-index to the corresponding cell of Index-table. For a chunk, first anchor point (left edge byte) value of byte represents the horizontal direction index and a last anchor point (right edge byte) value of byte is a reference to the vertical direction of the Index-table. Figure 3. Overview of Filling Index-table Table Figure 3 shows how Index-table is filled with reference points of the chunk. When proposed system creates metadata (with chunk index, chunk hash and chunk anchor points (value of edge bytes)) list, server sends to the server only Index-table Predicting Duplicated Data of Look-Up Process In look-up process, system aims to find the chunks that are expected to be duplicated (highly probable to be duplicated chunk) using Index table. Also in this process, while we are looking highly probable to be duplicated chunk for improving our search results to be more accurate, we don t only search chunks one by one, but we aim to seek adjacent double chunks for per offset in the modified file. If we can predict some data to be adjacent chunks as highly probable to be duplicated from the destination file by their anchor bytes, it means these parts of data in destination file has not only same length with chunks of the source file, also these data store same bytes of values at the position where their boundaries of each chunk. If we look up just a single chunk with Index-table (using only 2 anchor-bytes), then there might be plenty of offsets that no duplicates. This makes file similarity process to be wasteful and with unnecessary data processing consumptions. Nonetheless, it is a very rare occasion that is not duplicated chunks comes up from the predicting result when we try look up two adjacent chunks. In other words, it is more accurate and avoiding unnecessary time consumption. That s why we call this part of data in destination file as be highly probable duplicate chunk. 342 Copyright c 2014 SERSC

5 Figure 4. Duplicated Chunk Look Up Process Overview Probability to be a duplicate is the only one from (4,294,967,296) occasions. Nevertheless, Highly probable duplicate chunk is possibly seen to be duplicated but we confirm whether they are duplicated or not by their SHA1 value after predicting them to be Highly probable duplicate chunk. When proposed system receives Index-table from the server, whole file lookup process starts to determine highly probable duplicate chunks in the modified destination file (destination file). Process starts as reading destination file from begin till end. But each reading step, i.e., offset position at i (first boundary byte of the first chunk) in file, we also need to read the 3 more bytes, which are where :i+k-1 (last boundary byte of the first chunk), i+k (for first boundary byte of the second chunk) and i+2* K-1 (last anchor byte of the second). Figure 5. Lookup Chunks which are Might be Duplicated in Modified File Then look up process performs as follows: First, system takes four anchor bytes (shown in Figure 5), checks cell of Index table stores any values using first two anchor bytes to confirm first chunk index is stored Copyright c 2014 SERSC 343

6 in the index-table. If the convenient cell does not store any value, then it is safe to say that current offset (first boundary byte position of first chunk) cannot be the first byte of the duplicate chunk. Thus system shifts one byte right to continue our look up process. If the cell stores any index values, then we anticipate it to possible to be a duplicate. Then, system checks the second chunk boundary (lookup adjacent chunk) from Index-table as previous way to whether a chunk index is stored or not. If the cell, which indexed the second chunk boundary byte values, does not store any value, then it means that current offset ((the first boundary byte position of the first chunk) is not a high probability to not to be the first byte of the duplicate chunk. Therefore, we shift one byte right to our look up process. If the cell, which indexed the second chunk boundary byte values, does include index chunk values, then we need to match the indexes between founded and the previous founded indexes (stored in first chunk boundary indexed cell) whether are adjacent or not. If these are not adjacent, then it implies that the current offset is not a beginning ((first boundary byte position of first chunk) of duplicate chunks and we shift one byte right to our look up process. If these indexes are adjacent, then we see this as a high probability to be duplicate data. Hence, we put these two adjacent chunks (the first boundary byte begins from current offset) to our suspicious list (list that storing chunks that have been determined to be a high probability to be a duplicate). Afterwards, we shift look up process position by 2*K to strive to find next chunks. This gives us a plenty of opportunities to save time avoid to read unnecessary bytes. Although we determine the chunks that might be duplicated, it does not necessarily imply that the chunks are duplicated precisely. It is doubtful that the chunks are duplicated by being checked of their hash values to the original file chunk s hash values. Therefore, it is to estimate the hash of suspicious chunks and send them to the server for ensuring the duplication. The server receives the hash values and compares them to the convenient chunk hashes of the original file to confirm whether chunk has been duplicated or not. If any suspicious chunk is proven not to be a duplicate (even though the occasion has a very low likelihood), then we send back to the client the chunk to lookup step. The client receives the non-duplicated chunk from the server and performs the lookup process only within the chunk range that has been come from the server. After that, the lookup process is carried out as the same as before. The lookup process, confirming suspicious chunk, ensures that the process will be implemented repeatedly until there is no duplicate chunks that came from the server or there is no suspicious chunk. Note that these repeated processes cover the limited range of data Multi-Level Byte Index Chunking In proposed system double Index-tables are used when file size is over than consistent amount. In fact, chunk size value is appreciable impact to be influence on result of any deduplication approach. Setting chunk size as too large may let deduplication speed to accelerate but decrease the amount of detection identical data on file. Oppositely small chunk sized deduplication process may give long time consumption but high rate detection of identical data. Therefore proposed system uses mixing small-sized chunks and big-sized chunks mixing at the case of big amount file are being synchronized. 344 Copyright c 2014 SERSC

7 Figure 6. File Similarity Detecting Process of Multi-level Byte Index based Chunking Approach When synchronized file size is over than 5GB, we use Multi-level byte index chunking approach. Proposed system produces double Index table for a file each chunk sizes are correspondingly 4MB and 32KB. Then at the first level, system client detecting big identical data using 4MB chunk sized Index-table by byte Index chunking approach very fast time. In fact shifting by 4MB in the lookup process affords to fast performing. At the second level we perform byte-index chunking using 32KB Index-table on entire non-duplicated data from first level file similarity detection. This gives us opportunity to more accuracy rated data deduplication and doesn t consume so much time because of restricted by only nonduplicated area System Overview and Implementation Main goal of proposed system relies on transferring only non-overlapping chunks and circumvents sensitivity. System server has to perform few processes in idle mode that will already have done before the arrival of the file deduplication request from client. In other words, the processes of indexing the server file, calculates hashes and move them to the server database are have to be done in the server at first in the system idle mode. Figure 7. Multi-level Byte Index based Deduplication System when File Size is Lower than 5GB Copyright c 2014 SERSC 345

8 Figure 7 presents the process flow of general file (file size is lower than 5GB) deduplication. When the server receives a deduplication request from the client, the server examines every chunk index of through the requesting file, take out the chunk number index of them and put them into Index-table. Particularly, this means distribute the all the chunk numbers of requested file from database to the 256*256 table (Index-table) mentioned in the previous sections. After that, the server sends Index-table to the client. The client receives Index-table and examines the client file bytes to look for a high probability of duplicate chunks using received Index-table. It estimates hash values in the lookup results of duplicated probability chunks. The client sends these hash values to the server to confirm whether highly probable chunks are duplicated by their hash values or not. If any of chunks is not proven to be duplicated, then the non-duplicated chunk will be sent back to the client for reexamination to find another duplicated probability chunk in the range of non-duplicated chunk only. Figure 8 describes deduplication process when file size is over than 5GB. Difference from general process proposed system performs looking-up process twice using double Indextables to determine high probable duplicated data. In other word, system finishes to define duplicated data on destination file with using 4MB chunk size Index table, system repeats previous processes to define identical data on destination file using 32KB chunk sized Index table only in non-duplicated data state that ascertained from previous detection. Further process are performed same as general described in previously. Figure 8. Multi-level Byte Index Based Deduplication System when File Size is over than 5GB 346 Copyright c 2014 SERSC

9 4. Performance Evaluation This section discusses the evaluation results of the proposed system with several experiments. First, we examined the behavior of proposed system in terms of deduplication capability by comparing Content-defined chunking and Fixed-sized chunking approach. Next, we measured the data deduplication time consumption for several approaches under workloads. The server and the client platform consist of 3 GHz Pentium 4 Processor, WD JS hard disk, 100 Mbps network. The software is implemented on Linux kernel version Fedora Core 9. To perform comprehensive analysis on data deduplication algorithm, we implemented several deduplication algorithms for comparison purpose including fixedlength chunking, and content-defined chunking. We made experimental data set using for modifying a file in a random manner. In this experiment, we modified a data file using lseek() function in Linux system using randomly generated file offset and applied a patch to make test data file. Table 1. Amount of Modified Data Version Files of Given Older Version Number Data Size New Data OverLap(%) MB 634 MB MB 1091 MB MB 1613 MB MB 2252 MB MB 2727 MB MB 3146 MB MB 3715 MB MB 4180 MB MB 4702 MB 9 In Figure 9, we performed data deduplication experiment on deduplication capability varying duplication rate. We examined each of to see how much duplication there is between files under a chunking approaches workload. From the deduplication graph in Figure 9, we can see each of content-defined chunking approach, fixed-sized chunking approach and byteindex chunking could find data redundancy of given percentage amount modification file between given original file. From the overlap of modified file in graph, content-defined chunking approach does the closest data deduplication ratio result to overlap of modified file than others. Almost, there is no difference between overlapping amount and content-defined chunking approach overlapping amount. Fixed-size chunking approach shows the lowest deduplication ratio result because of vulnerable to shifts inside data stream. For multi-level byte-index based chunking data deduplication approach shows a much better result than fixed chunking approach though it couldn t reach high as content-defined chunking approach. Copyright c 2014 SERSC 347

10 Figure 9. Deduplication Result Varying Overlapped Data Size We measured the deduplication speed varying given percentage modification amount, Fig 10. We can see content-defined chunking approach is performed in longest time. Multi-level Byte-index based chunking approach is seen slightly slower than fixed chunking approach in the experiment. With this experiment result, we can conclude that Multi-level Byte-index chunking is very practical approach compared to several well-known data deduplication algorithms. Figure 10. Deduplication Performance Time Varying Overlapped Data Size 348 Copyright c 2014 SERSC

11 5. Conclusion In this paper, we propose a multi-level byte index chunking for detecting duplicated data in large-scale network systems. The algorithm identifies duplicated regions of a file and only sends non-duplicated part of data. The key idea is to adapt the byte index chunking approach for efficient metadata handling. Multi-level byte index chunking classifies data file by their size and predict highly similar region that contains identical data blocks, which performs within fast time using double chunk sized Index-tables. The key points are classifying data by their size and predict highly probable to identical data within fast time using double chunk sized Index-tables. Lookup process shifting only single byte or twice chunk size depending if there is probable chunk existed at the offset position or not. Multi-level byte index chunking approach can get separate amount as content-defined approach can do in data deduplication. Several issues remain open. First, our work has limitations on supporting simple data file which has redundant data blocks with spatial locality; therefore, if the file has several modifications then overall performance will be degrade. For future work, we plan to build a massive deduplication system with huge number of files. In this case, handling file similarity information needs more elaborated scheme. Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education, Science and Technology(.2012R1A1A ). References [1] K. Eshghi and H. Tang, A framework for analyzing and improving content-based chunking algorithms, Hewlett-Packard Labs Technical Report TR., vol. 30, (2005). [2] A. Muthitacharoen, B. Chen and D. Mazieres, A low-bandwidth network file system, ACM SIGOPS Operating Systems Review, vol. 35, no. 5, (2001), pp [3] S. Quinlan and S. Dorward, Venti: a new approach to archival storage, Proceedings of the FAST 2002 Conference on File and Storage Technologies, (2002). [4] M. Ajtai, R. Burns, R. Fagin, D. D. E. Long and L. Stockmeyer, Compactly encoding unstructured inputs with differential compression, Journal of the Association for Computing Machinery, vol. 49, no. 3, (2002), pp [5] F. Douglis and A. Iyengar, Application-specific delta-encoding via resemblance detection, Proceedings of the USENIX Annual Technical Conference, (2003), pp [6] P. Kulkarni, F. Douglis, J. LaVoie and J. Tracey, Redundancy elimination within large collections of files, In: Proceedings of the annual conference on USENIX Annual Technical Conference, USENIX Association, (2004). [7] H. M. Jung, S. Y. Park, J. G. Lee and Y. W. Ko, Efficient Data Deduplication System Considering File Modification Pattern, International Journal of Security and Its Applications, vol. 6, no. 2, (2012). Authors Ider Lkhagvasuren, graduated from Mongolian University of Science and Technology in He also graduated Department of Computer Engineering, Hallym University with master s degree in He is currently working as a researcher in Hallym University. His research interests include Data deduplication and Cloud system. Copyright c 2014 SERSC 349

12 Jungmin So, received the B.S. degree in computer engineering from Seoul National University in 2001, and Ph.D. degree in Computer Science from University of Illinois at Urbana-Champaign in He is currently an assistant professor in Department of Computer Engineering, Hallym University. His research interests include wireless networking and mobile computing. Jeong-Gun Lee, received the B.S. degree in computer engineering from Hallym University in 1996, and M.S. and Ph.D. degree from Gwangju Institute of Science and Technology (GIST), Korea, in 1998 and He is currently an assistant professor in the Computer Engineering department at Hallym University. Jin Kim, received an MS degree in computer science from the college of Engineering at the Michigan State University in 1990, and in 1996 a PhD degree from the Michigan State University. Since then he has been working as a professor on computer engineering at the Hallym University. His research includes Bioinformatics and data mining. Young Woong Ko, received both a M.S. and Ph.D. in computer science from Korea University, Seoul, Korea, in 1999 and 2003, respectively. He is now a professor in Department of Computer engineering, Hallym University, Korea. His research interests include operating system, embedded system and multimedia system. 350 Copyright c 2014 SERSC

Online Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.

Online Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1. , pp.1-10 http://dx.doi.org/10.14257/ijmue.2014.9.1.01 Design and Implementation of Binary File Similarity Evaluation System Sun-Jung Kim 2, Young Jun Yoo, Jungmin So 1, Jeong Gun Lee 1, Jin Kim 1 and

More information

Byte Index Chunking Approach for Data Compression

Byte Index Chunking Approach for Data Compression Ider Lkhagvasuren 1, Jung Min So 1, Jeong Gun Lee 1, Chuck Yoo 2, Young Woong Ko 1 1 Dept. of Computer Engineering, Hallym University Chuncheon, Korea {Ider555, jso, jeonggun.lee, yuko}@hallym.ac.kr 2

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Alternative Approaches for Deduplication in Cloud Storage Environment

Alternative Approaches for Deduplication in Cloud Storage Environment International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 10 (2017), pp. 2357-2363 Research India Publications http://www.ripublication.com Alternative Approaches for

More information

An Efficient Provable Data Possession Scheme based on Counting Bloom Filter for Dynamic Data in the Cloud Storage

An Efficient Provable Data Possession Scheme based on Counting Bloom Filter for Dynamic Data in the Cloud Storage , pp. 9-16 http://dx.doi.org/10.14257/ijmue.2016.11.4.02 An Efficient Provable Data Possession Scheme based on Counting Bloom Filter for Dynamic Data in the Cloud Storage Eunmi Jung 1 and Junho Jeong 2

More information

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:

More information

Delta Compressed and Deduplicated Storage Using Stream-Informed Locality

Delta Compressed and Deduplicated Storage Using Stream-Informed Locality Delta Compressed and Deduplicated Storage Using Stream-Informed Locality Philip Shilane, Grant Wallace, Mark Huang, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract For backup

More information

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Ilhoon Shin Seoul National University of Science & Technology ilhoon.shin@snut.ac.kr Abstract As the amount of digitized

More information

DEC: An Efficient Deduplication-Enhanced Compression Approach

DEC: An Efficient Deduplication-Enhanced Compression Approach 2016 IEEE 22nd International Conference on Parallel and Distributed Systems DEC: An Efficient Deduplication-Enhanced Compression Approach Zijin Han, Wen Xia, Yuchong Hu *, Dan Feng, Yucheng Zhang, Yukun

More information

Deduplication Storage System

Deduplication Storage System Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business

More information

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, & Windsor Hsu Backup Recovery Systems Division EMC Corporation Introduction

More information

SSD Garbage Collection Detection and Management with Machine Learning Algorithm 1

SSD Garbage Collection Detection and Management with Machine Learning Algorithm 1 , pp.197-206 http//dx.doi.org/10.14257/ijca.2018.11.4.18 SSD Garbage Collection Detection and Management with Machine Learning Algorithm 1 Jung Kyu Park 1 and Jaeho Kim 2* 1 Department of Computer Software

More information

ChunkStash: Speeding Up Storage Deduplication using Flash Memory

ChunkStash: Speeding Up Storage Deduplication using Flash Memory ChunkStash: Speeding Up Storage Deduplication using Flash Memory Biplob Debnath +, Sudipta Sengupta *, Jin Li * * Microsoft Research, Redmond (USA) + Univ. of Minnesota, Twin Cities (USA) Deduplication

More information

Compression and Decompression of Virtual Disk Using Deduplication

Compression and Decompression of Virtual Disk Using Deduplication Compression and Decompression of Virtual Disk Using Deduplication Bharati Ainapure 1, Siddhant Agarwal 2, Rukmi Patel 3, Ankita Shingvi 4, Abhishek Somani 5 1 Professor, Department of Computer Engineering,

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

EaSync: A Transparent File Synchronization Service across Multiple Machines

EaSync: A Transparent File Synchronization Service across Multiple Machines EaSync: A Transparent File Synchronization Service across Multiple Machines Huajian Mao 1,2, Hang Zhang 1,2, Xianqiang Bao 1,2, Nong Xiao 1,2, Weisong Shi 3, and Yutong Lu 1,2 1 State Key Laboratory of

More information

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD)

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) University Paderborn Paderborn Center for Parallel Computing Technical Report dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) Dirk Meister Paderborn Center for Parallel Computing

More information

ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING

ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING S KEERTHI 1*, MADHAVA REDDY A 2* 1. II.M.Tech, Dept of CSE, AM Reddy Memorial College of Engineering & Technology, Petlurivaripalem. 2. Assoc.

More information

NLE-FFS: A Flash File System with PRAM for Non-linear Editing

NLE-FFS: A Flash File System with PRAM for Non-linear Editing 16 IEEE Transactions on Consumer Electronics, Vol. 55, No. 4, NOVEMBER 9 NLE-FFS: A Flash File System with PRAM for Non-linear Editing Man-Keun Seo, Sungahn Ko, Youngwoo Park, and Kyu Ho Park, Member,

More information

Deploying De-Duplication on Ext4 File System

Deploying De-Duplication on Ext4 File System Deploying De-Duplication on Ext4 File System Usha A. Joglekar 1, Bhushan M. Jagtap 2, Koninika B. Patil 3, 1. Asst. Prof., 2, 3 Students Department of Computer Engineering Smt. Kashibai Navale College

More information

LevelDB-Raw: Eliminating File System Overhead for Optimizing Performance of LevelDB Engine

LevelDB-Raw: Eliminating File System Overhead for Optimizing Performance of LevelDB Engine 777 LevelDB-Raw: Eliminating File System Overhead for Optimizing Performance of LevelDB Engine Hak-Su Lim and Jin-Soo Kim *College of Info. & Comm. Engineering, Sungkyunkwan University, Korea {haksu.lim,

More information

Partial Caching Scheme for Streaming Multimedia Data in Ad-hoc Network

Partial Caching Scheme for Streaming Multimedia Data in Ad-hoc Network , pp.106-110 http://dx.doi.org/10.14257/astl.2014.51.25 Partial Caching Scheme for Streaming Multimedia Data in Ad-hoc Network Backhyun Kim and Iksoo Kim 1 Faculty of Liberal Education, Incheon National

More information

In-line Deduplication for Cloud storage to Reduce Fragmentation by using Historical Knowledge

In-line Deduplication for Cloud storage to Reduce Fragmentation by using Historical Knowledge In-line Deduplication for Cloud storage to Reduce Fragmentation by using Historical Knowledge Smitha.M. S, Prof. Janardhan Singh Mtech Computer Networking, Associate Professor Department of CSE, Cambridge

More information

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18,   ISSN International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, www.ijcea.com ISSN 2321-3469 SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY Vidya Kurtadikar

More information

Network Intrusion Forensics System based on Collection and Preservation of Attack Evidence

Network Intrusion Forensics System based on Collection and Preservation of Attack Evidence , pp.354-359 http://dx.doi.org/10.14257/astl.2016.139.71 Network Intrusion Forensics System based on Collection and Preservation of Attack Evidence Jong-Hyun Kim, Yangseo Choi, Joo-Young Lee, Sunoh Choi,

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Design of the Journaling File System for Performance Enhancement

Design of the Journaling File System for Performance Enhancement 22 Design of the Journaling File System for Performance Enhancement Seung-Ju, Jang Dong-Eui University, Dept. of Computer Engineering Summary In this paper, I developed for the purpose of ensuring stability

More information

Design and Implementation of Various File Deduplication Schemes on Storage Devices

Design and Implementation of Various File Deduplication Schemes on Storage Devices Design and Implementation of Various File Deduplication Schemes on Storage Devices Yong-Ting Wu, Min-Chieh Yu, Jenq-Shiou Leu Department of Electronic and Computer Engineering National Taiwan University

More information

Design Tradeoffs for Data Deduplication Performance in Backup Workloads

Design Tradeoffs for Data Deduplication Performance in Backup Workloads Design Tradeoffs for Data Deduplication Performance in Backup Workloads Min Fu,DanFeng,YuHua,XubinHe, Zuoning Chen *, Wen Xia,YuchengZhang,YujuanTan Huazhong University of Science and Technology Virginia

More information

Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information

Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information Min Fu, Dan Feng, Yu Hua, Xubin He, Zuoning Chen *, Wen Xia, Fangting Huang, Qing

More information

Rethinking Deduplication Scalability

Rethinking Deduplication Scalability Rethinking Deduplication Scalability Petros Efstathopoulos Petros Efstathopoulos@symantec.com Fanglu Guo Fanglu Guo@symantec.com Symantec Research Labs Symantec Corporation, Culver City, CA, USA 1 ABSTRACT

More information

3D Grid Size Optimization of Automatic Space Analysis for Plant Facility Using Point Cloud Data

3D Grid Size Optimization of Automatic Space Analysis for Plant Facility Using Point Cloud Data 33 rd International Symposium on Automation and Robotics in Construction (ISARC 2016) 3D Grid Size Optimization of Automatic Space Analysis for Plant Facility Using Point Cloud Data Gyu seong Choi a, S.W.

More information

A Reverse Differential Archiving Method based on Zdelta

A Reverse Differential Archiving Method based on Zdelta 2012 International Conference on Image, Vision and Computing (ICIVC 2012) IPCSIT vol. 50 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V50.19 A Reverse Differential Archiving Method based

More information

FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance

FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance Yujuan Tan, Jian Wen, Zhichao Yan, Hong Jiang, Witawas Srisa-an, Baiping Wang, Hao Luo Outline Background and Motivation

More information

MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION

MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION INFORMATION SYSTEMS IN MANAGEMENT Information Systems in Management (2014) Vol. 3 (4) 273 283 MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION MATEUSZ SMOLIŃSKI Institute of

More information

Data deduplication for Similar Files

Data deduplication for Similar Files Int'l Conf. Scientific Computing CSC'17 37 Data deduplication for Similar Files Mohamad Zaini Nurshafiqah, Nozomi Miyamoto, Hikari Yoshii, Riichi Kodama, Itaru Koike, Toshiyuki Kinoshita School of Computer

More information

Shared snapshots. 1 Abstract. 2 Introduction. Mikulas Patocka Red Hat Czech, s.r.o. Purkynova , Brno Czech Republic

Shared snapshots. 1 Abstract. 2 Introduction. Mikulas Patocka Red Hat Czech, s.r.o. Purkynova , Brno Czech Republic Shared snapshots Mikulas Patocka Red Hat Czech, s.r.o. Purkynova 99 612 45, Brno Czech Republic mpatocka@redhat.com 1 Abstract Shared snapshots enable the administrator to take many snapshots of the same

More information

Frequency Based Chunking for Data De-Duplication

Frequency Based Chunking for Data De-Duplication 2010 18th Annual IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems Frequency Based Chunking for Data De-Duplication Guanlin Lu, Yu Jin, and

More information

The Effectiveness of Deduplication on Virtual Machine Disk Images

The Effectiveness of Deduplication on Virtual Machine Disk Images The Effectiveness of Deduplication on Virtual Machine Disk Images Keren Jin & Ethan L. Miller Storage Systems Research Center University of California, Santa Cruz Motivation Virtualization is widely deployed

More information

Optimizing Fsync Performance with Dynamic Queue Depth Adaptation

Optimizing Fsync Performance with Dynamic Queue Depth Adaptation JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.5, OCTOBER, 2015 ISSN(Print) 1598-1657 http://dx.doi.org/10.5573/jsts.2015.15.5.570 ISSN(Online) 2233-4866 Optimizing Fsync Performance with

More information

Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies

Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies November 2007 Reducing Costs in the Data Center Table of Contents The Increasingly Costly Data Center...1

More information

Construction Scheme for Cloud Platform of NSFC Information System

Construction Scheme for Cloud Platform of NSFC Information System , pp.200-204 http://dx.doi.org/10.14257/astl.2016.138.40 Construction Scheme for Cloud Platform of NSFC Information System Jianjun Li 1, Jin Wang 1, Yuhui Zheng 2 1 Information Center, National Natural

More information

Reducing The De-linearization of Data Placement to Improve Deduplication Performance

Reducing The De-linearization of Data Placement to Improve Deduplication Performance Reducing The De-linearization of Data Placement to Improve Deduplication Performance Yujuan Tan 1, Zhichao Yan 2, Dan Feng 2, E. H.-M. Sha 1,3 1 School of Computer Science & Technology, Chongqing University

More information

DEBAR: A Scalable High-Performance Deduplication Storage System for Backup and Archiving

DEBAR: A Scalable High-Performance Deduplication Storage System for Backup and Archiving University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Technical reports Computer Science and Engineering, Department of 1-5-29 DEBAR: A Scalable High-Performance Deduplication

More information

Transaction Processing in Mobile Database Systems

Transaction Processing in Mobile Database Systems Ashish Jain* 1 http://dx.doi.org/10.18090/samriddhi.v7i2.8631 ABSTRACT In a mobile computing environment, a potentially large number of mobile and fixed users may simultaneously access shared data; therefore,

More information

A Load Balancing Scheme for Games in Wireless Sensor Networks

A Load Balancing Scheme for Games in Wireless Sensor Networks , pp.89-94 http://dx.doi.org/10.14257/astl.2013.42.21 A Load Balancing Scheme for Games in Wireless Sensor Networks Hye-Young Kim 1 1 Major in Game Software, School of Games, Hongik University, Chungnam,

More information

A Virtual-Synchronized-File Based Privacy Protection System

A Virtual-Synchronized-File Based Privacy Protection System Vol.133 (Information Technology and Computer Science 2016), pp.29-33 http://dx.doi.org/10.14257/astl.2016. A Virtual-Synchronized-File Based Privacy Protection System Hye-Lim Jeong 1, Ki-Woong Park 2 System

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Low Overhead Geometric On-demand Routing Protocol for Mobile Ad Hoc Networks

Low Overhead Geometric On-demand Routing Protocol for Mobile Ad Hoc Networks Low Overhead Geometric On-demand Routing Protocol for Mobile Ad Hoc Networks Chang Su, Lili Zheng, Xiaohai Si, Fengjun Shang Institute of Computer Science & Technology Chongqing University of Posts and

More information

Development of Massive Data Transferring Method for UPnP based Robot Middleware

Development of Massive Data Transferring Method for UPnP based Robot Middleware Development of Massive Data Transferring Method for UPnP based Robot Middleware Kyung San Kim, Sang Chul Ahn, Yong-Moo Kwon, Heedong Ko, and Hyoung-Gon Kim Imaging Media Research Center Korea Institute

More information

Chapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk

Chapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk Chapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk Young-Joon Jang and Dongkun Shin Abstract Recent SSDs use parallel architectures with multi-channel and multiway, and manages multiple

More information

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed 5.3 By convention, a cache is named according to the amount of data it contains (i.e., a 4 KiB cache can hold 4 KiB of data); however, caches also require SRAM to store metadata such as tags and valid

More information

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage)

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage) The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) Enterprise storage: $30B market built on disk Key players: EMC, NetApp, HP, etc.

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta and Jin Li Microsoft Research, Redmond, WA, USA Contains work that is joint with Biplob Debnath (Univ. of Minnesota) Flash Memory

More information

CS307: Operating Systems

CS307: Operating Systems CS307: Operating Systems Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building 3-513 wuct@cs.sjtu.edu.cn Download Lectures ftp://public.sjtu.edu.cn

More information

Erik Riedel Hewlett-Packard Labs

Erik Riedel Hewlett-Packard Labs Erik Riedel Hewlett-Packard Labs Greg Ganger, Christos Faloutsos, Dave Nagle Carnegie Mellon University Outline Motivation Freeblock Scheduling Scheduling Trade-Offs Performance Details Applications Related

More information

An Evaluation of Using Deduplication in Swappers

An Evaluation of Using Deduplication in Swappers An Evaluation of Using in Swappers Weiyan Wang, Chen Zeng {wywang,zeng}@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison Abstract Data deduplication is a specialized data compression

More information

Lecture 2: Memory Systems

Lecture 2: Memory Systems Lecture 2: Memory Systems Basic components Memory hierarchy Cache memory Virtual Memory Zebo Peng, IDA, LiTH Many Different Technologies Zebo Peng, IDA, LiTH 2 Internal and External Memories CPU Date transfer

More information

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Soojun Im and Dongkun Shin Sungkyunkwan University Suwon, Korea {lang33, dongkun}@skku.edu ABSTRACT We propose a novel flash

More information

Deduplication and Incremental Accelleration in Bacula with NetApp Technologies. Peter Buschman EMEA PS Consultant September 25th, 2012

Deduplication and Incremental Accelleration in Bacula with NetApp Technologies. Peter Buschman EMEA PS Consultant September 25th, 2012 Deduplication and Incremental Accelleration in Bacula with NetApp Technologies Peter Buschman EMEA PS Consultant September 25th, 2012 1 NetApp and Bacula Systems Bacula Systems became a NetApp Developer

More information

Remote Direct Storage Management for Exa-Scale Storage

Remote Direct Storage Management for Exa-Scale Storage , pp.15-20 http://dx.doi.org/10.14257/astl.2016.139.04 Remote Direct Storage Management for Exa-Scale Storage Dong-Oh Kim, Myung-Hoon Cha, Hong-Yeon Kim Storage System Research Team, High Performance Computing

More information

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early

More information

High-Performance VLSI Architecture of H.264/AVC CAVLD by Parallel Run_before Estimation Algorithm *

High-Performance VLSI Architecture of H.264/AVC CAVLD by Parallel Run_before Estimation Algorithm * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 29, 595-605 (2013) High-Performance VLSI Architecture of H.264/AVC CAVLD by Parallel Run_before Estimation Algorithm * JONGWOO BAE 1 AND JINSOO CHO 2,+ 1

More information

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers Soohyun Yang and Yeonseung Ryu Department of Computer Engineering, Myongji University Yongin, Gyeonggi-do, Korea

More information

A Comparison of File. D. Roselli, J. R. Lorch, T. E. Anderson Proc USENIX Annual Technical Conference

A Comparison of File. D. Roselli, J. R. Lorch, T. E. Anderson Proc USENIX Annual Technical Conference A Comparison of File System Workloads D. Roselli, J. R. Lorch, T. E. Anderson Proc. 2000 USENIX Annual Technical Conference File System Performance Integral component of overall system performance Optimised

More information

DDSF: A Data Deduplication System Framework for Cloud Environments

DDSF: A Data Deduplication System Framework for Cloud Environments DDSF: A Data Deduplication System Framework for Cloud Environments Jianhua Gu, Chuang Zhang and Wenwei Zhang School of Computer Science and Technology, High Performance Computing R&D Center Northwestern

More information

Implementation and Performance Evaluation of RAPID-Cache under Linux

Implementation and Performance Evaluation of RAPID-Cache under Linux Implementation and Performance Evaluation of RAPID-Cache under Linux Ming Zhang, Xubin He, and Qing Yang Department of Electrical and Computer Engineering, University of Rhode Island, Kingston, RI 2881

More information

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition Chapter 7: Main Memory Operating System Concepts Essentials 8 th Edition Silberschatz, Galvin and Gagne 2011 Chapter 7: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure

More information

HP Dynamic Deduplication achieving a 50:1 ratio

HP Dynamic Deduplication achieving a 50:1 ratio HP Dynamic Deduplication achieving a 50:1 ratio Table of contents Introduction... 2 Data deduplication the hottest topic in data protection... 2 The benefits of data deduplication... 2 How does data deduplication

More information

An Overview of Projection, Partitioning and Segmentation of Big Data Using Hp Vertica

An Overview of Projection, Partitioning and Segmentation of Big Data Using Hp Vertica IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 5, Ver. I (Sep.- Oct. 2017), PP 48-53 www.iosrjournals.org An Overview of Projection, Partitioning

More information

SMCCSE: PaaS Platform for processing large amounts of social media

SMCCSE: PaaS Platform for processing large amounts of social media KSII The first International Conference on Internet (ICONI) 2011, December 2011 1 Copyright c 2011 KSII SMCCSE: PaaS Platform for processing large amounts of social media Myoungjin Kim 1, Hanku Lee 2 and

More information

Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality

Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble Work done at Hewlett-Packard

More information

Decision analysis of the weather log by Hadoop

Decision analysis of the weather log by Hadoop Advances in Engineering Research (AER), volume 116 International Conference on Communication and Electronic Information Engineering (CEIE 2016) Decision analysis of the weather log by Hadoop Hao Wu Department

More information

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp.

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Primary Storage Optimization Technologies that let you store more data on the same storage Thin provisioning Copy-on-write

More information

Keywords: disk throughput, virtual machine, I/O scheduling, performance evaluation

Keywords: disk throughput, virtual machine, I/O scheduling, performance evaluation Simple and practical disk performance evaluation method in virtual machine environments Teruyuki Baba Atsuhiro Tanaka System Platforms Research Laboratories, NEC Corporation 1753, Shimonumabe, Nakahara-Ku,

More information

SELECTING VOTES FOR ENERGY EFFICIENCY IN PROBABILISTIC VOTING-BASED FILTERING IN WIRELESS SENSOR NETWORKS USING FUZZY LOGIC

SELECTING VOTES FOR ENERGY EFFICIENCY IN PROBABILISTIC VOTING-BASED FILTERING IN WIRELESS SENSOR NETWORKS USING FUZZY LOGIC SELECTING VOTES FOR ENERGY EFFICIENCY IN PROBABILISTIC VOTING-BASED FILTERING IN WIRELESS SENSOR NETWORKS USING FUZZY LOGIC Su Man Nam and Tae Ho Cho College of Information and Communication Engineering,

More information

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář

Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Flexible Cache Cache for afor Database Management Management Systems Systems Radim Bača and David Bednář Department ofradim Computer Bača Science, and Technical David Bednář University of Ostrava Czech

More information

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Ambry: LinkedIn s Scalable Geo- Distributed Object Store Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil

More information

Improvement of Buffer Scheme for Delay Tolerant Networks

Improvement of Buffer Scheme for Delay Tolerant Networks Improvement of Buffer Scheme for Delay Tolerant Networks Jian Shen 1,2, Jin Wang 1,2, Li Ma 1,2, Ilyong Chung 3 1 Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science

More information

Linux Software RAID Level 0 Technique for High Performance Computing by using PCI-Express based SSD

Linux Software RAID Level 0 Technique for High Performance Computing by using PCI-Express based SSD Linux Software RAID Level Technique for High Performance Computing by using PCI-Express based SSD Jae Gi Son, Taegyeong Kim, Kuk Jin Jang, *Hyedong Jung Department of Industrial Convergence, Korea Electronics

More information

The Logic of Physical Garbage Collection in Deduplicating Storage

The Logic of Physical Garbage Collection in Deduplicating Storage The Logic of Physical Garbage Collection in Deduplicating Storage Fred Douglis Abhinav Duggal Philip Shilane Tony Wong Dell EMC Shiqin Yan University of Chicago Fabiano Botelho Rubrik 1 Deduplication in

More information

Efficient Resource Management for the P2P Web Caching

Efficient Resource Management for the P2P Web Caching Efficient Resource Management for the P2P Web Caching Kyungbaek Kim and Daeyeon Park Department of Electrical Engineering & Computer Science, Division of Electrical Engineering, Korea Advanced Institute

More information

Fast H.264 Video Decoding by Bit Order Reversing and Its Application to Real-time H.264 Video Encryption. Handong Global University, Pohang, Korea

Fast H.264 Video Decoding by Bit Order Reversing and Its Application to Real-time H.264 Video Encryption. Handong Global University, Pohang, Korea , pp.45-56 http://dx.doi.org/10.14257/ijmue.2016.11.3.05 Fast H.264 Video Decoding by Bit Order Reversing and Its Application to Real-time H.264 Video Encryption Kang Yi 1, Yuo-Han Lee 1 and Jeong-Hyun

More information

Design of Hierarchical Crossconnect WDM Networks Employing a Two-Stage Multiplexing Scheme of Waveband and Wavelength

Design of Hierarchical Crossconnect WDM Networks Employing a Two-Stage Multiplexing Scheme of Waveband and Wavelength 166 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 20, NO. 1, JANUARY 2002 Design of Hierarchical Crossconnect WDM Networks Employing a Two-Stage Multiplexing Scheme of Waveband and Wavelength

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Iomega REV Drive Data Transfer Performance

Iomega REV Drive Data Transfer Performance Technical White Paper March 2004 Iomega REV Drive Data Transfer Performance Understanding Potential Transfer Rates and Factors Affecting Throughput Introduction Maximum Sustained Transfer Rate Burst Transfer

More information

Single Instance Storage Strategies

Single Instance Storage Strategies Single Instance Storage Strategies Michael Fahey, Hitachi Data Systems SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this

More information

ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS

ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS Radhakrishnan R 1, Karthik

More information

Hyper Text Transfer Protocol Compression

Hyper Text Transfer Protocol Compression Hyper Text Transfer Protocol Compression Dr.Khalaf Khatatneh, Professor Dr. Ahmed Al-Jaber, and Asma a M. Khtoom Abstract This paper investigates HTTP post request compression approach. The most common

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models RCFile: A Fast and Space-efficient Data

More information

Distributed Sequential Access MAC Protocol for Single-Hop Wireless Networks

Distributed Sequential Access MAC Protocol for Single-Hop Wireless Networks Wireless Pers Commun DOI 10.1007/s11277-013-1142-8 Distributed Sequential Access MAC Protocol for Single-Hop Wireless Networks Ki-seok Lee Cheeha Kim Springer Science+Business Media New York 2013 Abstract

More information

Deduplication File System & Course Review

Deduplication File System & Course Review Deduplication File System & Course Review Kai Li 12/13/13 Topics u Deduplication File System u Review 12/13/13 2 Storage Tiers of A Tradi/onal Data Center $$$$ Mirrored storage $$$ Dedicated Fibre Clients

More information

Distributed Interference-aware Medium Access Control for IEEE Visible Light Communications

Distributed Interference-aware Medium Access Control for IEEE Visible Light Communications Sensors and Materials, Vol. 30, No. 8 (2018) 1665 1670 MYU Tokyo 1665 S & M 1623 Distributed Interference-aware Medium Access Control for IEEE 802.15.7 Visible Light Communications Eui-Jik Kim, 1 Jung-Hyok

More information

HEAD HardwarE Accelerated Deduplication

HEAD HardwarE Accelerated Deduplication HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version

More information

Improvement of Matrix Factorization-based Recommender Systems Using Similar User Index

Improvement of Matrix Factorization-based Recommender Systems Using Similar User Index , pp. 71-78 http://dx.doi.org/10.14257/ijseia.2015.9.3.08 Improvement of Matrix Factorization-based Recommender Systems Using Similar User Index Haesung Lee 1 and Joonhee Kwon 2* 1,2 Department of Computer

More information

CS307 Operating Systems Main Memory

CS307 Operating Systems Main Memory CS307 Main Memory Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University Spring 2018 Background Program must be brought (from disk) into memory and placed within a process

More information

Improved MAC protocol for urgent data transmission in wireless healthcare monitoring sensor networks

Improved MAC protocol for urgent data transmission in wireless healthcare monitoring sensor networks , pp.282-286 http://dx.doi.org/10.14257/astl.2015.116.57 Improved MAC protocol for urgent data transmission in wireless healthcare monitoring sensor networks Rae Hyeon Kim, Jeong Gon Kim 1 Department of

More information

Baoping Wang School of software, Nanyang Normal University, Nanyang , Henan, China

Baoping Wang School of software, Nanyang Normal University, Nanyang , Henan, China doi:10.21311/001.39.7.41 Implementation of Cache Schedule Strategy in Solid-state Disk Baoping Wang School of software, Nanyang Normal University, Nanyang 473061, Henan, China Chao Yin* School of Information

More information

ASN Configuration Best Practices

ASN Configuration Best Practices ASN Configuration Best Practices Managed machine Generally used CPUs and RAM amounts are enough for the managed machine: CPU still allows us to read and write data faster than real IO subsystem allows.

More information

Dynamic Deferred Acknowledgment Mechanism for Improving the Performance of TCP in Multi-Hop Wireless Networks

Dynamic Deferred Acknowledgment Mechanism for Improving the Performance of TCP in Multi-Hop Wireless Networks Dynamic Deferred Acknowledgment Mechanism for Improving the Performance of TCP in Multi-Hop Wireless Networks Dodda Sunitha Dr.A.Nagaraju Dr. G.Narsimha Assistant Professor of IT Dept. Central University

More information