Data deduplication for Similar Files

Size: px
Start display at page:

Download "Data deduplication for Similar Files"

Transcription

1 Int'l Conf. Scientific Computing CSC'17 37 Data deduplication for Similar Files Mohamad Zaini Nurshafiqah, Nozomi Miyamoto, Hikari Yoshii, Riichi Kodama, Itaru Koike, Toshiyuki Kinoshita School of Computer Science, Tokyo University of Technology Hachioji Tokyo, , Japan Abstract - Recently, massive data growth and duplicate data in enterprise systems have led to the use of deduplication technique. The deduplication technique is a powerful storage minimization technique that can be adopted to manage maintenance issues in data growth. The target files for deduplication are divided into several parts (each part is called a block) and any duplicate blocks are eliminated. In the variable-length block method, we use a particular bit-pattern (called a singularity) to decide the breakpoint of the block. Since multimedia data such as audio data and image data has a huge capacity, the demand for data compression is high due to its ability to save space. In this research, we extracted a file with high similarity using file similarity determination, and proposed a similar file extraction method to eliminate duplication only for these files. Morphological analysis and cosine similarity is used to determine the similarity in text files. By deduplicating only files with high similarity, the time required for deduplication can be minimized without reducing the effect of deduplication too much. Experimental results confirmed that if deduplication is performed on files with similarity of 0.5 or more, the reduction of deduplication rate is suppressed to about 8% and the processing time can be shortened to about 1/3.3. Keywords: data deduplication, variable-length block, similar file extraction, deduplication rate detected. It can reduce a huge amount of data by eliminating overlapping data (redundant data) in large-scale servers or data storage. Using data deduplication, only one representative of two or more overlapping data files or the same areas of similar data is preserved, while the overlapping data is replaced with links that point to the representative data (as shown in Figure 1). By replacing multiple overlapping data with links, data storage size can be extremely reduced. As a result, the efficiency of data storage can be highly improved and cost in data maintenance and storage can be also decreased. In this research, we extracted a file with high similarity using file similarity determination, and proposed a similar file extraction method to eliminate duplication only for these files. This is a method of improving the efficiency of deduplication by decomposing sentences into words using morphological analysis, extracting similar files by similarity determination using cosine similarity, and deduplicating only for similar files. Experiments confirmed that the similar file extraction method is an effective way to shorten the time required for deduplication without decreasing the effect of deduplication too much. 1 Introduction In recent years, the volume of file data in enterprise systems has greatly increased due to the growing popularity in handling multimedia data including audio, animation or video, etc. In these multimedia files, lots of exactly or mostly identical files might exist and deduplication techniques must be used to minimize the file data volume by eliminating redundant data. Data deduplication is one of file compaction techniques that is commonly used in general enterprise systems by removing duplicates within and across a file. The general concept has been successfully applied to file backup, virtual machine storage, and WAN replication and so on. Data deduplication is a process that calculates the similarity in record pairs and merges them if similarity is Fig.1 Concept of data deduplication

2 38 Int'l Conf. Scientific Computing CSC'17 2 Data deduplication 2.1 Two types of block In the data deduplication technique, the target files are divided into several parts, each part is called a block, and any duplicate blocks are eliminated. By dividing the files into blocks and finding the duplicate part by the blocks, files are not required to be strictly same and high deduplication efficiency can be achieved. There are two types of block; one is fixed-length block whose length is constant and the other is variable-length block whose length can be changed. In the fixed-length block method, when some data are inserted into the file and the blocks after inserted position are shifted, they cannot be recognized as duplicate blocks in the original data (Figure 2 (a)). On the other hand, in the variable-length block method, even if some data are inserted into the file, by adjusting the block length for the insertion, the blocks after inserted position can be recognized as duplicate blocks and be applied for deduplication (Figure 2 (b)). Thus, in the variable-length block method, the effect of deduplication can be maintained even if some data have been inserted or deleted. Figure 3 shows how to detect the break point of variablelength blocks efficiently. Firstly the hash value of a small part with constant length, that is called a window, is calculated. The window indicates the candidate of the breakpoint of the block. When a particular bit-pattern, that is called a singularity, is included in the hash value of the window, the candidate becomes a real breakpoint of the block. When the singularity is not included in the hash value, the candidate does not become a breakpoint. The effect of deduplication can be affected by the singularity; especially by the singularity size. In our previous works [5] [6], we found that the optimum singularity size is 15 bits and the optimum window size is around 32 bytes. In this study, we also used these parameters. 2.2 Variable-length block The Rabin-Karp string search algorithm is used to find the singularity in the hash value of the window. The following parameters are used in the algorithm. (1) Minimum file size (default is 40 bytes) Files that are smaller than this size will not be targeted for deduplication. (2) Minimum block length (default is 4,000 bytes) (3) Maximum block length (default is 16,000 bytes) (4) Window size (default is 32 bytes) Window is a unit for calculating a hash value. (5) Singularity (a) Fixed-length block (b) Variable-length block Fig. 2 Two types of block When the singularity is included in the bit-pattern of the hash value of the window, a breakpoint is found and a block is generated at this position. For a file whose size is between the minimum file size and the minimum block length, the whole file is generated as a block. When a file is larger than the minimum block length, a breakpoint will be determined. As shown in Figure 3, in searching for the breakpoint, a hash value is first created for the window at the location of the minimum length block, and is checked up if it includes the singularity, or not. When the hash value includes the singularity, a breakpoint is found and a block is generated at this position. When the hash value does not include the singularity, the window is shifted one byte and the breakpoint search is repeated. When the breakpoint is not found until the maximum block length, a maximum length block is generated at this position. In the variable-length block method, the block length can be changed, and the maximum and minimum block length is set not to generate an extremely large or small block. The effect of deduplication is also affected by this maximum / minimum block length.

3 Int'l Conf. Scientific Computing CSC' Related works Fig.3 Breakpoint search The effect of deduplication in the fixed-length block method when the block length is set to 4 ~ 16 K bytes was investigated in [1] and the efficiency of the variable-length method was discussed in [3]. The differences of the effects of deduplication between in the variable-length block method and in the fixed-length block method when the block length is larger than 4 K bytes were reported in [2] and when the block length is smaller than 4 K bytes were discussed in [4]. In [5], the double layered deduplication method that combines the fixed-length method and variable-length method is proposed. These studies have investigated how the block length affects the effect of deduplication. In our previous work [6][7], we analyzed the relationship between the singularity size and the deduplication rate. In [8] and [9], we researched the efficiency of deduplication for firmware files and audio data files respectively. In this research, we proposed a method to shorten the processing time of deduplication without reducing the effect of duplication elimination too much by performing deduplication only on similar files. 4 Similar file extraction method Below is the process of deduplication: Create blocks Detect block duplications Delete duplicates Create links Since a large amount of processing is carried out, the load on the CPU is high. As the number of target files for deduplication increases, the duplicated parts are easier to find. However, the increase in the number of files also causes the increase of time needed to find the duplicated parts. In this research, we proposed similar file extraction method which extracts highly similar files from the target file group and performs deduplication only for files with high similarity prior to deduplication. By not performing deduplication on files with low similarity, it is possible to reduce the time required for deduplication without reducing the effect of deduplication too much. However, in this method, the originally duplicated parts that is actually possible for deduplication may not become the target for deduplication. Thus, the effect of deduplication may decrease. Through experiments, the effect of reducing the time needed for deduplication and the decrease in efficiency of deduplication are simultaneously evaluated. In the similar file extraction method, similar document files (doc, docx) were extracted using morphological analysis and cosine similarity. 4.1 Morphological analysis Morphological analysis is a method of separating sentences into meaningful words and identify the part of speech or content. In English language, it is easy to divide the sentences into morphemes in order to write sentences separated by words such as "I love you.. On the other hand, Japanese sentencewhich has the same meaning is harder to separate into morphemes. The morphemes in this example are,,,, and word match by dictionary is required to perform morphological analysis. 4.2 Cosine similarity Cosine similarity is a method to determine the similarity between two sentences by creating a vector of frequency of morphemes appearance between two sentences. Then, the normalized inner product of the vector is taken as the similarity between those two sentences. For example, in case of sentence A "I live in a big house in Tokyo." and sentence B "I stay in a big hotel in Boston.", all morphemes that appear are listed up (in this example: {I, live, in, a, big, house, Tokyo, stay, hotel, Boston}) and a vector of the frequency for each morphemes appearance in each sentence is created (in this example: V A ={1, 1, 2, 1, 1, 1, 1, 0, 0, 0} and V B ={1, 0, 2, 1, 1, 0, 0, 1, 1, 1}. The normalized inner product of V A and V B (In this example:

4 40 Int'l Conf. Scientific Computing CSC'17 cov( V A 7 10, V B VA VB ) V V 0.7 A B which is 0.7 is the cosine similarity between sentences A and B. Cosine similarity takes a value between 0 and 1. The closer the value to 1 indicates that the two sentences are similar. However, even though the sentences to be compared are similar in appearance, the meanings and contents are not necessarily similar. Therefore, even if the cosine similarity is close to 1, it does not indicate that the two documents are possible for deduplication without fail, instead it only shows that there is a high possibility for deduplication. 5 Experimental results 5.1 Target files for deduplication The effect of similar file extraction method was examined through experiments. Using variable length block method, the deduplication rate was performed for the document file (doc or docx extension) using similar file extraction method. The deduplication rate using similar file extraction method was obtained from the following calculation formula. It can Table 1 Target files for deduplication Extension Total data size (Byte) Number of files doc 35,894, Fig. 4 Similarities of target files Fig. 5 Categorize target files be concluded that the higher the deduplication rate, the higher the effect of deduplication. Deduplication Rate "data size deleted by deduplication" 100(%) "similar file size " + "non - similar file size" Denominator for general deduplication rate is "file size targeted for deduplication" while in this situation the denominator is "similar file size". For similar file extraction method, non-similar files should also be considered as target of deduplication and included in the denominator of the deduplication rate. For similar file method, file extension was converted to txt and morphological analysis was conducted to break down the contents to word level. Then, similarity were determined using cosine similarity, and similar files were extracted in three stages with similarity of 0.3 or more, 0.4 or more, and 0.5 or more. The change in processing time and the rate of deduplication between two methods; deduplicating only similar files and deduplicating the entire file without performing similar file search were compared. For processing time, the time required for similar file search does not included. Table 1 shows the total data size and number of files to be deduplicated. The similarity of 20 target files for deduplication is shown in Fig. 4, and Fig. 5 shows a graph arranged in order of maximum similarity from extracted files with the highest degree of similarity of 0.3 or more, 0.4 or more, and 0.5 or more. 5.2 Experimental results Table 2 shows the result of deduplication using similar file extraction method. As a result of extracting similar files, the

5 Int'l Conf. Scientific Computing CSC'17 41 Table 2 Results of deduplication Attribute Similarity Number of File size Exec. Time Number of Reduce size Dedup. rate range files (Byte) (sec.) blocks (Byte) (%) All files 20 35,894, ,138 1,873, or more 17 33,664, ,604 1,873, Extracted files 0.4 or more 16 32,934, ,429 1,873, or more 6 7,856, ,514 1,549, total number of blocks and the execution time required for deduplication de-creases proportionally to the reduced file size. This is because deduplication was performed only for similar files, and searching for unnecessary block matches and link creation time was reduced. On the other hand, the similarity of 0.3 or more and 0.4 or more deduplicate exactly the same blocks as deduplicating the entire file. Files that are not similar according to similarity determination will not become the target for deduplication, but this does not reduce the effect of deduplication. Furthermore, if the similarity is more than 0.5, the files extracted as similar are drastically reduced to 22% (about 1 ) 4.6 of all files. The deduplication processing time was reduced to 30% (about 1 ), but the rate of deduplication was only 3.3 reduced from 5.22% to 4.32% that is about 8% (about 1 ). This shows that files can effectively be extracted for 1.2 deduplication. As described above, it was confirmed that similar file extraction method is an effective method capable to reduce the time required for deduplication without significantly reducing the effect of deduplication. 6 Conclusion In this research, we proposed a similar file extraction method which performs deduplication only for files with high similarity by preliminary similarity determination. For text such as doc and docx, similarity determination is performed using morphological analysis and cosine similarity. Then, deduplication is performed only for files with high similarity. This method shorten the time required for deduplication without reducing the effect of deduplication too much. Experiments confirmed that if the deduplication is performed on files with similarity of 0.5 or more, the reduction of the deduplication rate is suppressed to about 8% and Fig. 6 Results of deduplication rate and execution time the processing time can be shortened to 1 or less. In other 3.3 words, if deduplication is performed by narrow down to files with high similarity using similar file extraction method, the processing time can be shortened without decreasing the effect of deduplication too much. In the future, we will expand the scope of application so that the proposed similar file extraction method can be applied not only to text files but also to other types of files. 7 References [1] Q. He, Z. Li, X. Zhang, Data deduplication techniques, Future Information Technology and Management Engineering (FITME) 2010, vol.1, pp , Oct [2] C. Constantinescu, J. Glider, D. Chambliss, Mixing Deduplication and Compression on Active Data Sets, Data Compression Conference (DCC) 2011, pp , March 2011 [3] A.N. Yasa, P.C. Nagesh, Space savings and design considerations in variable length deduplication, ACM SIGOPS Operating Systems Review, Vol.46 Issue 3, pp.57-64, Dec [4] M. Noorafiza, I. Koike, H. Yamasaki, A. Rizalhasrin, T. Kinoshita, Block Length Optimization in Data Deduplication Technique, Proceedings of the 10th International Conference on Scientific Computing (CSC2013), pp , July 2013

6 42 Int'l Conf. Scientific Computing CSC'17 [5] H. Yamasaki, I. Koike, T. Kinoshita, Analysis of double layered deduplication efficiency, IPSJ SIGMPS Technical Report, Vol.2014-MPS-97 No.9, March 2014 (in Japanese) [6] M. Ogiwara, M. Takaya, T. Kasuya, I. Koike, T. Kinoshita, Singularity Size Optimization in Data Deduplication Technique, Proceedings of the 2014 International Conference on Parallel and Distributed Processing Techniques and Applications 2014, (PDPTA2014), pp , July 2014 [7] M. Noorafiza, M. Hirose, M. Takaya, I. Koike, T. Kinoshita, Optimum Singularity Size in Data Deduplication Technique, Proceedings of the 2015 International Conference on Scientific Computing (CSC2015), pp , July 2015 [8] N. Takeuchi, M. Hirose, M. Noorafiza, S. Takano, I. Koike, T. Kinoshita, Data Deduplication for Firmware Files, Proceedings of the 2016 International Conference on Scientific Computing (CSC2016), pp.14-19, July 2016 [9] MZ Nurshafiqah, H. Yoshii, F. Enomoto, I. Koike, T. Kinoshita, Data Deduplication for Audio Data Files, Proceedings of 32th International Conference on Computers and Their Applications (CATA2017), pp.17-21, April 2017

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18,   ISSN International Journal of Computer Engineering and Applications, Volume XII, Special Issue, March 18, www.ijcea.com ISSN 2321-3469 SECURE DATA DEDUPLICATION FOR CLOUD STORAGE: A SURVEY Vidya Kurtadikar

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Compression and Decompression of Virtual Disk Using Deduplication

Compression and Decompression of Virtual Disk Using Deduplication Compression and Decompression of Virtual Disk Using Deduplication Bharati Ainapure 1, Siddhant Agarwal 2, Rukmi Patel 3, Ankita Shingvi 4, Abhishek Somani 5 1 Professor, Department of Computer Engineering,

More information

Drive Space Efficiency Using the Deduplication/Compression Function of the FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series

Drive Space Efficiency Using the Deduplication/Compression Function of the FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series White Paper Drive Space Efficiency Using the Function of the FUJITSU Storage ETERNUS F series and ETERNUS DX S4/S3 series The function is provided by the FUJITSU Storage ETERNUS F series and ETERNUS DX

More information

A New Compression Method Strictly for English Textual Data

A New Compression Method Strictly for English Textual Data A New Compression Method Strictly for English Textual Data Sabina Priyadarshini Department of Computer Science and Engineering Birla Institute of Technology Abstract - Data compression is a requirement

More information

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp.

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Primary Storage Optimization Technologies that let you store more data on the same storage Thin provisioning Copy-on-write

More information

Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines

Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines The Scientific World Journal Volume 2013, Article ID 596724, 6 pages http://dx.doi.org/10.1155/2013/596724 Research Article A Two-Level Cache for Distributed Information Retrieval in Search Engines Weizhe

More information

Data Reduction Meets Reality What to Expect From Data Reduction

Data Reduction Meets Reality What to Expect From Data Reduction Data Reduction Meets Reality What to Expect From Data Reduction Doug Barbian and Martin Murrey Oracle Corporation Thursday August 11, 2011 9961: Data Reduction Meets Reality Introduction Data deduplication

More information

IJRIM Volume 2, Issue 2 (February 2012) (ISSN )

IJRIM Volume 2, Issue 2 (February 2012) (ISSN ) AN ENHANCED APPROACH TO OPTIMIZE WEB SEARCH BASED ON PROVENANCE USING FUZZY EQUIVALENCE RELATION BY LEMMATIZATION Divya* Tanvi Gupta* ABSTRACT In this paper, the focus is on one of the pre-processing technique

More information

Multimedia Integration for Cooking Video Indexing

Multimedia Integration for Cooking Video Indexing Multimedia Integration for Cooking Video Indexing Reiko Hamada 1, Koichi Miura 1, Ichiro Ide 2, Shin ichi Satoh 3, Shuichi Sakai 1, and Hidehiko Tanaka 4 1 The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku,

More information

Online Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.

Online Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1. , pp.1-10 http://dx.doi.org/10.14257/ijmue.2014.9.1.01 Design and Implementation of Binary File Similarity Evaluation System Sun-Jung Kim 2, Young Jun Yoo, Jungmin So 1, Jeong Gun Lee 1, Jin Kim 1 and

More information

Backup and Recovery Scheme for Distributed e-learning System

Backup and Recovery Scheme for Distributed e-learning System Notice for the use of this material The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). This material is published on this web site with the agreement of the

More information

DATA DEDUPLCATION AND MIGRATION USING LOAD REBALANCING APPROACH IN HDFS Pritee Patil 1, Nitin Pise 2,Sarika Bobde 3 1

DATA DEDUPLCATION AND MIGRATION USING LOAD REBALANCING APPROACH IN HDFS Pritee Patil 1, Nitin Pise 2,Sarika Bobde 3 1 DATA DEDUPLCATION AND MIGRATION USING LOAD REBALANCING APPROACH IN HDFS Pritee Patil 1, Nitin Pise 2,Sarika Bobde 3 1 Department of Computer Engineering 2 Department of Computer Engineering Maharashtra

More information

A Prototype System to Browse Web News using Maps for NIE in Elementary Schools in Japan

A Prototype System to Browse Web News using Maps for NIE in Elementary Schools in Japan A Prototype System to Browse Web News using Maps for NIE in Elementary Schools in Japan Yutaka Uchiyama *1 Akifumi Kuroda *2 Kazuaki Ando *3 *1, 2 Graduate School of Engineering, *3 Faculty of Engineering

More information

DEDUPLICATION BASICS

DEDUPLICATION BASICS DEDUPLICATION BASICS 4 DEDUPE BASICS 6 WHAT IS DEDUPLICATION 8 METHODS OF DEDUPLICATION 10 DEDUPLICATION EXAMPLE 12 HOW DO DISASTER RECOVERY & ARCHIVING FIT IN? 14 DEDUPLICATION FOR EVERY BUDGET QUANTUM

More information

Enhanced Performance of Database by Automated Self-Tuned Systems

Enhanced Performance of Database by Automated Self-Tuned Systems 22 Enhanced Performance of Database by Automated Self-Tuned Systems Ankit Verma Department of Computer Science & Engineering, I.T.M. University, Gurgaon (122017) ankit.verma.aquarius@gmail.com Abstract

More information

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far

More information

HP Dynamic Deduplication achieving a 50:1 ratio

HP Dynamic Deduplication achieving a 50:1 ratio HP Dynamic Deduplication achieving a 50:1 ratio Table of contents Introduction... 2 Data deduplication the hottest topic in data protection... 2 The benefits of data deduplication... 2 How does data deduplication

More information

ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING

ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING ENCRYPTED DATA MANAGEMENT WITH DEDUPLICATION IN CLOUD COMPUTING S KEERTHI 1*, MADHAVA REDDY A 2* 1. II.M.Tech, Dept of CSE, AM Reddy Memorial College of Engineering & Technology, Petlurivaripalem. 2. Assoc.

More information

Market Splitting Algorithm for Congestion Management in Electricity Spot Market

Market Splitting Algorithm for Congestion Management in Electricity Spot Market Proceedings of the 6th WSEAS International Conference on Power Systems, Lisbon, Portugal, September 22-24, 2006 338 Market Splitting Algorithm for Congestion Management in Electricity Spot Market Marta

More information

Improving TCP throughput using forward error correction

Improving TCP throughput using forward error correction This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Communications Express, Vol., 1 6 Improving TCP throughput using forward error correction

More information

An Improved Frequent Pattern-growth Algorithm Based on Decomposition of the Transaction Database

An Improved Frequent Pattern-growth Algorithm Based on Decomposition of the Transaction Database Algorithm Based on Decomposition of the Transaction Database 1 School of Management Science and Engineering, Shandong Normal University,Jinan, 250014,China E-mail:459132653@qq.com Fei Wei 2 School of Management

More information

The Effectiveness of Deduplication on Virtual Machine Disk Images

The Effectiveness of Deduplication on Virtual Machine Disk Images The Effectiveness of Deduplication on Virtual Machine Disk Images Keren Jin & Ethan L. Miller Storage Systems Research Center University of California, Santa Cruz Motivation Virtualization is widely deployed

More information

How to Reduce Data Capacity in Objectbased Storage: Dedup and More

How to Reduce Data Capacity in Objectbased Storage: Dedup and More How to Reduce Data Capacity in Objectbased Storage: Dedup and More Dong In Shin G-Cube, Inc. http://g-cube.kr Unstructured Data Explosion A big paradigm shift how to generate and consume data Transactional

More information

A Retrieval Method for Double Array Structures by Using Byte N-Gram

A Retrieval Method for Double Array Structures by Using Byte N-Gram A Retrieval Method for Double Array Structures by Using Byte N-Gram Masao Fuketa, Kazuhiro Morita, and Jun-Ichi Aoe Abstract Retrieving keywords requires speed and compactness. A trie is one of the data

More information

Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data

Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data 46 Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data

More information

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc. Volume 6, Issue 2, February 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Comparative

More information

Experimental Observations of Construction Methods for Double Array Structures Using Linear Functions

Experimental Observations of Construction Methods for Double Array Structures Using Linear Functions Experimental Observations of Construction Methods for Double Array Structures Using Linear Functions Shunsuke Kanda*, Kazuhiro Morita, Masao Fuketa, Jun-Ichi Aoe Department of Information Science and Intelligent

More information

Multi-level Byte Index Chunking Mechanism for File Synchronization

Multi-level Byte Index Chunking Mechanism for File Synchronization , pp.339-350 http://dx.doi.org/10.14257/ijseia.2014.8.3.31 Multi-level Byte Index Chunking Mechanism for File Synchronization Ider Lkhagvasuren, Jung Min So, Jeong Gun Lee, Jin Kim and Young Woong Ko *

More information

Analysis of Basic Data Reordering Techniques

Analysis of Basic Data Reordering Techniques Analysis of Basic Data Reordering Techniques Tan Apaydin 1, Ali Şaman Tosun 2, and Hakan Ferhatosmanoglu 1 1 The Ohio State University, Computer Science and Engineering apaydin,hakan@cse.ohio-state.edu

More information

Connecting Rod Design using a Stress Analysis of 3D CAD

Connecting Rod Design using a Stress Analysis of 3D CAD The 4th International Conference on Design Engineering and Science, ICDES 2017 Aachen, Germany, September 17-19, 2017 124 Connecting Rod Design using a Stress Analysis of 3D CAD (Automatic computation

More information

Getting to places from my house...

Getting to places from my house... Reductions, Self-Similarity, and Recursion Relations between problems Notes for CSC 100 - The Beauty and Joy of Computing The University of North Carolina at Greensboro Getting to places from my house...

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

Semi supervised clustering for Text Clustering

Semi supervised clustering for Text Clustering Semi supervised clustering for Text Clustering N.Saranya 1 Assistant Professor, Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore 1 ABSTRACT: Based on clustering

More information

CADIAL Search Engine at INEX

CADIAL Search Engine at INEX CADIAL Search Engine at INEX Jure Mijić 1, Marie-Francine Moens 2, and Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia {jure.mijic,bojana.dalbelo}@fer.hr

More information

A Low-Cost Correction Algorithm for Transient Data Errors

A Low-Cost Correction Algorithm for Transient Data Errors A Low-Cost Correction Algorithm for Transient Data Errors Aiguo Li, Bingrong Hong School of Computer Science and Technology Harbin Institute of Technology, Harbin 150001, China liaiguo@hit.edu.cn Introduction

More information

Peer To Peer Communication Using Heterogeneous Networks

Peer To Peer Communication Using Heterogeneous Networks Volume 4 Issue 10, October 2015 Peer To Peer Communication Using Heterogeneous Networks Khandave Pooja, Karande Ashwini, Kharmale Swati,Vanve Subeda. Dr.D.Y.Patil School Of Engineering and Technology,Lohegaon,Pune

More information

Design and Implementation of Various File Deduplication Schemes on Storage Devices

Design and Implementation of Various File Deduplication Schemes on Storage Devices Design and Implementation of Various File Deduplication Schemes on Storage Devices Yong-Ting Wu, Min-Chieh Yu, Jenq-Shiou Leu Department of Electronic and Computer Engineering National Taiwan University

More information

SPEECH WATERMARKING USING DISCRETE WAVELET TRANSFORM, DISCRETE COSINE TRANSFORM AND SINGULAR VALUE DECOMPOSITION

SPEECH WATERMARKING USING DISCRETE WAVELET TRANSFORM, DISCRETE COSINE TRANSFORM AND SINGULAR VALUE DECOMPOSITION SPEECH WATERMARKING USING DISCRETE WAVELET TRANSFORM, DISCRETE COSINE TRANSFORM AND SINGULAR VALUE DECOMPOSITION D. AMBIKA *, Research Scholar, Department of Computer Science, Avinashilingam Institute

More information

Expanding the use of CTS-to-Self mechanism to improving broadcasting on IEEE networks

Expanding the use of CTS-to-Self mechanism to improving broadcasting on IEEE networks Expanding the use of CTS-to-Self mechanism to improving broadcasting on IEEE 802.11 networks Christos Chousidis, Rajagopal Nilavalan School of Engineering and Design Brunel University London, UK {christos.chousidis,

More information

implementation using GPU architecture is implemented only from the viewpoint of frame level parallel encoding [6]. However, it is obvious that the mot

implementation using GPU architecture is implemented only from the viewpoint of frame level parallel encoding [6]. However, it is obvious that the mot Parallel Implementation Algorithm of Motion Estimation for GPU Applications by Tian Song 1,2*, Masashi Koshino 2, Yuya Matsunohana 2 and Takashi Shimamoto 1,2 Abstract The video coding standard H.264/AVC

More information

Web Page Recommender System based on Folksonomy Mining for ITNG 06 Submissions

Web Page Recommender System based on Folksonomy Mining for ITNG 06 Submissions Web Page Recommender System based on Folksonomy Mining for ITNG 06 Submissions Satoshi Niwa University of Tokyo niwa@nii.ac.jp Takuo Doi University of Tokyo Shinichi Honiden University of Tokyo National

More information

Providing quality search in the electronic catalog of scientific library via Yandex.Server

Providing quality search in the electronic catalog of scientific library via Yandex.Server 42 Providing quality search in the electronic catalog of scientific library via Yandex.Server Boldyrev Petr 1[0000-0001-7346-6993] and Krylov Ivan 1[0000-0002-8377-1489] 1 Orenburg State University, Pobedy

More information

S. Indirakumari, A. Thilagavathy

S. Indirakumari, A. Thilagavathy International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 2 ISSN : 2456-3307 A Secure Verifiable Storage Deduplication Scheme

More information

SAINT Integrated Signaling System with High Reliability and Safety

SAINT Integrated Signaling System with High Reliability and Safety Hitachi Review Vol. 57 (2008), No. 1 41 SAINT Signaling System with High Reliability and Safety Eiji Sasaki Shinji Hondo Tomohiro Ebuchi OVERVIEW: Trains could not run safely without signaling systems.

More information

Fast frame memory access method for H.264/AVC

Fast frame memory access method for H.264/AVC Fast frame memory access method for H.264/AVC Tian Song 1a), Tomoyuki Kishida 2, and Takashi Shimamoto 1 1 Computer Systems Engineering, Department of Institute of Technology and Science, Graduate School

More information

Generating All Solutions of Minesweeper Problem Using Degree Constrained Subgraph Model

Generating All Solutions of Minesweeper Problem Using Degree Constrained Subgraph Model 356 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16 Generating All Solutions of Minesweeper Problem Using Degree Constrained Subgraph Model Hirofumi Suzuki, Sun Hao, and Shin-ichi Minato Graduate

More information

STUDYING OF CLASSIFYING CHINESE SMS MESSAGES

STUDYING OF CLASSIFYING CHINESE SMS MESSAGES STUDYING OF CLASSIFYING CHINESE SMS MESSAGES BASED ON BAYESIAN CLASSIFICATION 1 LI FENG, 2 LI JIGANG 1,2 Computer Science Department, DongHua University, Shanghai, China E-mail: 1 Lifeng@dhu.edu.cn, 2

More information

De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid

De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid By Greg Schulz Founder and Senior Analyst, the StorageIO Group Author The Green and Virtual Data Center (CRC)

More information

Evaluating Auto Scalable Application on Cloud

Evaluating Auto Scalable Application on Cloud Evaluating Auto Scalable Application on Cloud Takashi Okamoto Abstract Cloud computing enables dynamic scaling out of system resources, depending on workloads and data volume. In addition to the conventional

More information

Position Sort. Anuj Kumar Developer PINGA Solution Pvt. Ltd. Noida, India ABSTRACT. Keywords 1. INTRODUCTION 2. METHODS AND MATERIALS

Position Sort. Anuj Kumar Developer PINGA Solution Pvt. Ltd. Noida, India ABSTRACT. Keywords 1. INTRODUCTION 2. METHODS AND MATERIALS Position Sort International Journal of Computer Applications (0975 8887) Anuj Kumar Developer PINGA Solution Pvt. Ltd. Noida, India Mamta Former IT Faculty Ghaziabad, India ABSTRACT Computer science has

More information

Embedded Descendent-Only Zerotree Wavelet Coding for Image Compression

Embedded Descendent-Only Zerotree Wavelet Coding for Image Compression Embedded Descendent-Only Zerotree Wavelet Coding for Image Compression Wai Chong Chia, Li-Minn Ang, and Kah Phooi Seng Abstract The Embedded Zerotree Wavelet (EZW) coder which can be considered as a degree-0

More information

Tree-Based Minimization of TCAM Entries for Packet Classification

Tree-Based Minimization of TCAM Entries for Packet Classification Tree-Based Minimization of TCAM Entries for Packet Classification YanSunandMinSikKim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington 99164-2752, U.S.A.

More information

Variable Neighborhood Search Based Algorithm for University Course Timetabling Problem

Variable Neighborhood Search Based Algorithm for University Course Timetabling Problem Variable Neighborhood Search Based Algorithm for University Course Timetabling Problem Velin Kralev, Radoslava Kraleva South-West University "Neofit Rilski", Blagoevgrad, Bulgaria Abstract: In this paper

More information

Improvements and Implementation of Hierarchical Clustering based on Hadoop Jun Zhang1, a, Chunxiao Fan1, Yuexin Wu2,b, Ao Xiao1

Improvements and Implementation of Hierarchical Clustering based on Hadoop Jun Zhang1, a, Chunxiao Fan1, Yuexin Wu2,b, Ao Xiao1 3rd International Conference on Machinery, Materials and Information Technology Applications (ICMMITA 2015) Improvements and Implementation of Hierarchical Clustering based on Hadoop Jun Zhang1, a, Chunxiao

More information

CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval

CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval DCU @ CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval Walid Magdy, Johannes Leveling, Gareth J.F. Jones Centre for Next Generation Localization School of Computing Dublin City University,

More information

An Improved Document Clustering Approach Using Weighted K-Means Algorithm

An Improved Document Clustering Approach Using Weighted K-Means Algorithm An Improved Document Clustering Approach Using Weighted K-Means Algorithm 1 Megha Mandloi; 2 Abhay Kothari 1 Computer Science, AITR, Indore, M.P. Pin 453771, India 2 Computer Science, AITR, Indore, M.P.

More information

Encoding Words into String Vectors for Word Categorization

Encoding Words into String Vectors for Word Categorization Int'l Conf. Artificial Intelligence ICAI'16 271 Encoding Words into String Vectors for Word Categorization Taeho Jo Department of Computer and Information Communication Engineering, Hongik University,

More information

Job Re-Packing for Enhancing the Performance of Gang Scheduling

Job Re-Packing for Enhancing the Performance of Gang Scheduling Job Re-Packing for Enhancing the Performance of Gang Scheduling B. B. Zhou 1, R. P. Brent 2, C. W. Johnson 3, and D. Walsh 3 1 Computer Sciences Laboratory, Australian National University, Canberra, ACT

More information

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti 1 Department

More information

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University

More information

Leap-based Content Defined Chunking --- Theory and Implementation

Leap-based Content Defined Chunking --- Theory and Implementation Leap-based Content Defined Chunking --- Theory and Implementation Chuanshuai Yu, Chengwei Zhang, Yiping Mao, Fulu Li Huawei Technologies Co., Ltd. {yuchuanshuai, zhangchengwei, tony.mao, lifulu}@huawei.com

More information

An efficient access control method for composite multimedia content

An efficient access control method for composite multimedia content IEICE Electronics Express, Vol.7, o.0, 534 538 An efficient access control method for composite multimedia content Shoko Imaizumi,a), Masaaki Fujiyoshi,andHitoshiKiya Industrial Research Institute of iigata

More information

Modeling the Component Pickup and Placement Sequencing Problem with Nozzle Assignment in a Chip Mounting Machine

Modeling the Component Pickup and Placement Sequencing Problem with Nozzle Assignment in a Chip Mounting Machine Modeling the Component Pickup and Placement Sequencing Problem with Nozzle Assignment in a Chip Mounting Machine Hiroaki Konishi, Hidenori Ohta and Mario Nakamori Department of Information and Computer

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

MULTIMEDIA PROXY CACHING FOR VIDEO STREAMING APPLICATIONS.

MULTIMEDIA PROXY CACHING FOR VIDEO STREAMING APPLICATIONS. MULTIMEDIA PROXY CACHING FOR VIDEO STREAMING APPLICATIONS. Radhika R Dept. of Electrical Engineering, IISc, Bangalore. radhika@ee.iisc.ernet.in Lawrence Jenkins Dept. of Electrical Engineering, IISc, Bangalore.

More information

A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON DWT WITH SVD

A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON DWT WITH SVD A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON WITH S.Shanmugaprabha PG Scholar, Dept of Computer Science & Engineering VMKV Engineering College, Salem India N.Malmurugan Director Sri Ranganathar Institute

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Reducing Replication Bandwidth for Distributed Document Databases

Reducing Replication Bandwidth for Distributed Document Databases Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 Document-oriented

More information

Comparisons of Efficient Implementations for DAWG

Comparisons of Efficient Implementations for DAWG Comparisons of Efficient Implementations for DAWG Masao Fuketa, Kazuhiro Morita, and Jun-ichi Aoe Abstract Key retrieval is very important in various applications. A trie and DAWG are data structures for

More information

International Journal of Video& Image Processing and Network Security IJVIPNS-IJENS Vol:10 No:02 7

International Journal of Video& Image Processing and Network Security IJVIPNS-IJENS Vol:10 No:02 7 International Journal of Video& Image Processing and Network Security IJVIPNS-IJENS Vol:10 No:02 7 A Hybrid Method for Extracting Key Terms of Text Documents Ahmad Ali Al-Zubi Computer Science Department

More information

Ontology Extraction from Heterogeneous Documents

Ontology Extraction from Heterogeneous Documents Vol.3, Issue.2, March-April. 2013 pp-985-989 ISSN: 2249-6645 Ontology Extraction from Heterogeneous Documents Kirankumar Kataraki, 1 Sumana M 2 1 IV sem M.Tech/ Department of Information Science & Engg

More information

Efficient Mining Algorithms for Large-scale Graphs

Efficient Mining Algorithms for Large-scale Graphs Efficient Mining Algorithms for Large-scale Graphs Yasunari Kishimoto, Hiroaki Shiokawa, Yasuhiro Fujiwara, and Makoto Onizuka Abstract This article describes efficient graph mining algorithms designed

More information

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Walid Magdy, Gareth J.F. Jones Centre for Next Generation Localisation School of Computing Dublin City University,

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

Preliminary inspection about the profit of 3D data in public works

Preliminary inspection about the profit of 3D data in public works icccbe 2010 Nottingham University Press Proceedings of the International Conference on Computing in Civil and Building Engineering W Tizani (Editor) Preliminary inspection about the profit of 3D data in

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Web Crawler Finds and downloads web pages automatically provides the collection for searching Web is huge and constantly

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

What is database? Types and Examples

What is database? Types and Examples What is database? Types and Examples Visit our site for more information: www.examplanning.com Facebook Page: https://www.facebook.com/examplanning10/ Twitter: https://twitter.com/examplanning10 TABLE

More information

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

Complexity Reduced Mode Selection of H.264/AVC Intra Coding Complexity Reduced Mode Selection of H.264/AVC Intra Coding Mohammed Golam Sarwer 1,2, Lai-Man Po 1, Jonathan Wu 2 1 Department of Electronic Engineering City University of Hong Kong Kowloon, Hong Kong

More information

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106 CHAPTER 6 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform Page No 6.1 Introduction 103 6.2 Compression Techniques 104 103 6.2.1 Lossless compression 105 6.2.2 Lossy compression

More information

Information Providing System for Commuters Unable to Get Home at the Time of Disaster by Constructing Local Network using Single-board Computers

Information Providing System for Commuters Unable to Get Home at the Time of Disaster by Constructing Local Network using Single-board Computers Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'17 221 Information Providing System for Commuters Unable to Get Home at the Time of Disaster by Constructing Local Network using Single-board Computers

More information

ChunkStash: Speeding Up Storage Deduplication using Flash Memory

ChunkStash: Speeding Up Storage Deduplication using Flash Memory ChunkStash: Speeding Up Storage Deduplication using Flash Memory Biplob Debnath +, Sudipta Sengupta *, Jin Li * * Microsoft Research, Redmond (USA) + Univ. of Minnesota, Twin Cities (USA) Deduplication

More information

A Method of Identifying the P2P File Sharing

A Method of Identifying the P2P File Sharing IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.11, November 2010 111 A Method of Identifying the P2P File Sharing Jian-Bo Chen Department of Information & Telecommunications

More information

Delivery Context in MPEG-21

Delivery Context in MPEG-21 Delivery Context in MPEG-21 Sylvain Devillers Philips Research France Anthony Vetro Mitsubishi Electric Research Laboratories Philips Research France Presentation Plan MPEG achievements MPEG-21: Multimedia

More information

Backup and Recovery Best Practices

Backup and Recovery Best Practices Backup and Recovery Best Practices Session: 3 Track: ELA Services Skip Farmer Symantec 1 Backup System Infrastructure 2 Isolating Performance Issues 3 Virtual Machine Backups 4 Reporting - Opscenter Analytics

More information

ABSTRACT I. INTRODUCTION

ABSTRACT I. INTRODUCTION International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISS: 2456-3307 Hadoop Periodic Jobs Using Data Blocks to Achieve

More information

A Performance of Embedding Process for Text Steganography Method

A Performance of Embedding Process for Text Steganography Method A Performance of Embedding Process for Text Steganography Method BAHARUDIN OSMAN 1, ROSHIDI DIN 1, TUAN ZALIZAM TUAN MUDA 2, MOHD. NIZAM OMAR 1, School of Computing 1, School of Multimedia Technology and

More information

Improved Parallel Rabin-Karp Algorithm Using Compute Unified Device Architecture

Improved Parallel Rabin-Karp Algorithm Using Compute Unified Device Architecture Improved Parallel Rabin-Karp Algorithm Using Compute Unified Device Architecture Parth Shah 1 and Rachana Oza 2 1 Chhotubhai Gopalbhai Patel Institute of Technology, Bardoli, India parthpunita@yahoo.in

More information

Similarity Joins in MapReduce

Similarity Joins in MapReduce Similarity Joins in MapReduce Benjamin Coors, Kristian Hunt, and Alain Kaeslin KTH Royal Institute of Technology {coors,khunt,kaeslin}@kth.se Abstract. This paper studies how similarity joins can be implemented

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

Deduplication Storage System

Deduplication Storage System Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business

More information

Index Generation and Advanced Search Functions for. Muitimedia Presentation Material. Yahiko Kambayashi 3 Kaoru Katayama 3 Yasuhiro Kamiya 3

Index Generation and Advanced Search Functions for. Muitimedia Presentation Material. Yahiko Kambayashi 3 Kaoru Katayama 3 Yasuhiro Kamiya 3 Index Generation and Advanced Search Functions for Muitimedia Presentation Material Yahiko Kambayashi 3 Kaoru Katayama 3 Yasuhiro Kamiya 3 Osami Kagawa 33 3 Department of Information Science, Kyoto University

More information

CONCEPTUAL DESIGN FOR SOFTWARE PRODUCTS: SERVICE REQUEST PORTAL. Tyler Munger Subhas Desa

CONCEPTUAL DESIGN FOR SOFTWARE PRODUCTS: SERVICE REQUEST PORTAL. Tyler Munger Subhas Desa CONCEPTUAL DESIGN FOR SOFTWARE PRODUCTS: SERVICE REQUEST PORTAL Tyler Munger Subhas Desa Real World Problem at Cisco Systems Smart Call Home (SCH) is a component of Cisco Smart Services that offers proactive

More information

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Web crawlers Retrieving web pages Crawling the web» Desktop crawlers» Document feeds File conversion Storing the documents Removing noise Desktop Crawls! Used

More information

International Journal of Scientific Research and Reviews

International Journal of Scientific Research and Reviews Research article Available online www.ijsrr.org ISSN: 2279 0543 International Journal of Scientific Research and Reviews Asymmetric Digital Signature Algorithm Based on Discrete Logarithm Concept with

More information

BEx Front end Performance

BEx Front end Performance BUSINESS INFORMATION WAREHOUSE BEx Front end Performance Performance Analyses of BEx Analyzer and Web Application in the Local and Wide Area Networks Environment Document Version 1.1 March 2002 Page 2

More information

An Efficient Approach for Color Pattern Matching Using Image Mining

An Efficient Approach for Color Pattern Matching Using Image Mining An Efficient Approach for Color Pattern Matching Using Image Mining * Manjot Kaur Navjot Kaur Master of Technology in Computer Science & Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib,

More information

SHAPE SEGMENTATION FOR SHAPE DESCRIPTION

SHAPE SEGMENTATION FOR SHAPE DESCRIPTION SHAPE SEGMENTATION FOR SHAPE DESCRIPTION Olga Symonova GraphiTech Salita dei Molini 2, Villazzano (TN), Italy olga.symonova@graphitech.it Raffaele De Amicis GraphiTech Salita dei Molini 2, Villazzano (TN),

More information