Implementation of Robust Compression Technique using LZ77 Algorithm on Tensilica s Xtensa Processor

2016 International Conference on Information Technology Implementation of Robust Compression Technique using LZ77 Algorithm on Tensilica s Xtensa Processor Vasanthi D R and Anusha R M.Tech (VLSI Design and Embedded Systems) CMR Institute of Technology Bangalore -37, Karnataka, India vasanthidr11@gmail.com & anusha005r@gmail.com Abstract SSD Technology is used to read/write data to the external device with high speed and low power. A lossless LZ77 data compression technique is used here in order to retain the information content by removing data redundancy and also employs a text window in congestion with look-ahead buffer to serve as a dictionary. In the proposed technique the unmatched string present in the dictionary are encoded as a length and offset. Encoding and decoding process in data compression enables SSD s, so that the data written is less in turn yields the endurance of SSD performance. Eventually using novel method, pointers are effectively selected and encoded for the longest matched string to obtain greater compression. For the unmatched character compression ratio is 3.12times which is better than existing algorithm. Data compression improves the performance of SSD, hence this reduces the amount of data transfer to flash memory. The speed of secure data compression increases with a creation of preferred search function for a HEX sample data size 32Kbyte. Even though if extra time is acquired, asymmetric compression decompresses the data quickly resulting in good compression. Keywords SSD; LZ77; Compression Ratio; Tensilica s Xtensa Processor; Xtensa SIMD Processor; FLIX Instruction; Vinay B K Asst. Professor, ECE Dept. CMR Institute of Technology Bangalore -37, Karnataka, India vinay.9cool@gmail.com LZ77 type decoders expand the compressed file by expanding "copy items" in the compressed file into larger strings enabling "duplicate string elimination" and only the first copy of a string is stored in the compressed file. Many decoder algorithms combine LZ77 with other compression algorithm and they extract the "length" and the "distance" from the huge variety of decoder structure [3]. Hence the Compression can reduce the size of data in order to save the space and transmission time. Due to this it increases the effective storage capacity of devices and an added advantage is that data compression reduces write path wherein instead of writing complete blocks only a few bytes of data are actually modified. II. LEMPEL ZIV (LZ77) Data compression technique was first described by Lempel Ziv in 1977. Encoder examines input sequence by sliding of last n bytes of data that has been processed, and for the subsequent bytes it searches for the longest match. I. INTRODUCTION HDD s read access time is lesser than SSD in terms of milliseconds which is an advantage for SSD. Rather than using disk and read/write heads SSD uses flash memory, hence information is retained even at times of power off. Due to all this reason current technology uses SSD s more in place of HDD s. SSD s can be made smaller, uses less power and doesnot make noise, hence its reliable and last longer on a single battery charge than HDD s. In SSD, LZ77 data compression is a key feature having high data throughput rates and less power consumption. Compression eliminates the redundancy of data in a reversible manner and also increases entropy by reducing the size of date [1]. Lempel & Ziv algorithm is categorized into lossless and lossy compression. In lossless compression, files are compressed and decompressed by producing the same file out, whereas in lossy compression data cannot reproduce the exact file instead it returns to approximate file [2].The dictionaries used in compression technique does not match with the programming structure that holds the data table, and also it does not have external dictionary causing problem while decompressing another data. Therefore the compression algorithm tends to implement the programming object libraries files, so that the algorithm can easily be reusable and updated. Fig. 1: Encoding using the LZ77 approach. A sliding window is categorized into search buffer and look-ahead buffer as shown in fig. 1. In search buffer a portion of recently encoded sequence is formed whereas in look-ahead buffer next portion of sequence is formed. Practically size of buffer is five times larger than look-ahead buffer. Distance of pointer from look-ahead buffer is called offset [4] and number of consecutive symbols in search buffer which matches the consecutive symbol in look-ahead buffer is called length of the match. Encoder searches the search buffer for longest match, once the longest match is found encoder encodes it with triple (o,l,c). If the size of buffer is S,size of window is W and size of source alphabet A, then number of bits to be encoded in triple using fixed length code is given in equation (1). (1) 978-1-5090-3584-7/16 $31.00 2016 IEEE DOI 10.1109/ICIT.2016.41 148

For example, if the minimum match length is of 3 characters and even if there is no current match, a sequence of 3 new bytes is typically the minimum searched buffer. If the match is not less than or equal to 3 character is found, then the 1 st byte of the string is output of literal byte. Then the window s start and end are adjusted by one, and the next input byte is appended to the end of 2 remaining bytes, hence a new search commences. If a 3 character match is found, new bytes are one-at-atime is added to the end of a match string and searches are made, to determine the longer string that has atleast a similar match character in the window. determined, a compression technique like lossless comp data includes LZ77, LZ78, LZW, LZSS etc is applied to the data [6]. Here LZ77 lossless compression is used where data can be compressed in better n simpler approach. Then the compressed data file is accessed by the user and user checks if the compressed data is correct. If data is correct then user ends the program and the compressed file is deleted, if the data is not correct then the sliding window slides to the next position n search continues until the match is found. To retrieve the original data obtained compressed file is decompressed. LZ77 has three different possibilities for its coding process. a) The next character to be encoded in the window has no match in the sequence [5]. b) If the character present in search buffer is same as the character in look-ahead buffer, then the length of the match is found. c) The matched string must be extended inside the look-ahead buffer. Compression ratio can be defined as ratio between size of compressed file and size of source file. (In simple it is ratio of size after compression to size before compression). If compression ratio is more, than speed increases thereby data sending and retrieving process is faster between host and the flash memory. Hence compression ratio can be increased by making kernel to denote pointers of match substring and its offset of longest match string to be encoded effectively. III. METHODOLOGY Compression is an encoding process that takes less space e.g. to reduce load on memory, disk, I/O etc. Lossless decoder reproduces message exactly whereas lossy decoder reproduces message approximately. Fig. 3: Flow chart of LZ77 compression. Fig. 2: Compression and decompression model. The data sent to the compressor block is compressed for a different models and the compressed data is again an input to the decompressor block, wherein an original data is retrieved back as shown in fig. 2. A. Flowchart of LZ77 The file size is determined for the data or text taken for compression and decompression. Once the file size is Compression can be done using two methods, that is fixed length codes and variable length codes. 1) Fixed Length Codes A fixed length code is a code in which fixed number of source data is encoded into a fixed number of output data. It can be carried out using three methodologies. a) Short bytes: Storage unit in short bytes is of 5 bits. If alphabet is <=32 symbols, then 5 bits per symbols are used. 149

If alphabet is >32 symbols and <= 60 then either of the case can be used: Use 1-30 for most frequent symbols ( Base case ). Use 1-30 for least frequent symbols ( shift case ). Use 0 and 31 to shift back and forth. (e.g. type writer). This type of compression works well when shifts do not occur often. One of the optimization methods that can be used in this algorithm is either, temporary shift and shift lock or multiple cases. b) Bigrams / Digrams: Storage unit in bigrams is 8bits which contains from 0 to 255. From the first of 1 to 87 bits characters like blank, uppercase, lowercase, digits and 25 special characters are used. Rest of the bits from 88 to 255 is used for bigrams, which is the combination of master plus combining. Master consists of 8bits they are blank, A, E, I, O, N, T, U and combination consists of 21 bits they are blank, plus everything but J, K, Q, X, Y, Z. Therefore total codes consist of: 88+8*21=88+168=256. By this bigrams are simple, fast and requires less memory, maximum compression is of 50%. c) n-grams: Storage unit in n-grams is 8bits. This is similar to that of bigrams, but extended to cover sequences of 2 or more characters. The goal of n-gram is to encode each length of unit that is greater than one which occurs with high probability. This is common for two and three symbol words and can capture longer phrases and names. All the above three methods are simple and very effective when their assumptions are correct. 2) Variable Length Codes Variable length code is a code which maps the source data to a variable number of bits. Variable length codes can allow source to be compressed and decompressed with zero error by technique of lossless data compression and thereby data can be read back. Continue the process until the tree contains all nodes and symbols. Frequent symbols are usually nearer to the root, hence it is a short code and less frequent symbols are deeper giving them longer codes. b) Lempel Ziv: Lempel Ziv is based on the adaptive dictionary approach to variable length coding. In order to build the dictionary, encounter the text that has been already used [10]. A good dictionary can be built if the text follows zipf s file. Some of the variant used in the compression are LZ77, Gzip, LZ78, LZW and UNIX. These variants differ from, The way how dictionary is built. Representation of pointers and Pointers limitation on which it s referred. IV. PROPOSED METHOD The previously existed method was a set of sequence is considered comprising a sliding window that contains both search and look ahead buffer. Search pointer points or slides from backwards, either of the two criteria is obtained that is match found and match not found. Fig.4 shows the encoding and decoding process. So if match is not found in search buffer which is present in look-ahead buffer then a bit is shifted to one position and the sequence will be considered until the length of the match is found. Thereby a triple is obtained containing offset, length of the match and code word for both encoding and decoding process. a) Huffman codes: Huffman coding is a method for compressing data with variable length code [8]. This method has a set of variable length code words with the shortest average length that is assigned to the data by using the frequencies of occurrences. Process includes gathering of probabilities for symbols like character, words or a mix. Later build a tree as follows Find at least two frequency symbols or nodes and join it with parent node. Label the least probable as 0 and other branch as 1. Fig. 4: LZ77 compression and decompression. In the proposed method, a set of data is considered in a sliding window where the size of search buffer is larger than the look-ahead buffer as shown in fig. 5. 150

b) The implementation of LZ77 is more efficient on Xtensa using Tensilica instructions. c) The encoder must keep track of some amount of data such as the last 2KB, 4KB, 32KB. d) The performance of LZ77 in SSD increases, thereby the speed of read/write cycle can be improved up to 8byte per cycles. Fig. 5: LZ77 Proposed Method Block Diagram. Now the search pointer position is at the left of the search buffer and the search pointer starts its search until it finds a character which is present at the start of look-ahead buffer, if a match is found, then the encoding process is stopped, thereby length of the match is found. Copy the characters to the end, so that left out end character will be represented as a code word. Therefore a triple (o,l,c) is obtained, where o is offset, l is length of the match, c is code word that is encoded and the same triple can be decoded. Here the offset, length of the match and code word should be matched or same for both encoding and decoding process. If the length of the string in search buffer is larger, then the data is stored in memory, so that the search buffer works faster thereby search speed increases. Text or data is taken from host interface and is transmitted to the write path during compression. Therefore stored data is made to uncompressed so as to retrieve back the original data which is sent to the host interface. Finally the data is verified, whether the data sent and retrieved back is same. The total throughput of a compression size will be of a single stream, where it is capable for implementing in giga bits per second. Latency of compression ratio increases from the start to the end of compression and decompression and it is based on Huffman code. Compression ratio reduces the size of original file that is, if the final result is 1/3 the size of the starting compression file. V. CHALLENGES OF PROPOSED LZ77 IN SSD During the implementation of SSD the challenges faced by proposed LZ77 data compression is, a) Data compression benefits in reducing write amplification, i.e, even though only a few bytes of data are changed yet it needs the whole blocks to be written. Fig. 6: Chart Showing Compression ratio for the Proposed LZ77 Family in SSD. In the proposed LZ77 compression technique, the compression ratio achieved is 3.12 times better than any other lossless compression technique as shown in fig. 6. The optimization of LZ77 data compression has been improved by 550% on Xtensa processor. The output of compression ratio in LZ77 which is of literal bytes, occupies 9bits and a matched string upto 258bytes occupies 24bits. Worst case result for lz77 is 1/8=12% growth and a best case is of 24/258*8=98.8% which helps in reduction of size. VI. IMPLEMENTATION RESULTS The results of proposed LZ77 were evaluated on Tensilica s Xtensa Processor for each block (compression and decompression) and obtained different results and performance. Combination of proposed LZ77 yields a better performance and compression ratio for its read path and write path. The total compression ratio obtained for unmatched character is 3.14 times, which was better than any other existing algorithms and compression ratio for matched character is double than that of unmatched character. The implementation of SSD in lz77 will improve the performance of read/write speed, thereby capacity and reliability also increases. The total code size obtained for proposed LZ77 in profile cycles is 34764 bytes. The cycles obtained for proposed LZ77 are, for 12 cycles it is 1 execute with 5 unconditional instruction fetch along with 6 unconditional load and for 22 cycles it is 1 execute with 16 unconditional instruction fetch along with 5 unconditional load. Therefore the data was securely compressed which was robust to many of the compression technique. 151

The Xtensa C compiler additionally supports Tensilica s FLIX (Flexible Length Instruction Extension) instruction to confess from 4-byte to 16-byte VLIW (very long instruction word) instruction. The level of performance and efficiency increases 10 to 100x for lower energy consumption. Multi-processor subsystem shows each cascade for easy load assessment and re-partitioning guidance. The call graph view enables to view the entire application s caller hierarchy and callee function. High performance code from c source can offer compiler to operate Xtensa SIMD processor and FLIX instruction to obtain hybrid sample and cycle accurate results. Fig. 7: Final Build and Results. The Robust LZ77 data compression code debug was implemented on Tensilica s Xtensa Processor is based on the GNU compiler which is highly customized for targeting the compact of 16/24-bit Xtensa ISA as shown in fig. 8. Fig. 9: Multi-core profiling (Hybrid sampled). In multi-core profiling, fig. 10 shows the cycle accurate results, which was obtained by user-defined write and read path size for a different sizes like 4kb, 8kb, 16kb, 32kb, so that host interface can write the data through write path during compression and the data can be read through read path during uncompression. Hybrid sampled results are obtained without user-defined sizes as shown in fig. 9. Profiling information is used to optimize application code to further reduce branch delays and improves in-lining. Fig. 8: LZ77 Output Fig. 10: Multi-core profiling (Cycle accurate). Fig. 8: LZ77 output. 152

[2] Adrian Traian, Murgan and Radii Radescu. "A Comparison of Algorithms for Lossless Data Compression Using the Lempel-Ziv- Welch Type Methods". IEEE, pp.105-111, 1996. [3] Senthil Shanmugasundaram, Robert Lourdusamy A Comparative Study Of Text Compression Algorithms International Journal of Wisdom Based Computing, Vol. 1 (3), December 2011. [4] Khalid Sayood, Introduction to Data Compression, 2nd Edition, San Francisco, CA, Morgan Kaufmann, 2000. [5] C. Fraser. An instruction for direct interpretation of LZ77-compressed programs. Technical report MSR-TR-2002-90, 9/2002. Fig. 11: Pipeline view of instruction stalls and latency issues (Cycle Accurate). VII. CONCLUSION LZ77 algorithm explores mechanism for compressing data over SSDs. The Robust LZ77 data compression technique is simple and effective approach to compress the data and uses redundant nature of data to provide compression ratio. A 12% increase in LZ77 code improves the performance for 8bytes per cycle at a time. Due to the wide memory interface available at the local memories processing can be extended for 16, 32 or 64 bytes at a time. The approximate average speed of LZ77 in solid state drive controller (SSD) read/write cycles is 180Mb/s. Compression ratio wil be more, when speed increases thereby data sending and retrieving process is faster between host and the flash memory in SSD s. References [1] David Salomon. Data Compression: The Complete Reference, 4th Edition. (With contributions by Giovanni Motta and David Bryant). Published by Springer, Dec 2006. [6] Sungjin Lee, Jihoon Park, Kermin Fleming, Arvind Improving Performance and Lifetime of Solid-State Drives Using Hardware- Accelerated Compression, IEEE Transactions on consumer electronics year:2011,volume :57,Issue:4, pp:1732-1739. [7] Hu Yuanfu; Wu Xunsen Signal processing, The methods of improving the compression algorithms, 3 rd International conference on year :1996,Volume 1Pages : 698 701. Vol.1,DOI: 10.1109/ICSIGP.1996.567359. [8] Huffman D.A., A method for the construction of minimumredundancy codes, Proceedings of the Institute of Radio Engineers, 40 (9), pp. 1098 1101, September 1952. [9] Simple Hashing LZ77 sliding Dictionary Compression Program, PROG1.C, by Rich Geldreih, Jr October, 1993. [10] Gipfeli High Speed Compreeion Algorithm Rastislav Lenhardt, and Jyrki Alakuijala, University of Oxford, united Kingdom. 153