DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms

Size: px
Start display at page:

Download "DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms"

Transcription

1 DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms Transmission Systems Group (TrSys) Attiya Mahmood Nazia Islam Dawit Nigatu Werner Henkel 1

2 Outline Inspiration Description of Bi-directional Algorithms: LZ77, LZ78, and LZW84 Results Conclusion 1

3 Inspiration Bi-directional Reading and Lempel-Ziv-like Processes in the DNA Bi-directional processes: DNA Replication, Transcription, Translation Lempel-Ziv-like process: Alternative Splicing 2

4 Bi-directional Processes: Replication DNA has two ends in the two strands: 3 to 5 Bi-directionality in reading in complementary strands Figure: DNA replication ( node=photos/graphics&id=85151) 3

5 Bi-directional Processes: Gene Expression Transcription Figure: mrna Transcription Process en.wikipedia.org/wiki/transcription(genetics) 4

6 Bi-directional Processes: Gene expression Translation >gi :c Escherichia coli str. K-12 substr. W3110, Complete genome ATGAATGTTAATTACCTGAATGATTCAGATCTGGATTTTCTTCAGCATTGTAGTGAGGAACAGTTGGCAAATTTC GCCCGATTGCTCACCCATAATGAAAAAGGCAAAACTCGCCTCTCCAGCGTACTGATGCGCAACGAACTGTTAAATC GATGGAAGGGCATCCCGAGCAACATCGCCGCAACTGGCAGCTGATTGCCGGAGAATTACAGCATTTTGGTGGCGAT AGTATCGCCAACAAACTGCGCGGACACGGTAAATTGTATCGGGCCATTTTGCTCGATGTTTCAAAGCGATTGAAGC TGAAAGCCGACAAAGAGATGTCTACGTTTGAAATTGAGCAACAGTTACTGGAACAATTTCTGCGTAATACCTGGA AGAAAATGGACGAGGAACATAAGCAGGAGTTTCTGCACGCGGTCGATGCCAGGGTGAATGAGCTGGAAGAGCTGC TGCGCTGCTGATGAAAGACAAATTATTGGCAAAAGGTGTGTCGCATTTGCTTTCCAGCCAACTGACCCGCATTTTA CGCACCCACGCAGCAATGAGCGTACTTGGGCATGGTTTGCTGCGCGGCGCGGGGCTGGGAGGCCCTGTAGGTGCGG CACTAAATGGGGTTAAAGCGGTCAGCGGCAGCGCCTATCGCGTGACGATTCCAGCCGTACTGCAAATCGCCTGCCT GCGCCGGATGGTTAGCGCCACTCAGGTCTGA Figure: Genes htga and yaaw in overlapping position on opposite strands (nucleotide sequence of gene yaaw of E. coli K12.) 5

7 Alternative Splicing Figure: Alternative splicing ( 6

8 Bi-directional Lempel-Ziv-like Algorithms Bi-directional matching criterion Cost: 1 bit direction flag Flag bit 0 : regular direction Flag bit 1 : reverse direction 7

9 Bi-directional LZ77 Data traversed in: memory and look ahead buffers Size of buffers limited Output: (position, length, non-matching character, flag) text block to be compressed: ecabbaced Encoder Output 0,0,e,0 0,0,c,0 0,0,a,0 0,0,b,0 3,4,d,1 String at specified index e c a b baced 8

10 Bi-directional LZ77 Output: (position, length, non-matching character, flag) text block to be compressed: ecabbaced Encoder Output 0,0,e,0 0,0,c,0 0,0,a,0 0,0,b,0 3,4,d,1 String at specified index e c a b baced 9

11 Bi-directional LZ77 Output: (position, length, non-matching character, flag) text block to be compressed: e cabbaced Encoder Output 0,0,e,0 0,0,c,0 0,0,a,0 0,0,b,0 3,4,d,1 String at specified index e c a b baced 10

12 Bi-directional LZ77 Output: (position, length, non-matching character, flag) text block to be compressed: ec abbaced Encoder Output 0,0,e,0 0,0,c,0 0,0,a,0 0,0,b,0 3,4,d,1 String at specified index e c a b baced 11

13 Bi-directional LZ77 Output: (position, length, non-matching character, flag) text block to be compressed: ecab baced Encoder Output 0,0,e,0 0,0,c,0 0,0,a,0 0,0,b,0 3,4,d,1 String at specified index e c a b baced 12

14 Bi-directional LZ78 Dictionary based approach Dictionary size limited Output: (index, next non-matching character, flag) text block to be compressed: ecbebbec Encoder Encoder String at Decoder dictionary Output specified index dictionary 1: e 0,e,0 e 1: e 2: c 0,c,0 c 2: c 3: b 0,b,0 b 3: b 4: eb 1,b,0 eb 4: eb 5: bec 4,c,1 bec 5: bec 13

15 Bi-directional LZ78 Output: (index, next non-matching character, flag) text block to be compressed: ecbebbec Encoder Encoder String at Decoder dictionary Output specified index dictionary 1: e 0,e,0 e 1: e 2: c 0,c,0 c 2: c 3: b 0,b,0 b 3: b 4: eb 1,b,0 eb 4: eb 5: bec 4,c,1 bec 5: bec 14

16 Bi-directional LZ78 Output: (index, next non-matching character, flag) text block to be compressed:ecbebbec Encoder Encoder String at Decoder dictionary Output specified index dictionary 1: e 0,e,0 e 1: e 2: c 0,c,0 c 2: c 3: b 0,b,0 b 3: b 4: eb 1,b,0 eb 4: eb 5: bec 4,c,1 bec 5: bec 15

17 Bi-directional LZ78 Output: (index, next non-matching character, flag) text block to be compressed:ecbebbec Encoder Encoder String at Decoder dictionary Output specified index dictionary 1: e 0,e,0 e 1: e 2: c 0,c,0 c 2: c 3: b 0,b,0 b 3: b 4: eb 1,b,0 eb 4: eb 5: bec 4,c,1 bec 5: bec 16

18 Bi-directional LZ78 Output: (index, next non-matching character, flag) text block to be compressed:ecbebbec Encoder Encoder String at Decoder dictionary Output specified index dictionary 1: e 0,e,0 e 1: e 2: c 0,c,0 c 2: c 3: b 0,b,0 b 3: b 4: eb 1,b,0 eb 4: eb 5: bec 4,c,1 bec 5: bec 17

19 Bi-directional LZ78 Output: (index, next non-matching character, flag) text block to be compressed:ecbebbec Encoder Encoder String at Decoder dictionary Output specified index dictionary 1: e 0,e,0 e 1: e 2: c 0,c,0 c 2: c 3: b 0,b,0 b 3: b 4: eb 1,b,0 eb 4: eb 5: bec 4,c,1 bec 5: bec 18

20 Bi-directional LZW84 Initialized dictionary Output: (index, flag) text block to be compressed: ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? 19

21 Bi-directional LZW84 Output: (index, flag) text block to be compressed:ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? 20

22 Bi-directional LZW84 Output: (index, flag) text block to be compressed:ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? 21

23 Bi-directional LZW84 Output: (index, flag) text block to be compressed:ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? 22

24 Bi-directional LZW84 Output: (index, flag) text block to be compressed:ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? 23

25 Bi-directional LZW84 Output: (index, flag) text block to be compressed:ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? 24

26 Bi-directional LZW84 Output: (index, flag) text block to be compressed:ecabaced Encoder Encoder String at Decoder dictionary Output specified index dictionary 27: ec 5,0 e 27: e? 28: ca 3,0 c 27: ec; 28: c? 29: ab 1,0 a 28: ca; 29: a? 30: bac 29,1?a 29: a?; 30:?a? 31: ced 27,1 ce 30:?ac; 31: ce? Last dictionary entry cannot be used for reverse reading 25

27 Bi-directional LZW84: Symmetry Text compression: ASCII characters in initialized dictionary Bit pattern symmetry Use flag bit to indicate inverted bit pattern Initial dictionary is half the size of original counterpart 26

28 Results Audio [F] Audio [FR] Image [F] Image [FR] Text [F] Text [FR] Compression Ratio N SB Figure: Comparison of conventional and modified LZ77 27

29 Results Audio [F] Audio [FR] Image [F] Image [FR] Text [F] Text [FR] Compression Ratio N SB Figure: Comparison of conventional and modified LZ78 28

30 Results Compression Ratio Audio [F] Audio [FR] Image [F] Image [FR] Text [F] Text [FR] Finite Dictionary Size Figure: Comparison of conventional and modified LZW84 29

31 Conclusion Proposed algorithm has better performance, when there is some symmetry in data Employ this for DNA compression using conjugate reverse relations which may use non-binary flags Analytical treatment of how this differs from the original Lempel-Ziv algorithm properties 30

32 Thank You! 31

DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms

DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms DNA Inspired Bi-directional Lempel-Ziv-like Compression Algorithms Attiya Mahmood, Nazia Islam, Dawit Nigatu, and Werner Henkel Jacobs University Bremen Electrical Engineering and Computer Science Bremen,

More information

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey Data Compression Media Signal Processing, Presentation 2 Presented By: Jahanzeb Farooq Michael Osadebey What is Data Compression? Definition -Reducing the amount of data required to represent a source

More information

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding Huffman codes require us to have a fairly reasonable idea of how source symbol probabilities are distributed. There are a number of applications

More information

Lossless compression II

Lossless compression II Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)

More information

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic

More information

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes Overview Last Lecture Data Transmission This Lecture Data Compression Source: Lecture notes Next Lecture Data Integrity 1 Source : Sections 10.1, 10.3 Lecture 4 Data Compression 1 Data Compression Decreases

More information

Error Resilient LZ 77 Data Compression

Error Resilient LZ 77 Data Compression Error Resilient LZ 77 Data Compression Stefano Lonardi Wojciech Szpankowski Mark Daniel Ward Presentation by Peter Macko Motivation Lempel-Ziv 77 lacks any form of error correction Introducing a single

More information

Dictionary techniques

Dictionary techniques Dictionary techniques The final concept that we will mention in this chapter is about dictionary techniques. Many modern compression algorithms rely on the modified versions of various dictionary techniques.

More information

Engineering Mathematics II Lecture 16 Compression

Engineering Mathematics II Lecture 16 Compression 010.141 Engineering Mathematics II Lecture 16 Compression Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University 1 Lossless Compression Outline Huffman &

More information

EE-575 INFORMATION THEORY - SEM 092

EE-575 INFORMATION THEORY - SEM 092 EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------

More information

Week 7: Assignment Solutions

Week 7: Assignment Solutions Week 7: Assignment Solutions 1. In 6-bit 2 s complement representation, when we subtract the decimal number +6 from +3, the result (in binary) will be: a. 111101 b. 000011 c. 100011 d. 111110 Correct answer

More information

Multimedia Systems. Part 20. Mahdi Vasighi

Multimedia Systems. Part 20. Mahdi Vasighi Multimedia Systems Part 2 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for dvanced Studies in asic Sciences, Zanjan, Iran rithmetic Coding

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 2: Text Compression Lecture 6: Dictionary Compression Juha Kärkkäinen 15.11.2017 1 / 17 Dictionary Compression The compression techniques we have seen so far replace individual

More information

Lempel-Ziv-Welch (LZW) Compression Algorithm

Lempel-Ziv-Welch (LZW) Compression Algorithm Lempel-Ziv-Welch (LZW) Compression lgorithm Introduction to the LZW lgorithm Example 1: Encoding using LZW Example 2: Decoding using LZW LZW: Concluding Notes Introduction to LZW s mentioned earlier, static

More information

EE67I Multimedia Communication Systems Lecture 4

EE67I Multimedia Communication Systems Lecture 4 EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.

More information

Abdullah-Al Mamun. CSE 5095 Yufeng Wu Spring 2013

Abdullah-Al Mamun. CSE 5095 Yufeng Wu Spring 2013 Abdullah-Al Mamun CSE 5095 Yufeng Wu Spring 2013 Introduction Data compression is the art of reducing the number of bits needed to store or transmit data Compression is closely related to decompression

More information

Data compression with Huffman and LZW

Data compression with Huffman and LZW Data compression with Huffman and LZW André R. Brodtkorb, Andre.Brodtkorb@sintef.no Outline Data storage and compression Huffman: how it works and where it's used LZW: how it works and where it's used

More information

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77 CS 493: Algorithms for Massive Data Sets February 14, 2002 Dictionary-based compression Scribe: Tony Wirth This lecture will explore two adaptive dictionary compression schemes: LZ77 and LZ78. We use the

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Thinh Nguyen (Based on Prof. Ben Lee s Slides) Oregon State University School of Electrical Engineering and Computer Science Outline Why compression?

More information

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University ECE 499/599 Data Compression & Information Theory Thinh Nguyen Oregon State University Adminstrivia Office Hours TTh: 2-3 PM Kelley Engineering Center 3115 Class homepage http://www.eecs.orst.edu/~thinhq/teaching/ece499/spring06/spring06.html

More information

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS Yair Wiseman 1* * 1 Computer Science Department, Bar-Ilan University, Ramat-Gan 52900, Israel Email: wiseman@cs.huji.ac.il, http://www.cs.biu.ac.il/~wiseman

More information

Geometry: Semester 1 Midterm

Geometry: Semester 1 Midterm Class: Date: Geometry: Semester 1 Midterm Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The first two steps for constructing MNO that is congruent to

More information

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding SIGNAL COMPRESSION Lecture 5 11.9.2007 Lempel-Ziv Coding Dictionary methods Ziv-Lempel 77 The gzip variant of Ziv-Lempel 77 Ziv-Lempel 78 The LZW variant of Ziv-Lempel 78 Asymptotic optimality of Ziv-Lempel

More information

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression

More information

Visualizing Lempel-Ziv-Welch

Visualizing Lempel-Ziv-Welch Visualizing Lempel-Ziv-Welch The following slides assume you have read and (more or less) understood the description of the LZW algorithm in the 6.02 notes. The intent here is to help consolidate your

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year Image compression Stefano Ferrari Università degli Studi di Milano stefano.ferrari@unimi.it Methods for Image Processing academic year 2017 2018 Data and information The representation of images in a raw

More information

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014 LZW Compression Ramana Kumar Kundella Indiana State University rkundella@sycamores.indstate.edu December 13, 2014 Abstract LZW is one of the well-known lossless compression methods. Since it has several

More information

Multimedia Systems. Part 4. Mahdi Vasighi

Multimedia Systems. Part 4. Mahdi Vasighi Multimedia Systems Part 4 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran Image Formats

More information

Data Representation. Types of data: Numbers Text Audio Images & Graphics Video

Data Representation. Types of data: Numbers Text Audio Images & Graphics Video Data Representation Data Representation Types of data: Numbers Text Audio Images & Graphics Video Analog vs Digital data How is data represented? What is a signal? Transmission of data Analog vs Digital

More information

Mapping Reads to Reference Genome

Mapping Reads to Reference Genome Mapping Reads to Reference Genome DNA carries genetic information DNA is a double helix of two complementary strands formed by four nucleotides (bases): Adenine, Cytosine, Guanine and Thymine 2 of 31 Gene

More information

University of Waterloo CS240 Spring 2018 Help Session Problems

University of Waterloo CS240 Spring 2018 Help Session Problems University of Waterloo CS240 Spring 2018 Help Session Problems Reminder: Final on Wednesday, August 1 2018 Note: This is a sample of problems designed to help prepare for the final exam. These problems

More information

University of Waterloo CS240R Winter 2018 Help Session Problems

University of Waterloo CS240R Winter 2018 Help Session Problems University of Waterloo CS240R Winter 2018 Help Session Problems Reminder: Final on Monday, April 23 2018 Note: This is a sample of problems designed to help prepare for the final exam. These problems do

More information

Block Graphs in Practice

Block Graphs in Practice Travis Gagie 1,, Christopher Hoobin 2 Simon J. Puglisi 1 1 Department of Computer Science, University of Helsinki, Finland 2 School of CSIT, RMIT University, Australia Travis.Gagie@cs.helsinki.fi Abstract

More information

Image coding and compression

Image coding and compression Image coding and compression Robin Strand Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Today Information and Data Redundancy Image Quality Compression Coding

More information

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland An On-line Variable Length inary Encoding Tinku Acharya Joseph F. Ja Ja Institute for Systems Research and Institute for Advanced Computer Studies University of Maryland College Park, MD 242 facharya,

More information

7: Image Compression

7: Image Compression 7: Image Compression Mark Handley Image Compression GIF (Graphics Interchange Format) PNG (Portable Network Graphics) MNG (Multiple-image Network Graphics) JPEG (Join Picture Expert Group) 1 GIF (Graphics

More information

Lossless Compression Algorithms

Lossless Compression Algorithms Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms

More information

University of Waterloo CS240R Fall 2017 Review Problems

University of Waterloo CS240R Fall 2017 Review Problems University of Waterloo CS240R Fall 2017 Review Problems Reminder: Final on Tuesday, December 12 2017 Note: This is a sample of problems designed to help prepare for the final exam. These problems do not

More information

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression

More information

WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION

WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION V.KRISHNAN1, MR. R.TRINADH 2 1 M. Tech Student, 2 M. Tech., Assistant Professor, Dept. Of E.C.E, SIR C.R. Reddy college

More information

Comparative Study of Dictionary based Compression Algorithms on Text Data

Comparative Study of Dictionary based Compression Algorithms on Text Data 88 Comparative Study of Dictionary based Compression Algorithms on Text Data Amit Jain Kamaljit I. Lakhtaria Sir Padampat Singhania University, Udaipur (Raj.) 323601 India Abstract: With increasing amount

More information

Data Compression 신찬수

Data Compression 신찬수 Data Compression 신찬수 Data compression Reducing the size of the representation without affecting the information itself. Lossless compression vs. lossy compression text file image file movie file compression

More information

yintroduction to compression ytext compression yimage compression ysource encoders and destination decoders

yintroduction to compression ytext compression yimage compression ysource encoders and destination decoders In this lecture... Compression and Standards Gail Reynard yintroduction to compression ytext compression Huffman LZW yimage compression GIF TIFF JPEG The Need for Compression ymultimedia data volume >

More information

Lossless compression II

Lossless compression II Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)

More information

DEFLATE COMPRESSION ALGORITHM

DEFLATE COMPRESSION ALGORITHM DEFLATE COMPRESSION ALGORITHM Savan Oswal 1, Anjali Singh 2, Kirthi Kumari 3 B.E Student, Department of Information Technology, KJ'S Trinity College Of Engineering and Research, Pune, India 1,2.3 Abstract

More information

OPTIMIZATION OF LZW (LEMPEL-ZIV-WELCH) ALGORITHM TO REDUCE TIME COMPLEXITY FOR DICTIONARY CREATION IN ENCODING AND DECODING

OPTIMIZATION OF LZW (LEMPEL-ZIV-WELCH) ALGORITHM TO REDUCE TIME COMPLEXITY FOR DICTIONARY CREATION IN ENCODING AND DECODING Asian Journal Of Computer Science And Information Technology 2: 5 (2012) 114 118. Contents lists available at www.innovativejournal.in Asian Journal of Computer Science and Information Technology Journal

More information

Massively Parallel Computations of the LZ-complexity of Strings

Massively Parallel Computations of the LZ-complexity of Strings Massively Parallel Computations of the LZ-complexity of Strings Alexander Belousov Electrical and Electronics Engineering Department Ariel University Ariel, Israel alex.blsv@gmail.com Joel Ratsaby Electrical

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics

More information

Lempel-Ziv-Welch Compression

Lempel-Ziv-Welch Compression Lempel-Ziv-Welch Compression Brad Karp UCL Computer Science CS 3007 6 th February 2018 (lecture notes derived from material from Hari Balakrishnan, Katrina LaCurts, and Terry Welch) 1 The Compression Problem

More information

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master

More information

CIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1

CIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1 Version 1 1. (20 Points) Given the class A network address 117.0.0.0 will be divided into multiple subnets. a. (5 Points) How many bits will be necessary to address 4,000 subnets? b. (5 Points) What is

More information

Dictionary Based Compression for Images

Dictionary Based Compression for Images Dictionary Based Compression for Images Bruno Carpentieri Abstract Lempel-Ziv methods were original introduced to compress one-dimensional data (text, object codes, etc.) but recently they have been successfully

More information

Introduction to GE Microarray data analysis Practical Course MolBio 2012

Introduction to GE Microarray data analysis Practical Course MolBio 2012 Introduction to GE Microarray data analysis Practical Course MolBio 2012 Claudia Pommerenke Nov-2012 Transkriptomanalyselabor TAL Microarray and Deep Sequencing Core Facility Göttingen University Medical

More information

The Effect of Non-Greedy Parsing in Ziv-Lempel Compression Methods

The Effect of Non-Greedy Parsing in Ziv-Lempel Compression Methods The Effect of Non-Greedy Parsing in Ziv-Lempel Compression Methods R. Nigel Horspool Dept. of Computer Science, University of Victoria P. O. Box 3055, Victoria, B.C., Canada V8W 3P6 E-mail address: nigelh@csr.uvic.ca

More information

Simple variant of coding with a variable number of symbols and fixlength codewords.

Simple variant of coding with a variable number of symbols and fixlength codewords. Dictionary coding Simple variant of coding with a variable number of symbols and fixlength codewords. Create a dictionary containing 2 b different symbol sequences and code them with codewords of length

More information

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION 15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

More information

Analysis of Parallelization Effects on Textual Data Compression

Analysis of Parallelization Effects on Textual Data Compression Analysis of Parallelization Effects on Textual Data GORAN MARTINOVIC, CASLAV LIVADA, DRAGO ZAGAR Faculty of Electrical Engineering Josip Juraj Strossmayer University of Osijek Kneza Trpimira 2b, 31000

More information

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1 IMAGE COMPRESSION- I Week VIII Feb 25 02/25/2003 Image Compression-I 1 Reading.. Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive,

More information

Overview. Dataset: testpos DNA: CCCATGGTCGGGGGGGGGGAGTCCATAACCC Num exons: 2 strand: + RNA (from file): AUGGUCAGUCCAUAA peptide (from file): MVSP*

Overview. Dataset: testpos DNA: CCCATGGTCGGGGGGGGGGAGTCCATAACCC Num exons: 2 strand: + RNA (from file): AUGGUCAGUCCAUAA peptide (from file): MVSP* Overview In this homework, we will write a program that will print the peptide (a string of amino acids) from four pieces of information: A DNA sequence (a string). The strand the gene appears on (a string).

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Outline Why compression? Classification Entropy and Information

More information

Unified VLSI Systolic Array Design for LZ Data Compression

Unified VLSI Systolic Array Design for LZ Data Compression Unified VLSI Systolic Array Design for LZ Data Compression Shih-Arn Hwang, and Cheng-Wen Wu Dept. of EE, NTHU, Taiwan, R.O.C. IEEE Trans. on VLSI Systems Vol. 9, No.4, Aug. 2001 Pages: 489-499 Presenter:

More information

Compressing Data. Konstantin Tretyakov

Compressing Data. Konstantin Tretyakov Compressing Data Konstantin Tretyakov (kt@ut.ee) MTAT.03.238 Advanced April 26, 2012 Claude Elwood Shannon (1916-2001) C. E. Shannon. A mathematical theory of communication. 1948 C. E. Shannon. The mathematical

More information

CHAPTER II LITERATURE REVIEW

CHAPTER II LITERATURE REVIEW CHAPTER II LITERATURE REVIEW 2.1 BACKGROUND OF THE STUDY The purpose of this chapter is to study and analyze famous lossless data compression algorithm, called LZW. The main objective of the study is to

More information

Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data. Sorelle A. Friedler Google Joint work with David M. Mount

Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data. Sorelle A. Friedler Google Joint work with David M. Mount Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data Sorelle A. Friedler Google Joint work with David M. Mount Motivation Kinetic data: data generated by moving objects Sensors collect data

More information

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints: CS231 Algorithms Handout # 31 Prof. Lyn Turbak November 20, 2001 Wellesley College Compression The Big Picture We want to be able to store and retrieve data, as well as communicate it with others. In general,

More information

Study on LZW algorithm for Embedded Instruction Memory.

Study on LZW algorithm for Embedded Instruction Memory. Study on LZW algorithm for Embedded Instruction Memory. ADBULLAH A. HUSSAIN MAO ZHIGANG MICROELECTRONICS MICROELECTRONICS Harbin Institute of Technology Harbin Institute of Technology Flat No. 202, Building

More information

Design and Implementation of FPGA- based Systolic Array for LZ Data Compression

Design and Implementation of FPGA- based Systolic Array for LZ Data Compression Design and Implementation of FPGA- based Systolic Array for LZ Data Compression Mohamed A. Abd El ghany Electronics Dept. German University in Cairo Cairo, Egypt E-mail: mohamed.abdel-ghany@guc.edu.eg

More information

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J. Data Storage Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J. Glenn Brookshear Copyright 2012 Pearson Education, Inc. Data Storage Bits

More information

EE292: Fundamentals of ECE

EE292: Fundamentals of ECE EE292: Fundamentals of ECE Fall 2012 TTh 10:00-11:15 SEB 1242 Lecture 22 121115 http://www.ee.unlv.edu/~b1morris/ee292/ 2 Outline Review Binary Number Representation Binary Arithmetic Combinatorial Logic

More information

HARDWARE IMPLEMENTATION OF LOSSLESS LZMA DATA COMPRESSION ALGORITHM

HARDWARE IMPLEMENTATION OF LOSSLESS LZMA DATA COMPRESSION ALGORITHM HARDWARE IMPLEMENTATION OF LOSSLESS LZMA DATA COMPRESSION ALGORITHM Parekar P. M. 1, Thakare S. S. 2 1,2 Department of Electronics and Telecommunication Engineering, Amravati University Government College

More information

CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits a 45% b 13% c 12% d 16% e 9% f 5% Why?

More information

The Effect of Flexible Parsing for Dynamic Dictionary Based Data Compression

The Effect of Flexible Parsing for Dynamic Dictionary Based Data Compression The Effect of Flexible Parsing for Dynamic Dictionary Based Data Compression Yossi Matias Nasir Rajpoot Süleyman Cenk Ṣahinalp Abstract We report on the performance evaluation of greedy parsing with a

More information

Supporting Level 2 Functionality

Supporting Level 2 Functionality Supporting Level 2 Functionality Adobe Developer Support Technical Note #5110 31 March 1992 Adobe Systems Incorporated Adobe Developer Technologies 345 Park Avenue San Jose, CA 95110 http://partners.adobe.com/

More information

Optimized Compression and Decompression Software

Optimized Compression and Decompression Software 2015 IJSRSET Volume 1 Issue 3 Print ISSN : 2395-1990 Online ISSN : 2394-4099 Themed Section: Engineering and Technology Optimized Compression and Decompression Software Mohd Shafaat Hussain, Manoj Yadav

More information

Eval: A Gene Set Comparison System

Eval: A Gene Set Comparison System Masters Project Report Eval: A Gene Set Comparison System Evan Keibler evan@cse.wustl.edu Table of Contents Table of Contents... - 2 - Chapter 1: Introduction... - 5-1.1 Gene Structure... - 5-1.2 Gene

More information

On Data Latency and Compression

On Data Latency and Compression On Data Latency and Compression Joseph M. Steim, Edelvays N. Spassov, Kinemetrics, Inc. Abstract Because of interest in the capability of digital seismic data systems to provide low-latency data for Early

More information

CIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1

CIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1 Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. Router A Router B Router C Router D Network Next Hop Next Hop Next Hop Next

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: Enhanced LZW (Lempel-Ziv-Welch) Algorithm by Binary Search with

More information

More Bits and Bytes Huffman Coding

More Bits and Bytes Huffman Coding More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width

More information

12 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, October 18, 2006

12 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, October 18, 2006 12 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, October 18, 2006 3 Sequence comparison by compression This chapter is based on the following articles, which are all recommended reading: X. Chen,

More information

LOSSLESS DATA COMPRESSION AND DECOMPRESSION ALGORITHM AND ITS HARDWARE ARCHITECTURE

LOSSLESS DATA COMPRESSION AND DECOMPRESSION ALGORITHM AND ITS HARDWARE ARCHITECTURE LOSSLESS DATA COMPRESSION AND DECOMPRESSION ALGORITHM AND ITS HARDWARE ARCHITECTURE V V V SAGAR 1 1JTO MPLS NOC BSNL BANGALORE ---------------------------------------------------------------------***----------------------------------------------------------------------

More information

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 3 DLD P VIDYA SAGAR

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 3 DLD P VIDYA SAGAR DLD UNIT III Combinational Circuits (CC), Analysis procedure, Design Procedure, Combinational circuit for different code converters and other problems, Binary Adder- Subtractor, Decimal Adder, Binary Multiplier,

More information

FPGA based Data Compression using Dictionary based LZW Algorithm

FPGA based Data Compression using Dictionary based LZW Algorithm FPGA based Data Compression using Dictionary based LZW Algorithm Samish Kamble PG Student, E & TC Department, D.Y. Patil College of Engineering, Kolhapur, India Prof. S B Patil Asso.Professor, E & TC Department,

More information

On the Suitability of Suffix Arrays for Lempel-Ziv Data Compression

On the Suitability of Suffix Arrays for Lempel-Ziv Data Compression On the Suitability of Suffix Arrays for Lempel-Ziv Data Compression Artur J. Ferreira 1,3 Arlindo L. Oliveira 2,4 Mário A. T. Figueiredo 3,4 1 Instituto Superior de Engenharia de Lisboa, Lisboa, PORTUGAL

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1

Nature Biotechnology: doi: /nbt Supplementary Figure 1 Supplementary Figure 1 Detailed schematic representation of SuRE methodology. See Methods for detailed description. a. Size-selected and A-tailed random fragments ( queries ) of the human genome are inserted

More information

Image Compression. cs2: Computational Thinking for Scientists.

Image Compression. cs2: Computational Thinking for Scientists. Image Compression cs2: Computational Thinking for Scientists Çetin Kaya Koç http://cs.ucsb.edu/~koc/cs2 koc@cs.ucsb.edu The course was developed with input from: Ömer Eǧecioǧlu (Computer Science), Maribel

More information

Tutorial for the Exon Ontology website

Tutorial for the Exon Ontology website Tutorial for the Exon Ontology website Table of content Outline Step-by-step Guide 1. Preparation of the test-list 2. First analysis step (without statistical analysis) 2.1. The output page is composed

More information

Repetition 1st lecture

Repetition 1st lecture Repetition 1st lecture Human Senses in Relation to Technical Parameters Multimedia - what is it? Human senses (overview) Historical remarks Color models RGB Y, Cr, Cb Data rates Text, Graphic Picture,

More information

A Comprehensive Review of Data Compression Techniques

A Comprehensive Review of Data Compression Techniques Volume-6, Issue-2, March-April 2016 International Journal of Engineering and Management Research Page Number: 684-688 A Comprehensive Review of Data Compression Techniques Palwinder Singh 1, Amarbir Singh

More information

Fast Two-Stage Lempel-Ziv Lossless Numeric Telemetry Data Compression Using a Neural Network Predictor

Fast Two-Stage Lempel-Ziv Lossless Numeric Telemetry Data Compression Using a Neural Network Predictor Journal of Universal Computer Science, vol. 10, no. 9 (2004), 1199-1211 submitted: 9/11/03, accepted: 9/6/04, appeared: 28/9/04 J.UCS Fast Two-Stage Lempel-Ziv Lossless Numeric Telemetry Data Compression

More information

CIS-331 Exam 2 Fall 2014 Total of 105 Points. Version 1

CIS-331 Exam 2 Fall 2014 Total of 105 Points. Version 1 Version 1 1. (20 Points) Given the class A network address 119.0.0.0 will be divided into a maximum of 15,900 subnets. a. (5 Points) How many bits will be necessary to address the 15,900 subnets? b. (5

More information

Indexing. CS6200: Information Retrieval. Index Construction. Slides by: Jesse Anderton

Indexing. CS6200: Information Retrieval. Index Construction. Slides by: Jesse Anderton Indexing Index Construction CS6200: Information Retrieval Slides by: Jesse Anderton Motivation: Scale Corpus Terms Docs Entries A term incidence matrix with V terms and D documents has O(V x D) entries.

More information

Lecture 6 Review of Lossless Coding (II)

Lecture 6 Review of Lossless Coding (II) Shujun LI (李树钧): INF-10845-20091 Multimedia Coding Lecture 6 Review of Lossless Coding (II) May 28, 2009 Outline Review Manual exercises on arithmetic coding and LZW dictionary coding 1 Review Lossy coding

More information

CIS-331 Exam 2 Spring 2016 Total of 110 Points Version 1

CIS-331 Exam 2 Spring 2016 Total of 110 Points Version 1 Version 1 1. (20 Points) Given the class A network address 121.0.0.0 will be divided into multiple subnets. a. (5 Points) How many bits will be necessary to address 8,100 subnets? b. (5 Points) What is

More information

Implementation of BDEA-A DNA cryptography for securing data in cloud

Implementation of BDEA-A DNA cryptography for securing data in cloud Implementation of BDEA-A DNA cryptography for securing data in cloud Hema N 1 Sonali Ulhas Kadwadkar 2 1 Assistant Professor, Dept of Information Science Engineering, RNSIT Bangalore, Karnataka, India

More information

Small-Space 2D Compressed Dictionary Matching

Small-Space 2D Compressed Dictionary Matching Small-Space 2D Compressed Dictionary Matching Shoshana Neuburger 1 and Dina Sokol 2 1 Department of Computer Science, The Graduate Center of the City University of New York, New York, NY, 10016 shoshana@sci.brooklyn.cuny.edu

More information

Implementation and Optimization of LZW Compression Algorithm Based on Bridge Vibration Data

Implementation and Optimization of LZW Compression Algorithm Based on Bridge Vibration Data Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 1570 1574 Advanced in Control Engineeringand Information Science Implementation and Optimization of LZW Compression Algorithm Based

More information

A Method for Virtual Extension of LZW Compression Dictionary

A Method for Virtual Extension of LZW Compression Dictionary A Method for Virtual Extension of Compression Dictionary István Finta, Lóránt Farkas, Sándor Szénási and Szabolcs Sergyán Technology and Innovation, Nokia Networks, Köztelek utca 6, Budapest, Hungary Email:

More information

Digital Logic Lecture 4 Binary Codes

Digital Logic Lecture 4 Binary Codes Digital Logic Lecture 4 Binary Codes By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department Outline Introduction. Character coding. Error detection codes. Gray code. Decimal coding.

More information