Lecture on Computer Networks
|
|
- Jocelin Lambert
- 5 years ago
- Views:
Transcription
1 Lecture on Computer Networks Historical Development Copyright (c) 2008 Dr. Thomas Haenselmann (Saarland University, Germany). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. time
2 Source coding in data networks Motivation for source coding in data networks This short introduction addresses the problem of how to code payload optimally for the network. Theoretically, for an error-free communication channel, no special precautions are necessary, we could simply forward the bits. More realistic, we know that the channel introduces errors. So it can make sense to send more bits than only the data to transmit to account for the erroneous channel.
3 Source coding in data networks Error control: Detecting vs. Correcting Basically two variants are distinguished: Error detecting codes Error correcting codes (Forward Error Correction) Both variants need a certain amount of redundancy. Which variant makes sense in which situation? Error detection: Detection might be sufficient if the sink can ask the source for repeated transmission. Therefore, a feedback channel and a defined protocol are necessary. In some cases, erroneous data can simply be deleted without retransmission (e.g., IP-telephony).
4 Source coding in data networks Error control: Detecting vs. Correcting Error correcting codes: A larger amount of redundancy is necessary. This makes sense if no feedback channel is available or a retransmission delays the packet considerably (e. g. telephony, video conferences). What is better: Forward error correction or retransmission?
5 Source coding in data networks Error control: The Hamming-Distance Error correcting codes: A larger amount of redundancy is necessary. This makes sense if Operand Operand XOR The Hamming-Distance of two codewords corresponds exactly to the number of bits the codewords differ in. This is calculated using the XOR-operation. In other words: If two codewords have the Hamming-Distance d, then it is necessary to toggle d bits to transform one codeword into the other.
6 Source coding in data networks Error control: The Hamming-Distance The Hamming-Distance of an entire code is defined as the smallest possible distance two arbitrary (but different) codewords of a code can have. To detect errors we need a code with valid and invalid words. To be able to detect d bit errors in a codeword the code must have a distance of d+1. Why does less distance not suffice?
7 Source coding in data networks Error control: Error correcting codes Error correcting codes need a distance of 2d+1 if d errors need to be detected and corrected. Why does less distance not suffice? With a distance of 2d+1 we need to toggle 2d+1 bits to get from one valid codeword to another. If we toggle d bits only then the effort to get back to the original codeword is also exactly toggling d bits. But to get to a different (not the original) codeword when there are d bit errors it is necessary to toggle d+1 bits. If only d bit errors may occur then the shortest (and only) way to get to the next codeword is toggling d bits. Example: 2 bits are encoded by 10 Orig. code: E.corr.Code: How many bit errors can be detected and corrected here?
8 Source coding in data networks Error control: Redundancy estimation of error correcting codes How many bits do we need at least for the correction of a bit error? We want to have 2^m valid codewords. r correction bits are needed. The error correcting code will have n = (m + r) bits in total. By toggling one of the n bits (redundant bits may be toggled, too) we get an (illegal) codeword with a distance of 1.That means, for each of the 2^m valid words we can create n invalid words. Hence follows n 1 2 m 2 n
9 Source coding in data networks Error control: Redundancy estimation of error correcting codes How many bits do we need at least for the correction of a bit error? (n + 1) is composed of n invalid (each created by one bit error) and one valid codeword. The number of words of the error correcting code stands on the right side of the inequality. n can be written as m data bits plus r correction bits. m r 1 2 m 2 m r m r 1 2 r With a given m we can estimate the number of correction bits a code needs in any case.
10 Source coding in data networks Error control: The error correcting Hamming code The Hamming code achieves the minimum of the previous estimation. Algorithm: Number the bits from the LSB to the MSB. All 2^n bits are check bits, the rest are data bits. The data bits are filled up from left to right with the actual data, the check bits only depend on the data bits. Which data bit influences which check bit? Check bits Data bits
11 Source coding in data networks Error control: The error correcting Hamming code Check bits Data bits To see this, convert the (order) number of a bit into its binary representation: 11 = Data bit 11 hence influences check bits 8, 2 and 1. Check bit 1 is of course also influenced by the data bits 3, 5, 7 and 9. By definition all check bits combined with their data bits have to exhibit an even (or odd, depending what was agreed upon) parity.
12 Source coding in data networks Error control: The error correcting Hamming code Example: Data: digit?? 1? 1 0 1? data w/out check bits Conversion Data and check bits Correction procedure First, a counter is set to zero. Then, the check bits are checked one by one. If one of them produces the wrong parity, the number of the corresponding check bit is added to the counter. In the end, the counter points to the toggled bit.
13 Source coding in data networks Error control: The error correcting Hamming code Another interpretation of the error recovery scheme: Why does the counter value reference the defective bit? Bit 1 defective? yes: Bits 3,5,7,9 and 11 may be defective Bit 2 defective? no: Bits 5 and 9 are left Bit 4 defective? yes: Only bit 5 is left
14 Source coding in data networks Error control: Cyclic Redundancy Check (CRC) CRC is based on the idea of polynomial division. Remember: (x 5 +x 3 +x+1):(x+1) = x 4 x 3 +2x 2 2x+3 [2/(x+1)] (x 5 +x 4 ) 0 x 4 +x 3 ( x 4 x 3 ) 0 + 2x 3 +x (2x 3 +2x 2 ) 0 2x 2 +x ( 2x 2 2x) 0 3x+1 (3x+3) Check: [x 4 x 3 +2x 2 2x+3 2/(x+1)]*(x+1) = = Remainder or modulus What's the difference between polynomial division and normal division?
15 Source coding in data networks Error control: Cyclic Redundancy Check A bit string is interpreted as a polynomial by numbering the bits consecutively and, if a bit is set, by adding the corresponding term to the polynomial. In other words: Use the bits as coefficients. Example: position data bits 1x 7 + 1x 6 + 0x 5 + x 4 + 0x 3 + 1x 2 + 0x 1 + 1x 0 given data bits is the polynomial corresponding to the
16 Source coding in data networks Error control: Cyclic Redundancy Check The principle of CRC: Sender and recipient agree upon a divisor polynomial, also called generator polynomial. Then, g zeros are added to the message, g being the degree of the generator polynomial. In the next step, the sender divides the message (extended by g zeros), similar to the polynomial division. In most cases there will be a remainder, the result of the division is of no interest. The remainder is then subtracted from the message (being extended by the zeros). The resulting bit string is now transferred to the recipient. If the message was transmitted correctly, no remainder should emerge on the recipient's side. Why? Because the sender intentionally subtracted the remainder before sending the message. The g zeros which emerge after the division are interpreted by the recipient as an indication of an error free transmission. Note: The sender can safely subtract the remainder without harming the message. Why?
17 Source coding in data networks Error control: Cyclic Redundancy Check The only difference to normal polynomial division: Calculations are binary and after the calculation of each digit a modulo 2 operation is performed! In other words: Always ignore the carry-over. This simplifies the addition and subtraction significantly Discovery: Both operations, plus and minus, are equivalent to the XOR operation.
18 Source coding in data networks Error control: Cyclic Redundancy Check (CRC) Example: Message: Generator polynomial: (x 4 + x + 1) = 4 th degree (5 th Message extended by 4 zeros: order) Division: :10011= (the result does not matter) XOR XOR XOR XOR = remainder Special note: The division continues, if the MSB (most significant bit) of the divisor and of the bit string currently being divided is set. Sometimes the bit string has to be extended by new bits (from the message) until the generator polynomial fits under it minus (XOR) = transmitted message, which should generate no remainder when divided
19 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? For the following analysis separate the error from the message: Transmitted message (resp. the corresponding polynomial) including an error: Original message (error free): M(x) T(x) Separate the transmitted message in: M(x) = T(x) + E(x) with E(x) being the isolated error. Every bit which is set in E stands for a toggled bit in M. A sequence from the first 1 bit to the last 1 bit is called a burst-error. A burst-error can occur anywhere in E. Question: Does the following division by the generator polynomial G(x) produce a remainder? If not, we cannot detect the error. [T(x) + E(x)]:G(x) = remainder less? T(x):G(x) is divisible without any remainder, because we constructed the message exactly for this property. The analysis is therefore reduced to the question whether E(x):G(x) erroneously results in no remainder thus passing undetected.
20 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? 1-bit error: The burst consists of only one error. If the generator polynomial has more than one coefficient, E(x) with a leading 1 followed by zeros cannot be divided without a remainder. So we are on the save side with regard to 1 bit errors. Our generator polynomial is at least as good as a parity bit. Example: 1000(...)0:101= continued as above...
21 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? 2-bit error: A 2-bit error must look like this: x i + (...) + x j, therefore x j can be factored out, which results in x j (x (i-j) + 1). It has already been shown that a generator polynomial with more than one term cannot divide the factor x j. When is a term (x k + 1) divided? (with k = i - j) For a given generator polynomial this has to be tested for 2-bit bursts with different lengths. Here, the error (inevitably) has the form 10(...)01. What follows is an example program to test whether the generator polynomial x 15 + x is useful for detecting 2-bit errors.
22 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? main() { Int MAX_LENGTH = 60000; char* generator = " "; char* bit_string = new char[max_length+1]; for(int length = 2; length < MAX_LENGTH; length++) { if((length % 100) == 0) cout << length << endl; for(int j = 1; j < length 1; j++) // clear bitstring bit_string[j] = '0'; bit_string[0] = '1'; bit_string[length 1] = '1'; bit_string[length] = 0; // test if divisible by generator polynomial if(divisible(bit_string, length, generator, strlen(generator)) == true) { cout << "Division successful with length " << length << endl; break; } // if } // for if(bit_string) delete[] bit_string; } // main
23 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? Error polynomials with an odd number of terms: Speculation: If the generator polynomial contains the term (x 1 + x 0 ), an error string with an odd number of bits cannot be divided. Proof by contradiction: Assuming E(x) being divisible by (x 1 + x 0 ), the factor can also be extracted: E(x) = (x 1 + x 0 ) Q(x) So far we only divided polynomials. Now, for the first time we use them as functions and evaluate it for x = 1. (x 1 + x 0 ) equals (1 + 1) and Q(x) equals 1, because Q(x) still contains an odd number of terms (additions are still done modulo 2). Hence follows (1 + 1) Q(x) = 0 x 1 = 0 But in the beginning we assumed that E(x) contains an odd number of terms. Thus, the result should have been 1, not 0. As follows, the factor (x 1 +x 0 ) cannot be extracted. As a consequence, E(x) is not divisible by (x 1 + x 0 ), if it contains an odd number of terms (or error bits). Result: The generator polynomial should contain the term (x 1 + x 0 ) to catch all errors with an odd number of bits.
24 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? Recognition of burst errors of length r: The burst error in E(x) could look like this: 0001 anything To move the last bit to the very right, a factor can be factored out: E(x)=x i (x (r-1) ), with i being the number of zeros on the right side of the last 1. If the degree of the generator polynomial itself is r (hence it has r + 1 terms), the error of the form (x (r-1) ) cannot be divided either, because the generator polynomial is larger than the error to be divided. Example (decimal system): 99: is not divisible without a remainder (result 0, remainder 99). Detecting burst errors with length r is trivial in the way that the error itself simply occurs at the end of the division. Even if the burst is just as large as the generator polynomial (which means r+1 bits), the division yields no remainder only if by chance the error coincides exactly with the generator polynomial. This is possible, but not very likely. Example (decimal system): : = 1 (modulo 0)
25 Source coding in data networks Principle of Data Fountains A number of sources generate a theoretically endless stream of packets which are derived from a file. All packets are pairwise different. source 1 (streaming file A) source 2 (streaming file A) source n (streaming file A) Given that the file consists of p packets, the sink needs to receive only any p+k packets to regenerate the file no matter from which source, no matter which packets. Any choice of p+k packets will do (k < 5% of the file size). (mystical, right?) sink (downloading file A)...
26 Source coding in data networks Principle of Data Fountains Mathematical foundation: Let F be a file split into chunks to form a n x m bit-matrix and let T be a invertible n x n bit-matrix (containing only 0 and 1). TxF=M The bit-matrix M is transmitted line-by-line over the net. At the receiver side, the original file is obtained by T -1 xm=f If T is the unit matrix, then the sender simply sends a file chunk-bychunk like e.g., know from FTP. Advantage: Nothing to compute. Disadvantage: Any missing chunk will corrupt the file.
27 Source coding in data networks Principle of Data Fountains Data fountains rather make use of a matrix T which is more densely populated with 1-bits. The gain is that we don't rely on each individual data-chunk. Any number of m chunks (or slightly more) will do. Algorithm Sender: The sender splits a file into n data chunks (parts). Then, a random bit-vector of size n is created. The chunks with a corresponding 1 are XORed into a new chunk which is broadcast over the network. This process can be repeated endlessly to produce more and more packets from the same file.
28 Source coding in data networks Principle of Data Fountains Receiver: It collects n packets with the corresponding bit-vectors. The receiver knows these bit vectors used by the sender because is uses exactly the same random number generator. After having obtained n packets with linear independent bit-vectors (namely the matrix T), it inverts T and back transforms the message M into the original file F. (The random number sources have to be synchronized.)
29 Source coding in data networks Principle of Data Fountains Example Data to be transmitted: Random bit-stream: Sender Receiver = = = The receiver starts decoding the incoming packets no sooner than he got an invertible matrix. If this is not (yet) the case, the receiver gathers more packets (and more random data) until the inverse matrix can be calculated. The random bits are not sent but generated by two synchronized (e.g., by an index) random number generators.
30 Source coding in data networks Principle of Data Fountains Pro A missing packets means a missing random bit vector. However, the next packet will likely fill the gap. In a traditional transmission a missing part of a file can be replace by no other packet rather than the missing one. Many sources can contribute without mutual cooperation. Con Inverting the large matrix might be time- and memory consuming. To generate a single new packet, the sender has to XOR 50% of the entire file. Improvement: Don't generate random bits for the transform but a matrix which is mostly populated along the diagonal.
31 Motivation for compression in data networks The bandwidth provided by the physical layer is usually not available for an application because of protocol overhead introduced by each layer of the network stack. By optimizing protocols we can try to exploit the resources more optimally. However, the potential of not transmitting redundant data bears a much larger potential for increased efficiency, often several orders of magnitude larger than the wasted network bandwidth. Question: Every once in a while we can read in the press, that a compression algorithm has been invented which can (successfully) compress its own output again (repeatedly). In particular it is usually claimed that random data can be compressed. Why is this unlikely? There is a long history of inventors and companies how claim to have achieved the above described. For details see
32 Handling huge data volumes Text Still image 1 page with 80 characters/line and 64 lines/page and 1 byte/char results in 80 * 64 * 1 * 8 = 41 kbit/page 24 bits/pixel, 512 x 512 pixel/image results in 512 x 512 x 24 = ca. 6 Mbit/image Audio Video CD quality, sampling rate 44,1 KHz, 16 bits per sample results in 44,1 x 16 = 706 kbit/s. Stereo: 1,412 Mbit/s Full-size frame 1024 x 768 pixel/frame, 24 bits/pixel, 30 frames/s results in 1024 x 768 x 24 x 30 = 566 Mbit/s. More realistic: 360 x 240 pixel/frame, 360 x 240 x 24 x 30 = 60 Mbit/s => Storage and transmission of multimedia streams require compression!
33 Example 1: ABC -> 1; EE -> 2 Example 2: Note that in this example both algorithms lead to the same compression rate.
34 Run Length Coding Principle Replace all repetitions of the same symbol in the text ( runs ) by a repetition counter and the symbol. Example Text: AAAABBBAABBBBBCCCCCCCCDABCBAABBBBCCD Encoding: 4A3B2A5B8C1D1A1B1C1B2A4B2C1D As we can see, we can only expect a good compression rate when long runs occur frequently. Examples are long runs of blanks in text documents or leading white pixels in gray-scale images.
35 When dealing with binary files we are sure that a run of 1 s is always followed by a run of 0 s and vice versa. It is thus sufficient to store the repetition counters only! Example
36 Run Length Coding, Legal Issues - Beware of the patent trap! Runlength encoding of the type (length, character) US Patent No: 4,586,027 Title: Method and system for data compression and restoration Filed: 07-Aug-1984 Granted: 29-Apr-1986 Inventor: Tsukimaya et al. Assignee: Hitachi Runlength encoding (length [<= 16], character) Number: 4,872,009 Title: Method and apparatus for data compression and restoration Filed: 07-Dec-1987 Granted: 03-Oct-1989 Inventor: Tsukimaya et al. Assignee: Hitachi
37 Variable Length Coding Classical character codes use the same number of bits for each character. When the frequency of occurrence is different for different characters, we can use fewer bits for frequent characters and more bits for rare characters. Example Code 1: A B C D E (binary) Encoding of ABRACADABRA with constant bit length (= 5 Bits): Code 2: A B R C D Encoding:
38 Delimiters Code 2 can only be decoded unambiguously when delimiters are stored with the codewords. This can increase the size of the encoded string considerably. Idea No code word should be the prefix of another codeword! We will then no longer need delimiters. Code 3: A 11 B 00 R 011 C 010 D 10 Encoded string:
39 Representation as a TRIE (or prefix tree) An obvious method to represent such a code as a TRIE. In fact, any TRIE with M leaf nodes can be used to represent a code for a string containing M different characters. The figure on the next page shows two codes which can be used for ABRACADABRA. The code for each character is represented by the path from the root of the TRIE to that character where 0 goes to the left, 1 goes to the right, as is the convention for TRIEs. The TRIE on the left corresponds to the encoding of ABRACADABRA on the previous page, the TRIE on the right generates the following encoding: which is two bits shorter.
40 Two Tries for our Example The TRIE representation guarantees indeed that no codeword is the prefix of another codeword. Thus the encoded bit string can be uniquely decoded.
41 Huffman Code Now the question arises how we can find the best variable-length code for given character frequencies (or probabilities). The algorithm that solves this problem was found by David Huffman in Algorithm Generate-Huffman-Code Determine the frequencies of the characters and mark the leaf nodes of a binary tree (to be built) with them. Out of the tree nodes not yet marked as DONE, take the two with the smallest frequencies and compute their sum. Create a parent node for them and mark it with the sum. Mark the branch to the left son with 0, the one to the right son with 1. Mark the two son nodes as DONE. When there is only one node not yet marked as DONE, stop (the tree is complete). Otherwise, continue with step 2.
42 Huffman Code, Example Probabilities of the characters: p(a) = 0.3; p(b) = 0.3; p(c) = 0.1; p(d) = 0.15; p(e) = % 0 30% 30% A B % % 40% % 15% 15% C D E
43 Huffman Code, why is it optimal? Characters with higher probabilities are closer to the root of the tree and thus have shorter codeword lengths; thus it is a good code. It is even the best possible code! Reason: The length of an encoded string equals the weighted outer path length of the Huffman tree. To compute the weighted outer path length we first compute the product of the weight (frequency counter) of a leaf node with its distance from the root. We then compute the sum of all these values over the leaf nodes. This is obviously the same as summing up the products of each character s codeword length with its frequency of occurrence. No other tree with the same frequencies attached to the leaf nodes has a smaller weighted path length than the Huffman tree.
44 Sketch of the Proof With a similar construction process, another tree could be built but without always combining the two nodes with the minimal frequencies. We can show by induction that no other such strategy will lead to a smaller weighted outer path length than the one that combines the minimal values in each step.
45 Decoding Huffman Codes (1) An obvious possibility is to use the TRIE: Read the input stream sequentially and traverse the TRIE until a leaf node is reached. When a leaf node is reached, output the character attached to it. To decode the next bit, start again at the root of the TRIE. Observation The input bit rate is constant, the output character rate is variable.
46 Decoding Huffman Codes (2) As an alternative we can use a decoding table. Creation of the decoding table: If the longest codeword has L bits, the table has 2 L entries. Let c i be the codeword for character s i. Let c i have l i bits. We then create 2 L-li entries in the table. In each of these entries the first l i bits are equal to c i, and the remaining bits take on all possible L-l i binary combinations. At all these addresses of the table we enter s i as the character recognized, and we remember l i as the length of the codeword.
47 Decoding with the Table Algorithm Table-Based Huffman Decoder Read L bits from the input stream into a buffer. Use the buffer as the address into the table and output the recognized character s i. Remove the first l i bits from the buffer and pull in the next l i bits from the input bit stream. Continue with step 2. Observation Table-based Huffman decoding is fast. The output character rate is constant, the input bit rate is variable.
48 Huffman Code, Comments A very good code for many practical purposes. Can only be used when the frequencies (or probabilities) of the characters are known in advance. Variation: Determine the character frequencies separately for each new document and store/transmit the code tree/table with the data. Note that a loss in optimality comes from the fact that each character must be encoded with a fixed number of bits, and thus the codeword lengths do not match the frequencies exactly (consider a code for three characters A, B and C, each occurring with a frequency of 33 %).
49 Critical Review of 0-Order Codes (1) ABCD ABCD ABCD ABCD Is quite redundant but cannot be compressed by Huffman. Huffman is sometimes referred to as a 0-order entropy model which means that each character is generated independent of its predecessor. No character makes a particular successor more likely. current character A B C D next character A B C D
50 Critical Review of 0-Order Codes (2) ABCD ABCD ABCD ABCD A string of the above type could be better described by a 1-order entropy model: the current character gives a hint on what character is expected next. current character A B next character A B The example above is trivial since each character uniquely determines the next one. C D C D
51 Critical Review of 0-Order Codes (3) Most sources produce characters with varying relative occurrences which can be encoded with Huffman, but which also exhibit intercharacter correlations. In the English language a c is often followed by an h, but not very often by a z. The example on the right consists of only four characters. Once an A has occurred, only another A or a B will follow. As an exception, a D is always followed by another D. Let us see how many bits we must spend if this particular correlation is known: current character 25% 25% 25% 25% A B C D next character A B C D assumed relative occurrence (used in the next slide)
52 Critical Review of 0-Order Codes (4) We assume that the occurrence of every character is equal (25%). Once it has occurred, the following two characters are equally likely (e.g., 50% for A->A and 50% for A->B)* X=current character P(X)=probability for char. X Y X = Y occurred and X was its predecessor P(Y X) = Prob. for Y if X was predecessor X= A B C D P(X)= Y X= A B B C C D D P(Y X)= log(p(y X))= P(Y X)log(P(Y X))= Sum [P()P()log()]= H(Y X)= 0.75 Bits for coding a character (first character must be given) *Note: The example is not realistic. It was chosen for easy calculation. A B C D A B C D
53 Critical Review of 0-Order Codes (5) Number of bits needed in 1-order entropy models in general: H ( ) P( x) P( y x)( 1)log ( P( y x)) ( Y X ) = x y 2 This gives us the number of bits we need in order to code a y if we have seen an x before. This is the probability of getting a character y next if we have currently seen an x. Note that in real world example this can be zero very often because many combinations of characters simply never occur. This is why we can save on bits! In this sum, all possible occurrences of X are considered. x can be considered to be the first character which determines the next character y. Each x occurs with a probability of P(x). This determines the number of bits we need in order to code a letter Y (y is a particular instance) if an X occurred already
54 Lempel-Ziv Code Lempel-Ziv codes are an example of the large group of dictionarybased codes. Dictionary: A table of character strings which is used in the encoding process. Example The word lecture is found on page x4, line y4 of the dictionary. It can thus be encoded as (x4,y4). A sentence such as this is a lecture could perhaps be encoded as a sequence of tuples (x1,y1) (x2,y2) (x3,y3) (x4,y4).
55 Dictionary-Based Coding Techniques Static techniques The dictionary exists before a string is encoded. It is not changed, neither in the encoding nor in the decoding process. Dynamic techniques The dictionary is created on the fly during the encoding process, at the sending (and sometimes also at the receiving) side. Lempel and Ziv have proposed an especially brilliant dynamic, dictionary-based technique (1977). Variations of this techniques are used very widely today for lossless compression. An example is LZW (Lempel/Ziv/Welch) which is invoked with the Unix compress command. The well-known TIFF format (Tag Image File Format) is also based on Lempel-Ziv coding.
56 Ziv-Lempel Coding, the Principle The current piece of the message can be encoded as a reference to an earlier (identical) piece of the message. This reference will usually be shorter than the piece itself. As the message is processed, the dictionary is created dynamically. InitializeStringTable(); WriteCode(ClearCode); w = the empty string; for each character in string { K = GetNextCharacter(); if w + K is in the string table { w=w+k /* string concatenation*/ } else { WriteCode(CodeFromString(w)); AddTableEntry(w+K); w=k } } WriteCode(CodeFromString(w));
57 LZW, Example 1, Encoding Alphabet: {A, B, C} Message: A B A B C B A B A B Encoded Message: ω +K ω DICTIONARY A A index = code entry AB B BA A 1 A AB ABC CB BA BAB BA BAB AB C B BA B BA BAB 2 B 3 C 4 AB 5 BA 6 ABC 7 CB 8 BAB
58 LZW Algorithm: Decoding (1) Note that the decoding algorithm also creates the dictionary dynamically, the dictionary is not transmitted! While((Code=GetNextCode()!= EofCode) { if (Code == ClearCode) { InitializeTable(); Code = GetNextCode(); if (Code==EofCode) break; WriteString(StringFromCode(Code)); OldCode = Code; } /* end of ClearCode case */ else { if (IsInTable(Code)) { WriteString( StringFromCode(Code) ); AddStringToTable(StringFromCode(OldCode)+ FirstChar(StringFromCode(Code))); OldCode = Code; }
59 LZW Algorithm: Decoding (2) } } else {/* code is not in table */ OutString = StringFromCode(OldCode) + FirstChar(StringFromCode(OldCode))); WriteString(OutString); AddStringToTable(OutString); OldCode = Code; }
60 LZW, Example 2, Decoding Alphabet: {A, B, C,D} Transmitted Code: A B A C AB A code oldcode DICTIONARY 1 1 index entry A B C D AB 6 BA 7 AC 8 CA 9 ABA
61 LZW, Properties The dictionary is created dynamically during the encoding and decoding process. It is neither stored nor transmitted! The dictionary adapts dynamically to the properties of the character string. With length N of the original message, the encoding process is of complexity O(N). With length M of the encoded message, the decoding process is of complexity O(M). These are thus very efficient processes. Since several characters of the input alphabet are combined into one character of the code, M <= N.
62 Typical Compression Rates Typical examples of file sizes in % of the original size Type of file Encoded with Huffman Encoded with Lempel Ziv C source code 65 % 45 % machine code 80 % 55 % text 50 % 30 %
63 Arithmetic Coding From an information theory point of view, the Huffman code is not quite optimal since a codeword must always consist of an integer number of bits even if this does not correspond exactly to the frequency of occurrence of the character. Arithmetic coding solves this problem. Idea An entire message is represented by a floating point number out of the interval [0,1). For this purpose the interval [0,1) is repeatedly subdivided according to the frequency of the next symbol. Each new sub-interval represents one symbol. When the process is completed the shortest floating point number contained in the target interval is chosen as the representative for the message.
64 Arithmetic Coding, the Algorithm Begin in front of the first character of the input stream, with the current interval set to [0,1). Read the next character from the input stream. Subdivide the current interval according to the frequencies of all characters of the alphabet. Select the subinterval corresponding to the current character as the next current interval. If you reach the end of the input stream or the end symbol, go to step 3. Otherwise go to step 1. From the current (final) interval, select the floating point number that you can represent in the computer with the smallest number of bits. This number is the encoding of the string.
65 Arithmetic Coding, the Decoding Algorithm Algorithm Arithmetic Decoding Subdivide the interval [0,1) according to the character frequencies, as described in the encoding algorithm, up to the maximum size of a message. The encoded floating point number uniquely identifies one particular subinterval. This subinterval uniquely identifies one particular message. Output the message.
66 Arithmetic Coding, Example Alphabet = {A,B,C} Frequencies (probabilities): p(a) = 0.2; p(b) = 0.3; p(c) = 0.5 Messages: ACB AAB (maximum size of a messe is 3). Encoding of the first block ACB: 0 0,2 0,5 1 A B C 0 0,04 0,1 0,2 A B C 0,1 0,12 0,15 0,2 A B C Final interval: [0.12; 0.15) choose e.g
67 Arithmetic Coding, Implementation So far we dealt with real numbers of (theoretically) infinit precision. How do we actually encode a message with many characters (like several megabytes)? If character A occurs with a probability of 20%, the number of digits for coding consecutive A s grows very fast: First A < 0.2, second A < 0.04, third A < 0.008, , , and so on. Let us assume, that our processor has 8-bit wide registers. (decimal 0) (0.2 x 255=51) (0.5 x 255=127) A B C 0,2 0,5 (decimal 255) Note: We skip to store or transmit 0. since it is redundant. Advantage: We have a binary fixed point arithmetic representation of fractions and we can compute them in the processor s register.
68 Arithmetic Coding, Implementation Disadvantage: Even when using 32-bit or 64-bit CPU registers we can only code a couple of characters. Solution: Once our interval gets smaller and smaller, we will obtain a growing number of leading bits which have settled (they will never change). So we will transmit them to the receiver and shift them our of our register to gain new bits. Example: Interval of first A = [ , ] No matter which characters follow, the most significant two zeros from the lower and the upper bound will never change. So we store or transmit them and shift the rest two digits to the left. As a consequence we gain two new least significant digits: A=[ ??, ?? ]
69 Arithmetic Coding, Implementation A=[ ??, ?? ] How do we initialize the new digits prior to the ongoing encoding? Obviously, we want to keep the interval as large as possible. This is achieved by filling the lower bound with 0-bits and the upper bound with 1-bits. A new =[ , ] Note that always adding 0-bits to the lower bound and 1-bits to the upper bound introduces a mistake, because the size of the interval does not exactly correspond to the probability of character A. However, we will not get into trouble if the encoding and the decoding side makes the same mistake.
70 Arithmetic Coding, Properties The encoding depends on the probabilities (relative occurrences) of the characters. The higher the frequency, the larger the subinterval; the smaller the number of bits needed to represent it. The code length reaches the theoretical optimum: The number of bits used for each character need not be an integer. It can approach the real probability better than using the Huffman code. We always need a terminal symbol to stop the encoding process. Problem: One bit error destroys the entire message.
Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code
Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic
More informationCh. 7 Error Detection and Correction
Ch. 7 Error Detection and Correction Error Detection and Correction Data can be corrupted during transmission. Some applications require that errors be detected and corrected. 2 1. Introduction Let us
More informationCh. 2: Compression Basics Multimedia Systems
Ch. 2: Compression Basics Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Outline Why compression? Classification Entropy and Information
More informationCMSC 2833 Lecture 18. Parity Add a bit to make the number of ones (1s) transmitted odd.
Parity Even parity: Odd parity: Add a bit to make the number of ones (1s) transmitted even. Add a bit to make the number of ones (1s) transmitted odd. Example and ASCII A is coded 100 0001 Parity ASCII
More informationChapter 10 Error Detection and Correction. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Chapter 10 Error Detection and Correction 0. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Note The Hamming distance between two words is the number of differences
More informationLecture 6: Reliable Transmission. CSE 123: Computer Networks Alex Snoeren (guest lecture) Alex Sn
Lecture 6: Reliable Transmission CSE 123: Computer Networks Alex Snoeren (guest lecture) Alex Sn Lecture 6 Overview Finishing Error Detection Cyclic Remainder Check (CRC) Handling errors Automatic Repeat
More informationAdvanced Computer Networks. Rab Nawaz Jadoon DCS. Assistant Professor COMSATS University, Lahore Pakistan. Department of Computer Science
Advanced Computer Networks Department of Computer Science DCS COMSATS Institute of Information Technology Rab Nawaz Jadoon Assistant Professor COMSATS University, Lahore Pakistan Advanced Computer Networks
More informationLossless Compression Algorithms
Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms
More informationGreedy Algorithms CHAPTER 16
CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often
More informationIMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1
IMAGE COMPRESSION- I Week VIII Feb 25 02/25/2003 Image Compression-I 1 Reading.. Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive,
More informationMultimedia Networking ECE 599
Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics
More information4. Error correction and link control. Contents
//2 4. Error correction and link control Contents a. Types of errors b. Error detection and correction c. Flow control d. Error control //2 a. Types of errors Data can be corrupted during transmission.
More informationCSE123A discussion session
CSE23A discussion session 27/2/9 Ryo Sugihara Review Data Link Layer (3): Error detection sublayer CRC Polynomial representation Implementation using LFSR Data Link Layer (4): Error recovery sublayer Protocol
More informationLecture 4: CRC & Reliable Transmission. Lecture 4 Overview. Checksum review. CRC toward a better EDC. Reliable Transmission
1 Lecture 4: CRC & Reliable Transmission CSE 123: Computer Networks Chris Kanich Quiz 1: Tuesday July 5th Lecture 4: CRC & Reliable Transmission Lecture 4 Overview CRC toward a better EDC Reliable Transmission
More informationCS321: Computer Networks Error Detection and Correction
CS321: Computer Networks Error Detection and Correction Dr. Manas Khatua Assistant Professor Dept. of CSE IIT Jodhpur E-mail: manaskhatua@iitj.ac.in Error Detection and Correction Objective: System must
More informationCSE 123: Computer Networks Alex C. Snoeren. HW 1 due Thursday!
CSE 123: Computer Networks Alex C. Snoeren HW 1 due Thursday! Error handling through redundancy Adding extra bits to the frame Hamming Distance When we can detect When we can correct Checksum Cyclic Remainder
More informationEE-575 INFORMATION THEORY - SEM 092
EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------
More information(Refer Slide Time: 2:20)
Data Communications Prof. A. Pal Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture-15 Error Detection and Correction Hello viewers welcome to today s lecture
More informationChapter 3. The Data Link Layer. Wesam A. Hatamleh
Chapter 3 The Data Link Layer The Data Link Layer Data Link Layer Design Issues Error Detection and Correction Elementary Data Link Protocols Sliding Window Protocols Example Data Link Protocols The Data
More informationInst: Chris Davison
ICS 153 Introduction to Computer Networks Inst: Chris Davison cbdaviso@uci.edu ICS 153 Data Link Layer Contents Simplex and Duplex Communication Frame Creation Flow Control Error Control Performance of
More informationLecture / The Data Link Layer: Framing and Error Detection
Lecture 2 6.263/16.37 The Data Link Layer: Framing and Error Detection MIT, LIDS Slide 1 Data Link Layer (DLC) Responsible for reliable transmission of packets over a link Framing: Determine the start
More informationSIGNAL COMPRESSION Lecture Lempel-Ziv Coding
SIGNAL COMPRESSION Lecture 5 11.9.2007 Lempel-Ziv Coding Dictionary methods Ziv-Lempel 77 The gzip variant of Ziv-Lempel 77 Ziv-Lempel 78 The LZW variant of Ziv-Lempel 78 Asymptotic optimality of Ziv-Lempel
More informationCSCI-1680 Link Layer Reliability Rodrigo Fonseca
CSCI-1680 Link Layer Reliability Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Janno< Last time Physical layer: encoding, modulation Link layer framing Today Getting
More informationCh. 2: Compression Basics Multimedia Systems
Ch. 2: Compression Basics Multimedia Systems Prof. Thinh Nguyen (Based on Prof. Ben Lee s Slides) Oregon State University School of Electrical Engineering and Computer Science Outline Why compression?
More informationWhere we are in the Course
Link Layer Where we are in the Course Moving on up to the Link Layer! Application Transport Network Link Physical CSE 461 University of Washington 2 Scope of the Link Layer Concerns how to transfer messages
More informationCSCI-1680 Link Layer I Rodrigo Fonseca
CSCI-1680 Link Layer I Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Last time Physical layer: encoding, modulation Today Link layer framing Getting frames
More informationChapter 7 Lossless Compression Algorithms
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5 Dictionary-based Coding 7.6 Arithmetic Coding 7.7
More informationISO/OSI Reference Model. Data Link Layer. 7. Application. 6. Presentation. 5. Session. 4. Transport. 3. Network. 2. Data Link. 1.
Data Link Layer 1 ISO/OSI Reference Model 7. Application E-Mail, Terminal, Remote login 6. Presentation System dependent presentation of data (EBCDIC/ASCII) 5. Session Connection establishment, termination,
More informationCSE123A discussion session
CSE123A discussion session 2007/02/02 Ryo Sugihara Review Data Link layer (1): Overview Sublayers End-to-end argument Framing sublayer How to delimit frame» Flags and bit stuffing Topics Data Link Layer
More informationChapter 3. The Data Link Layer
Chapter 3 The Data Link Layer 1 Data Link Layer Algorithms for achieving reliable, efficient communication between two adjacent machines. Adjacent means two machines are physically connected by a communication
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory
More informationMYcsvtu Notes DATA REPRESENTATION. Data Types. Complements. Fixed Point Representations. Floating Point Representations. Other Binary Codes
DATA REPRESENTATION Data Types Complements Fixed Point Representations Floating Point Representations Other Binary Codes Error Detection Codes Hamming Codes 1. DATA REPRESENTATION Information that a Computer
More informationECE 333: Introduction to Communication Networks Fall Lecture 6: Data Link Layer II
ECE 333: Introduction to Communication Networks Fall 00 Lecture 6: Data Link Layer II Error Correction/Detection 1 Notes In Lectures 3 and 4, we studied various impairments that can occur at the physical
More informationData Compression Techniques
Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits
More informationDigital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay
Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today
More informationData and Computer Communications. Protocols and Architecture
Data and Computer Communications Protocols and Architecture Characteristics Direct or indirect Monolithic or structured Symmetric or asymmetric Standard or nonstandard Means of Communication Direct or
More informationImplementing CRCCs. Introduction. in Altera Devices
Implementing CRCCs in Altera Devices July 1995, ver. 1 Application Note 49 Introduction Redundant encoding is a method of error detection that spreads the information across more bits than the original
More informationUNIT-II 1. Discuss the issues in the data link layer. Answer:
UNIT-II 1. Discuss the issues in the data link layer. Answer: Data Link Layer Design Issues: The data link layer has a number of specific functions it can carry out. These functions include 1. Providing
More informationCompression; Error detection & correction
Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some
More informationData Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression
An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression
More informationCSE 461: Framing, Error Detection and Correction
CSE 461: Framing, Error Detection and Correction Next Topics Framing Focus: How does a receiver know where a message begins/ends Error detection and correction Focus: How do we detect and correct messages
More informationCSE 123A Computer Networks
CSE 123A Computer Networks Winter 2005 Lecture 4: Data-Link I: Framing and Errors Some portions courtesy Robin Kravets and Steve Lumetta Last time How protocols are organized & why Network layer Data-link
More informationDigital Image Processing
Lecture 9+10 Image Compression Lecturer: Ha Dai Duong Faculty of Information Technology 1. Introduction Image compression To Solve the problem of reduncing the amount of data required to represent a digital
More informationCHAPTER 2 Data Representation in Computer Systems
CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting
More informationCSEP 561 Error detection & correction. David Wetherall
CSEP 561 Error detection & correction David Wetherall djw@cs.washington.edu Codes for Error Detection/Correction ti ti Error detection and correction How do we detect and correct messages that are garbled
More informationCHAPTER 2 Data Representation in Computer Systems
CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting
More informationChapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha
Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we have to take into account the complexity of the code.
More informationFundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding
Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental
More informationCSCI-1680 Link Layer Reliability John Jannotti
CSCI-1680 Link Layer Reliability John Jannotti Based partly on lecture notes by David Mazières, Phil Levis, Rodrigo Fonseca Roadmap Last time Physical layer: encoding, modulation Link layer framing Today
More informationCompression; Error detection & correction
Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some
More informationEE 6900: FAULT-TOLERANT COMPUTING SYSTEMS
EE 6900: FAULT-TOLERANT COMPUTING SYSTEMS LECTURE 6: CODING THEORY - 2 Fall 2014 Avinash Kodi kodi@ohio.edu Acknowledgement: Daniel Sorin, Behrooz Parhami, Srinivasan Ramasubramanian Agenda Hamming Codes
More informationCSMC 417. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala Set 4. September 09 CMSC417 Set 4 1
CSMC 417 Computer Networks Prof. Ashok K Agrawala 2009 Ashok Agrawala Set 4 1 The Data Link Layer 2 Data Link Layer Design Issues Services Provided to the Network Layer Framing Error Control Flow Control
More informationIntro. To Multimedia Engineering Lossless Compression
Intro. To Multimedia Engineering Lossless Compression Kyoungro Yoon yoonk@konkuk.ac.kr 1/43 Contents Introduction Basics of Information Theory Run-Length Coding Variable-Length Coding (VLC) Dictionary-based
More informationKinds Of Data CHAPTER 3 DATA REPRESENTATION. Numbers Are Different! Positional Number Systems. Text. Numbers. Other
Kinds Of Data CHAPTER 3 DATA REPRESENTATION Numbers Integers Unsigned Signed Reals Fixed-Point Floating-Point Binary-Coded Decimal Text ASCII Characters Strings Other Graphics Images Video Audio Numbers
More informationChapter 10 Error Detection and Correction 10.1
Chapter 10 Error Detection and Correction 10.1 10-1 INTRODUCTION some issues related, directly or indirectly, to error detection and correction. Topics discussed in this section: Types of Errors Redundancy
More informationEE67I Multimedia Communication Systems Lecture 4
EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.
More informationData Link Networks. Hardware Building Blocks. Nodes & Links. CS565 Data Link Networks 1
Data Link Networks Hardware Building Blocks Nodes & Links CS565 Data Link Networks 1 PROBLEM: Physically connecting Hosts 5 Issues 4 Technologies Encoding - encoding for physical medium Framing - delineation
More informationErrors. Chapter Extension of System Model
Chapter 4 Errors In Chapter 2 we saw examples of how symbols could be represented by arrays of bits. In Chapter 3 we looked at some techniques of compressing the bit representations of such symbols, or
More informationData Link Layer: Overview, operations
Data Link Layer: Overview, operations Chapter 3 1 Outlines 1. Data Link Layer Functions. Data Link Services 3. Framing 4. Error Detection/Correction. Flow Control 6. Medium Access 1 1. Data Link Layer
More informationData Link Layer. Srinidhi Varadarajan
Data Link Layer Srinidhi Varadarajan Data Link Layer: Functionality The data link layer must: Detect errors (using redundancy bits) Request retransmission if data is lost (using automatic repeat request
More informationBinary Trees Case-studies
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look
More information2.1 CHANNEL ALLOCATION 2.2 MULTIPLE ACCESS PROTOCOLS Collision Free Protocols 2.3 FDDI 2.4 DATA LINK LAYER DESIGN ISSUES 2.5 FRAMING & STUFFING
UNIT-2 2.1 CHANNEL ALLOCATION 2.2 MULTIPLE ACCESS PROTOCOLS 2.2.1 Pure ALOHA 2.2.2 Slotted ALOHA 2.2.3 Carrier Sense Multiple Access 2.2.4 CSMA with Collision Detection 2.2.5 Collision Free Protocols 2.2.5.1
More informationHorn Formulae. CS124 Course Notes 8 Spring 2018
CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it
More informationIMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I
IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I 1 Need For Compression 2D data sets are much larger than 1D. TV and movie data sets are effectively 3D (2-space, 1-time). Need Compression for
More informationWelcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson
Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson zhuyongxin@sjtu.edu.cn 2 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information
More informationData Compression. Guest lecture, SGDS Fall 2011
Data Compression Guest lecture, SGDS Fall 2011 1 Basics Lossy/lossless Alphabet compaction Compression is impossible Compression is possible RLE Variable-length codes Undecidable Pigeon-holes Patterns
More informationChapter 4: Application Protocols 4.1: Layer : Internet Phonebook : DNS 4.3: The WWW and s
Chapter 4: Application Protocols 4.1: Layer 5-7 4.2: Internet Phonebook : DNS 4.3: The WWW and E-Mails OSI Reference Model Application Layer Presentation Layer Session Layer Application Protocols Chapter
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,
More informationCS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77
CS 493: Algorithms for Massive Data Sets February 14, 2002 Dictionary-based compression Scribe: Tony Wirth This lecture will explore two adaptive dictionary compression schemes: LZ77 and LZ78. We use the
More informationData Compression Fundamentals
1 Data Compression Fundamentals Touradj Ebrahimi Touradj.Ebrahimi@epfl.ch 2 Several classifications of compression methods are possible Based on data type :» Generic data compression» Audio compression»
More informationUNIT-II. Part-2: CENTRAL PROCESSING UNIT
Page1 UNIT-II Part-2: CENTRAL PROCESSING UNIT Stack Organization Instruction Formats Addressing Modes Data Transfer And Manipulation Program Control Reduced Instruction Set Computer (RISC) Introduction:
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 6 Coding I Chapter 3 Information Redundancy Part.6.1 Information Redundancy - Coding A data word with d bits is encoded
More informationNetworking Link Layer
Networking Link Layer ECE 650 Systems Programming & Engineering Duke University, Spring 2018 (Link Layer Protocol material based on CS 356 slides) TCP/IP Model 2 Layer 1 & 2 Layer 1: Physical Layer Encoding
More informationScribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017
CS6 Lecture 4 Greedy Algorithms Scribe: Virginia Williams, Sam Kim (26), Mary Wootters (27) Date: May 22, 27 Greedy Algorithms Suppose we want to solve a problem, and we re able to come up with some recursive
More informationDavid Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.
David Rappaport School of Computing Queen s University CANADA Copyright, 1996 Dale Carnegie & Associates, Inc. Data Compression There are two broad categories of data compression: Lossless Compression
More informationSome portions courtesy Robin Kravets and Steve Lumetta
CSE 123 Computer Networks Fall 2009 Lecture 4: Data-Link I: Framing and Errors Some portions courtesy Robin Kravets and Steve Lumetta Administrative updates I m Im out all next week no lectures, but You
More informationELEC 691X/498X Broadcast Signal Transmission Winter 2018
ELEC 691X/498X Broadcast Signal Transmission Winter 2018 Instructor: DR. Reza Soleymani, Office: EV 5.125, Telephone: 848 2424 ext.: 4103. Office Hours: Wednesday, Thursday, 14:00 15:00 Slide 1 In this
More informationFault-Tolerant Computing
Fault-Tolerant Computing Dealing with Mid-Level Impairments Oct. 2007 Error Detection Slide 1 About This Presentation This presentation has been prepared for the graduate course ECE 257A (Fault-Tolerant
More informationMore Bits and Bytes Huffman Coding
More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width
More informationCHW 261: Logic Design
CHW 261: Logic Design Instructors: Prof. Hala Zayed Dr. Ahmed Shalaby http://www.bu.edu.eg/staff/halazayed14 http://bu.edu.eg/staff/ahmedshalaby14# Slide 1 Slide 2 Slide 3 Digital Fundamentals CHAPTER
More informationCS422 Computer Networks
CS422 Computer Networks Lecture 3 Data Link Layer Dr. Xiaobo Zhou Department of Computer Science CS422 DataLinkLayer.1 Data Link Layer Design Issues Services Provided to the Network Layer Provide service
More informationFigure-2.1. Information system with encoder/decoders.
2. Entropy Coding In the section on Information Theory, information system is modeled as the generationtransmission-user triplet, as depicted in fig-1.1, to emphasize the information aspect of the system.
More informationENEE x Digital Logic Design. Lecture 3
ENEE244-x Digital Logic Design Lecture 3 Announcements Homework due today. Homework 2 will be posted by tonight, due Monday, 9/2. First recitation quiz will be tomorrow on the material from Lectures and
More informationHuffman Code Application. Lecture7: Huffman Code. A simple application of Huffman coding of image compression which would be :
Lecture7: Huffman Code Lossless Image Compression Huffman Code Application A simple application of Huffman coding of image compression which would be : Generation of a Huffman code for the set of values
More informationLossless compression II
Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)
More informationModule 2: Computer Arithmetic
Module 2: Computer Arithmetic 1 B O O K : C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N, 3 E D, D A V I D L. P A T T E R S O N A N D J O H N L. H A N N E S S Y, M O R G A N K A U F M A N N
More informationFunctional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute
Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Module # 02 Lecture - 03 Characters and Strings So, let us turn our attention to a data type we have
More information15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION
15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:
More informationData link layer functions. 2 Computer Networks Data Communications. Framing (1) Framing (2) Parity Checking (1) Error Detection
2 Computer Networks Data Communications Part 6 Data Link Control Data link layer functions Framing Needed to synchronise TX and RX Account for all bits sent Error control Detect and correct errors Flow
More informationError Detection Codes. Error Detection. Two Dimensional Parity. Internet Checksum Algorithm. Cyclic Redundancy Check.
Error Detection Two types Error Detection Codes (e.g. CRC, Parity, Checksums) Error Correction Codes (e.g. Hamming, Reed Solomon) Basic Idea Add redundant information to determine if errors have been introduced
More informationDLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 1 DLD P VIDYA SAGAR
UNIT I Digital Systems: Binary Numbers, Octal, Hexa Decimal and other base numbers, Number base conversions, complements, signed binary numbers, Floating point number representation, binary codes, error
More informationMultimedia Systems. Part 20. Mahdi Vasighi
Multimedia Systems Part 2 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for dvanced Studies in asic Sciences, Zanjan, Iran rithmetic Coding
More informationBits, Words, and Integers
Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are
More informationCS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department
CS473-Algorithms I Lecture 11 Greedy Algorithms 1 Activity Selection Problem Input: a set S {1, 2,, n} of n activities s i =Start time of activity i, f i = Finish time of activity i Activity i takes place
More informationAd hoc and Sensor Networks Chapter 6: Link layer protocols. Holger Karl
Ad hoc and Sensor Networks Chapter 6: Link layer protocols Holger Karl Goals of this chapter Link layer tasks in general Framing group bit sequence into packets/frames Important: format, size Error control
More informationComputer and Network Security
CIS 551 / TCOM 401 Computer and Network Security Spring 2009 Lecture 6 Announcements First project: Due: 6 Feb. 2009 at 11:59 p.m. http://www.cis.upenn.edu/~cis551/project1.html Plan for Today: Networks:
More informationEncoding. A thesis submitted to the Graduate School of University of Cincinnati in
Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master
More informationDigital Fundamentals
Digital Fundamentals Tenth Edition Floyd Chapter 2 2009 Pearson Education, Upper 2008 Pearson Saddle River, Education NJ 07458. All Rights Reserved Decimal Numbers The position of each digit in a weighted
More information2 nd Week Lecture Notes
2 nd Week Lecture Notes Scope of variables All the variables that we intend to use in a program must have been declared with its type specifier in an earlier point in the code, like we did in the previous
More information