Lecture on Computer Networks

Size: px
Start display at page:

Download "Lecture on Computer Networks"

Transcription

1 Lecture on Computer Networks Historical Development Copyright (c) 2008 Dr. Thomas Haenselmann (Saarland University, Germany). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. time

2 Source coding in data networks Motivation for source coding in data networks This short introduction addresses the problem of how to code payload optimally for the network. Theoretically, for an error-free communication channel, no special precautions are necessary, we could simply forward the bits. More realistic, we know that the channel introduces errors. So it can make sense to send more bits than only the data to transmit to account for the erroneous channel.

3 Source coding in data networks Error control: Detecting vs. Correcting Basically two variants are distinguished: Error detecting codes Error correcting codes (Forward Error Correction) Both variants need a certain amount of redundancy. Which variant makes sense in which situation? Error detection: Detection might be sufficient if the sink can ask the source for repeated transmission. Therefore, a feedback channel and a defined protocol are necessary. In some cases, erroneous data can simply be deleted without retransmission (e.g., IP-telephony).

4 Source coding in data networks Error control: Detecting vs. Correcting Error correcting codes: A larger amount of redundancy is necessary. This makes sense if no feedback channel is available or a retransmission delays the packet considerably (e. g. telephony, video conferences). What is better: Forward error correction or retransmission?

5 Source coding in data networks Error control: The Hamming-Distance Error correcting codes: A larger amount of redundancy is necessary. This makes sense if Operand Operand XOR The Hamming-Distance of two codewords corresponds exactly to the number of bits the codewords differ in. This is calculated using the XOR-operation. In other words: If two codewords have the Hamming-Distance d, then it is necessary to toggle d bits to transform one codeword into the other.

6 Source coding in data networks Error control: The Hamming-Distance The Hamming-Distance of an entire code is defined as the smallest possible distance two arbitrary (but different) codewords of a code can have. To detect errors we need a code with valid and invalid words. To be able to detect d bit errors in a codeword the code must have a distance of d+1. Why does less distance not suffice?

7 Source coding in data networks Error control: Error correcting codes Error correcting codes need a distance of 2d+1 if d errors need to be detected and corrected. Why does less distance not suffice? With a distance of 2d+1 we need to toggle 2d+1 bits to get from one valid codeword to another. If we toggle d bits only then the effort to get back to the original codeword is also exactly toggling d bits. But to get to a different (not the original) codeword when there are d bit errors it is necessary to toggle d+1 bits. If only d bit errors may occur then the shortest (and only) way to get to the next codeword is toggling d bits. Example: 2 bits are encoded by 10 Orig. code: E.corr.Code: How many bit errors can be detected and corrected here?

8 Source coding in data networks Error control: Redundancy estimation of error correcting codes How many bits do we need at least for the correction of a bit error? We want to have 2^m valid codewords. r correction bits are needed. The error correcting code will have n = (m + r) bits in total. By toggling one of the n bits (redundant bits may be toggled, too) we get an (illegal) codeword with a distance of 1.That means, for each of the 2^m valid words we can create n invalid words. Hence follows n 1 2 m 2 n

9 Source coding in data networks Error control: Redundancy estimation of error correcting codes How many bits do we need at least for the correction of a bit error? (n + 1) is composed of n invalid (each created by one bit error) and one valid codeword. The number of words of the error correcting code stands on the right side of the inequality. n can be written as m data bits plus r correction bits. m r 1 2 m 2 m r m r 1 2 r With a given m we can estimate the number of correction bits a code needs in any case.

10 Source coding in data networks Error control: The error correcting Hamming code The Hamming code achieves the minimum of the previous estimation. Algorithm: Number the bits from the LSB to the MSB. All 2^n bits are check bits, the rest are data bits. The data bits are filled up from left to right with the actual data, the check bits only depend on the data bits. Which data bit influences which check bit? Check bits Data bits

11 Source coding in data networks Error control: The error correcting Hamming code Check bits Data bits To see this, convert the (order) number of a bit into its binary representation: 11 = Data bit 11 hence influences check bits 8, 2 and 1. Check bit 1 is of course also influenced by the data bits 3, 5, 7 and 9. By definition all check bits combined with their data bits have to exhibit an even (or odd, depending what was agreed upon) parity.

12 Source coding in data networks Error control: The error correcting Hamming code Example: Data: digit?? 1? 1 0 1? data w/out check bits Conversion Data and check bits Correction procedure First, a counter is set to zero. Then, the check bits are checked one by one. If one of them produces the wrong parity, the number of the corresponding check bit is added to the counter. In the end, the counter points to the toggled bit.

13 Source coding in data networks Error control: The error correcting Hamming code Another interpretation of the error recovery scheme: Why does the counter value reference the defective bit? Bit 1 defective? yes: Bits 3,5,7,9 and 11 may be defective Bit 2 defective? no: Bits 5 and 9 are left Bit 4 defective? yes: Only bit 5 is left

14 Source coding in data networks Error control: Cyclic Redundancy Check (CRC) CRC is based on the idea of polynomial division. Remember: (x 5 +x 3 +x+1):(x+1) = x 4 x 3 +2x 2 2x+3 [2/(x+1)] (x 5 +x 4 ) 0 x 4 +x 3 ( x 4 x 3 ) 0 + 2x 3 +x (2x 3 +2x 2 ) 0 2x 2 +x ( 2x 2 2x) 0 3x+1 (3x+3) Check: [x 4 x 3 +2x 2 2x+3 2/(x+1)]*(x+1) = = Remainder or modulus What's the difference between polynomial division and normal division?

15 Source coding in data networks Error control: Cyclic Redundancy Check A bit string is interpreted as a polynomial by numbering the bits consecutively and, if a bit is set, by adding the corresponding term to the polynomial. In other words: Use the bits as coefficients. Example: position data bits 1x 7 + 1x 6 + 0x 5 + x 4 + 0x 3 + 1x 2 + 0x 1 + 1x 0 given data bits is the polynomial corresponding to the

16 Source coding in data networks Error control: Cyclic Redundancy Check The principle of CRC: Sender and recipient agree upon a divisor polynomial, also called generator polynomial. Then, g zeros are added to the message, g being the degree of the generator polynomial. In the next step, the sender divides the message (extended by g zeros), similar to the polynomial division. In most cases there will be a remainder, the result of the division is of no interest. The remainder is then subtracted from the message (being extended by the zeros). The resulting bit string is now transferred to the recipient. If the message was transmitted correctly, no remainder should emerge on the recipient's side. Why? Because the sender intentionally subtracted the remainder before sending the message. The g zeros which emerge after the division are interpreted by the recipient as an indication of an error free transmission. Note: The sender can safely subtract the remainder without harming the message. Why?

17 Source coding in data networks Error control: Cyclic Redundancy Check The only difference to normal polynomial division: Calculations are binary and after the calculation of each digit a modulo 2 operation is performed! In other words: Always ignore the carry-over. This simplifies the addition and subtraction significantly Discovery: Both operations, plus and minus, are equivalent to the XOR operation.

18 Source coding in data networks Error control: Cyclic Redundancy Check (CRC) Example: Message: Generator polynomial: (x 4 + x + 1) = 4 th degree (5 th Message extended by 4 zeros: order) Division: :10011= (the result does not matter) XOR XOR XOR XOR = remainder Special note: The division continues, if the MSB (most significant bit) of the divisor and of the bit string currently being divided is set. Sometimes the bit string has to be extended by new bits (from the message) until the generator polynomial fits under it minus (XOR) = transmitted message, which should generate no remainder when divided

19 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? For the following analysis separate the error from the message: Transmitted message (resp. the corresponding polynomial) including an error: Original message (error free): M(x) T(x) Separate the transmitted message in: M(x) = T(x) + E(x) with E(x) being the isolated error. Every bit which is set in E stands for a toggled bit in M. A sequence from the first 1 bit to the last 1 bit is called a burst-error. A burst-error can occur anywhere in E. Question: Does the following division by the generator polynomial G(x) produce a remainder? If not, we cannot detect the error. [T(x) + E(x)]:G(x) = remainder less? T(x):G(x) is divisible without any remainder, because we constructed the message exactly for this property. The analysis is therefore reduced to the question whether E(x):G(x) erroneously results in no remainder thus passing undetected.

20 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? 1-bit error: The burst consists of only one error. If the generator polynomial has more than one coefficient, E(x) with a leading 1 followed by zeros cannot be divided without a remainder. So we are on the save side with regard to 1 bit errors. Our generator polynomial is at least as good as a parity bit. Example: 1000(...)0:101= continued as above...

21 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? 2-bit error: A 2-bit error must look like this: x i + (...) + x j, therefore x j can be factored out, which results in x j (x (i-j) + 1). It has already been shown that a generator polynomial with more than one term cannot divide the factor x j. When is a term (x k + 1) divided? (with k = i - j) For a given generator polynomial this has to be tested for 2-bit bursts with different lengths. Here, the error (inevitably) has the form 10(...)01. What follows is an example program to test whether the generator polynomial x 15 + x is useful for detecting 2-bit errors.

22 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? main() { Int MAX_LENGTH = 60000; char* generator = " "; char* bit_string = new char[max_length+1]; for(int length = 2; length < MAX_LENGTH; length++) { if((length % 100) == 0) cout << length << endl; for(int j = 1; j < length 1; j++) // clear bitstring bit_string[j] = '0'; bit_string[0] = '1'; bit_string[length 1] = '1'; bit_string[length] = 0; // test if divisible by generator polynomial if(divisible(bit_string, length, generator, strlen(generator)) == true) { cout << "Division successful with length " << length << endl; break; } // if } // for if(bit_string) delete[] bit_string; } // main

23 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? Error polynomials with an odd number of terms: Speculation: If the generator polynomial contains the term (x 1 + x 0 ), an error string with an odd number of bits cannot be divided. Proof by contradiction: Assuming E(x) being divisible by (x 1 + x 0 ), the factor can also be extracted: E(x) = (x 1 + x 0 ) Q(x) So far we only divided polynomials. Now, for the first time we use them as functions and evaluate it for x = 1. (x 1 + x 0 ) equals (1 + 1) and Q(x) equals 1, because Q(x) still contains an odd number of terms (additions are still done modulo 2). Hence follows (1 + 1) Q(x) = 0 x 1 = 0 But in the beginning we assumed that E(x) contains an odd number of terms. Thus, the result should have been 1, not 0. As follows, the factor (x 1 +x 0 ) cannot be extracted. As a consequence, E(x) is not divisible by (x 1 + x 0 ), if it contains an odd number of terms (or error bits). Result: The generator polynomial should contain the term (x 1 + x 0 ) to catch all errors with an odd number of bits.

24 Source coding in data networks Error control: (CRC) Recognized errors Which errors are recognized? Recognition of burst errors of length r: The burst error in E(x) could look like this: 0001 anything To move the last bit to the very right, a factor can be factored out: E(x)=x i (x (r-1) ), with i being the number of zeros on the right side of the last 1. If the degree of the generator polynomial itself is r (hence it has r + 1 terms), the error of the form (x (r-1) ) cannot be divided either, because the generator polynomial is larger than the error to be divided. Example (decimal system): 99: is not divisible without a remainder (result 0, remainder 99). Detecting burst errors with length r is trivial in the way that the error itself simply occurs at the end of the division. Even if the burst is just as large as the generator polynomial (which means r+1 bits), the division yields no remainder only if by chance the error coincides exactly with the generator polynomial. This is possible, but not very likely. Example (decimal system): : = 1 (modulo 0)

25 Source coding in data networks Principle of Data Fountains A number of sources generate a theoretically endless stream of packets which are derived from a file. All packets are pairwise different. source 1 (streaming file A) source 2 (streaming file A) source n (streaming file A) Given that the file consists of p packets, the sink needs to receive only any p+k packets to regenerate the file no matter from which source, no matter which packets. Any choice of p+k packets will do (k < 5% of the file size). (mystical, right?) sink (downloading file A)...

26 Source coding in data networks Principle of Data Fountains Mathematical foundation: Let F be a file split into chunks to form a n x m bit-matrix and let T be a invertible n x n bit-matrix (containing only 0 and 1). TxF=M The bit-matrix M is transmitted line-by-line over the net. At the receiver side, the original file is obtained by T -1 xm=f If T is the unit matrix, then the sender simply sends a file chunk-bychunk like e.g., know from FTP. Advantage: Nothing to compute. Disadvantage: Any missing chunk will corrupt the file.

27 Source coding in data networks Principle of Data Fountains Data fountains rather make use of a matrix T which is more densely populated with 1-bits. The gain is that we don't rely on each individual data-chunk. Any number of m chunks (or slightly more) will do. Algorithm Sender: The sender splits a file into n data chunks (parts). Then, a random bit-vector of size n is created. The chunks with a corresponding 1 are XORed into a new chunk which is broadcast over the network. This process can be repeated endlessly to produce more and more packets from the same file.

28 Source coding in data networks Principle of Data Fountains Receiver: It collects n packets with the corresponding bit-vectors. The receiver knows these bit vectors used by the sender because is uses exactly the same random number generator. After having obtained n packets with linear independent bit-vectors (namely the matrix T), it inverts T and back transforms the message M into the original file F. (The random number sources have to be synchronized.)

29 Source coding in data networks Principle of Data Fountains Example Data to be transmitted: Random bit-stream: Sender Receiver = = = The receiver starts decoding the incoming packets no sooner than he got an invertible matrix. If this is not (yet) the case, the receiver gathers more packets (and more random data) until the inverse matrix can be calculated. The random bits are not sent but generated by two synchronized (e.g., by an index) random number generators.

30 Source coding in data networks Principle of Data Fountains Pro A missing packets means a missing random bit vector. However, the next packet will likely fill the gap. In a traditional transmission a missing part of a file can be replace by no other packet rather than the missing one. Many sources can contribute without mutual cooperation. Con Inverting the large matrix might be time- and memory consuming. To generate a single new packet, the sender has to XOR 50% of the entire file. Improvement: Don't generate random bits for the transform but a matrix which is mostly populated along the diagonal.

31 Motivation for compression in data networks The bandwidth provided by the physical layer is usually not available for an application because of protocol overhead introduced by each layer of the network stack. By optimizing protocols we can try to exploit the resources more optimally. However, the potential of not transmitting redundant data bears a much larger potential for increased efficiency, often several orders of magnitude larger than the wasted network bandwidth. Question: Every once in a while we can read in the press, that a compression algorithm has been invented which can (successfully) compress its own output again (repeatedly). In particular it is usually claimed that random data can be compressed. Why is this unlikely? There is a long history of inventors and companies how claim to have achieved the above described. For details see

32 Handling huge data volumes Text Still image 1 page with 80 characters/line and 64 lines/page and 1 byte/char results in 80 * 64 * 1 * 8 = 41 kbit/page 24 bits/pixel, 512 x 512 pixel/image results in 512 x 512 x 24 = ca. 6 Mbit/image Audio Video CD quality, sampling rate 44,1 KHz, 16 bits per sample results in 44,1 x 16 = 706 kbit/s. Stereo: 1,412 Mbit/s Full-size frame 1024 x 768 pixel/frame, 24 bits/pixel, 30 frames/s results in 1024 x 768 x 24 x 30 = 566 Mbit/s. More realistic: 360 x 240 pixel/frame, 360 x 240 x 24 x 30 = 60 Mbit/s => Storage and transmission of multimedia streams require compression!

33 Example 1: ABC -> 1; EE -> 2 Example 2: Note that in this example both algorithms lead to the same compression rate.

34 Run Length Coding Principle Replace all repetitions of the same symbol in the text ( runs ) by a repetition counter and the symbol. Example Text: AAAABBBAABBBBBCCCCCCCCDABCBAABBBBCCD Encoding: 4A3B2A5B8C1D1A1B1C1B2A4B2C1D As we can see, we can only expect a good compression rate when long runs occur frequently. Examples are long runs of blanks in text documents or leading white pixels in gray-scale images.

35 When dealing with binary files we are sure that a run of 1 s is always followed by a run of 0 s and vice versa. It is thus sufficient to store the repetition counters only! Example

36 Run Length Coding, Legal Issues - Beware of the patent trap! Runlength encoding of the type (length, character) US Patent No: 4,586,027 Title: Method and system for data compression and restoration Filed: 07-Aug-1984 Granted: 29-Apr-1986 Inventor: Tsukimaya et al. Assignee: Hitachi Runlength encoding (length [<= 16], character) Number: 4,872,009 Title: Method and apparatus for data compression and restoration Filed: 07-Dec-1987 Granted: 03-Oct-1989 Inventor: Tsukimaya et al. Assignee: Hitachi

37 Variable Length Coding Classical character codes use the same number of bits for each character. When the frequency of occurrence is different for different characters, we can use fewer bits for frequent characters and more bits for rare characters. Example Code 1: A B C D E (binary) Encoding of ABRACADABRA with constant bit length (= 5 Bits): Code 2: A B R C D Encoding:

38 Delimiters Code 2 can only be decoded unambiguously when delimiters are stored with the codewords. This can increase the size of the encoded string considerably. Idea No code word should be the prefix of another codeword! We will then no longer need delimiters. Code 3: A 11 B 00 R 011 C 010 D 10 Encoded string:

39 Representation as a TRIE (or prefix tree) An obvious method to represent such a code as a TRIE. In fact, any TRIE with M leaf nodes can be used to represent a code for a string containing M different characters. The figure on the next page shows two codes which can be used for ABRACADABRA. The code for each character is represented by the path from the root of the TRIE to that character where 0 goes to the left, 1 goes to the right, as is the convention for TRIEs. The TRIE on the left corresponds to the encoding of ABRACADABRA on the previous page, the TRIE on the right generates the following encoding: which is two bits shorter.

40 Two Tries for our Example The TRIE representation guarantees indeed that no codeword is the prefix of another codeword. Thus the encoded bit string can be uniquely decoded.

41 Huffman Code Now the question arises how we can find the best variable-length code for given character frequencies (or probabilities). The algorithm that solves this problem was found by David Huffman in Algorithm Generate-Huffman-Code Determine the frequencies of the characters and mark the leaf nodes of a binary tree (to be built) with them. Out of the tree nodes not yet marked as DONE, take the two with the smallest frequencies and compute their sum. Create a parent node for them and mark it with the sum. Mark the branch to the left son with 0, the one to the right son with 1. Mark the two son nodes as DONE. When there is only one node not yet marked as DONE, stop (the tree is complete). Otherwise, continue with step 2.

42 Huffman Code, Example Probabilities of the characters: p(a) = 0.3; p(b) = 0.3; p(c) = 0.1; p(d) = 0.15; p(e) = % 0 30% 30% A B % % 40% % 15% 15% C D E

43 Huffman Code, why is it optimal? Characters with higher probabilities are closer to the root of the tree and thus have shorter codeword lengths; thus it is a good code. It is even the best possible code! Reason: The length of an encoded string equals the weighted outer path length of the Huffman tree. To compute the weighted outer path length we first compute the product of the weight (frequency counter) of a leaf node with its distance from the root. We then compute the sum of all these values over the leaf nodes. This is obviously the same as summing up the products of each character s codeword length with its frequency of occurrence. No other tree with the same frequencies attached to the leaf nodes has a smaller weighted path length than the Huffman tree.

44 Sketch of the Proof With a similar construction process, another tree could be built but without always combining the two nodes with the minimal frequencies. We can show by induction that no other such strategy will lead to a smaller weighted outer path length than the one that combines the minimal values in each step.

45 Decoding Huffman Codes (1) An obvious possibility is to use the TRIE: Read the input stream sequentially and traverse the TRIE until a leaf node is reached. When a leaf node is reached, output the character attached to it. To decode the next bit, start again at the root of the TRIE. Observation The input bit rate is constant, the output character rate is variable.

46 Decoding Huffman Codes (2) As an alternative we can use a decoding table. Creation of the decoding table: If the longest codeword has L bits, the table has 2 L entries. Let c i be the codeword for character s i. Let c i have l i bits. We then create 2 L-li entries in the table. In each of these entries the first l i bits are equal to c i, and the remaining bits take on all possible L-l i binary combinations. At all these addresses of the table we enter s i as the character recognized, and we remember l i as the length of the codeword.

47 Decoding with the Table Algorithm Table-Based Huffman Decoder Read L bits from the input stream into a buffer. Use the buffer as the address into the table and output the recognized character s i. Remove the first l i bits from the buffer and pull in the next l i bits from the input bit stream. Continue with step 2. Observation Table-based Huffman decoding is fast. The output character rate is constant, the input bit rate is variable.

48 Huffman Code, Comments A very good code for many practical purposes. Can only be used when the frequencies (or probabilities) of the characters are known in advance. Variation: Determine the character frequencies separately for each new document and store/transmit the code tree/table with the data. Note that a loss in optimality comes from the fact that each character must be encoded with a fixed number of bits, and thus the codeword lengths do not match the frequencies exactly (consider a code for three characters A, B and C, each occurring with a frequency of 33 %).

49 Critical Review of 0-Order Codes (1) ABCD ABCD ABCD ABCD Is quite redundant but cannot be compressed by Huffman. Huffman is sometimes referred to as a 0-order entropy model which means that each character is generated independent of its predecessor. No character makes a particular successor more likely. current character A B C D next character A B C D

50 Critical Review of 0-Order Codes (2) ABCD ABCD ABCD ABCD A string of the above type could be better described by a 1-order entropy model: the current character gives a hint on what character is expected next. current character A B next character A B The example above is trivial since each character uniquely determines the next one. C D C D

51 Critical Review of 0-Order Codes (3) Most sources produce characters with varying relative occurrences which can be encoded with Huffman, but which also exhibit intercharacter correlations. In the English language a c is often followed by an h, but not very often by a z. The example on the right consists of only four characters. Once an A has occurred, only another A or a B will follow. As an exception, a D is always followed by another D. Let us see how many bits we must spend if this particular correlation is known: current character 25% 25% 25% 25% A B C D next character A B C D assumed relative occurrence (used in the next slide)

52 Critical Review of 0-Order Codes (4) We assume that the occurrence of every character is equal (25%). Once it has occurred, the following two characters are equally likely (e.g., 50% for A->A and 50% for A->B)* X=current character P(X)=probability for char. X Y X = Y occurred and X was its predecessor P(Y X) = Prob. for Y if X was predecessor X= A B C D P(X)= Y X= A B B C C D D P(Y X)= log(p(y X))= P(Y X)log(P(Y X))= Sum [P()P()log()]= H(Y X)= 0.75 Bits for coding a character (first character must be given) *Note: The example is not realistic. It was chosen for easy calculation. A B C D A B C D

53 Critical Review of 0-Order Codes (5) Number of bits needed in 1-order entropy models in general: H ( ) P( x) P( y x)( 1)log ( P( y x)) ( Y X ) = x y 2 This gives us the number of bits we need in order to code a y if we have seen an x before. This is the probability of getting a character y next if we have currently seen an x. Note that in real world example this can be zero very often because many combinations of characters simply never occur. This is why we can save on bits! In this sum, all possible occurrences of X are considered. x can be considered to be the first character which determines the next character y. Each x occurs with a probability of P(x). This determines the number of bits we need in order to code a letter Y (y is a particular instance) if an X occurred already

54 Lempel-Ziv Code Lempel-Ziv codes are an example of the large group of dictionarybased codes. Dictionary: A table of character strings which is used in the encoding process. Example The word lecture is found on page x4, line y4 of the dictionary. It can thus be encoded as (x4,y4). A sentence such as this is a lecture could perhaps be encoded as a sequence of tuples (x1,y1) (x2,y2) (x3,y3) (x4,y4).

55 Dictionary-Based Coding Techniques Static techniques The dictionary exists before a string is encoded. It is not changed, neither in the encoding nor in the decoding process. Dynamic techniques The dictionary is created on the fly during the encoding process, at the sending (and sometimes also at the receiving) side. Lempel and Ziv have proposed an especially brilliant dynamic, dictionary-based technique (1977). Variations of this techniques are used very widely today for lossless compression. An example is LZW (Lempel/Ziv/Welch) which is invoked with the Unix compress command. The well-known TIFF format (Tag Image File Format) is also based on Lempel-Ziv coding.

56 Ziv-Lempel Coding, the Principle The current piece of the message can be encoded as a reference to an earlier (identical) piece of the message. This reference will usually be shorter than the piece itself. As the message is processed, the dictionary is created dynamically. InitializeStringTable(); WriteCode(ClearCode); w = the empty string; for each character in string { K = GetNextCharacter(); if w + K is in the string table { w=w+k /* string concatenation*/ } else { WriteCode(CodeFromString(w)); AddTableEntry(w+K); w=k } } WriteCode(CodeFromString(w));

57 LZW, Example 1, Encoding Alphabet: {A, B, C} Message: A B A B C B A B A B Encoded Message: ω +K ω DICTIONARY A A index = code entry AB B BA A 1 A AB ABC CB BA BAB BA BAB AB C B BA B BA BAB 2 B 3 C 4 AB 5 BA 6 ABC 7 CB 8 BAB

58 LZW Algorithm: Decoding (1) Note that the decoding algorithm also creates the dictionary dynamically, the dictionary is not transmitted! While((Code=GetNextCode()!= EofCode) { if (Code == ClearCode) { InitializeTable(); Code = GetNextCode(); if (Code==EofCode) break; WriteString(StringFromCode(Code)); OldCode = Code; } /* end of ClearCode case */ else { if (IsInTable(Code)) { WriteString( StringFromCode(Code) ); AddStringToTable(StringFromCode(OldCode)+ FirstChar(StringFromCode(Code))); OldCode = Code; }

59 LZW Algorithm: Decoding (2) } } else {/* code is not in table */ OutString = StringFromCode(OldCode) + FirstChar(StringFromCode(OldCode))); WriteString(OutString); AddStringToTable(OutString); OldCode = Code; }

60 LZW, Example 2, Decoding Alphabet: {A, B, C,D} Transmitted Code: A B A C AB A code oldcode DICTIONARY 1 1 index entry A B C D AB 6 BA 7 AC 8 CA 9 ABA

61 LZW, Properties The dictionary is created dynamically during the encoding and decoding process. It is neither stored nor transmitted! The dictionary adapts dynamically to the properties of the character string. With length N of the original message, the encoding process is of complexity O(N). With length M of the encoded message, the decoding process is of complexity O(M). These are thus very efficient processes. Since several characters of the input alphabet are combined into one character of the code, M <= N.

62 Typical Compression Rates Typical examples of file sizes in % of the original size Type of file Encoded with Huffman Encoded with Lempel Ziv C source code 65 % 45 % machine code 80 % 55 % text 50 % 30 %

63 Arithmetic Coding From an information theory point of view, the Huffman code is not quite optimal since a codeword must always consist of an integer number of bits even if this does not correspond exactly to the frequency of occurrence of the character. Arithmetic coding solves this problem. Idea An entire message is represented by a floating point number out of the interval [0,1). For this purpose the interval [0,1) is repeatedly subdivided according to the frequency of the next symbol. Each new sub-interval represents one symbol. When the process is completed the shortest floating point number contained in the target interval is chosen as the representative for the message.

64 Arithmetic Coding, the Algorithm Begin in front of the first character of the input stream, with the current interval set to [0,1). Read the next character from the input stream. Subdivide the current interval according to the frequencies of all characters of the alphabet. Select the subinterval corresponding to the current character as the next current interval. If you reach the end of the input stream or the end symbol, go to step 3. Otherwise go to step 1. From the current (final) interval, select the floating point number that you can represent in the computer with the smallest number of bits. This number is the encoding of the string.

65 Arithmetic Coding, the Decoding Algorithm Algorithm Arithmetic Decoding Subdivide the interval [0,1) according to the character frequencies, as described in the encoding algorithm, up to the maximum size of a message. The encoded floating point number uniquely identifies one particular subinterval. This subinterval uniquely identifies one particular message. Output the message.

66 Arithmetic Coding, Example Alphabet = {A,B,C} Frequencies (probabilities): p(a) = 0.2; p(b) = 0.3; p(c) = 0.5 Messages: ACB AAB (maximum size of a messe is 3). Encoding of the first block ACB: 0 0,2 0,5 1 A B C 0 0,04 0,1 0,2 A B C 0,1 0,12 0,15 0,2 A B C Final interval: [0.12; 0.15) choose e.g

67 Arithmetic Coding, Implementation So far we dealt with real numbers of (theoretically) infinit precision. How do we actually encode a message with many characters (like several megabytes)? If character A occurs with a probability of 20%, the number of digits for coding consecutive A s grows very fast: First A < 0.2, second A < 0.04, third A < 0.008, , , and so on. Let us assume, that our processor has 8-bit wide registers. (decimal 0) (0.2 x 255=51) (0.5 x 255=127) A B C 0,2 0,5 (decimal 255) Note: We skip to store or transmit 0. since it is redundant. Advantage: We have a binary fixed point arithmetic representation of fractions and we can compute them in the processor s register.

68 Arithmetic Coding, Implementation Disadvantage: Even when using 32-bit or 64-bit CPU registers we can only code a couple of characters. Solution: Once our interval gets smaller and smaller, we will obtain a growing number of leading bits which have settled (they will never change). So we will transmit them to the receiver and shift them our of our register to gain new bits. Example: Interval of first A = [ , ] No matter which characters follow, the most significant two zeros from the lower and the upper bound will never change. So we store or transmit them and shift the rest two digits to the left. As a consequence we gain two new least significant digits: A=[ ??, ?? ]

69 Arithmetic Coding, Implementation A=[ ??, ?? ] How do we initialize the new digits prior to the ongoing encoding? Obviously, we want to keep the interval as large as possible. This is achieved by filling the lower bound with 0-bits and the upper bound with 1-bits. A new =[ , ] Note that always adding 0-bits to the lower bound and 1-bits to the upper bound introduces a mistake, because the size of the interval does not exactly correspond to the probability of character A. However, we will not get into trouble if the encoding and the decoding side makes the same mistake.

70 Arithmetic Coding, Properties The encoding depends on the probabilities (relative occurrences) of the characters. The higher the frequency, the larger the subinterval; the smaller the number of bits needed to represent it. The code length reaches the theoretical optimum: The number of bits used for each character need not be an integer. It can approach the real probability better than using the Huffman code. We always need a terminal symbol to stop the encoding process. Problem: One bit error destroys the entire message.

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic

More information

Ch. 7 Error Detection and Correction

Ch. 7 Error Detection and Correction Ch. 7 Error Detection and Correction Error Detection and Correction Data can be corrupted during transmission. Some applications require that errors be detected and corrected. 2 1. Introduction Let us

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Outline Why compression? Classification Entropy and Information

More information

CMSC 2833 Lecture 18. Parity Add a bit to make the number of ones (1s) transmitted odd.

CMSC 2833 Lecture 18. Parity Add a bit to make the number of ones (1s) transmitted odd. Parity Even parity: Odd parity: Add a bit to make the number of ones (1s) transmitted even. Add a bit to make the number of ones (1s) transmitted odd. Example and ASCII A is coded 100 0001 Parity ASCII

More information

Chapter 10 Error Detection and Correction. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Chapter 10 Error Detection and Correction. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 10 Error Detection and Correction 0. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Note The Hamming distance between two words is the number of differences

More information

Lecture 6: Reliable Transmission. CSE 123: Computer Networks Alex Snoeren (guest lecture) Alex Sn

Lecture 6: Reliable Transmission. CSE 123: Computer Networks Alex Snoeren (guest lecture) Alex Sn Lecture 6: Reliable Transmission CSE 123: Computer Networks Alex Snoeren (guest lecture) Alex Sn Lecture 6 Overview Finishing Error Detection Cyclic Remainder Check (CRC) Handling errors Automatic Repeat

More information

Advanced Computer Networks. Rab Nawaz Jadoon DCS. Assistant Professor COMSATS University, Lahore Pakistan. Department of Computer Science

Advanced Computer Networks. Rab Nawaz Jadoon DCS. Assistant Professor COMSATS University, Lahore Pakistan. Department of Computer Science Advanced Computer Networks Department of Computer Science DCS COMSATS Institute of Information Technology Rab Nawaz Jadoon Assistant Professor COMSATS University, Lahore Pakistan Advanced Computer Networks

More information

Lossless Compression Algorithms

Lossless Compression Algorithms Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms

More information

Greedy Algorithms CHAPTER 16

Greedy Algorithms CHAPTER 16 CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often

More information

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1 IMAGE COMPRESSION- I Week VIII Feb 25 02/25/2003 Image Compression-I 1 Reading.. Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive,

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics

More information

4. Error correction and link control. Contents

4. Error correction and link control. Contents //2 4. Error correction and link control Contents a. Types of errors b. Error detection and correction c. Flow control d. Error control //2 a. Types of errors Data can be corrupted during transmission.

More information

CSE123A discussion session

CSE123A discussion session CSE23A discussion session 27/2/9 Ryo Sugihara Review Data Link Layer (3): Error detection sublayer CRC Polynomial representation Implementation using LFSR Data Link Layer (4): Error recovery sublayer Protocol

More information

Lecture 4: CRC & Reliable Transmission. Lecture 4 Overview. Checksum review. CRC toward a better EDC. Reliable Transmission

Lecture 4: CRC & Reliable Transmission. Lecture 4 Overview. Checksum review. CRC toward a better EDC. Reliable Transmission 1 Lecture 4: CRC & Reliable Transmission CSE 123: Computer Networks Chris Kanich Quiz 1: Tuesday July 5th Lecture 4: CRC & Reliable Transmission Lecture 4 Overview CRC toward a better EDC Reliable Transmission

More information

CS321: Computer Networks Error Detection and Correction

CS321: Computer Networks Error Detection and Correction CS321: Computer Networks Error Detection and Correction Dr. Manas Khatua Assistant Professor Dept. of CSE IIT Jodhpur E-mail: manaskhatua@iitj.ac.in Error Detection and Correction Objective: System must

More information

CSE 123: Computer Networks Alex C. Snoeren. HW 1 due Thursday!

CSE 123: Computer Networks Alex C. Snoeren. HW 1 due Thursday! CSE 123: Computer Networks Alex C. Snoeren HW 1 due Thursday! Error handling through redundancy Adding extra bits to the frame Hamming Distance When we can detect When we can correct Checksum Cyclic Remainder

More information

EE-575 INFORMATION THEORY - SEM 092

EE-575 INFORMATION THEORY - SEM 092 EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------

More information

(Refer Slide Time: 2:20)

(Refer Slide Time: 2:20) Data Communications Prof. A. Pal Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture-15 Error Detection and Correction Hello viewers welcome to today s lecture

More information

Chapter 3. The Data Link Layer. Wesam A. Hatamleh

Chapter 3. The Data Link Layer. Wesam A. Hatamleh Chapter 3 The Data Link Layer The Data Link Layer Data Link Layer Design Issues Error Detection and Correction Elementary Data Link Protocols Sliding Window Protocols Example Data Link Protocols The Data

More information

Inst: Chris Davison

Inst: Chris Davison ICS 153 Introduction to Computer Networks Inst: Chris Davison cbdaviso@uci.edu ICS 153 Data Link Layer Contents Simplex and Duplex Communication Frame Creation Flow Control Error Control Performance of

More information

Lecture / The Data Link Layer: Framing and Error Detection

Lecture / The Data Link Layer: Framing and Error Detection Lecture 2 6.263/16.37 The Data Link Layer: Framing and Error Detection MIT, LIDS Slide 1 Data Link Layer (DLC) Responsible for reliable transmission of packets over a link Framing: Determine the start

More information

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding SIGNAL COMPRESSION Lecture 5 11.9.2007 Lempel-Ziv Coding Dictionary methods Ziv-Lempel 77 The gzip variant of Ziv-Lempel 77 Ziv-Lempel 78 The LZW variant of Ziv-Lempel 78 Asymptotic optimality of Ziv-Lempel

More information

CSCI-1680 Link Layer Reliability Rodrigo Fonseca

CSCI-1680 Link Layer Reliability Rodrigo Fonseca CSCI-1680 Link Layer Reliability Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Janno< Last time Physical layer: encoding, modulation Link layer framing Today Getting

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Thinh Nguyen (Based on Prof. Ben Lee s Slides) Oregon State University School of Electrical Engineering and Computer Science Outline Why compression?

More information

Where we are in the Course

Where we are in the Course Link Layer Where we are in the Course Moving on up to the Link Layer! Application Transport Network Link Physical CSE 461 University of Washington 2 Scope of the Link Layer Concerns how to transfer messages

More information

CSCI-1680 Link Layer I Rodrigo Fonseca

CSCI-1680 Link Layer I Rodrigo Fonseca CSCI-1680 Link Layer I Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Last time Physical layer: encoding, modulation Today Link layer framing Getting frames

More information

Chapter 7 Lossless Compression Algorithms

Chapter 7 Lossless Compression Algorithms Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5 Dictionary-based Coding 7.6 Arithmetic Coding 7.7

More information

ISO/OSI Reference Model. Data Link Layer. 7. Application. 6. Presentation. 5. Session. 4. Transport. 3. Network. 2. Data Link. 1.

ISO/OSI Reference Model. Data Link Layer. 7. Application. 6. Presentation. 5. Session. 4. Transport. 3. Network. 2. Data Link. 1. Data Link Layer 1 ISO/OSI Reference Model 7. Application E-Mail, Terminal, Remote login 6. Presentation System dependent presentation of data (EBCDIC/ASCII) 5. Session Connection establishment, termination,

More information

CSE123A discussion session

CSE123A discussion session CSE123A discussion session 2007/02/02 Ryo Sugihara Review Data Link layer (1): Overview Sublayers End-to-end argument Framing sublayer How to delimit frame» Flags and bit stuffing Topics Data Link Layer

More information

Chapter 3. The Data Link Layer

Chapter 3. The Data Link Layer Chapter 3 The Data Link Layer 1 Data Link Layer Algorithms for achieving reliable, efficient communication between two adjacent machines. Adjacent means two machines are physically connected by a communication

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory

More information

MYcsvtu Notes DATA REPRESENTATION. Data Types. Complements. Fixed Point Representations. Floating Point Representations. Other Binary Codes

MYcsvtu Notes DATA REPRESENTATION. Data Types. Complements. Fixed Point Representations. Floating Point Representations. Other Binary Codes DATA REPRESENTATION Data Types Complements Fixed Point Representations Floating Point Representations Other Binary Codes Error Detection Codes Hamming Codes 1. DATA REPRESENTATION Information that a Computer

More information

ECE 333: Introduction to Communication Networks Fall Lecture 6: Data Link Layer II

ECE 333: Introduction to Communication Networks Fall Lecture 6: Data Link Layer II ECE 333: Introduction to Communication Networks Fall 00 Lecture 6: Data Link Layer II Error Correction/Detection 1 Notes In Lectures 3 and 4, we studied various impairments that can occur at the physical

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today

More information

Data and Computer Communications. Protocols and Architecture

Data and Computer Communications. Protocols and Architecture Data and Computer Communications Protocols and Architecture Characteristics Direct or indirect Monolithic or structured Symmetric or asymmetric Standard or nonstandard Means of Communication Direct or

More information

Implementing CRCCs. Introduction. in Altera Devices

Implementing CRCCs. Introduction. in Altera Devices Implementing CRCCs in Altera Devices July 1995, ver. 1 Application Note 49 Introduction Redundant encoding is a method of error detection that spreads the information across more bits than the original

More information

UNIT-II 1. Discuss the issues in the data link layer. Answer:

UNIT-II 1. Discuss the issues in the data link layer. Answer: UNIT-II 1. Discuss the issues in the data link layer. Answer: Data Link Layer Design Issues: The data link layer has a number of specific functions it can carry out. These functions include 1. Providing

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression

More information

CSE 461: Framing, Error Detection and Correction

CSE 461: Framing, Error Detection and Correction CSE 461: Framing, Error Detection and Correction Next Topics Framing Focus: How does a receiver know where a message begins/ends Error detection and correction Focus: How do we detect and correct messages

More information

CSE 123A Computer Networks

CSE 123A Computer Networks CSE 123A Computer Networks Winter 2005 Lecture 4: Data-Link I: Framing and Errors Some portions courtesy Robin Kravets and Steve Lumetta Last time How protocols are organized & why Network layer Data-link

More information

Digital Image Processing

Digital Image Processing Lecture 9+10 Image Compression Lecturer: Ha Dai Duong Faculty of Information Technology 1. Introduction Image compression To Solve the problem of reduncing the amount of data required to represent a digital

More information

CHAPTER 2 Data Representation in Computer Systems

CHAPTER 2 Data Representation in Computer Systems CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting

More information

CSEP 561 Error detection & correction. David Wetherall

CSEP 561 Error detection & correction. David Wetherall CSEP 561 Error detection & correction David Wetherall djw@cs.washington.edu Codes for Error Detection/Correction ti ti Error detection and correction How do we detect and correct messages that are garbled

More information

CHAPTER 2 Data Representation in Computer Systems

CHAPTER 2 Data Representation in Computer Systems CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting

More information

Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha

Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we have to take into account the complexity of the code.

More information

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental

More information

CSCI-1680 Link Layer Reliability John Jannotti

CSCI-1680 Link Layer Reliability John Jannotti CSCI-1680 Link Layer Reliability John Jannotti Based partly on lecture notes by David Mazières, Phil Levis, Rodrigo Fonseca Roadmap Last time Physical layer: encoding, modulation Link layer framing Today

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

EE 6900: FAULT-TOLERANT COMPUTING SYSTEMS

EE 6900: FAULT-TOLERANT COMPUTING SYSTEMS EE 6900: FAULT-TOLERANT COMPUTING SYSTEMS LECTURE 6: CODING THEORY - 2 Fall 2014 Avinash Kodi kodi@ohio.edu Acknowledgement: Daniel Sorin, Behrooz Parhami, Srinivasan Ramasubramanian Agenda Hamming Codes

More information

CSMC 417. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala Set 4. September 09 CMSC417 Set 4 1

CSMC 417. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala Set 4. September 09 CMSC417 Set 4 1 CSMC 417 Computer Networks Prof. Ashok K Agrawala 2009 Ashok Agrawala Set 4 1 The Data Link Layer 2 Data Link Layer Design Issues Services Provided to the Network Layer Framing Error Control Flow Control

More information

Intro. To Multimedia Engineering Lossless Compression

Intro. To Multimedia Engineering Lossless Compression Intro. To Multimedia Engineering Lossless Compression Kyoungro Yoon yoonk@konkuk.ac.kr 1/43 Contents Introduction Basics of Information Theory Run-Length Coding Variable-Length Coding (VLC) Dictionary-based

More information

Kinds Of Data CHAPTER 3 DATA REPRESENTATION. Numbers Are Different! Positional Number Systems. Text. Numbers. Other

Kinds Of Data CHAPTER 3 DATA REPRESENTATION. Numbers Are Different! Positional Number Systems. Text. Numbers. Other Kinds Of Data CHAPTER 3 DATA REPRESENTATION Numbers Integers Unsigned Signed Reals Fixed-Point Floating-Point Binary-Coded Decimal Text ASCII Characters Strings Other Graphics Images Video Audio Numbers

More information

Chapter 10 Error Detection and Correction 10.1

Chapter 10 Error Detection and Correction 10.1 Chapter 10 Error Detection and Correction 10.1 10-1 INTRODUCTION some issues related, directly or indirectly, to error detection and correction. Topics discussed in this section: Types of Errors Redundancy

More information

EE67I Multimedia Communication Systems Lecture 4

EE67I Multimedia Communication Systems Lecture 4 EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.

More information

Data Link Networks. Hardware Building Blocks. Nodes & Links. CS565 Data Link Networks 1

Data Link Networks. Hardware Building Blocks. Nodes & Links. CS565 Data Link Networks 1 Data Link Networks Hardware Building Blocks Nodes & Links CS565 Data Link Networks 1 PROBLEM: Physically connecting Hosts 5 Issues 4 Technologies Encoding - encoding for physical medium Framing - delineation

More information

Errors. Chapter Extension of System Model

Errors. Chapter Extension of System Model Chapter 4 Errors In Chapter 2 we saw examples of how symbols could be represented by arrays of bits. In Chapter 3 we looked at some techniques of compressing the bit representations of such symbols, or

More information

Data Link Layer: Overview, operations

Data Link Layer: Overview, operations Data Link Layer: Overview, operations Chapter 3 1 Outlines 1. Data Link Layer Functions. Data Link Services 3. Framing 4. Error Detection/Correction. Flow Control 6. Medium Access 1 1. Data Link Layer

More information

Data Link Layer. Srinidhi Varadarajan

Data Link Layer. Srinidhi Varadarajan Data Link Layer Srinidhi Varadarajan Data Link Layer: Functionality The data link layer must: Detect errors (using redundancy bits) Request retransmission if data is lost (using automatic repeat request

More information

Binary Trees Case-studies

Binary Trees Case-studies Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look

More information

2.1 CHANNEL ALLOCATION 2.2 MULTIPLE ACCESS PROTOCOLS Collision Free Protocols 2.3 FDDI 2.4 DATA LINK LAYER DESIGN ISSUES 2.5 FRAMING & STUFFING

2.1 CHANNEL ALLOCATION 2.2 MULTIPLE ACCESS PROTOCOLS Collision Free Protocols 2.3 FDDI 2.4 DATA LINK LAYER DESIGN ISSUES 2.5 FRAMING & STUFFING UNIT-2 2.1 CHANNEL ALLOCATION 2.2 MULTIPLE ACCESS PROTOCOLS 2.2.1 Pure ALOHA 2.2.2 Slotted ALOHA 2.2.3 Carrier Sense Multiple Access 2.2.4 CSMA with Collision Detection 2.2.5 Collision Free Protocols 2.2.5.1

More information

Horn Formulae. CS124 Course Notes 8 Spring 2018

Horn Formulae. CS124 Course Notes 8 Spring 2018 CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it

More information

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I 1 Need For Compression 2D data sets are much larger than 1D. TV and movie data sets are effectively 3D (2-space, 1-time). Need Compression for

More information

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson zhuyongxin@sjtu.edu.cn 2 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information

More information

Data Compression. Guest lecture, SGDS Fall 2011

Data Compression. Guest lecture, SGDS Fall 2011 Data Compression Guest lecture, SGDS Fall 2011 1 Basics Lossy/lossless Alphabet compaction Compression is impossible Compression is possible RLE Variable-length codes Undecidable Pigeon-holes Patterns

More information

Chapter 4: Application Protocols 4.1: Layer : Internet Phonebook : DNS 4.3: The WWW and s

Chapter 4: Application Protocols 4.1: Layer : Internet Phonebook : DNS 4.3: The WWW and  s Chapter 4: Application Protocols 4.1: Layer 5-7 4.2: Internet Phonebook : DNS 4.3: The WWW and E-Mails OSI Reference Model Application Layer Presentation Layer Session Layer Application Protocols Chapter

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,

More information

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77 CS 493: Algorithms for Massive Data Sets February 14, 2002 Dictionary-based compression Scribe: Tony Wirth This lecture will explore two adaptive dictionary compression schemes: LZ77 and LZ78. We use the

More information

Data Compression Fundamentals

Data Compression Fundamentals 1 Data Compression Fundamentals Touradj Ebrahimi Touradj.Ebrahimi@epfl.ch 2 Several classifications of compression methods are possible Based on data type :» Generic data compression» Audio compression»

More information

UNIT-II. Part-2: CENTRAL PROCESSING UNIT

UNIT-II. Part-2: CENTRAL PROCESSING UNIT Page1 UNIT-II Part-2: CENTRAL PROCESSING UNIT Stack Organization Instruction Formats Addressing Modes Data Transfer And Manipulation Program Control Reduced Instruction Set Computer (RISC) Introduction:

More information

FAULT TOLERANT SYSTEMS

FAULT TOLERANT SYSTEMS FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 6 Coding I Chapter 3 Information Redundancy Part.6.1 Information Redundancy - Coding A data word with d bits is encoded

More information

Networking Link Layer

Networking Link Layer Networking Link Layer ECE 650 Systems Programming & Engineering Duke University, Spring 2018 (Link Layer Protocol material based on CS 356 slides) TCP/IP Model 2 Layer 1 & 2 Layer 1: Physical Layer Encoding

More information

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017 CS6 Lecture 4 Greedy Algorithms Scribe: Virginia Williams, Sam Kim (26), Mary Wootters (27) Date: May 22, 27 Greedy Algorithms Suppose we want to solve a problem, and we re able to come up with some recursive

More information

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc. David Rappaport School of Computing Queen s University CANADA Copyright, 1996 Dale Carnegie & Associates, Inc. Data Compression There are two broad categories of data compression: Lossless Compression

More information

Some portions courtesy Robin Kravets and Steve Lumetta

Some portions courtesy Robin Kravets and Steve Lumetta CSE 123 Computer Networks Fall 2009 Lecture 4: Data-Link I: Framing and Errors Some portions courtesy Robin Kravets and Steve Lumetta Administrative updates I m Im out all next week no lectures, but You

More information

ELEC 691X/498X Broadcast Signal Transmission Winter 2018

ELEC 691X/498X Broadcast Signal Transmission Winter 2018 ELEC 691X/498X Broadcast Signal Transmission Winter 2018 Instructor: DR. Reza Soleymani, Office: EV 5.125, Telephone: 848 2424 ext.: 4103. Office Hours: Wednesday, Thursday, 14:00 15:00 Slide 1 In this

More information

Fault-Tolerant Computing

Fault-Tolerant Computing Fault-Tolerant Computing Dealing with Mid-Level Impairments Oct. 2007 Error Detection Slide 1 About This Presentation This presentation has been prepared for the graduate course ECE 257A (Fault-Tolerant

More information

More Bits and Bytes Huffman Coding

More Bits and Bytes Huffman Coding More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width

More information

CHW 261: Logic Design

CHW 261: Logic Design CHW 261: Logic Design Instructors: Prof. Hala Zayed Dr. Ahmed Shalaby http://www.bu.edu.eg/staff/halazayed14 http://bu.edu.eg/staff/ahmedshalaby14# Slide 1 Slide 2 Slide 3 Digital Fundamentals CHAPTER

More information

CS422 Computer Networks

CS422 Computer Networks CS422 Computer Networks Lecture 3 Data Link Layer Dr. Xiaobo Zhou Department of Computer Science CS422 DataLinkLayer.1 Data Link Layer Design Issues Services Provided to the Network Layer Provide service

More information

Figure-2.1. Information system with encoder/decoders.

Figure-2.1. Information system with encoder/decoders. 2. Entropy Coding In the section on Information Theory, information system is modeled as the generationtransmission-user triplet, as depicted in fig-1.1, to emphasize the information aspect of the system.

More information

ENEE x Digital Logic Design. Lecture 3

ENEE x Digital Logic Design. Lecture 3 ENEE244-x Digital Logic Design Lecture 3 Announcements Homework due today. Homework 2 will be posted by tonight, due Monday, 9/2. First recitation quiz will be tomorrow on the material from Lectures and

More information

Huffman Code Application. Lecture7: Huffman Code. A simple application of Huffman coding of image compression which would be :

Huffman Code Application. Lecture7: Huffman Code. A simple application of Huffman coding of image compression which would be : Lecture7: Huffman Code Lossless Image Compression Huffman Code Application A simple application of Huffman coding of image compression which would be : Generation of a Huffman code for the set of values

More information

Lossless compression II

Lossless compression II Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)

More information

Module 2: Computer Arithmetic

Module 2: Computer Arithmetic Module 2: Computer Arithmetic 1 B O O K : C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N, 3 E D, D A V I D L. P A T T E R S O N A N D J O H N L. H A N N E S S Y, M O R G A N K A U F M A N N

More information

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Module # 02 Lecture - 03 Characters and Strings So, let us turn our attention to a data type we have

More information

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION 15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

More information

Data link layer functions. 2 Computer Networks Data Communications. Framing (1) Framing (2) Parity Checking (1) Error Detection

Data link layer functions. 2 Computer Networks Data Communications. Framing (1) Framing (2) Parity Checking (1) Error Detection 2 Computer Networks Data Communications Part 6 Data Link Control Data link layer functions Framing Needed to synchronise TX and RX Account for all bits sent Error control Detect and correct errors Flow

More information

Error Detection Codes. Error Detection. Two Dimensional Parity. Internet Checksum Algorithm. Cyclic Redundancy Check.

Error Detection Codes. Error Detection. Two Dimensional Parity. Internet Checksum Algorithm. Cyclic Redundancy Check. Error Detection Two types Error Detection Codes (e.g. CRC, Parity, Checksums) Error Correction Codes (e.g. Hamming, Reed Solomon) Basic Idea Add redundant information to determine if errors have been introduced

More information

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 1 DLD P VIDYA SAGAR

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 1 DLD P VIDYA SAGAR UNIT I Digital Systems: Binary Numbers, Octal, Hexa Decimal and other base numbers, Number base conversions, complements, signed binary numbers, Floating point number representation, binary codes, error

More information

Multimedia Systems. Part 20. Mahdi Vasighi

Multimedia Systems. Part 20. Mahdi Vasighi Multimedia Systems Part 2 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for dvanced Studies in asic Sciences, Zanjan, Iran rithmetic Coding

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

CS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department

CS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department CS473-Algorithms I Lecture 11 Greedy Algorithms 1 Activity Selection Problem Input: a set S {1, 2,, n} of n activities s i =Start time of activity i, f i = Finish time of activity i Activity i takes place

More information

Ad hoc and Sensor Networks Chapter 6: Link layer protocols. Holger Karl

Ad hoc and Sensor Networks Chapter 6: Link layer protocols. Holger Karl Ad hoc and Sensor Networks Chapter 6: Link layer protocols Holger Karl Goals of this chapter Link layer tasks in general Framing group bit sequence into packets/frames Important: format, size Error control

More information

Computer and Network Security

Computer and Network Security CIS 551 / TCOM 401 Computer and Network Security Spring 2009 Lecture 6 Announcements First project: Due: 6 Feb. 2009 at 11:59 p.m. http://www.cis.upenn.edu/~cis551/project1.html Plan for Today: Networks:

More information

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master

More information

Digital Fundamentals

Digital Fundamentals Digital Fundamentals Tenth Edition Floyd Chapter 2 2009 Pearson Education, Upper 2008 Pearson Saddle River, Education NJ 07458. All Rights Reserved Decimal Numbers The position of each digit in a weighted

More information

2 nd Week Lecture Notes

2 nd Week Lecture Notes 2 nd Week Lecture Notes Scope of variables All the variables that we intend to use in a program must have been declared with its type specifier in an earlier point in the code, like we did in the previous

More information