Digital Representation Chapter : Representation of Multimedia Data Audio Technology s and Graphics Video Technology Chapter 3: Multimedia Systems Communication Aspects and Services Chapter 4: Multimedia Systems Storage Aspects.: s and Graphics Digital image representation formats and color models JPEG, JPEG synthesis and graphics systems A digital image is a spatial representation of an object (D, 3D scene or another image - real or virtual) Definition of digital image : Let I, J, K Z be a finite interval. Let G N with G < be the grey scale level / color depth (intensity value of a picture element = a pixel) of the image. () A D-image is a function f: I J G () A 3D-image is a function f: I J K G (3) If G = {,}, the function is a binary (or bit) image, otherwise it is a pixel image The Resolution depends on the size of I and J (and K) and describes the number of pixels per row resp. column. Example To display a 55-line television picture (NTSC) without noticeable degradation with a Video Graphics Array (VGA) video controller, 64x48 pixels and 56 discrete grey levels give an array of 37. 8-bit numbers and a total of.457.6 bit. Page Page Representation Color Models An Capturing Format is specified by: spatial resolution (pixel x pixel) and color encoding (bits per pixel) Example: captured image of a DVD video with 4:3 picture size: spatial resolution: 768 x 576 pixel color encoding: -bit (binary image), 8-bit (color or grayscale), 4-bit (color-rgb) An Storing Format is a -dimensional array of values representing the image in a bitmap or pixmap, respectively. Also called raster graphics. The data of the fields of a bitmap is a binary digit, data in a pixmap may be a collection of: 3 numbers representing the intensities of red, green, and blue components of the color 3 numbers representing indices to tables of red, green and blue intensities Single numbers as index to a table of color triples Single numbers as index to any other data structures that represents a color / color system Further properties can be assigned with the whole image: width, height, depth, version, etc. Why storing values for red, green, blue? Color perception by the human brain is possible through the additive composition of red, green and blue light (RGB system). The relative intensities of RGB values are transmitted to the monitor where they are reproduced at each point in time. On a computer monitor, each pixel is given as an overlay of those three image tones with different intensities by this, any color can be reproduced. But: another possible color model: CYMK When printing an image, other color components are used cyan, yellow, magenta, kontrast which in all can also reproduce all colors. Thus, many image processing software and also some image storing formats also support this model. Page 3 Page 4
Color Models Another possibility is to use a different representation of color information by means of the YUV system where Y is the brightness (or luminance) information U and V are color difference signals (chrominance) Y, U and V are functions of R, G and B Why? As the human eye is more sensitive to brightness than to chrominance, separate brightness information from the color information and code the more important luminance with more bit than the chrominance this can save bits in the representation format. Color Models Usual scheme: Y =.3 R +.5 G +. B (the color sensitivity of the human eye is considered) U = c (B-Y); V = c (R-Y) c, c = constants reflecting perception aspects of the human eye and the human brain! Possible Coding: YUV signal Y =.3 R +.5 G +. B U = (B-Y).43 = -.48 R -. G +.43 B V = (R-Y).877 =.64 R -.57 G -.6 B This is a system of 3 equations for determining Y, U, V from R, G, B or for recalculating R, G, B from Y, U, V The resolution of Y is more important than the resolution of U and V Spend more bits for Y than for U and V (Y : U : V = 4 : : ) The weighting factors in the calculation of the Y signal compensate the color perception misbalance of the human eye Page 5 Page 6 Formats Why Compression? Lots of different image formats are in use today, e.g. GIF (Graphics Interchange Format) Compressed with some basic lossless compression techniques to 5% of original picture without loss. Supports 4-bit colors. BMP (Bitmap) Devide-independent representation of an image: uses RGB color model, without compression. Color depth up to 4-bit, additional option of specifying a color table to use. TIFF (Tagged File Format) Supports grey levels, RGB, and CYMK color model. Also supports lots of different compression methods. Additionally contains a descriptive part with properties a display should provide to show the image. PostScript s are described without reference to special properties as e.g. resolution. Nice feature for printers, but hard to include into documents where you have to know the image size... JPEG (Joint Photographics Expert Group) Lots of possible compressions, mostly with loss! Page 7 High-resolution image: e.g. 4 768 pixel, 4 bit color depth 4 768 4 = 8.874.368 bit formats like GIF: Lossless compression (entropy encoding) for reducing data amount while keeping image quality JPEG: Lossy compression remove some image details to achieve a higher compression rate by suppressing higher frequencies Combined with lossless techniques Trade-Off between file size and quality JPEG is a joint standard of ISO and ITU-T In June 87, an adaptive transformation coding technique based on DCT was adopted for JPEG In, JPEG became a ISO international standard Page 8
JPEG Implementation Independent of image size Applicable to any image and pixel aspect ratio Color representation JPEG applies to color and grey-scaled still images content Of any complexity, with any statistical characteristics Properties of JPEG State-of-the-art regarding compression factor and image quality Run on as many available standard processors as possible Compression mechanisms are available as software-only packages or together with specific hardware support - use of specialized hardware should speed up image decompression Encoded data stream has a fixed interchange format Fast coding is also used for video sequences: Motion JPEG How could We compress? Entropy encoding Data stream is considered to be a simple digital sequence without semantics Lossless coding, decompression process regenerates the data completely Used regardless of the media s specific characteristics Examples: Run-length encoding, Huffman encoding, Arithmetic encoding Source encoding Semantics of the data are taken into account Lossy coding (encoded data are not identical with original data) Degree of compression depends on the data contents Example: Discrete Cosine Transformation (DCT) as transformation technique of the spatial domain into the two-dimensional frequency domain Hybrid encoding Used by most multimedia systems Combination of entropy and source encoding Examples: JPEG, MPEG, H.6 Page Page Compression Steps in JPEG Compression Steps in JPEG Uncompressed Compressed Preparation Analog-to-digital conversion division into blocks of N N pixels Suitable structuring and ordering of image information Preparation Pixel Block, MCU Processing DCT Predictor (approximation of real numbers by rational numbers) Entropy Encoding Run-length Huffman Arithmetic Processing - Source Encoding Transformation from time to frequency domain using DCT In principle no compression itself but computation of new coefficients as input for compression process Mapping of real numbers into rational numbers (approximation) A certain loss of precision will in general be unavoidable MCU: Minimum Coded Unit DCT: Discrete Cosine Transform Entropy Encoding Lossless compression of a sequential digital data stream Page Page
The Principle Original The opposite Transformation Table JPEG Decoder Dequantization get rid of invisible details the details cannot be reconstructed Encode Huffman, Run Length Encoding Retransformation JPEG Picture Original Without : Encoding gain would be very poor (or nonexisting) Transformation and Retransformation must be inverse to each other Task of transformation: produce a picture representation which may be encoded with a high gain of reduction Variants of Compression JPEG is not a single format, but it can be chosen from a number of modes: Lossy sequential DCT-based mode (baseline process) Must be supported by every JPEG implementation Block, MCU, FDCT, Run-length, Huffman Expanded lossy DCT-based mode Enhancement to the baseline process by adding progressive encoding Lossless mode Low compression ratio perfect reconstruction of original image No DCT, but differential encoding by prediction Hierarchical mode Accommodates images of different resolutions Selects its algorithms from the three other modes Page 3 Page 4 First Step: Preparation Picture Preparation - Components General image model Independence from image parameters like size and pixel ratio Description of most of the well-known picture representations Source picture consists of to 55 components (planes) C i Components may be assigned to RGB or YUV values For example, C may be assigned to red color information Each component C i can have a different number of superpixels X i, Y i (A superpixel is a rectangle of pixels which all have the same value) Resolution of the components may be different: Y A A X A N Y B X B B M X 3 D D D M X = X = X Y 3 3 Y = Y = Y 3 Y i C C C 3 C N superpixel X i C i A grey-scale image consists (in most cases) of a single component RGB color representation has three components with equal resolution YUV color image processing uses Y = 4 Y = 4 Y 3 and X = 4 X = 4 X 3 Page 5 Page 6
Preparation - Dimensions Dimensions of a compressed image are defined by X (maximum of all X i ), Y (maximum of all Y i ), H i and V i (relative horizontal and vertical sampling ratios for each component i) X i Y i with H i = min X and V j i = minyj j j H i and V i must be integers in the range of to 4. This restriction is needed for the interleaving of components Example: Y = 4 pixels, X = 6 pixels Y C C C 3 Preparation Data Ordering An image is divided into several components which can be processed one by one. But: how to prepare a component for processing? Observation for most parts of an image: not so much difference between the values in a rectangle of N N pixels For further processing: divide each component of an image into blocks of N N pixels Thus, the image is divided into data units (blocks): Lossless mode uses one pixel as one data unit Lossy mode uses blocks of 8 8 pixels (with 8 or bits per pixel) X X = 6, Y = 4 H = V = X = 6, Y = H = V = X 3 = 3, Y 3 = H 3 = V 3 = Page 7 Page 8 Preparation - Data Ordering Interleaved Data Ordering Non-interleaved data ordering: The easiest but not the most convenient sequence of data processing Data units are processed component by component For one component, the processing order is left-to-right and top-to-bottom With the non-interleaved technique, a RGB-encoded image is processed by: First the red component only Then the blue component, followed by the green component This is (for speed reasons) less suitable than data unit interleaving Often more suitable: interleave data units Interleaving means: don t process all blocks component by component, but mix data units from all components Interleaved data units of different components: Combination to Minimum Coded Units (MCUs) If all components have the same resolution MCU consists of one data unit for each component If components have different resolutions. For each component, regions of data units are determined; data units in one region are ordered left-to-right and top-to bottom. Each component consists of the same number of regions 3. MCU consists of one region in each component Up to 4 components can be encoded in interleaved mode (according to JPEG) Each MCU consists of at most ten data units Page Page
Preparation - MCUs Compression Steps in JPEG 3 MCU example: four components C, C, C 3, C 4 3 4 5 a a a a C : H =, V = b b MCUs: data units per MCU C : H =, V = MCU = a a a a b b c c d MCU = a a 3 a a 3 b b 3 c c d MCU 3 = a 4 a 5 a 4 a 5 b 4 b 5 c c d MCU 4 = a a a 3 a 3 b b c c 3 d c d c C 3 : H 3 =, V 3 = C 4 : H 4 =, V 4 = where a ij : data units of C b ij : data units of C c ij : data units of C 3 d ij : data units of C 4 xi H i = min x y i Vi = min y j j Uncompressed Preparation Pixel Block, MCU MCU: Minimum Coded Unit DCT: Discrete Cosine Transform Processing DCT (approximation of real numbers by rational numbers) Compressed Entropy Encoding Run-length Huffman Arithmetic Result of image preparation: sequence of 8 8 blocks, the order is defined by MCUs The samples are encoded with 8 bit/pixel Next step: image processing by source encoding Page Page Source Encoding Transformation Discrete Cosine Transformation Encoding by transformation: Data are transformed into another mathematical domain, which is more suitable for compression. The inverse transformation must exist and must be easy to calculate Most widely known example: Fourier transformation uv m n ux vy π i π i m n x = y = xy F f e e = The parameters m and n indicate the granularity Most effective transformation for image compression: Discrete Cosine Transformation (DCT) m n πu(x+ ) πv(y+ ) uv δ nm x= y= xy m n F f cos cos = Fast Fourier Transformation (FFT) Let f xy be a pixel (x,y) in the original picture. ( x N ; y N ) N N (x + ) u π (y + ) vπ F uv : = γ N cu cv fxy cos cos, u,v {,...,N }, x= y= N N u = c u = u > f xy space domain (i.e. geometric ) F uv frequency domain (indicates how fast the information moves inside the rectangle) F is the lowest frequency in both directions, i.e. a measure of the average pixel value F uv with small total frequency (i.e u+v small) are (in general) larger than F uv with large u+v Page 3 Page 4
Retransformation: Inverse Cosine Transformation N N (x + ) u π (y + ) vπ fxy = δ N cu cv Fuv cos cos u = v = N N Simplest example (just for demonstration): Let f xy = f = constant N F = f cos( ) = f all other F = γ N γ N uv x y f = f = δ c c F cos(...) cos(...) N u v uv u v = δ N c c F N = δ N γ N f! = if δn = γn then γn = N Example N=8 (Standard): N=: Fuv = cu c v 4... (x + ) u π (y + ) vπ Fuv = cu cv fxy cos cos x = y = 4 4 uπ vπ uπ 3πv 3π u πv 3π u 3πv = cu cv f cos cos + f cos + f cos + f cos 4 4 4 4 4 4 4 4 F = [ f + f + f + f ] i.e. f if fxy f Transformed values can be much smaller than original values: π 3π π 3π F = f cos + f cos + f cos + f cos = f f + f f 4 4 4 4 positive + negative terms, i.e. if f xy f F Page 5 Page 6 Baseline Process - Processing Meaning of Coefficients How can DCT be useful for JPEG? - F uv for larger values of u and v are often very small! Low High First step of image processing: Samples are encoded with 8 bits/pixel; each pixel is an integer in the range [,55] Pixel values are shifted to the range [-8, 7] (-complement representation) Data units of 8 x 8 pixel values are defined by f xy [-8, 7], where x, y are in the range [, 7] Each value is transformed using the Forward DCT (FDCT): 8 8 block Transformation to frequencies Low Low 7 7 (x+ )u π (y+ )vπ uv 4 u v xy 6 6 x= y= F = c c f cos cos for u / v = where c u/v = and u,v [,7 ] otherwise Cosine expressions are independent of f xy fast calculation is possible Result: From 64 coefficients f xy we get 64 coefficients F uv in the frequency domain + F F F F F 3 High... Low High High Page 7 Page 8
Baseline Process - Processing Coefficient F : DC-coefficient Corresponds to the lowest frequency in both dimensions Determines the fundamental color of the data unit of 64 pixels Normally the values for F are very similar in neighbored blocks Other coefficients (F uv for u+v > ): AC-coefficients Non-zero frequency in one or both dimensions Reconstruction of the image: Inverse DCT (IDCT) If FDCT and IDCT could be calculated with full precision DCT would be lossless In practice: precision is restricted (real numbers!), thus DCT is lossy different implementations of JPEG decoder may produce different images Reason for the transformation: Experience shows that many AC-coefficients have a value of almost zero, i.e. they are zero after quantization entropy encoding may lead to significant data reduction. Compression Steps in JPEG Uncompressed Preparation Pixel Block, MCU Processing Predictor DCT (approximation of real numbers by rational numbers) Compressed Entropy Encoding Run-length Huffman Arithmetic Result of image processing: 8 8 blocks of DC/AC coefficients MCU: Minimum Coded Unit Till now, no compression is done this task is enabled by DCT: Discrete Cosine Transformquantization Page Page 3 Baseline Process - Observation: How to enforce that even more values are zero? Answer: by. Divide F uv by Quantum uv = Q uv and take the nearest integer as the result [ ] F = F / Q Q uv uv uv F uv Dequantization: F Q = F (only an approximation of F uv ) Q * uv uv uv... N-... N- most values are zero smaller values Q Example: N=8; quantization step=, Q uv = (u+v)+3 3 5 7... 7 5 7 7 uv =..... 7... 3...... process: Divide DCT-coefficient value F uv by an integer number Q uv and round the result to the nearest integer of all DCT-coefficients results in a lossy transformation some image details given by higher frequencies are cut off. JPEG application provides a table with 64 entries, each used for quantization of one DCT-coefficient each coefficient can be adjusted separately A high compression factor is achievable on the expense of image quality large quantization numbers: high data reduction but information loss increases No default values for quantization tables are specified in JPEG Page 3 Page 3
Example Example 4 Input values from exemplary grey-scale image 44 47 4 4 55 7 75 DC coefficient FDCT Output Values (because of space reasons only the part before the comma ) Matrix for Quality Level 44 5 4 47 4 48 67 7 86-8 5-3 - -4-3 5 7 3 5 7 5 55 36 67 63 6 5 7-34 6 - - 4 7 5 7 3 5 7 68 45 56 6 5 55 36 6 First: subtract 8 from each element - -4-6 -8 3 - - 7 3 5 7 6 48 56 48 4 36 47 6 Then: perform FDCT -8-5 4-5 -8-3 -3 8 3 5 7 3 47 67 4 55 55 4 36 6 F uv -3 8-8 8 5 3 5 7 3 5 36 56 3 67 6 44 4 47 4 - -8 8 8-4 -7 3 5 7 3 5 7 48 55 36 55 5 47 47 36-3 4 - -7 - - 5 7 3 5 7. -8-4 -6 7 3 5 7 3 Page 33. 3. Page 34 Example Example Effects of 6-4 - - - - 86 F* uv - reconstruction after dequantization - 4 - -3-5 -7 Correct value was - Reconstructed image after performing the inverse DCT: 46 5 6 57 76 83 6 48 6 55 76 3 4 67 88 6 76 87 8 86 8 85 3 6 8 Error in reconstruction: 3 4 5 7 4 8 7 8 3 4 - -5-3 3 - - - - -7-35 -7 7 - -3-5 5 7-6. 6 7 7 74 8 75 78 6 75 63 76 6 6 63 83 86 8 5 5 4 6 8 5 3 5 8 33 3 3 - - - 3-5 58 88 5 88 8 6 5 6 3 5 6 4-3 5-3 5 instead of -8 43 8 4 86 86 6 55 7 33 6 3 8 - -7 57 7 5 7 8 67 75 68 7 4 7 4 6 5 4 5 4 3 7 8 3 4. 5. Quantized Matrix Indication of quality loss Page 35 Page 36
Problem of Compression Steps in JPEG Cutting of higher frequencies leads to partly wrong color information the higher the quantization coefficients, the more disturbance is in a 8 8 block Result: edges of blocks can be seen Uncompressed Compressed Processing Preparation Pixel Predictor Block, MCU DCT (approximation of real numbers by rational numbers) Entropy Encoding Run-length Huffman Arithmetic Result of quantization: 8 8 blocks of DC/AC coefficients with lots of zeros How to process and encode the data efficiently? MCU: Minimum Coded Unit DCT: Discrete Cosine Transform Page 37 Page 38 Baseline Process - Entropy Encoding Example Entropy Encoding Initial step: map 8 8 block of transformed values FQuv to a 64 element vector which can be further process by entropy encoding DC-coefficients determine the basic color of the data units in a frame; variation between DC-coefficients of successive frames is typically small The DC-coefficient is encoded as difference between the current coefficient and the previous one Zig-zag ordering 6-3 - - - 4-5 3 - - - -3 - - - - - - AC-coefficients: processing order uses zig-zag sequence DC-coefficient AC-coefficients, higher frequencies Coefficients with lower frequencies are encoded first, followed by higher frequencies. Result: sequence of similar data bytes efficient entropy encoding Page 3 DC-coefficient: code coefficients for one block as difference to the previous one AC-coefficients: consider each block separately, order data using zig-zag sequence to achieve long sequences of zerovalues: -3 4 - -5-3 -3 - - - - - - - - - - Entropy encoding: Run-length encoding of zero values of quantized AC-coefficients Huffman encoding on DC- and AC-coefficients Page 4
Run-length Encoding Run-length encoding is a content-dependent coding technique Sequences of the same bytes are replaced by the number of their occurrences A special flag byte is used which doesn t occur in the byte stream itself Coding procedure: If a byte occurs at least four consecutive times, the number of occurrences 4 (offset = 4) is counted The compression data contain this byte followed by the special flag and the number of occurrences 4 As a consequence: Representation of 4 to 5 bytes with three bytes is possible (with corresponding compression effect) Example with! as special flag: Uncompressed sequence: Run-length coded sequence: ABCCCCCCCCDEFGGG ABC!4DEFGGG Offset of 4, since for smaller blocks there would be no reduction effect; e.g. with offset 3: D! DDD (both strings have same length) Page 4 Run-length Encoding Similar it is done in JPEG: The zero-value is the only one appearing in longer sequences, thus use a more efficient coding by only compressing zero sequences: code nonzero coefficients together with their run-length, i.e. the number of zeros preceding the nonzero value Run-length {,...,5}, i.e. 4 Bit for representing the length of zero sequences Coded sequence: run-length, size, amplitude with run-length number of subsequent zero-coefficients size number of bits used for representing the following coefficient amplitude value of that following coefficient using size bits By adapting the size of representing a coefficient to its value achieves a further compression because most coefficients for higher frequencies have very small values If (run-length, size) = (5, ) then there are more than 5 zeros after each other. (,) = EoB symbol (End of Block) indicates the termination of the actual rectangle (EoB is very frequently used) Page 4 Example Huffman Encoding Size i Amplitude - -3, -,3 3-7,...,-4 4,...,7 4-5,...,-8 8,...,5 - i +,...,- i- i-,..., i - -3,...,-5 5,...,3 is for instance represented by: size = 4, amplitude = The sequence...... is encoded by 35 zeroes = 8 = = = = = 3 = 4 = 5 = 5 = 4 = 3 = = = = = 8 4 bits -complement Representation (other representations are possible) 5,, 5,, 5, 7, 57 35 zeros at all, followed by a value represented using 7 bit In a second step, the string may be still reduced by Huffman encoding principles With 7 bit, is 64 + 57 Page 43 The Huffman code is an optimal code using the minimum number of bits for a string of data with given probabilities per character Statistical encoding method: For each character, a probability of occurrence is known by encoder and decoder Frequently occurring characters are coded with shorter strings than seldomly occurring characters Successive characters are coded independent of each other Resulting code is prefix free unique decoding is guaranteed A binary tree is constructed to determine the Huffman codewords of the characters: Leaves represent the characters that are to be encoded Nodes contain the occurrence probability of the characters belonging to the subtree Edges of the tree are assigned with and Page 44
Huffman Encoding Huffman Encoding Example Algorithm for computing the Huffman code:.) List all characters as well as their frequencies.) Select the two list elements with the smallest frequency and remove them from the list 3.) Make them the leafs of a tree, whereby the probabilities for both elements are being added; place the tree into the list 4.) Repeat steps and 3, until the list contains only one element 5.) Mark all edges: Father left son with Father right son with The code words result from the path from the root to the leafs Suppose that characters A, B, C, D and E occur with probabilities p(a) =.7, p(b) =.36, p(c) =.6, p(d) =.4, p(e) =.7 p(ced) =.37 p(c) =.6 p(adceb) =. p(ed) =. p(e) =.7 p(d) =.4 p(a) =.7 p(ab) =.63 p(b) =.36 Resulting Code: x w(x) A B C D E Page 45 Page 46 Huffman Encoding in JPEG Huffman Encoding in JPEG Coding of run-length ( {,, 5}), size ( {,, }) (i,j): i preceding zeroes ( i 5) in front of a nonzero value coded with j bits The table has 6+ = 6 entries with significantly different occurrence probabilities EoB is relatively frequent ZRL: at least 6 successive zeroes, i.e. ZRL = (5,) Some values such as (5,) are extremely rare: 5 preceding zeros in front of a very large value is practically impossible! The same holds for most of the combinations in the table. Thus: Huffman coding of the table entries will lead to significant further compression! size runlength 3... EoB (impossible) (,3) (impossible).. (i,j).. 4 (impossible) 5 ZRL (5,) Different Huffman tables for (run-length, size) are used for different 8x8 blocks, basing on their contents Thus the coding begins with a HTN (Huffman-table-number) The coding of amplitudes may also change from block to block Amplitude codes are stored in the preceding (run-length, size) coding table A 8 8 block thus is coded as follows: [VLC, DC coefficient, sequence of (run-length, size, amplitude) for the AC coefficients] VLC = variable length code: contains actual HTN + actual VLI (Variable Length Integer), i.e. coding method for next amplitude Page 47 Page 48
Alternative to Huffman: Arithmetic Coding Arithmetic Coding: Example Characteristics: Achieves optimality (coding rate) as the Huffman coding Difference to Huffman: the entire data stream has an assigned probability, which consists of the probabilities of the contained characters. Coding a character takes place with consideration of all previous characters. The data are coded as an interval of real numbers between and. Each value within the interval can be used as code word. The minimum length of the code is determined by the assigned probability. Disadvantage: the data stream can be decoded only as a whole..35.35 Code data ACAB with p A =.5, p B =., p C =.3 p A =.5 p B =. p C =.3.5 p AA =.5 p AB =. p AC =.5 p BA p BB p BC p CA p CB p CC.5.35.5 p ACA =.75 p ACB =.3 p ACC =.45.45.7.6.68.7.85..455 p ACAA =.375 p ACAB =.5 p ACAC =.5.3875.45.5.45 ACAB can be coded by each binary number from the interval [.3875,.45), rounded up to -log (p ACAB ) = 6.6 i.e. 7 bit, e.g.. Page 4 Page 5 Variants of Compression Variants: Expanded Lossy DCT-based Mode JPEG is not a single format, but it can be chosen from a number of modes: Lossy sequential DCT-based mode (baseline process) Presented before, but not the only method Expanded lossy DCT-based mode Enhancement to the baseline process by adding progressive encoding Lossless mode Low compression ratio perfect reconstruction of original image No DCT, but differential encoding Hierarchical mode Accommodates images of different resolutions Selects its algorithms from the three other modes With sequential encoding as in the baseline process the whole image is coded and decoded in a single run. An alternative to sequential encoding is progressive encoding, done in the entropy encoding step. Two alternatives for progressive encoding are possible: Spectral selection At first, coefficients of low frequencies are passed to entropy encoding, coefficients of higher frequencies are processed in successive runs Successive approximation All coefficients are transferred in one run, but most-significant bits are encoded prior to less-significant bits. possible coding alternatives in the expanded mode: Using sequential encoding, spectral selection, or successive approximation (3 variants) Using Huffman or Arithmetic encoding ( variants) Using 8 or bits for representing the samples ( variants) Most popular mode: sequential display mode with 8 bits/sample and Huffman encoding Page 5 Page 5
Expanded Lossy DCT-based Mode (Example) Sequential encoding: image is coded and decoded in a single run Step Step Step 3 Progressive encoding: image is coded and decoded in refining steps Variants: Lossless Mode Lossless mode uses differential encoding (Differential encoding is also known as prediction or relative encoding) Sequence of characters whose values are different from zero, but which do not differ much. Calculate only the difference wrt. the previous value (used also for DC-coefficients) Differential encoding for still images: Avoid using DCT/quantization Instead: calculation of differences between nearby pixels or pixel groups Edges are represented by large values Areas with similar luminance and chrominance are represented by small values Homogenous area is represented by a large number of zeros further compression with run-length encoding is possible as for DCT Step Step Step 3 Page 53 Page 54 Variants: Lossless Mode Variants: Hierarchical Mode Uses data units of single pixels for image preparation Any precision between and 6 bits/pixel can be used processing and quantization use a predictive technique instead of transformation encoding 8 predictors are specified for each pixel X by means of a combination of the already known adjacent samples A, B, and C Uncompressed data predictor predicted values X C A B X The actual predictor should give the best approximation of x by the already known values A,B,C no prediction A B 3 C 4 A+B-C 5 A+(B-C)/ 6 B+(A-C)/ 7 (A+B)/ The number of the chosen predictor and the difference of the prediction to the actual value are passed to entropy encoding (Huffman or Arithmetic Encoding) Predictor Entropy encoder Compressed data Example: (4,): X is exactly given by A+B-C (7,): X is (A+B)/+ Page 55 The Hierarchical mode uses either the lossy DCT-based algorithms or the lossless compression technique The idea: encoding of an image at different resolutions Algorithm: is initially sampled at a low resolution Subsequently, the resolution is raised and the compressed image is subtracted from the previous result The process is repeated until the full resolution of the image is obtained in a compressed form Disadvantage: Requires substantially more storage capacity Advantage: Compressed image is immediately available at different resolutions scaling becomes cheap Page 56