Chapter 3: Multimedia Systems - Communication Aspects and Services Chapter 4: Multimedia Systems Storage Aspects Chapter 5: Multimedia Usage

Chapter : Basics Audio Technology Images and Graphics Video and Animation.: Images and Graphics Digital image representation Image formats and color models JPEG, JPEG Image synthesis and graphics systems Image analysis Chapter 3: Multimedia Systems - Communication Aspects and Services Chapter 4: Multimedia Systems Storage Aspects Chapter 5: Multimedia Usage Page

Digital Image Representation A digital image is a spatial representation of an object (D, 3D scene or another image - real or virtual) Definition of digital image : Let I, J, K Z be a finite interval. Let G N with G < be the grey scale level / color depth (intensity value of a picture element = a pixel) of the image. () A D-image is a function f: I J G () A 3D-image is a function f: I J K G (3) If G = {,}, the function is a binary (or bit) image, otherwise it is a pixel image The Resolution depends on the size of I and J (and K) and describes the number of pixels per row resp. column. Example To display a 55-line television picture (NTSC) without noticeable degradation with a Video Graphics Array (VGA) video controller, 64x48 pixels and 56 discrete grey levels give an array of 37. 8-bit numbers and a total of.457.6 bit. Page

Image Representation An Image Capturing Format is specified by: spatial resolution (pixel x pixel) and color encoding (bits per pixel) Example: captured image of a DVD video with 4:3 picture size: spatial resolution: 768 x 576 pixel color encoding: -bit (binary image), 8-bit (color or grayscale), 4-bit (color-rgb) An Image Storing Format is a -dimensional array of values representing the image in a bitmap or pixmap, respectively. Also called raster graphics. The data of the fields of a bitmap is a binary digit, data in a pixmap may be a collection of: 3 numbers representing the intensities of red, green, and blue components of the color 3 numbers representing indices to tables of red, green and blue intensities Single numbers as index to a table of color triples Single numbers as index to any other data structures that represents a color / color system Further properties can be assigned with the whole image: width, height, depth, version, etc. Page 3

Color Models Lehrstuhl für Informatik 4 Why storing values for red, green, blue? Color perception by the human brain is possible through the additive composition of red, green and blue light (RGB system). The relative intensities of RGB values are transmitted to the monitor where they are reproduced at each point in time. On a computer monitor, each pixel is given as an overlay of those three image tones with different intensities by this, any color can be reproduced. But: another possible color model: CYMK When printing an image, other color components are used cyan, yellow, magenta, kontrast which in all can also reproduce all colors. Thus, many image processing software and also some image storing formats also support this model. Page 4

Color Models Lehrstuhl für Informatik 4 Another possibility is to use a different representation of color information by means of the YUV system where Y is the brightness (or luminance) information U and V are color difference signals (chrominance) Y, U and V are functions of R, G and B Why? As the human eye is more sensitive to brightness than to chrominance, separate brightness information from the color information and code the more important luminance with more bit than the chrominance this can save bits in the representation format. Page 5

Color Models Lehrstuhl für Informatik 4 Usual scheme: Y =.3 R +.59 G +. B (the color sensitivity of the human eye is considered) U = c (B-Y); V = c (R-Y) c, c = constants reflecting perception aspects of the human eye and the human brain! Possible Coding: YUV signal Y =.3 R +.59 G +. B U = (B-Y).493 = -.48 R -.9 G +.439 B V = (R-Y).877 =.64 R -.57 G -.96 B This is a system of 3 equations for determining Y, U, V from R, G, B or for recalculating R, G, B from Y, U, V The resolution of Y is more important than the resolution of U and V Spend more bits for Y than for U and V (Y : U : V = 4 : : ) The weighting factors in the calculation of the Y signal compensate the color perception misbalance of the human eye Page 6

Image Formats Lehrstuhl für Informatik 4 Lots of different image formats are in use today, e.g. GIF (Graphics Interchange Format) Compressed with some basic lossless compression techniques to 5% of original picture without loss. Supports 4-bit colors. BMP (Bitmap) Devide-independent representation of an image: uses RGB color model, without compression. Color depth up to 4-bit, additional option of specifying a color table to use. TIFF (Tagged Image File Format) Supports grey levels, RGB, and CYMK color model. Also supports lots of different compression methods. Additionally contains a descriptive part with properties a display should provide to show the image. PostScript Images are described without reference to special properties as e.g. resolution. Nice feature for printers, but hard to include into documents where you have to know the image size... JPEG (Joint Photographics Expert Group) Lots of possible compressions, mostly with loss! Page 7

Why Compression? High-resolution image: e.g. 4 768 pixel, 4 bit color depth 4 768 4 = 8.874.368 bit Image formats like GIF: Lossless compression (entropy encoding) for reducing data amount while keeping image quality JPEG: Lossy compression remove some image details to achieve a higher compression rate by suppressing higher frequencies Combined with lossless techniques Trade-Off between file size and quality JPEG is a joint standard of ISO and ITU-T In June 987, an adaptive transformation coding technique based on DCT was adopted for JPEG In 99, JPEG became a ISO international standard Page 8

JPEG Implementation Independent of image size Applicable to any image and pixel aspect ratio Color representation JPEG applies to color and grey-scaled still images Image content Of any complexity, with any statistical characteristics Properties of JPEG State-of-the-art regarding compression factor and image quality Run on as many available standard processors as possible Compression mechanisms are available as software-only packages or together with specific hardware support - use of specialized hardware should speed up image decompression Encoded data stream has a fixed interchange format Fast coding is also used for video sequences: Motion JPEG Page 9

How could We compress? Entropy encoding Data stream is considered to be a simple digital sequence without semantics Lossless coding, decompression process regenerates the data completely Used regardless of the media s specific characteristics Examples: Run-length encoding, Huffman encoding, Arithmetic encoding Source encoding Semantics of the data are taken into account Lossy coding (encoded data are not identical with original data) Degree of compression depends on the data contents Example: Discrete Cosine Transformation (DCT) as transformation technique of the spatial domain into the two-dimensional frequency domain Hybrid encoding Used by most multimedia systems Combination of entropy and source encoding Examples: JPEG, MPEG, H.6 Page

Compression Steps in JPEG Uncompressed Image Compressed Image Image Preparation Pixel Block, MCU Image Processing DCT Predictor Quantization (approximation of real numbers by rational numbers) Entropy Encoding Run-length Huffman Arithmetic MCU: Minimum Coded Unit DCT: Discrete Cosine Transform Page

Compression Steps in JPEG Image Preparation Analog-to-digital conversion Image division into blocks of N N pixels Suitable structuring and ordering of image information Image Processing - Source Encoding Transformation from time to frequency domain using DCT In principle no compression itself but computation of new coefficients as input for compression process Quantization Mapping of real numbers into rational numbers (approximation) A certain loss of precision will in general be unavoidable Entropy Encoding Lossless compression of a sequential digital data stream Page

The Principle Lehrstuhl für Informatik 4 Original get rid of invisible details Transformation Quantization Encode JPEG Picture The opposite Quantization Table Huffman, Run Length Encoding JPEG Decoder Dequantization Retransformation the details cannot be reconstructed Original Without Quantization : Encoding gain would be very poor (or nonexisting) Transformation and Retransformation must be inverse to each other Task of transformation: produce a picture representation which may be encoded with a high gain of reduction Page 3

Variants of Image Compression JPEG is not a single format, but it can be chosen from a number of modes: Lossy sequential DCT-based mode (baseline process) Must be supported by every JPEG implementation Block, MCU, FDCT, Run-length, Huffman Expanded lossy DCT-based mode Enhancement to the baseline process by adding progressive encoding Lossless mode Low compression ratio perfect reconstruction of original image No DCT, but differential encoding by prediction Hierarchical mode Accommodates images of different resolutions Selects its algorithms from the three other modes Page 4

First Step: Image Preparation General image model Independence from image parameters like size and pixel ratio Description of most of the well-known picture representations Source picture consists of to 55 components (planes) C i Components may be assigned to RGB or YUV values For example, C may be assigned to red color information Each component C i can have a different number of superpixels X i, Y i (A superpixel is a rectangle of pixels which all have the same value) Y i C C C 3 C N superpixel X i C i Page 5

Picture Preparation - Components Resolution of the components may be different: A A X B X B D X 3 D Y Y Y 3 3 A N B M D M X = X = X Y = Y = Y 3 A grey-scale image consists (in most cases) of a single component RGB color representation has three components with equal resolution YUV color image processing uses Y = 4 Y = 4 Y 3 and X = 4 X = 4 X 3 Page 6

Image Preparation - Dimensions Dimensions of a compressed image are defined by X (maximum of all X i ), Y (maximum of all Y i ), H i and V i (relative horizontal and vertical sampling ratios for each component i) X i Y i with H = and V = i min X j j i miny j j H i and V i must be integers in the range of to 4. This restriction is needed for the interleaving of components Example: Y = 4 pixels, X = 6 pixels Y C C C 3 X = 6, Y = 4 H = V = X X = 6, Y = H = V = X 3 = 3, Y 3 = H 3 = V 3 = Page 7

Image Preparation Data Ordering An image is divided into several components which can be processed one by one. But: how to prepare a component for processing? Observation for most parts of an image: not so much difference between the values in a rectangle of N N pixels For further processing: divide each component of an image into blocks of N N pixels Thus, the image is divided into data units (blocks): Lossless mode uses one pixel as one data unit Lossy mode uses blocks of 8 8 pixels (with 8 or bits per pixel) Page 8

Image Preparation - Data Ordering Non-interleaved data ordering: The easiest but not the most convenient sequence of data processing Data units are processed component by component For one component, the processing order is left-to-right and top-to-bottom With the non-interleaved technique, a RGB-encoded image is processed by: First the red component only Then the blue component, followed by the green component This is (for speed reasons) less suitable than data unit interleaving Page 9

Interleaved Data Ordering Often more suitable: interleave data units Interleaving means: don t process all blocks component by component, but mix data units from all components Interleaved data units of different components: Combination to Minimum Coded Units (MCUs) If all components have the same resolution MCU consists of one data unit for each component If components have different resolutions. For each component, regions of data units are determined; data units in one region are ordered left-to-right and top-to bottom. Each component consists of the same number of regions 3. MCU consists of one region in each component Up to 4 components can be encoded in interleaved mode (according to JPEG) Each MCU consists of at most ten data units Page

Image Preparation - MCUs MCU example: four components C, C, C 3, C 4 3 4 5 3 a a a a b b c c d C : H =, V = C : H =, V = C 3 : H 3 =, V 3 = C 4 : H 4 =, V 4 = MCUs: 9 data units per MCU MCU = a a a a b b c c d MCU = a a 3 a a 3 b b 3 c c d MCU 3 = a 4 a 5 a 4 a 5 b 4 b 5 c c d MCU 4 = a a a 3 a 3 b b c c 3 d where a ij : data units of C b ij : data units of C c ij : data units of C 3 d ij : data units of C 4 H V i i xi = min x y i = min y j j Page

Compression Steps in JPEG Uncompressed Image Compressed Image Image Preparation Pixel Block, MCU Image Processing DCT Quantization (approximation of real numbers by rational numbers) Entropy Encoding Run-length Huffman Arithmetic MCU: Minimum Coded Unit DCT: Discrete Cosine Transform Result of image preparation: sequence of 8 8 blocks, the order is defined by MCUs The samples are encoded with 8 bit/pixel Next step: image processing by source encoding Page

Source Encoding Transformation Encoding by transformation: Data are transformed into another mathematical domain, which is more suitable for compression. The inverse transformation must exist and must be easy to calculate Most widely known example: Fourier transformation uv m n π i i m ux π n vy x = y = xy F f e e = The parameters m and n indicate the granularity Most effective transformation for image compression: Discrete Cosine Transformation (DCT) m n πu(x+ ) πv(y+ ) uv δ nm x= y= xy m n F f cos cos = Fast Fourier Transformation (FFT) Page 3

Discrete Cosine Transformation Let f xy be a pixel (x,y) in the original picture. ( x N ; y N ) N N (x + ) u π (y + ) vπ F : = γ c c f cos cos, u,v {,...,N }, N N uv N u v xy x= y= c u u = = u > f xy space domain (i.e. geometric ) F uv frequency domain (indicates how fast the information moves inside the rectangle) F is the lowest frequency in both directions, i.e. a measure of the average pixel value F uv with small total frequency (i.e u+v small) are (in general) larger than F uv with large u+v Page 4

Retransformation: Inverse Cosine Transformation N N (x + ) u π (y + ) vπ fxy = δ N cu cv Fuv cos cos u = v = N N Simplest example (just for demonstration): Let f xy = f = constant N F = γ f cos( ) = γ f all other F = N N uv x y f = f = δ c c F cos(...) cos(...) N u v uv u v = δ c c F N N γ N f = δ N! = if δ = γ then γ = N N N N Page 5

Example Lehrstuhl für Informatik 4 N=8 (Standard): N=: Fuv = cu c v 4... (x + ) u π (y + ) vπ Fuv = cu cv fxy cos cos x = y = 4 4 uπ vπ uπ 3πv 3π u πv 3π u 3πv = cu cv f cos cos + f cos + f cos + f cos 4 4 4 4 4 4 4 4 F = [ f + f + f + f ] i.e. f if fxy f Transformed values can be much smaller than original values: π 3π π 3π F = f cos + f cos + f cos + f cos = f f + f f 4 4 4 4 positive + negative terms, i.e. if f xy f F Page 6

Baseline Process - Image Processing How can DCT be useful for JPEG? - F uv for larger values of u and v are often very small! First step of image processing: Samples are encoded with 8 bits/pixel; each pixel is an integer in the range [,55] Pixel values are shifted to the range [-8, 7] (-complement representation) Data units of 8 x 8 pixel values are defined by f xy [-8, 7], where x, y are in the range [, 7] Each value is transformed using the Forward DCT (FDCT): 7 7 (x+ )u π (y+ )vπ uv 4 u v xy 6 6 x= y= F = c c f cos cos for u / v = where c u/v = and u,v [,7 ] otherwise Cosine expressions are independent of f xy fast calculation is possible Result: From 64 coefficients f xy we get 64 coefficients F uv in the frequency domain Page 7

Meaning of Coefficients Low High Transformation to frequencies Low Low 8 8 block F F + F F F 3 High Low High High... Page 8

Baseline Process - Image Processing Coefficient F : DC-coefficient Corresponds to the lowest frequency in both dimensions Determines the fundamental color of the data unit of 64 pixels Normally the values for F are very similar in neighbored blocks Other coefficients (F uv for u+v > ): AC-coefficients Non-zero frequency in one or both dimensions Reconstruction of the image: Inverse DCT (IDCT) If FDCT and IDCT could be calculated with full precision DCT would be lossless In practice: precision is restricted (real numbers!), thus DCT is lossy different implementations of JPEG decoder may produce different images Reason for the transformation: Experience shows that many AC-coefficients have a value of almost zero, i.e. they are zero after quantization entropy encoding may lead to significant data reduction. Page 9

Compression Steps in JPEG Uncompressed Image Compressed Image Image Preparation Pixel Block, MCU Image Processing Predictor DCT Quantization (approximation of real numbers by rational numbers) Entropy Encoding Run-length Huffman Arithmetic Result of image processing: 8 8 blocks of DC/AC coefficients MCU: Minimum Coded Unit Till now, no compression is done this task is enabled by DCT: Discrete Cosine Transformquantization Page 3

Quantization Lehrstuhl für Informatik 4 Observation: F uv...... N- smaller values N- How to enforce that even more values are zero? Answer: by Quantization. Divide F uv by Quantum uv = Q uv and take the nearest integer as the result [ ] F = F / Q Q uv uv uv Dequantization: F Q = F (only an approximation of F uv ) Q * uv uv uv most values are zero Q Example: N=8; quantization step=, Q uv = (u+v)+3 uv 3 5 7 9... 7 5 7 9 7 9 = 9..... 7... 3...... Page 3

Baseline Process - Quantization Quantization process: Divide DCT-coefficient value F uv by an integer number Q uv and round the result to the nearest integer Quantization of all DCT-coefficients results in a lossy transformation some image details given by higher frequencies are cut off. JPEG application provides a table with 64 entries, each used for quantization of one DCT-coefficient each coefficient can be adjusted separately A high compression factor is achievable on the expense of image quality large quantization numbers: high data reduction but information loss increases No default values for quantization tables are specified in JPEG Page 3

Example Lehrstuhl für Informatik 4 Input values from exemplary grey-scale image 4 44 47 4 4 55 79 75 44 5 4 47 4 48 67 79 5 55 36 67 63 6 5 7 68 45 56 6 5 55 36 6 First: subtract 8 from each element 6 48 56 48 4 36 47 6 Then: perform FDCT 47 67 4 55 55 4 36 6 36 56 3 67 6 44 4 47 48 55 36 55 5 47 47 36. Page 33

Example Lehrstuhl für Informatik 4 DC coefficient FDCT Output Values (because of space reasons only the part before the comma ) Quantization Matrix for Quality Level 86-8 5-9 3-9 -4-9 3 5 7 9 3 5 7-34 6-9 - 4 7 5 7 9 3 5 7 9 - -4-6 -8 3 - - 7 9 3 5 7 9-8 -5 4-5 -8-3 -3 8 9 3 5 7 9 3 F uv -3 8-8 8 5 3 5 7 9 3 5 4 - -8 8 8-4 -7 3 5 7 9 3 5 7 9-3 4 - -7 - - 5 7 9 3 5 7 9-8 - 4-6 7 9 3 5 7 9 3. 3. Page 34

Example Lehrstuhl für Informatik 4 Effects of Quantization F* uv - reconstruction after dequantization 6-4 - - - - 86-4 -9-3 -5-7 Correct value was - 4-5 3 - - -35 7 - -3 5 7 - -3 - - -7-7 -5-9 - - -9 3-5 - 3 5-9 3 5 instead of -8 - -7 5 4. 5. Quantized Matrix Indication of quality loss Page 35

Example Lehrstuhl für Informatik 4 Reconstructed image after performing the inverse DCT: 46 5 6 57 76 83 6 48 6 55 76 93 49 67 88 6 76 87 8 86 8 85 9 3 6 8 Error in reconstruction: 3 4 5 9 7 4 8 9 7 8 9 3 6. 69 7 8 78 75 76 69 83 9 8 5 6 5 5 8 3 7 74 75 69 63 6 63 86 5 4 8 3 33 3 58 88 5 88 8 6 59 9 6 9 3 5 6 4 43 8 4 86 86 6 55 7 33 6 3 8 57 79 59 7 8 67 75 68 7 4 7 9 4 6 5 4 9 4 3 7 9 8 3 Page 36

Lehrstuhl für Informatik 4 Problem of Quantization Cutting of higher frequencies leads to partly wrong color information the higher the quantization coefficients, the more disturbance is in a 8 8 block Result: edges of blocks can be seen Page 37

Compression Steps in JPEG Uncompressed Image Compressed Image Image Preparation Pixel Block, MCU Image Processing Predictor DCT Quantization (approximation of real numbers by rational numbers) Entropy Encoding Run-length Huffman Arithmetic Result of quantization: 8 8 blocks of DC/AC coefficients with lots of zeros MCU: Minimum Coded Unit How to process and encode the data efficiently? DCT: Discrete Cosine Transform Page 38

Entropy Encoding Lehrstuhl für Informatik 4 Baseline Process - Entropy Encoding Initial step: map 8 8 block of transformed values F Q uv to a 64 element vector which can be further process by entropy encoding DC-coefficients determine the basic color of the data units in a frame; variation between DC-coefficients of successive frames is typically small The DC-coefficient is encoded as difference between the current coefficient and the previous one AC-coefficients: processing order uses zig-zag sequence DC-coefficient AC-coefficients, higher frequencies Coefficients with lower frequencies are encoded first, followed by higher frequencies. Result: sequence of similar data bytes efficient entropy encoding Page 39

Example Lehrstuhl für Informatik 4 Zig-zag ordering 6 4 - - -3-5 -3 3 - - - - - - - - - - DC-coefficient: code coefficients for one block as difference to the previous one AC-coefficients: consider each block separately, order data using zig-zag sequence to achieve long sequences of zerovalues: -3 4 - -5-3 -3 - - - - - - - - - - Entropy encoding: Run-length encoding of zero values of quantized AC-coefficients Huffman encoding on DC- and AC-coefficients Page 4

Run-length Encoding Run-length encoding is a content-dependent coding technique Sequences of the same bytes are replaced by the number of their occurrences A special flag byte is used which doesn t occur in the byte stream itself Coding procedure: If a byte occurs at least four consecutive times, the number of occurrences 4 (offset = 4) is counted The compression data contain this byte followed by the special flag and the number of occurrences 4 As a consequence: Representation of 4 to 59 bytes with three bytes is possible (with corresponding compression effect) Example with! as special flag: Uncompressed sequence: Run-length coded sequence: ABCCCCCCCCDEFGGG ABC!4DEFGGG Offset of 4, since for smaller blocks there would be no reduction effect; e.g. with offset 3: D! DDD (both strings have same length) Page 4

Run-length Encoding Similar it is done in JPEG: The zero-value is the only one appearing in longer sequences, thus use a more efficient coding by only compressing zero sequences: code nonzero coefficients together with their run-length, i.e. the number of zeros preceding the nonzero value Run-length {,...,5}, i.e. 4 Bit for representing the length of zero sequences Coded sequence: run-length, size, amplitude with run-length number of subsequent zero-coefficients size number of bits used for representing the following coefficient amplitude value of that following coefficient using size bits By adapting the size of representing a coefficient to its value achieves a further compression because most coefficients for higher frequencies have very small values If (run-length, size) = (5, ) then there are more than 5 zeros after each other. (,) = EoB symbol (End of Block) indicates the termination of the actual rectangle (EoB is very frequently used) Page 4

Example Lehrstuhl für Informatik 4 Size i Amplitude - -3, -,3 3-7,...,-4 4,...,7 4-5,...,-8 8,...,5 - i +,...,- i- i-,..., i - -3,...,-5 5,...,3 is for instance represented by: size = 4, amplitude = The sequence...... is encoded by = 8 = 9 = = = = 3 = 4 = 5 = 5 = 4 = 3 = = = = 9 = 8 4 bits -complement Representation (other representations are possible) 5,, 5,, 5, 7, 57 35 zeroes 35 zeros at all, followed by a value represented using 7 bit With 7 bit, is 64 + 57 In a second step, the string may be still reduced by Huffman encoding principles Page 43

Huffman Encoding The Huffman code is an optimal code using the minimum number of bits for a string of data with given probabilities per character Statistical encoding method: For each character, a probability of occurrence is known by encoder and decoder Frequently occurring characters are coded with shorter strings than seldomly occurring characters Successive characters are coded independent of each other Resulting code is prefix free unique decoding is guaranteed A binary tree is constructed to determine the Huffman codewords of the characters: Leaves represent the characters that are to be encoded Nodes contain the occurrence probability of the characters belonging to the subtree Edges of the tree are assigned with and Page 44

Huffman Encoding Algorithm for computing the Huffman code:.) List all characters as well as their frequencies.) Select the two list elements with the smallest frequency and remove them from the list 3.) Make them the leafs of a tree, whereby the probabilities for both elements are being added; place the tree into the list 4.) Repeat steps and 3, until the list contains only one element 5.) Mark all edges: Father left son with Father right son with The code words result from the path from the root to the leafs Page 45

Huffman Encoding Example Suppose that characters A, B, C, D and E occur with probabilities p(a) =.7, p(b) =.36, p(c) =.6, p(d) =.4, p(e) =.7 p(adceb) =. p(ced) =.37 p(c) =.6 p(ed) =. p(a) =.7 p(e) =.7 p(d) =.4 p(ab) =.63 p(b) =.36 Resulting Code: x w(x) A B C D E Page 46

Huffman Encoding in JPEG Coding of run-length ( {,, 5}), size ( {,, }) (i,j): i preceding zeroes ( i 5) in front of a nonzero value coded with j bits The table has 6+ = 6 entries with significantly different occurrence probabilities EoB is relatively frequent ZRL: at least 6 successive zeroes, i.e. ZRL = (5,) Some values such as (5,) are extremely rare: 5 preceding zeros in front of a very large value is practically impossible! The same holds for most of the combinations in the table. Thus: Huffman coding of the table entries will lead to significant further compression! size runlength 3... EoB (impossible) (,3) (impossible).. (i,j).. 4 (impossible) 5 ZRL (5,) Page 47

Huffman Encoding in JPEG Different Huffman tables for (run-length, size) are used for different 8x8 blocks, basing on their contents Thus the coding begins with a HTN (Huffman-table-number) The coding of amplitudes may also change from block to block Amplitude codes are stored in the preceding (run-length, size) coding table A 8 8 block thus is coded as follows: [VLC, DC coefficient, sequence of (run-length, size, amplitude) for the AC coefficients] VLC = variable length code: contains actual HTN + actual VLI (Variable Length Integer), i.e. coding method for next amplitude Page 48

Alternative to Huffman: Arithmetic Coding Characteristics: Achieves optimality (coding rate) as the Huffman coding Difference to Huffman: the entire data stream has an assigned probability, which consists of the probabilities of the contained characters. Coding a character takes place with consideration of all previous characters. The data are coded as an interval of real numbers between and. Each value within the interval can be used as code word. The minimum length of the code is determined by the assigned probability. Disadvantage: the data stream can be decoded only as a whole. Page 49

Arithmetic Coding: Example Code data ACAB with p A =.5, p B =., p C =.3 p A =.5 p B =. p C =.3.5.7 p AA =.5 p AB =. p AC =.5 p BA p BB p BC p CA p CB p CC.5.35.5.6.68.7.85.9 p ACA =.75 p ACB =.3 p ACC =.45.35.45.455.5.35 p ACAA =.375 p ACAB =.5 p ACAC =.5.3875.45.45 ACAB can be coded by each binary number from the interval [.3875,.45), rounded up to -log (p ACAB ) = 6.6 i.e. 7 bit, e.g.. Page 5

Variants of Image Compression JPEG is not a single format, but it can be chosen from a number of modes: Lossy sequential DCT-based mode (baseline process) Presented before, but not the only method Expanded lossy DCT-based mode Enhancement to the baseline process by adding progressive encoding Lossless mode Low compression ratio perfect reconstruction of original image No DCT, but differential encoding Hierarchical mode Accommodates images of different resolutions Selects its algorithms from the three other modes Page 5

Variants: Expanded Lossy DCT-based Mode With sequential encoding as in the baseline process the whole image is coded and decoded in a single run. An alternative to sequential encoding is progressive encoding, done in the entropy encoding step. Two alternatives for progressive encoding are possible: Spectral selection At first, coefficients of low frequencies are passed to entropy encoding, coefficients of higher frequencies are processed in successive runs Successive approximation All coefficients are transferred in one run, but most-significant bits are encoded prior to less-significant bits. possible coding alternatives in the expanded mode: Using sequential encoding, spectral selection, or successive approximation (3 variants) Using Huffman or Arithmetic encoding ( variants) Using 8 or bits for representing the samples ( variants) Most popular mode: sequential display mode with 8 bits/sample and Huffman encoding Page 5

Expanded Lossy DCT-based Mode (Example) Sequential encoding: image is coded and decoded in a single run Step Step Step 3 Progressive encoding: image is coded and decoded in refining steps Step Step Step 3 Page 53

Variants: Lossless Mode Lossless mode uses differential encoding (Differential encoding is also known as prediction or relative encoding) Sequence of characters whose values are different from zero, but which do not differ much. Calculate only the difference wrt. the previous value (used also for DC-coefficients) Differential encoding for still images: Avoid using DCT/quantization Instead: calculation of differences between nearby pixels or pixel groups Edges are represented by large values Areas with similar luminance and chrominance are represented by small values Homogenous area is represented by a large number of zeros further compression with run-length encoding is possible as for DCT Page 54

Variants: Lossless Mode Uses data units of single pixels for image preparation Any precision between and 6 bits/pixel can be used Image processing and quantization use a predictive technique instead of transformation encoding 8 predictors are specified for each pixel X by means of a combination of the already known adjacent samples A, B, and C predictor predicted values X Uncompressed data C A B X The actual predictor should give the best approximation of x by the already known values A,B,C no prediction A B 3 C 4 A+B-C 5 A+(B-C)/ 6 B+(A-C)/ 7 (A+B)/ Predictor Entropy encoder Compressed data The number of the chosen predictor and the difference of the prediction to the actual value are passed to entropy encoding (Huffman or Arithmetic Encoding) Example: (4,): X is exactly given by A+B-C (7,): X is (A+B)/+ Page 55

Variants: Hierarchical Mode The Hierarchical mode uses either the lossy DCT-based algorithms or the lossless compression technique The idea: encoding of an image at different resolutions Algorithm: Image is initially sampled at a low resolution Subsequently, the resolution is raised and the compressed image is subtracted from the previous result The process is repeated until the full resolution of the image is obtained in a compressed form Disadvantage: Requires substantially more storage capacity Advantage: Compressed image is immediately available at different resolutions scaling becomes cheap Page 56

JPEG Lehrstuhl für Informatik 4 Improvement of the original JPEG standard: 6 Bit color depth (up to 8 billion colors) Progressive mode is not only an option, but mandatory Definition of Regions of Interest choose an image region which will be less compressed than the rest of the image. Thus, images can be individually coded, improving the subjective image quality Integration of watermarks invisibly embed additional information which can be recognized by certain programs; watermarks cannot be removed from the image Resync Marker improve fault tolerance and error correction by setting markers; if an transmission error corrupts the data, not all following data are lost like in normal JPEG Most important: better compression, faster coding and decoding process by using Wavelets instead of DCT. Wavelets describe transforming functions, how fast image information are changing. No pixel blocks are used in compression, infinite functions are performed on finite image regions. Thus the wavelets can describe edges and hard changes better than DCT. Page 57

Wavelets Lehrstuhl für Informatik 4 Wavelets are mathematical tools which allow to decompose functions in a hierarchical way, i.e. to describe a function by means of: [overall shape, first detail, second detail,...], e.g. by ( most important features, refinements, less important topics,... ) Thus, the most important aspects are available with a relatively small number of bits Application of Wavelets: Image compression Image editing Animation Signal processing... Page 58

Haar Transforms The simplest example of wavelet technology is the Haar transform. There are one-dimensional Haar wavelet transforms (which allow the compression of the representation of piecewise constant functions) two-dimensional Haar wavelet transforms (for image compression,...) higher-dimensional Haar wavelet transforms Example (one-dimensional): Simple example to show how we can reduce the amount of bits needed for representation by transformation and compression Let a string of pixels be given as follows: 9 7 3 5 6 8 6 4 Then we do (recursively): calculate the average of successive pairs calculate the distance from the average ( detail ) Page 59

D Haar Transforms In the first step of this procedure we get: [average ; average ;...; detail ; detail ;...] Thereafter we get: [average, ; detail, ;...] where average, is the mean value of average and average and detail, is the mean value of detail and detail This procedure may be continued. The detail values become less and less important From the transformed strings we can fully reconstruct the original Very often the details are very small and may be suppressed. In such a case we cannot exactly reconstruct the original but the errors are comparatively small. Page 6

Example Lehrstuhl für Informatik 4 9 7 3 5 6 8 6 4 Averages 8 4 7 5 + - - + Details 6 6 6 Global average detail detail coefficient coarsest resolution Thus the sequence [9, 7, 3, 5, 6, 8, 6, 4] has been transformed to [6; ;, ;,-,-,] detail coefficients finest resolution Page 6

Example Lehrstuhl für Informatik 4 Reconstruction: 6; ;, ;, -, -, 6 = 6 + 6 = 6-8 = 6 + 4 = 6-7 = 6 + 5 = 6-9 = 8 + 7 = 8-3 = 4 + (-) 5 = 4 - (-) 6 = 7 + (-) 8 = 7 - (-) 6 = 5 + 4 = 5 - If we would suppress the finest detail coefficients (i.e.,, -, -, ) then we would reconstruct to: 8 8 4 4 7 7 5 5 instead of: 9 7 3 5 6 8 6 4 Page 6

Generalizsation Lehrstuhl für Informatik 4 Generalization to piecewise constant functions (instead of strings): Let V j be vector space of all functions which are piecewise constant in the j equal subintervals of [:] Every one-dimensional image with j pixels can be considered as an element of V j (e.g. a two-pixel image has two constant parts over the intervals [,.5) and [.5,) The figure shows an element of V 3 Obviously: V 3 is a refinement of V etc. 3 V V V V... A basis for the vector space V j is a set of functions by which all functions of V j may be represented as linear combinations of the basis functions. 8 6 4 f(x) The piecewise constant sections might be an approximation of the function f(x) according to a given norm x Page 63

Generalization Lehrstuhl für Informatik 4 A simple basis of V j (there are many other alternatives for a basis) is: φ j i (x) i i + x ; j j = i i + x ; j j for i =,..., j - Definition:. The inner product of two functions f, g V j is defined as. The L norm is defined by : = < uu > = u( x) u( x) Very often we try to approximate a given function f(x) by a function f(x) according to the L norm. I.e. find f(x) such that f ( x ) f ( x ) = minimum for all f ( x ) 3 φ dx 3 φ 4 f g : = f(x ) g( x ) dx Page 64

Generalization Lehrstuhl für Informatik 4 In the following we will use another basis for V j than the very simple basis which consists only of the elementary components shown before This will lead us to the concept of a Haar Wavelet basis Construction procedure of a suitable basis for V j+ from a basis of V j : Suppose that we already have a basis for V j Then define a new vector space W j as orthogonal complement of V j in V j+, i.e. W j := vector space of functions in V j+ which are orthogonal to all functions in V j ( orthogonal means that <u v>= if u V j ; v W j ) W j is the detail of V j+ which cannot be expressed by V j The basis functions of V j together with the basis functions of W j form a basis of V j+ Page 65

Generalization Lehrstuhl für Informatik 4 Definition: The elements of a basis of W j j (i.e. linearly independent functions ψ i (x) which span W j ) are called Wavelets. Immediate consequence: j. ψ together with the basis of V j are a basis of V j+ i (x) j. ψ i (x) is orthogonal to j j j k i k φ (x), i.e. ψ (x) φ (x)dx = The detail coefficients are coefficients of the wavelet basis functions. Page 66

Generalization Lehrstuhl für Informatik 4 Wavelets for W j (i.e. elements of V j+ which are orthogonal to basis elements of V j ) are denoted as Haar wavelets Haar wavelets of W j (there would be other possibilities for basis functions): ψ j i (x): = if x ; j j for i =,..., j - i i + + if x ; j j i + i + otherwise Page 67

Example Lehrstuhl für Informatik 4 Example: V basis functions Haar wavelets for W φ ψ ψ φ + - + - φ + ψ + ψ 3 φ 3 - - Page 68

Example Lehrstuhl für Informatik 4 9 7 3 5 6 8 6 4 We can (using these functions) redo the string example as follows: S( x ) = string (written as a function of x ) = 9 φ (x) + 7 φ (x) + 3 φ (x) + 5 φ (x) + 6 φ (x) + 8 φ (x) + 6 φ (x) + 4 φ (x) 3 3 3 3 3 3 3 3 3 4 5 6 7 = 8 φ (x) + 4 φ (x) + 7 φ (x) + 5 φ (x) + ψ (x) ψ (x) ψ (x) + ψ (x) 3 3 = 6 φ (x) + 6 φ (x) + ψ (x) + ψ (x) + ψ (x) ψ (x) ψ (x) + ψ (x) 3 = 6 φ (x) + ψ (x) + ψ (x) + ψ (x) + ψ (x) ψ (x) ψ (x) + ψ (x) 3 Page 69

Example Lehrstuhl für Informatik 4 Graphical representation: 6x x + x + x + x -x -x + x = 9 7 3 5 6 8 6 4 φ ψ ψ ψ ψ ψ ψ ψ 3 Page 7

Example Lehrstuhl für Informatik 4 φ Starting with the basis function for V we can refine it by the W detail coefficient, i.e. by to the basis for V ψ This in turn can be refined by two detail coefficients of V : φ, ψ, ψ, ψ ψ, ψ of W to the basis With four more detail coefficients of W we get a basis for V 3 consisting of the eight basis functions φ ; ψ ; ψ, ψ ; ψ, ψ, ψ, ψ With eight more detail coefficients of W 3 we get a basis for V 4 etc. 3 This procedure leads to a better approximation of a given function (or of a D image) Page 7

Approximation Of Continuous Functions Approximation of a continuous function f(x) (dotted line) by averages and detail coefficients We start with V. Then: V - appr. + W -detail = V -appr. V - appr. + W -detail = V -appr. V - appr. + W -detail = V 3 -appr. V 3- appr. + W 3 -detail = V 4 -appr. We can also work backwards, i.e. start with the V 4 approximation and go to the V 3 approximation by suppressing the W 3 detail coefficients etc. V 4 approximation V 3 approximation V approximation average of f(x) V approximation W 3 detail coefficients W detail coefficients W detail coefficients W detail coefficients V approximation Page 7

Approximation Of Continuous Functions φ, ψ, ψ, ψ,... All Haar basis functions are orthogonal to each other (of course not to itself); i.e. <f g> = if f g are basis functions Orthogonality to all other basis functions is not necessarily valid for other systems of basis functions In addition to orthogonality we can provide orthonormality, i.e. <f f> = The basis functions φ ψ j* i j* i ( x) : = j ( x) : = j j φ ( x) i j ψ ( x) i are orthonormal If the basis functions are orthonormalized, then the detail coefficients and the j average coefficient have to be multiplied by In our example we then get: S(x) = 6 φ + ψ + ψ + ψ + ψ ψ ψ + ψ 3 = 6 φ + ψ + ψ + ψ + ψ ψ ψ + ψ * * * * * * * * 3 Page 73

Decomposition Lehrstuhl für Informatik 4 Decomposition of a sequence of h numbers together with a normalization: each original coefficient with superscript j is multiplied by (-j/) proc DecompositionStep(C:array[..h] of reals) for i:= to h/ do C [i]:=(c[i-]+c[i])*sqrt() C [h/+i]:=(c[i-]-c[i])*sqrt() end for C:=C ; end proc proc Decomposition(C:array[..h] of reals) C := C/sqrt(h); // normalize input coefficients while h> do DecompositionStep(C[..h]); h:=h/; end while; end proc Page 74

Lehrstuhl für Informatik 4 Usage for Data Compression Haar wavelets for data compression: Suppose that a function f(x) is given by a linear combination of m basis functions, e.g. by a linear combination of m Haar wavelet functions: m f(x) = c u (x) i= i We want to reduce the number of coefficients, i.e. we seek for a function f(x) which:. Is similar to f(x), i.e. i for example: u ( x ) = φ( x ),..., u 8( x ) = ψ3( x ) f(x) f(x) ε for some norm. May be represented by fewer coefficients (possibly with other basis functions) u i( x ) instead of u i( x ) ~ = m ~ f ( x) c~ i u~ i ( x) with m ~ < m i = Finding the best f(x) is a difficult problem if all possible Thus we restrict ourselves to the Haar Wavelet basis u(x) i are taken into account. Page 75

Lehrstuhl für Informatik 4 Data Compression If we do not change the basis function but reduce the number of coefficients then it is easy to show that: If the basis functions are orthonormal, then for m < m coefficients and for the L - norm, i.e. f f =< f(x) f(x) f(x) f(x) >, the error is minimized if we keep the coefficients of c,..., c m whose absolute values are the largest ones Proof: Let σ,..., σ m be a permutation of,..., m Let m f(x) = c u i= σ i σ i f ( x) - ~ f ( x) = = σ i i = m~ + m m c m u c σ i i = m~ + j = m~ + c m j c σ i σ j j = m~ + σ < u u σ i σ u J σ j > = m i = m~ + ( c σ i ) since u i are orthonormal c,...,c The error is minimized if for σ the smallest coefficients in absolute m+ σm values are used Page 76

Data Compression Approximation of f(x) by V 3, W 3 i.e. 8 averages 8 details φ ( x ),..., φ ( x ) ψ 3 3 7 ( x ),..., ψ ( x ) 3 3 7 orthogonal functions!! (for simplicity the * has been suppressed) And (further on) by: φ ; ψ ; ψ ; ψ ; ψ,..., ψ ; ψ,..., ψ 3 3 3 7 global average 6 coefficients The sequence of pictures shows the effect (i.e. loss of exactitude) if more and more coefficients are eliminated Page 77

D Haar Wavelets We have shown how to use one-dimensional Haar wavelets for representation and compression of one-dimensional functions Now we generalize to D-images First we show how to apply the ( averaging + detail ) technique to D-images a pixel of D-image First approach: Standard decomposition. Apply the D-approach to each row. Result: Average + details for each row average x ++++ x ++++ x ++++ x ++++ Details of different resolution (see previous examples). Apply the D-approach to each column of the results obtained by step I.e.: first the rows, then the columns global average ααββββ XXXX++++ Detail coefficients Page 78

D Haar Wavelets Second Approach: Nonstandard decomposition Alternate between rows and columns. Out of the n values per row: calculate n averages + n details. Take the result and do the same for the columns: i.e. calculate m averages + m details for the m values per column Result of steps and : averages step (for left half only) detail coefficients step 3. Repeat step and for the averages part, i.e. for the left upper corner 4. Do that recursively until only one global average is left over Remark: This technique works only for square images, i.e. n n pixels Page 79

D Haar Wavelet Basis There are two methods for the construction of a two-dimensional basis:. The standard construction. The nonstandard construction Regarding V i which has i basis functions in D, namely i- i- φ ; ψ ; ψ, ψ ; ψ,..., ψ ;...; ψ,..., ψ 3 i- we get i i basis functions for V i in D by combining the D basis functions with each other. Example for V where the D-basis will consist of 4x4 = 6 basis functions (since there are 4 basis functions φ, ψ, ψ, ψ in V ) Page 8

The Standard Construction Let u,...,u be the D basis functions i i Then v j(x,y) = u j (x) u is a basis function for D images j (y) for j,j : There are i x i = i basis functions for V j in D. Example: y ψ ( ) y ψ ( ) y ψ ( x) ψ ( ) x y y + - x - + x - + - + x + means: + -means: - Doing that for φ (x) ψ ψ ψ (x) ψ (x) ψ (x) ψ φ (y) (y) (y) (y) we get all the 6 basis functions for V in D Page 8

The Nonstandard Construction The nonstandard construction of a two-dimensional basis defines a two-dimensional scaling function φφ(x,y): = φ(x) φ(y) and three wavelet functions φψ(x,y): = φ(x) ψ(y) ψφ(x,y): = ψ(x) φ(y) ψψ(x,y): = ψ(x) ψ(y) Furthermore we define scaling by a superscript j and horizontal and vertical translations by a pair of subscripts k and l The nonstandard basis consists of φφ (x,y): = φφ (x,y) as well as of scaled and translated versions of the wavelet functions φψ, ψφ, and ψψ. The result is φψ (x,y): = φψ( x k, y l) j j j j kl φψ (x,y): = ψφ( x k, y l) j j j j kl φψ (x,y): = ψψ ( x k, y l) j j j j kl Page 8

Nonstandard Construction Of D Haar Wavelets - + - + - + + - - + + - φψ (x,y), φψ (x,y), ψψ (x,y), ψψ (x,y), - + φψ (x,y), - + φψ (x,y), - + + - ψψ (x,y), - + + - ψψ (x,y), - - + + - + - + + - φψ (x,y), function for global average + ψψ (x,y), + - ψφ (x,y), + - ψφ (x,y), + - φφ (x,y), ψφ (x,y), ψφ (x,y), ψφ (x,y), Page 83

The Standard Construction The standard construction of a D Haar wavelet basis for V We calculate the products of the onedimensional basis φ, ψ, ψ, ψ for both dimensions A detail: The portion of the square [:]x[:] where the basis function is different from zero is not a square for all 6 functions If we apply the standard construction to an orthonormal basis in one dimension, we get an orthonormal basis in two dimensions - - - + + φ (x) ψ (y) ψ (x) ψ (y) ψ (x) ψ (y) - + - + function for global average + - + φ (x) ψ (y) ψ (x) ψ (y) ψ + - - + + + - + - + - - - + - + - + - + - + - - + - - + - + - -+ + - -+ (x) ψ(y) ψ(x) ψ + - ψ (x) ψ (y) + - + - + (y) φ (x) ψ (y) ψ (x) ψ (y) ψ (x) ψ (y) ψ (x) ψ (y) + - + - -+ - φ (x) φ (y) ψ (x) φ (y) ψ (x) φ (y) ψ (x) φ (y Page 84

Comparison Lehrstuhl für Informatik 4 The standard decomposition of an m x m image requires one-dimensional transforms on all rows and then on all columns. 4(m -m) assignment operations are needed The nonstandard decomposition requires only (8/3) (m -) assignment operations In the nonstandard case, all the nonstandard basis functions have square support ( support = area where the function is nonzero). This is not the case for standard basis functions. Page 85

Decomposition Lehrstuhl für Informatik 4 Only to remember: decomposition of a sequence of h numbers together with a normalization. These procedures are needed in the -dimensional decomposition. proc DecompositionStep(C:array[..h] of reals) for i:= to h/ do C [i]:=(c[i-]+c[i])*sqrt() C [h/+i]:=(c[i-]-c[i])*sqrt() end for C:=C ; end proc proc Decomposition(C:array[..h] of reals) C := C/sqrt(h); // normalize input coefficients while h> do DecompositionStep(C[..h]); h:=h/; end while; end proc Page 86

Decomposition Lehrstuhl für Informatik 4 The standard decomposition of a D picture of h w pixels proc StandardDecompositionStep(C:array[..h,..w] of reals) for row:= to h do Decomposition(C[row,..w]); end for for col:= to h do Decomposition(C[..h,col]); end for end proc The nonstandard decomposition of a D picture of h w pixels proc NonstandardDecompositionStep(C:array[..h,..w] of reals) C:=C/h; // normalize input coefficients while h> do for row:= to h do DecompositionStep(C[row,..h]); end for for col:= to h do DecompositionStep(C[..h,col]); end for end while end proc Page 87

Decomposition Lehrstuhl für Informatik 4 Instead as of the unnormalized Haar Wavelet functions φ (x): = φ( x i) ψ φ ψ j j i where j j for x < i (x): = ψ( x i) ψ (x): = for x < j j j i (x): = φ( x i) j * j j i (x): = ψ( x i) φ(x): = for x otherwise otherwise we can use the normalized Haar wavelet functions As a compensation we have to multiply each unnormalized coefficient with superscript j by (-j/) Example: Unnormalized version: coefficients = (6;;,;,-,-,) basis functions = ( φ ; ψ ; ψ, ψ ; ψ, ψ, ψ, ψ ) 3 Normalized version: - - coefficients = (6;;, ;,,, ) normalized according to (* * * * * * * * * basis functions = ( φ ; ψ ; ψ, ψ ; ψ, ψ, ψ, ψ ) Page 88 3

Decomposition Lehrstuhl für Informatik 4 Comment on input normalization C [i]:=(c[i-]+c[i])*sqrt(), i.e. the average is multiplied by At the beginning we normalize by Example: Let h = n, e.g. h = 4 ; basis functions Then the higher superscript is n -, e.g. 3 We normalize first: C C C C: = = = n h n Then in the first run we multiply the first detail by, then the coefficient is normalized by C C = n n C: = C h ( Θ ; Ψ ; ; Ψ ) 3 7 The same holds for the averages. In successive runs the other detail levels and the global average are correctly evaluated by means of successive multiplications with Page 89

D Image Compression D-image compression with D-Haar basis functions (Generalization of the corresponding technique for D-images) Step: Compute coefficients c,...,c m which represent an image in a normalized Haar basis. (Image = c u (x,y) + c u (x,y) +... + c m u m (x,y)) Step: Sort c,...,c m in order of decreasing magnitude. Result: m Step3: Find the smallest k with c σ ε where ε is the allowed error regarding! i i= k+ the L norm,i.e. f f ε c σ,...,c σ m Page 9

Decomposition Lehrstuhl für Informatik 4 A fast procedure (by binary search) for finding the threshold which coefficients are negligible proc Compress(C:array[..m]of reals, ε: real) τ min :=min{ C[i] } τ max :=max{ C[i] } do τ:=(τ min + τ max )/ s:= for i:= to m do if C[i] τ then s:=s+(c[i])^ end for if s < ε^ then τ min := τ else τ max := τ until τ min τ max for i:= to m do if C[i] < τ then C[i]:=; end for end proc Page 9

Decomposition Lehrstuhl für Informatik 4 This procedure starts with coefficients C[],, C[m] and finds the smallest m for which m c ε where σ,, σ is a permutation of,,m i= m+ σ i m such that c c c σ σ σ m and where ε is a tolerable L error The algorithm works as follows:. It starts with a threshold τ which is the average between the smallest and the largest coefficient. It computes the L error if all coefficients smaller in magnitude than τ would be discarded 3. If squared error < ε then we continue with the right half of the interval 4. If squared error ε then we continue with the left half of the interval Page 9

Decomposition Lehrstuhl für Informatik 4 Let (without any loss of generality) c, c,, c m be already sorted such that c (τ max ) c c m (τ min ) Let s := squared error by discarding all coefficients smaller than τ, i.e. all coefficients on the left side of τ Case : s < ε We could even discard (possibly) more than those coefficients! Hence we discard the part on the left of τ. We set τ new,min := τ and restart with the new interval. Still very small error? τ min τ new,min = τ τ max Case : s > ε new interval By discarding the elements left of the actual τ we would discard too many elements. The elements left of τ must be kept anyway. Set τ new,max := τ and restart. Error too large? τ min new interval τ new,max = τ τ max Page 93

Multiresolution Analysis Multiresolution analysis means analyzing a signal at different frequencies giving different resolutions Consider a nested set of vector spaces V V V... Basis functions of V j are sometimes called scaling functions W j := orthogonal complement of V j in V j+ [orthogonal according to some definition of an inner product ] The functions which we choose as a basis for W j are called wavelets Example (Haar Wavelets): ψ ψ ψ ψ 3 Matrix notation: Φ (x): = [ φ (x),..., φ (x)]; Ψ (x): = [ ψ (x),..., ψ (x)] If the basis V j and W j consists of m j and n j elements respectively, then we may combine the basis functions into single row matrices j j j j j j j j m n Page 94

Matrix Formulation Some immediate consequences:. V j W j = V j+ ; W j orthogonal to V j m j + n j = m j+. V j- V j The elements of V j- must be expressible as linear combination of the finer functions of V j, i.e. the scaling functions must be refinable Example: Scaling functions of V : φ φ Scaling functions of V : φ φ We have: φ = φ + φ φ = φ + φ 3 φ φ 3 m j- There must be a matrix P j with where P j is a m j m j- matrix (a) Φ (x) = Φ (x) P j j j m j Page 95

Matrix Formulation 3. W j- V j Elements of W j- can be written as linear combination of scaling functions of V j There is a m j n j- matrix Q j j- j j with (b) Ψ (x) = Φ (x) Q Example: Φ (x) = Φ (x) P and Ψ (x) = Φ (x) Q with: P = and Q = 4. (a) and (b) can be combined to: j j j j j Φ Ψ = Φ P Q In our example: φφψ ψ φφ φφ = 3 Page 96

Matrix Formulation j 5. All functions of Φ (x) must be orthogonal to all functions of Ψ j (x). j j < Φ Ψ > = (Matrix whose (k,l) entry is < φ ψ > ) j j k l (from (b)): j j j < Φ Φ > = Q j j j 6. < Φ Φ > Q = is a homogeneous systems of equations. The set of j j solutions is called null space of < Φ Φ >. Page 97

Matrix Formulation The columns of Q j are a basis for the null space. There are many alternatives for Q j, thus there are many different wavelet bases for a given W j. The Haar wavelets are defined by the additional requirement that the number of consecutive nonzero values in Q j per column is minimal Q = Definition Orthogonal wavelet basis: all functions (wavelets + scaling functions) are orthogonal to each other Semi-orthogonal wavelet basis: wavelets orthogonal to scaling functions but not to each other ˆ= Haar basis orthogonal; spline basis semi-orthogonal ˆ= Page 98

Matrix Formulation A matrix notation for the approximation of functions We have shown earlier that (and how) it was possible to approximate a function f(x) in V j by f ( x ) j j = c u ( x ) +... + c j u ( x ) j m m where u i (x) are basis functions of V j The coefficients c ij might be, for example, pixel colors or y-coordinates of the function f(x) j j j T We can write the coefficients c ij as C = c,...,c j m j j j f ( x ) = U C where U = [u (x),...,u (x)] j m Page 99

Matrix Formulation We may want to express f(x) within V j- (a lower resolution version, i.e. less accurate, with lower number of coefficients, i.e. m j- instead as of m j coefficients) The standard procedure of creating the m j- coefficients of C j- out of the m j coefficients of C j consists of: linear filtering the coefficients of C j down sampling This may be expressed as a matrix equation: C j- = A j C j m j where A j is a m j- m j matrix m j- Since C j- is smaller then C j this filtering process looses some detail. The lost detail may be expressed in a second matrix D j- with D j- = B j C j where B j is a n j- m j matrix Page

Matrix Formulation C j- = A j C j Low resolution of C j D j- = B j C j Detail of C j which cannot be expressed by C j- If A j and B j are appropriately chosen, C j may be recovered from C j- and D j- : C j = P j C j- + Q j D j- The process of calculating C j- and D j- from C j is called decomposition of C j or <analysis> The process of calculating C j from C j- and D j- is called reconstruction of C j or <synthesis> The global procedure looks as follows: A j A j- A C j C j- C j-...c C B j D j- B j- D j- B D This recursive procedure is called a filter bank The coefficients C j can be reconstructed from: C,D,D,D,...,D j- This sequence is called wavelet transform. It has the same size as C j. Page

Example Lehrstuhl für Informatik 4 For the Haar basis of V we have A = B = C = A C D = B C A is the averaging operation B is the differencing operation (this explains the factor ½) A general relation which must be satisfied by the matrices A j and B j is : A Φ Ψ = Φ j j j j j B j j j j j Thus since Φ Ψ Φ P Q = (shown earlier) we must have A j j j P Q = j B j A i.e. must be invertible j B Page

Multiresolution Analysis How to choose a suitable technique for transformation? Suitable for a particular application. The following steps have to be made:. Select the scaling functions (this defines V j and P j ). Select an inner product for V, V,... (this defines the L norm as well as W j ) 3. Select a set of wavelets which are a basis for W j (j =,,...) (this defines Q j ) Ψ j (x) Remark: P j and Q j define A j and B j since Φ j ( x ) (j =,,...) A B j j j = P Q j Page 3

Multiresolution Analysis It is desirable for reasons of data compression that:. Wavelets are an orthogonal basis for W j. Wavelets have a small support (the support of a function f(x) is the number of points where f(x) ) But: orthogonality comes very often with large support. Thus it may be better to sacrifice the orthogonality. An example for that are the spline wavelets which: have minimum support are not orthogonal to each other (except for degree = ) Page 4

B-Spline Wavelets Haar type wavelets have many advantages: simplicity orthogonality very small supports non-overlapping scaling functions for a given level j non-overlapping wavelets for a given level j However, Haar transforms are not well suited for animation and for curve editing since they do not have enough smoothness Therefore, we are interested in wavelets which have several continuous derivatives Such wavelets can be derived from piecewise polynomial splines (B-splines) B-spline scaling functions and B-spline wavelets can be defined for different degrees d =, d =, d =,... The higher the degree the smoother the transform With degree d, the scaling functions have d - continuous derivatives d = is the Haar transform Page 5

Non-Uniform B-Spline Scaling Functions The nonuniform B-spline basis functions for degree d are constructed as follows: Choose positive integers k and d (k d) and values x,...,x k + d + These values are called knots Then we define recursively the non-uniform B-spline basis functions of degree d: N (x): i = x x < x i i+ otherwise x x = + x x r i r xi+ r x r N i (x): N i (x) xi r x N i i (x) + + i+ r i (i =,..., k; r =,..., d) Remark: If the denominator is then the whole term is defined to be zero Page 6

Uniform B-Spline Scaling Functions Endpoint-interpolating B-splines of degree d on the interval [,] are defined by setting the first d + knots to and the last d + knots to d d N ( x ),...,N k ( x ) form a basis for the space of piecewise-polynomials of degree d with d- continuous derivatives Finally, uniformly spaced B-splines are constructed by selecting k = j + d and by making the interior knots equally spaced. x d+,...,x k This gives j d d + d B-spline functions N ( x ),...,N j ( x ) which are a basis of a particular degree d and a level j, i.e. for V j + d (d). If V j (d) denotes the space which is spanned by the B-spline scaling functions of degree j with j uniform intervals, then the spaces V (d), V (d),... are nested: V (d) V (d) V (d)... Page 7

B-Spline Scaling Functions Example J =, i.e. = subintervals: [ : ),[ :) Degrees d=, d=, d=, d=3 B-spline scaling functions 3 N N (x),...,n (x) N (x) d d d = + d + j d N N 3 N N N The functions N jd (x) have d - continuous derivatives. N d (x),..., N id (x) are a basis for V (d) N Degree N N Degree N N 3 Degree 3 N 3 N 3 3 N 4 Degree 3 Page 8

Scaling and Wavelet Functions for V 3 (d) In the following we show B-spline scaling functions for j = 3 and for degrees d = (Haar wavelet functions) d = (linear splines) d = (quadratic splines) d = 3 (cubic splines) We get j + d = 8 + d scaling functions and j =8 wavelets for each case j The wavelets are determined by matrices Q which satisfy The solution of this equation system is not unique j j j < Φ Φ > = Q Page 9

B-Spline Scaling Functions for V 3 () Degree d = (Haar wavelets) - not a continuous function: lim f ( x + ε ) lim f ( x ε ) ε ε if x =,,..., 7 8 8 8 8 scaling functions 8 wavelets Page

B-Spline Scaling Functions for V 3 () Degree d = (linear B-spline wavelets) 9 scaling functions 8 wavelets Page

B-Spline Scaling Functions for V 3 () Degree d = (quadratic B-spline) - can be continuously differentiated once scaling functions 8 wavelets Page

B-Spline Scaling Functions for V 3 (d) Degree d = 3 (cubic B-spline) - can be differentiated continuously two times: scaling functions 8 wavelets There are j + d scaling functions if the unit interval [:) is subdivided into j subintervals. The scaling functions have d- continuous derivatives Page 3

Wavelets in JPEG Using wavelets, images can better be compressed than with DCT. Used in compression here: nonstandard decomposition Page 4

Decomposition Lehrstuhl für Informatik 4 Coarser details Vertical information (fine details) Diagonal information (finest details) Horizontal information (fine details) Page 5