CHAPTER 3 ENCODING VIDEO SEQUENCES IN FRACTAL BASED COMPRESSION. Day by day, the demands for higher and faster technologies are rapidly

Similar documents
A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

3 Image Compression. Multimedia Data Size/Duration Kbits Telephone quality speech. A Page of text 11 x 8.5

A Binarization Algorithm specialized on Document Images and Photos

3D vector computer graphics

Parallelism for Nested Loops with Non-uniform and Flow Dependences

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Shape-adaptive DCT and Its Application in Region-based Image Coding

Lecture 5: Multilayer Perceptrons

Hybrid Non-Blind Color Image Watermarking

Edge Detection in Noisy Images Using the Support Vector Machines

Parallel matrix-vector multiplication

The Codesign Challenge

Reducing Frame Rate for Object Tracking

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Fast Intra- and Inter-Prediction Mode Decision in H.264 Advanced Video Coding

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

Local Quaternary Patterns and Feature Local Quaternary Patterns

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

An Image Compression Algorithm based on Wavelet Transform and LZW

Programming in Fortran 90 : 2017/2018

Mathematics 256 a course in differential equations for engineering students

Cluster Analysis of Electrical Behavior

Efficient Video Coding with R-D Constrained Quadtree Segmentation

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

EFFICIENT H.264 VIDEO CODING WITH A WORKING MEMORY OF OBJECTS

Some Tutorial about the Project. Computer Graphics

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

An efficient method to build panoramic image mosaics

Classification Based Mode Decisions for Video over Networks

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

A Gradient Difference based Technique for Video Text Detection

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

A Gradient Difference based Technique for Video Text Detection

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

K-means and Hierarchical Clustering

CMPS 10 Introduction to Computer Science Lecture Notes

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

An Optimal Algorithm for Prufer Codes *

Six-Band HDTV Camera System for Color Reproduction Based on Spectral Information

Simulation Based Analysis of FAST TCP using OMNET++

Angle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

S1 Note. Basis functions.

PYTHON IMPLEMENTATION OF VISUAL SECRET SHARING SCHEMES

A DCVS Reconstruction Algorithm for Mine Video Monitoring Image Based on Block Classification

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Optimal Workload-based Weighted Wavelet Synopses

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

3D Virtual Eyeglass Frames Modeling from Multiple Camera Image Data Based on the GFFD Deformation Method

TN348: Openlab Module - Colocalization

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

CS 534: Computer Vision Model Fitting

Lecture 13: High-dimensional Images

Color and Printer Models for Color Halftoning

Efficient Content Representation in MPEG Video Databases

y and the total sum of

Wishing you all a Total Quality New Year!

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Subjective and Objective Comparison of Advanced Motion Compensation Methods for Blocking Artifact Reduction in a 3-D Wavelet Coding System

Pictures at an Exhibition

What are the camera parameters? Where are the light sources? What is the mapping from radiance to pixel color? Want to solve for 3D geometry

Enhanced AMBTC for Image Compression using Block Classification and Interpolation

Key-Selective Patchwork Method for Audio Watermarking

Lecture #15 Lecture Notes

An Image Fusion Approach Based on Segmentation Region

X- Chart Using ANOM Approach

OPTIMAL VIDEO SUMMARY GENERATION AND ENCODING. (ICIP Draft v0.2, )

Machine Learning: Algorithms and Applications

Dynamic wetting property investigation of AFM tips in micro/nanoscale

LECTURE : MANIFOLD LEARNING

Suppression for Luminance Difference of Stereo Image-Pair Based on Improved Histogram Equalization

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

A Background Subtraction for a Vision-based User Interface *

A WAVELET CODEC FOR INTERLACED VIDEO

2 optmal per-pxel estmate () whch we had proposed for non-scalable vdeo codng [5] [6]. The extended s shown to accurately account for both temporal an

The Shortest Path of Touring Lines given in the Plane

MOTION BLUR ESTIMATION AT CORNERS

[33]. As we have seen there are different algorithms for compressing the speech. The

Fast Intra- and Inter-Prediction Mode Decision in H.264 Advanced Video Coding

Image Fusion With a Dental Panoramic X-ray Image and Face Image Acquired With a KINECT

Machine Learning 9. week

Video Content Representation using Optimal Extraction of Frames and Scenes

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Combined Rate Control and Mode Decision Optimization for MPEG-2 Transcoding with Spatial Resolution Reduction

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Design and Implementation of an Energy Efficient Multimedia Playback System

Transcription:

65 CHAPTER 3 ENCODING VIDEO SEQUENCES IN FRACTAL BASED COMPRESSION 3.1 Introducton Day by day, the demands for hgher and faster technologes are rapdly ncreasng. Although the technologes avalable now are consdered to be more advanced than 30-40 years ago, people are stll lookng for mprovements and enhancements. In the last twenty years, computers have been developed and ther prce has reached a level of acceptance where almost anyone can buy one. Nowadays, before purchasng computers, customers are concerned about two thngs: (1) the speed of the CPU; and (2) the storage and memory capacty. Image and vdeo compresson helps to reduce the memory capacty and to have faster transmsson rates. Many compresson technques [117, 118] have been ntroduced and developed, such as Jont Photographc Experts Group (JPEG) and Graphcs Interchange Format (GIF) for mage compresson as well as Movng Pcture Experts Group (MPEG) for vdeo compresson [102]. Although some of these technques show sgnfcant and enhanced performance n decreasng the cost of the transmsson and storng data, the search for alternatves contnues. Fractal mage or vdeo compresson [85] s a new compresson method whch s based on self-smlarty wthn the dfferent portons of the mage. It mght be revolutonary n the world of data compresson because of ts hgh compresson rate compared wth other methods; however, t suffers from some problems such as the tme taken for encodng [93]. Fractal dmenson s a promsng feature proposed to characterze roughness and self smlarty n an mage sequence. Roughness usually results from the edge components n the spatal doman and movement n the temporal doman, whle self smlarty corresponds to both spatal and temporal redundancy. The above observaton

66 serves as a motvaton for employng fractal dmenson as the maor feature dscrmnator. The theory of terated contractve transformatons s utlzed to record the transformaton functon for every partton between the doman and range block [94, 95, 96]. However, the results reported, 0.68bts/pxel compresson rate (11.76 compresson rato) wth SNR 27.7 db n the two dmensonal case and 41.80 and 74.39 compresson rato wth PSNR around 29dB and 33dB, respectvely n the 3- dmensonal case. In order to meet the compresson requrements of the dverse mage sequence characterstcs, a robust technque proposed s based on fractal dmensons of the ensemble lumnance and chromnance, respectvely, relatve to a reference frame nsde a Group Of Pctures (GOP). The method proposed s tested on mage sequences contanng varous moton dynamcs. The performance n terms of the compresson rato s tabulated and the PSNR vaue n all three color channels for each reconstructed frame s llustrated. 3.2 Fractal Color Image Compresson Lttle work has been done of fractal color mage compresson compared wth the work done on fractal gray-scaled mage compresson [48]. The need for color mage compresson s ganng mportance n recent tmes due to large scale multmeda applcatons. Conventonal fractal compresson schemes can easly be extended to color mage compresson as a color mage s usually represented n mult channels such as Red, Green and Blue (RGB) components [118]. Thus each channel n color mage can be compressed as a gray-level mage. Hurtgen, Mols and Smon [63] proposed a fractal transform codng of color mages. Ths knd of encodng lacks the possblty of consderng smlartes between the three color planes, thus a good compresson rato s not acheved. Far populaton on the doman pool s needed to obtan a hgh qualty nterpolaton for zoomng purposes. To explot the spectral

67 redundancy n RGB components, the root mean square error (RMS) measure n grayscale space can be extended to 3-dmensonal color space for fractal-based color mage codng [29]. Experments show that a 1.5 compresson rato mprovement can be obtaned usng vector dstorton measure n fractal codng wth fxed mage partton as compared to separate fractal codng n RGB mages. However, snce RGB space s not perceptually unform, t s decded to use another color space, called CIE-Lab. The color space A color space s a mathematcal representaton of a set of colors. The three most popular color models are RGB (used n computer graphcs), YIQ, YUV or YC b C r (used n vdeo systems) and CMYK (used n color prntng). However, none of these color spaces are drectly related to the ntutve notons of hue, saturaton and brghtness, whch are the bass of our color percepton. Ths resulted n the temporary pursut of other models, such as HSI and HSV, to smplfy programmng, processng and end-user manpulaton but tryng to get closer to the actual representaton of colors n our bran. Indeed, mathematcally, all the color spaces can be derved from the RGB nformaton suppled by devces such as cameras and scanners. The (RGB) color space s wdely used throughout computer graphcs, snce red, green and blue are three prmary addtve colors (ndvdual components are added together to form a desred color) and are represented by a three-dmensonal, Cartesan coordnate system [80]. The trangle, called Maxwell trangle has been drawn between the three prmares. The ntersecton pont of a color vector wth the trangle gves an ndcaton of the hue and saturaton of the color n terms of the dstances of the pont from the vertces of the trangle.

68 Fgure 3.1 The RGB color space representaton The RGB color space s the most prevalent choce for computer graphcs because color dsplays use red, green and blue to create the desred color. Therefore, the choce of the RGB color space smplfes the archtecture and desgn of the system. Also, a system that s desgned usng the RGB color space can take advantage of a large number of exstng software routnes, snce ths color space has been around for a number of years. However, RGB s not very effcent when dealng wth real-world mages. All three RGB components need to be of equal band wdth to generate any color wthn the RGB color cube. The result of ths s a frame buffer that has the same pxel depth and dsplay resoluton for each RGB component. Also, processng an mage n the RGB color space s usually not the most effcent method. For example, to modfy the ntensty or color of a gven pxel, the three RGB values must be read from the frame buffer, the ntensty or color calculated, the desred modfcatons performed and the new RGB values calculated and wrtten back to the frame buffer. If the system has access to an mage stored drectly n the ntensty and color format, the processng steps would be faster. The most common systems are the YUV and YC b C r color spaces. The YUV color space s used by the PAL (Phase Alternaton Lne), NTSC (Natonal Televson

69 System Commttee), and SECAM composte color vdeo standards. The equatons that descrbe the drect transformaton RGB to YUV are: Y = 0.299 R + 0.587 G + 0.114 B U = 0.147 R + 0.289G + 0.436B = 0.492 (B Y) (3.1) V = 0.615R 0.515 G 0.100 B = 0.877 (R Y) and for the nverse transformaton: R Y V R = Y + 1.140 V G = Y 0.395 U 0.581V (3.2) B = Y + 2.032 V For dgtal RGB values wth a range of 0-255, Y has a range of 0-255, U a range of 0 to ±112 and V a range of 0 to ±157. As the RGB color space, the YUV space s not unform concernng the HVS. A system s sad to be not unform f a lttle perturbaton of a value s perceved lnearly along the possble varaton of that value. Usng a nonperceptually unform space as RGB has the drawback that the Human Vson System wll be affected by computer measures for dgtal vdeo processng, snce the dstance from RGB value wll not be unform n respect of the HVS. Startng from these consderatons, the Commsson Internatonale d Eclarage (CIE) defned a unform color model. Dancu and Hart [29] presented a comparatve study of fractal color mage compresson n the CIE-Lab color space wth that of Jacqun's terated transform technque for 3- dmensonal color. It has been shown that the use of unform color space yelds compressed mages to have less notceable color dstorton than other methods. Snce there are three types of color photoreceptor cone cells n the retna, each wth a dfferent spectral response curve, all colors can be completely descrbed by three numbers, correspondng to the outputs of the cone cells. The CIE workgroup then defned XYZ trstmulus values, where all vsble colors can be represented usng only postve values of

70 X, Y and Z. For applcatons where t s mportant to be able to measure dfferences between colors n a way that matches perceptual smlarty as good as possble, the perceptually unform color spaces fnd ther best feld of use. The CIE-Lab color space s desgned such that the perceved dfferences between sngle, nearby colors correspond to the Eucldean dstance of the color coordnates. The (nonlnear) conversons from RGB to CIE-Lab are gven by: X 0.412453 = Y 0.212671 Z 0.019334 0.357580 0.715160 0.119193 0.180423 R 0.072169. G 0.950227 B (3.3) Fgure 3.2 CIE-Lab color space representaton Lumnance vares from 0 (black) to 100 (whte) and a and, b components vary from 50 to +50 and represent the color varaton along the red-green and blueyellow axs. The two-dmensonal fractal encoder works on gray-level mages, or wth one layer of color decomposton. For encodng a color mage, t should be decomposed n three color layer decomposton as separated RGB or YIQ layers [72].

71 Fgure 3.3 CIE-Lab color space Every mage wll be parttoned n squared doman and range blocks. It s assumed that: Range blocks have a constant sze equal to R Doman blocks have a constant sze equal to D, whch s two tmes the sze of a range block. The Doman Pool s a Non Overlapped Doman Block. Gven a startng mage µ org the encoder process wll be for Range Block R on the partton obtaned from the startng mage µ : org 1) extractng the Range Block R ; 2) fndng the fractal transform of ths block; 3) savng the fractal code τ end The procedure nvolved n the step (2) s gven below

72 2.a) from the nput Range Block R computng ts lumnance l R and ts contrast C R. The lumnance l R and the contrast computed the followng way: Lumnance C R of the range block R s l R s the mnmum value of lumnance among all pxels that compose R : l R = mn 1SXSR 1SYSR p R (x, y) where P R ( x, y) s a generc pxel of R whereas R s the sze of the Range Block. Contrast C R s obtaned by the overall sum of pxels nsde the Range R. Before the addton, the prevous lumnance subtracted to every pxel s value. l R s 2.b) computng R wth the norm of the Range Block usng the lumnance and the contrast. For every Doman Block D nsde the Doman Pool: The normalzaton R of the range block R s obtaned: From the pxels that belong to the Range Block R, the lumnance l R s subtracted (lumnance shft). Every pxel s then dvded by the contrast C (contrast scalng). It s easy to notce that R l R =4. To obtan the contrast we subtract the lumnance value to every pxel, and then we sum all the pxels together: C R = (10 4) + (7 4) + (5 4) + (4 4) = 10 (3.4)

73 2.c) sub samplng D to match Range Block s sze. The generc doman block D s subsampled by a factor of two stated as a condton) to have the same dmenson of the range block R The same procedure of 2.a) s now executed on D The same procedure of 2.b) s now executed on D 2.d) computng ts lumnance l D and ts contrast C D 2.e) computng D wth the norm of D usng An sometry k s chosen. l D and C D. 2.f) computng the sometry D startng from D. k 2.g) computng the error between R and D. k A metrc s needed to compare ranges and transformed doman. The smple MSE between pxels of approxmaton R and D k s used as a frst R R 1 ( R, D ) = P ( x, y) - P ( x, y) MSE (3.5) k R D R.R k y= 1 x= 1 2 2.h) f the error s a local mnmum, save τ composed by nformaton on sometry and doman used. Fnally the fractal code τ of the Range Block R s obtaned consderng: The coordnates ( Y ) X, that dentfy the Doman Block D D D (founded as n 2.g that mnmze the MSE error measure) on the encodng mage. The sometry k that mnmze the error measure.

74 Lumnance l, l and the contrast rato: R D 2.) end sometry loop end Doman Block loop Fgure 3.4 An example of normalzaton: a) the range block b) a threedmensonal representaton of the range block c) lumnance shft d) contrast. Allowed sometres are Identty τ 1 ( µ, ) = µ, Horzontal reflecton τ 2 ( µ, ) = µ, n Vertcal reflecton τ 3( µ, ) = µ n, Frst dagonal reflecton τ 4 ( µ, ) = µ, Second dagonal reflecton τ 5 ( µ, ) = µ n, n 90 counter-clockwse rotaton τ 6 ( µ, ) = µ n, 180 counter-clockwse rotaton τ 7 ( µ, ) = µ n, n 270 counter-clockwse rotaton τ 8 ( µ, ) = µ, n CR C = (3.6) C D All these steps are mandatory, to obtan an effcent matchng between Range Blocks R and all the sometres of the subsampled Doman Block D. The overall process s explaned usng the gven range block as an example. 10 7 5 4

75 Gven the followng Range Block and Doman Block (the Doman Block s already sub-sampled to match the Range Block sze), t s notced that they seem completely dfferent from each other for every sometry appled to D. The lumnances of the blocks are: respectve blocks: l R and l D. As stated n 2.b.1 subtractng the lumnance to ther R D 10 7 19 13 5 4 9 7 R - l R D l D 10 7 5 4 10 7 5 4 The contrast of the blocks are computed and applyng the rato as n 2.h: ( R - l ) ( D - l D ) C R R C D (3.7) 0.6 0.3 0.1 0 0.6 0.3 0.1 0 l = 4 l = 4 R D C = 10 C = 20 (3.8) R It s seen that the ntal blocks that seemed dfferent are now dentcal. All the operatons made are reversble. D

76 Bdmensonal Fractal Decoder The fractal code τ s computed wth the method descrbed above, a fractal approxmaton of ts ntal attractor (the ntal mage) µ org s re- obtaned. It s assumed that: Range Blocks have a constant sze of R' = βr Doman Blocks have a constant sze D' whch s two tmes the sze of the Range Blocks The startng mage of the decodng process µ 0 has a sze that s µ 0tmes the sze of the ntal encoded mage µ org The factor β s the zoomng factor used at the decodng stage to obtan varable dfferent szes of decode mage. Gven the fractal code τ and a startng mage µ 0 the decodng stage conssts of the followng steps: 1) for m <α teratons 2) for every τ of τ 3) readng from τ the doman block coordnates ( x, ) D y D 4) extractng the Doman Block D stuated at the poston (. x, β. y ) β of µ m D D 5) Computng a sub sample approxmaton sub D of D to match the sze of R 6) applyng the k sometry extracted from τ to sub D obtanng D sub k 7) extractng l R, l D and c from τ. 8) applyng lumnance values and contrast to sub D obtanng R decoded k 9) wrtng R on µ decoded m+ 1

77 End τ loop End teraton loop The prevous steps mean: 1) Ths loop cycles the whole decodng procedure α tmes: ths s the cycle that generates the sequence of mages that wll converge towards the attractor ( µ or an expanded verson of t) org 2) Ths loop decodes all the τ fractal transform related to the Range Block R 3) Reads from Ths τ the coordnates ( ) x, of Doman Block D. D y D doman s the best transformed doman that the encoder found n the orgnal mage µ org for R. 4) Snce all the mage s zoomed by a factor β, all the coordnates must be 3.3 Vdeo Compresson shfted of the same amount. Vdeo s a number of mages n sequence; therefore, the same method of compressng mages can be appled on the vdeo by compressng each frame of the vdeo separately (ths s called ntra-frame codng). Though ths technque looks very smple, t s mpractcal because t requres a very large memory space to store data. The best way to acheve a better compresson s n takng advantage of the smlartes between the vdeo frames. There are two man functons used n vdeo codng [80]. 1. Predcton: create a predcton of the current frame based on one or more prevously transmtted frames. 2. Compensaton: subtract the predcton from the current frame to

78 produce a resdual frame. 3.3.1 Frame Dfferencng The dea of frame dfferencng n vdeo codng s to produce a resdual frame by subtractng the prevous frame from the current frame; the resdual frame wll then be a form of zero data, wth lght and dark areas (lght ndcates postve resdual data and dark ndcates negatve resdual data). Nevertheless, there wll be more portons n the resdual frame wth zero data than the lght and dark; ths s due to the smlarty between the frames (most of the pxels n the prevous frame wll be equal to the pxels n the current frame). Therefore, snce more of the resdual frame s zero data, then the compresson effcency wll be further mproved, f the resdual frame s compressed nstead of the current frame. Frame dfferencng method n vdeo codng faces a maor problem whch s best llustrated by the followng example. In the encodng and the decodng process there wll be no predcton for the frst frame, but the problem starts wth the second frame when the encoder uses the frst frame as the predcton and encodes the resultng resdual frame. The decoded frst frame s not exactly the same as the nput frame, whch leads to a small error n the predcton of the second frame at the decoder. Ths error wll ncrease as t s contnued wth other frames and the result wll be of low qualty n the decoded vdeo sequence [92, 99]. 3.3.2 Moton-compensaton Predcton When the dfferences between the prevous frame and the current frame are not really smlar or f there s a bg change between the frames, then the compresson may not be sgnfcant. Ths s due to the movement n the vdeo scene. So, n ths type of frames, another method of predcton used s called Moton-compensaton Predcton where the achevement of better predcton s by estmatng the movement and

79 compensatng for t. Moton-compensaton s smlar to frame dfferencng wth two extra steps. 1. Moton estmaton: Comparng a regon n the current frame wth the neghborng regons of the prevous decoded frame and fndng the best match. 2. Moton Compensaton: Subtractng the matchng regon from the current regon. The encoder wll send the locaton of the best match to the decoder to perform the same moton compensaton operaton n the process of decodng the current frame. The resdual frame n moton-compensaton contans less data compared wth frame dfferencng (hgher compresson); however, moton-compensaton s computatonally very ntensve. 3.4 Fractal Vdeo Compresson For fractal vdeo compresson, there are two extensons of stll mage compresson [83]. They are frame-based compresson and cube-based compresson. In frame-based compresson, vdeo clps and moton pctures are naturally dvded nto segments accordng to scene changes. Each segment, begnnng wth an ntal frame, s called an ntra-coded frame, or I-frame. Each frame then can be coded manly usng the moton codes by referencng ts precedng frame called a P-frame, as a predcted frame from ts predecessor. The I-frames and the P-frames are also called coarse frames and the frames that are added between any two of the I-frames and P frames are called bdrectonal frames or B-frames (Fgure 3.5). Each B-frame s coded usng the predcton from both coarse frames mmedately before and after t. In a 2-dmensnal fractal vdeo compresson system, the I-frames are compressed usng mage compresson technque. The I-frames are coded by 2-to-1

80 local and global self referencng fractal codes. Whle decodng such an I-frame, a hdden frame has been created n each teraton. For example, f an I-frame F s created by ten teratons from some ntal mage F 0 wth self reference fractal codes, ten consecutve P-frames wll be set that have the same set of codes and are dentcal to the I-frame fractal codes, but wll be referenced to the precedng frame nstead of tself. Then, startng from the same ntal mage F 0, by the end, the tenth P-frame s clearly the same I-frame obtaned n the frst procedure. As a result, a fractal represented I-frame can be replaced by a sequence of P-frames f a tme delay s allowed. Fgure 3.5 Vdeo clp frames In a cube based compresson, mage sequences are parttoned nto groups of frames, and every group of frames s parttoned nto non-overlapped cubes of ranges and domans. The compresson codes are computed and stored for every cube. Every group of frames s called GOF. Each GOF can be compressed and decompressed separately as an entty. Assumng temporal axs along the sequence, every GOF can be consdered as a large cubod. In fractal compresson, each GOF s parttoned nto nonoverlap small cubods. Each cubod s called as a range cubod and denoted as R. The szes of edges of R may be dfferent especally the edge n the temporal drecton may vary from the horzontal drecton and the vertcal drecton. In order to obtan the

81 approxmate transformaton of R, another overlap partton s necessary whose small parts are called doman cubods. The horzontal and the vertcal edges of the doman cubods are twce as large as the range cubods respectvely. But the temporal edge of the doman cubod s the same as the one of the range cubods. The cubod algorthm of fractal vdeo compresson s gven below:. Parttonng the moton mage sequence to a seres of GOF. For each GOF the followng steps have been done:. Parttonng the GOF nto range cubods and doman cubods. The horzontal and the vertcal edges of the doman cubods are twce as large as the ones of the range cubods respectvely. The temporal edge of the doman cubods s the same sze as the one of the range cubods.. For each range cubod R the followng steps have been done: a) All doman cubods are the same szes as R n three drectons. b) Computng the scale factor and the offset factor α, β of D and the rms error between R and α D + β I. c) Choosng the optmal approxmaton R α D + β I that have the mnmal rms error. d) Storng α, β and the locaton of D of the optmal approxmaton as the Compresson codes of R. The frame-based compresson can obtan hgh compresson rato, but the compresson of the current frame s related to the prevous decompressed mage, so there s a delay between frames when decompressed and an error may spread between frames. The cube-based method can obtan the decompressed mages wth hgh qualtes. However, f transmsson error s consdered, the adaptve partton s not sutable because the partton nformaton may be lost durng transmsson.

82 If a vdeo sequence s composed by n frames: U π S = µ (3.9) =1 The encoded stream by ust searchng for approprate transformatons µ on every sngle frame s obtaned, so that the output vdeo s: U n = 1 S ' = τ ( µ ) 0 (3.10) S' S n Beng µ 0 an ntal arbtrary frame. In fact most of the frames are correlated to each other along tme. To explot ths redundancy, another technque can be used to encode fractally a vdeo. For every frame + 1 the doman blocks are searched usng the frame as the searchng pool. The vdeo sequence s frst analyzed to fnd whch frames are hghly correlated to each other. An MSE rato s used to dvde the sequence nto group of pctures (GOP) and then use the frst frame as a doman pool for the rest of the GOP frames. Beng the GOP composed by p frames, a subsequence s encoded statng: S m P ' = Uτ ( µ 0 ) = 1 τ : D R 1 (3.11) The complete sequence wll be U t ' S m= 1 S' = (3.12) Every frame n the packet s parttoned n range blocks, and a transformaton of a doman block obtaned by the frst frame of the packet s chosen to be the best approxmaton of every range block on the current frame. At decodng tme, the process s nverted and startng from a blank frame, all the other frames of a packet

83 are reconstructed usng the transformaton set obtaned durng the encodng stage. The drect extenson of the two-dmensonal fractal encodng s to consder the sequence of frames,.e. the overall sequence, as a three-dmensonal obect wth the thrd axs represented by the tmng. In fractal vdeo codng, usng the three-dmensonal extenson range and doman blocks become three dmensonal obects: range and doman cubes. The process s straght-forward: the vdeo sequence s parttoned nto range and doman cubes, and for every range cube, a transformed doman cube s searched to mnmze the error measure and to be the best approxmaton of t. 3. 5 Experments and Results The proposed method has been appled to four vdeo sequences such as "Foreman" (l44x144, 300 frames, 30 frames per second (fps)), "Moble" (144xl44, 300 frames, 30 fps), "Mother_daughter" (l44xl44, 300 frames, 30 fps) and "Suze" (144x144, 150 frames, 30 fps). The test sequences are gven as nput to the vdeo encoder and the parameters such as CR and PSNR value are calculated for each sequence. The CR can be calculated as shown n below. orgnal _ fle _ sze compressed _ fle _ sze CR = (3.13) orgnal _ fle _ sze The vdeo codng system uses PSNR value to assess the vdeo qualty, whch s llustrated n the equaton gven below.. 2 255 PSNR = 10 log10 db (3.14) MSE DCT-based transformaton (used n the exstng transformaton) and DWTbased transformatons such as Haar flter and Fractal transforms for the YUV sequences such as Foreman, Moble, Mother_daughter, and Suze have been used. The performance parameters measured nclude the PSNR values n each case along wth

84 compresson rato, bts per frame and encodng tme values. The PSNR value, compresson rato, encodng tme, and average bts per frame of DCT, DWT and Fractal for "Foreman" and "Moble" vdeo sequences are compared n Table 3.1. The results show that our proposed technque s 1.13 to 1.195 tmes faster than the exstng algorthm. As per the Lumnance PSNR value s concerned, the proposed technque acheves a 2.07 to 2.5 db mprovement. Comparable performance s also obtaned wth the exstng technque on CR. Table 3.1 Performance comparsons of DCT based, DWT-based and Fractalbased transformatons for "Foreman" and "Moble vdeo sequences Parameters Vdeo sequence Foreman Moble DCT DWT Fractal DCT DWT Fractal PSNR Y (db) 33.68 35.75 35.78 32.05 34.48 34.56 PSNR U (db) 33.80 36.19 36.03 31.89 34.47 34.55 PSNR V (db) 33.87 36.00 36.07 31.92 34.48 34.56 Encodng tme (sec) 1676.104 1451.217 1380.325 1959.355 1639.257 1718.904 Compresson Rato 0.8768 0.8768 0.8768 0.7712 0.7722 0.7722 Average bts/frame 11245624 11176528 11166248 20766232 20704344 20706376 Table 3.2 Performance comparsons of DCT-based, DWT-based and Fractalbased transformatons for "Mother_daughter" and "Suze" vdeo sequences Parameters Vdeo sequence Mother_daughter Suze DCT DWT Fractal DCT DWT Fractal PSNR Y (db) 34.48 36.76 36.79 34.37 36.30 36.34 PSNR U (db) 34.13 36.19 36.23 34.44 36.49 36.54 PSNR V (db) 34.18 36.23 36.27 34.45 36.47 36.52 Encodng tme (sec) 1554.032 1183.067 1182.283 651.373 581.145 593.079 Compresson Rato 0.9135 0.9135 0.9142 0.9237 0.9246 0.9244 Average bts/frame 7851616 7766600 7766552 3475376 3440328 3446912 Comparson of PSNR value, CR, encodng tme, and average bts per frame

85 of DCT, DWT and Fractal for "Mother_daughter" and "Suze" vdeo sequences s shown n Table 3.1. The results show that ths technque s 1.098 to 1.314 tmes faster than the exstng algorthm. Comparable performance n Compresson Rato s also obtaned wth the exstng technque. Moderate mprovements are also derved for Chromnance PSNR value. The proposed technque acheves a 1.098 db to 1.314 db mprovement n Lumnance PSNR value and ths s shown n Table 3.2. Consderable mprovements are also obtaned for Chromnance PSNR value. Performance comparson of average bts requred for moton nformaton s gven n Table 3.3. Performance comparson of average MSE per pxel requred for moton nformaton s represented n Table 3.4. Average number of search ponts per block for Car-phone vdeo sequence s shown n Fgure 3.6. The Performance comparsons of DCT-based, DWT-based and Fractal-based transformaton for Moble and Foreman vdeo sequences are shown n Tables 3.5 and 3.6. Table 3.3 Performance comparson of average bts requred for moton nformaton Vdeo Sequence Full search Fast search Proposed algorthm Foreman 1783.59 1654.1 1642.86 Contaner 371.67 357.69 352.96 Clare 478.29 459.14 434.46 Car-phone 1749.2 1613.77 1591.02 Mss Amerca 469.97 459.48 450.76 Moble 2725.27 2645.25 1642.86 Mother-daughter 755.67 722.13 721.01 News 917.43 848.69 827.76 Salesman 671.48 633.08 634.46 Slent 1009.31 946.76 931.47 Suze 1115.64 1059.01 1039.62 The performance evaluatons show that the mproved transformaton technque s 1.13 to 1.195 tmes faster than the exstng algorthm. As per the Lumnance PSNR value s concerned, the proposed technque acheves a 2.07 to 2.51

86 db mprovement. Improved CR s also obtaned compared wth the exstng technque. Table 3.4 Performance comparson of average MSE per pxel requred for moton nformaton Vdeo Sequence Full search Fast search Proposed algorthm Foreman 2.5601 2.5621 2.5622 Contaner 2.5602 2.5641 2.5631 Clare 2.5610 2.5651 2.5633 Car-phone 2.5612 2.5633 2.5622 Mss Amerca 2.5602 2.5606 2.5612 Moble 2.5621 2.5611 2.5613 Mother-daughter 2.5611 2.5612 2.5611 News 2.5605 2.5651 2.5623 Salesman 2.5604 2.5621 2.5618 Slent 2.5610 2.5677 2.5624 Suze 2.5602 2.5612 2.5633 35 30 proposed algorthm exstng algorthm 30 25 proposed algorthm exstng algorthm Search Ponts 25 20 15 Search Ponts 20 15 10 10 5 5 15 25 35 45 55 65 75 85 95 Frame Number 5 5 15 25 35 45 55 65 75 85 95 Frame Number (a) Car-phone vdeo sequence (b) Contaner vdeo sequence Fgure 3.6 Average number of search ponts per block for Table 3.5 Performance comparsons of DCT-based, DWT-based and Fractalbased transformatons for Foreman vdeo sequence Parameters DCT DWT Fractal PSNR Y (db) 33.68 35.75 35.78 PSNR U (db) 33.80 36.19 36.03 PSNR V (db) 33.87 36.00 36.07 Encodng tme (sec) 1676.104 1451.217 1380.325 Compresson Rato 0.8768 0.8768 0.8768 Average bts/frame 11245624 11176528 11166248

87 Table 3.6 Performance comparsons of DCT-based, DWT-based and Fractal - based transformatons for Moble vdeo sequence Parameters DCT DWT Fractal PSNR Y (db) 32.05 34.48 34.56 PSNR U (db) 31.89 34.47 34.55 PSNR V (db) 31.92 34.48 34.56 Encodng tme (sec) 1959.355 1639.257 1718.904 Compresson Rato 0.7712 0.7722 0.7722 Average bts/frame 20766232 20704344 20706376 Table 3.7 Performance comparsons of DCT-based, DWT-based and Fractal- based transformatons for Mother-daughter vdeo sequence Parameters DCT DWT fractal PSNR Y (db) 34.48 36.76 36.79 PSNR U (db) 34.13 36.19 36.23 PSNR V (db) 34.18 36.23 36.27 Encodng tme (sec) 1554.032 1183.067 1182.283 Compresson Rato 0.9135 0.9135 0.9142 Average bts/frame 7851616 7766600 7766552 Table 3.8 Performance comparsons of DCT-based, DWT-based and Fractal- based transformatons for Suse vdeo sequence Parameters DCT DWT Fractal PSNR Y (db) 34.37 36.30 36.34 PSNR U (db) 34.44 36.49 36.54 PSNR V (db) 34.45 36.47 36.52 Encodng tme (sec) 651.373 581.145 593.079 Compresson Rato 0.9237 0.9246 0.9244 Average bts/frame 3475376 3440328 3446912

88 3.6 Concluson The results presented here are comparable and better than the other codng schemes. Most current vdeo codng schemes use moton compensaton. Here the fractal technques used n mage codng s used for the vdeo, but n vdeo codng, the advantage of the smlarty between the frames are taken and, because of that, hgher compresson rates are obtaned compared wth mage compresson. Expermental results show that ths scheme provdes a superor performance n terms of PSNR as well as the subectve qualty at low bt-rates. Moreover, the blockng artfacts are reduced sgnfcantly, the vsual qualty of the reconstructed frames s better as compared to the other compresson methods.