FRACTAL COMPRESSION USAGE FOR I FRAMES IN MPEG4 I MPEG4 Angel Radanov Kanchev FKTT, Technical University of Sofia, Todor Alexandrov bvd 14, 1303 Sofia, Bulgaria, phone: +3592 9306413, e-mail: angel_kanchev@mail.bg Keywords: MPEG-4, video compression, fractal compression, bitstream, Iterated Function Systems MPEG-4 [1]. ( ). -,.,. [2] I (I- VOP). (! ). I -." #. Abstract The MPEG-4 standard [1] can be extended with new algorithms and supports addition of custom data in its bit stream. However there could appear compatibility problems between the new extended encoders and existing decoders. The problem could be even worse an extended decoder could treat data for another extended decoder as its own. In this article the usage of fractal compression is examined so maximum compatibility with standard and other extended decoders is achieved. For simplicity fractal compression [2] is used on I key frames (I-VOP). The standard decision is to define a new visual object type ( Fractal ) and to add it to the bit stream. The corresponding I frame could be skipped or could be marked as no-coded. There exist nonstandard solutions as well the final solution in this article is non-standard. 1. INTRODUCTION Fractal compression as defined in [2] (using local Iterated Function Systems) can be applied to any rectangular image. For simplicity we use it on I frames they can be treated as independent images in the video frame sequence. In MPEG-4 the 2D representation of video object in particular moment of time is called Video Object Plane (VOP). This VOP is the analogue to frames in MPEG-1 and MPEG-2 (but more generalized). The problem is that VOP doesn t support data extensions, i.e. it is impossible to define new VOP (F-VOP), which contains compressed fractal data. The
place to put the compressed data appears to be a serious problem for the fractal compression usage. In section 2 the bit stream format is investigated in order to find all possible ways of adding new data. In section 3 the best solution for adding data is chosen (best according compatibility). The results of the experiments are given in section 4 and the conclusions in section 5. 2. INVESTIGATION The MPEG bit stream is an object hierarchy presented as linear sequence of bits. An object is presented with universal start code followed by: unique object code, header and sub-objects (if there are such). The standard allows on every place where the universal start code exist to put arbitrary object. This way the hierarchy can be broken in any way but the decoder must not block, crash or make any problem. According to [1] the most generalized object of the Visual part of MPEG-4 (i.e. at the root of each visual hierarchy) is the Visual Object Sequence (VOS). As shown in Fig.1 the VOS consists of these fields: Profile and level, User Data and a sequence of sub-objects called Visual Objects (VO). In all pictures with {} a sequence is described and with [] a non-obligatory field. VOS Profile & Level User Data {VO} Fig.1 Visual Object Sequence data User data is special type of object that contains consecutive bytes (and they must not simulate the universal start code). It is a place where arbitrary data can be put so it can be used for fractal data container. As shown in Fig.2 Visual Object consists of VO Type, Video Signal Type (only if the object is of Video type), User data and Object Data. VO VO Type [Video Signal Type] User Data Object Data Fig.2 Visual Object data Object Data depends on the Visual Object Type. VO Type can be: Video, Still Texture, Mesh, FBA (Face and Body Animation), 3D Mesh and additional defined. We can use it to define Fractal VO Type, which Object Data will contain
fractal bit stream. For VO Type Video there is field Video Signal Type that can be: "PAL", "NTSC", "SECAM", "MAC" or Unspecified. It can be modified to fractal Video Signal Type and used with the corresponding Object Data. The Object Data for VO Type Video is called Video Object Layer (VOL) see Fig. 3. It s first bit shows if the VOL is with short header. This means that there is no more VOL data and next is VOP with short header. When VOL is with normal header it can be Fine Granularity Scalability (FGS) or Base Video Object Layer. We are interested in VOL with normal header and containing Base layer information. VOL Header Short Normal Layer FGS Base Fig.3 Possible Video Object Layer types In Fig.4 you can see the content of Base VOL. It contains User Data and Video Object Planes (VOP) as sub-objects. Base VOL Shape Data User Data {VOP} Fig.4 Base layer You can see the data in one VOP in Fig.5. Of interest is the field Coding Type that can take values: "I", "P", "B" and "S". We can use it and define new F VOP with fractal data in it. Notice that there is no User Data in the VOP. VOP Coding Type VOP Coded Shape Info Sprite Info Quant Params Macroblock data Fig.5 Video Object Plane We can say that the object hierarchy in our case (Visual Object of type Video, Video Object Layer with normal header and for Base layer) is as shown in Fig.6.
VOS VO VOL {VOP} Fig.6 The object hierarchy of Simple Profiles In the implementation that we use the organization is 1 VOS for each I key frame (as in Fig.6 the VOP sequence is till next I-VOP). This means that each I-VOP is first VOP in new hierarchy as in Fig.6. The number of VOS is equal to the number of I frames. Knowing the organization of the bit stream we can choose the best way to add data in it. 3. CHOOSING THE BEST SOLUTION 3.1. Constraints Before finding the possible solutions let s view the constraints that we have about changing fields: 1. Coding type of VOP it is 2 bits and all possible values are already used. 2. Video Signal Type it affects only the interpretation of the data in ordinary VOP so we should not modify it. 3.2. Solutions, advantages and disadvantages Let us view all possible ways for adding additional data in the bit stream: 1. New visual object type Fractal and use it instead of Video VO. This is the standard way defined in the standard. The problem is that 2 new extended encoders could use the same value for their VO types (VO Type is 4-bit field only). It will be a big mess when extended decoder tries to interpret data created from improper encoder. Also for backward compatibility it is not a good idea to add new unknown objects (like Fractal Visual Object). 2. Not affecting the hierarchy using User Data instead Using one of these solutions guarantees no conflict with current encoders and possible future ones. 2.1. In Visual Object Sequence It is possible that in one VOS exist many I-VOPs. In order to find out how many Fractal frames to put in this user data we have to make additional buffering. Nevertheless it can be used because our encoder puts one VOS for each I-VOP. 2.2. In Visual Object As for VOS one VO could have many I-VOPs. This is in lower level in the hierarchy so the possibility is lower than for VOS. 2.3. In Video Object Layer Video Object Layer supports User Data only when used Normal header and Base layer. 2.4. In Group of Video Object Planes (GVOP)
It is possible to group VOPs in the Video Object Layer using Group of VOPs. This GVOP object supports User Data. The problem is that the encoder itself could use GVOP and this object needs additional data to be filled which is not necessary for our needs. 2.5. In no-coded VOP Each VOP has bit flag for indication is it coded or not. The problem is that divx5 uses no-coded frames as markers and for statistics. 3.3. The best solution It seems that solutions 2.1. and 2.2. give the least problems so they are the best according compatibility. Considering that solution 2.2. is closer to the VOP level of hierarchy, it is used. When using User Data it is important to avoid simulation of the universal start code (binary: 23 0 and one 1 in the bit stream, or hex: 0x000001). That is the reason of adding one marker bit ( 1 ) on every 22 bits in the fractal encoder s output. In order to distinguish our User Data from other encoder s User Data we add at the beginning four chars with combined value Frac. This is typical technique for data distinguishing in the MPEG standard. 4. RESULTS The codec that is extended with fractal compression and decompression is the open source Advanced Simple Profile codec XviD. In Fig. 7 you can see the data flow in the encoder. If the current frame is of I type, XviD generates Visual Object Sequence with only one Visual Object. This Visual Object is with Video Object Layer which first Video Object Plane is I-VOP. If the fractal algorithm is activated fractal data is added as user data in the same Visual Object and the I-VOP is skipped. XviD Decision I Frame P/B Frame Fig.7 Data flow in the encoder The decoder parses this user data and decodes it if it exists. For an ordinary decoder the situation is just like the I-VOP is missing (see Fig. 8). The shown frame is after fractal coded I-VOP so P and B frames should update it. Because for this decoder there is no I frame the macro blocks (in the down-left corner) are updating the picture before this missing I-VOP. In Fig.9 is shown the same frame as in Fig.8 but the film is decoded by extended decoder. Now the fractal I frame is decoded so the following macro blocks (in the down-left corner) are updating valid picture. The used fractal codec is working only with gray scale images so if the film is color the color effect on Fig.9 appears. For comparison of JPEG I frame and Fractal I frame see Fig.10 and Fig.11. You can see in Fig.11 that fractal compression has lost some details. VOS VOP Fractal Packetizer
Fig.8 Fig.10 Fig.9 Fig.11 In Table 1 you can see the difference in sizes of the compressed demo movies. Note that the file size is not always smaller this is because fractal compression is very sensitive to the statistic of the image. TABLE 1 (COMPARISON TABLE) File name Colored Fractal File Size, KB compression COL_AA_Fractal.avi Y Y 3 964 COL_AA_NoFractal.avi Y N 3 958 COL_NS_Fractal.avi Y Y 4 170 COL_NS_NoFractal.avi Y N 4 254 GS_AA_Fractal.avi N Y 3 924 GS_AA_NoFractal.avi N N 3 918 GS_NS_Fractal.avi N Y 4 090 GS_NS_NoFractal.avi N N 4 160 5. CONCLUSIONS The most compatible solution for usage of fractal compression has been found. The fractal codec that is used needs color support and optimizations. In addition it is necessary to make a decision whether a frame will be compressed well with fractal algorithm. 6. REFERENCES [1] MPEG 4 - ISO/IEC 14496 - Coding of audio-visual objects (Part 2: Visual) [2] M. Barnsley and L. Hurd, Fractal image compression, AK Peters Ltd, Wellesley, Massachusetts, 1993.