Part 1 of 4. MARCH - PDF Free Download

Presented by Brought to You by Part 1 of 4 MARCH 2004 www.securitysales.com A1

Part1of 4 Essentials of DIGITAL VIDEO COMPRESSION By Bob Wimmer Video Security Consultants cctvbob@aol.com AT A GLANCE Compression is the art of removing information the viewer does not need Compression is necessary to maximize recording and the storing and transmitting of digital images Some methods eliminate irrelevant portions of images, while others nix redundant parts JPEG, MPEG and H.26* Series are the most prevalent compression standards The quality, speed and storage required by an application dictates compression choices Effective recording, retrieving, storing and transmitting are critical to exploit the full benefits and capabilities of digital video. The first place to start is with a solid knowledge of the methods and standards of video compression. Welcome to the first of a four-part series designed to educate readers on the fast and ever-changing world of digital video as found in the security industry. Part 1, brought to you by Honeywell Security, is designed to explain the need for video compression and the many different compression methods and standards used throughout most of the security industry. Compression Facilitates Video Transmission In its basic form, compression is the art of removing information viewed as irrelevant to the viewer. In this case, the viewer is a dealer, system integrator or anyone else who relies on high quality recorded images. The amount and type of information removed varies from system to system and can be controlled by system setup parameters. But why do we need compression? To help answer this question, let s evaluate the requirements needed to store or transmit a single minute of composite video to a remote location. Without compression, the ability to store this information would require a minimum of 1.66GB of storage space. In the case of remote video viewing transmitted via a 56Kbps modem, it would take more than 2 1 2 days to transmit from point A to point B. This is completely unacceptable in today s surveillance community, which depends on permanent storage of a security application as well as the ability to see this information remotely. A2 www.securitysales.com MARCH 2004

There has always been a trade-off between the quality of video and file size. If you want the best quality in images, then you have to deal with an enormous file size. For example, by decreasing the file size by 50 percent, you lose some image quality but create a smaller file size that is more conducive to recording or transmitting video signals. By no means is the process behind compression easy; there is a tremendous amount of mathematical complexity required to establish the different compression methods. A quick description of the basic parts will help clarify some of the theory and explanations discussed later in this article. Lossless Bests Lossy, But Compresses Less Analyzing the video signal separates the signal into many parts, or subparts, and is classified by the importance in reference to the image s visual quality. Following this signal analysis, the next part is the quantizer. Quantization is simply the process of decreasing the number of bits needed to store a set of values, or transformed coefficients as they are called in data compression language. Since quantization is a many-to-one mapping and reduces the precision of those values, it is known as a lossy process (as opposed to lossless) and is the main source of compression in most image coding schemes. There is a trade-off between image quality and degree of quantization. A large quantization step size can produce unacceptably large image distortion. Lossy compression actually eliminates some of the data in the image and, therefore, provides greater compression ratios than lossless compression. Thus, the trade-off is file size vs. image quality. Lossless, on the other hand, consists of those techniques guaranteed to generate an exact duplicate of the input data stream after a compress/expand cycle. No information is lost, hence the Block Diagram of JPEG Compression 8X8 pixel block Video analyzer Lossless vs. Lossy Compression Lossy This image reconstruction contains degradation relative to the original. This is due to the discarding of redundant information but capable of achieving much higher compression. Source: ACM Each image is assigned a numeric code in which common events or information are assigned only a few bits, while rare or uncommon events are assigned a larger amount of bits. The steps to create this data output stream are divided into signal analysis, quantization and variable-length encoding. Quantizer Binary encoder Output data stream name lossless. However, this method can only achieve a modest amount of compression. The lossless compression of images is important in fields such as medical imaging and remote-sensing where data integrity is essential. Typically, compression ratios for lossless codes, including variable-length encoding, are listed as an average of 4:1 compression. In variable-length encoding, prior to the writing of the image, the information is aligned according to frequency which plays an important role in the image compression process. For the most part, lower frequencies, which occur more often, are placed to the front while higher frequencies are placed at the end. In any file, certain characters are used more than others. In general, we can attain significant savings if we use variable-length prefix codes that take advantage of the relative frequencies of the symbols in the messages to be encoded. Lossless The reconstructed image, after compression, is numerically identical to the original image. However, it can only achieve a modest amount of compression. MARCH 2004 www.securitysales.com A3

Huffman compression is an example of variable-length encoding. Huffman s algorithm compresses files by assigning smaller codes to frequently used characters and longer codes to characters that are less frequently used. Using representation by binary codes (a sequence of zeros and ones that can uniquely represent a character), the number of bits required to represent each character depends on the number of characters that have to be represented. One bit can represent two characters. For example, 0 represents the first character and 1 represents the second character. Two bits can represent four characters, and so on. Spatial Reduction vs. Spectral Reduction Spatial Reduction Spatial reduction in the reduction between the correlation of neighboring pixel values. Spectral Reduction Spectral reduction is the reduction between different color planes or bands within an image. Full Image, Conditional Among Analyzing Methods There are several methods of analyzing a video image. The first is full image compression. This approach usually relates to Joint Photographic Experts Group (JPEG) and wavelet compression schemes in which the entire image is analyzed, compressed and stored. In most cases, this form of analyzing an image can only provide a limited amount of compression, meaning larger image file sizes. With conditional compression, only changes from image to image, or to adjacent image are analyzed and Full vs. Conditional Images compressed. This method is usually associated with Moving Picture Experts Group (MPEG) and modified MPEG compression methods. Reduction Can Be Accomplished Via Irrelevancy, Redundant The major image reduction schemes are irrelevancy reduction and redundant reduction. Irrelevancy reduction omits parts of the video signal that are not noticed or perceived by the signal receiver, which in this case is the human eye. Through the research of Human Visual Systems (HVS), it has been In the full image example (top), each image is compressed and stored. In the conditional image example (bottom), only changes are compressed and stored. proven that small color changes are perceived less accurately than small changes in brightness, so why brother saving this information? It is also known that low frequency changes are more noticeable to the human eye than high frequency changes. (Low frequencies control the coarser of more noticeable conditions of an video image whereas higher frequencies are usually related to the finer details of a video image.) Redundancy reduction is accomplished by removing duplication from the signal source, which is found either within a single image or between multiple images of a video stream. The first of three redundancy reduction methods is labeled spatial reduction. This is the reduction of the correlation between neighboring pixel values. As seen in the illustration above, the data stream can be reduced to single values for each of the four quadrants. Although this is a very simple example, it shows one of the basic ways for redundancy reduction. The next reduction method is spectral reduction. This is the correlation between color planes or bands within an image. As an example, let us again look at the blue sky in the illustration above. Many areas of that sky have the same numeric value. Therefore, the amount A4 www.securitysales.com MARCH 2004

of stored information to reproduce that same image can be reduced in the decompression mode of operation. The last area is known as temporal reduction. This is the correlation between adjacent frames in a sequence. This information is the bases for MPEG as well as H.26* series of compression methods. In temporal reduction, two types of image arrangements are viewed. The first one is a full representation of the viewed image. This is known as the I- frame and is encoded as a single image, with no reference to any past or future images. In some circles, it is also referred as the Key-frame. The concept behind the temporal method is if there is no movement then why bother saving the information? Conversely, any movement will be detected and the compression process will begin. History of Video Standards ITU-T standard Joint ITU-T/MPEG standards MPEG standards Source: Copyright 2003 by LSI Logic Corp. H.261 (version 1) MPEG-1 Compression Methods Fool the Human Eye There are four methods for compression: discrete cosine transform (DCT), vector quantization (VQ), fractal compression and discrete wavelet transform (DWT). DCT is a lossy compression algorithm that samples the image at regular intervals. It analyzes the components and discards those that do not affect the image as perceived by the human eye. JPEG, MPEG, H.261, H.263 and H.264 are a few compression standards that incorporate DCT. VQ is also a lossy compression that looks at an array of important, instead of individual, values. Vector quantization then generalizes what it sees, compresses redundant information and tries to retain the desired information as close to the original as possible. Fractal compression is a form of VQ. However, this type of compression locates and compresses self-similar sections of an image. This compression then uses fractal algorithms. (Fractal is a generalization of an information-free, object-based compression scheme rather than a quantization matrix. It uses a set repetitive in shape, but not size.) DWT compresses an image by frequency ranges. It filters the entire image, both high and low frequencies, and repeats this procedure several times. Wavelet compression utilizes the entire image, which differs from many DCT methods. Major Standards Include JPEG, MPEG and H.26* Now that we are familiar with the different compression theories and the ways video information is reduced, we can apply this knowledge to the industry s various compression standards. Since many video compression equipment manufacturers have developed their own standards, we will only look at the major compression standards presently approved. These include JPEG, JPEG2000, MPEG, MPEG-1, MPEG-2, MPEG-4, H.263, H.264, super motion image compression technology (SMICT) and wavelet. JPEG Is Capable of 20:1 Image Size Reduction JPEG is a lossy compression method, meaning the decompressed image isn t quite the same as the one you started with. JPEG is designed to exploit known limitations of the human eye, notably H.261 (version 1) H.263 H.263+ H.263++ H.263/MPEG-2 MPEG-4 (version 1) H.264/MPEG-4 AVC MPEG-4 (version 2) 1988 1990 1992 1994 1996 1998 2000 2002 2004 Video compression standards have been around for almost two decades. H.264 and MPEG-4 AVC are the most recent developments. the fact that small color changes are perceived less accurately than small changes in brightness. Thus, it is intended for compressing images that will be viewed by humans. Data compression is achieved by concentrating on the lower spatial frequencies. According to the standard, a modest compression of 20:1 can be achieved with only a small amount of image degrading. However, if you plan to machine analyze your information, the small errors generated by JPEG may cause errors. The Joint Photographic Experts Group has approved the next standard for image compression, known as JPEG2000, based on wavelet compression algorithms. By setting the mother wave for image compression and decompression ahead of time as a part of the standard, JPEG2000 will be able to provide resolution at a compression of 200:1. Similar to JPEG, MPEG Has Many Permutations There are many different areas to the MPEG compression standard. Each area has its own special features, and improvements that add to the existing standard are constantly being incorporated. However, the basics are similar for all versions. MARCH 2004 www.securitysales.com A5

MPEG Theory I B B P B B P B B P MPEG Relationship between I, P, and B frames in motion prediction An I-frame is encoded as a single image, with no reference to any past or future frames. A P-frame is encoded relative to the past reference frame. A reference frame can be a P- or an I-frame. A B-frame is encoded relative to the past reference frame, the future reference frame, or both frames. The future reference frame is the closest following reference frame (I or P). MPEG incorporates the same compression methods as JPEG (DCT). However, MPEG is based on the group of images concept. The group of images is defined as the I-frame, P- Frame and the B-Frame. The I-frame (intra) provides the starting or access point and offers only a small amount of compression. P-frames (predicted) are coded with reference to a previous picture, which can be either an I-frame or another P- frame. B-frames (bidirectional) are intended to be compressed with a low bit rate, using both the previous and future references. B-frames are never used as references. The relationship between the three frame types is described in the MPEG standard; however, it does not restrict the limit of B-frames between the two references, or the number of images between two I-frames. As previously mentioned, there are many different forms of MPEG standards. The MPEG-1 standard has a resolution of 352 X 240 pixels at 30 images a second and incorporates progressive scanning. It is designed for up to 1.5MBps with compression ratios listed as 27:1. The MPEG-2 standard has a resolution of 720 X 480 pixels and incorporates both progressive and interlaced scanning. (Interlaced scanning is the method used in the CCTV industry to produce images on monitors.) The most significant improvement over MPEG-1 is its ability to efficiently compress interlaced video. It is also capable of coding standard-definition television at bit rates from about 3MBps to 15MBps and high-definition television. Compression ratios for MPEG-2 vary. The ratio depends on the type of signal and the number of B, P and I frames. On average, this ratio can vary from 50:1 to 100:1. The MPEG-4 standard is used for multimedia and Web compression because it is designed for low bit-rate transmission. MPEG-4 is based on object-based compression. Individual objects within a scene are tracked separately and compressed together. This method offers a very efficient compression ratio that is scalable from 20:1 up to 300:1. In today s CCTV industry, more and more manufacturers are turning to MPEG-4 for remote viewing of compressed video images. At press time, two additional standards, MPEG-7 and MPEG-21, were under consideration. H.26* Series Frequently Deployed for Remote Viewing This group of compression standards is the result of the telecommunication industry and has also been adopted by the security industry for remote viewing of video information. The H.263 video compression algorithm is designed for low bit-rate communications. The video sourcecoding algorithm of H.263 is a hybrid of inter-picture prediction that uses temporal redundancy and transforms the coding of the remaining signal to reduce spatial redundancy. H.263 can achieve picture quality as high as H.261 with 30-50 percent of the bit usage. Because of its low resolutions and low bit rates for transmitting video images, H.263 is also better than MPEG-1/MPEG-2. The compression ratio of H.263 can reach up to 200:1. Manufacturers have made many advances during the past year in the compression standards offered. H.264 is one of those advancements. In theory, H.264 is based on block transforms and motion-compensated predictive coding. Motion estimation is used to identify and eliminate the temporal redundancies that exist between individual pictures. H.264 leverages today s processing power to provide improved coding techniques, including multiple reference frames and variable block sizes for motion compensation, intraframe prediction, an integer transform, an in-the-loop deblocking filter and improved entropy coding. H.264 is also referred to as MPEG-4 AVC (advanced video compression) or MPEG-4 Part 10. This compression standard introduces smaller block sizes, greater flexibility and greater precision in motion vectors. A6 www.securitysales.com MARCH 2004

Super Motion Image Technology Stores According to Changes The super motion image compression technology (SMICT) standard has almost the same characteristics of H.264. Based on redundancy and motion, it combines DSP hardware compression with CPU software compression. Utilizing an intelligent nonlinear super motion codec, SMICT intelligently analyzes the motion changes that occur within the frame, eliminates the redundant portion of the image that need not be stored, and compresses the delta, or change, based on motion. This compression method has a ratio of up to 2,400:1. Wavelet Technique Uses Filtering Schemes The wavelet compression standard does not use DCT, but instead incorporates the use of frequencies filtration. The advantage of wavelet compression is that, in contrast to JPEG and MPEG, its algorithm does not divide the image into blocks, but analyzes the entire image. This characteristic of wavelet compression allows it to obtain good compression ratios, while maintaining optimal image quality. The filtering schemes rely on the image parts that are not noticed by the human eye. The more often filtering Video Storage Requirements Storage in MB 2,500 2,000 1,500 1,000 500 0 MPEG-2 occurs, the smaller the overall file size of the images and the lower the image quality will be when decompressed. With the addition of JPEG2000 (mentioned previously), the approach taken by the Joint Photographic Experts Group is changing the way compression standards are being considered. Let Quality, Speed and Storage Be Your Guide Image compression plays a very important part in the digital storage and transmission of video images. Most of the equipment offered today gives operators the capability to set up compression ratios in order to meet their image needs. You may not see compression ratio settings on any of the software setup screens. It may instead appear as the term image quality, which is related to this function. A setting of high image quality represents low compression, while low quality settings indicate high compression of the signal. With all of the different types of reduction methods available for video images and the many different compression standards, it is no wonder why many people get confused by digital video storage and transmitting equipment. With each form of information reduction method or compression standard, MPEG-4 (ASP) H.264 The storage requirement of 90 minutes of DVD quality video is more than twice as much for MPEG-2 than it is for H.264. Comparison of JPEG Photos 3Mbyte Original 19kbyte JPEG 2000 19kbyte JPEG These photos show the improved image quality between JPEG and JPEG2000 with the same file size. there is one single item to keep in mind. The quality of the reproduced image whether from a storage device such as a DVR or a remote location will depend on the application of that system. Not every standard or compression method is designed to match all requirements. When selecting your system, keep in mind that if the image quality, speed of remote viewing and storage requirement is what you expected, then you have made the right choice. The next installment of this fourpart series will cover storage devices and their applications in the world of digital video. Robert (Bob) Wimmer is president of Video Security Consultants and has more than 33 years of experience in CCTV. He has been a training consultant for several of the industry s leading CCTV manufacturers and other organizations. He has also written numerous articles on CCTV applications and advancing equipment technology. For bulk reprints (500 or more), call (310) 533-2400. Courtesy of EE Times MARCH 2004 www.securitysales.com A7

Fusion DVR Fusion Sample Recording Chart Fusion Remote View Software Fusion is more than a standard DVR Series. It is an intelligent, world-class digital management system combining multiplexing, motion detection, audio, text insertion, mapping and remote notification into one extremely versatile, upgradeable unit. 8 to 32 camera units Capture rates up to 480 ips Live viewing up to 480 ips Multiplexed analog output Remote accessibility/notification Text insertion interface With its advanced compression algorithms, fast capture rates and flexible GUI, Fusion is truly the next generation of DVR technology. Display maps of facilities Smart Search Index Search Easy installation and operation For more information on these and other Honeywell Video Systems products, please call our Sales Support Center at 1.800.796.CCTV. Reader Service Card No. 103 www.honeywellvideo.com