Edge-Enhanced Image Coding for Low Bit Rates

Similar documents
Error Protection of Wavelet Coded Images Using Residual Source Redundancy

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

REGION-BASED SPIHT CODING AND MULTIRESOLUTION DECODING OF IMAGE SEQUENCES

Modified SPIHT Image Coder For Wireless Communication

CSEP 521 Applied Algorithms Spring Lossy Image Compression

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Wavelet Transform (WT) & JPEG-2000

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

Embedded Rate Scalable Wavelet-Based Image Coding Algorithm with RPSWS

Lecture 7: Most Common Edge Detectors

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Wavelet Based Image Compression Using ROI SPIHT Coding

ANALYSIS OF SPIHT ALGORITHM FOR SATELLITE IMAGE COMPRESSION

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

signal-to-noise ratio (PSNR), 2

Fingerprint Image Compression

Final Review. Image Processing CSE 166 Lecture 18

DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

Corner Detection. Harvey Rhody Chester F. Carlson Center for Imaging Science Rochester Institute of Technology

Video Compression An Introduction

Embedded Descendent-Only Zerotree Wavelet Coding for Image Compression

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

Corner Detection using Difference Chain Code as Curvature

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Line, edge, blob and corner detection

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Reconstruction PSNR [db]

DCT Coefficients Compression Using Embedded Zerotree Algorithm

EFFICIENT METHODS FOR ENCODING REGIONS OF INTEREST IN THE UPCOMING JPEG2000 STILL IMAGE CODING STANDARD

Low-Memory Packetized SPIHT Image Compression

JBEAM: Coding Lines and Curves via Digital Beamlets

A COMPRESSION TECHNIQUES IN DIGITAL IMAGE PROCESSING - REVIEW

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

Bit-Plane Decomposition Steganography Using Wavelet Compressed Video

Anno accademico 2006/2007. Davide Migliore

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Module 6 STILL IMAGE COMPRESSION STANDARDS

An Embedded Wavelet Video Coder. Using Three-Dimensional Set. Partitioning in Hierarchical Trees. Beong-Jo Kim and William A.

An Embedded Wavelet Video Coder Using Three-Dimensional Set Partitioning in Hierarchical Trees (SPIHT)

FAST AND EFFICIENT SPATIAL SCALABLE IMAGE COMPRESSION USING WAVELET LOWER TREES

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo

Chapter 4. Clustering Core Atoms by Location

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

A New Configuration of Adaptive Arithmetic Model for Video Coding with 3D SPIHT

BLIND MEASUREMENT OF BLOCKING ARTIFACTS IN IMAGES Zhou Wang, Alan C. Bovik, and Brian L. Evans. (

Illustrating Number Sequences

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

A Study of Image Compression Based Transmission Algorithm Using SPIHT for Low Bit Rate Application

Visually Improved Image Compression by using Embedded Zero-tree Wavelet Coding

Image coding based on multiband wavelet and adaptive quad-tree partition

Biomedical Image Analysis. Point, Edge and Line Detection

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

Topic 5 Image Compression

Reversible Wavelets for Embedded Image Compression. Sri Rama Prasanna Pavani Electrical and Computer Engineering, CU Boulder

An Efficient Context-Based BPGC Scalable Image Coder Rong Zhang, Qibin Sun, and Wai-Choong Wong

Towards the completion of assignment 1

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering

Chapter 11 Arc Extraction and Segmentation

PERFORMANCE ANALYSIS OF CANNY AND OTHER COMMONLY USED EDGE DETECTORS Sandeep Dhawan Director of Technology, OTTE, NEW YORK

An embedded and efficient low-complexity hierarchical image coder

Digital Image Processing

Automatic Logo Detection and Removal

CHAPTER 3 IMAGE ENHANCEMENT IN THE SPATIAL DOMAIN

Chapter 3 Image Registration. Chapter 3 Image Registration

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Fast Progressive Image Coding without Wavelets

An Embedded Wavelet Video. Set Partitioning in Hierarchical. Beong-Jo Kim and William A. Pearlman

Mesh Based Interpolative Coding (MBIC)

Edge and corner detection

Image Compression Algorithms using Wavelets: a review

INSTRUCTORS: A. SANPHAWAT JATUPATWARANGKUL A. NATTAPOL SUPHAWONG A. THEEPRAKORN LUNTHOMRATTANA COMPUTER AIDED DESIGN I AUTOCAD AND ILLUSTRATOR CS

Advanced Encoding Features of the Sencore TXS Transcoder

Motion Estimation. Original. enhancement layers. Motion Compensation. Baselayer. Scan-Specific Entropy Coding. Prediction Error.

Edge Detection. CMPUT 206: Introduction to Digital Image Processing. Nilanjan Ray. Source:

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

HOUGH TRANSFORM CS 6350 C V

Comparison of EBCOT Technique Using HAAR Wavelet and Hadamard Transform

EECS490: Digital Image Processing. Lecture #19

Application of Daubechies Wavelets for Image Compression

Performance Analysis of SPIHT algorithm in Image Compression

An Optimum Approach for Image Compression: Tuned Degree-K Zerotree Wavelet Coding

FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION

Computer Graphics. Attributes of Graphics Primitives. Somsak Walairacht, Computer Engineering, KMITL 1

Introduction to Video Compression

Reduced Frame Quantization in Video Coding

Dense Motion Field Reduction for Motion Estimation

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Topic 4 Image Segmentation

Novel Lossy Compression Algorithms with Stacked Autoencoders

Image denoising using curvelet transform: an approach for edge preservation

Edge detection. Stefano Ferrari. Università degli Studi di Milano Elaborazione delle immagini (Image processing I)

An Implementation of Multiple Region-Of-Interest Models in H.264/AVC

642 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 5, MAY 2001

CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM

Digital Image Processing Fundamentals

Image Segmentation Techniques for Object-Based Coding

Transcription:

-Enhanced Image Coding for Low Bit Rates Dirck Schilling and Pamela Cosman Department of Electrical and Computer Engineering, University of California at San Diego 9500 Gilman Drive, Mail Code 0407, La Jolla, California, 92093 dschilli@ucsd.edu, pcosman@ece.ucsd.edu Abstract Many current progressive wavelet-based image coders attempt to achieve the greatest reduction in mean squared error (MSE) with each bit sent. In so doing, they tend to send information on the lowest-frequency wavelet coefficients first. At very low bit rates, images compressed by these coders are therefore dominated by low frequency information and blotchy artifacts. These effects combine to hamper recognition of objects in the images. In this paper, we present a new progressive image coder which employs edge enhancement with the goal of improving the visual appearance and recognizability of compressed images at very low bit rates. Important edges in the original image are captured and transmitted as side information together with a traditional wavelet coder bit stream. The decoder combines the two complementary information sources in a manner which, for certain image classes, can yield highly recognizable images at very low bit rates. 1. Introduction Recent progressive wavelet-based image coders such as SPIHT [1] and EZW [2] share the characteristic that, at very low bit rates, the decoded image appears blurred and blotchy. By transmitting the wavelet coefficients in strict bit plane order, the coder tends to send low frequency information first. At the very lowest bit rates, the square regions of support of the wavelet bases are clearly visible as stairstep artifacts (blotches) in the developing image. These factors delay recognition by the viewer of the image s content. For many images, however, the information required to identify the image content to a viewer can be represented by a very few lines. An example can be seen in the way a caricature artist is able to create, with only a few strokes, a sketch which is nevertheless instantly recognizable as a specific person. In the same way, a progressive image coder should be able to convey the central meaning of an image within the earliest part of the progression, filling in the remaining details as time and bandwidth permit. The example of the sketch artist suggests that one way to achieve this goal might be by transmitting the primary edge information first. 2. Proposed Approach Our coder combines SPIHT (or any other progressive wavelet coder) and edge enhancement, with the goal of allowing faster human recognition of the progressively decoded images. A block diagram is shown in Figure 1. The source image is passed through an edge detector, which identifies prominent edges likely to aid in human recognition. Lines are extracted from the identified edge pixels, and the line segment endpoints can be encoded using standard techniques for line graphics, such as modified multiring chain coding. The encoded edge information is sent as part of the image header. Encoding then proceeds in the usual SPIHT fashion. The decoder enhances edges in the output image by combining the edge location information with pixel intensity information available from the progressive SPIHT bit stream. xij (, ) Detection SPIHT Encoding SPIHT Decoding xij ˆ(, ) edges edges e xˆ (, i j) k e k enhanced e Coding Decoding reconstruction Figure 1: Block diagram of edge enhancing coder. The coder has the following characteristics: The decoder uses the edge information in combination with the SPIHT-encoded bit stream to achieve a natural-appearing result, rather than merely superimposing lines over the developing image. s are enhanced (sharpened) without obscuring features coming into focus near the enhanced edges. Stairstep artifacts in diagonal and curved edges, caused by the rectangular shape of the wavelet bases,

are largely removed in the enhanced edges unlike with standard edge-sharpening algorithms. The compact edge information can be sent in the image header and protected from errors. No further enhancement information must be sent later in the bit stream. l 3. Extraction The edge enhancing progressive coder requires that the edges of interest either be known in advance or detected prior to encoding. The edge enhancement step is independent of the algorithms used for edge detection and line extraction; accordingly, existing, proven methods have been chosen for these portions of the coder. detection is performed by a robust and effective method known as SUSAN (Smallest Univalue Segment Assimilating Nucleus) [3]. This method is based on a nonlinear local region-finding procedure which has several advantages over derivative-based methods such as Sobel or Canny edge detection. These include that it is insensitive to noise, provides good edge localization independently of mask size, and reports sharp corners with good connectivity. Line segment extraction is carried out using methods described by Rosin and West in [4]. The current implementation approximates curved edges by sets of linked straight line segments, however the enhancement technique is applicable to higher-order representations such as arcs, ellipses and polynomials. The primary goal of our approach is to improve recognition by enhancing only the most important edges in the image. There is no widely accepted measure for the importance of an edge, however in most situations a longer edge will convey more information than a shorter one. The edge detection methods described above intrinsically filter out much noise, but still report short, sharply defined edges where they exist. In images containing detailed or cluttered areas, such short edges often occur in closely grouped clusters. Coding these edges is costly, and enhancing them is not likely to improve recognition significantly. We therefore include a further filtering step in the edge detection procedure, which removes edges that are short, have few endpoints, and are distant from other edges. 4. Coding Numerous methods for encoding line graphics have been described; some of these focus primarily on long continuous curves, long straight lines, or closed contours, cases that are relatively infrequent in natural images. One challenge of the application described here is that the edge information packet must be kept small in relation to the progressive portion of the transmitted stream, which at the Figure 2: Multiring chain coding grid structure, with the radius 13 ring omitted for clarity. Dots indicate indexed grid points; X s are points on the input curve. The circled grid point indicates the next output point, and the shaded area is the error between the input and output curves for the current step. bit rates of interest is already tiny. Only a few edges can be transmitted, and most of these consist of very few line segments. There is little information available for adaptive methods to work with. 4.1 Modified Multiring Chain Code One efficient method of encoding edge information is the modified multiring chain code with Fibonacci spacing described in [5]. This method was originally proposed for coding human signatures, but can be applied to any form of line-based graphics. The method is summarized in Figure 2. Given a base unit of length l, a structure of concentric square rings is defined whose radii are the product of l and the Fibonacci sequence 1, 2, 3, 5, 8 and 13. Grid points are located along each ring with a spacing of l, yielding a total of 256 grid points. The input curve consists of an ordered sequence of points X = { xo, x1, K, xn}, which is approximated by the output sequence Y = { y0, y1, K, y M }. At each step in the procedure, the multiring structure is positioned with its origin on the previously coded output point y j 1. The intersection of the input curve with each ring is examined in succession from the outer ring inward. The nearest grid point to this intersection is chosen as the next candidate output point y cand. If no point along X between y j 1 and

y cand is further than l from the line segment ( yj 1, ycand ), then y j becomes y cand, and the encoder outputs its multiring index. This approach encodes the input curve nearly losslessly, in that no point on the input curve is further than l from the approximating output curve. Points on the output curve are a maximum distance of 13 2l apart, so the choice of l represents a tradeoff between accuracy of the output curve and the number of output points (number of ring indices) required to transmit it. For our implementation, we used l = 1.5 pixels. The ring indices are entropy coded to take advantage of the predictability in continuous curves. In our implementation, after each ring index is sent, the grid points are renumbered such that the indices corresponding to straight ahead 0, 8, 24, 48, 88 and 152 are positioned along the line extending in the direction of the previously encoded line segment. The remaining indices are shifted by a corresponding amount around their respective rings. The result is that each index reflects an angular offset relative to the previously coded line segment, rather than an absolute offset. The distribution of index occurrences is thereby skewed toward smaller angular changes, improving compression of the indices. 4.2 Coding of Start Points and Single Segments The multiring chain coder must be initialized with the starting point of each edge. For this purpose we employ arithmetic differential offset coding of the edge start points. The edges are first sorted by length. The encoder computes the means of the horizontal and vertical offsets between edge start points, and transmits these along with the total number of edges in an edge packet header. The encoder then arithmetically encodes the differences between the means and the horizontal and vertical offsets to each successive edge from the previous one. Integer arithmetic coding is carried out using techniques similar to those proposed for the JBIG2 standard in [6]. The multiring chain code is particularly effective for long, smooth curves. It performs less well for edges consisting of only one line segment, since no information is available from which to predict the angle of the line segment. For this reason we use differential offset coding to transmit both the start and end points of these edges. 4.3 Coding Performance Table 1 shows the cost of coding the edge information for several test images. Costs are shown both in total bits, and in bits per original point (bpp). Three methods for coding the ring indices are compared: Huffman coding, arithmetic coding with no initial data, and arithmetic coding with initial index probabilities computed from several training images. In both the Huffman and trained arithmetic cases, the training sets used to generate index probabilities did not include the test images. In the arithmetic cases, the coder state included context from prior indices. Arithmetic coding with initial probabilities performed best in most cases except those with the smallest edge packets. Untrained arithmetic coding performed worse than Huffman coding, due primarily to the small amount of data to be coded: the arithmetic coder has too little time to adapt. A disadvantage of the arithmetic technique is that it requires a significant amount of memory at both the encoder and decoder for maintaining state, because of the use of prior index context. 5. Enhancement We now describe the approach used by the decoder to enhance the edges transmitted in the edge packet. For the current implementation it is assumed that the edges of interest are predominantly of the step or single-sided ramp type. Extensions are possible, however, which allow other types of edges such as narrow ridges to be handled. The basic principle of the edge enhancement procedure is illustrated in Figure 3. The original edge, shown in profile in part A of Figure 3, is reconstructed by the wavelet coder at a low bit rate as B. The decoder smoothes B to obtain C; alternatively it retains an earlier version of B in memory. The decoder obtains the location of the edge from the image header. Image Orig. Points Huffman Untrained Arithmetic Trained Arithmetic airplane 243 1593 (6.6) 1683 (6.9) 1534 (6.3) angels 296 2058 (7.0) 2129 (7.2) 1976 (6.7) apples 1018 5866 (5.8) 5979 (5.9) 5860 (5.8) aranda 665 3643 (5.5) 3709 (5.6) 3498 (5.3) bora 386 2335 (6.0) 2486 (6.4) 2388 (6.2) mtilt 65 963 (14.8) 962 (14.8) 1011 (15.6) orange 402 2337 (5.8) 2451 (6.1) 2328 (5.8) picnic 241 1903 (7.9) 1994 (8.3) 1988 (8.3) wbird 374 2390 (6.4) 2441 (6.5) 2348 (6.3) wgrass 45 837 (18.6) 790 (17.6) 830 (18.4) Mean 373.5 2393 (8.43) 2462 (8.53) 2376 (8.45) Table 1: packet coding costs for several images. Figures in parentheses are bits per original edge point.

A. Original edge profile D. Target edge profile farthest from k along the normal line from x k to k (Figure 4). Examples of the output of our current implementation are shown in Figure 5 and Figure 6. 6. Discussion B. Reconstructed at low bit rate C. Smoothed profile F. Enhanced low rate edge profile Figure 3: enhancement procedure Target t x,k Pixel x k k E. Enhancement offset values Figure 4: Pixel enhancement for edge k. Pixels in shaded area belong to k. Based on the intensity values of C at a distance from the edge on either side, a target profile D is computed. In the current implementation, this profile is a nearly ideal step, however other profiles may be chosen. The difference between D and C yields E, the enhancement profile. Finally, E is added to the reconstructed profile B, resulting in the enhanced edge profile F. In practice, several complications to the above procedure arise, for instance when two or more edges fall within a distance 2 of one another. The intent of the target profile D is to determine valid pixel intensities to use for the high and low sides of the edge. Thus, the desired target pixel intensities should not be drawn from the opposite side of a neighboring edge. To handle this, the decoder computes a distance transform, labeling each image pixel with its distance from the nearest edge and the identity of that edge. An efficient distance transform suitable for this purpose is the 5-7-11 Chamfer algorithm [7]. For each edge, the enhancement procedure is performed on all pixels within of the edge and not closer to another edge. The enhancement target value for a pixel x k belonging to edge k is drawn from the target pixel t xk,, where t xk, is the pixel belonging to k which is In many progressive image transmission applications, such as fast browsing, it is important that the user obtain an understanding of the image contents as early as possible in the progression. The goal of improving recognition at low bit rates may take precedence over that of improving fidelity at higher bit rates. In such applications it is advantageous to place a higher priority on edge information than on low-frequency information when ordering the progressive bit stream. We have described an edge-enhancing image coder which, for certain image classes, allows highly recognizable images to be reproduced at very low bit rates. Our current work is focused on improving the edge coding in order to limit the edge location information to as small a percentage of the total bit stream as possible. We expect that more sophisticated techniques for removing visually unimportant edges will aid in this goal. We further plan to carry out human observer experiments along the lines of [8] to demonstrate the effect of this approach on image recognizability. References [1] A. Said and W. Pearlman, A New, Fast, and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees, IEEE Trans. on Circuits and Systems for Video Technology, 6(3):243-250, 1996. [2] J. Shapiro, Embedded Image Coding Using Zerotrees of Wavelet Coefficients, IEEE Trans. on Signal Processing, 41(12):3445-3462, 1993. [3] S. M. Smith and J. M. Brady, SUSAN A New Approach to Low Level Image Processing, Intl. Journal of Computer Vision, 23(1):45-78, 1997. [4] P. Rosin and G. West, Nonparametric Segmentation of Curves into Various Representations, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 17, No. 12, 1995. [5] T. Berger and D. Miller, Method and an Apparatus for Electronically Compressing a Transaction with a Human Signature, U.S. patent 5091975, 1992. [6] ISO/IEC JTC1/SC29/WG1 N1093, JBIG2 Committee Draft, November 1998. [7] G. Borgefors, Distance Transformations in Digital Images, Computer Vision, Graphics and Image Processing, Vol. 34, No. 3, pages 344-371, 1986. [8] D. Schilling, P. Cosman and C. Berry, Image Recognition in Single-Scale and Multiscale Decoders, Proceedings of the 32nd Asilomar Conference on Signals, Systems and Computers, Vol. 1, pages 477-481, November 1998.

(a) (a) (b) (b) (c) Figure 5: Example of enhancement procedure. (a): Original 512x402 image. (b): SPIHT-coded image at 0.0099 bpp. (c): -enhanced image at 0.0099 bpp. There were 65 edge points. The wavelet coding uses 0.0050 bpp and the arithmetically encoded edge data requires 0.0049 bpp. (c) Figure 6: Example of enhancement procedure. (a): Original 482x388 image. (b): SPIHT-coded image at 0.0298 bpp. (c): -enhanced image at 0.0298 bpp. There were 386 edge points. The wavelet coding uses 0.0170 bpp and the arithmetically encoded edge data requires 0.0128 bpp.