Image Segmentation Techniques for Object-Based Coding

Similar documents
Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

MRT based Fixed Block size Transform Coding

Dense Motion Field Reduction for Motion Estimation

Video Compression An Introduction

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

Adaptive Quantization for Video Compression in Frequency Domain

A Miniature-Based Image Retrieval System

Department of Electronics and Communication KMP College of Engineering, Perumbavoor, Kerala, India 1 2

C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S. Image Operations II

Review for the Final

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

Video Compression Method for On-Board Systems of Construction Robots

Pre- and Post-Processing for Video Compression

Context based optimal shape coding

EE795: Computer Vision and Intelligent Systems

Image Pyramids and Applications

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Lossless and Lossy Minimal Redundancy Pyramidal Decomposition for Scalable Image Compression Technique

Biomedical Image Analysis. Mathematical Morphology

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

Source Coding Techniques

Novel Lossy Compression Algorithms with Stacked Autoencoders

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

An Approach for Reduction of Rain Streaks from a Single Image

Multimedia Communications. Transform Coding

A Novel Approach for Deblocking JPEG Images

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

TEXT DETECTION AND RECOGNITION IN CAMERA BASED IMAGES

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

signal-to-noise ratio (PSNR), 2

CITS 4402 Computer Vision

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

A deblocking filter with two separate modes in block-based video coding

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Multiframe Blocking-Artifact Reduction for Transform-Coded Video

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

A new predictive image compression scheme using histogram analysis and pattern matching

Fundamentals of Digital Image Processing

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

IT Digital Image ProcessingVII Semester - Question Bank

Histogram Based Block Classification Scheme of Compound Images: A Hybrid Extension

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Morphological Image Processing

Data Hiding in Video

[ ] Review. Edges and Binary Images. Edge detection. Derivative of Gaussian filter. Image gradient. Tuesday, Sept 16

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Tutorial on Image Compression

Image denoising using curvelet transform: an approach for edge preservation

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

Image Gap Interpolation for Color Images Using Discrete Cosine Transform

K-Means Clustering Using Localized Histogram Analysis

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Network Image Coding for Multicast

Logical Templates for Feature Extraction in Fingerprint Images

Wavelet Transform (WT) & JPEG-2000

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Introduction. Computer Vision & Digital Image Processing. Preview. Basic Concepts from Set Theory

Compression Artifact Reduction with Adaptive Bilateral Filtering

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method

morphology on binary images

Digital Video Processing

Digital Image Processing

Final Review. Image Processing CSE 166 Lecture 18

Operators-Based on Second Derivative double derivative Laplacian operator Laplacian Operator Laplacian Of Gaussian (LOG) Operator LOG

A combined fractal and wavelet image compression approach

Module 7 VIDEO CODING AND MOTION ESTIMATION

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER

Research Article Image Segmentation Using Gray-Scale Morphology and Marker-Controlled Watershed Transformation

Robust Zero Watermarking for Still and Similar Images Using a Learning Based Contour Detection

Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques

Ian Snyder. December 14, 2009

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Computational QC Geometry: A tool for Medical Morphometry, Computer Graphics & Vision

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

JPEG decoding using end of block markers to concurrently partition channels on a GPU. Patrick Chieppe (u ) Supervisor: Dr.

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

11. Gray-Scale Morphology. Computer Engineering, i Sejong University. Dongil Han

Pattern based Residual Coding for H.264 Encoder *

Fingerprint Image Compression

Secure Scalable Streaming and Secure Transcoding with JPEG-2000

1 Background and Introduction 2. 2 Assessment 2

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

Filters. Advanced and Special Topics: Filters. Filters

COMPARISONS OF DCT-BASED AND DWT-BASED WATERMARKING TECHNIQUES

AN ALGORITHM FOR BLIND RESTORATION OF BLURRED AND NOISY IMAGES

ENHANCED DCT COMPRESSION TECHNIQUE USING VECTOR QUANTIZATION AND BAT ALGORITHM Er.Samiksha 1, Er. Anurag sharma 2

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Morphological Image Processing

Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection

Vidhya.N.S. Murthy Student I.D Project report for Multimedia Processing course (EE5359) under Dr. K.R. Rao

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold

Mathematical Morphology and Distance Transforms. Robin Strand

Transcription:

Image Techniques for Object-Based Coding Junaid Ahmed, Joseph Bosworth, and Scott T. Acton The Oklahoma Imaging Laboratory School of Electrical and Computer Engineering Oklahoma State University {ajunaid,bosworj,sacton}@okstate.edu http://oil.okstate.edu Abstract Two image segmentation methods are presented and compared in terms of rate-distortion within an object-based coding scheme. The LOMO segmentation exploits the relationship between mathematical morphology and local monotonicity in producing a multiscale segmentation. The process is a morphological analogy to the Laplacian of Gaussian. The level set approach uses area morphology to generate segmented regions having a specified minimum area. Segments are optimally chosen from the connected components of the image level sets. A simple object-based coding scheme using the discrete cosine transform is used to avoid the artifacts produced by conventional block-based coding at segment boundaries. Results of each segmentation method are given and compared to one another and to conventional JPEG coding by rate-distortion and the presence of boundary artifacts. 1. Introduction A dilemma within image segmentation research is the search for an effective measure of segmentation quality. Different methods of segmentation exist, utilizing different image characteristics, e.g. shape, texture, motion, etc. These methods perform differently depending on the application and are often compared only subjectively. One important application is clear, however. This work was supported by the National Aeronautics and Space Administration under EPSCOR grant NCC5-171. With the emergence of new standards such as MPEG-4, object-based coding is a rapidly developing field, allowing greater functionality than previous encoding methods. We therefore contend that segmentation quality can be measured, in this context, by the performance in such coding schemes, i.e. by rate-distortion. Although no standard segmentation technique is widely accepted for object-based coding, the coding results depend critically on the segmentation. To complicate matters, different algorithms have been developed for the object-based coding itself. Here we present one such method, using a combination of blockbased classification, JPEG compression and the method of successive projection onto convex sets as given in [1]. The remainder of the paper is organized as follows. In Section we discuss the object-based coding method technique. In Sections 3 and 4, two novel grayscale segmentation methods are given: locally monotonic (LOMO) segmentation and level set-based segmentation. Finally, in Section 5, sample results are given, along with the corresponding rate-distortion measures. The results compare favorably with those of JPEG in terms of boundary quality and object-based functionality.. Object-Based Coding Theory In the object-based coding method used in this investigation, rectangular blocks are used to provide boundaries for each segmented object. These sub-blocks are then divided into 8x8 pixel sub-blocks for further processing. Three types of sub-blocks are possible. First, sub-blocks that are not part of the segment are ignored (not coded). Second, we have blocks in which all constituent pixels are part of the object; these are coded in a manner similar to the standard JPEG, using the discrete cosine transform (DCT) coefficients and quantization. The third class includes sub-blocks containing segment boundary pixels, requiring special processing to ensure the preservation of the segment boundary.

The objective of processing sub-blocks of the third type is the reduction of the number of coefficients needed to represent the object and to preserve the edges. We use the successive projection onto convex sets approach. The first convex set can be defined using the energy compactness property of the DCT and the fact that the effect of the boundary will be mostly within the high frequency components of the transform. We set the high frequency coefficients to zero and retain only the low frequency coefficients. The second convex set can be obtained from the boundary of the object in that particular sub-block. In an iterative manner, pixels that are part of the object are restored to their original values, while those outside of the object are given by the reconstruction defined by the DCT coefficients. The sub-block is then successively projected onto the convex sets in this manner, reducing the high frequency components inherent in the boundary. The coefficients are then encoded using the resultant DCT coefficients. Finally, the boundaries of each object are encoded using a lossless chain-code. Thus, boundary information is encoded separately from intra-segment DCT coefficients. 3. LOMO Here, we present the first of two segmentations for use in the object-based coding scheme. This segmentation is derived by a linear combination of morphological filters, and is designed to generate signals with the property of local monotonicity. We now summarize the motivation behind this segmentation method and give an outline of the algorithmic steps of its implementation. In the one-dimensional (1-D) case, a signal is locally monotonic (LOMO) of a given degree or scale n if and only if it is either non-increasing or non-decreasing within every window of length n. This definition can be restated in terms of morphology by the statement that a discrete 1- D signal is locally monotonic of a given degree n if and only if it is a root signal of both the open (Ó) and close (Á) filters simultaneously, assuming a zero-valued symmetric structuring element k of length n-1. In [], this definition is generalized from 1-D to higher dimensions by requiring that a LOMO-n signal f be a root signal of the filter: f k + f k f $. (1) Such signals may be derived from general inputs by iterative application of (1) until convergence. In practice, LOMO signals can be obtained by the alternative filter: f ( f $ k) k + ( f k) $ k, () which converges more rapidly to a root signal. The LOMO segmentation method relies upon the generation of a multi-scale representation, or scale-space, of such locally monotonic images. Beginning with the original image, each scale-space layer is derived by the iterative application of () to the image of the previous scale. As described in [], this filtering procedure possesses many desirable scaling properties, as well as edge-localization and robustness to noise. At each given scale, edge detection is performed. The edge-detection employs an unbiased (self-dual) combination of morphological erode (K) and dilate (J) filters, referred to as the morphological Laplacian: d = f Jk + f Kk f, (3) where the structuring element k is circular of diameter n- 3 (guaranteeing the equivalence to a discrete second derivative approximation in the 1-D case []). Zerocrossings of this difference function d represent edges, and (3) can be viewed as an analogy to the linear Laplacian of Gaussian (LoG) edge detection: a morphological approximation of a Laplacian operator acts on an appropriately chosen layer of a morphological scale-space. Insignificant edges are removed by a threshold on the morphological gradient (structuring element diameter n- 1). Edges detected at a given scale n are then dilated by a circular structuring element of diameter n-1 and then thinned. This process alleviates a difficulty of zerocrossing detection by closing edge gaps occurring at junctions between regions. With each single-scale segmentation complete, segments are then linked through scale from coarse to fine. A given segment is associated with the next-higher layer segment possessing the greatest spatial intersection. In this manner, the boundaries of an initial coarse-scale segmentation are refined to finer and finer detail, until the final segmentation is reached. 4. Area Morphology and Level Set-Based A disadvantage of segmentation approaches such as anisotropic diffusion or the watershed algorithm is the inability to prescribe the minimum region area in the segmentation. These segmentation techniques may be sensitive to small regions of high contrast. Using an area morphology approach, we can extract segments from the image that exceed a minimum area. Area morphology is based on the area of connected components (regions) in the level sets of an image [3]. In a threshold decomposition of the image I, an associated level set L( I, t) is a set of pixels that meet a given threshold t:, ) (, ) (, ). For a ( x y L I t if I x y t

discrete range of K intensities, K 1 I ( x, y) = 1 ( x, y) L( I, i) where 1 [ ] is the set i = 0 indicator function. s The area open operator, denoted by $ (I), removes all connected components within the level set L( I, t) that do not have a minimum area of s. An area close operator, s denoted by (I), removes connected components in the complemented level set L c ( I, t) that do not possess the minimum area s area close removes small regions of zeros. Applied to each level set in the image (for each intensity in the range of possible intensities), area open removes small bright objects, while area close removes small dark objects. The area open-close (AOC) operator, denoted by s s ( $ ( I)), is defined as the concatenation of the area open and close operators. We use the AOC operator to construct a scale-space for level set-based image segmentation. Processing can be implemented independently on each image level set; the grayscale image is reconstructed by a standard stacking operation after processing the level sets. Given the AOC-scaled image, a segmentation may be derived. In such a segmentation, only the connected components within image level sets (or complements of level sets) can define a segment. The scale/area parameter of the AOC scale-space representation defines the minimum area of a segment. The segmentation algorithm finds the optimal combination of segments that minimize the summed internal intensity variance of each segment. So, the goal of the level set approach is to provide a segmentation that maximizes region homogeneity. The approach guarantees closed regions and boundary preservation, which are important features in object-based coding. 5. Results and Conclusions These rate-distortion curves show that level set method gives better compression than the LOMO segmentation method. This difference is more evident in the curve for the swan image and less distinct in the other two. Both object-based coded methods show poorer performance in the rate-distortion curves when compared to JPEG, but this is due to the fact that JPEG does not have the extra overhead of the segmentation. (Note that the compression here is due solely to quantization of DCT coefficients no Huffman or run-length coding is used.) For the object-based methods, some compression is necessarily compromised for the added object-based functionality and removal of visually undesirable artifacts at segment boundaries. Figure 16 shows a close-up of a region within the original image of Figure 13. Within the decoded JPEG image of Figure 17, blocking artifacts and sacrifice of edge localization are noticeable. These artifacts are avoided in the object-based results shown in Figures 14 and 15. In conclusion, the level set segmentation method tends to outperform the LOMO segmentation in terms of ratedistortion within an object-based coding scheme. While the object-based method fails to give better overall compression than the block-based JPEG method, it provides superior boundary preservation and increased functionality for object-based searches. 6. References [1] H.H. Chen, M.R. Civanlar, and B.G. Haskell, A block transform coder for arbitrarily shaped image segments, Proc. IEEE Int. Conf. on Image Processing,Austin, Texas, November 13-16, 1994. [] J. Bosworth and S.T. Acton, "Morphological Image by Local Monotonicity," Proc. of the Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, October 4-7, 1999. [3] D.P. Mukherjee and S.T. Acton, "Document page segmentation using multiscale clustering," Proc. IEEE Int. Conf. on Image Processing, Kobe, Japan, Oct. 5-9, 1999. For the original 56x56 pixel 8-bit grayscale images of Figures 1, 4, and 7, we show example results for each segmentation method. Figures, 5, and 8 show a LOMO segmentation and Figures 3, 6, and 9 show the result of level set segmentation method. Figure 10 shows the ratedistortion curves corresponding to the Old Central image (Figure 1) for the two segmentation methods and JPEG compression. Similarly, Figures 11 and 1 show the rate-distortion curves for the cameraman and swan images.

Figure 1: Old Central Figure : LOMO Figure 3: Level set Figure 4: Cameraman Figure 5: LOMO Figure 6: Level set Figure 7: Swan Figure 8: LOMO Figure 9: Level set

Figure 10: Rate-Distortion Curve for the Image Old Central Figure 11: Rate-Distortion Curve for the Image Cameraman Figure 1: Rate-Distortion Curve for the Image Swan Figure 13: Original Image Figure 14: Close-up of decoded image using LOMO segmentation Figure 15: Close-up of decoded image using level set segmentation Figure 16: Close-up of Original Figure 17: Close-up of decoded JPEG