An Adaptive and Deterministic Method for Initializing the Lloyd-Max Algorithm

Similar documents
An Accelerated Nearest Neighbor Search Method for the K-Means Clustering Algorithm

An Effective Color Quantization Method Based on the Competitive Learning Paradigm

A COMPARATIVE STUDY OF K-MEANS AND FUZZY C-MEANS FOR COLOR REDUCTION. M. Emre Celebi

Feature-Guided K-Means Algorithm for Optimal Image Vector Quantizer Design

Module 8: Video Coding Basics Lecture 42: Sub-band coding, Second generation coding, 3D coding. The Lecture Contains: Performance Measures

Hierarchical Minimum Spanning Trees for Lossy Image Set Compression

Image Compression with Competitive Networks and Pre-fixed Prototypes*

THREE DESCRIPTIONS OF SCALAR QUANTIZATION SYSTEM FOR EFFICIENT DATA TRANSMISSION

Comparative Study on VQ with Simple GA and Ordain GA

Dynamic local search algorithm for the clustering problem

A Review on LBG Algorithm for Image Compression

Compression of Image Using VHDL Simulation

Chapter 7: Competitive learning, clustering, and self-organizing maps

Operators-Based on Second Derivative double derivative Laplacian operator Laplacian Operator Laplacian Of Gaussian (LOG) Operator LOG

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016

On the splitting method for vector quantization codebook generation

6. Dicretization methods 6.1 The purpose of discretization

CHAPTER 6 INFORMATION HIDING USING VECTOR QUANTIZATION

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Binary vector quantizer design using soft centroids

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

A GENETIC C-MEANS CLUSTERING ALGORITHM APPLIED TO COLOR IMAGE QUANTIZATION

Kapitel 4: Clustering

An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve

Supervised vs. Unsupervised Learning

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Iterative split-and-merge algorithm for VQ codebook generation published in Optical Engineering, 37 (10), pp , October 1998

Texture Image Segmentation using FCM

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

Genetic algorithm with deterministic crossover for vector quantization

Lecture 2 The k-means clustering problem

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

A Study on the Effect of Codebook and CodeVector Size on Image Retrieval Using Vector Quantization

Unsupervised Learning and Clustering

3D Surface Recovery via Deterministic Annealing based Piecewise Linear Surface Fitting Algorithm

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Machine Learning for OR & FE

Review of the Robust K-means Algorithm and Comparison with Other Clustering Methods

Clustering. Unsupervised Learning

Discrete geometry. Lecture 2. Alexander & Michael Bronstein tosca.cs.technion.ac.il/book

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

Topic 4 Image Segmentation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

Summary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4

9.913 Pattern Recognition for Vision. Class 8-2 An Application of Clustering. Bernd Heisele

Cluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008

REAL-TIME IMPLEME TATIO OF ORDER-STATISTICS BASED DIRECTIO AL FILTERS ABSTRACT

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Non-Parametric Vector Quantization Algorithm

k-means Clustering David S. Rosenberg April 24, 2018 New York University

Robust Shape Retrieval Using Maximum Likelihood Theory

Clustering Algorithms. Margareta Ackerman

Segmentation by Clustering

Overcompressing JPEG images with Evolution Algorithms

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Pasi Fränti K-means properties on six clustering benchmark datasets Pasi Fränti and Sami Sieranoja Algorithms, 2017.

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Clustering. Unsupervised Learning

Unsupervised Learning and Clustering

Digital Image Fundamentals

AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM

DIFFERENTIAL IMAGE COMPRESSION BASED ON ADAPTIVE PREDICTION

N-Candidate methods for location invariant dithering of color images

Two Algorithms of Image Segmentation and Measurement Method of Particle s Parameters

Fast color quantization using weighted sort-means clustering

Image Segmentation Based on Watershed and Edge Detection Techniques

Evaluation of texture features for image segmentation

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

Sampling-based Planning 2

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

k-center Problems Joey Durham Graphs, Combinatorics and Convex Optimization Reading Group Summer 2008

High Dimensional Indexing by Clustering

4.1 QUANTIZATION NOISE

FPGA implementation of a predictive vector quantization image compression algorithm for image sensor applications

New Approach of Estimating PSNR-B For Deblocked

Clustering and Visualisation of Data

Clustering. Unsupervised Learning

A Novel Approach to Image Segmentation for Traffic Sign Recognition Jon Jay Hack and Sidd Jagadish

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

Finite Element Methods for the Poisson Equation and its Applications

2D image segmentation based on spatial coherence

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

Clustering in R d. Clustering. Widely-used clustering methods. The k-means optimization problem CSE 250B

PSD2B Digital Image Processing. Unit I -V

A Course in Machine Learning

What to come. There will be a few more topics we will cover on supervised learning

Iterative random projections for high-dimensional data clustering

UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING. Daniela Joiţa Titu Maiorescu University, Bucharest, Romania

Segmentation of Mushroom and Cap Width Measurement Using Modified K-Means Clustering Algorithm

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

MRT based Fixed Block size Transform Coding

Slides for Data Mining by I. H. Witten and E. Frank

Image Enhancement. Digital Image Processing, Pratt Chapter 10 (pages ) Part 1: pixel-based operations

Water-Filling: A Novel Way for Image Structural Feature Extraction

Spatial Enhancement Definition

Transcription:

An Adaptive and Deterministic Method for Initializing the Lloyd-Max Algorithm Jared Vicory and M. Emre Celebi Department of Computer Science Louisiana State University, Shreveport, LA, USA ABSTRACT Gray-level quantization (reduction) is an important operation in image processing and analysis. The Lloyd- Max algorithm (LMA) is a classic scalar quantization algorithm that can be used for gray-level reduction with minimal mean squared distortion. However, the algorithm is known to be very sensitive to the choice of initial centers. In this paper, we introduce an adaptive and deterministic algorithm to initialize the LMA for gray-level quantization. Experiments on a diverse set of publicly available test images demonstrate that the presented method outperforms the commonly used uniform initialization method. Keywords: Gray-level reduction, scalar quantization, Lloyd-Max algorithm, center initialization 1. INTRODUCTION Gray-level quantization (reduction) is an important operation in image processing and analysis. The objective is to reduce the number of unique gray levels in an image from an initial value of L to a desired number of K (K L) with minimal distortion. In most applications, 8-bit pixels in an image can be reduced to 4-bits or fewer without introducing perceivable distortion. Immediate applications of gray-level quantization include: (i) image compression, (ii) image segmentation, (iii) image analysis, and (iv) content-based image retrieval. The Lloyd-Max algorithm (LMA) 1,2 is a well-known scalar quantization algorithm that can be used for graylevel reduction. This algorithm is known to be optimal when the distortion measure is taken as the mean squared error (MSE). While originally developed for the discretization of analog signals so that they can be easily stored and manipulated, the LMA can also be used to quantize discrete data, such as an image histogram. The first step in the LMA is to select an initial set of K points, or centers, from the data set to be quantized. The data can then be partitioned around these initial centers in such a way that each partition consists of a set of points that are best represented by their respective center in the MSE sense. It can be shown that for two adjacent centers c i and c i+1, the optimal partition boundary lying between them is given by their average, i.e. (c i +c i+1 )/2. In addition, for a given partition, the optimum center is given by the centroid of that partition, which is calculated as the weighted average of the bins that lie between the partition s boundaries. Once a new set of centers is obtained by calculating the centroid of each partition, new partition boundaries can be obtained by averaging the new centers. Each iteration of this procedure yields a new set of centers, which give an MSE less than or equal to that of the previous iteration. Therefore, repeating this procedure until the change in MSE becomes negligible yields the final set of centers. In the case of gray-level quantization, each pixel in the original image is mapped to the nearest center. This results in a quantized image, which is an optimal representation of the original image given the initial set of centers. The performance of the LMA deps heavily on the choice of the initial centers. While the algorithm is guaranteed to obtain a local minimum in MSE, this minimum may be far from the global minimum (the ideal quantization) if the initial centers are chosen poorly. Therefore, an intelligent initialization method will not only enhance the quality of the quantized images, but also accelerate the convergence of the algorithm. Thesimplestwaytodeterminethe initialcentersistoselectthem randomlyfromamongthe pointsinthe data set. This method has two seriousdrawbacks. First, while it is possible to obtain the globalminimum MSE for the input image, it is equally likely to get the worst possible result. Therefore, over time, the performance of a random Corresponding author information: E-mail: ecelebi@lsus.edu, Telephone: 1-318-795-4281.

initialization scheme will be only mediocre. Second, this scheme is nondeterministic, allowing the same image to be quantized in multiple ways, some of which may be very different. This can be undesirable for applications such as content-based image retrieval, where it is important to be able to quantize an image the same way so that it can be accurately compared with previously quantized images. Scheunders 3 proposed an initialization method based on genetic algorithms. This method gives better results than random initialization, however it is still non-deterministic and the genetic optimization procedure is computationally demanding. An alternative initialization method involves the selection of the centers so that they are uniformly distributed throughout the data set. This commonly used method has the advantage of being deterministic, but it disregards the distribution of data, which is likely to yield suboptimal results in many cases. In this paper, we present an adaptive and deterministic initialization method for the LMA. Our method is loosely based on the farthest-first heuristic, 4,5 which is commonly used to initialize the multidimensional variant of the LMA, i.e. the Linde-Buzo-Gray (LBG) algorithm. 6 The rest of the paper is organized as follows. Section 2 describes the proposed initialization method. Section 3 presents the comparison of the proposed method with the commonly used uniform initialization method. Finally, Section 4 gives the conclusions. 2. PROPOSED INITIALIZATION METHOD In orderto initialize the LMA adaptively, we modify the farthest-first heuristic. 4,5 According to this method, the first center c 1 is calculated as the mean of the data vectors and the i-th center c i is chosen to be the point that has the largest minimum distance to the previously selected centers, i.e. c 1,c 2,...,c i 1. In the n-dimensional case, the distance function is often chosen as the squared Euclidean (L 2 ) distance. When dealing with scalar data, a different distance measure is required. In this study, the distance of center c to histogram bin b with height h is calculated using the following Gaussian weighted function: e ((c b)/255)2 e h3/2. The advantage of this function is that it weights bins not only by their distance from each center (difference in gray levels), but also by their height (the percentage of pixels that fall into the bin). There are several ways to choose the first center, including the bin with the largest height, the median of the histogram, and the center bin. In this study, the centroid of the histogram, which represents the mean gray-level in the input image, is chosen as the first center. The pseudocode of the overall initialization method is given in Algorithm 1. It is important to note that this algorithm does not produce the centers in sorted order. Since K is often a small number, the sorting of the centers can be accomplished easily by a simple algorithm such as insertion sort. 7 3. EXPERIMENTAL RESULTS AND DISCUSSION The proposed initialization method was compared to the uniform initialization method on 16 images taken from the USC SIPI Database. 8 Table 1 compares the methods for 4, 8, 12, and 16 gray-levels on six representative images. The columns from left to right represent the number of quantization levels (K), MSE obtained by the uniform initialization method, MSE obtained the proposed method, and the percent improvement (degradation) achieved by the proposed method, respectively. Performance statistics over the entire set of (16) images is as follows. The proposed method outperformed uniform initialization in 73% of the cases. In cases where the former method performed better, the average MSE improvement was 10.8%, whereas in the remaining cases the average MSE degradation was 5.34%. Overall, the proposed method obtained an average of 6.49% improvement in MSE. In summary, the proposed method generally outperforms uniform initialization with respect to distortion minimization. Furthermore, in cases where the former gives inferior results, the discrepancy between the two methods is insignificant. This was expected since the latter method allocates the quantization levels without regard to the gray level distribution of the image, whereas the former performs the allocation adaptively. Figures 1 and 2 shows sample quantization results for the Truck and Tiffany images and the corresponding error images, respectively. The error image for a particular initialization method was obtained by taking the pixelwise absolute difference between the original and quantized images. In order to obtain a better visualization,

input : h[0...255] (Normalized histogram of the input image) output: C = {c 1,c 2,...,c K } (K cluster centers) The first center is given by the histogram centroid; c 1 = 0; for g = 0;g 255;g = g +1 do c 1 = c 1 +g h[g]; Iterate over the required number of centers; for i = 2;i K;i = i+1 do Iterate over the histogram bins; max dist = ; max index = 0; for j = 0;j 255;j = j +1 do Iterate over the previously selected centers; min dist = ; for k = 1;k < i;k = k +1 do ( ) ck j 2 255 dist = e e h[j]3/2 ; if dist < min dist then min dist = dist; if max dist < min dist then max dist = min dist; max index = j; c i = max index; Algorithm 1: Proposed Initialization Algorithm pixel values of the error images were multiplied by 8 and then negated. It can be seen that the proposed method produces visually pleasing results with less prominent contouring (see, for example, the top road in the Truck image and the neck area in the Tiffany image) and distortion. 4. CONCLUSIONS In this paper, we introduced an effective Lloyd-Max initialization algorithm for gray-level quantization. In contrast to other popular initialization schemes, this algorithm is adaptive, deterministic, and computationally efficient. Experiments on a large set of test images demonstrated that the presented method generally outperforms the commonly used uniform initialization method with respect to distortion minimization. ACKNOWLEDGMENTS This publication was made possible by grants from the Louisiana Board of Regents (LEQSF2008-11-RD-A-12) and US National Science Foundation (0959583, 1117457). REFERENCES [1] Max, J., Quantizing for Minimum Distortion, IRE Transactions on Information Theory 6(1), 7 12 (1960). [2] Lloyd, S., Least Squares Quantization in PCM, IEEE Transactions on Information Theory 28(2), 129 136 (1982).

Table 1. MSE comparison of the LMA initialization methods K Uniform Proposed Delta (%) Airplane (216 gray levels) 4 131.39 129.64 1.33 8 35.80 33.88 5.34 12 18.60 19.87-6.84 16 10.59 9.04 14.69 House (227 gray levels) 4 171.35 171.52-0.10 8 48.53 47.04 3.07 12 22.71 21.86 3.72 16 14.40 13.74 4.59 Lenna (216 gray levels) 4 162.67 162.53 0.09 8 43.99 41.92 4.69 12 20.83 20.83 0.00 16 14.91 12.92 13.31 Splash (234 gray levels) 4 236.52 236.52 0.00 8 54.00 49.01 9.24 12 23.03 22.90 0.53 16 13.78 13.38 2.93 Tiffany (179 gray levels) 4 162.40 162.40 0.00 8 32.32 32.96-1.99 12 22.43 20.66 7.91 16 19.25 9.88 48.66 Truck (144 gray levels) 4 71.95 72.41-0.65 8 43.50 29.21 32.84 12 11.95 10.33 13.54 16 6.50 6.45 0.77

(a) Original image (b) Uniform initialization (c) Error image (d) Proposed initialization (e) Error image Figure 1. Quantization results for the Truck image (K = 8)

(a) Original image (b) Uniform initialization (c) Error image (d) Proposed initialization (e) Error image Figure 2. Quantization results for the Tiffany image (K = 16)

[3] Scheunders, P., A Genetic Lloyd-Max Image Quantization Algorithm, Pattern Recognition Letters 17(5), 547 556 (1996). [4] Gonzalez, T. F., Clustering to Minimize the Maximum Intercluster Distance, Theoretical Computer Science 38, 293 306 (1985). [5] Hochbaum, D. and Shmoys, D., A Best Possible Heuristic for the k-center Problem, Mathematics of Operations Research 10(2), 180 184 (1985). [6] Linde, Y., Buzo, A., and Gray, R., An Algorithm for Vector Quantizer Design, IEEE Transactions on Communications 28(1), 84 95 (1980). [7] Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C., [Introduction to Algorithms], The MIT Press, third ed. (2009). [8] Weber, A., The USC-SIPI Image Database, http://sipi.usc.edu/database/(last Accessed: November 20, 2011).