Automatic Techniques for Gridding cdna Microarray Images

Similar documents
MICROARRAY IMAGE SEGMENTATION USING CLUSTERING METHODS

Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification

cdna Microarray Genome Image Processing Using Fixed Spot Position

Fuzzy C-means with Bi-dimensional Empirical Mode Decomposition for Segmentation of Microarray Image

Supplementary Appendix

Automatic Grayscale Classification using Histogram Clustering for Active Contour Models

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

An Unsupervised and Fully-Automated Image Analysis Method for cdna Microarrays

DNA microarrays [1] are used to measure the expression

Improved Processing of Microarray Data Using Image Reconstruction Techniques

GENERAL AUTOMATED FLAW DETECTION SCHEME FOR NDE X-RAY IMAGES

From microarray images to Biological knowledge. Junior Barrera BIOINFO-USP DCC/IME-USP

Exploratory data analysis for microarrays

Clustering Jacques van Helden

Analyzing ICAT Data. Analyzing ICAT Data

SOME stereo image-matching methods require a user-selected

Adaptive Noise Reduction in Microarray Images Based on the Center-Weighted Vector Medians

A New Algorithm for Detecting Text Line in Handwritten Documents

SUPPLEMENTARY FILE S1: 3D AIRWAY TUBE RECONSTRUCTION AND CELL-BASED MECHANICAL MODEL. RELATED TO FIGURE 1, FIGURE 7, AND STAR METHODS.

OCR For Handwritten Marathi Script

Designing of Fingerprint Enhancement Based on Curved Region Based Ridge Frequency Estimation

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

Chapter 3 Image Registration. Chapter 3 Image Registration

Contents. ! Data sets. ! Distance and similarity metrics. ! K-means clustering. ! Hierarchical clustering. ! Evaluation of clustering results

Available online Journal of Scientific and Engineering Research, 2019, 6(1): Research Article

Chapter 3. Automated Segmentation of the First Mitotic Spindle in Differential Interference Contrast Microcopy Images of C.

Fingerprint Image Enhancement Algorithm and Performance Evaluation

Comparative Study of ROI Extraction of Palmprint

Nature Publishing Group

AN APPROACH OF SEMIAUTOMATED ROAD EXTRACTION FROM AERIAL IMAGE BASED ON TEMPLATE MATCHING AND NEURAL NETWORK

Restoring Warped Document Image Based on Text Line Correction

Intensity Augmented ICP for Registration of Laser Scanner Point Clouds

Operators-Based on Second Derivative double derivative Laplacian operator Laplacian Operator Laplacian Of Gaussian (LOG) Operator LOG

/ Computational Genomics. Normalization

COMPRESSION OF MICROARRAY IMAGES USING A BINARY TREE DECOMPOSITION

Defect Inspection of Liquid-Crystal-Display (LCD) Panels in Repetitive Pattern Images Using 2D Fourier Image Reconstruction

DATA MODELS IN GIS. Prachi Misra Sahoo I.A.S.R.I., New Delhi

Detection of Ductus in Mammary Gland Tissue by Computer Vision

Supervised Clustering of Yeast Gene Expression Data

A Novel q-parameter Automation in Tsallis Entropy for Image Segmentation

Image Enhancement Techniques for Fingerprint Identification

Text Information Extraction And Analysis From Images Using Digital Image Processing Techniques

ScienceDirect. Fuzzy clustering algorithms for cdna microarray image spots segmentation.

N.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction

doi: /

CHAPTER 3 RETINAL OPTIC DISC SEGMENTATION

Improvement of Electrophoretic Gel Image Analysis

Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques

Topic 4 Image Segmentation

GEOG 4110/5100 Advanced Remote Sensing Lecture 4

Tracking of Virus Particles in Time-Lapse Fluorescence Microscopy Image Sequences

Methodology for spot quality evaluation

A Survey of Problems of Overlapped Handwritten Characters in Recognition process for Gurmukhi Script

Flicker Comparison of 2D Electrophoretic Gels

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Supporting Information. High-Throughput, Algorithmic Determination of Nanoparticle Structure From Electron Microscopy Images

Flicker Comparison of 2D Electrophoretic Gels

A Review on Image Segmentation Techniques

An Automated Image-based Method for Multi-Leaf Collimator Positioning Verification in Intensity Modulated Radiation Therapy

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

An ICA based Approach for Complex Color Scene Text Binarization

Coarse-to-fine image registration

Simulation of In-Cylinder Flow Phenomena with ANSYS Piston Grid An Improved Meshing and Simulation Approach

Development of an Automated Fingerprint Verification System

Matthew Baker. Visualisation of Electrophoresis Gels using TEX

Automatic Lane Extraction in Hemoglobin and Serum Protein Electrophoresis Using Image Processing

5th International Conference on Information Engineering for Mechanics and Materials (ICIMM 2015)

A Robust Method for Circle / Ellipse Extraction Based Canny Edge Detection

Automatic Detection of Change in Address Blocks for Reply Forms Processing

SIMULATIVE ANALYSIS OF EDGE DETECTION OPERATORS AS APPLIED FOR ROAD IMAGES

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

NEW MONITORING TECHNIQUES ON THE DETERMINATION OF STRUCTURE DEFORMATIONS

Part-Based Skew Estimation for Mathematical Expressions

Layout DA (Physical Design)

EECS490: Digital Image Processing. Lecture #22

Small-scale objects extraction in digital images

IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 10 March 2015 ISSN (online):

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 1

Histogram and watershed based segmentation of color images

A Robust Automated Process for Vehicle Number Plate Recognition

AN EFFICIENT BINARIZATION TECHNIQUE FOR FINGERPRINT IMAGES S. B. SRIDEVI M.Tech., Department of ECE

Study Of Spatial Biological Systems Using a Graphical User Interface

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Sequencing. Computational Biology IST Ana Teresa Freitas 2011/2012. (BACs) Whole-genome shotgun sequencing Celera Genomics

Introduction to GE Microarray data analysis Practical Course MolBio 2012

8/3/2017. Contour Assessment for Quality Assurance and Data Mining. Objective. Outline. Tom Purdie, PhD, MCCPM

Counting Particles or Cells Using IMAQ Vision

International Journal of Advance Engineering and Research Development. Iris Recognition and Automated Eye Tracking

HCR Using K-Means Clustering Algorithm

Genetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland

RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE

INCREASING CLASSIFICATION QUALITY BY USING FUZZY LOGIC

Automatically Algorithm for Physician s Handwritten Segmentation on Prescription

Ho-Youl Jung and Hwan-Gue Cho. Department of Computer Science, Pusan National University, San-30, Jangjeon-dong, Keumjeong-gu, Pusan, , Korea

Design of Non-Uniform Antenna Arrays Using Genetic Algorithm

Blood Microscopic Image Analysis for Acute Leukemia Detection

Sea Turtle Identification by Matching Their Scale Patterns

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

Hybrid Fuzzy C-Means Clustering Technique for Gene Expression Data

Agilent CytoGenomics 2.0 Feature Extraction for CytoGenomics

Transcription:

Automatic Techniques for Gridding cda Microarray Images aima Kaabouch, Member, IEEE, and Hamid Shahbazkia Department of Electrical Engineering, University of orth Dakota Grand Forks, D 58202-765 2 University of Algarve, FCT, Campus de Gambelas 8000 Faro, Portugal Abstract Microarray is considered an important instrument and powerful new technology for large-scale gene sequence and gene expression analysis. One of the major challenges of this technique is the image processing phase. The accuracy of this phase has an important impact on the accuracy and effectiveness of the subsequent gene expression and identification analysis. The processing can be organized mainly into four steps: gridding, spot isolation, segmentation, and quantification. Although several commercial software packages are now available, microarray image analysis still requires some intervention by the user, and thus a certain level of image processing expertise. This paper describes and compares four techniques that perform automatic gridding and spot isolation. The proposed techniques are based on template matching technique, standard deviation, sum, and derivative of these profiles. Experimental results show that the accuracy of the derivative of the sum profile is highly accurate compared to other techniques for good and poor quality microarray images. Index Terms Automatic Gridding, Image Processing, Microarray Image analysis. I. ITRODUCTIO Microarray is considered an important instrument and powerful new technology for large-scale gene sequence and gene expression analysis. Originally, the technique evolved from E. Southern s technique in the 970s [] and became modernized in the last decade by two key innovations. One was pioneered by P. Brown [2], consisting of the use of nonporous solid support to facilitate miniaturization and fluorescent-based detection. The second was the development of methods for high-density spatial synthesis of oligonucleotides [3]. The resulting microarray images contain thousands of genes that can be used to survey the DA and RA variations [4], which will someday become a standard tool for both molecular biology research and genomic clinical diagnosis, such as cancer diagnosis [5, 6] and diabetes diagnosis [7, 8]. One of the major challenges of the microarray technique is the image processing phase. The purpose of this phase is to extract each spotted DA sequence as well as to obtain background estimates and quality measures. The accuracy of this phase has an important impact on the accuracy and effectiveness of the subsequent gene expression and identification analysis. The processing in this phase can be organized mainly into three steps: ) gridding and extraction of individual spots; 2) segmentation of the spot images; 3) quantification. The performance of these steps is critical, since this process will directly impact the strategy and quality of downstream microarray data. Because of the noise that affects microarray images, the image processing is analytically complex and labor intensive. Thus, it is highly important that these steps are automated, in order to save time and resources and especially to remove userdependent variations. Commercial systems that perform some of these steps such as ScanAlyze, GenePix, and QuantArray already exist. However, most of these software systems are semiautomatic. For example, for gridding they require the end-users to specify the geometry of the array, such as number of rows and columns. This manual setting works for very good images, but for average and poor quality images, the manual gridding is time consuming and rarely correctly done, which can affect not only the spot extraction but also its grayscale intensity and, hence, the quality of gene expression data. A few research papers have been published to address automatic gridding. Jain et al. [9] used a gridding algorithm based on axis projections of image intensity. Yang et al. [0] used template matching and seeded region growing methods for gridding. Other authors proposed morphological methods for grid segmentation []. Since these approaches use templates or employ axis projections as a central component, irregular and overlapping grid layouts may cause problems. In this paper we describe and compare different techniques to address the problem of automatic gridding and spot extraction. These techniques are based on a template matching, the horizontal and vertical sums and standard deviations, as well as the derivatives of these profiles.

II. METHODOLOGY Microarray images are grouped in subarrays, with each subarray containing several hundred spots. Thus, to extract each spot, the image has to be segmented first in subarrays.. We implemented and compared several techniques to divide the image into subarrays and to extract each spot Template matching Horizontal and vertical sum profiles Horizontal and vertical standard deviation profiles Derivatives of the horizontal and vertical sum profiles or of the standard deviation profiles A. Template matching This technique uses a predefined image template to grid the subarray data image, thereby isolating each spot. B. Horizontal and vertical sum profiles Horizontal and vertical sum signals, by indicating the maxima and minima, point to the locations of subarrays and the gaps between them, respectively (Figs. 3 and 4). Thus, the gridding can be done by locating the middle of the gaps between subarrays. Then to extract spots, the same aforementioned technique is applied to each subarray image with the maxima representing the spots and the minima the gaps between them. The horizontal sum profile is obtained by adding the pixel intensities of every row. The sum profile is defined as follows: S() i I () j Where, i and j are the indices of the lane columns and rows, respectively. C. Horizontal and vertical standard deviation This technique uses the same methodology as in section B., but applies it to the standard deviation profiles. These profiles are also used to locate the middle of the gaps between subarray and between spots. The horizontal profile is obtained by computing the standard deviation of pixel intensities for every column. Horizontal standard deviation is defined as follows: j ij 2 /2 () i ( Iij Ii ) ) (2) M Where, and M are the image numbers of rows and columns, respectively, i and j are the indices of the image columns and rows, respectively. Iij is the intensity of the pixel located at column i and row j. Ii represents the mean of the intensity values of row j and is defined follows: I i I ij j D. First derivatives of the standard deviation or the sum profiles Because the subarrays are uniformly distributed, as are the spots in each subarray, the horizontal and vertical signals of any profile are periodic. However, because of the noise, the backgrounds of the microarray data images are not uniform and, thus, the gridding requires setting a threshold depending on the quality of the image. This threshold can be avoided by computing the derivative of these profiles. The numerical derivative profile is computed using the following equation: P ' () i Pi () Pi ( ) Pi () i Where, i and j are the indices of the image columns and rows, respectively. The periods of the horizontal and vertical profiles give the width and the height of the spot images, respectively. Although this technique can be used to grid the image into subarrays, the algorithm was applied only to extract spots from subarrays. The algorithm is as follows Compute the first and second derivative of the vertical profile Locate the pixels ( x, x2,... xn ) where the first derivative is zero and the second derivative is positive (see Fig. 2) Use these coordinates to divide the subarray vertically Compute the first and second derivatives of the vertical profile Locate the pixels where the first derivative is zero and the second derivative is positive Use these coordinates to divide the subarray horizontally Using techniques B, C, and D, the gridding is done by detecting the middle of the gaps between signals corresponding to the sub-arrays both horizontally and vertically. Once the sub-arrays are isolated, the same technique can be applied horizontally and vertically to each subarray in order to isolate the spots. The horizontal and vertical periods of any signal correspond to the width and height of each spot, respectively. (3) (4)

III. RESULTS We have implemented several techniques to grid cda microarray images and compared their performances using a set microarray images of different qualities. Fig. shows a typical cda microarray image obtained from yeast and containing 6 subarrays. Each subarray contains 384 spots, with a total of 6,44 spots that have to be isolated automatically. This image is an example of an average quality image, where the brightness of the spots is highly non-uniform. Most of the spots have intensity levels close to background levels, which make the gridding and spot extraction a difficult and tedious task. Figs. 4 a) and 4 b) show the horizontal and vertical profiles of the sum and standard deviation, respectively, for the microarray data image of Fig. 3. These profiles are binarized by setting a threshold level depending on the quality of the image. The resulting profile is then used to determine the middle of the gaps between signals, as shown in Fig. 4. The results of gridding show high efficiency in segmenting the image in subarrays; however, the techniques based on the horizontal sum and standard deviation profiles are not fully automatic and can fail when used on average and poor quality images. This failure can occur because the gaps between signals corresponding to adjacent spots are very small and can be easily affected by the noise, as shown in Fig. 5 and Fig. 6. Additionally, end users are required to set a threshold level for the noise and, hence, need to possess a certain expertise in image processing. The application of the template matching technique also gives good results but fails for images that present geometric distortion. The setting of the threshold level is avoided by computing the derivative of the horizontal sum and standard deviation profiles. As shown in Fig. 7 and Fig. 8, corresponding to the derivatives of the horizontal sum and standard deviation profiles, no threshold is needed. In each interval period, the maximum and minimum of the signal correspond to the left and right edges of the spot. The period of the signals in Fig. 7 and Fig. 8 gives the width of the spot image, while the period of the vertical signal gives the height of the spot image. Figs. 9 and 0 give the result of the gridding corresponding to subarray number on the top left of Fig.3, using the sum profile (or standard deviation profile) and the derivative of the sum profile (or derivative of the standard deviation profile), respectively. Comparing the two last rows in the two images, one can conclude that the two techniques B and D fail to grid correctly because several cells in this last row of Fig. 9 contain more than one spot while the cells in the last row of Fig. 0, using the derivative profile, contain only one spot per cell. This technique, derivative, showed better results than the other techniques, both in gridding and extracting individual spots. However, the derivative of the sum profile gives higher signals and, thus, better results in gridding and extracting spots. Fig. : Methodoly for gridding an image consisting of four subarrays. Fig.2: Methodology of gridding a subarray using the first derivative of sum or standard deviation profiles.

Fig. 3. Example of microarray image containing 6 subarrays. Fig. 4. Typical profile of a) the horizontal and b) the vertical standard deviation and the sum profiles of Fig. 3. Fig. 5. Profile of the horizontal standard deviation Fig. 6. Profile of the horizontal sum profile.

Fig. 7. Derivative of the horizontal standard deviation profile. Fig. 8. Derivative of the horizontal sum profile. Fig. 9. Gridded subarray number one in Fig. 3 using the horizontal and vertical sum (or standard deviation) profiles. Fig. 0. Gridded subarray number one in Fig. 3 using the derivative of the horizontal and vertical sum (or standard deviation) profiles. COCLUSIO In this work, we have presented a new technique that automatically subdivides the microarray images in subarrays and isolates the spots in each subarray image. We have compared the robustness and performance of this technique with other existing algorithms using a set of cda microarray images of different qualities and with different spot shapes. This evaluation and comparison showed that the derivative of the sum of profile technique is the most accurate for gridding all types of microarray data images. The proposed technique is not affected by the noise or the shape of the spots, and does not require any images. The proposed technique is not affected by the noise or the shape of the spots, and does not require any threshold level. However, if the image or the subarrays are misaligned an alignment algorithm is required before gridding the image or the subarrays. Our objective in the future is to address this issue and to develop criteria to evaluate the proposed algorithms. REFERECES [] E. M. Southern, Detection of specific sequences among DA fragments separated by gel electrophoresis, J. Mol. Biol., vol. 98, pp. 503 57, 975. [2] M. Schena, D. Shalon, R. W. Davis, and P. O. Brown, Quantitative monitoring of gene expression patterns with a complementary DA microarray, Science, vol. 270, pp. 467 470, 995. [3] S. P. A. Fodor, J. L. Read, M. C. Pirrung, L. Stryer,A. T. Lu, and D. Solas, Light-directed, spatially addressable parallel chemical synthesis, Science, vol. 25, pp. 767 773, 99. [4] E. S. Lander, Array of hope, ature Genetics Suppl., vol. 2, pp. 3 4, Jan. 999. [5] A. Alizadeh, M. Eisen, P. O. Brown, and L. M. Staudt, Probing lymphocyte biology by genomic-scale gene expression analysis, J. Clin. Immunol., vol. 8, pp. 373 379, 998. [6] C. M. Perou, S. S. Jeffrey, and M. et al., Distinctive gene expression patterns in human mammary epithelial cells and breast cancers, in Proc. at. Acad. Sci., vol. 96, 999, pp. 922 927. [7] L. M. Scearce et al., Functional genomics of the endocrine pancreas: The pancreas clone set and PancChip, new resources for diabetes research, Diabetes, vol. 5, pp. 997 2002, 2002. [8] S. T. adler, J. P. Stoehr, K. L. Schueler, G. Tanimoto, B. S. Yandel, and A. D. Attie, The expression of adipogenic genes is decreased in obesity and diabetes mellitus, Proc. at. Acad. Sci., vol. 97, pp. 37 376, 2000. [9] A. Jain, T. Tokuyasu, A. Snijders, R. Segraves, D. Albertson, and D. Pinkel, Fully automatic quantification of microarray image data, Genome Res., vol. 2, no. 2, pp. 325 332, 2002. [0] Y. H. Yang, M. J. Buckley, S. Dudoit, and T. P. Speed, Comparison of methods for image analysis on cda microarray data, J. Comput. Graph. Statist., vol., pp. 08 36, 2002. [] R. Hirata, J. Barrera, and R. F. Hashimoto, Segmentation of microarray images by mathematical morphology, Real-Time Imaging, vol. 8, no. 6, pp. 49 505, 2002.