From microarray images to Biological knowledge. Junior Barrera BIOINFO-USP DCC/IME-USP
|
|
- Miles Riley
- 6 years ago
- Views:
Transcription
1 From microarray images to Biological knowledge Junior Barrera BIOINFO-USP DCC/IME-USP
2 Team Hugo A. Armelin Junior Barrera Helena Brentaini Marcel Brun Y. Chen Edward R. Dougherty Roberto M. Cesar Jr. Daniel Dantas Gustavo Esteves Marco D. Gubitoso Nina S. T. Hirata Roberto Hirata Jr Luiz F. Reis Paulo S. Silva Sandro de Souza Walter Trepode...
3 Layout Introduction Chip design Image analysis Normalization Genes Signature Clustering Genetic networks An environment for knowledge discovery
4 Introduction
5 Knowledge evolution in genetics Gene expression
6 Knowledge evolution in genetics
7 Data acquisition
8 Data acquisition
9 Analysis Phases G. Signature Measure Normal, Clustering Clustering Clustering G. R. Net.
10 Biological questions What genes are enough to separate a cancerous tissue from a normal one? What genes can separate to types of cancer? What genes differentiate two types of chickens or trees? What genes regulate a pathway?
11 Chip design
12 Clone Selection 1. Clustering for the same gene 5 Known gene 3 2. Choice of a clone representing the gene 5 3 Cuidado com sítios de Poliadenilação!!
13 Affect hybridization process Clone size - they should have similar size Clone position in the gene - they should come from similar regions Clone similarity - they should not have large similar regions
14 Hybridization cdnas Captura das Imagens Cy5 Misturas de cdnas marcados CY3 e CY5 Cy3
15 Image Analysis
16 A scanned image of a microarray slide
17 Processo de Segmentação de Imagens de Microarrays Hirata R, Barrera J, Hashimoto R, Dantas D, Esteves G. In press, 2002.
18 Raw data to the gene expression estimation step
19 Available solutions Scanalyze: usually doesn t t find misaligned spots. SpotFinder(TIGR): subarrays must be placed manually. Arrayvision: very good on locating misaligned spots; many options. UCSF Spot: : does everything automatically if the image is perfect. Quantarray,, F-scanF scan, Dapple, Genepix, Imagene etc. All of them require user interaction to some level.
20 Our aim... Is to reduce the user interaction, doing the job automatically and measuring correctly the relative mrna concentrations. This will make the process cheaper and faster. User interaction makes the segmentation subjective. Eliminating that, the results may be more reproducible.
21 Our software
22 Parameter setting In this window the user sets parameters for a whole family of arrays He can save in a file for reusing them
23 A vertical image profile is the sum of the spots values of each image line
24 The subarray gridding Is done by filtering the horizontal and vertical profiles
25 And finally taking the local minima of the filtered profile
26 the same is done with the horizontal profile. Here the result
27 Spots gridding is done separately for each subarray
28 The profile filtering is simpler having just one step, and also uses local minima
29 The spots detection step is basically the application of the Watershed operator
30 To avoid oversegmentation the image must be filtered
31 The filtered image also gives markers that will be used in Watershed
32 We give as input to the Watershed the markers, grid and the filtered image gradient
33 Here the resulting grid in white and spots cortours in light blue
34 Segmentation example
35 Segmentation example
36 Raw data to the gene expression estimation step The raw data of a spot consists on: the pixels values of both channels inside its rectangular region of interest which pixels belong to foreground or backround Foreground is the region with spotted cdna Background is the region without it.
37 Raw data to the gene expression estimation step
38 Gene expression estimation Is to find a value that represents the relative quantity of mrna in the two samples.
39 Some techniques to estimate gene expression Linear regression or least-squares squares fit of the values of pixels in the two channels.
40 Some techniques to estimate gene expression (ch1i-ch1b) ch1b) / (ch2i-ch2b) ch2b) where chxi is the estimated foreground intensity and chxb is the estimated backround intensity of channel X.
41 Some techniques to estimate gene expression To estimate chxi and chxb we can do: mean or median of all pixels in the foreground and background. mean or median of some percentiles in the foreground and background (fixed region method) mean or median of higher percentiles of all the pixels in the rectangle to estimate chxi and of lower percentiles to estimate the chxb.. Foreground and background information is ignored (histogram method)
42 Gene expression generation The program saves the expression data in a tab separated text file The file has the same format of the ones generated by ScanAlyze
43 Validation We made controlled experiments to test the expression estimation techniques. The objective of the experiment was to test how expression was affected by: position in the slide dilution of cdna length of mrna fragments being marked with cy3 or cy5
44 Validation We spotted microarrays with 32 blocks, each block with 6 genes x 5 dilutions x 2 repetitions + 4 landmarks = 64 spots We made six slides like this and, onto them, we poured six different mrna soups: Dilution gene 43A 43B 44A 44B 45A 45B Irf Trp ST IL Q Lys
45 Validation Here each point is the value of a spot obtained by the fixed region method. Spots from different dilutions are grouped. The black ones are from the three bigger mrna fragments, and the red, from the three smaller. sqrt( ( (ch1i A -ch1b A ) x (ch2i B -ch2b B ) ) sqrt( ( (ch2i A -ch2b A ) x (ch1i B -ch1b B ) )
46 Validation And here is the best result, obtained with the histogram method. sqrt( ( (ch1i A -ch1b A ) x (ch2i B -ch2b B ) ) sqrt( ( (ch2i A -ch2b A ) x (ch1i B -ch1b B ) )
47 Normalization
48 Normalization The expected expression of the gene IRF was 1.0 but the expression found was 1.6 This is due to the physical properties of the dyes.
49 Normalization When we have a single slide, we must eliminate the constant k assuming, when appropriate, that we can normalize all the spots using the expression of a housekeeping gene we can normalize assuming that the mean of normalized expression rates is one x = k (ch1i ch1b) (ch2i ch2b)
50 Normalization by swap Consists on eliminating the influence of the dyes properties by using two slides, and swapping the dye used to label the mrna sample. Use it if you find the single slide normalization hypotheses too strong.
51 Normalization by swap Better results can be achieved by doing swap experiments. x = k (ch1i A ch1b A ) = (ch2i B ch2b B ) 1 (ch2i A ch2b A ) (ch1i B ch1b B ) k
52 Normalization by swap Better results can be achieved by doing swap experiments. x = (ch1i (ch2i A A ch1b ch2b A A ) ) (ch2i (ch1i B B ch2b ch1b B B ) )
53 Genes Signature
54 Cancer tissue characterization Problem: from a small set (60) of microarrays, find a minimum number of genes that are enough to separate cancer A and B.
55 v erb b2 avian erythroblastic leukemia viral oncogene homolog LINEAR CLASSIFIER (DISPERSED GAUSSIAN) w/ σ = the tumor sample from Patient phosphofructokinase, platelet 7 th ( ) BRCA1 BRCA2/sporadic 1 th ( )
56 Approach randomize data compute classifier using genes subsets measure error for different dispersions choose the subset that balance small error and high dispersion. A supercomputer is required.
57 Linear classifier Dispersion centered in the sample Flat round dispersion model Error computed analytically (faster) X i [R n-1 ] 2 R h 2 r k R x k h X j [R 1 ] Misclassification error, ε n (h,r) Hyper-plane section: V n-1
58 Robustness analysis r k r j
59 Error curves under various dispersion levels, σ σ = 0.20 σ = 0.30 σ = 0.40 σ = 0.50 σ = 0.60 σ = 0.70 misclassification erorr, ε # of variables, d
60 Linear programming optimization
61
62
63
64 Steps The best linear classifier uses about genes Genes used are eliminated and the best linear classifier is computed, more genes are separated The procedure is repeated till having about 100 genes The full search is applied in the selected subset of genes
65 Separation
66 Validation Expression of chosen subsets of genes are measured several times in low cost experiments If the experiments reveal compact clusters the subset of genes chosen should be correct.
67 Clustering
68 Time course data 6 time course data C1 C2 C3 4 Clustered by dendrogram 2 Original clusters log2(ratio) 0 Dendrogram time KL plot multidimensional space different views of Microarray data KL plot mis-classified C1 C2 C
69 Example No error! Tighter clusters due to small variance Results from Fuzzy c-means
70 Gene Regulation Networks
71 Cell peptide other signals GENES NETWORK mrna Transcr. Translat. Proteins Pool Metabolic Pathways
72 Cell Cycle Modeling Division Steps x1fp... x1 z u1 y1fp x1f y1f I u2 u5 p w1fp vfp w2fp w1f v w2f y2fp w1 w2 x3f x4f y1f y2f y2f y1 y2 z Forward Signal Feedback to p Feedback to previous layer... x6fp x6f x6
73 Modeling Dynamical Systems
74 Simulator Architecture
75
76
77
78 Knockout x1fp... x1 Division Steps z u1 y1fp x1f y1f I u2 u5 p w1fp vfp w2fp w1f v w2f y2fp w1 w2 x3f x4f y1f y2f y2f y1 y2 z Forward Signal Feedback to p Feedback to previous layer... x6fp x6f x6
79
80
81 An environment for knowledge discovery
82 Objected oriented database P1 P2 Pn Pi : analytical and mining procedures (kernel parallel)
83 Integrated Environment Genbank database Another databases Access control module Clinical data Sequence module P1 microarray module P1 analitical operational P2 P3 analitical operational P2 P3 query Pn Pn query query
84 Users 1 Analysis tools writer SpreadSheet Dataminig 2 M_database MOLAP/ROLAP server 3 Metadata Data Warehouse Data Mart Integrator Wrapper Wrapper Wrapper 4 IMS RDMS Others
85 GRID Computer - DCC-IME-USP Internet and Intranet E4000 E4000:Databases and web application services (by SEFAZ) 10 processors 16 G-bytes main memory 1 T-bytes disk Processing services E3500 V880 V880?... (by CAGE-FAPESP) 4 processors 8 gigabytes (by Malaria-FAPESP) 16 processors 32 G-bytes (by eucalipto-cnpq???) 16 processors 32 G-bytes
86 Σ What genes regulate the pathway A->B->C->D? Wet Lab Proteome Transcriptome Genome Pathways
Clustering Algorithms: Can anything be Concluded?
Clustering Algorithms: Can anything be Concluded? Edward R. Dougherty, Seungchan Kim Texas A&M University Junior Barrera, Marcel Brun, Roberto Marcondes Universidade de Sao Paulo Yidong Chen, Michael Bittner,
More information/ Computational Genomics. Normalization
10-810 /02-710 Computational Genomics Normalization Genes and Gene Expression Technology Display of Expression Information Yeast cell cycle expression Experiments (over time) baseline expression program
More informationSVM Classification in -Arrays
SVM Classification in -Arrays SVM classification and validation of cancer tissue samples using microarray expression data Furey et al, 2000 Special Topics in Bioinformatics, SS10 A. Regl, 7055213 What
More informationIntroduction to GE Microarray data analysis Practical Course MolBio 2012
Introduction to GE Microarray data analysis Practical Course MolBio 2012 Claudia Pommerenke Nov-2012 Transkriptomanalyselabor TAL Microarray and Deep Sequencing Core Facility Göttingen University Medical
More informationMicro-array Image Analysis using Clustering Methods
Micro-array Image Analysis using Clustering Methods Mrs Rekha A Kulkarni PICT PUNE kulkarni_rekha@hotmail.com Abstract Micro-array imaging is an emerging technology and several experimental procedures
More informationAutomatic Techniques for Gridding cdna Microarray Images
Automatic Techniques for Gridding cda Microarray Images aima Kaabouch, Member, IEEE, and Hamid Shahbazkia Department of Electrical Engineering, University of orth Dakota Grand Forks, D 58202-765 2 University
More informationEECS490: Digital Image Processing. Lecture #22
Lecture #22 Gold Standard project images Otsu thresholding Local thresholding Region segmentation Watershed segmentation Frequency-domain techniques Project Images 1 Project Images 2 Project Images 3 Project
More informationMicroarray Data Analysis (V) Preprocessing (i): two-color spotted arrays
Microarray Data Analysis (V) Preprocessing (i): two-color spotted arrays Preprocessing Probe-level data: the intensities read for each of the components. Genomic-level data: the measures being used in
More informationMICROARRAY IMAGE SEGMENTATION USING CLUSTERING METHODS
Mathematical and Computational Applications, Vol. 5, No. 2, pp. 240-247, 200. Association for Scientific Research MICROARRAY IMAGE SEGMENTATION USING CLUSTERING METHODS Volkan Uslan and Đhsan Ömür Bucak
More informationNature Publishing Group
Figure S I II III 6 7 8 IV ratio ssdna (S/G) WT hr hr hr 6 7 8 9 V 6 6 7 7 8 8 9 9 VII 6 7 8 9 X VI XI VIII IX ratio ssdna (S/G) rad hr hr hr 6 7 Chromosome Coordinate (kb) 6 6 Nature Publishing Group
More informationFuzzy C-means with Bi-dimensional Empirical Mode Decomposition for Segmentation of Microarray Image
www.ijcsi.org 316 Fuzzy C-means with Bi-dimensional Empirical Mode Decomposition for Segmentation of Microarray Image J.Harikiran 1, D.RamaKrishna 2, M.L.Phanendra 3, Dr.P.V.Lakshmi 4, Dr.R.Kiran Kumar
More informationGoal-oriented Schema in Biological Database Design
Goal-oriented Schema in Biological Database Design Ping Chen Department of Computer Science University of Helsinki Helsinki, Finland 00014 EMAIL: pchen@cs.helsinki.fi Abstract In this paper, I reviewed
More informationIntroduction to Data Mining of Microarrays using the MicroArray Explorer
Introduction to Data Mining of Microarrays using the MicroArray Explorer Peter F. Lemkin Lab. Experimental & Computational Biology, CCR, NCI Frederick, MD 21702 MAExplorer: http://www.lecb.ncifcrf.gov/maexplorer
More informationExploratory data analysis for microarrays
Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA
More informationEECS 730 Introduction to Bioinformatics Microarray. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introduction to Bioinformatics Microarray Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ GeneChip 2011/11/29 EECS 730 2 Hybridization to the Chip 2011/11/29
More informationCARMAweb users guide version Johannes Rainer
CARMAweb users guide version 1.0.8 Johannes Rainer July 4, 2006 Contents 1 Introduction 1 2 Preprocessing 5 2.1 Preprocessing of Affymetrix GeneChip data............................. 5 2.2 Preprocessing
More informationPreprocessing -- examples in microarrays
Preprocessing -- examples in microarrays I: cdna arrays Image processing Addressing (gridding) Segmentation (classify a pixel as foreground or background) Intensity extraction (summary statistic) Normalization
More information9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology
9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example
More informationTutorial:OverRepresentation - OpenTutorials
Tutorial:OverRepresentation From OpenTutorials Slideshow OverRepresentation (about 12 minutes) (http://opentutorials.rbvi.ucsf.edu/index.php?title=tutorial:overrepresentation& ce_slide=true&ce_style=cytoscape)
More informationNormalization: Bioconductor s marray package
Normalization: Bioconductor s marray package Yee Hwa Yang 1 and Sandrine Dudoit 2 October 30, 2017 1. Department of edicine, University of California, San Francisco, jean@biostat.berkeley.edu 2. Division
More informationAnalyzing ICAT Data. Analyzing ICAT Data
Analyzing ICAT Data Gary Van Domselaar University of Alberta Analyzing ICAT Data ICAT: Isotope Coded Affinity Tag Introduced in 1999 by Ruedi Aebersold as a method for quantitative analysis of complex
More informationPathway Studio Quick Start Guide
Pathway Studio Quick Start Guide This Quick Start Guide is for users of the Pathway Studio 4.0 pathway analysis software. The Quick Start Guide demonstrates the key features of the software and provides
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1
Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group
More informationcdna Microarray Genome Image Processing Using Fixed Spot Position
American Journal of Applied Sciences 3 (2): 1730-1734, 2006 ISSN 1546-9239 2006 Science Publications cdna Microarray Genome Image Processing Using Fixed Spot Position Basim Alhadidi, Hussam Nawwaf Fakhouri
More informationTutorial. RNA-Seq Analysis of Breast Cancer Data. Sample to Insight. November 21, 2017
RNA-Seq Analysis of Breast Cancer Data November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com
More informationBioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness
Bioconductor exercises Exploring cdna data June 2004 Wolfgang Huber and Andreas Buness The following exercise will show you some possibilities to load data from spotted cdna microarrays into R, and to
More informationAnalysis of (cdna) Microarray Data: Part I. Sources of Bias and Normalisation
Analysis of (cdna) Microarray Data: Part I. Sources of Bias and Normalisation MICROARRAY ANALYSIS My (Educated?) View 1. Data included in GEXEX a. Whole data stored and securely available b. GP3xCLI on
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationDiscovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London
Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,
More informationMAGE-ML: MicroArray Gene Expression Markup Language
MAGE-ML: MicroArray Gene Expression Markup Language Links: - Full MAGE specification: http://cgi.omg.org/cgi-bin/doc?lifesci/01-10-01 - MAGE-ML Document Type Definition (DTD): http://cgi.omg.org/cgibin/doc?lifesci/01-11-02
More informationIntroduction to Systems Biology II: Lab
Introduction to Systems Biology II: Lab Amin Emad NIH BD2K KnowEnG Center of Excellence in Big Data Computing Carl R. Woese Institute for Genomic Biology Department of Computer Science University of Illinois
More informationCLUSTERING IN BIOINFORMATICS
CLUSTERING IN BIOINFORMATICS CSE/BIMM/BENG 8 MAY 4, 0 OVERVIEW Define the clustering problem Motivation: gene expression and microarrays Types of clustering Clustering algorithms Other applications of
More informationSegmentation
Lecture 6: Segmentation 24--4 Robin Strand Centre for Image Analysis Dept. of IT Uppsala University Today What is image segmentation? A smörgåsbord of methods for image segmentation: Thresholding Edge-based
More informationWindows. RNA-Seq Tutorial
Windows RNA-Seq Tutorial 2017 Gene Codes Corporation Gene Codes Corporation 525 Avis Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com
More informationMATH3880 Introduction to Statistics and DNA MATH5880 Statistics and DNA Practical Session Monday, 16 November pm BRAGG Cluster
MATH3880 Introduction to Statistics and DNA MATH5880 Statistics and DNA Practical Session Monday, 6 November 2009 3.00 pm BRAGG Cluster This document contains the tasks need to be done and completed by
More informationECS 234: Data Analysis: Clustering ECS 234
: Data Analysis: Clustering What is Clustering? Given n objects, assign them to groups (clusters) based on their similarity Unsupervised Machine Learning Class Discovery Difficult, and maybe ill-posed
More informationTHE preceding chapters were all devoted to the analysis of images and signals which
Chapter 5 Segmentation of Color, Texture, and Orientation Images THE preceding chapters were all devoted to the analysis of images and signals which take values in IR. It is often necessary, however, to
More informationNetwork Analysis, Visualization, & Graphing TORonto (NAViGaTOR) User Documentation
Network Analysis, Visualization, & Graphing TORonto (NAViGaTOR) User Documentation Jurisica Lab, Ontario Cancer Institute http://ophid.utoronto.ca/navigator/ November 10, 2006 Contents 1 Introduction 2
More informationDouble Self-Organizing Maps to Cluster Gene Expression Data
Double Self-Organizing Maps to Cluster Gene Expression Data Dali Wang, Habtom Ressom, Mohamad Musavi, Cristian Domnisoru University of Maine, Department of Electrical & Computer Engineering, Intelligent
More informationAutomated Bioinformatics Analysis System on Chip ABASOC. version 1.1
Automated Bioinformatics Analysis System on Chip ABASOC version 1.1 Phillip Winston Miller, Priyam Patel, Daniel L. Johnson, PhD. University of Tennessee Health Science Center Office of Research Molecular
More informationHow do microarrays work
Lecture 3 (continued) Alvis Brazma European Bioinformatics Institute How do microarrays work condition mrna cdna hybridise to microarray condition Sample RNA extract labelled acid acid acid nucleic acid
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More informationBiclustering for Microarray Data: A Short and Comprehensive Tutorial
Biclustering for Microarray Data: A Short and Comprehensive Tutorial 1 Arabinda Panda, 2 Satchidananda Dehuri 1 Department of Computer Science, Modern Engineering & Management Studies, Balasore 2 Department
More informationReview of feature selection techniques in bioinformatics by Yvan Saeys, Iñaki Inza and Pedro Larrañaga.
Americo Pereira, Jan Otto Review of feature selection techniques in bioinformatics by Yvan Saeys, Iñaki Inza and Pedro Larrañaga. ABSTRACT In this paper we want to explain what feature selection is and
More informationComputer Vision Group Prof. Daniel Cremers. 8. Boosting and Bagging
Prof. Daniel Cremers 8. Boosting and Bagging Repetition: Regression We start with a set of basis functions (x) =( 0 (x), 1(x),..., M 1(x)) x 2 í d The goal is to fit a model into the data y(x, w) =w T
More informationIncorporating Known Pathways into Gene Clustering Algorithms for Genetic Expression Data
Incorporating Known Pathways into Gene Clustering Algorithms for Genetic Expression Data Ryan Atallah, John Ryan, David Aeschlimann December 14, 2013 Abstract In this project, we study the problem of classifying
More informationVector Xpression 3. Speed Tutorial: III. Creating a Script for Automating Normalization of Data
Vector Xpression 3 Speed Tutorial: III. Creating a Script for Automating Normalization of Data Table of Contents Table of Contents...1 Important: Please Read...1 Opening Data in Raw Data Viewer...2 Creating
More informationSegmentation
Lecture 6: Segmentation 215-13-11 Filip Malmberg Centre for Image Analysis Uppsala University 2 Today What is image segmentation? A smörgåsbord of methods for image segmentation: Thresholding Edge-based
More informationIs deformable image registration a solved problem?
Is deformable image registration a solved problem? Marcel van Herk On behalf of the imaging group of the RT department of NKI/AVL Amsterdam, the Netherlands DIR 1 Image registration Find translation.deformation
More informationExcel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division
Excel R Tips EXCEL TIP 1: INPUTTING FORMULAS To input a formula in Excel, click on the cell you want to place your formula in, and begin your formula with an equals sign (=). There are several functions
More informationMachine Learning. Computational biology: Sequence alignment and profile HMMs
10-601 Machine Learning Computational biology: Sequence alignment and profile HMMs Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Growth
More informationGene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients
1 Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients 1,2 Keyue Ding, Ph.D. Nov. 8, 2014 1 NCIC Clinical Trials Group, Kingston, Ontario, Canada 2 Dept. Public
More informationModern Medical Image Analysis 8DC00 Exam
Parts of answers are inside square brackets [... ]. These parts are optional. Answers can be written in Dutch or in English, as you prefer. You can use drawings and diagrams to support your textual answers.
More informationSVM CLASSIFICATION AND ANALYSIS OF MARGIN DISTANCE ON MICROARRAY DATA. A Thesis. Presented to. The Graduate Faculty of The University of Akron
SVM CLASSIFICATION AND ANALYSIS OF MARGIN DISTANCE ON MICROARRAY DATA A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master
More informationRegion-based Segmentation
Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.
More informationAnalyzing Genomic Data with NOJAH
Analyzing Genomic Data with NOJAH TAB A) GENOME WIDE ANALYSIS Step 1: Select the example dataset or upload your own. Two example datasets are available. Genome-Wide TCGA-BRCA Expression datasets and CoMMpass
More informationInformatica Data Explorer Performance Tuning
Informatica Data Explorer Performance Tuning 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationClustering Color/Intensity. Group together pixels of similar color/intensity.
Clustering Color/Intensity Group together pixels of similar color/intensity. Agglomerative Clustering Cluster = connected pixels with similar color. Optimal decomposition may be hard. For example, find
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More information0 Graphical Analysis Use of Excel
Lab 0 Graphical Analysis Use of Excel What You Need To Know: This lab is to familiarize you with the graphing ability of excels. You will be plotting data set, curve fitting and using error bars on the
More informationWorkload Characterization Techniques
Workload Characterization Techniques Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/
More informationTable Of Contents: xix Foreword to Second Edition
Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data
More informationIntroduction to the Bioconductor marray package : Input component
Introduction to the Bioconductor marray package : Input component Yee Hwa Yang 1 and Sandrine Dudoit 2 April 30, 2018 Contents 1. Department of Medicine, University of California, San Francisco, jean@biostat.berkeley.edu
More informationFeature Selection in Knowledge Discovery
Feature Selection in Knowledge Discovery Susana Vieira Technical University of Lisbon, Instituto Superior Técnico Department of Mechanical Engineering, Center of Intelligent Systems, IDMEC-LAETA Av. Rovisco
More informationOut-of-sample extension of diffusion maps in a computer-aided diagnosis system. Application to breast cancer virtual slide images.
Out-of-sample extension of diffusion maps in a computer-aided diagnosis system. Application to breast cancer virtual slide images. Philippe BELHOMME Myriam OGER Jean-Jacques MICHELS Benoit PLANCOULAINE
More informationIntroduction to Medical Imaging (5XSA0) Module 5
Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed
More informationAutomated Data Pipelines & Intelligent Microscopy
Galaxy Workshop Freiburg University Automated Data Pipelines & Intelligent Microscopy RNAi Screening Facility BioQuant, Heidelberg University Jürgen Reymann Scientific Infrastructure Automated Data Pipelines
More informationTutorial 1: Exploring the UCSC Genome Browser
Last updated: May 12, 2011 Tutorial 1: Exploring the UCSC Genome Browser Open the homepage of the UCSC Genome Browser at: http://genome.ucsc.edu/ In the blue bar at the top, click on the Genomes link.
More informationNorbert Schuff VA Medical Center and UCSF
Norbert Schuff Medical Center and UCSF Norbert.schuff@ucsf.edu Medical Imaging Informatics N.Schuff Course # 170.03 Slide 1/67 Objective Learn the principle segmentation techniques Understand the role
More informationGene expression & Clustering (Chapter 10)
Gene expression & Clustering (Chapter 10) Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species Dynamic programming Approximate pattern matching
More information10 Microarray Analysis Using the MicroArray Explorer
10 Microarray Analysis Using the MicroArray Explorer Peter F. Lemkin, Gregory C. Thornwall, Jai Evans 3 (1) Laboratory of Experimental and Computational Biology, CCR, NCI- Frederick, Frederick, MD 21702;
More informationRPPanalyzer (Version 1.0.3) Analyze reverse phase protein array data User s Guide
RPPanalyzer (Version 1.0.3) Analyze reverse phase protein array data User s Guide Heiko Mannsperger and Stephan Gade German Cancer Research Center Heidelberg, Germany October 1, 2012 Contents 1 Introduction
More informationROTS: Reproducibility Optimized Test Statistic
ROTS: Reproducibility Optimized Test Statistic Fatemeh Seyednasrollah, Tomi Suomi, Laura L. Elo fatsey (at) utu.fi March 3, 2016 Contents 1 Introduction 2 2 Algorithm overview 3 3 Input data 3 4 Preprocessing
More informationIntroduction to Bioinformatics AS Laboratory Assignment 2
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 2 Last week, we discussed several high-throughput methods for the analysis of gene expression in cells. Of those methods, microarray technologies
More informationTIGR ExpressConverter
TIGR ExpressConverter (Version 1.7) January, 2005 Microarray Software Group The Institute for Genomic Research Table of Contents Introduction ---------------------------------------------------------------------------
More informationContents. Foreword to Second Edition. Acknowledgments About the Authors
Contents Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments About the Authors xxxi xxxv Chapter 1 Introduction 1 1.1 Why Data Mining? 1 1.1.1 Moving toward the Information Age 1
More informationOrganizing, cleaning, and normalizing (smoothing) cdna microarray data
Organizing, cleaning, and normalizing (smoothing) cdna microarray data All product names are given as examples only and they are not endorsed by the USDA or the University of Illinois. INTRODUCTION The
More informationSEEK User Manual. Introduction
SEEK User Manual Introduction SEEK is a computational gene co-expression search engine. It utilizes a vast human gene expression compendium to deliver fast, integrative, cross-platform co-expression analyses.
More informationContents. ! Data sets. ! Distance and similarity metrics. ! K-means clustering. ! Hierarchical clustering. ! Evaluation of clustering results
Statistical Analysis of Microarray Data Contents Data sets Distance and similarity metrics K-means clustering Hierarchical clustering Evaluation of clustering results Clustering Jacques van Helden Jacques.van.Helden@ulb.ac.be
More informationImage Analysis - Lecture 5
Texture Segmentation Clustering Review Image Analysis - Lecture 5 Texture and Segmentation Magnus Oskarsson Lecture 5 Texture Segmentation Clustering Review Contents Texture Textons Filter Banks Gabor
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 3/3/08 CAP5510 1 Gene g Probe 1 Probe 2 Probe N 3/3/08 CAP5510
More informationAGA User Manual. Version 1.0. January 2014
AGA User Manual Version 1.0 January 2014 Contents 1. Getting Started... 3 1a. Minimum Computer Specifications and Requirements... 3 1b. Installation... 3 1c. Running the Application... 4 1d. File Preparation...
More informationClustering Jacques van Helden
Statistical Analysis of Microarray Data Clustering Jacques van Helden Jacques.van.Helden@ulb.ac.be Contents Data sets Distance and similarity metrics K-means clustering Hierarchical clustering Evaluation
More informationLecture: Segmentation I FMAN30: Medical Image Analysis. Anders Heyden
Lecture: Segmentation I FMAN30: Medical Image Analysis Anders Heyden 2017-11-13 Content What is segmentation? Motivation Segmentation methods Contour-based Voxel/pixel-based Discussion What is segmentation?
More informationBreeding Guide. Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel
Breeding Guide Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel www.phenome-netwoks.com Contents PHENOME ONE - INTRODUCTION... 3 THE PHENOME ONE LAYOUT... 4 THE JOBS ICON...
More informationTopics of the talk. Biodatabases. Data types. Some sequence terminology...
Topics of the talk Biodatabases Jarno Tuimala / Eija Korpelainen CSC What data are stored in biological databases? What constitutes a good database? Nucleic acid sequence databases Amino acid sequence
More informationPerformance Evaluation of the TINA Medical Image Segmentation Algorithm on Brainweb Simulated Images
Tina Memo No. 2008-003 Internal Memo Performance Evaluation of the TINA Medical Image Segmentation Algorithm on Brainweb Simulated Images P. A. Bromiley Last updated 20 / 12 / 2007 Imaging Science and
More informationIntroduction to Cancer Genomics
Introduction to Cancer Genomics Gene expression data analysis part I David Gfeller Computational Cancer Biology Ludwig Center for Cancer research david.gfeller@unil.ch 1 Overview 1. Basic understanding
More informationCANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr.
CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. Michael Nechyba 1. Abstract The objective of this project is to apply well known
More informationData Clustering Hierarchical Clustering, Density based clustering Grid based clustering
Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms
More informationDimension reduction : PCA and Clustering
Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental
More informationIntroduction to R and Statistical Data Analysis
Microarray Center Introduction to R and Statistical Data Analysis PART II Petr Nazarov petr.nazarov@crp-sante.lu 22-11-2010 OUTLINE PART II Descriptive statistics in R (8) sum, mean, median, sd, var, cor,
More informationImage Processing: Final Exam November 10, :30 10:30
Image Processing: Final Exam November 10, 2017-8:30 10:30 Student name: Student number: Put your name and student number on all of the papers you hand in (if you take out the staple). There are always
More informationGeneralized Principal Component Analysis CVPR 2007
Generalized Principal Component Analysis Tutorial @ CVPR 2007 Yi Ma ECE Department University of Illinois Urbana Champaign René Vidal Center for Imaging Science Institute for Computational Medicine Johns
More informationGLIDER: Gradient Landmark-Based Distributed Routing for Sensor Networks. Stanford University. HP Labs
GLIDER: Gradient Landmark-Based Distributed Routing for Sensor Networks Qing Fang Jie Gao Leonidas J. Guibas Vin de Silva Li Zhang Stanford University HP Labs Point-to-Point Routing in Sensornets Routing
More informationPoints Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked
Plotting Menu: QCExpert Plotting Module graphs offers various tools for visualization of uni- and multivariate data. Settings and options in different types of graphs allow for modifications and customizations
More informationScienceDirect. Fuzzy clustering algorithms for cdna microarray image spots segmentation.
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 417 424 International Conference on Information and Communication Technologies (ICICT 2014) Fuzzy clustering
More informationCommittee: Dr. Rosemary Renaut 1 Professor Department of Mathematics and Statistics, Director Computational Biosciences PSM Arizona State University
Evaluation of Gene Selection Using Support Vector Machine Recursive Feature Elimination A report presented in fulfillment of internship requirements of the CBS PSM Degree Committee: Dr. Rosemary Renaut
More informationDNA microarrays [1] are used to measure the expression
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 7, JULY 2005 901 Mixture Model Analysis of DNA Microarray Images K. Blekas*, Member, IEEE, N. P. Galatsanos, Senior Member, IEEE, A. Likas, Senior Member,
More informationGeneSifter.Net User s Guide
GeneSifter.Net User s Guide 1 2 GeneSifter.Net Overview Login Upload Tools Pairwise Analysis Create Projects For more information about a feature see the corresponding page in the User s Guide noted in
More information