A systematic performance evaluation of clustering methods for single-cell RNA-seq data Supplementary Figures and Tables
|
|
- Charla Paul
- 5 years ago
- Views:
Transcription
1 A systematic performance evaluation of clustering methods for single-cell RNA-seq data Supplementary Figures and Tables Angelo Duò, Mark D Robinson, Charlotte Soneson
2 Supplementary Table : Overview of the datasets used in the study. Data set Total # cells Total # features Filtering Retained # cells Retained # features Koh 8,98 KohTCC 8,98 Kumar 6,9 KumarTCC 6 8, SimKumareasy,66 SimKumarhard 99,68 SimKumar8hard 99,6 Trapnell, TrapnellTCC 7 68,9 Zhengmixeq,99,68 Zhengmixuneq 6,98 6, Zhengmix8eq,99,76,898,898 MDrop,898 8,9 8,9 MDrop 8,9 6, 6,6 MDrop 6,6 6 8, 6 8, MDrop 6 8,,6,6 MDrop,6 99,6 99,6 MDrop 99,6 99,6 99,6 MDrop 99,6,, MDrop, 7 68,9 7 68,9 MDrop 7 68,9,,6,,7 MDrop,,7 6,,6,79,6 MDrop,8,6,97,7,798,7 MDrop,66,7
3 Zhengmixeq Zhengmixuneq Zhengmix8eq cd.monocytes cd.monocytes dimension dimension dimension cd.monocytes cd.t.helper cd6.nk memory.t naive.t dimension dimension Kumar dimension Trapnell Koh H7_derived_APS Dgcr8 knockout mouse serum+lif v6. mouse i+lif v6. mouse serum+lif H7_derived_DLtM T T T8 dimension dimension dimension H7_derived_DCntrlDrmmtm H7_derived_DLLpPXM H7_derived_ESMT H7_derived_MPS H7_derived_Sclrtm H7_dreived_D._Smtmrs H7hESC dimension dimension SimKumareasy dimension SimKumarhard SimKumar8hard Group Group Group Group Group Group dimension Group dimension dimension Group Group Group Group Group Group Group6 Group7 Group8 dimension dimension KohTCC dimension TrapnellTCC KumarTCC H7_derived_APS H7_derived_DCntrlDrmmtm H7_derived_DLLpPXM H7_derived_ESMT H7_derived_MPS H7_derived_Sclrtm T T T8 Dgcr8 knockout mouse serum+lif v6. mouse i+lif v6. mouse serum+lif H7_dreived_D._Smtmrs dimension H7_derived_DLtM dimension dimension H7hESC dimension dimension dimension Supplementary Figure : t-sne representations of all data sets used in the study (before gene filtering). The cells are colored by the annotation used to define the ground truth, to which all clusterings are compared.
4 Zhengmixeq Zhengmixuneq MDrop MDrop.6% 6.8% MDrop 7.%.%.%.7% 9.% 9.8%.6%.6%.8%.%.%.%.88%.% 6%.9% Kumar Trapnell MDrop Koh MDrop.9% 9.% MDrop.%.9% 8.9%.% 6.%.% 7.%.%.9%.%.89% 6.%.86%.% 8.7%.%.%.8%.7% SimKumareasy SimKumarhard MDrop %.% 7.%.6% Zhengmix8eq SimKumar8hard MDrop.99% 6%.% 6.9%.6%.%.7%.9% KohTCC 7.% MDrop % MDrop MDrop 8.%.7%.%.6%.%.% 6.8%.79% 8% 6%.%.7%.% KumarTCC.69% 8.% % TrapnellTCC MDrop.9%.7% 6.% 6.%.%.% 9.7% 7.9% % Supplementary Figure : Overlaps between the features retained by the different gene filtering approaches, for each of the data sets. The numbers represent percentages of the total number of features selected by at least one of the approaches.
5 Koh KohTCC Kumar KumarTCC SimKumareasy SimKumarhard SimKumar8hard Trapnell TrapnellTCC Zhengmixeq Zhengmixuneq Zhengmix8eq..7 Median Adjusted Rand Index filtered filtered filteredmdrop Number of clusters SC SCsvm Supplementary Figure : Adjusted Rand Index (ARI) as a function of the number of clusters (k), for each of the data sets (columns) and gene filterings (rows). The vertical lines indicate the true number of populations in the respective data sets.
6 A sce_filtered_zhengmixeq B sce_filtered_zhengmixeq True class True class cd.monocytes cd.monocytes dimension Clusters dimension Clusters C dimension sce_filtered_zhengmixuneq D dimension sce_filtered_trapnell SC True class cd.monocytes True class T T dimension Clusters dimension T8 Clusters dimension dimension Supplementary Figure : t-sne plots showing examples of data set/clustering method combinations with a large disagreement between the inferred and true clusters. The respective data sets, gene filtering methods and clustering methods are indicated above each plot. The true clusters are indicated with different shapes, and the inferred clusters in different colors. 6
7 Kumar KumarTCC SimKumareasy KohTCC Koh Zhengmixuneq SimKumarhard Zhengmixeq SimKumar8hard Zhengmix8eq TrapnellTCC Trapnell filtered filtered filteredmdrop Median ARI..7.. SCsvm SC SCsvm SC SCsvm SC Supplementary Figure : Adjusted Rand Index (ARI) at the number of clusters that gives the best performance. Some methods failed to return any clustering for certain data sets (indicated by white squares). Association between number of detected features and cluster assignment Koh KohTCC Kumar KumarTCC 7 SimKumareasy SimKumarhard SimKumar8hard Trapnell 8 6 TrapnellTCC Zhengmixeq Zhengmixuneq Zhengmix8eq SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm log(adjusted p value) filtering filtered filtered filteredmdrop Supplementary Figure 6: Association between the number of detected features and the cluster assignment, for all data sets and filterings, with the true number of clusters. Only one of the five runs is shown. Values are adjusted p-values (using the Benjamini-Hochberg method) from a Kruskal-Wallis test. 7
8 Koh KohTCC Kumar KumarTCC SimKumareasy SimKumarhard SimKumar8hard Trapnell TrapnellTCC Zhengmixeq Zhengmixuneq Zhengmix8eq Elapsed time (s) filtered filtered filteredmdrop SC SCsvm Number of clusters Supplementary Figure 7: Actual run times versus the imposed number of clusters, for each data set and gene filtering method. 8
9 Koh KohTCC Kumar KumarTCC SimKumareasy SimKumarhard SimKumar8hard Trapnell TrapnellTCC Zhengmixeq Zhengmixuneq Zhengmix8eq Elapsed time (s) filtered filtered filteredmdrop SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm SC SCsvm Supplementary Figure 8: Actual run times, across all numbers of clusters, for each data set and gene filtering method. 9
10 Median ARI. MDrop_MDrop.8 _.6 MDrop MDrop. SCsvm SC Supplementary Figure 9: Comparison of the clusterings obtained after different gene filterings. The values represent the median Adjusted Rand Index (ARI) across the data sets and filterings, for the true number of clusters, for each pair of filterings. Koh KohTCC Kumar KumarTCC SimKumareasy SimKumarhard SimKumar8hard Trapnell TrapnellTCC Zhengmixeq Zhengmixuneq Zhengmix8eq Entropy filtered filtered filteredmdrop SC SCsvm Number of clusters Supplementary Figure : Entropy for different imposed numbers of clusters. The true number of clusters is indicated by the vertical line, and the cross represents the entropy of the true partition.
11 filtered filtered filteredmdrop Difference between estimated and true k SC SCsvm SC SCsvm SC SCsvm Supplementary Figure : Difference between estimated and true number of clusters, for methods including a function to estimate the number of clusters. Kumar KumarTCC filtered filtered filteredmdrop SimKumareasy KohTCC Koh SimKumar8hard SimKumarhard Zhengmixeq Zhengmixuneq Zhengmix8eq Median ARI TrapnellTCC Trapnell SCsvm SC SCsvm SC SCsvm SC Supplementary Figure : Adjusted Rand Index (ARI) at the number of clusters estimated by the indicated methods. Empty squares indicate that the estimated number of clusters was outside the evaluated range (- for data sets with less than or equal to 7 true clusters, - for data sets with more than 7 true clusters.
Review: Identification of cell types from single-cell transcriptom. method
Review: Identification of cell types from single-cell transcriptomes using a novel clustering method University of North Carolina at Charlotte October 12, 2015 Brief overview Identify clusters by merging
More informationClustering. RNA-seq: What is it good for? Finding Similarly Expressed Genes. Data... And Lots of It!
RNA-seq: What is it good for? Clustering High-throughput RNA sequencing experiments (RNA-seq) offer the ability to measure simultaneously the expression level of thousands of genes in a single experiment!
More informationMinitab Notes for Activity 1
Minitab Notes for Activity 1 Creating the Worksheet 1. Label the columns as team, heat, and time. 2. Have Minitab automatically enter the team data for you. a. Choose Calc / Make Patterned Data / Simple
More informationSupplementary Material. Cell type-specific termination of transcription by transposable element sequences
Supplementary Material Cell type-specific termination of transcription by transposable element sequences Andrew B. Conley and I. King Jordan Controls for TTS identification using PET A series of controls
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming
More informationsrap: Simplified RNA-Seq Analysis Pipeline
srap: Simplified RNA-Seq Analysis Pipeline Charles Warden October 30, 2017 1 Introduction This package provides a pipeline for gene expression analysis. The normalization function is specific for RNA-Seq
More informationChIP-seq hands-on practical using Galaxy
ChIP-seq hands-on practical using Galaxy In this exercise we will cover some of the basic NGS analysis steps for ChIP-seq using the Galaxy framework: Quality control Mapping of reads using Bowtie2 Peak-calling
More informationFrequency Distributions
Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,
More informationMath 121 Project 4: Graphs
Math 121 Project 4: Graphs Purpose: To review the types of graphs, and use MS Excel to create them from a dataset. Outline: You will be provided with several datasets and will use MS Excel to create graphs.
More informationWindows. RNA-Seq Tutorial
Windows RNA-Seq Tutorial 2017 Gene Codes Corporation Gene Codes Corporation 525 Avis Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com
More informationmirnet Tutorial Starting with expression data
mirnet Tutorial Starting with expression data Computer and Browser Requirements A modern web browser with Java Script enabled Chrome, Safari, Firefox, and Internet Explorer 9+ For best performance and
More informationPackage SC3. November 27, 2017
Type Package Title Single-Cell Consensus Clustering Version 1.7.1 Author Vladimir Kiselev Package SC3 November 27, 2017 Maintainer Vladimir Kiselev A tool for unsupervised
More informationPackage scmap. March 29, 2019
Type Package Package scmap March 29, 2019 Title A tool for unsupervised projection of single cell RNA-seq data Version 1.4.1 Author Vladimir Kiselev Maintainer Vladimir Kiselev
More informationPackage SC3. September 29, 2018
Type Package Title Single-Cell Consensus Clustering Version 1.8.0 Author Vladimir Kiselev Package SC3 September 29, 2018 Maintainer Vladimir Kiselev A tool for unsupervised
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and
More informationDifferential Expression Analysis at PATRIC
Differential Expression Analysis at PATRIC The following step- by- step workflow is intended to help users learn how to upload their differential gene expression data to their private workspace using Expression
More informationSNPViewer Documentation
SNPViewer Documentation Module name: Description: Author: SNPViewer Displays SNP data plotting copy numbers and LOH values Jim Robinson (Broad Institute), gp-help@broad.mit.edu Summary: The SNPViewer displays
More informationAutomated Bioinformatics Analysis System on Chip ABASOC. version 1.1
Automated Bioinformatics Analysis System on Chip ABASOC version 1.1 Phillip Winston Miller, Priyam Patel, Daniel L. Johnson, PhD. University of Tennessee Health Science Center Office of Research Molecular
More informationPackage SIMLR. December 1, 2017
Version 1.4.0 Date 2017-10-21 Package SIMLR December 1, 2017 Title Title: SIMLR: Single-cell Interpretation via Multi-kernel LeaRning Maintainer Daniele Ramazzotti Depends
More informationHow to Make Graphs in EXCEL
How to Make Graphs in EXCEL The following instructions are how you can make the graphs that you need to have in your project.the graphs in the project cannot be hand-written, but you do not have to use
More informationSEEK User Manual. Introduction
SEEK User Manual Introduction SEEK is a computational gene co-expression search engine. It utilizes a vast human gene expression compendium to deliver fast, integrative, cross-platform co-expression analyses.
More informationClustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming
Clustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming R. Karthick 1, Dr. Malathi.A 2 Research Scholar, Department of Computer
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More information16 - Comparing Groups
16 - Comparing Groups Contents 16 - COMPARING GROUPS... 1 QUALITATIVE DATA: CONTENT OF CODED SEGMENTS... 1 QUANTITATIVE DATA: CODE FREQUENCIES... 3 16 - Comparing Groups MAXQDA allows you to compare different
More informationTutorial: RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and Expression measures
: RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and February 24, 2014 Sample to Insight : RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and : RNA-Seq Analysis
More informationHow to use the DEGseq Package
How to use the DEGseq Package Likun Wang 1,2 and Xi Wang 1. October 30, 2018 1 MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST /Department of Automation, Tsinghua University. 2
More informationCharts in Excel 2003
Charts in Excel 2003 Contents Introduction Charts in Excel 2003...1 Part 1: Generating a Basic Chart...1 Part 2: Adding Another Data Series...3 Part 3: Other Handy Options...5 Introduction Charts in Excel
More informationNature Methods: doi: /nmeth Supplementary Figure 1
Supplementary Figure 1 Schematic representation of the Workflow window in Perseus All data matrices uploaded in the running session of Perseus and all processing steps are displayed in the order of execution.
More informationITMO Ecole de Bioinformatique Hands-on session: smallrna-seq N. Servant 21 rd November 2013
ITMO Ecole de Bioinformatique Hands-on session: smallrna-seq N. Servant 21 rd November 2013 1. Data and objectives We will use the data from GEO (GSE35368, Toedling, Servant et al. 2011). Two samples were
More informationCHAPTER 6. The Normal Probability Distribution
The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit
More informationSupplementary text S6 Comparison studies on simulated data
Supplementary text S Comparison studies on simulated data Peter Langfelder, Rui Luo, Michael C. Oldham, and Steve Horvath Corresponding author: shorvath@mednet.ucla.edu Overview In this document we illustrate
More informationExpression Analysis with the Advanced RNA-Seq Plugin
Expression Analysis with the Advanced RNA-Seq Plugin May 24, 2016 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com
More informationCanadian National Longitudinal Survey of Children and Youth (NLSCY)
Canadian National Longitudinal Survey of Children and Youth (NLSCY) Fathom workshop activity For more information about the survey, see: http://www.statcan.ca/ Daily/English/990706/ d990706a.htm Notice
More informationPage 1. Graphical and Numerical Statistics
TOPIC: Description Statistics In this tutorial, we show how to use MINITAB to produce descriptive statistics, both graphical and numerical, for an existing MINITAB dataset. The example data come from Exercise
More informationPROMO 2017a - Tutorial
PROMO 2017a - Tutorial Introduction... 2 Installing PROMO... 2 Step 1 - Importing data... 2 Step 2 - Preprocessing... 6 Step 3 Data Exploration... 9 Step 4 Clustering... 13 Step 5 Analysis of sample clusters...
More informationRNA-Seq Analysis With the Tuxedo Suite
June 2016 RNA-Seq Analysis With the Tuxedo Suite Dena Leshkowitz Introduction In this exercise we will learn how to analyse RNA-Seq data using the Tuxedo Suite tools: Tophat, Cuffmerge, Cufflinks and Cuffdiff.
More informationProjection with Public Data (PPD)
Projection with Public Data (PPD) Goal To compare users 16S rrna data with published datasets by processing and normalization them together, and projecting into 3D PCoA plot for visual comparative analysis
More informationPackage ALDEx2. April 4, 2019
Type Package Package ALDEx2 April 4, 2019 Title Analysis Of Differential Abundance Taking Sample Variation Into Account Version 1.14.1 Date 2017-10-11 Author Greg Gloor, Ruth Grace Wong, Andrew Fernandes,
More informationPackage M3Drop. August 23, 2018
Version 1.7.0 Date 2016-07-15 Package M3Drop August 23, 2018 Title Michaelis-Menten Modelling of Dropouts in single-cell RNASeq Author Tallulah Andrews Maintainer Tallulah Andrews
More informationDetecting Novel Associations in Large Data Sets
Detecting Novel Associations in Large Data Sets J. Hjelmborg Department of Biostatistics 5. februar 2013 Biostat (Biostatistics) Detecting Novel Associations in Large Data Sets 5. februar 2013 1 / 22 Overview
More informationUsing the Kolmogorov-Smirnov Test for Image Segmentation
Using the Kolmogorov-Smirnov Test for Image Segmentation Yong Jae Lee CS395T Computational Statistics Final Project Report May 6th, 2009 I. INTRODUCTION Image segmentation is a fundamental task in computer
More informationHIERARCHICAL DATA VISUALIZATION AS A TOOL FOR DEVELOPING STUDENT UNDERSTANDING OF VARIATION OF DATA GENERATED IN SIMULATIONS
HIERARCHICAL DATA VISUALIZATION AS A TOOL FOR DEVELOPING STUDENT UNDERSTANDING OF VARIATION OF DATA GENERATED IN SIMULATIONS William Concord Consortium, USA wfinzer@concord.org Data Games is a data exploration
More informationProject 11 Graphs (Using MS Excel Version )
Project 11 Graphs (Using MS Excel Version 2007-10) Purpose: To review the types of graphs, and use MS Excel 2010 to create them from a dataset. Outline: You will be provided with several datasets and will
More informationVersion 1.0. Reference manual: RaceID (Rare Cell Type Identification)
Version.0 Reference manual: RaceID (Rare Cell Type Identification) 0. Prerequisites... I. Input data... II. Preprocessing of input data... 3 III. k-means clustering... 3 IV. Outlier identification... 8
More informationSlides URL.
MDS vs t SNE 1 2 Slides URL https://git.embl.de/klaus/mds_vs_tsne 3 Multi Dimensional Scaling (MDS) MDS: given distances can we infer (Eucledian) coordinates? yes!, via the double centering trick distances
More informationTechnical University of Munich. Exercise 8: Neural Networks
Technical University of Munich Chair for Bioinformatics and Computational Biology Protein Prediction I for Computer Scientists SoSe 2018 Prof. B. Rost M. Bernhofer, M. Heinzinger, D. Nechaev, L. Richter
More informationStatistics 202: Data Mining. c Jonathan Taylor. Clustering Based in part on slides from textbook, slides of Susan Holmes.
Clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group will be similar (or
More informationPackage EMDomics. November 11, 2018
Package EMDomics November 11, 2018 Title Earth Mover's Distance for Differential Analysis of Genomics Data The EMDomics algorithm is used to perform a supervised multi-class analysis to measure the magnitude
More informationDREM. Dynamic Regulatory Events Miner (v1.0.9b) User Manual
DREM Dynamic Regulatory Events Miner (v1.0.9b) User Manual Jason Ernst (jernst@cs.cmu.edu) Ziv Bar-Joseph Machine Learning Department School of Computer Science Carnegie Mellon University Contents 1 Introduction
More informationHow do microarrays work
Lecture 3 (continued) Alvis Brazma European Bioinformatics Institute How do microarrays work condition mrna cdna hybridise to microarray condition Sample RNA extract labelled acid acid acid nucleic acid
More informationPackage BayesMetaSeq
Type Package Package BayesMetaSeq March 19, 2016 Title Bayesian hierarchical model for RNA-seq differential meta-analysis Version 1.0 Date 2016-01-29 Author Tianzhou Ma, George Tseng Maintainer Tianzhou
More information3.3 Hardware Karnaugh Maps
2P P = P = 3.3 Hardware UIntroduction A Karnaugh map is a graphical method of Boolean logic expression reduction. A Boolean expression can be reduced to its simplest form through the 4 simple steps involved
More informationBGGN-213: FOUNDATIONS OF BIOINFORMATICS (Lecture 14)
BGGN-213: FOUNDATIONS OF BIOINFORMATICS (Lecture 14) Genome Informatics (Part 1) https://bioboot.github.io/bggn213_f17/lectures/#14 Dr. Barry Grant Nov 2017 Overview: The purpose of this lab session is
More informationCompClustTk Manual & Tutorial
CompClustTk Manual & Tutorial Brandon King Copyright c California Institute of Technology Version 0.1.10 May 13, 2004 Contents 1 Introduction 1 1.1 Purpose.............................................
More informationEvaluation and comparison of gene clustering methods in microarray analysis
Evaluation and comparison of gene clustering methods in microarray analysis Anbupalam Thalamuthu 1 Indranil Mukhopadhyay 1 Xiaojing Zheng 1 George C. Tseng 1,2 1 Department of Human Genetics 2 Department
More informationfastgear Manual Contents
fastgear Manual fastgear was developed in a collaborative project by Rafal Mostowy, Nicholas J. Croucher, Cheryl P. Andam, Jukka Corander, William P. Hanage, and Pekka Marttinen* Contents Introduction...
More informationChapter 2 Assignment (due Thursday, April 19)
(due Thursday, April 19) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should
More informationStatistics with a Hemacytometer
Statistics with a Hemacytometer Overview This exercise incorporates several different statistical analyses. Data gathered from cell counts with a hemacytometer is used to explore frequency distributions
More informationTutorial. RNA-Seq Analysis of Breast Cancer Data. Sample to Insight. November 21, 2017
RNA-Seq Analysis of Breast Cancer Data November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationEXERCISE: GETTING STARTED WITH SAV
Sequencing Analysis Viewer (SAV) Overview 1 EXERCISE: GETTING STARTED WITH SAV Purpose This exercise explores the following topics: How to load run data into SAV How to explore run metrics with SAV Getting
More informationUser s Guide. Using the R-Peridot Graphical User Interface (GUI) on Windows and GNU/Linux Systems
User s Guide Using the R-Peridot Graphical User Interface (GUI) on Windows and GNU/Linux Systems Pitágoras Alves 01/06/2018 Natal-RN, Brazil Index 1. The R Environment Manager...
More informationCURIOUS BROWSERS: Automated Gathering of Implicit Interest Indicators by an Instrumented Browser
CURIOUS BROWSERS: Automated Gathering of Implicit Interest Indicators by an Instrumented Browser David Brown Mark Claypool Computer Science Department Worcester Polytechnic Institute Worcester, MA 01609,
More informationChapter 2 Assignment (due Thursday, October 5)
(due Thursday, October 5) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should
More informationVariable Selection for Consistent Clustering
Variable Selection for Consistent Clustering Ron Yurko Rebecca Nugent Sam Ventura Department of Statistics Carnegie Mellon University Classification Society, 2017 Typical Questions in Cluster Analysis
More informationNUMERICAL COMPUTING For Finance Using Excel. Sorting and Displaying Data
NUMERICAL COMPUTING For Finance Using Excel Sorting and Displaying Data Outline 1 Sorting data Excel Sort tool (sort data in ascending or descending order) Simple filter (by ROW, COLUMN, apply a custom
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationAUTOMATIC BLASTOMERE DETECTION IN DAY 1 TO DAY 2 HUMAN EMBRYO IMAGES USING PARTITIONED GRAPHS AND ELLIPSOIDS
AUTOMATIC BLASTOMERE DETECTION IN DAY 1 TO DAY 2 HUMAN EMBRYO IMAGES USING PARTITIONED GRAPHS AND ELLIPSOIDS Amarjot Singh 1, John Buonassisi 1, Parvaneh Saeedi 1, Jon Havelock 2 1. School of Engineering
More informationAccelerated Life Testing Module Accelerated Life Testing - Overview
Accelerated Life Testing Module Accelerated Life Testing - Overview The Accelerated Life Testing (ALT) module of AWB provides the functionality to analyze accelerated failure data and predict reliability
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 2
Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical
More informationPackage GREP2. July 25, 2018
Type Package Package GREP2 July 25, 2018 Title GEO RNA-Seq Experiments Processing Pipeline Version 1.0.1 Maintainer Naim Al Mahi An R based pipeline to download and process Gene Expression
More informationQuantification. Part I, using Excel
Quantification In this exercise we will work with RNA-seq data from a study by Serin et al (2017). RNA-seq was performed on Arabidopsis seeds matured at standard temperature (ST, 22 C day/18 C night) or
More informationEBSeqHMM: An R package for identifying gene-expression changes in ordered RNA-seq experiments
EBSeqHMM: An R package for identifying gene-expression changes in ordered RNA-seq experiments Ning Leng and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 The model 2 2.1 EBSeqHMM model..........................................
More informationPredicting ground-level scene Layout from Aerial imagery. Muhammad Hasan Maqbool
Predicting ground-level scene Layout from Aerial imagery Muhammad Hasan Maqbool Objective Given the overhead image predict its ground level semantic segmentation Predicted ground level labeling Overhead/Aerial
More informationPRACTICE EXERCISES. Family Utility Expenses
PRACTICE EXERCISES Family Utility Expenses Your cousin, Rita Dansie, wants to analyze her family's utility expenses for 2012. She wants to save money during months when utility expenses are lower so that
More informationPackage SEPA. September 26, 2018
Type Package Title SEPA Version 1.10.0 Date 2015-04-17 Author Zhicheng Ji, Hongkai Ji Maintainer Zhicheng Ji Package SEPA September 26, 2018 Given single-cell RNA-seq data and true experiment
More informationStep-by-Step Guide to Basic Genetic Analysis
Step-by-Step Guide to Basic Genetic Analysis Page 1 Introduction This document shows you how to clean up your genetic data, assess its statistical properties and perform simple analyses such as case-control
More informationPage Segmentation by Web Content Clustering
Page Segmentation by Web Content Clustering Sadet Alcic Heinrich-Heine-University of Duesseldorf Department of Computer Science Institute for Databases and Information Systems May 26, 20 / 9 Outline Introduction
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1
Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group
More informationPractical OmicsFusion
Practical OmicsFusion Introduction In this practical, we will analyse data, from an experiment which aim was to identify the most important metabolites that are related to potato flesh colour, from an
More informationPackage monocle. March 10, 2014
Package monocle March 10, 2014 Type Package Title Analysis tools for single-cell expression experiments. Version 0.99.0 Date 2013-11-19 Author Cole Trapnell Maintainer Cole Trapnell
More informationCellular Automata. Cellular Automata contains three modes: 1. One Dimensional, 2. Two Dimensional, and 3. Life
Cellular Automata Cellular Automata is a program that explores the dynamics of cellular automata. As described in Chapter 9 of Peak and Frame, a cellular automaton is determined by four features: The state
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use
More informationPackage clusterseq. R topics documented: June 13, Type Package
Type Package Package clusterseq June 13, 2018 Title Clustering of high-throughput sequencing data by identifying co-expression patterns Version 1.4.0 Depends R (>= 3.0.0), methods, BiocParallel, bayseq,
More informationStatistics Lecture 6. Looking at data one variable
Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)
More informationSupplementary Information for: Global similarity and local divergence in human and mouse gene co-expression networks
Supplementary Information for: Global similarity and local divergence in human and mouse gene co-expression networks Panayiotis Tsaparas 1, Leonardo Mariño-Ramírez 2, Olivier Bodenreider 3, Eugene V. Koonin
More informationTitle: Connect It: Performance Optimization Session #: 521 Speaker: J.R. Horton Company: HP OpenView ITSM
Title: Connect It: Performance Optimization Session #: 521 Speaker: J.R. Horton Company: HP OpenView ITSM Agenda Connect It Overview Optimization features How to tune scenario performance Other considerations
More information1. Setting up & data entry
1. Setting up & data entry Begin by typing date, event name, and height in row 31, columns A-D. We selected 31 simply to leave some room at the top of the document, as that is where your timeline will
More informationSupplementary Figures
Supplementary Figures re co rd ed ch annels darkf ield MPM2 PI cell cycle phases G1/S/G2 prophase metaphase anaphase telophase bri ghtfield 55 pixel Supplementary Figure 1 Images of Jurkat cells captured
More informationBackground. Parallel Coordinates. Basics. Good Example
Background Parallel Coordinates Shengying Li CSE591 Visual Analytics Professor Klaus Mueller March 20, 2007 Proposed in 80 s by Alfred Insellberg Good for multi-dimensional data exploration Widely used
More informationA formal concept analysis approach to consensus clustering of multi-experiment expression data
Hristoskova et al. BMC Bioinformatics 214, 15:151 METHODOLOGY ARTICLE Open Access A formal concept analysis approach to consensus clustering of multi-experiment expression data Anna Hristoskova 1*, Veselka
More informationSTEM. Short Time-series Expression Miner (v1.1) User Manual
STEM Short Time-series Expression Miner (v1.1) User Manual Jason Ernst (jernst@cs.cmu.edu) Ziv Bar-Joseph Center for Automated Learning and Discovery School of Computer Science Carnegie Mellon University
More informationAll About PlexSet Technology Data Analysis in nsolver Software
All About PlexSet Technology Data Analysis in nsolver Software PlexSet is a multiplexed gene expression technology which allows pooling of up to 8 samples per ncounter cartridge lane, enabling users to
More informationMulti-View 3D Object Detection Network for Autonomous Driving
Multi-View 3D Object Detection Network for Autonomous Driving Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia CVPR 2017 (Spotlight) Presented By: Jason Ku Overview Motivation Dataset Network Architecture
More informationTables in Microsoft Word
Tables in Microsoft Word In this lesson we re going to create and work with Tables in Microsoft Word. Tables are used to improve the organisation and presentation of data in your documents. If you haven
More informationSA+ Spreadsheets. Fig. 1
SongSeq User Manual 1- SA+ Spreadsheets 2- Make Template And Sequence a. Template Selection b. Syllables Identification: Using One Pair of Features c. Loading Target Files d. Viewing Results e. Identification
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu November 7, 2017 Learnt Clustering Methods Vector Data Set Data Sequence Data Text
More informationJoint Embeddings of Shapes and Images. 128 dim space visualized by t-sne
MDS Embedding MDS takes as input a distance matrix D, containing all N N pair of distances between elements xi, and embed the elements in N dimensional space such that the inter distances Dij are preserved
More information