SWIFT: SCALABLE WEIGHTED ITERATIVE FLOW-CLUSTERING TECHNIQUE

Size: px

Start display at page:

Download "SWIFT: SCALABLE WEIGHTED ITERATIVE FLOW-CLUSTERING TECHNIQUE"

Allison Joella Copeland
5 years ago
Views:

1 SWIFT: SCALABLE WEIGHTED ITERATIVE FLOW-CLUSTERING TECHNIQUE Iftekhar Naim, Gaurav Sharma, Suprakash Datta, James S. Cavenaugh, Jyh-Chiang E. Wang, Jonathan A. Rebhahn, Sally A. Quataert, and Tim R. Mosmann University of Rochester, Rochester, NY York University, Toronto, ON FlowCAP Summit, / 48 SWIFT

2 OUTLINE 1 INTRODUCTION Flow cytometry (FC) Data Analysis Automated Multivariate clustering of FC Data 2 SWIFT METHOD FOR FC DATA ANALYSIS SWIFT Algorithm Weighted Iterative Sampling based EM Bimodality Splitting Graph-based Merging 3 DOES IT WORK? Does It Work? How Do We Know It Works? 4 FLOWCAP CONTEST Results on FlowCAP Datasets Few Thoughts for FlowCAP II 5 CONCLUSION 2 / 48 SWIFT

3 OUTLINE 1 INTRODUCTION Flow cytometry (FC) Data Analysis Automated Multivariate clustering of FC Data 2 SWIFT METHOD FOR FC DATA ANALYSIS SWIFT Algorithm Weighted Iterative Sampling based EM Bimodality Splitting Graph-based Merging 3 DOES IT WORK? Does It Work? How Do We Know It Works? 4 FLOWCAP CONTEST Results on FlowCAP Datasets Few Thoughts for FlowCAP II 5 CONCLUSION 3 / 48 SWIFT

4 FLOW CYTOMETRY (FC) OVERVIEW Rapid multivariate analysis of individual cells. High throughput data generation (description of 1 million cells). High dimensionality ( 20 measurements per cell). Fluorochrome Antibody Antigen Cell FIGURE: Flow cytometry system (Ref: 4 / 48 SWIFT

5 FC DATA ANALYSIS Traditionally FC data analyzed by Manual Gating Subjective, Scales poorly with increasing dimensions 1D/2D Projections may not represent full picture Inaccurate for overlapping clusters (a) Two overlapping (b) Combined view (c) Manual gating clusters FIGURE: Manual gating for overlapping clusters. Automated multivariate clustering is desirable for FC Data analysis. Repeatable, nonsubjective, comprehends multivariate structure. 5 / 48 SWIFT

6 CHALLENGES OF AUTOMATED CLUSTERING OF FC DATA Challenges of Automated Clustering: Large FC datasets 1 million events High dimensionality ( 20 or more dimensions) Very small clusters that are important in immunological analysis ( cells out of millions) Overlapping clusters and background noise 6 / 48 SWIFT

7 CHALLENGES OF AUTOMATED CLUSTERING OF FC DATA Challenges of Automated Clustering: Large FC datasets 1 million events High dimensionality ( 20 or more dimensions) Very small clusters that are important in immunological analysis ( cells out of millions) Overlapping clusters and background noise Our goal: Design automated clustering method capable of addressing these challenges 6 / 48 SWIFT

8 MANY DIFFERENT CLUSTERING METHODS Patitional Clustering Soft Hard Mixture Fuzzy K-means Model Clustering Grid Based Spectral Clustering... 7 / 48 SWIFT

9 MANY DIFFERENT CLUSTERING METHODS Patitional Clustering Soft Hard Mixture Fuzzy K-means Model Clustering Grid Based Spectral Clustering... 8 / 48 SWIFT

10 MODEL BASED CLUSTERING FOR FC DATA Model based clustering offers several advantages: Soft clustering- comprehends overlapping clusters, background noise BUT, computationally expensive and choice of model imposes limitation 9 / 48 SWIFT

11 MODEL BASED CLUSTERING FOR FC DATA Model based clustering offers several advantages: Soft clustering- comprehends overlapping clusters, background noise BUT, computationally expensive and choice of model imposes limitation Recent proposals for statistical model based FC clustering (Chan et al. [2008], Lo et al. [2008],Finak et al. [2009], Pyne et al. [2009]) 9 / 48 SWIFT

12 MODEL BASED CLUSTERING FOR FC DATA Model based clustering offers several advantages: Soft clustering- comprehends overlapping clusters, background noise BUT, computationally expensive and choice of model imposes limitation Recent proposals for statistical model based FC clustering (Chan et al. [2008], Lo et al. [2008],Finak et al. [2009], Pyne et al. [2009]) We propose computationally efficient model-based clustering method SWIFT (Naim et al. [2010]) that offers two advantages: Scalability: Faster Computation + Less Memory Usage Detection of Small Populations: 100 cells out of 1 million 9 / 48 SWIFT

13 OUTLINE 1 INTRODUCTION Flow cytometry (FC) Data Analysis Automated Multivariate clustering of FC Data 2 SWIFT METHOD FOR FC DATA ANALYSIS SWIFT Algorithm Weighted Iterative Sampling based EM Bimodality Splitting Graph-based Merging 3 DOES IT WORK? Does It Work? How Do We Know It Works? 4 FLOWCAP CONTEST Results on FlowCAP Datasets Few Thoughts for FlowCAP II 5 CONCLUSION 10 / 48 SWIFT

14 SWIFT ALGORITHM FOR FC DATA CLUSTERING SWIFT: a three stage algorithm: 11 / 48 SWIFT

15 SWIFT ALGORITHM FOR FC DATA CLUSTERING SWIFT: a three stage algorithm: 1 Weighted Iterative Sampling based EM : Gaussian mixture model clustering + novel weighted iterative sampling Bayesian Information Criterion (BIC) 11 / 48 SWIFT

16 SWIFT ALGORITHM FOR FC DATA CLUSTERING SWIFT: a three stage algorithm: 1 Weighted Iterative Sampling based EM : Gaussian mixture model clustering + novel weighted iterative sampling Bayesian Information Criterion (BIC) 2 Bimodality Splitting: Split any cluster that is, Bimodal in any dimensions or any principal components. Useful for clustering high dimensional data. 11 / 48 SWIFT

17 SWIFT ALGORITHM FOR FC DATA CLUSTERING SWIFT: a three stage algorithm: 1 Weighted Iterative Sampling based EM : Gaussian mixture model clustering + novel weighted iterative sampling Bayesian Information Criterion (BIC) 2 Bimodality Splitting: Split any cluster that is, Bimodal in any dimensions or any principal components. Useful for clustering high dimensional data. 3 Graph-based Merging: Merge overlapping Gaussians ( Hennig [2009], Finak et al. [2009], Baudry et al. [2010]). Allows representation of non-gaussian clusters 11 / 48 SWIFT

18 CLUSTERING STRATEGY: SWIFT GMM clustering with Sampling for k [K min, K max ] BIC to decide number of Gaussians ( ˆK) Split Bimodal Clusters until Unimodal. Results in K split Clusters Graph-based Merging using Overlap/Entropy criteria Results in K entropy Clusters Soft clustering for K entropy clusters 12 / 48 SWIFT

19 STAGE 1: GAUSSIAN MIXTURE MODEL CLUSTERING GMM clustering with Sampling for k [K min, K max ] BIC to decide number of Gaussians ( ˆK) Split Bimodal Clusters until Unimodal. Results in K split Clusters Graph-based Merging using Overlap/Entropy criteria Results in K entropy Clusters Soft clustering for K entropy clusters 13 / 48 SWIFT

20 STAGE 1: GAUSSIAN MIXTURE MODEL CLUSTERING Gaussian mixture model (GMM) clustering is chosen among the model based methods Faster than other model based clustering methods Closed form solution 14 / 48 SWIFT

21 STAGE 1: GAUSSIAN MIXTURE MODEL CLUSTERING Gaussian mixture model (GMM) clustering is chosen among the model based methods Faster than other model based clustering methods Closed form solution Expectation Maximization (EM) algorithm for parameter estimation Computational complexity of each iteration: O(Nkd 2 ) N = the number of data-vectors in the dataset k = is the number of Gaussian components d = is the dimension of each data-vectors 14 / 48 SWIFT

22 STAGE 1: SAMPLING FOR SCALABILITY Operate on smaller subsample of dataset for better computation performance. Challenge: Poor representation of smaller clusters. (a) 4 Gaussians with 150K, 100K, 50K and 150 datapoints (b) After 10% sampling 15 / 48 SWIFT

23 STAGE 1: SAMPLING FOR SCALABILITY Operate on smaller subsample of dataset for better computation performance. Challenge: Poor representation of smaller clusters. (c) 4 Gaussians with 150K, 100K, 50K and 150 datapoints (d) After 10% sampling Solution: Weighted iterative sampling Faster computation Better detection of small clusters 15 / 48 SWIFT

24 STAGE 1: WEIGHTED ITERATIVE SAMPLING BASED EM FCS Dataset X Subsample S from X GMM fitting to S using EM Fix p largest clusters and add them to F. Initially F = Resample S from X with probability l F γ (i) j All the clusters fixed? No Yes Perform few EM iterations on X Output model parameters (θ) 16 / 48 SWIFT

25 STAGE 1: WEIGHTED ITERATIVE SAMPLING BASED EM FCS Dataset X Subsample S from X F = set of clusters whose parameters are fixed. GMM fitting to S using EM Fix p largest clusters and add them to F. Initially F = Resample S from X with probability l F γ (i) j All the clusters fixed? No Yes Perform few EM iterations on X Output model parameters (θ) 16 / 48 SWIFT

26 STAGE 1: WEIGHTED ITERATIVE SAMPLING BASED EM FCS Dataset X Subsample S from X GMM fitting to S using EM F = set of clusters whose parameters are fixed. P(X (i) is selected in S) = l F γ (i) l Fix p largest clusters and add them to F. Initially F = Resample S from X with probability l F γ (i) j All the clusters fixed? No Yes Perform few EM iterations on X Output model parameters (θ) 16 / 48 SWIFT

27 STAGE 1: WEIGHTED ITERATIVE SAMPLING BASED EM FIGURE: 4 Gaussian clusters with 150K, 100K, 50K and 150 datapoints 17 / 48 SWIFT

28 WEIGHTED ITERATIVE SAMPLING: FIRST SAMPLE (a) First sample (b) Clustering first sample Uniform random sampling 18 / 48 SWIFT

29 WEIGHTED ITERATIVE SAMPLING: SECOND SAMPLE (c) Second sample (d) Clustering second sample Sampling probability: (1 l {1} γ (i) l ) 19 / 48 SWIFT

30 WEIGHTED ITERATIVE SAMPLING: THIRD SAMPLE (e) Third sample (f) Clustering third sample Sampling probability: (1 l {1,2} γ (i) l ) 20 / 48 SWIFT

31 WEIGHTED ITERATIVE SAMPLING: LAST SAMPLE (g) Last sample (h) Final clustering Sampling probability: (1 l {1,2,3} γ (i) l ) 21 / 48 SWIFT

32 STAGE 2: BIMODALITY SPLITTING GMM clustering with Sampling for k [K min, K max ] BIC to decide number of Gaussians ( ˆK) Split Bimodal Clusters until Unimodal. Results in K split Clusters Graph-based Merging using Overlap/Entropy criteria Results in K entropy Clusters Soft clustering for K entropy clusters 22 / 48 SWIFT

33 STAGE 2: BIMODALITY SPLITTING Motivated by Biology Separation along only one dimension can be significant 23 / 48 SWIFT

34 STAGE 2: BIMODALITY SPLITTING Motivated by Biology Separation along only one dimension can be significant Clustering is challenging for high-dimensional data. Curse of Dimensionality. Discrimination in one dimension can be obfuscated by strong similarity in other dimensions. 23 / 48 SWIFT

35 STAGE 2: BIMODALITY SPLITTING Motivated by Biology Separation along only one dimension can be significant Clustering is challenging for high-dimensional data. Problem: Curse of Dimensionality. Discrimination in one dimension can be obfuscated by strong similarity in other dimensions. Gaussian mixture model for high dimensional data: Sometimes results in small clusters that are bimodal in one or two dimensions 23 / 48 SWIFT

36 STAGE 2: BIMODALITY SPLITTING Motivated by Biology Separation along only one dimension can be significant Clustering is challenging for high-dimensional data. Problem: Curse of Dimensionality. Discrimination in one dimension can be obfuscated by strong similarity in other dimensions. Gaussian mixture model for high dimensional data: Solution: Sometimes results in small clusters that are bimodal in one or two dimensions Detect bimodal clusters and split them. 23 / 48 SWIFT

37 STAGE 2: BIMODALITY SPLITTING Bimodality Detection: detect clusters that are bimodal in Any given dimensions. Any principal components. Perform 1-D Kernel density estimation Compute number of modes Split each bimodal clusters until all subclusters are unimodal 24 / 48 SWIFT

38 STAGE 3: GRAPH-BASED MERGING GMM clustering with Sampling for k [K min, K max ] BIC to decide number of Gaussians ( ˆK) Split Bimodal Clusters until Unimodal. Results in K split Clusters Graph-based Merging using Overlap/Entropy criteria Results in K entropy Clusters Soft clustering for K entropy clusters 25 / 48 SWIFT

39 STAGE 3: GRAPH-BASED MERGING Merging of overlapping Gaussian components. Allows representing non-gaussian clusers. (k) After fitting 10 Gaussians (l) After merging down to 2 clusters 26 / 48 SWIFT

40 STAGE 3: GRAPH-BASED MERGING Merging Criterion: Normalized Overlap Measure (NO) Jaccard Index E i = Ellipsoid approximating i-th Gaussian NO(i,j) = Vol(E i E j ) Vol(E i E j ) (1) Merge (i,j ) such that, (i,j ) = max (i,j) NO(i,j) (2) 27 / 48 SWIFT

41 STAGE 3: GRAPH-BASED MERGING Merging Criterion: Normalized Overlap Measure (NO) Jaccard Index E i = Ellipsoid approximating i-th Gaussian NO(i,j) = Vol(E i E j ) Vol(E i E j ) (1) Merge (i,j ) such that, (i,j ) = max (i,j) NO(i,j) (2) Stopping Criterion: Merge until no significant changes in entropy (Finak et al. [2009], Baudry et al. [2010]). Ent(K) = n K i=1 j=1 γ (i) j log(γ (i) j ) (3) 27 / 48 SWIFT

42 STAGE 3: GRAPH-BASED MERGING 5 Gaussian clusters / 48 SWIFT

43 STAGE 3: GRAPH-BASED MERGING 5 Gaussian clusters / 48 SWIFT

44 STAGE 3: GRAPH-BASED MERGING 5 Gaussian clusters / 48 SWIFT

45 STAGE 3: GRAPH-BASED MERGING 5 Gaussian clusters / 48 SWIFT

46 OUTLINE 1 INTRODUCTION Flow cytometry (FC) Data Analysis Automated Multivariate clustering of FC Data 2 SWIFT METHOD FOR FC DATA ANALYSIS SWIFT Algorithm Weighted Iterative Sampling based EM Bimodality Splitting Graph-based Merging 3 DOES IT WORK? Does It Work? How Do We Know It Works? 4 FLOWCAP CONTEST Results on FlowCAP Datasets Few Thoughts for FlowCAP II 5 CONCLUSION 32 / 48 SWIFT

47 DOES IT WORK? Experiment: Cluster high-dimensional FC data Dataset: 544,000 Events 21 Dimensions 33 / 48 SWIFT

48 DOES IT WORK? Experiment: Cluster high-dimensional FC data Dataset: 544,000 Events 21 Dimensions SWIFT Output: 191 Gaussians (Gaussian fitting + Bimodality Splitting) 143 Clusters (Post Merging) 33 / 48 SWIFT

49 544,000 EVENTS, 21 DIMENSIONS, 143 CLUSTERS 34 / 48 SWIFT

50 HOW DO WE KNOW IT WORKS? Experiments to produce datasets with ground truth Rochester Human Immunology Center 35 / 48 SWIFT

51 HOW DO WE KNOW IT WORKS? Experiments to produce datasets with ground truth Rochester Human Immunology Center Electronic mixture of Human cells and Mouse cells Two datafiles: Human cells and Mouse cells only Human Datafile: 276,418 events Mouse Datafile: 267,582 events Total: 544,000 events Stained using both human and mouse antibodies. Human/mouse label is known for every cell. 35 / 48 SWIFT

52 HOW DO WE KNOW IT WORKS? Experiments to produce datasets with ground truth Rochester Human Immunology Center Electronic mixture of Human cells and Mouse cells Two datafiles: Human cells and Mouse cells only Human Datafile: 276,418 events Mouse Datafile: 267,582 events Total: 544,000 events Stained using both human and mouse antibodies. Human/mouse label is known for every cell. Examine every clusters: Human only? Mouse only? or Both? 35 / 48 SWIFT

53 FRACTIONAL MEMBERSHIP OF HUMAN AND MOUSE Fractional Membership (Human or Mouse) Human Mouse Cluster Numbers 36 / 48 SWIFT

54 SMALL CLUSTER DETECTION Electronic human-mouse mixture Varying proportions of human cells. Five datasets: 50%, 25%, 10%, 1%, 0.1% Human cells. 37 / 48 SWIFT

55 SMALL CLUSTER DETECTION Electronic human-mouse mixture Varying proportions of human cells. Five datasets: 50%, 25%, 10%, 1%, 0.1% Human cells. Sensitivity Analysis: Precision: Precision = Recall: Recall = TP TP+FP TP TP+FN 37 / 48 SWIFT

56 SMALL CLUSTER DETECTION Electronic human-mouse mixture Varying proportions of human cells. Five datasets: 50%, 25%, 10%, 1%, 0.1% Human cells. Sensitivity Analysis: Precision: Precision = Recall: Recall = TP TP+FP TP TP+FN % of Human cells Precision Recall Human Clusters 50% % % 84 25% % % 59 10% % % 38 1% % 92.2% % % 98.5% 4 37 / 48 SWIFT

57 ADVANTAGES OF SWIFT Scalable memory and computation time. Complexity of each EM iteration reduced from O(Nkd 2 ) to O(nkd 2 ) n = Sample size Better resolution of small clusters Capable of detecting non-gaussian clusters Works well for overlapping clusters (true for all model-based methods) 38 / 48 SWIFT

58 OUTLINE 1 INTRODUCTION Flow cytometry (FC) Data Analysis Automated Multivariate clustering of FC Data 2 SWIFT METHOD FOR FC DATA ANALYSIS SWIFT Algorithm Weighted Iterative Sampling based EM Bimodality Splitting Graph-based Merging 3 DOES IT WORK? Does It Work? How Do We Know It Works? 4 FLOWCAP CONTEST Results on FlowCAP Datasets Few Thoughts for FlowCAP II 5 CONCLUSION 39 / 48 SWIFT

59 CLUSTERING RESULTS: GVHD DATASET GvHD Dataset: Data Sample 001.fcs Events 6 dimensions 40 / 48 SWIFT

60 CLUSTERING RESULTS: GVHD DATASET GvHD Dataset: Data Sample 001.fcs Events 6 dimensions 13 Gaussians using BIC 11 Clusters after merging 40 / 48 SWIFT

61 CLUSTERING RESULTS: GVHD DATASET FL2.H FL1.H 41 / 48 SWIFT

62 CLUSTERING RESULTS: GVHD DATASET FL2.H FL1.H 42 / 48 SWIFT

63 CLUSTERING RESULTS: GVHD DATASET SSC.H SSC.H FSC.H FSC.H SSC.H SSC.H FSC.H FSC.H 43 / 48 SWIFT

64 FEW THOUGHTS FOR FLOWCAP II FlowCAP-I datasets were relatively small Less than 100,000 events. Maximum 12 dimensions Number of clusters usually smaller than / 48 SWIFT

65 FEW THOUGHTS FOR FLOWCAP II FlowCAP-I datasets were relatively small Less than 100,000 events. Maximum 12 dimensions Number of clusters usually smaller than 25 Introduction of larger datasets for FlowCAP II 1 millions events, 20 dimensions are common 44 / 48 SWIFT

66 FEW THOUGHTS FOR FLOWCAP II FlowCAP-I datasets were relatively small Less than 100,000 events. Maximum 12 dimensions Number of clusters usually smaller than 25 Introduction of larger datasets for FlowCAP II 1 millions events, 20 dimensions are common Introduction of different tasks and corresponding performance measure Detection of very small clusters Detection of overlapping populations 44 / 48 SWIFT

67 FEW THOUGHTS FOR FLOWCAP II FlowCAP-I datasets were relatively small Less than 100,000 events. Maximum 12 dimensions Number of clusters usually smaller than 25 Introduction of larger datasets for FlowCAP II 1 millions events, 20 dimensions are common Introduction of different tasks and corresponding performance measure Detection of very small clusters Detection of overlapping populations Gold standard for validation? Manual Gating: Focused rather than exhaustive Does not comprehend overlapping populations 44 / 48 SWIFT

68 FEW THOUGHTS FOR FLOWCAP II FlowCAP-I datasets were relatively small Less than 100,000 events. Maximum 12 dimensions Number of clusters usually smaller than 25 Introduction of larger datasets for FlowCAP II 1 millions events, 20 dimensions are common Introduction of different tasks and corresponding performance measure Detection of very small clusters Detection of overlapping populations Gold standard for validation? Manual Gating: Focused rather than exhaustive Does not comprehend overlapping populations Electronically Mixed Datasets: for objective evaluation Human/Mouse dataset 44 / 48 SWIFT

69 OUTLINE 1 INTRODUCTION Flow cytometry (FC) Data Analysis Automated Multivariate clustering of FC Data 2 SWIFT METHOD FOR FC DATA ANALYSIS SWIFT Algorithm Weighted Iterative Sampling based EM Bimodality Splitting Graph-based Merging 3 DOES IT WORK? Does It Work? How Do We Know It Works? 4 FLOWCAP CONTEST Results on FlowCAP Datasets Few Thoughts for FlowCAP II 5 CONCLUSION 45 / 48 SWIFT

70 CONCLUSION SWIFT: Scalable algorithm for FC data clustering Posterior sampling based EM + Bimodality Splitting + Graph-based merging Advantages: lower computational complexity + better small cluster resolution 46 / 48 SWIFT

71 CONCLUSION SWIFT: Scalable algorithm for FC data clustering Posterior sampling based EM + Bimodality Splitting + Graph-based merging Advantages: lower computational complexity + better small cluster resolution Extensible to other soft clustering methods Mixture of t, skewed t distributions or fuzzy clustering 46 / 48 SWIFT

72 CONCLUSION SWIFT: Scalable algorithm for FC data clustering Posterior sampling based EM + Bimodality Splitting + Graph-based merging Advantages: lower computational complexity + better small cluster resolution Extensible to other soft clustering methods Mixture of t, skewed t distributions or fuzzy clustering Further speed-up can be achieved by combining with parallelization. Parallelization using GPU (Suchard et al. [2010],Espenshade et al. [2009]). 46 / 48 SWIFT

73 CONCLUSION SWIFT: Scalable algorithm for FC data clustering Posterior sampling based EM + Bimodality Splitting + Graph-based merging Advantages: lower computational complexity + better small cluster resolution Extensible to other soft clustering methods Mixture of t, skewed t distributions or fuzzy clustering Further speed-up can be achieved by combining with parallelization. Parallelization using GPU (Suchard et al. [2010],Espenshade et al. [2009]). Future work: Improve stability and robustness Cross-sample cluster matching for biological inference. 46 / 48 SWIFT

74 C OLLABORATORS SWIFT Naim, Sharma, Datta, Cavenaugh, Rebhahn, Wang, Mosmann Acceleration Pangborn, Cavenaugh, von Laszewski NYICE Influenza GAFF Rebhahn, Cavenaugh, Naim, Sharma, Mosmann Rochester Human Immunology Center Lymphoma Quataert, Mosmann Bernstein, Quataert Treanor, Topham, Sant, Kim, Whittaker, Mosmann RPBIP Immunocompromised Sanz, Looney, Mosmann, Ritchlin, Anolik, Quataert Asthma ACE Autoimmunity Georas, Looney, Mosmann Sanz, Fowell, Looney, Quataert, Mosmann 47 / 48 SWIFT

75 REFERENCES J.P. Baudry, A.E. Raftery, G. Celeux, K. Lo, and R. Gottardo. Combining mixture components for clustering. Journal of Computational and Graphical Statistics, 19 (2): , C. Chan, F. Feng, J. Ottinger, D. Foster, M. West, and T.B. Kepler. Statistical mixture modeling for cell subtype identification in flow cytometry. Cytometry Part A, (8), J. Espenshade, A. Pangborn, G. von Laszewski, D. Roberts, and J.S. Cavenaugh. Accelerating Partitional Algorithms for Flow Cytometry on GPUs. pages , G. Finak, R. Gottardo, R. Brinkman, et al. Merging mixture components for cell population identification in flow cytometry. Advances in Bioinformatics, 2009, C. Hennig. Methods for merging Gaussian mixture components. Advances in Data Analysis and Classification, pages 1 32, K. Lo, R.R. Brinkman, and R. Gottardo. Automated gating of flow cytometry data via robust model-based clustering. Cytometry Part A, 73: , Iftekhar Naim, Suprakash Datta, Gaurav Sharma, James Cavenaugh, and Tim Mosmann. SWIFT: Scalable weighted iterative sampling for flow cytometry clustering. In Proc. IEEE Intl. Conf. Acoustics Speech and Sig. Proc., pages , Dallas, Texas, USA, Mar S. Pyne et al. Automated high-dimensional flow cytometric data analysis. PNAS, 106 (21):8519, / 48 SWIFT

Merging Mixture Components for Cell Population Identification in Flow Cytometry Data The flowmerge package

Merging Mixture Components for Cell Population Identification in Flow Cytometry Data The flowmerge package Greg Finak, Raphael Gottardo October 30, 2017 greg.finak@ircm.qc.ca, raphael.gottardo@ircm.qc.ca