Reorganizing the data by sample
|
|
- Claude Jacobs
- 6 years ago
- Views:
Transcription
1 Reorganizing the data by sample Kevin R. Coombes 23 March 2011 Contents 1 Executive Summary Introduction Aims/Objectives Methods Description of Data Statistical Methods Results Conclusions Details 2 3 Appendix 4 1 Executive Summary 1.1 Introduction This report describes the analysis of a data set from HapMap, a member of the laboratory of HapMap. This dataset was acquired using Illumina 610K Quad v1 chips. The main goal of the study is to identify genetic abnormalities that are associated with clinical outcome (including overall survival and time-to-treatment). This is the zeroth part of a series of related reports Aims/Objectives There are two specific goals for the present report. First, we compute (if necessary) and record the desired sample names that will be attached to everything we do later. Second, we reconfigure the data (which starts out stored by chromosome) to store it instead by sample, which will make data handling much more efficient for all of the later steps. 1
2 00-dataDicer Methods Description of Data The dataset contains measurements on 225 normal controls. Extensive clinical followup is available Statistical Methods Raw data were processed in BeadStudio or GenomeStudio to yield genotype calls, log R ratios (LRR), and B allele frequencies (BAF) for each SNP in each sample. Since the study does not include matched normal DNA, the BeadStudio computations were performed relative to the pool of 120 HapMap samples run by Illumina. 1.3 Results The reconfigured data is stored in per-chromosome files within per-sample subdirectories in the ChrBySample subdirectory. 1.4 Conclusions We are ready to start our own processing of the data. 2 Details We need to run a double loop over chromosomes and samples in order to process all the data. We start with a block of code that loads the LRR and BAF data for one chromosome, which is denoted by the variable chrname. This code also extracts a list of sample names if one has not yet been created. > curdir <- paste("chr", chrname, sep = "_") > curchrfile <- file.path(.inputdir, paste("full Data Table_Chr_", chrname, + ".txt", sep = "")) > curchr <- read.table(curchrfile, sep = "\t", header = TRUE) > if (min(diff(curchr$position) < 0)) curchr <- curchr[order(curchr$position), + ] > if (!exists(".startcol")) { + g <- grep("gtype", colnames(curchr)) +.BS.COLS.PER.PT <- median(diff(g)) +.STARTCOL <- g[1] + rm(g) We use this code to load the data from chromosome 13. > memory.limit(2048)
3 00-dataDicer 3 [1] > chrname <-.CHRN Using the chromosome data, we can determine the short sample names. (This list may not be needed; it depends on how the raw image files were named when they were initially generated.) > shorten <- function(name) { + strsplit(name, "_")[[1]][2] > if (!exists("samplenames")) { + samplenames <- seq(.startcol, ncol(curchr),.bs.cols.per.pt) + shortnames <- names(samplenames) <- sub("\\.gtype", "", colnames(curchr)[samplenames]) + if (.MAKESHORTER) { + shortnames <- sapply(names(samplenames), shorten) + else if (exists("fixnames")) { + shortnames <- sapply(names(samplenames), fixnames) + else { + names(shortnames) <- shortnames + dup <- duplicated(shortnames) + if (any(dup)) + shortnames[dup] <- paste(shortnames[dup], "dup", sep = "-") > save(samplenames, shortnames, file = "allsamplenames.rda") We create a directory to store the results. > if (!file.exists("chrbysample")) dir.create("chrbysample") The main loop simply reads the data for each chromosome, parses it out by sample, and then appends the data for each sample to a separate file. > wext <- NULL > for (chrname in c(1:22, 'X', 'Y', 'XY', 'MT')) { + cat(paste("loading chromosome", chrname, "\n"), file=stderr())
4 00-dataDicer 4 + <<load.one.chromosome>> + for (i in 1:length(sampleNames)) { + longname <- names(shortnames)[i] + # key point: don't want to get prefixes + # without the final '.', looking for "CLZ.1" also gets "CLZ.11" + ln <- paste(longname, '\\.', sep='') + # but there is STILL a problem if one sample is called "X257" + # and another is called "X257.Rep2" + if (is.null(wext)) { + w0 <- which(regexpr(ln, colnames(curchr)) > 0) + wext <- sub(ln, "", colnames(curchr)[w0]) + targets <- c("name", "Chr", "Position", + paste(longname, wext, sep='.')) + cid <- shortnames[i] + cat(paste("\twriting", cid, "\n"), file=stderr()) + w <- which(colnames(curchr) %in% targets) + if (length(w) < 5) stop(paste("bad value, cid =", cid, "longname =", longname)) + if (length(w)!= 3+length(wext)) stop(paste("bad value, cid =", cid, "longname =", longna + # make sure a directory exists for this sample + samdir <- file.path("chrbysample", cid) + if (!file.exists(samdir)) dir.create(samdir) + # create by-sample file + fd <- file.path(samdir, paste("chr", chrname, ".tsv", sep="")) + write.table(curchr[,w], file=fd, quote=false, sep="\t", row.names=false) + rm(curchr) + gc() 3 Appendix This analysis was run in the following directory: > getwd() [1] "c:/snp-hapmap/analysis03" Note that \\mdadqsfs02 is the standard insititutional location for storing data and analyses; N: is the name given to that location on this machine. This analysis was run in the following software environment: > sessioninfo()
5 00-dataDicer 5 R version ( ) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grdevices utils datasets methods base
Reorganizing the data by sample
Reorganizing the data by sample Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................
More informationPreparing the Final Data Set
Preparing the Final Data Set Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................
More informationPreliminary Figures for Renormalizing Illumina SNP Cell Line Data
Preliminary Figures for Renormalizing Illumina SNP Cell Line Data Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................
More informationPlotting Segment Calls From SNP Assay
Plotting Segment Calls From SNP Assay Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................
More informationMain Results. Kevin R, Coombes. 10 September 2011
Main Results Kevin R, Coombes 10 September 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives.................................. 1 1.2
More informationgenocn: integrated studies of copy number and genotype
genocn: integrated studies of copy number and genotype Sun, W., Wright, F., Tang, Z., Nordgard, S.H., Van Loo, P., Yu, T., Kristensen, V., Perou, C. February 22, 2010 1 Overview > library(genocn) This
More informationPreprocessing and Genotyping Illumina Arrays for Copy Number Analysis
Preprocessing and Genotyping Illumina Arrays for Copy Number Analysis Rob Scharpf September 18, 2012 Abstract This vignette illustrates the steps required prior to copy number analysis for Infinium platforms.
More informationRAPIDR. Kitty Lo. November 20, Intended use of RAPIDR 1. 2 Create binned counts file from BAMs Masking... 1
RAPIDR Kitty Lo November 20, 2014 Contents 1 Intended use of RAPIDR 1 2 Create binned counts file from BAMs 1 2.1 Masking.................................................... 1 3 Build the reference 2 3.1
More informationPackage icnv. R topics documented: March 8, Title Integrated Copy Number Variation detection Version Author Zilu Zhou, Nancy Zhang
Title Integrated Copy Number Variation detection Version 1.2.1 Author Zilu Zhou, Nancy Zhang Package icnv March 8, 2019 Maintainer Zilu Zhou Integrative copy number variation
More informationPackage RLMM. March 7, 2019
Version 1.44.0 Date 2005-09-02 Package RLMM March 7, 2019 Title A Genotype Calling Algorithm for Affymetrix SNP Arrays Author Nusrat Rabbee , Gary Wong
More informationPackage lodgwas. R topics documented: November 30, Type Package
Type Package Package lodgwas November 30, 2015 Title Genome-Wide Association Analysis of a Biomarker Accounting for Limit of Detection Version 1.0-7 Date 2015-11-10 Author Ahmad Vaez, Ilja M. Nolte, Peter
More information500K Data Analysis Workflow using BRLMM
500K Data Analysis Workflow using BRLMM I. INTRODUCTION TO BRLMM ANALYSIS TOOL... 2 II. INSTALLATION AND SET-UP... 2 III. HARDWARE REQUIREMENTS... 3 IV. BRLMM ANALYSIS TOOL WORKFLOW... 3 V. RESULTS/OUTPUT
More informationPackage SimGbyE. July 20, 2009
Package SimGbyE July 20, 2009 Type Package Title Simulated case/control or survival data sets with genetic and environmental interactions. Author Melanie Wilson Maintainer Melanie
More informationRecalling Genotypes with BEAGLECALL Tutorial
Recalling Genotypes with BEAGLECALL Tutorial Release 8.1.4 Golden Helix, Inc. June 24, 2014 Contents 1. Format and Confirm Data Quality 2 A. Exclude Non-Autosomal Markers......................................
More informationTutorial. Identification of Variants Using GATK. Sample to Insight. November 21, 2017
Identification of Variants Using GATK November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com
More informationMACAU User Manual. Xiang Zhou. March 15, 2017
MACAU User Manual Xiang Zhou March 15, 2017 Contents 1 Introduction 2 1.1 What is MACAU...................................... 2 1.2 How to Cite MACAU................................... 2 1.3 The Model.........................................
More informationWelcome to MAPHiTS (Mapping Analysis Pipeline for High-Throughput Sequences) tutorial page.
Welcome to MAPHiTS (Mapping Analysis Pipeline for High-Throughput Sequences) tutorial page. In this page you will learn to use the tools of the MAPHiTS suite. A little advice before starting : rename your
More informationQuality control of array genotyping data with argyle Andrew P Morgan
Quality control of array genotyping data with argyle Andrew P Morgan 2015-10-08 Introduction Proper quality control of array genotypes is an important prerequisite to further analysis. Genotype quality
More informationPolymorphism and Variant Analysis Lab
Polymorphism and Variant Analysis Lab Arian Avalos PowerPoint by Casey Hanson Polymorphism and Variant Analysis Matt Hudson 2018 1 Exercise In this exercise, we will do the following:. 1. Gain familiarity
More informationPackage saascnv. May 18, 2016
Version 0.3.4 Date 2016-05-10 Package saascnv May 18, 2016 Title Somatic Copy Number Alteration Analysis Using Sequencing and SNP Array Data Author Zhongyang Zhang [aut, cre], Ke Hao [aut], Nancy R. Zhang
More informationGBS Bioinformatics Pipeline(s) Overview
GBS Bioinformatics Pipeline(s) Overview Getting from sequence files to genotypes. Pipeline Coding: Ed Buckler Jeff Glaubitz James Harriman Presentation: Terry Casstevens With supporting information from
More informationGenomeStudio Software Release Notes
GenomeStudio Software 2009.2 Release Notes 1. GenomeStudio Software 2009.2 Framework... 1 2. Illumina Genome Viewer v1.5...2 3. Genotyping Module v1.5... 4 4. Gene Expression Module v1.5... 6 5. Methylation
More informationHow to use CNTools. Overview. Algorithms. Jianhua Zhang. April 14, 2011
How to use CNTools Jianhua Zhang April 14, 2011 Overview Studies have shown that genomic alterations measured as DNA copy number variations invariably occur across chromosomal regions that span over several
More informationKaryoStudio v1.4 User Guide
KaryoStudio v1.4 User Guide FOR RESEARCH USE ONLY ILLUMINA PROPRIETARY Part # 11328837 Rev. C June 2011 Notice This document and its contents are proprietary to Illumina, Inc. and its affiliates ("Illumina"),
More informationCalling variants in diploid or multiploid genomes
Calling variants in diploid or multiploid genomes Diploid genomes The initial steps in calling variants for diploid or multi-ploid organisms with NGS data are the same as what we've already seen: 1. 2.
More informationPooling Segments. Kevin R. Coombes. 17 March 2011
Pooling Segments Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 2 1.1 Introduction......................................... 2 1.1.1 Aims/Objectives.................................. 2 1.2
More informationMPG NGS workshop I: Quality assessment of SNP calls
MPG NGS workshop I: Quality assessment of SNP calls Kiran V Garimella (kiran@broadinstitute.org) Genome Sequencing and Analysis Medical and Population Genetics February 4, 2010 SNP calling workflow Filesize*
More informationPackage GEM. R topics documented: January 31, Type Package
Type Package Package GEM January 31, 2018 Title GEM: fast association study for the interplay of Gene, Environment and Methylation Version 1.5.0 Date 2015-12-05 Author Hong Pan, Joanna D Holbrook, Neerja
More informationGxE.scan. October 30, 2018
GxE.scan October 30, 2018 Overview GxE.scan can process a GWAS scan using the snp.logistic, additive.test, snp.score or snp.matched functions, whereas snp.scan.logistic only calls snp.logistic. GxE.scan
More informationGenetic type 1 Error Calculator (GEC)
Genetic type 1 Error Calculator (GEC) (Version 0.2) User Manual Miao-Xin Li Department of Psychiatry and State Key Laboratory for Cognitive and Brain Sciences; the Centre for Reproduction, Development
More informationsurvsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes
survsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes Kouros Owzar Zhiguo Li Nancy Cox Sin-Ho Jung Chanhee Yi June 29, 2016 1 Introduction This vignette
More informationClick on "+" button Select your VCF data files (see #Input Formats->1 above) Remove file from files list:
CircosVCF: CircosVCF is a web based visualization tool of genome-wide variant data described in VCF files using circos plots. The provided visualization capabilities, gives a broad overview of the genomic
More informationINTRODUCTION AUX FORMATS DE FICHIERS
INTRODUCTION AUX FORMATS DE FICHIERS Plan. Formats de séquences brutes.. Format fasta.2. Format fastq 2. Formats d alignements 2.. Format SAM 2.2. Format BAM 4. Format «Variant Calling» 4.. Format Varscan
More informationR.devices. Henrik Bengtsson. November 19, 2012
R.devices Henrik Bengtsson November 19, 2012 Abstract The R.devices package provides utility methods that enhance the existing graphical device functions already available in R for the purpose of simplifying
More informationNetwork Based Models For Analysis of SNPs Yalta Opt
Outline Network Based Models For Analysis of Yalta Optimization Conference 2010 Network Science Zeynep Ertem*, Sergiy Butenko*, Clare Gill** *Department of Industrial and Systems Engineering, **Department
More informationPackage GWAF. March 12, 2015
Type Package Package GWAF March 12, 2015 Title Genome-Wide Association/Interaction Analysis and Rare Variant Analysis with Family Data Version 2.2 Date 2015-03-12 Author Ming-Huei Chen
More informationFVGWAS- 3.0 Manual. 1. Schematic overview of FVGWAS
FVGWAS- 3.0 Manual Hongtu Zhu @ UNC BIAS Chao Huang @ UNC BIAS Nov 8, 2015 More and more large- scale imaging genetic studies are being widely conducted to collect a rich set of imaging, genetic, and clinical
More informationTriform: peak finding in ChIP-Seq enrichment profiles for transcription factors
Triform: peak finding in ChIP-Seq enrichment profiles for transcription factors Karl Kornacker * and Tony Håndstad October 30, 2018 A guide for using the Triform algorithm to predict transcription factor
More informationsnpqc an R pipeline for quality control of Illumina SNP data
snpqc an R pipeline for quality control of Illumina SNP data 1. In a nutshell snpqc is a series of R scripts to perform quality control analysis on Illumina SNP data. The objective of the program is to
More informationSpotter Documentation Version 0.5, Released 4/12/2010
Spotter Documentation Version 0.5, Released 4/12/2010 Purpose Spotter is a program for delineating an association signal from a genome wide association study using features such as recombination rates,
More informationCNVPanelizer: Reliable CNV detection in target sequencing applications
CNVPanelizer: Reliable CNV detection in target sequencing applications Oliveira, Cristiano cristiano.oliveira@med.uni-heidelberg.de Wolf, Thomas thomas_wolf71@gmx.de April 30, 2018 Amplicon based targeted
More informationGenetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland
Genetic Programming Charles Chilaka Department of Computational Science Memorial University of Newfoundland Class Project for Bio 4241 March 27, 2014 Charles Chilaka (MUN) Genetic algorithms and programming
More informationEstimating. Local Ancestry in admixed Populations (LAMP)
Estimating Local Ancestry in admixed Populations (LAMP) QIAN ZHANG 572 6/05/2014 Outline 1) Sketch Method 2) Algorithm 3) Simulated Data: Accuracy Varying Pop1-Pop2 Ancestries r 2 pruning threshold Number
More informationGWAS Exercises 3 - GWAS with a Quantiative Trait
GWAS Exercises 3 - GWAS with a Quantiative Trait Peter Castaldi January 28, 2013 PLINK can also test for genetic associations with a quantitative trait (i.e. a continuous variable). In this exercise, we
More informationPackage HMMASE. February 4, HMMASE R package
Package HMMASE February 4, 2014 Type Package Title HMMASE R package Version 1.0 Date 2014-02-04 Author Juan R. Steibel, Heng Wang, Ping-Shou Zhong Maintainer Heng Wang An R package that
More informationPackage igc. February 10, 2018
Type Package Package igc February 10, 2018 Title An integrated analysis package of Gene expression and Copy number alteration Version 1.8.0 This package is intended to identify differentially expressed
More informationImporting and Merging Data Tutorial
Importing and Merging Data Tutorial Release 1.0 Golden Helix, Inc. February 17, 2012 Contents 1. Overview 2 2. Import Pedigree Data 4 3. Import Phenotypic Data 6 4. Import Genetic Data 8 5. Import and
More informationExamples of implementation of pre-processing method described in paper with R code snippets - Electronic Supplementary Information (ESI)
Electronic Supplementary Material (ESI) for Analyst. This journal is The Royal Society of Chemistry 2015 Examples of implementation of pre-processing method described in paper with R code snippets - Electronic
More informationAffymetrix GeneChip DNA Analysis Software
Affymetrix GeneChip DNA Analysis Software User s Guide Version 3.0 For Research Use Only. Not for use in diagnostic procedures. P/N 701454 Rev. 3 Trademarks Affymetrix, GeneChip, EASI,,,, HuSNP, GenFlex,
More informationPackage methylmnm. January 14, 2013
Type Package Title detect different methylation level (DMR) Version 0.99.0 Date 2012-12-01 Package methylmnm January 14, 2013 Author Maintainer Yan Zhou To give the exactly p-value and
More informationRnBeadsDJ A Quickstart Guide to the RnBeads Data Juggler
RnBeadsDJ A Quickstart Guide to the RnBeads Data Juggler Fabian Müller, Yassen Assenov, Pavlo Lutsik Contact: rnbeads@mpi-inf.mpg.de Package version: 1.12.2 September 25, 2018 RnBeads is an R package for
More informationSIBER User Manual. Pan Tong and Kevin R Coombes. May 27, Introduction 1
SIBER User Manual Pan Tong and Kevin R Coombes May 27, 2015 Contents 1 Introduction 1 2 Using SIBER 1 2.1 A Quick Example........................................... 1 2.2 Dealing With RNAseq Normalization................................
More informationPRSice: Polygenic Risk Score software - Vignette
PRSice: Polygenic Risk Score software - Vignette Jack Euesden, Paul O Reilly March 22, 2016 1 The Polygenic Risk Score process PRSice ( precise ) implements a pipeline that has become standard in Polygenic
More informationMaximizing Public Data Sources for Sequencing and GWAS
Maximizing Public Data Sources for Sequencing and GWAS February 4, 2014 G Bryce Christensen Director of Services Questions during the presentation Use the Questions pane in your GoToWebinar window Agenda
More informationmbpcr: A package for DNA copy number profile estimation
mbpcr: A package for DNA copy number profile estimation P. M. V. Rancoita 1,2,3 and M. Hutter 4 1 Istituto Dalle Molle di Studi sull Intelligenza Artificiale (IDSIA), Manno- Lugano, Switzerland 2 Laboratory
More informationELAI user manual. Yongtao Guan Baylor College of Medicine. Version June Copyright 2. 3 A simple example 2
ELAI user manual Yongtao Guan Baylor College of Medicine Version 1.0 25 June 2015 Contents 1 Copyright 2 2 What ELAI Can Do 2 3 A simple example 2 4 Input file formats 3 4.1 Genotype file format....................................
More informationAn Introduction to the genoset Package
An Introduction to the genoset Package Peter M. Haverty April 4, 2013 Contents 1 Introduction 2 1.1 Creating Objects........................................... 2 1.2 Accessing Genome Information...................................
More informationPackage seqcat. March 25, 2019
Package seqcat March 25, 2019 Title High Throughput Sequencing Cell Authentication Toolkit Version 1.4.1 The seqcat package uses variant calling data (in the form of VCF files) from high throughput sequencing
More informationKGG: A systematic biological Knowledge-based mining system for Genomewide Genetic studies (Version 3.5) User Manual. Miao-Xin Li, Jiang Li
KGG: A systematic biological Knowledge-based mining system for Genomewide Genetic studies (Version 3.5) User Manual Miao-Xin Li, Jiang Li Department of Psychiatry Centre for Genomic Sciences Department
More informationmethylmnm Tutorial Yan Zhou, Bo Zhang, Nan Lin, BaoXue Zhang and Ting Wang January 14, 2013
methylmnm Tutorial Yan Zhou, Bo Zhang, Nan Lin, BaoXue Zhang and Ting Wang January 14, 2013 Contents 1 Introduction 1 2 Preparations 2 3 Data format 2 4 Data Pre-processing 3 4.1 CpG number of each bin.......................
More informationPackage ridge. R topics documented: February 15, Title Ridge Regression with automatic selection of the penalty parameter. Version 2.
Package ridge February 15, 2013 Title Ridge Regression with automatic selection of the penalty parameter Version 2.1-2 Date 2012-25-09 Author Erika Cule Linear and logistic ridge regression for small data
More informationAnalytical Processing of Data of statistical genetics research in UNIX like Systems
Survival Skills for Analytical Processing of Data of statistical genetics research in UNIX like Systems robert yu :: March 2011 anote UNIX like? Traditional/classical UNIX, e.g. System V (Solaris), BSD
More informationGenomes On The Cloud GotCloud. University of Michigan Center for Statistical Genetics Mary Kate Wing Goo Jun
Genomes On The Cloud GotCloud University of Michigan Center for Statistical Genetics Mary Kate Wing Goo Jun Friday, March 8, 2013 Why GotCloud? Connects sequence analysis tools together Alignment, quality
More informationPackage calmate. R topics documented: February 15, Version Depends R (>= ), R.utils (>= ), aroma.core (>= 2.8.
Package calmate February 15, 2013 Version 0.10.0 Depends R (>= 2.10.1), R.utils (>= 1.19.3), aroma.core (>= 2.8.0) Imports utils, MASS, matrixstats (>= 0.6.2), R.methodsS3 (>= 1.4.2),R.oo (>= 1.11.4),
More informationRelease Notes. JMP Genomics. Version 4.0
JMP Genomics Version 4.0 Release Notes Creativity involves breaking out of established patterns in order to look at things in a different way. Edward de Bono JMP. A Business Unit of SAS SAS Campus Drive
More informationAtlas-SNP2 DOCUMENTATION V1.1 April 26, 2010
Atlas-SNP2 DOCUMENTATION V1.1 April 26, 2010 Contact: Jin Yu (jy2@bcm.tmc.edu), and Fuli Yu (fyu@bcm.tmc.edu) Human Genome Sequencing Center (HGSC) at Baylor College of Medicine (BCM) Houston TX, USA 1
More informationGemTools Documentation
Literature: GemTools Documentation Bert Klei and Brian P. Kent February 2011 This software is described in GemTools: a fast and efficient approach to estimating genetic ancestry (in preparation) Klei L,
More informationAxiom Analysis Suite Release Notes (For research use only. Not for use in diagnostic procedures.)
Axiom Analysis Suite 4.0.1 Release Notes (For research use only. Not for use in diagnostic procedures.) Axiom Analysis Suite 4.0.1 includes the following changes/updates: 1. For library packages that support
More informationStep-by-Step Guide to Basic Genetic Analysis
Step-by-Step Guide to Basic Genetic Analysis Page 1 Introduction This document shows you how to clean up your genetic data, assess its statistical properties and perform simple analyses such as case-control
More informationPackage genmossplus. R topics documented: February 19, Type Package
Type Package Package genmossplus February 19, 2015 Title Application of MOSS algorithm to genome-wide association study (GWAS) Version 1.0 Date 2013-04-12 Author Olga Vesselova, Matthew Friedlander, Laurent
More informationsegmentseq: methods for detecting methylation loci and differential methylation
segmentseq: methods for detecting methylation loci and differential methylation Thomas J. Hardcastle October 13, 2015 1 Introduction This vignette introduces analysis methods for data from high-throughput
More informationPackage RobustSNP. January 1, 2011
Package RobustSNP January 1, 2011 Type Package Title Robust SNP association tests under different genetic models, allowing for covariates Version 1.0 Depends mvtnorm,car,snpmatrix Date 2010-07-11 Author
More informationPopulation Genetics in BioPerl HOWTO
Population Genetics in BioPerl HOW Jason Stajich, Dept Molecular Genetics and Microbiology, Duke University $Id: PopGen.xml,v 1.2 2005/02/23 04:56:30 jason Exp $ This document
More informationGenetic Analysis. Page 1
Genetic Analysis Page 1 Genetic Analysis Objectives: 1) Set up Case-Control Association analysis and the Basic Genetics Workflow 2) Use JMP tools to interact with and explore results 3) Learn advanced
More informationAn Introduction to the methylumi package
An Introduction to the methylumi package Sean Davis and Sven Bilke October 13, 2014 1 Introduction Gene expression patterns are very important in understanding any biologic system. The regulation of gene
More informationInput files: Trim reads: Create bwa index: Align trimmed reads: Convert sam to bam: Sort bam: Remove duplicates: Index sorted, no-duplicates bam:
Input files: 11B-872-3.Ac4578.B73xEDMX-2233_palomero-1.fq 11B-872-3.Ac4578.B73xEDMX-2233_palomero-2.fq Trim reads: java -jar trimmomatic-0.32.jar PE -threads $PBS_NUM_PPN -phred33 \ [...]-1.fq [...]-2.fq
More informationPeter Schweitzer, Director, DNA Sequencing and Genotyping Lab
The instruments, the runs, the QC metrics, and the output Peter Schweitzer, Director, DNA Sequencing and Genotyping Lab Overview Roche/454 GS-FLX 454 (GSRunbrowser information) Evaluating run results Errors
More informationapt-probeset-genotype Manual
Contents Introduction. Quick Start - getting up and running. Beta Software - a word of caution. The Report File - explanation of contents. Program Options - command line options. FAQ - Frequently Asked
More informationCTL mapping in R. Danny Arends, Pjotr Prins, and Ritsert C. Jansen. University of Groningen Groningen Bioinformatics Centre & GCC Revision # 1
CTL mapping in R Danny Arends, Pjotr Prins, and Ritsert C. Jansen University of Groningen Groningen Bioinformatics Centre & GCC Revision # 1 First written: Oct 2011 Last modified: Jan 2018 Abstract: Tutorial
More informationcontinout_data.txt DESCRIPTION: Dataset contains continuous outcome and matches with the continfile.txt individuals
ATHENA Tutorial Installation: Download the ATHENA source file from http://ritchielab.psu.edu/ritchielab/software Unzip the tar ball athena-1.1.tar.gz tar -xvzf athena-1.1.tar.gz./configure make make install
More informationcrlmm to downstream data analysis
crlmm to downstream data analysis VJ Carey, B Carvalho March, 2012 1 Running CRLMM on a nontrivial set of CEL files To use the crlmm algorithm, the user must load the crlmm package, as described below:
More informationMAGA: Meta-Analysis of Gene-level Associations
MAGA: Meta-Analysis of Gene-level Associations SYNOPSIS MAGA [--sfile] [--chr] OPTIONS Option Default Description --sfile specification.txt Select a specification file --chr Select a chromosome DESCRIPTION
More informationUsing the GBS Analysis Pipeline Tutorial
Using the GBS Analysis Pipeline Tutorial Cornell CBSU/IGD GBS Bioinformatics Workshop September 13 & 14 2012 Step 0: If one of the CBSU BioHPC Lab workstations was reserved for you, it will be listed on
More informationConvert Dosages to Genotypes Author: Autumn Laughbaum, Golden Helix, Inc.
Convert Dosages to Genotypes Author: Autumn Laughbaum, Golden Helix, Inc. Overview This script converts allelic dosage values to genotypes based on user-specified thresholds. The dosage data may be in
More information10. Interfacing R. Thomas Lumley Ken Rice. Universities of Washington and Auckland. Seattle, July 2016
10. Interfacing R Thomas Lumley Ken Rice Universities of Washington and Auckland Seattle, July 2016 Interfacing R With Bioconductor, R can do a huge proportion of the analyses you ll want but not everything;
More informationPackage gpart. November 19, 2018
Package gpart November 19, 2018 Title Human genome partitioning of dense sequencing data by identifying haplotype blocks Version 1.0.0 Depends R (>= 3.5.0), grid, Homo.sapiens, TxDb.Hsapiens.UCSC.hg38.knownGene,
More informationPrepare input data for CINdex
1 Introduction Prepare input data for CINdex Genomic instability is known to be a fundamental trait in the development of tumors; and most human tumors exhibit this instability in structural and numerical
More informationRAD Population Genomics Programs Paul Hohenlohe 6/2014
RAD Population Genomics Programs Paul Hohenlohe (hohenlohe@uidaho.edu) 6/2014 I. Overview These programs are designed to conduct population genomic analysis on RAD sequencing data. They were designed for
More informationDevyser QF-PCR. Guide to Sample Runs, Data Analysis & Results Interpretation
Devyser QF-PCR Guide to Sample Runs, Data Analysis & Results Interpretation Version 4-2013 Contents 1. Setting up a sample run on an ABI Genetic Analyzer... 3 1.1 Introduction... 3 1.2 Workflow... 3 1.3
More informationPediHaplotyper Manual
PediHaplotyper Manual Roeland Voorrips, Wageningen UR Plant Breeding, 2015 Introduction PediHaplotyper is software for assigning haploblock alleles to individuals in a pedigree, based on observed marker
More informationHow to use cghmcr. October 30, 2017
How to use cghmcr Jianhua Zhang Bin Feng October 30, 2017 1 Overview Copy number data (arraycgh or SNP) can be used to identify genomic regions (Regions Of Interest or ROI) showing gains or losses that
More informationData Currently Available (And How to Access It) Chance Hohensee Data Training September 9, 2016
Data Currently Available (And How to Access It) Chance Hohensee Data Training September 9, 2016 Introduction The WHI dataset is large and complex There are different cohorts within the 161,808 WHI participants,
More informationSNP HiTLink Manual. Yoko Fukuda 1, Hiroki Adachi 2, Eiji Nakamura 2, and Shoji Tsuji 1
SNP HiTLink Manual Yoko Fukuda 1, Hiroki Adachi 2, Eiji Nakamura 2, and Shoji Tsuji 1 1 Department of Neurology, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan 2 Dynacom Co., Ltd, Kanagawa,
More informationThe Lander-Green Algorithm in Practice. Biostatistics 666
The Lander-Green Algorithm in Practice Biostatistics 666 Last Lecture: Lander-Green Algorithm More general definition for I, the "IBD vector" Probability of genotypes given IBD vector Transition probabilities
More informationsegmentseq: methods for detecting methylation loci and differential methylation
segmentseq: methods for detecting methylation loci and differential methylation Thomas J. Hardcastle October 30, 2018 1 Introduction This vignette introduces analysis methods for data from high-throughput
More informationdiscosnp++ Reference-free detection of SNPs and small indels v2.2.2
discosnp++ Reference-free detection of SNPs and small indels v2.2.2 User's guide November 2015 contact: pierre.peterlongo@inria.fr Table of contents GNU AFFERO GENERAL PUBLIC LICENSE... 1 Publication...
More information2 binary_coding. Index 21. Code genotypes as binary. binary_coding(genotype_warnings2na, genotype_table)
Package genotyper May 22, 2018 Title SNP Genotype Marker Design and Analysis Version 0.0.1.8 We implement a common genotyping workflow with a standardized software interface. 'genotyper' designs genotyping
More informationPackage Hapi. July 28, 2018
Type Package Package Hapi July 28, 2018 Title Inference of Chromosome-Length Haplotypes Using Genomic Data of Single Gamete Cells Version 0.0.3 Author, Han Qu, Jinfeng Chen, Shibo Wang, Le Zhang, Julong
More informationSEQGWAS: Integrative Analysis of SEQuencing and GWAS Data
SEQGWAS: Integrative Analysis of SEQuencing and GWAS Data SYNOPSIS SEQGWAS [--sfile] [--chr] OPTIONS Option Default Description --sfile specification.txt Select a specification file --chr Select a chromosome
More informationBICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017
BICF Nano Course: GWAS GWAS Workflow Development using PLINK Julia Kozlitina Julia.Kozlitina@UTSouthwestern.edu April 28, 2017 Getting started Open the Terminal (Search -> Applications -> Terminal), and
More information