The H3ABioNet GWAS Pipeline

Size: px
Start display at page:

Download "The H3ABioNet GWAS Pipeline"

Transcription

1 School of Electrical and UNIVERSITY OF THE WITWATERSRAND, JOHANNESBURG Information Engineering The H3ABioNet GWAS Pipeline Scott Hazelhurst 1 Introduction Need to build pipelines Many bioinformatics workflows are complex multiple steps, software dependancies have multiple parameters Constraints of good scientific practice needs to be re-rerun often in understanding data reproducible by others portable Workflow/Pipeline Packaging of steps in a complex analysis automate the steps not black-box user needs to understand the individual steps Use appropriate software technology to support this Nextflow Containerisation (Docker/Singularity) 1

2 Pan-African Bioinformatics Network for H3Africa Support the work of the Human Heredity and Health in Africa (H3A) Consortium: build informatics and bioinformatics capacity support H3A projects > 30 nodes in 15 African countries Many activities pipelines work is just one of them 2 Pipelines H3ABioNet Pipelines Project Developed set of pipelines to support common workflow next generation sequence variant calling 16S RNA analysis imputation GWAS 3 H3A GWAS H3A GWAS pipeline Three separate workflows conversion from Illumina Top/Bottom format Quality control Basic association study 3.1 Installing Set-up Must install: Java 8 Nextflow Either all the bioinf tools, envs or Docker/Singularity 2

3 Method 1: Using git Especially if you re going to extend. git clone This creates a folder called h3agwas with scripts inside it. Method 2: Using nextflow nextflow pull h3abionet/h3agwas 3.2 Quality control Remove duplicate SNPs Remove SNPs, individuals with high missingness, HWE, MAF Remove outliers on sample heterozygosity Remove relatedness Remove SNPs on HWE, differential missingness Produce reports Running a QC Controlled by a config file params.input_dir = "input" params.input_pat = "sample" params.output = "test-qc" params.output_dir = "output" params.case_control = "/datab/awigengwas/aux/data.phe" params.case_control_col = "bmi_case_control" params.batch = "/datab/awigengwas/aux/awifam.phe" params.batch_col = "batch" params.phenotype = "/datab/awigengwas/aux/data.phe" params.pheno_col = "site" params.sexinfo_available = true params.pi_hat = 0.18 Main components Input directory and file Output directory and file Batch analysis: strongly recommended By phenotype: e.g., site, strongly recommended 3

4 Case-control: binary compulsory QC cut-offs Need phenotype file with headers. Batch analysis is intended for cases where the DNA from the experiment is collected, shipped or genotyped separately. In QC you need to be sure that there aren t significant differences between the different batches. Phenotype analysis allows you to do more detailed batch analysis. Depending on the phenotype chosen, there may be overall genotype difference or not. For example, in the AWI-Gen study, we chose collection site as the phenotype of interest. Here we expect there to be very sigificant overall genotype differences because our sites are in very different parts of Africa. The point of doing phenotype analysis for us is to ensure that there are no (a) batch effects at any sites, and (b) to explore missingness and QC at the site level. You could also choose sex as the phenotype, in which case you would not see overall genotype difference. The case-control column in compulsory you need to choose a phenotype that is binary. We do not expect the genotypes overall to be different between cases and controls and if there is an overall difference there s likely a QC problem. If you don t have a suitable binary case-control make up one even randomly assign participants to different groups. For each of the above, you need to specify the file name where the data can be found and the column (the first line of the file must have headers). Usually, it will be the same file and different column names, but you may have multiple files To get the workflow nextflow pull h3abionet/h3agwas Then to run #either nextflow run -c my.config h3abionet/h3agwas/plink-qc #or nextflow run -c my.config h3abionet/h3agwas/plink-qc nextflow run -c my.config h3abionet/h3agwas/plink-qc N E X T F L O W ~ version Launching../plink-qc.nf [pedantic_kare] - revision: ac5217ecef Sexinfo available command [warm up] executor > local [fe/c0582e] Submitted process > inmd5 (1) [d4/1e3bbb] Submitted process > getduplicatemarkers (1) [f4/ed17d8] Submitted process > removeduplicatesnps (1) [ad/6d480c] Submitted process > getinitmaf (1) [8b/40df13] Submitted process > getx (1) [57/ef097e] Submitted process > identifyindivdiscsexinfo (1) [91/59ea3c] Submitted process > generateindivmissingnessplot (1) [ab/e1643d] Submitted process > generatesnpmissingnessplot (1) [d8/9d32e8] Submitted process > removeqcphase1 (1).... [e5/2817fb] Submitted process > generatemafplot (1) [d4/e6d01c] Submitted process > producereports (1) The output report is called output/test-qc.pdf 4 -profile pbs -profile docker

5 Exercise Set up 1. You should be logged into the head node of the cluster cream.core.wits.ac.za if you can otherwise log on to Create a directory for this exercise. 3. Copy the file /global/gwas-workshop/plinkdata/ga14.config into this directory. 4. Fetch a copy of the workflow. You can choose, either method above but if you are not familiar with git I suggest you do the following nextflow pull h3abionet/h3agwas The first time you run nextflow will require downloadin some software dependancies. The command nextflow pull fetches the workflow software from GitHub. (As an aside, the software is put in a hidden directory ~/.nextflow but you don t need to look at it) Config file Edit the config file I ve set it up for you so for the first time, so you shouldn t need to change anything. You ll get a chance to play around later. But at this stage read the config file carefully to understand what s there. Remember you can read the extensive documentation at https: //github.com/h3abionet/h3agwas. Note that you don t have to copy the input data this can stay where it is in the shared directory. Just specify where it is. I ve suggested that the output directory be qcdata. You don t have to create it as Nextflow will do that for you (though you can if you want). There are many default parameters to view all the parameters that will be applied when you run the job nextflow -c ga14.config config h3abionet/h3agwas/plink-qc.nf Running the workflow To run the workflow you ll need to specify the following: the name of the script to run The configuration file the mode of running if you are on cream.core.wits.ac.za, then specify you want to use the PBS scheduler to run jobs with the -profilepbs option if you are on , don t specify an option You will either use one of the two lines below depending on which machine you are on 5

6 # if you are on cream nextflow run h3abionet/h3agwas/plink-qc.nf -c ga14.config -profile pbs # if you are on nextflow run h3abionet/h3agwas/plink-qc.nf -c ga14.config Report The workflow should complete you will be told where the report can be found. Use scp to copy the file from the cluster to your machine and you can then open it and view. This is a very small example and somewhat artificial so some of the pictures may look a little odd. Experiment Try with different parameters and see what effect the change has. When you run the workflow again use the -resume option then only those parts of the workflow that need to be re-run will be re-run. Association study plink-assoc.nf Association workflow very experiment dependant data, population structure, co-variate, question basic workflow implemented for initial study can be extended Config file Example config file params.input_dir = "/spaces/scott/assoc/agt" params.input_pat = "25" params.output = "allgemma" params.output_dir = "assocresults" params.data = "/data/h3agwas/data.csv" params.covariates = "age,sex" params.pheno="bmi_c/np.log,wst_hip_r_c,standing_height_mm" params.gemma_num_cores = 8 params.gemma = 1 params.linear = 1 nextflow -c assoc.config config h3abionet/h3agwas/plink-assoc.nf 6

7 Running the workflow nextflow run h3abionet/h3agwas/plink-assoc.nf \ -c assoc.config The output will be found in the assocresults configuration file because that s specified in the assoc.config file Note you may need to change the above if you are running on a cluster (e.g. cream.core. wits.ac.za) you need to use the pbs profile nextflow run h3abionet/h3agwas/plink-assoc.nf -c assoc.config -profile pbs And if you are using Docker you will say nextflow run h3abionet/h3agwas/plink-assoc.nf -c assoc.config -profile docker Exercise Copy the file /global/gwas-workshop/plinkdata/assoc.config to your local directory. This has a simple set up for the association testing. Read through the file to understand the basics again, the full documentation can be found on the site. In this example, you will be used the QCd data you produced in the last step. If you changed the output name or directory in the previous exercise, you will have to change input name and/or directory in this exercise. In this example, we will do the basic χ 2 test, with Bonferroni correction and permutation testing (permutation testing is a default you can turn it off). We ll test two phenotypes pheno and sex and use sex as a covariate. Remember this is a very small data set so we are unlikely to find statsitically significant results. Now look at the fulll configuration: 4 Updating the workflow The workflows are updated bug fixes If you got the workflow using Nextflow nextflow pull h3abionet/h3agwas If you used git change directory git pull 7

8 5 Problems If something goes wrong, may be difficult to understand why workflow data As an example, here I deliberatly give the wrong column name for the batch, which causes one of the scripts to fail. We get ERROR ~ Error executing process > batchproc (1) Caused by: Process batchproc (1) terminated with an error exit status (1) Command executed [/spaces/scott/h3abionet/h3agwas/templates/batchreport.py]: #!/usr/bin/env python3 from future import print_function import argparse import sys import re Followed by several hundred lines of text Command exit status: 1 Command output: (empty) Command error: Traceback (most recent call last): File ".command.sh", line 577, in <module> bfrm, btext = getbatchanalysis() File ".command.sh", line 550, in getbatchanalysis result = miss_vals(ifrm,bfrm,args.batch_col,args.sexcheck_report) File ".command.sh", line 188, in miss_vals g = pd.merge(pfrm,ifrm,left_index=true,right_index=true,how= inner ).groupby(pheno_c File "/opt/exp_soft/python36/lib/python3.6/site-packages/pandas/core/generic.py", line **kwargs) File "/opt/exp_soft/python36/lib/python3.6/site-packages/pandas/core/groupby.py", line return klass(obj, by, **kwds) File "/opt/exp_soft/python36/lib/python3.6/site-packages/pandas/core/groupby.py", line 8

9 mutated=self.mutated) File "/opt/exp_soft/python36/lib/python3.6/site-packages/pandas/core/groupby.py", line raise KeyError(gpr) KeyError: batches Work dir: /spaces/scott/h3abionet/h3agwas/test/work/cf/335b6d21ad75841e1e d3d Tip: when you have fixed the problem you can continue the execution appending to the nextfl -- Check.nextflow.log file for details WARN: Killing pending tasks (1) test]$ The error is obscure but if you look you can see the mistake. You are also given the work directory for the process that failed. You can change directory to this directory and explore what happened: All input files will be there The script that failed will be found in.command.sh The output is in.command.out and error in.command.err 9

Computational requirements for the H3ABioNet GWAS workflows

Computational requirements for the H3ABioNet GWAS workflows School of Electrical and UNIVERSITY OF THE WITWATERSRAND, JOHANNESBURG Information Engineering Computational requirements for the H3ABioNet GWAS workflows Scott Hazelhurst 1 Introduction Recap of GWAS

More information

Nextflow: a tutorial through examples

Nextflow: a tutorial through examples UNIVERSITY OF THE WITWATERSRAND, JOHANNESBURG School of Electrical and Information Engineering Nextflow: a tutorial through examples www.bioinf.wits.ac.za/courses/nextflow.pdf Scott Hazelhurst August 2016

More information

NGI-RNAseq. Processing RNA-seq data at the National Genomics Infrastructure. NGI stockholm

NGI-RNAseq. Processing RNA-seq data at the National Genomics Infrastructure. NGI stockholm NGI-RNAseq Processing RNA-seq data at the National Genomics Infrastructure Phil Ewels phil.ewels@scilifelab.se NBIS RNA-seq tutorial 2017-11-09 SciLifeLab NGI Our mission is to offer a state-of-the-art

More information

Polymorphism and Variant Analysis Lab

Polymorphism and Variant Analysis Lab Polymorphism and Variant Analysis Lab Arian Avalos PowerPoint by Casey Hanson Polymorphism and Variant Analysis Matt Hudson 2018 1 Exercise In this exercise, we will do the following:. 1. Gain familiarity

More information

Modeling and Simulation with SST and OCCAM

Modeling and Simulation with SST and OCCAM Modeling and Simulation with SST and OCCAM Exercise 1 Setup, Configure & Run a Simple Processor Be on the lookout for this fellow: The callouts are ACTIONs for you to do! When you see the check mark, compare

More information

Maximizing Public Data Sources for Sequencing and GWAS

Maximizing Public Data Sources for Sequencing and GWAS Maximizing Public Data Sources for Sequencing and GWAS February 4, 2014 G Bryce Christensen Director of Services Questions during the presentation Use the Questions pane in your GoToWebinar window Agenda

More information

snpqc an R pipeline for quality control of Illumina SNP data

snpqc an R pipeline for quality control of Illumina SNP data snpqc an R pipeline for quality control of Illumina SNP data 1. In a nutshell snpqc is a series of R scripts to perform quality control analysis on Illumina SNP data. The objective of the program is to

More information

Ricopili: Introdution. WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015

Ricopili: Introdution. WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015 Ricopili: Introdution WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015 What will we offer? Practical: Sorry, no practical sessions today, please refer to the summer school, organized

More information

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017 BICF Nano Course: GWAS GWAS Workflow Development using PLINK Julia Kozlitina Julia.Kozlitina@UTSouthwestern.edu April 28, 2017 Getting started Open the Terminal (Search -> Applications -> Terminal), and

More information

PYTHON YEAR 10 RESOURCE. Practical 01: Printing to the Shell KS3. Integrated Development Environment

PYTHON YEAR 10 RESOURCE. Practical 01: Printing to the Shell KS3. Integrated Development Environment Practical 01: Printing to the Shell To program in Python you need the latest version of Python, which is freely available at www.python.org. Your school will have this installed on the computers for you,

More information

Git Workbook. Self-Study Guide to Git. Lorna Mitchell. This book is for sale at

Git Workbook. Self-Study Guide to Git. Lorna Mitchell. This book is for sale at Git Workbook Self-Study Guide to Git Lorna Mitchell This book is for sale at http://leanpub.com/gitworkbook This version was published on 2018-01-15 This is a Leanpub book. Leanpub empowers authors and

More information

Using GitHub to Share with SparkFun a

Using GitHub to Share with SparkFun a Using GitHub to Share with SparkFun a learn.sparkfun.com tutorial Available online at: http://sfe.io/t52 Contents Introduction Gitting Started Forking a Repository Committing, Pushing and Pulling Syncing

More information

NBIC TechTrack PBS Tutorial

NBIC TechTrack PBS Tutorial NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen Visit our webpage at: http://www.nbic.nl/support/brs 1 NBIC PBS Tutorial

More information

INTRODUCTION TO NEXTFLOW

INTRODUCTION TO NEXTFLOW INTRODUCTION TO NEXTFLOW Paolo Di Tommaso, CRG NETTAB workshop - Roma October 25th, 2016 @PaoloDiTommaso Research software engineer Comparative Bioinformatics, Notredame Lab Center for Genomic Regulation

More information

git commit --amend git rebase <base> git reflog git checkout -b Create and check out a new branch named <branch>. Drop the -b

git commit --amend git rebase <base> git reflog git checkout -b Create and check out a new branch named <branch>. Drop the -b Git Cheat Sheet Git Basics Rewriting Git History git init Create empty Git repo in specified directory. Run with no arguments to initialize the current directory as a git repository. git commit

More information

Job Submitter Documentation

Job Submitter Documentation Job Submitter Documentation Release 0+untagged.133.g5a1e521.dirty Juan Eiros February 27, 2017 Contents 1 Job Submitter 3 1.1 Before you start............................................. 3 1.2 Features..................................................

More information

Lab Exercise 1 Using EGit and JUnit

Lab Exercise 1 Using EGit and JUnit Lab Exercise 1 Using EGit and JUnit This lab exercise will get you familiar with following: EGit, an Eclipse plug-in to use to a distributed version control system called Git. JUnit, a unit testing framework

More information

Emile R. Chimusa Division of Human Genetics Department of Pathology University of Cape Town

Emile R. Chimusa Division of Human Genetics Department of Pathology University of Cape Town Advanced Genomic data manipulation and Quality Control with plink Emile R. Chimusa (emile.chimusa@uct.ac.za) Division of Human Genetics Department of Pathology University of Cape Town Outlines: 1.Introduction

More information

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA ABSTRACT The SAS system running in the Microsoft Windows environment contains a multitude of tools

More information

Genome-Wide Association Study Using

Genome-Wide Association Study Using has to Department of Epidemiology UT MD Anderson Cancer Center Houston, TX April 2, 2008 Programmers Cross Training Outline has to 1 has 2 to 3 Going object-oriented: Outline has Brief introduction to

More information

MAGMA manual (version 1.06)

MAGMA manual (version 1.06) MAGMA manual (version 1.06) TABLE OF CONTENTS OVERVIEW 3 QUICKSTART 4 ANNOTATION 6 OVERVIEW 6 RUNNING THE ANNOTATION 6 ADDING AN ANNOTATION WINDOW AROUND GENES 7 RESTRICTING THE ANNOTATION TO A SUBSET

More information

BanzaiDB Documentation

BanzaiDB Documentation BanzaiDB Documentation Release 0.3.0 Mitchell Stanton-Cook Jul 19, 2017 Contents 1 BanzaiDB documentation contents 3 2 Indices and tables 11 i ii BanzaiDB is a tool for pairing Microbial Genomics Next

More information

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine Table of Contents Table of Contents Job Manager for remote execution of QuantumATK scripts A single remote machine Settings Environment Resources Notifications Diagnostics Save and test the new machine

More information

Creating a multilingual site in WebPlus

Creating a multilingual site in WebPlus Creating a multilingual site in WebPlus One of the problems faced by a number of WebPlus users involves organizing a multilingual website. Ordinarily, the easiest way to do this is to create your primary

More information

CS 1110, LAB 3: MODULES AND TESTING First Name: Last Name: NetID:

CS 1110, LAB 3: MODULES AND TESTING   First Name: Last Name: NetID: CS 1110, LAB 3: MODULES AND TESTING http://www.cs.cornell.edu/courses/cs11102013fa/labs/lab03.pdf First Name: Last Name: NetID: The purpose of this lab is to help you better understand functions, and to

More information

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen 1 NBIC PBS Tutorial This part is an introduction to clusters and the PBS

More information

Technical White Paper

Technical White Paper Technical White Paper Via Excel (VXL) Item Templates This technical white paper is designed for Spitfire Project Management System users. In this paper, you will learn how to create Via Excel Item Templates

More information

The fgwas software. Version 1.0. Pennsylvannia State University

The fgwas software. Version 1.0. Pennsylvannia State University The fgwas software Version 1.0 Zhong Wang 1 and Jiahan Li 2 1 Department of Public Health Science, 2 Department of Statistics, Pennsylvannia State University 1. Introduction Genome-wide association studies

More information

Intro to Github. Jessica Young

Intro to Github. Jessica Young Intro to Github Jessica Young jyoung22@nd.edu GitHub Basics 1. Installing GitHub and Git 2. Connecting Git and GitHub 3. Why use Git? Installing GitHub If you haven t already, create an account on GitHub

More information

Reproducible computational pipelines with Docker and Nextflow

Reproducible computational pipelines with Docker and Nextflow Reproducible computational pipelines with Docker and Nextflow Paolo Di Tommaso - Notredame Lab Center for Genomic Regulation (CRG) HPC Advisory Council - 22 March 2016, Lugano @PaoloDiTommaso Research

More information

Programming Fundamentals

Programming Fundamentals Programming Fundamentals Computers are really very dumb machines -- they only do what they are told to do. Most computers perform their operations on a very primitive level. The basic operations of a computer

More information

Step-by-Step Guide to Advanced Genetic Analysis

Step-by-Step Guide to Advanced Genetic Analysis Step-by-Step Guide to Advanced Genetic Analysis Page 1 Introduction In the previous document, 1 we covered the standard genetic analyses available in JMP Genomics. Here, we cover the more advanced options

More information

XP: Backup Your Important Files for Safety

XP: Backup Your Important Files for Safety XP: Backup Your Important Files for Safety X 380 / 1 Protect Your Personal Files Against Accidental Loss with XP s Backup Wizard Your computer contains a great many important files, but when it comes to

More information

GSCAN GWAS Analysis Plan, v GSCAN GWAS ANALYSIS PLAN, Version 1.0 October 6, 2015

GSCAN GWAS Analysis Plan, v GSCAN GWAS ANALYSIS PLAN, Version 1.0 October 6, 2015 GSCAN GWAS Analysis Plan, v0.5 1 Overview GSCAN GWAS ANALYSIS PLAN, Version 1.0 October 6, 2015 There are three major components to this analysis plan. First, genome-wide genotypes must be on the correct

More information

Lab 01 How to Survive & Introduction to Git. Web Programming DataLab, CS, NTHU

Lab 01 How to Survive & Introduction to Git. Web Programming DataLab, CS, NTHU Lab 01 How to Survive & Introduction to Git Web Programming DataLab, CS, NTHU Notice These slides will focus on how to submit you code by using Git command line You can also use other Git GUI tool or built-in

More information

GWAS Exercises 3 - GWAS with a Quantiative Trait

GWAS Exercises 3 - GWAS with a Quantiative Trait GWAS Exercises 3 - GWAS with a Quantiative Trait Peter Castaldi January 28, 2013 PLINK can also test for genetic associations with a quantitative trait (i.e. a continuous variable). In this exercise, we

More information

TDDC88 Lab 4 Software Configuration Management

TDDC88 Lab 4 Software Configuration Management TDDC88 Lab 4 Software Configuration Management Introduction "Version control is to programmers what the safety net is to a trapeze artist. Knowing the net is there to catch them if they fall, aerialists

More information

How to get started using the JSL

How to get started using the JSL How to get started using the JSL I have used both Eclipse (www.eclipse.org) and Netbeans (www.netbeans.org) in my work with the JSL. Both are fine integrated development environments (IDEs). If you are

More information

CAREER SERVICES MANAGER, Powered by Symplicity STUDENT AND ALUMNI INSTRUCTION MANUAL

CAREER SERVICES MANAGER, Powered by Symplicity STUDENT AND ALUMNI INSTRUCTION MANUAL CAREER SERVICES MANAGER, Powered by Symplicity STUDENT AND ALUMNI INSTRUCTION MANUAL HOME TAB Log in at https://law-hamline-csm.symplicity.com/students/. Students For students, your login is your email

More information

Pootle Tutorial! Guide for translators and developers!

Pootle Tutorial! Guide for translators and developers! Pootle Tutorial Guide for translators and developers + Copyright 2014 Contents 1 Setting up an existing project 3 1.1 Adding the source language (the template ) 4 1.2 Localizing a project in a specific

More information

DNA Sequencing analysis on Artemis

DNA Sequencing analysis on Artemis DNA Sequencing analysis on Artemis Mapping and Variant Calling Tracy Chew Senior Research Bioinformatics Technical Officer Rosemarie Sadsad Informatics Services Lead Hayim Dar Informatics Technical Officer

More information

Table of Contents. 3. Changing your Lotus Notes Password, page Choosing your Letterhead and Signature, page 6

Table of Contents. 3. Changing your Lotus Notes Password, page Choosing your Letterhead and Signature, page 6 Table of Contents 1. Logging onto Lotus Notes, page 3 2. Logging out of Lotus Notes, page 3 3. Changing your Lotus Notes Password, page 4 4. Navigating in Lotus Notes, page 5 5. Choosing your Letterhead

More information

FVGWAS- 3.0 Manual. 1. Schematic overview of FVGWAS

FVGWAS- 3.0 Manual. 1. Schematic overview of FVGWAS FVGWAS- 3.0 Manual Hongtu Zhu @ UNC BIAS Chao Huang @ UNC BIAS Nov 8, 2015 More and more large- scale imaging genetic studies are being widely conducted to collect a rich set of imaging, genetic, and clinical

More information

Tutorial: Getting Started with Git. Introduction to version control Benefits of using Git Basic commands Workflow

Tutorial: Getting Started with Git. Introduction to version control Benefits of using Git Basic commands Workflow Tutorial: Getting Started with Git Introduction to version control Benefits of using Git Basic commands Workflow http://xkcd.com/1597/ 2 Tutorial Objectives Fundamentals of how git works Everything you

More information

Aldryn Installer Documentation

Aldryn Installer Documentation Aldryn Installer Documentation Release 0.2.0 Iacopo Spalletti February 06, 2014 Contents 1 django CMS Installer 3 1.1 Features.................................................. 3 1.2 Installation................................................

More information

contribution-guide.org Release

contribution-guide.org Release contribution-guide.org Release August 06, 2018 Contents 1 About 1 1.1 Sources.................................................. 1 2 Submitting bugs 3 2.1 Due diligence...............................................

More information

Containerised Development of a Scientific Data Management System Ben Leighton, Andrew Freebairn, Ashley Sommer, Jonathan Yu, Simon Cox LAND AND WATER

Containerised Development of a Scientific Data Management System Ben Leighton, Andrew Freebairn, Ashley Sommer, Jonathan Yu, Simon Cox LAND AND WATER Containerised elopment of a Scientific Data Management System Ben Leighton, Andrew Freebairn, Ashley Sommer, Jonathan Yu, Simon Cox LAND AND WATER Some context I m part of a team of developers in Land

More information

CMSC 201 Spring 2017 Lab 05 Lists

CMSC 201 Spring 2017 Lab 05 Lists CMSC 201 Spring 2017 Lab 05 Lists Assignment: Lab 05 Lists Due Date: During discussion, February 27th through March 2nd Value: 10 points (8 points during lab, 2 points for Pre Lab quiz) This week s lab

More information

Continuous integration & continuous delivery. COSC345 Software Engineering

Continuous integration & continuous delivery. COSC345 Software Engineering Continuous integration & continuous delivery COSC345 Software Engineering Outline Integrating different teams work, e.g., using git Defining continuous integration / continuous delivery We use continuous

More information

Admin Training. PaperSave Miami Green Way, 11th Floor, Miami, Florida USA

Admin Training. PaperSave Miami Green Way, 11th Floor, Miami, Florida USA Admin Training PaperSave 5.2 3150 Miami Green Way, 11th Floor, Miami, Florida 33146. USA 877 727 3799 305 373 0056 www.papersave.com PaperSave is a product of WhiteOwl - www.whiteowlsolutions.com TABLE

More information

Using Devices with Microsoft HealthVault

Using Devices with Microsoft HealthVault Using Devices with Microsoft HealthVault A Microsoft HealthVault Step-by-Step Guide This guide will help you get started using Microsoft HealthVault Connection Center to send information from your health

More information

Basic E-Sticker Pack User Guide

Basic E-Sticker Pack User Guide r6 Basic E-Sticker Pack User Guide Getting Started with Your Basic E-Sticker Pack Using the Basic E-Sticker Pack is a simple process. Before you begin, however, we highly recommend that you keep an original,

More information

A Brief Git Primer for CS 350

A Brief Git Primer for CS 350 A Brief Git Primer for CS 350 Tyler Szepesi (shamelessly stolen by Ben Cassell) University of Waterloo becassel@uwaterloo.ca September 8, 2017 Overview 1 Introduction 2 One-Time Setup 3 Using Git Git on

More information

Bitte decken Sie die schraffierte Fläche mit einem Bild ab. Please cover the shaded area with a picture. (24,4 x 7,6 cm)

Bitte decken Sie die schraffierte Fläche mit einem Bild ab. Please cover the shaded area with a picture. (24,4 x 7,6 cm) Bitte decken Sie die schraffierte Fläche mit einem Bild ab. Please cover the shaded area with a picture. (24,4 x 7,6 cm) Continuous Integration / Continuous Testing Seminary IIC Requirements Java SE Runtime

More information

Web:

Web: NEO 2 Contact Information United States Renaissance Learning PO Box 8036 Wisconsin Rapids, WI 54495-8036 Technical questions or problems: Telephone: (800) 338-4204 Email: support@renlearn.com Website:

More information

Textual Description of webbioc

Textual Description of webbioc Textual Description of webbioc Colin A. Smith October 13, 2014 Introduction webbioc is a web interface for some of the Bioconductor microarray analysis packages. It is designed to be installed at local

More information

Genetic Analysis. Page 1

Genetic Analysis. Page 1 Genetic Analysis Page 1 Genetic Analysis Objectives: 1) Set up Case-Control Association analysis and the Basic Genetics Workflow 2) Use JMP tools to interact with and explore results 3) Learn advanced

More information

Python lab session 1

Python lab session 1 Python lab session 1 Dr Ben Dudson, Department of Physics, University of York 28th January 2011 Python labs Before we can start using Python, first make sure: ˆ You can log into a computer using your username

More information

Apple Qmaster 4. User Manual

Apple Qmaster 4. User Manual Apple Qmaster 4 User Manual Copyright 2012 Apple Inc. All rights reserved. Your rights to the software are governed by the accompanying software license agreement. The owner or authorized user of a valid

More information

MAGMA manual (version 1.05)

MAGMA manual (version 1.05) MAGMA manual (version 1.05) TABLE OF CONTENTS OVERVIEW 3 QUICKSTART 4 ANNOTATION 6 OVERVIEW 6 RUNNING THE ANNOTATION 6 ADDING AN ANNOTATION WINDOW AROUND GENES 7 RESTRICTING THE ANNOTATION TO A SUBSET

More information

CMSC 201 Fall 2016 Lab 09 Advanced Debugging

CMSC 201 Fall 2016 Lab 09 Advanced Debugging CMSC 201 Fall 2016 Lab 09 Advanced Debugging Assignment: Lab 09 Advanced Debugging Due Date: During discussion Value: 10 points Part 1: Introduction to Errors Throughout this semester, we have been working

More information

keepassx Release 0.1.0

keepassx Release 0.1.0 keepassx Release 0.1.0 March 17, 2016 Contents 1 What is this project and why should I care? 3 2 What s a Password Manager? 5 3 I ve never hard of KeePassx, what is it? 7 4 Aren t there similar projects

More information

Pipeline Pilot Interface. User Guide

Pipeline Pilot Interface. User Guide Pipeline Pilot Interface Version 2.0.0.0 User Guide (for Package version 2.0.0 and above and Pipeline Pilot version 8.0 and above) Edgar Derksen, Sally Hindle c 2018 BioSolveIT GmbH, An der Ziegelei 79,

More information

How to Rescue a Deleted File Using the Free Undelete 360 Program

How to Rescue a Deleted File Using the Free Undelete 360 Program R 095/1 How to Rescue a Deleted File Using the Free Program This article shows you how to: Maximise your chances of recovering the lost file View a list of all your deleted files in the free Restore a

More information

Getting Started With Containers

Getting Started With Containers DEVNET 2042 Getting Started With Containers Matt Johnson Developer Evangelist @mattdashj Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

Computer Basics 1/24/13. Computer Organization. Computer systems consist of hardware and software.

Computer Basics 1/24/13. Computer Organization. Computer systems consist of hardware and software. Hardware and Software Computer Basics TOPICS Computer Organization Data Representation Program Execution Computer Languages Computer systems consist of hardware and software. Hardware includes the tangible

More information

G E T T I N G S TA R T E D W I T H G I T

G E T T I N G S TA R T E D W I T H G I T G E T T I N G S TA R T E D W I T H G I T A A R O N H O O V E R & B R A D M I N C H J A N U A R Y 2 2, 2 0 1 8 1 Why use a version control system? Much of this document was blatantly cribbed from Allen

More information

SmartCVS Tutorial. Starting the putty Client and Setting Your CVS Password

SmartCVS Tutorial. Starting the putty Client and Setting Your CVS Password SmartCVS Tutorial Starting the putty Client and Setting Your CVS Password 1. Open the CSstick folder. You should see an icon or a filename for putty. Depending on your computer s configuration, it might

More information

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI. 2 Navigating the NCBI Instructions Aim: To become familiar with the resources available at the National Center for Bioinformatics (NCBI) and the search engine Entrez. Instructions: Write the answers to

More information

Computer Networks - A Simple HTTP proxy -

Computer Networks - A Simple HTTP proxy - Computer Networks - A Simple HTTP proxy - Objectives The intent of this assignment is to help you gain a thorough understanding of: The interaction between browsers and web servers The basics of the HTTP

More information

1. Summary statistics test_gwas. This file contains a set of 50K random SNPs of the Subjective Well-being GWAS of the Netherlands Twin Register

1. Summary statistics test_gwas. This file contains a set of 50K random SNPs of the Subjective Well-being GWAS of the Netherlands Twin Register Quality Control for Genome-Wide Association Studies Bart Baselmans & Meike Bartels Boulder 2017 Setting up files and directories To perform a quality control protocol in a Genome-Wide Association Meta

More information

git-pr Release dev2+ng5b0396a

git-pr Release dev2+ng5b0396a git-pr Release 0.2.1.dev2+ng5b0396a Mar 20, 2017 Contents 1 Table Of Contents 3 1.1 Installation................................................ 3 1.2 Usage...................................................

More information

Docker at Lyft Speeding up development Matthew #dockercon

Docker at Lyft Speeding up development Matthew #dockercon Docker at Lyft Speeding up development Matthew Leventi @mleventi #dockercon Lyft Engineering Lyft Engineering Organization - Rapidly growing headcount - Fluid teams - Everyone does devops Technology -

More information

Prof. Konstantinos Krampis Office: Rm. 467F Belfer Research Building Phone: (212) Fax: (212)

Prof. Konstantinos Krampis Office: Rm. 467F Belfer Research Building Phone: (212) Fax: (212) Director: Prof. Konstantinos Krampis agbiotec@gmail.com Office: Rm. 467F Belfer Research Building Phone: (212) 396-6930 Fax: (212) 650 3565 Facility Consultant:Carlos Lijeron 1/8 carlos@carotech.com Office:

More information

LECTURE 0: Introduction and Background

LECTURE 0: Introduction and Background 1 LECTURE 0: Introduction and Background September 10, 2012 1 Computational science The role of computational science has become increasingly significant during the last few decades. It has become the

More information

CMSC 201 Spring 2017 Lab 12 Recursion

CMSC 201 Spring 2017 Lab 12 Recursion CMSC 201 Spring 2017 Lab 12 Recursion Assignment: Lab 12 Recursion Due Date: During discussion, May 1st through May 4th Value: 10 points (8 points during lab, 2 points for Pre Lab quiz) This week s lab

More information

Configuring Thunderbird for GMail

Configuring Thunderbird for GMail Configuring Thunderbird for GMail There are a couple of settings that need to be changed on Gmail before you can add the account to Thunderbird. 1) Log in to Gmail and click on Settings (which looks like

More information

Essential Skills for Bioinformatics: Unix/Linux

Essential Skills for Bioinformatics: Unix/Linux Essential Skills for Bioinformatics: Unix/Linux SHELL SCRIPTING Overview Bash, the shell we have used interactively in this course, is a full-fledged scripting language. Unlike Python, Bash is not a general-purpose

More information

CONTINUOUS INTEGRATION; TIPS & TRICKS

CONTINUOUS INTEGRATION; TIPS & TRICKS CONTINUOUS INTEGRATION; TIPS & TRICKS BIO I DO TECH THINGS I DO THINGS I DO THINGS BLUE OCEAN BEEP BEEP REFACTOR PEOPLE S HOUSES MY TIPS & TRICKS FOR CI - CI Infrastructure - CI Architecture - Pipeline

More information

How to version control like a pro: a roadmap to your reproducible & collaborative research

How to version control like a pro: a roadmap to your reproducible & collaborative research How to version control like a pro: a roadmap to your reproducible & collaborative research The material in this tutorial is inspired by & adapted from the Software Carpentry lesson on version control &

More information

Git & Github Fundamental by Rajesh Kumar.

Git & Github Fundamental by Rajesh Kumar. Git & Github Fundamental by Rajesh Kumar About me Rajesh Kumar DevOps Architect @RajeshKumarIN www.rajeshkumar.xyz www.scmgalaxy.com 2 What is git Manage your source code versions Who should use Git Anyone

More information

Note: Who is Dr. Who? You may notice that YARN says you are logged in as dr.who. This is what is displayed when user

Note: Who is Dr. Who? You may notice that YARN says you are logged in as dr.who. This is what is displayed when user Run a YARN Job Exercise Dir: ~/labs/exercises/yarn Data Files: /smartbuy/kb In this exercise you will submit an application to the YARN cluster, and monitor the application using both the Hue Job Browser

More information

Setting up GitHub Version Control with Qt Creator*

Setting up GitHub Version Control with Qt Creator* Setting up GitHub Version Control with Qt Creator* *This tutorial is assuming you already have an account on GitHub. If you don t, go to www.github.com and set up an account using your buckeyemail account.

More information

Git. Presenter: Haotao (Eric) Lai Contact:

Git. Presenter: Haotao (Eric) Lai Contact: Git Presenter: Haotao (Eric) Lai Contact: haotao.lai@gmail.com 1 Acknowledge images with white background is from the following link: http://marklodato.github.io/visual-git-guide/index-en.html images with

More information

Importing and Merging Data Tutorial

Importing and Merging Data Tutorial Importing and Merging Data Tutorial Release 1.0 Golden Helix, Inc. February 17, 2012 Contents 1. Overview 2 2. Import Pedigree Data 4 3. Import Phenotypic Data 6 4. Import Genetic Data 8 5. Import and

More information

Lab 08. Command Line and Git

Lab 08. Command Line and Git Lab 08 Command Line and Git Agenda Final Project Information All Things Git! Make sure to come to lab next week for Python! Final Projects Connect 4 Arduino ios Creative AI Being on a Team - How To Maximize

More information

Project 1 Balanced binary

Project 1 Balanced binary CMSC262 DS/Alg Applied Blaheta Project 1 Balanced binary Due: 7 September 2017 You saw basic binary search trees in 162, and may remember that their weakness is that in the worst case they behave like

More information

Python for Earth Scientists

Python for Earth Scientists Python for Earth Scientists Andrew Walker andrew.walker@bris.ac.uk Python is: A dynamic, interpreted programming language. Python is: A dynamic, interpreted programming language. Data Source code Object

More information

Axiom Analysis Suite Release Notes (For research use only. Not for use in diagnostic procedures.)

Axiom Analysis Suite Release Notes (For research use only. Not for use in diagnostic procedures.) Axiom Analysis Suite 4.0.1 Release Notes (For research use only. Not for use in diagnostic procedures.) Axiom Analysis Suite 4.0.1 includes the following changes/updates: 1. For library packages that support

More information

Remote Workflow Enactment using Docker and the Generic Execution Framework in EUDAT

Remote Workflow Enactment using Docker and the Generic Execution Framework in EUDAT Remote Workflow Enactment using Docker and the Generic Execution Framework in EUDAT Asela Rajapakse Max Planck Institute for Meteorology EUDAT receives funding from the European Union's Horizon 2020 programme

More information

Step-by-Step Guide to Basic Genetic Analysis

Step-by-Step Guide to Basic Genetic Analysis Step-by-Step Guide to Basic Genetic Analysis Page 1 Introduction This document shows you how to clean up your genetic data, assess its statistical properties and perform simple analyses such as case-control

More information

Easy Windows Working with Disks, Folders, - and Files

Easy Windows Working with Disks, Folders, - and Files Easy Windows 98-3 - Working with Disks, Folders, - and Files Page 1 of 11 Easy Windows 98-3 - Working with Disks, Folders, - and Files Task 1: Opening Folders Folders contain files, programs, or other

More information

2 Initialize a git repository on your machine, add a README file, commit and push

2 Initialize a git repository on your machine, add a README file, commit and push BioHPC Git Training Demo Script First, ensure that git is installed on your machine, and you have configured an ssh key. See the main slides for instructions. To follow this demo script open a terminal

More information

Introduction to Git and GitHub for Writers Workbook February 23, 2019 Peter Gruenbaum

Introduction to Git and GitHub for Writers Workbook February 23, 2019 Peter Gruenbaum Introduction to Git and GitHub for Writers Workbook February 23, 2019 Peter Gruenbaum Table of Contents Preparation... 3 Exercise 1: Create a repository. Use the command line.... 4 Create a repository...

More information

ToCatchAThief c ryan campbell & jenn coughlan 7/23/2018

ToCatchAThief c ryan campbell & jenn coughlan 7/23/2018 ToCatchAThief c ryan campbell & jenn coughlan 7/23/2018 Welcome to the To Catch a Thief: With Data! walkthrough! https://bioconductor.org/packages/devel/ bioc/vignettes/snprelate/inst/doc/snprelatetutorial.html

More information

Assignment 0. Nothing here to hand in

Assignment 0. Nothing here to hand in Assignment 0 Nothing here to hand in The questions here have solutions attached. Follow the solutions to see what to do, if you cannot otherwise guess. Though there is nothing here to hand in, it is very

More information

MarkLogic Server. Flexible Replication Guide. MarkLogic 9 May, Copyright 2018 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Flexible Replication Guide. MarkLogic 9 May, Copyright 2018 MarkLogic Corporation. All rights reserved. Flexible Replication Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-1, May, 2017 Copyright 2018 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Flexible Replication Guide

More information

Voyant Connect User Guide

Voyant Connect User Guide Voyant Connect User Guide WELCOME TO VOYANT CONNECT 3 INSTALLING VOYANT CONNECT 3 MAC INSTALLATION 3 WINDOWS INSTALLATION 4 LOGGING IN 4 WINDOWS FIRST LOGIN 6 MAKING YOUR CLIENT USEFUL 6 ADDING CONTACTS

More information

Helpful Galaxy screencasts are available at:

Helpful Galaxy screencasts are available at: This user guide serves as a simplified, graphic version of the CloudMap paper for applicationoriented end-users. For more details, please see the CloudMap paper. Video versions of these user guides and

More information

Computer Basics 1/6/16. Computer Organization. Computer systems consist of hardware and software.

Computer Basics 1/6/16. Computer Organization. Computer systems consist of hardware and software. Hardware and Software Computer Basics TOPICS Computer Organization Data Representation Program Execution Computer Languages Computer systems consist of hardware and software. Hardware includes the tangible

More information