Convert Dosages to Genotypes Author: Autumn Laughbaum, Golden Helix, Inc.

Similar documents
Recalling Genotypes with BEAGLECALL Tutorial

Importing and Merging Data Tutorial

Creating and Using Genome Assemblies Tutorial

Tutorial 3 - Performing a Change-Point Analysis in Excel

2. create the workbook file

HaploHMM - A Hidden Markov Model (HMM) Based Program for Haplotype Inference Using Identified Haplotypes and Haplotype Patterns

Genetic Analysis. Page 1

ANSYS, Inc. February 8, ACT Extensions (Apps): Installation Procedure

Operating instructions for MixtureCalc v1.2 (Freeware Version)

DRAFT COPY 10/6/2011. Accessing the ALCOTEST Instrument Upload Data - NJSP Public Website page -

Distributed Processing

Microsoft Office Illustrated. Getting Started with Excel 2007

QTX. Tutorial for. by Kim M.Chmielewicz Kenneth F. Manly. Software for genetic mapping of Mendelian markers and quantitative trait loci.

BEAGLECALL 1.0. Brian L. Browning Department of Medicine Division of Medical Genetics University of Washington. 15 November 2010

Intro to NGS Tutorial

500K Data Analysis Workflow using BRLMM

Microsoft Excel 2010 Tutorial

ABI PRISM GeneMapper Software Version 3.0 SNP Genotyping

Introduction to StatKey Getting Data Into StatKey

DIPSorter User Manual. In combination with Biotype s Mentype DIPplex PCR Amplification Kit Software version: 1.0.2

Microsoft Official Academic Course Windows Vista Configuration, Exam ISBN: ERRATA

MAILMERGE WORD MESSAGES

Introduction. Inserting and Modifying Tables. Word 2010 Working with Tables. To Insert a Blank Table: Page 1

lab MS Excel 2010 active cell

Importing a CROWNWeb Data Quality Improvement Report.csv File into Excel

ELAI user manual. Yongtao Guan Baylor College of Medicine. Version June Copyright 2. 3 A simple example 2

Installation Instructions. Your Guide to Installing and Getting Started with WinSteam Version 4.0

Table of Contents. Part I Introduction. Part II FSM Ribbon. Part III FSM Context Menu. Part IV Trial Balance. Part V Building a report

Microsoft Excel 2013: Excel Basics June 2014

Chemistry 30 Tips for Creating Graphs using Microsoft Excel

File Management Utility User Guide

Beginning Excel for Windows

2. INTRODUCTORY EXCEL

HPE ALM Excel Add-in. Microsoft Excel Add-in Guide. Software Version: Go to HELP CENTER ONLINE

LocateXT Version 1.3 Quick Start

ITEC102 INFORMATION TECHNOLOGIES

Limits of Jedox Software Components

Manage data. Qlik Sense November 2017 Copyright QlikTech International AB. All rights reserved.

Tutorial 1: Getting Started with Excel

Excel 2. Module 2 Formulas & Functions

Package lodgwas. R topics documented: November 30, Type Package

Defender Desktop Login GrIDsure Token User Guide

Crystal Reports XI Release 2

Data protection and security in QlikView

Microsoft Excel 2007

SAMPLE ICDL 5.0. International Computer Driving Licence. Module 4 - Spreadsheets Using Microsoft Excel 2010

Functions in Excel. Structure of a function: Basic Mathematical Functions. Arithmetic operators: Comparison Operators:

Axiom Analysis Suite Release Notes (For research use only. Not for use in diagnostic procedures.)

Uploading Scantron Scores to CourseWeb

Excel 2016 Basics for Windows

KIP Track System User Guide

PivotTables & Charts for Health

Table of Contents. Part I Introduction. Part II FSM Ribbon. Part III FSM Context Menu. Part IV Trial Balance. Part V Building a report

Step-by-Step Guide to Basic Genetic Analysis

If you finish the work for the day go to QUIA and review any objective you feel you need help with.

From Non-Spatial Data to Spatial Data. Geocoding & Georeferencing in ArcGIS

EDAConnect-Dashboard User s Guide Version 3.4.0

Development of linkage map using Mapmaker/Exp3.0

Importing data in a database with levels

User s Manual. DL-Term. IM B9852UF-01E 4rd Edition

An Introduction to Management Science, 12e. Instructions for Using Excel 2007

Introduction to the workbook environment

Starting ParTEST. Select Start, Programs ParTEST ParTEST Enter your User Name and password

ADMS-8100 Programming Software for the Yaesu FT-8100

Using Scantron ParLAN 6.5 for the First Time:

Configuring GeneMapper ID Software

Excel 2016 Basics for Mac

Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.4 Graphical User Interface (GUI) Manual

Intro to Excel. To start a new workbook, click on the Blank workbook icon in the middle of the screen.

KRS-590 Programming Software for the Kenwood TS-590

DataPro Quick Start Guide

EASYLABEL Net Print Server

Importing Career Standards Benchmark Scores

Affymetrix Genotyping Console 3.0 User Manual

Running Mekorma MICR on Windows Vista

WIRELESS DATABASE VIEWER PLUS FOR ANDROID DEVICES: USER GUIDE PRODUCT VERSION: 2.4

USER S MANUAL. - Security Server. Security Server. Security Server. smar. First in Fieldbus MAY / 06 VERSION 8 FOUNDATION

WCS-R10 Programming Software for the Icom IC-R10

Hands-On Lab. Lab: Developing BI Applications. Lab version: Last updated: 2/23/2011

APK-SR8 Programming Software for the Alinco DX-SR8

1 Introduction. ThinPrint Client Installation Page 1

The University of New Orleans WebSTAR (PeopleSoft Learning Solutions v 9.0): Basic Navigation Training Manual

Getting the Most from your Microsoft Excel

The OmniVista User Interface

Using the Calyx Point PDS (Point Data Server) with WorkCenter CRM

MOBILEDATABASE USER GUIDE PRODUCT VERSION: 1.0

Autostart. Anne-Marie Mahfouf

17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA Document Variables Code Variables... 1

The following content has been imported from Legacy Help systems and is in the process of being checked for accuracy.

N2KExtractor. NMEA 2000 Data Extractor Software. User s Manual

Manage data. Qlik Sense June 2018 Copyright QlikTech International AB. All rights reserved.

DOWNLOAD PDF VBA MACRO TO PRINT MULTIPLE EXCEL SHEETS TO ONE

WIRELESS DATABASE VIEWER PLUS (ENTERPRISE EDITION) SERVER SIDE USER GUIDE

Introduction to Microsoft Excel 2010 Quick Reference Sheet

Affymetrix GeneChip DNA Analysis Software

How to Import Part Numbers to Proman

CHAPTER 1 GETTING STARTED

Excel. Attention! The main benefits of using Excel actions are:

What s New in Jet Reports 2010 R2

ADMS-4B Programming Software for the Yaesu FT-857 / FT-897

Transcription:

Convert Dosages to Genotypes Author: Autumn Laughbaum, Golden Helix, Inc. Overview This script converts allelic dosage values to genotypes based on user-specified thresholds. The dosage data may be in Single- or Double-Dosage format and may have samples in the row labels or column headers. If the samples are in the column headers, the spreadsheet may contain map information and allele translation values. This script is designed to run on a spreadsheet that contains dosage data and has already been imported into SVS. Thus, it is expected that the user import their dosage data using one of the many SVS import tools. In particular, the Import Text and Import Third-Party tools may be used for this purpose. Recommended Directory Location Save the script to the following directory: *..\Application Data\Golden Helix SVS\UserScripts\Spreadsheet\Edit\ Note: The Application Data folder is a hidden folder on Windows operating systems and its location varies between XP and Vista. The easiest way to locate this directory on your computer is to open SVS and select Tools >Open Folder > UserScripts Folder. If saved to the proper folder, this script will be accessible from the spreadsheet Edit menu. User Specified Options First User Prompt: Dosage format: o Single-Dosage Format Choose this option if each sample has only one row or column (depending on orientation) per marker. Generally in this case, the numeric values will range from 0 to 2. o Double-Dosage Format - Choose this option if each sample has two rows or columns per marker that contain the genotype dosages, p1 and p2, for A_A and A_B. The third genotype dosage (B_B) is calculated as 1-p1-p2. Dosages in this format generally range from 0 to 1. The dosage columns/rows that correspond to the same sample must be adjacent. If the sample names do not match, the two names are concatenated with -. Samples are found in o Row Labels Choose this option if the row labels contain the sample names and the column headers contain the marker names.

o Column Headers Choose this option if the row labels contain the marker names and the column headers contain the sample names. Numeric types to convert o The spreadsheet is scanned for appropriate column types (Real, Integer and Binary) and the types and counts are displayed. Check the column types that contain the dosage data. o Note: If a dosage column were to contain only 0 and 1 the importer would detect it as binary. The same can be applied to columns containing 0,1 and 2 that are detected as integer. On the other hand, if the spreadsheet contains additional non-dosage columns (such as marker map data), you may want to uncheck some column types. Second User Prompt: Threshold Values: o If data is in single-dosage format, the user must specify three ranges for A_A, A_B, and B_B, such that if the dosage value for a cell falls within the range, the genotype is called. If the dosage value is not within any range, a missing genotype is called. o If the data is in double-dosage format, the user must specify one threshold value, such that if one of p1, p2 or p3 is larger than that value, the corresponding genotype is called. Note that if two of the three dosages are larger than that value, precedence is given in the order of p1 > p2 > p3. o Note: In both cases, <= and >= are used to compare dosage values against threshold values. Additional options for spreadsheets that have samples in columns: o Create Marker Map from columns if the spreadsheet contains chromosome and position information in columns, a marker map can be created and applied to the converted spreadsheet. The column containing the marker names must be categorical and can also be the row labels. The column containing the chromosome must be categorical or integer and can also be the row labels. The column containing the position must be integer. o Translate Alleles if the spreadsheet contains allele translation values for A and B (or A1 and A2), the user can optionally code the genotypes with the translated alleles. Using the Script 1. Open a spreadsheet containing dosage data in Real, Integer and/or Binary columns. 2. Select Edit > Convert Dosages to Genotypes. 3. CASE 1: Samples in Row Labels, Single-Dosage Format (see Figure 1).

Figure 1. Dosage data in Single-Dosage format with samples in row labels a. In the first dialog, leave Spreadsheet is in Single-Dosage format (one column/row per sample) checked. b. Also leave Row Labels selected after Samples are found in. c. Since this spreadsheet contains only real columns, this is the only numeric type available. Leave it checked and click Next>. d. In the next prompt, specify minimum and maximum threshold values for A_A, A_B and B_B. Click Convert. e. The resulting spreadsheet contains the converted genotypes and is a child of the original spreadsheet.

4. CASE 2: Samples in Row Labels, Double-Dosage Format a. In the first dialog, uncheck Spreadsheet is in Single-Dosage format (one column/row per sample). b. Leave Row Labels selected after Samples are found in. c. Since this spreadsheet contains only real columns, this is the only numeric type available. Leave it checked and click Next>.

d. In the next prompt, specify the minimum dosage value for a called genotype. Click Convert. e. The resulting spreadsheet contains the converted genotypes and is a child of the original spreadsheet. 5. CASE 3: Samples in Column Headers, Single-Dosage Format. a. In the first dialog, leave Spreadsheet is in Single-Dosage format (one column/row per sample) checked. b. Select Column Headers after Samples are found in.

c. This spreadsheet contains integer and real columns that contain dosage data. Leave both types checked and click Next>. d. In the next prompt, specify the minimum and maximum dosage values for each genotype. e. This spreadsheet also contains allele translation columns for A and B. Check Translate Alleles and choose the appropriate columns (this may be auto-detected and checked in your own spreadsheet). f. The resulting spreadsheet contains the converted genotypes (with translated alleles) and is a child of the original spreadsheet.

6. CASE 4: Samples in Column Headers, Double-Dosage Format. a. In the first dialog, uncheck Spreadsheet is in Single-Dosage format (one column/row per sample). b. Select Column Headers after Samples are found in. c. This spreadsheet has real columns that contain dosage data and an integer column with position. Since you do not want to convert position, uncheck Integer columns (1) click Next>. d. In the next prompt, specify the minimum dosage value for a called genotype. e. This spreadsheet contains marker map information so check Create Marker Map from columns and choose the appropriate columns (these may be auto-detected and checked in your own spreadsheet).

f. This spreadsheet also contains allele translation columns for A and B. Check Translate Alleles and choose the appropriate columns. g. The resulting marker-mapped spreadsheet contains the converted genotypes (with translated alleles) and is a child of the original spreadsheet.