TRAPPIST: A toolkit for comparative analysis and visualization of genomic regions

Size: px
Start display at page:

Download "TRAPPIST: A toolkit for comparative analysis and visualization of genomic regions"

Transcription

1 TRAPPIST: A toolkit for comparative analysis and visualization of genomic regions Geraldine A. Van der Auwera, PhD "

2 TRAPPIST: A toolkit for comparative analysis and visualization of genomic regions full-featured application Geraldine A. Van der Auwera, PhD Shankar Ambady "

3 The source code of Life Evolution = 4 bn years of forking without version tracking and you thought legacy Fortran code was a pain

4 Two distinct issues Getting the code from the repository (living beings) Reverse-engineering the code (zero documentation!) Extraction, sequencing, assembly Experimentation, mutagenesis + (comparative) sequence analysis

5 Top issue for getting NGS is outscaling Moore s Law Lincoln Stein via C. Titus 2011

6 Top issue for getting NGS is outscaling Moore s Law WHATEVER Lincoln Stein via C. Titus 2011

7 Evolving process of rev-eng No genomes entirely experimental Make random mutants, trace back effect to gene of interest One genome some predictive filtering Design mutants, long iterative process Many related genomes much better predictive filtering Nature s mutants, drastically reduced iterative process

8 Nature s mutants (example) pxo1 rep2 tra1 repx tra2 Anthrax PAI tra3 pbcxo1 p03bb102_179 pah820_272 pah187_270 NZ_ACMR0 NZ_ACMH0 IS075 pbc10987 NZ_ACMC0 NZ_ACMS0 NZ_ACMT0 NZ_ACNI0 VD022 Schrouff NZ_ACMO0 TIAC129 NZ_ACNJ0 NZ_ABDM0 NZ_ACNB0 NZ_ACLY0 NZ_ACMP0 NZ_ACNK0 pbc239 NZ_ACNE0 NZ_ABDA0 NZ_ACLV0 NZ_ACLT0 NZ_ACNF0 NZ_ACNA0

9 Typical analysis process BLAST All done through separate GUIs poor batching, no automation, no chaining

10 Programmatic access The servers can be accessed with scripts*, and there are awesome libraries that provide wrappers, data structures etc. But here s the rub * (there are a few GUI pipeline apps but usability is an issue </diplomatic>)

11 What s a command line? Exhibit A: Experimental Biologist

12 TRAPPIST Totally Rad Analysis Pipelines Python Super Tool "

13 TRAPPIST Totally Rad Analysis Pipelines Python Super Tool "

14 TRAPPIST Totally Rad Analysis Pipelines Python Super Tool "

15 genome DBs list of genomes CONTIG FISHER reference list of targets list of genes Optional Analyses HOST HK_SET EXTRACTOR H_K sets Data Input ] [ PHYLOGENY CONSTRUCT ] [ reference + constructs Core Analysis OR CONSERVATION SORTER segment sets SEQUENCE VARIATION plasmids STRUCTURE COMPARATOR CONTENT FUNCTIONS Example pipeline / workflow (my research) host trees plasmid trees PHYLOGENY CONGRUENCE

16 Do it manually? Exhibit B: Lazy Postdoc (me)

17 Long story short Collection of one-time scripts Toolkit Full-featured application DNA-based OS?

18 Fundamental requirements Design, assembly and modification of pipelines / workflows Automated execution, parameter / output versioning, provenance data bundling, interactive visualization

19 Staging / Execution

20 Staging system Basic requirements intuitive flexible extensible + Pre-assembled workflows / pipelines + Hooks for external / roll-your-own functions

21 Staging area - workflows

22 Workflow components TRAPPIST provides discrete task components for every step of analysis: Initial inputs selection Data processing steps (existing algorithms) Graphical output Component I/O relies on matching ports with data object classes Forces validation of data type/format (not up to user)

23 Component representation

24 Interacting with components

25 Connecting components

26 Connecting components

27 Connecting components

28 Execution system Progressive / modular / dependency-aware Parameter set versioning linked to output versioning Users more likely to try various parameters to test assumptions

29 Flow control rule Component ports have fill status If all its inputs are filled, component is OK to execute " add component to execution queue I m OK to go!

30 Database architecture Workflow 1 Staging DB Execution DB Workflow 2 Central DB Staging DB Execution DB

31

32 Staging DB DB schema + data dump sufficient to fully describe a workflow

33 Execution DB

34 Enforcing good practices Provenance bundle including: Workflow schema Parameter sets Code version info Executable papers! Reproducibility! Science!

35 Interactive visualization

36 U haz questions?

Package gsalib. R topics documented: February 20, Type Package. Title Utility Functions For GATK. Version 2.1.

Package gsalib. R topics documented: February 20, Type Package. Title Utility Functions For GATK. Version 2.1. Package gsalib February 20, 2015 Type Package Title Utility Functions For GATK Version 2.1 Date 2014-12-09 Author Maintainer Geraldine Van der Auwera This package contains

More information

Genome Browser. Background and Strategy

Genome Browser. Background and Strategy Genome Browser Background and Strategy Contents What is a genome browser? Purpose of a genome browser Examples Structure Extra Features Contents What is a genome browser? Purpose of a genome browser Examples

More information

Version Control for PL/SQL

Version Control for PL/SQL Version Control for PL/SQL What is the problem? How did we solve it? Implementation Strategies Demo!! Customer Spotlight Success Story: (In other words, this really works. :-) ) Rhenus Logistics, leading

More information

TBtools, a Toolkit for Biologists integrating various HTS-data

TBtools, a Toolkit for Biologists integrating various HTS-data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 TBtools, a Toolkit for Biologists integrating various HTS-data handling tools with a user-friendly interface Chengjie Chen 1,2,3*, Rui Xia 1,2,3, Hao Chen 4, Yehua

More information

Building (Better) Data Pipelines using Apache Airflow

Building (Better) Data Pipelines using Apache Airflow Building (Better) Data Pipelines using Apache Airflow Sid Anand (@r39132) QCon.AI 2018 1 About Me Work [ed s] @ Co-Chair for Maintainer of Spare time 2 Apache Airflow What is it? 3 Apache Airflow : What

More information

Two Examples of Datanomic. David Du Digital Technology Center Intelligent Storage Consortium University of Minnesota

Two Examples of Datanomic. David Du Digital Technology Center Intelligent Storage Consortium University of Minnesota Two Examples of Datanomic David Du Digital Technology Center Intelligent Storage Consortium University of Minnesota Datanomic Computing (Autonomic Storage) System behavior driven by characteristics of

More information

Object vs Image-based Testing Producing Automated GUI Tests to Withstand Change

Object vs Image-based Testing Producing Automated GUI Tests to Withstand Change Object vs Image-based Testing Producing Automated GUI Tests to Withstand Change Handling Application Change Script maintenance, and handling application change, is one of the highest impact factors when

More information

CLC Server. End User USER MANUAL

CLC Server. End User USER MANUAL CLC Server End User USER MANUAL Manual for CLC Server 10.0.1 Windows, macos and Linux March 8, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej 2 Prismet DK-8000 Aarhus C Denmark

More information

Reproducible & Transparent Computational Science with Galaxy. Jeremy Goecks The Galaxy Team

Reproducible & Transparent Computational Science with Galaxy. Jeremy Goecks The Galaxy Team Reproducible & Transparent Computational Science with Galaxy Jeremy Goecks The Galaxy Team 1 Doing Good Science Previous talks: performing an analysis setting up and scaling Galaxy adding tools libraries

More information

Automating Geodatabase Creation with Geoprocessing

Automating Geodatabase Creation with Geoprocessing Automating Geodatabase Creation with Geoprocessing Russell Brennan Ian Wittenmyer Esri UC 2014 Technical Workshop Assumptions Geodatabase fundamentals Experience with geoprocessing (GP) Understanding of

More information

Sequencing Data. Paul Agapow 2011/02/03

Sequencing Data. Paul Agapow 2011/02/03 Webservices for Next Generation Sequencing Data Paul Agapow 2011/02/03 Aims Assumed parameters: Must have a system for non-technical users to browse and manipulate their Next Generation Sequencing (NGS)

More information

Properties of Biological Networks

Properties of Biological Networks Properties of Biological Networks presented by: Ola Hamud June 12, 2013 Supervisor: Prof. Ron Pinter Based on: NETWORK BIOLOGY: UNDERSTANDING THE CELL S FUNCTIONAL ORGANIZATION By Albert-László Barabási

More information

An Introduction to Software Architecture. David Garlan & Mary Shaw 94

An Introduction to Software Architecture. David Garlan & Mary Shaw 94 An Introduction to Software Architecture David Garlan & Mary Shaw 94 Motivation Motivation An increase in (system) size and complexity structural issues communication (type, protocol) synchronization data

More information

Chapter 2. Operating-System Structures

Chapter 2. Operating-System Structures Chapter 2 Operating-System Structures 2.1 Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System Calls System Programs Operating System

More information

Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search

Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search Outline Genetic Algorithm Motivation Genetic algorithms An illustrative example Hypothesis space search Motivation Evolution is known to be a successful, robust method for adaptation within biological

More information

Performing whole genome SNP analysis with mapping performed locally

Performing whole genome SNP analysis with mapping performed locally BioNumerics Tutorial: Performing whole genome SNP analysis with mapping performed locally 1 Introduction 1.1 An introduction to whole genome SNP analysis A Single Nucleotide Polymorphism (SNP) is a variation

More information

Execo tutorial Grid 5000 school, Grenoble, January 2016

Execo tutorial Grid 5000 school, Grenoble, January 2016 Execo tutorial Grid 5000 school, Grenoble, January 2016 Simon Delamare Matthieu Imbert Laurent Pouilloux INRIA/CNRS/LIP ENS-Lyon 03/02/2016 1/34 1 introduction 2 execo, core module 3 execo g5k, Grid 5000

More information

Reverse Engineering Swift Apps. Michael Gianarakis Rootcon X 2016

Reverse Engineering Swift Apps. Michael Gianarakis Rootcon X 2016 Reverse Engineering Swift Apps Michael Gianarakis Rootcon X 2016 # whoami @mgianarakis Director of SpiderLabs APAC at Trustwave SecTalks Organiser (@SecTalks_BNE) Flat Duck Justice Warrior #ducksec Motivation

More information

ICAT Job Portal. a generic job submission system built on a scientific data catalog. IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013

ICAT Job Portal. a generic job submission system built on a scientific data catalog. IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013 ICAT Job Portal a generic job submission system built on a scientific data catalog IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013 Steve Fisher, Kevin Phipps and Dan Rolfe Rutherford Appleton Laboratory

More information

Drupal 8 THE VIDER ITY APPR OACH

Drupal 8 THE VIDER ITY APPR OACH Drupal 8 THE VIDER ITY APPROACH Introduction DR UPAL 8: THE VIDER ITY APPROACH Viderity focuses on designing the Total User Experience for Drupal sites, using a user-centered design approach Traditionally,

More information

Genome Assembly and De Novo RNAseq

Genome Assembly and De Novo RNAseq Genome Assembly and De Novo RNAseq BMI 7830 Kun Huang Department of Biomedical Informatics The Ohio State University Outline Problem formulation Hamiltonian path formulation Euler path and de Bruijin graph

More information

Embedding CIPRES Science Gateway Capabilities in Phylogenetics Software Environments

Embedding CIPRES Science Gateway Capabilities in Phylogenetics Software Environments Embedding CIPRES Science Gateway Capabilities in Phylogenetics Software Environments Mark A. Miller San Diego Supercomputer Center Phylogenetics is the study of the diversification of life on the planet

More information

Using Galaxy to provide Tools for the Analysis of diverse Local Datasets. 6 Key Insights

Using Galaxy to provide Tools for the Analysis of diverse Local Datasets. 6 Key Insights 5/25/11 Using Galaxy to provide Tools for the Analysis of diverse Local Datasets 6 Key Insights Hans-Rudolf Hotz (hrh@fmi.ch) Friedrich Miescher Institute for Biomedical Research Basel, Switzerland background

More information

Experimental Psychology Lab Practice tidy cooperation

Experimental Psychology Lab Practice tidy cooperation Experimental Psychology Lab Practice tidy cooperation today s topics 1 folder structure 2 version control 3 git 4 markdown 2 Folder structure ::: how not to manually produced version history clutter no

More information

Toad for Oracle Suite 2017 Functional Matrix

Toad for Oracle Suite 2017 Functional Matrix Toad for Oracle Suite 2017 Functional Matrix Essential Functionality Base Xpert Module (add-on) Developer DBA Runs directly on Windows OS Browse and navigate through objects Create and manipulate database

More information

Biocomputing II Coursework guidance

Biocomputing II Coursework guidance Biocomputing II Coursework guidance I refer to the database layer as DB, the middle (business logic) layer as BL and the front end graphical interface with CGI scripts as (FE). Standardized file headers

More information

HybridCheck User Manual

HybridCheck User Manual HybridCheck User Manual Ben J. Ward February 2015 HybridCheck is a software package to visualise the recombination signal in assembled next generation sequence data, and it can be used to detect recombination,

More information

Galaxy. Daniel Blankenberg The Galaxy Team

Galaxy. Daniel Blankenberg The Galaxy Team Galaxy Daniel Blankenberg The Galaxy Team http://galaxyproject.org Overview What is Galaxy? What you can do in Galaxy analysis interface, tools and datasources data libraries workflows visualization sharing

More information

E. coli functional genotyping: predicting phenotypic traits from whole genome sequences

E. coli functional genotyping: predicting phenotypic traits from whole genome sequences BioNumerics Tutorial: E. coli functional genotyping: predicting phenotypic traits from whole genome sequences 1 Aim In this tutorial we will screen genome sequences of Escherichia coli samples for phenotypic

More information

L4/Darwin: Evolving UNIX. Charles Gray Research Engineer, National ICT Australia

L4/Darwin: Evolving UNIX. Charles Gray Research Engineer, National ICT Australia L4/Darwin: Evolving UNIX Charles Gray Research Engineer, National ICT Australia charles.gray@nicta.com.au Outline 1. Project Overview 2. BSD on the Mach microkernel 3. Porting Darwin to the L4 microkernel

More information

Next Generation Sequencing quality trimming (NGSQTRIM)

Next Generation Sequencing quality trimming (NGSQTRIM) Next Generation Sequencing quality trimming (NGSQTRIM) Danamma B.J 1, Naveen kumar 2, V.G Shanmuga priya 3 1 M.Tech, Bioinformatics, KLEMSSCET, Belagavi 2 Proprietor, GenEclat Technologies, Bengaluru 3

More information

Computer Systems Performance Analysis and Benchmarking (37-235)

Computer Systems Performance Analysis and Benchmarking (37-235) Computer Systems Performance Analysis and Benchmarking (37-235) Analytic Modeling Simulation Measurements / Benchmarking Lecture by: Prof. Thomas Stricker Assignments/Projects: Christian Kurmann Textbook:

More information

[v1.2] SIMBA docs LGCM Federal University of Minas Gerais 2015

[v1.2] SIMBA docs LGCM Federal University of Minas Gerais 2015 [v1.2] SIMBA docs LGCM Federal University of Minas Gerais 2015 2 Summary 1. Introduction... 3 1.1 Why use SIMBA?... 3 1.2 How does SIMBA work?... 4 1.3 How to download SIMBA?... 4 1.4 SIMBA VM... 4 2.

More information

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 2: SYSTEM STRUCTURES By I-Chen Lin Textbook: Operating System Concepts 9th Ed. Chapter 2: System Structures Operating System Services User Operating System Interface System Calls Types of System

More information

Chapter 2: Operating-System Structures

Chapter 2: Operating-System Structures Chapter 2: Operating-System Structures Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System Calls System Programs Operating System

More information

Multimedia Information Systems Design Patterns & Web Frameworks (Part 1) VU ( ) Christoph Trattner

Multimedia Information Systems Design Patterns & Web Frameworks (Part 1) VU ( ) Christoph Trattner Multimedia Information Systems Design Patterns & Web Frameworks (Part 1) VU (707.020) Christoph Trattner Know-Center @Know-Center, TUG, Austria 1 Today s plan! PART 1: First partial exam (30mins) PART

More information

An Introduction to Software Architecture By David Garlan & Mary Shaw 94

An Introduction to Software Architecture By David Garlan & Mary Shaw 94 IMPORTANT NOTICE TO STUDENTS These slides are NOT to be used as a replacement for student notes. These slides are sometimes vague and incomplete on purpose to spark a class discussion An Introduction to

More information

Portal Application Deployment Scripting

Portal Application Deployment Scripting Portal Application Deployment Scripting Graham Harper, IBM ISSL Senior Application Architect Contents Deployment scripting in context What is a portal application? Portal application components Applying

More information

Data Entry, and Manipulation. DataONE Community Engagement & Outreach Working Group

Data Entry, and Manipulation. DataONE Community Engagement & Outreach Working Group Data Entry, and Manipulation DataONE Community Engagement & Outreach Working Group Lesson Topics Best Practices for Creating Data Files Data Entry Options Data Integration Best Practices Data Manipulation

More information

Turkit: Human Computation Algorithms on Mechanical Turk. Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller

Turkit: Human Computation Algorithms on Mechanical Turk. Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller Turkit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller Human computation Algorithms Tasks that build on each other. Human computation Algorithms

More information

Agile Data Management Challenges in Enterprise Big Data Landscape

Agile Data Management Challenges in Enterprise Big Data Landscape Agile Data Management Challenges in Enterprise Big Data Landscape Eric Simon, SAP Big Data October, 2017 1 Evolution Towards Enterprise Big Data Landscape administrator Data analyst Athena Redshift #123

More information

Certkiller.P questions

Certkiller.P questions Certkiller.P2140-020.59 questions Number: P2140-020 Passing Score: 800 Time Limit: 120 min File Version: 4.8 http://www.gratisexam.com/ P2140-020 IBM Rational Enterprise Modernization Technical Sales Mastery

More information

Release Notes. Version Gene Codes Corporation

Release Notes. Version Gene Codes Corporation Version 4.10.1 Release Notes 2010 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com

More information

arxiv: v1 [cs.se] 7 Dec 2018

arxiv: v1 [cs.se] 7 Dec 2018 arxiv:1812.03149v1 [cs.se] 7 Dec 2018 Extending ROOT through Modules Oksana Shadura 1,, Brian Paul Bockelman 1,, and Vassil Vassilev 2, 1 University of Nebraska Lincoln, 1400 R St, Lincoln, NE 68588, United

More information

Topic 10. WebE - Information Design

Topic 10. WebE - Information Design Topic 10 WebE - Information Design Information Design Model: Summary Three key issues: 1. Content. What content is available? Content Objects? Data Structures? 2. Composition. What views on that content

More information

Supercomputing made super human

Supercomputing made super human Supercomputing made super human The New Age of Accelerated Computing: A History of Innovation and Optimization in Computing Steve Hebert, Cofounder and CEO, Nimbix 2 1880 census had taken eight years to

More information

Managing Your Biological Data with Python

Managing Your Biological Data with Python Chapman & Hall/CRC Mathematical and Computational Biology Series Managing Your Biological Data with Python Ailegra Via Kristian Rother Anna Tramontano CRC Press Taylor & Francis Group Boca Raton London

More information

<Insert Picture Here> JavaFX 2.0

<Insert Picture Here> JavaFX 2.0 1 JavaFX 2.0 Dr. Stefan Schneider Chief Technologist ISV Engineering The following is intended to outline our general product direction. It is intended for information purposes only,

More information

Getting Started. April Strand Life Sciences, Inc All rights reserved.

Getting Started. April Strand Life Sciences, Inc All rights reserved. Getting Started April 2015 Strand Life Sciences, Inc. 2015. All rights reserved. Contents Aim... 3 Demo Project and User Interface... 3 Downloading Annotations... 4 Project and Experiment Creation... 6

More information

Managing & Accessing Data Effectively with Databases

Managing & Accessing Data Effectively with Databases Managing & Accessing Data Effectively with Databases RCS 2017 Brown Bag Series Bob Freeman, PhD Director, Research TechOps HBS 12 April, 2017 1 Overview RCS Services & whoami Benefits and drawbacks of

More information

Webinar optislang & ANSYS Workbench. Dynardo GmbH

Webinar optislang & ANSYS Workbench. Dynardo GmbH Webinar optislang & ANSYS Workbench Dynardo GmbH 1 1. Introduction 2. Process Integration and variation studies 6. Signal Processing 5. ANSYS Mechanical APDL in optislang 3. optislang inside ANSYS 4. ANSYS

More information

CSC 408F/CSC2105F Lecture Notes

CSC 408F/CSC2105F Lecture Notes CSC 408F/CSC2105F Lecture Notes These lecture notes are provided for the personal use of students taking CSC 408H/CSC 2105H in the Fall term 2004/2005 at the University of Toronto. Copying for purposes

More information

SABLE: Agent Support for the Consolidation of Enterprise-Wide Data- Oriented Simulations

SABLE: Agent Support for the Consolidation of Enterprise-Wide Data- Oriented Simulations : Agent Support for the Consolidation of Enterprise-Wide Data- Oriented Simulations Brian Blake The MITRE Corporation Center for Advanced Aviation System Development 1820 Dolley Madison Blvd. McLean, VA

More information

NeuroLOG WP1 Sharing Data & Metadata

NeuroLOG WP1 Sharing Data & Metadata Software technologies for integration of process and data in medical imaging NeuroLOG WP1 Sharing Data & Metadata Franck MICHEL Paris, May 18 th 2010 NeuroLOG ANR-06-TLOG-024 http://neurolog.polytech.unice.fr

More information

Achieving the Goals of the DoD Netcentric Data Strategy Using Embarcadero All-Access

Achieving the Goals of the DoD Netcentric Data Strategy Using Embarcadero All-Access White Paper Achieving the Goals of the DoD Netcentric Data Strategy Using Embarcadero All-Access By: Ron Lewis, CDO Technologies April 2010 Corporate Headquarters EMEA Headquarters Asia-Pacific Headquarters

More information

Version Control for PL/SQL

Version Control for PL/SQL Version Control for PL/SQL Customer Spotlight Success Story: Rhenus Logistics, leading logistics service company from Germany, uses this solution. Manages over 20,000 packages Packages are spread over

More information

Future, Past & Present of a Message

Future, Past & Present of a Message Whitepaper Future, Past & Present of a Message Whitepaper Future, Past and Present of a Message by Patrick De Wilde i What is wrong with the above title? No, I do not mean to write about The Message in

More information

Genome Browsers Guide

Genome Browsers Guide Genome Browsers Guide Take a Class This guide supports the Galter Library class called Genome Browsers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

Concurrency, Mutual Exclusion and Synchronization C H A P T E R 5

Concurrency, Mutual Exclusion and Synchronization C H A P T E R 5 Concurrency, Mutual Exclusion and Synchronization C H A P T E R 5 Multiple Processes OS design is concerned with the management of processes and threads: Multiprogramming Multiprocessing Distributed processing

More information

MODIFYING TRADITIONAL APPLICATIONS FOR MORE COST EFFECTIVE DEPLOYMENT

MODIFYING TRADITIONAL APPLICATIONS FOR MORE COST EFFECTIVE DEPLOYMENT MODIFYING TRADITIONAL APPLICATIONS FOR MORE COST EFFECTIVE DEPLOYMENT Eric H. Nielsen Ph.D. VP SaaS Platform Architecture CA Technologies e.h.nielsen @ ieee.org IEEE Software Technology Conference STC

More information

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial

More information

arxiv: v1 [cs.db] 28 Sep 2017

arxiv: v1 [cs.db] 28 Sep 2017 A Practical Python API for Querying AFLOWLIB Conrad W. Rosenbrock Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA. (Dated: October 3, 2017) arxiv:1710.00813v1 [cs.db]

More information

Object-Oriented Systems. Development: Using the Unified Modeling Language

Object-Oriented Systems. Development: Using the Unified Modeling Language Object-Oriented Systems Development: Using the Unified Modeling Language Chapter 3: Object-Oriented Systems Development Life Cycle Goals The software development process Building high-quality software

More information

Advanced Java Testing. What s next?

Advanced Java Testing. What s next? Advanced Java Testing What s next? Vincent Massol, February 2018 Agenda Context & Current status quo Coverage testing Testing for backward compatibility Mutation testing Environment testing Context: XWiki

More information

Data management is fun. Casey Dunn Assistant Professor Ecology and Evolutionary Biology

Data management is fun. Casey Dunn Assistant Professor Ecology and Evolutionary Biology Data management is fun Casey Dunn Assistant Professor Ecology and Evolutionary Biology What is science? The study of the natural world through observation and experiment. Reproducible study. Prove it isn

More information

Introduction to Operating Systems. Jo, Heeseung

Introduction to Operating Systems. Jo, Heeseung Introduction to Operating Systems Jo, Heeseung Today's Topics What is OS? History of OS 2 Operating System? Computer systems internals 3 Why do we learn OS? To graduate? To make a better OS or system Functionality

More information

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego Scientific Workflow Tools Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego 1 escience Today Increasing number of Cyberinfrastructure (CI) technologies Data Repositories: Network

More information

Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes

Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Wayne Pfeiffer (SDSC/UCSD) & Alexandros Stamatakis (TUM) February 25, 2010 What was done? Why is it important? Who cares? Hybrid MPI/OpenMP

More information

Change Management for the ArcGIS Platform for Local Government. Ayan Mitra Seth Lewis

Change Management for the ArcGIS Platform for Local Government. Ayan Mitra Seth Lewis Change Management for the ArcGIS Platform for Local Government Ayan Mitra Seth Lewis What is Change Management? Process used to ensure that changes to a product or system are introduced in a controlled

More information

Integrating large, fast-moving, and heterogeneous data sets in biology.

Integrating large, fast-moving, and heterogeneous data sets in biology. Integrating large, fast-moving, and heterogeneous data sets in biology. C. Titus Brown Asst Prof, CSE and Microbiology; BEACON NSF STC Michigan State University ctb@msu.edu Introduction Background: Modeling

More information

1 Abstract. 2 Introduction. 3 Requirements

1 Abstract. 2 Introduction. 3 Requirements 1 Abstract 2 Introduction This SOP describes the HMP Whole- Metagenome Annotation Pipeline run at CBCB. This pipeline generates a 'Pretty Good Assembly' - a reasonable attempt at reconstructing pieces

More information

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data

More information

Finding the appropriate method, with a special focus on: Mapping and alignment. Philip Clausen

Finding the appropriate method, with a special focus on: Mapping and alignment. Philip Clausen Finding the appropriate method, with a special focus on: Mapping and alignment Philip Clausen Background Most people choose their methods based on popularity and history, not by reasoning and research.

More information

Finishing Circular Assemblies. J Fass UCD Genome Center Bioinformatics Core Thursday April 16, 2015

Finishing Circular Assemblies. J Fass UCD Genome Center Bioinformatics Core Thursday April 16, 2015 Finishing Circular Assemblies J Fass UCD Genome Center Bioinformatics Core Thursday April 16, 2015 Assembly Strategies de Bruijn graph Velvet, ABySS earlier, basic assemblers IDBA, SPAdes later, multi-k

More information

Phylogeny Yun Gyeong, Lee ( )

Phylogeny Yun Gyeong, Lee ( ) SpiltsTree Instruction Phylogeny Yun Gyeong, Lee ( ylee307@mail.gatech.edu ) 1. Go to cygwin-x (if you don t have cygwin-x, you can either download it or use X-11 with brand new Mac in 306.) 2. Log in

More information

CompClustTk Manual & Tutorial

CompClustTk Manual & Tutorial CompClustTk Manual & Tutorial Brandon King Copyright c California Institute of Technology Version 0.1.10 May 13, 2004 Contents 1 Introduction 1 1.1 Purpose.............................................

More information

NETWORK POLICY FUNCTION VIRTUALIZATION VIA SDN AND PACKET PROCESSING

NETWORK POLICY FUNCTION VIRTUALIZATION VIA SDN AND PACKET PROCESSING Review of the Air Force Academy No 3 (27) 2014 NETWORK POLICY FUNCTION VIRTUALIZATION VIA SDN AND PACKET PROCESSING Titus Constantin BĂLAN Transilvania University of Braşov Abstract: So far affecting mostly

More information

The Future of Programming Languages. Will Crichton CS /28/18

The Future of Programming Languages. Will Crichton CS /28/18 The Future of Programming Languages Will Crichton CS 242 11/28/18 Thesis: The future of performance optimization is better programming models, not better optimizers. Thesis: programming languages The future

More information

ProM 6: The Process Mining Toolkit

ProM 6: The Process Mining Toolkit ProM 6: The Process Mining Toolkit H.M.W. Verbeek, J.C.A.M. Buijs, B.F. van Dongen, W.M.P. van der Aalst Department of Mathematics and Computer Science, Eindhoven University of Technology P.O. Box 513,

More information

Composing, Reproducing, and Sharing Simula5ons

Composing, Reproducing, and Sharing Simula5ons Composing, Reproducing, and Sharing Simula5ons Daniel Mosse {mosse,childers}@cs.pi

More information

Thoughts on simulation project management Andrew Davison UNIC, CNRS FACETS CodeJam #2 Gif sur Yvette, 5th-8th May 2008

Thoughts on simulation project management Andrew Davison UNIC, CNRS FACETS CodeJam #2 Gif sur Yvette, 5th-8th May 2008 Thoughts on simulation project management Andrew Davison UNIC, CNRS FACETS CodeJam #2 Gif sur Yvette, 5th-8th May 2008 Outline 1 Reproducible research, drowning in data and other problems 2 Solutions 3

More information

Luigi Build Data Pipelines of batch jobs. - Pramod Toraskar

Luigi Build Data Pipelines of batch jobs. - Pramod Toraskar Luigi Build Data Pipelines of batch jobs - Pramod Toraskar I am a Principal Solution Engineer & Pythonista with more than 8 years of work experience, Works for a Red Hat India an open source solutions

More information

Tn-seq Explorer 1.2. User guide

Tn-seq Explorer 1.2. User guide Tn-seq Explorer 1.2 User guide 1. The purpose of Tn-seq Explorer Tn-seq Explorer allows users to explore and analyze Tn-seq data for prokaryotic (bacterial or archaeal) genomes. It implements a methodology

More information

CASE TOOLS LAB VIVA QUESTION

CASE TOOLS LAB VIVA QUESTION 1. Define Object Oriented Analysis? VIVA QUESTION Object Oriented Analysis (OOA) is a method of analysis that examines requirements from the perspective of the classes and objects found in the vocabulary

More information

IBM High Performance Computing Toolkit

IBM High Performance Computing Toolkit IBM High Performance Computing Toolkit Pidad D'Souza (pidsouza@in.ibm.com) IBM, India Software Labs Top 500 : Application areas (November 2011) Systems Performance Source : http://www.top500.org/charts/list/34/apparea

More information

M 100 G 3000 M 3000 G 100. ii) iii)

M 100 G 3000 M 3000 G 100. ii) iii) A) B) RefSeq 1 Other Alignments 180000 1 1 Simulation of Kim et al method Human Mouse Rat Fruitfly Nematode Best Alignment G estimate 1 80000 RefSeq 2 G estimate C) D) 0 350000 300000 250000 0 150000 Interpretation

More information

USER FRIENDLY HIGH PRODUCTIVITY COMPUTATIONAL WORKFLOWS USING THE VISION/HPC PROTOTYPE

USER FRIENDLY HIGH PRODUCTIVITY COMPUTATIONAL WORKFLOWS USING THE VISION/HPC PROTOTYPE USER FRIENDLY HIGH PRODUCTIVITY COMPUTATIONAL WORKFLOWS USING THE VISION/HPC PROTOTYPE J.H. Unpingco Ohio Supercomputer Center Columbus, OH 43212 I. ABSTRACT HPCs (high-performance computers) utilize multiple

More information

Guidelines for deploying PHP applications

Guidelines for deploying PHP applications Guidelines for deploying PHP applications By Kevin Schroeder Technology Evangelist Zend Technologies About me Kevin Schroeder Technology Evangelist for Zend Programmer Sys Admin Author IBM i Programmer

More information

Introduction of NAREGI-PSE implementation of ACS and Replication feature

Introduction of NAREGI-PSE implementation of ACS and Replication feature Introduction of NAREGI-PSE implementation of ACS and Replication feature January. 2007 NAREGI-PSE Group National Institute of Informatics Fujitsu Limited Utsunomiya University National Research Grid Initiative

More information

CS4470: Intro to UI Software CS6456: Principles of UI Software. Fall 2006 Keith Edwards

CS4470: Intro to UI Software CS6456: Principles of UI Software. Fall 2006 Keith Edwards CS4470: Intro to UI Software CS6456: Principles of UI Software Fall 2006 Keith Edwards Today s Agenda Introductions Me TA You Class Overview Syllabus Resources Class Policies 2 Introductions Instructor

More information

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Projects 1 Information flow analysis for mobile applications 2 2 Machine-learning-guide typestate analysis for UAF vulnerabilities 3 3 Preventing

More information

High-Performance Algorithm Engineering for Computational Phylogenetics

High-Performance Algorithm Engineering for Computational Phylogenetics High-Performance Algorithm Engineering for Computational Phylogenetics Bernard M.E. Moret moret@cs.unm.edu Department of Computer Science University of New Mexico Albuquerque, NM 87131 High-Performance

More information

IBM PDTools for z/os. Update. Hans Emrich. Senior Client IT Professional PD Tools + Rational on System z Technical Sales and Solutions IBM Systems

IBM PDTools for z/os. Update. Hans Emrich. Senior Client IT Professional PD Tools + Rational on System z Technical Sales and Solutions IBM Systems IBM System z AD Tage 2017 IBM PDTools for z/os Update Hans Emrich Senior Client IT Professional PD Tools + Rational on System z Technical Sales and Solutions IBM Systems hans.emrich@de.ibm.com 2017 IBM

More information

Seminar III: R/Bioconductor

Seminar III: R/Bioconductor Leonardo Collado Torres lcollado@lcg.unam.mx Bachelor in Genomic Sciences www.lcg.unam.mx/~lcollado/ August - December, 2009 1 / 25 Class outline Working with HTS data: a simulated case study Intro R for

More information

Containerizing GPU Applications with Docker for Scaling to the Cloud

Containerizing GPU Applications with Docker for Scaling to the Cloud Containerizing GPU Applications with Docker for Scaling to the Cloud SUBBU RAMA FUTURE OF PACKAGING APPLICATIONS Turns Discrete Computing Resources into a Virtual Supercomputer GPU Mem Mem GPU GPU Mem

More information

COMBAT TB. An integrated environment for Tuberculosis data analysis

COMBAT TB. An integrated environment for Tuberculosis data analysis COMBAT TB An integrated environment for Tuberculosis data analysis Worldwide: more than 10 million infected 1.8 million deaths in 2015 Majority of disease burden in Africa and Asia 1% of SA population

More information

The iplant Data Commons

The iplant Data Commons The iplant Data Commons Using irods to Facilitate Data Dissemination, Discovery, and Reproducibility Jeremy DeBarry, jdebarry@iplantcollaborative.org Tony Edgin, tedgin@iplantcollaborative.org Nirav Merchant,

More information

ICT & Computing Progress Grid

ICT & Computing Progress Grid ICT & Computing Progress Grid Pupil Progress ion 9 Select, Algorithms justify and apply appropriate techniques and principles to develop data structures and algorithms for the solution of problems Programming

More information

INTRODUCTION TO NEXTFLOW

INTRODUCTION TO NEXTFLOW INTRODUCTION TO NEXTFLOW Paolo Di Tommaso, CRG NETTAB workshop - Roma October 25th, 2016 @PaoloDiTommaso Research software engineer Comparative Bioinformatics, Notredame Lab Center for Genomic Regulation

More information

Dynamic Cuda with F# HPC GPU & F# Meetup. March 19. San Jose, California

Dynamic Cuda with F# HPC GPU & F# Meetup. March 19. San Jose, California Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79 430 03 61 About Us! Software development and consulting company!

More information