warm-up exercise Representing Data Digitally goals for today proteins example from nature
|
|
- Suzan Hensley
- 5 years ago
- Views:
Transcription
1 Representing Data Digitally Anne Condon September 6, 007 warm-up exercise pick two examples of in your everyday life* in what media are the is represented? is the converted from one representation to another, as you use the? How? how does a particular representation of the influence what you can do with the? might any errors arise when you use the? *one example should be computer-related, one not...try for examples no-one else might think of goals for today be able to define a representation scheme understand, and start working on, course learning goals on representation let s start with two examples example from nature proteins are critical to life organisms need ways to store and transmit descriptions of proteins proteins beads on a necklace, with 0 different bead types (amino acids) what medium, and what representation scheme does nature use? news/science_ed/structlife/
2 DNA storage medium in the cell a (double-stranded) bead necklace with four different kinds of beads (bases, nucleotides): A,C,G, T genes to keep our body functioning, proteins are constantly manufactured in our cells genes segments of our DNA store descriptions for proteins the genetic code specifies how a protein (sequence of amino acids) can be represented as DNA (sequence of bases) the genetic code the genetic code: example TTC phenylalanine TTA leucine TTG leucine CTT leucine CTC leucine CTA leucine CTG leucine ATT isoleucine ATA isoleucine GTT valine GTA valine TCC serine TCA serine TCG serine CCT proline CCC proline CCA proline CCG proline ACT threonine ACA threonine GCT alanine GCA alanine TAC tyrosine TAA stop TAG stop CAT histidine CAC histidine CAA glutamine CAG glutamine AAT asparagine AAA lysine GAT aspartic acid GAA glutamic acid TGC cysteine TGA stop TGG tryptophan CGT arginine CGC arginine CGA arginine CGG arginine AGT serine AGA arginine GGT glycine GGA glycine one code for methionine isoleucine phenelalanine aspartic acid glycine is... (partial table needed for example) the genetic code: example one code for methionine isoleucine phenelalanine aspartic acid glycine is... ATGATCTTTGACGGG (partial table needed for example)
3 suppose you have no paper, and need to describe a connect-the-dots representation using your voice. How could you do it? dimension of the grid (in cm),,,,,,,,,, first dot third dot fifth dot second dot fourth dot 0 0,,,,,,,,,, a continuous-line drawing can be represented as a sequence of dots drawn on a page a sequence of numbers, which lists the coordinates of the dots on a D grid (preceded by dimension of grid in cm) activity make a continuous-line drawing on a piece of paper, and represent it as best you can using a sequence of dots how did you decide on the number of dots to use? where to position the dots? what principles would you suggest in general, for selecting the dots to represent a picture?
4 a representation scheme a description of how of one type (source ) can be represented using of another type (encoded ) source representation scheme encoded why study representation schemes? central to CS: designing representation schemes to balance engineering (e.g. errors) with usability (e.g. aesthetic) considerations is major activity useful: facility with helps in other fields, and in everyday life fun: designing and critiquing representation schemes is creative, and a new lens with which you view your world example: representation schemes in CS polygonal representation of surfaces (see work of Alla Sheffer in CS) this is the D version of connect-the-dots! course learning goals on describe properties of representation schemes that can be found in many contexts of the world around you critique properties of representation schemes, from the stand-point of usability and engineering considerations, given information about the context in which the scheme is used course learning goals on engage in design of schemes, for example by proposing modifications that address shortcomings of given representation schemes put your knowledge to practical use, for example, in making decisions about representing your own, or deciding you want to go into CS, or applying the knowledge in your own field
5 scheme property : digital vs not digital digital scheme: encoded is digital (source may be digital or analog) digital : comprised of symbols over a finite alphabet analog : not digital, e.g. continuous line scheme property : lossless vs lossy lossless: source can be reconstructed exactly from encoded lossy: not lossless scheme property : robustness in the face of errors this one is easiest to explain in context... let s go back to one example the genetic code: robustness in the face of errors what if a DNA base is copied incorrectly? ATGATCTTTGACGGG ATGATCTCTGACGGG what if a DNA base is deleted? ATGATCTTTGACGGG ATGATCTTGACGGG critique: what might this tell us about the cell s translation machinery? summary: some properties of schemes digital or not lossless or lossy robustness in the face of errors recall: goals for today be able to define a representation scheme understand, and start working on, course learning goals on representation do you remember the goals?
6 a representation scheme a description of how of one type (source ) can be represented using of another type (encoded ) source representation scheme encoded course learning goals on describe properties of representation schemes that can be found in many contexts of the world around you critique properties of representation schemes, from the stand-point of usability and engineering considerations, given information about the context in which the scheme is used course learning goals on engage in design of schemes, for example by proposing modifications that address shortcomings of given representation schemes put your knowledge to practical use, for example, in making decisions about representing your own, or deciding you want to go into CS, or applying the knowledge in your own field
Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner
Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner Outline I. Problem II. Two Historical Detours III.Example IV.The Mathematics of DNA Sequencing V.Complications
More informationby the Genevestigator program (www.genevestigator.com). Darker blue color indicates higher gene expression.
Figure S1. Tissue-specific expression profile of the genes that were screened through the RHEPatmatch and root-specific microarray filters. The gene expression profile (heat map) was drawn by the Genevestigator
More informationHP22.1 Roth Random Primer Kit A für die RAPD-PCR
HP22.1 Roth Random Kit A für die RAPD-PCR Kit besteht aus 20 Einzelprimern, jeweils aufgeteilt auf 2 Reaktionsgefäße zu je 1,0 OD Achtung: Angaben beziehen sich jeweils auf ein Reaktionsgefäß! Sequenz
More informationPyramidal and Chiral Groupings of Gold Nanocrystals Assembled Using DNA Scaffolds
Pyramidal and Chiral Groupings of Gold Nanocrystals Assembled Using DNA Scaffolds February 27, 2009 Alexander Mastroianni, Shelley Claridge, A. Paul Alivisatos Department of Chemistry, University of California,
More informationAppendix A. Example code output. Chapter 1. Chapter 3
Appendix A Example code output This is a compilation of output from selected examples. Some of these examples requires exernal input from e.g. STDIN, for such examples the interaction with the program
More information6 Anhang. 6.1 Transgene Su(var)3-9-Linien. P{GS.ry + hs(su(var)3-9)egfp} 1 I,II,III,IV 3 2I 3 3 I,II,III 3 4 I,II,III 2 5 I,II,III,IV 3
6.1 Transgene Su(var)3-9-n P{GS.ry + hs(su(var)3-9)egfp} 1 I,II,III,IV 3 2I 3 3 I,II,III 3 4 I,II,II 5 I,II,III,IV 3 6 7 I,II,II 8 I,II,II 10 I,II 3 P{GS.ry + UAS(Su(var)3-9)EGFP} A AII 3 B P{GS.ry + (10.5kbSu(var)3-9EGFP)}
More informationGenome Reconstruction: A Puzzle with a Billion Pieces. Phillip Compeau Carnegie Mellon University Computational Biology Department
http://cbd.cmu.edu Genome Reconstruction: A Puzzle with a Billion Pieces Phillip Compeau Carnegie Mellon University Computational Biology Department Eternity II: The Highest-Stakes Puzzle in History Courtesy:
More informationDegenerate Coding and Sequence Compacting
ESI The Erwin Schrödinger International Boltzmanngasse 9 Institute for Mathematical Physics A-1090 Wien, Austria Degenerate Coding and Sequence Compacting Maya Gorel Kirzhner V.M. Vienna, Preprint ESI
More informationTCGR: A Novel DNA/RNA Visualization Technique
TCGR: A Novel DNA/RNA Visualization Technique Donya Quick and Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Dallas, Texas 75275 dquick@mail.smu.edu, mhd@engr.smu.edu
More informationSUPPLEMENTARY INFORMATION. Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells
SUPPLEMENTARY INFORMATION Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells Yuanming Wang 1,2,7, Kaiwen Ivy Liu 2,7, Norfala-Aliah Binte Sutrisnoh
More informationSupplementary Table 1. Data collection and refinement statistics
Supplementary Table 1. Data collection and refinement statistics APY-EphA4 APY-βAla8.am-EphA4 Crystal Space group P2 1 P2 1 Cell dimensions a, b, c (Å) 36.27, 127.7, 84.57 37.22, 127.2, 84.6 α, β, γ (
More informationAmino Acid Graph Representation for Efficient Safe Transfer of Multiple DNA Sequence as Pre Order Trees
International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 292-299 http://www.aiscience.org/journal/ijbbe Amino Acid Graph Representation for Efficient Safe Transfer of
More informationBiostatistics and Bioinformatics Molecular Sequence Databases
. 1 Description of Module Subject Name Paper Name Module Name/Title 13 03 Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives: In the present module, the students will learn about 1. Encoding linear sequences
More informationAssignment 4. the three-dimensional positions of every single atom in the le,
Assignment 4 1 Overview and Background Many of the assignments in this course will introduce you to topics in computational biology. You do not need to know anything about biology to do these assignments
More informationBuilding and Animating Amino Acids and DNA Nucleotides in ShockWave Using 3ds max
1 Building and Animating Amino Acids and DNA Nucleotides in ShockWave Using 3ds max MIT Center for Educational Computing Initiatives THIS PDF DOCUMENT HAS BOOKMARKS FOR NAVIGATION CLICK ON THE TAB TO THE
More informationA relation between trinucleotide comma-free codes and trinucleotide circular codes
Theoretical Computer Science 401 (2008) 17 26 www.elsevier.com/locate/tcs A relation between trinucleotide comma-free codes and trinucleotide circular codes Christian J. Michel a,, Giuseppe Pirillo b,c,
More informationMLiB - Mandatory Project 2. Gene finding using HMMs
MLiB - Mandatory Project 2 Gene finding using HMMs Viterbi decoding >NC_002737.1 Streptococcus pyogenes M1 GAS TTGTTGATATTCTGTTTTTTCTTTTTTAGTTTTCCACATGAAAAATAGTTGAAAACAATA GCGGTGTCCCCTTAAAATGGCTTTTCCACAGGTTGTGGAGAACCCAAATTAACAGTGTTA
More informationDigging into acceptor splice site prediction: an iterative feature selection approach
Digging into acceptor splice site prediction: an iterative feature selection approach Yvan Saeys, Sven Degroeve, and Yves Van de Peer Department of Plant Systems Biology, Ghent University, Flanders Interuniversity
More informationTMRPres2D High quality visual representation of transmembrane protein models. User's manual
TMRPres2D High quality visual representation of transmembrane protein models Version 0.91 User's manual Ioannis C. Spyropoulos, Theodore D. Liakopoulos, Pantelis G. Bagos and Stavros J. Hamodrakas Department
More informationMachine Learning Classifiers
Machine Learning Classifiers Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve Bayes Perceptrons, Multi-layer Neural Networks
More informationCrick s Hypothesis Revisited: The Existence of a Universal Coding Frame
Crick s Hypothesis Revisited: The Existence of a Universal Coding Frame Jean-Louis Lassez*, Ryan A. Rossi Computer Science Department, Coastal Carolina University jlassez@coastal.edu, raross@coastal.edu
More informationSupplementary Materials:
Supplementary Materials: Amino acid codo n Numb er Table S1. Codon usage in all the protein coding genes. RSC U Proportion (%) Amino acid codo n Numb er RSC U Proportion (%) Phe UUU 861 1.31 5.71 Ser UCU
More information(DNA#): Molecular Biology Computation Language Proposal
(DNA#): Molecular Biology Computation Language Proposal Aalhad Patankar, Min Fan, Nan Yu, Oriana Fuentes, Stan Peceny {ap3536, mf3084, ny2263, oif2102, skp2140} @columbia.edu Motivation Inspired by the
More informationSupplementary Data. Image Processing Workflow Diagram A - Preprocessing. B - Hough Transform. C - Angle Histogram (Rose Plot)
Supplementary Data Image Processing Workflow Diagram A - Preprocessing B - Hough Transform C - Angle Histogram (Rose Plot) D - Determination of holes Description of Image Processing Workflow The key steps
More information2 41L Tag- AA GAA AAA ATA AAA GCA TTA RYA GAA ATT TGT RMW GAR C K65 Tag- A AAT CCA TAC AAT ACT CCA GTA TTT GCY ATA AAG AA
176 SUPPLEMENTAL TABLES 177 Table S1. ASPE Primers for HIV-1 group M subtype B Primer no Type a Sequence (5'-3') Tag ID b Position c 1 M41 Tag- AA GAA AAA ATA AAA GCA TTA RYA GAA ATT TGT RMW GAR A d 45
More informationPositional Amino Acid Frequency Patterns for Automatic Protein Annotation
UNIVERSIDADE DE LISBOA FACULDADE DE CIÊNCIAS DEPARTAMENTO DE INFORMÁTICA Positional Amino Acid Frequency Patterns for Automatic Protein Annotation Mestrado em Bioinformática e Biologia Computacional Bioinformática
More information高通量生物序列比對平台 : myblast
高通量生物序列比對平台 : myblast A Customized BLAST Platform For Genomics, Transcriptomis And Proteomics With Paralleled Computing On Your Desktop 呂怡萱 Linda Lu 2013.09.12. What s BLAST Sequence in FASTA format FASTA
More informationFeed Check Sample No Meat and Bone Meal (Pork) Association of American Feed Control Officials
Feed Check Sample No. - 200997 Meat and Bone Meal (Pork) Association of American Feed Control Officials - Pass 1 Results for 193 Labs - - Pass 2 Results for 192 Labs - No. Average No. Average AOAC Method
More informationLABORATORY STANDARD OPERATING PROCEDURE FOR PULSENET CODE: PNL28 MLVA OF SHIGA TOXIN-PRODUCING ESCHERICHIA COLI
1. PURPOSE: to describe the standardized laboratory protocol for molecular subtyping of Shiga toxin-producing Escherichia coli O157 (STEC O157) and Salmonella enterica serotypes Typhimurium and Enteritidis.
More informationFeed Check Sample No Preconditioning/Receiving Chow, Med Association of American Feed Control Officials
Feed Check Sample No. - 200929 Preconditioning/Receiving Chow, Med Association of American Feed Control Officials - Pass 1 Results for 212 Labs - - Pass 2 Results for 211 Labs - No. Average No. Average
More informationFeed Check Sample No Foundation Cattle Mineral, Medicated Association of American Feed Control Officials
Feed Check Sample No. - 200927 Foundation Cattle Mineral, Medicated Association of American Feed Control Officials - Pass 1 Results for 170 Labs - - Pass 2 Results for 168 Labs - No. Average No. Average
More informationDNA Sequencing. Overview
BINF 3350, Genomics and Bioinformatics DNA Sequencing Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Eulerian Cycles Problem Hamiltonian Cycles
More informationFeed Check Sample No Chicken Starter/Grower, Medicated Association of American Feed Control Officials
Feed Check Sample No. - 200926 Chicken Starter/Grower, Medicated Association of American Feed Control Officials - Pass 1 Results for 207 Labs - - Pass 2 Results for 206 Labs - No. Average No. Average AOAC
More informationProgramming Applications. What is Computer Programming?
Programming Applications What is Computer Programming? An algorithm is a series of steps for solving a problem A programming language is a way to express our algorithm to a computer Programming is the
More information3D-Dock. incorporating FTDock (version 2.0), RPScore, and Multidock. March Introduction Key to font usage Requirements...
3D-Dock incorporating FTDock (version 2.0), RPScore, and Multidock Gidon Moont, Graham R. Smith and Michael J. E. Sternberg March 2001 Contents 1 Introduction 3 1.1 Key to font usage.................................
More informationEfficient Selection of Unique and Popular Oligos for Large EST Databases. Stefano Lonardi. University of California, Riverside
Efficient Selection of Unique and Popular Oligos for Large EST Databases Stefano Lonardi University of California, Riverside joint work with Jie Zheng, Timothy Close, Tao Jiang University of California,
More informationGPRO 1.0 THE PROFESSIONAL TOOL FOR SEQUENCE ANALYSIS/ANNOTATION AND MANAGEMENT OF OMIC DATABASES. (February 2011)
The user guide you are about to check may not be thoroughly updated with regard to the last downloadable version of the software. GPRO software is under continuous development as an ongoing effort to improve
More informationMolecular Evolutionary Genetics Analysis version Sudhir Kumar, Koichiro Tamura and Masatoshi Nei
CP P and MEGA manual Molecular Evolutionary Genetics Analysis version 1.01 Sudhir Kumar, Koichiro Tamura and Masatoshi Nei MEGA is distributed with a nominal fee to defray the cost of producing the user
More informationA Novel Implementation of an Extended 8x8 Playfair Cipher Using Interweaving on DNA-encoded Data
International Journal of Electrical and Computer Engineering (IJECE) Vol. 4, No. 1, Feburary 2014, pp. 93~100 ISSN: 2088-8708 93 A Novel Implementation of an Extended 8x8 Playfair Cipher Using Interweaving
More informationStructural analysis and haplotype diversity in swine LEP and MC4R genes
J. Anim. Breed. Genet. ISSN - OIGINAL ATICLE Structural analysis and haplotype diversity in swine LEP and MC genes M. D Andrea, F. Pilla, E. Giuffra, D. Waddington & A.L. Archibald University of Molise,
More informationSupporting Information
Copyright WILEY VCH Verlag GmbH & Co. KGaA, 69469 Weinheim, Germany, 2015. Supporting Information for Small, DOI: 10.1002/smll.201501370 A Compact DNA Cube with Side Length 10 nm Max B. Scheible, Luvena
More informationUnderstanding the content of HyPhy s JSON output files
Understanding the content of HyPhy s JSON output files Stephanie J. Spielman July 2018 Most standard analyses in HyPhy output results in JSON format, essentially a nested dictionary. This page describes
More informationDue Thursday, July 18 at 11:00AM
CS106B Summer 2013 Handout #10 July 10, 2013 Assignment 3: Recursion! Parts of this handout were written by Julie Zelenski, Jerry Cain, and Eric Roberts. This assignment consists of four recursive functions
More informationClassification of biological sequences with kernel methods
Classification of biological sequences with kernel methods Jean-Philippe Vert Jean-Philippe.Vert@ensmp.fr Centre for Computational Biology Ecole des Mines de Paris, ParisTech International Conference on
More informationde novo assembly Simon Rasmussen 36626: Next Generation Sequencing analysis DTU Bioinformatics Next Generation Sequencing Analysis
de novo assembly Simon Rasmussen 36626: Next Generation Sequencing analysis DTU Bioinformatics 27626 - Next Generation Sequencing Analysis Generalized NGS analysis Data size Application Assembly: Compare
More informationGenomic Perl. From Bioinformatics Basics to Working Code REX A. DWYER. Genomic Perl Consultancy, Inc.
Genomic Perl From Bioinformatics Basics to Working Code REX A. DWYER Genomic Perl Consultancy, Inc. published by the press syndicate of the university of cambridge The Pitt Building, Trumpington Street,
More informationde Bruijn graphs for sequencing data
de Bruijn graphs for sequencing data Rayan Chikhi CNRS Bonsai team, CRIStAL/INRIA, Univ. Lille 1 SMPGD 2016 1 MOTIVATION - de Bruijn graphs are instrumental for reference-free sequencing data analysis:
More informationCSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly
CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly Ben Raphael Sept. 22, 2009 http://cs.brown.edu/courses/csci2950-c/ l-mer composition Def: Given string s, the Spectrum ( s, l ) is unordered multiset
More informationAn Efficient Mining for Approximate Frequent Items in Protein Sequence Database
An Efficient Mining for Approximate Frequent Items in Protein Sequence Database J. Jeyabharathi 1, Dr.D. Shanthi 2 1 Associate Professor, Department of Computer Science and Engineering, C.R. Engineering
More informationOFFICE OF RESEARCH AND SPONSORED PROGRAMS
OFFICE OF RESEARCH AND SPONSORED PROGRAMS June 9, 2016 Mr. Satoshi Harada Department of Innovation Research Japan Science and Technology Agency (JST) K s Gobancho, 7, Gobancho, Chiyoda-ku, Tokyo, 102-0076
More informationDNA Fragment Assembly
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri DNA Fragment Assembly Overlap
More informationLOAD SCHEDULING FOR BIOINFORMATICS APPLICATIONS IN LARGE SCALE NETWORKS SUDHA GUNTURU. Bachelor of Technology in Computer Science
LOAD SCHEDULING FOR BIOINFORMATICS APPLICATIONS IN LARGE SCALE NETWORKS By SUDHA GUNTURU Bachelor of Technology in Computer Science Jawaharlal Nehru Technological University Hyderabad, Andhra Pradesh 2005
More informationFigure 2.1: Simple model of a communication system
Chapter 2 Codes In the previous chapter we examined the fundamental unit of information, the bit, and its various abstract representations: the mathematical bit, the control bit, the classical bit, and
More informationChannel. Figure 2.1: Simple model of a communication system
Chapter 2 Codes In the previous chapter we examined the fundamental unit of information, the bit, and its various abstract representations: the Boolean bit (with its associated Boolean algebra and realization
More informationSequence Assembly. BMI/CS 576 Mark Craven Some sequencing successes
Sequence Assembly BMI/CS 576 www.biostat.wisc.edu/bmi576/ Mark Craven craven@biostat.wisc.edu Some sequencing successes Yersinia pestis Cannabis sativa The sequencing problem We want to determine the identity
More informationFigure 2.1: Simple model of a communication system
Chapter 2 Codes In the previous chapter we examined the fundamental unit of information, the bit, and its physical forms (the quantum bit and the classical bit), its classical mathematical model (the Boolean
More informationAxiom Patterns. COMP60421 Robert Stevens University of Manchester
Axiom Patterns COMP60421 Robert Stevens University of Manchester robert.stevens@manchester.ac.uk 1 Patterns of axioms An axiom pattern is a recurring regularity in how axioms are used or appear within
More informationComputational Methods for de novo Assembly of Next-Generation Genome Sequencing Data
1/39 Computational Methods for de novo Assembly of Next-Generation Genome Sequencing Data Rayan Chikhi ENS Cachan Brittany / IRISA (Genscale team) Advisor : Dominique Lavenier 2/39 INTRODUCTION, YEAR 2000
More informationDynamic Programming: Sequence alignment. CS 466 Saurabh Sinha
Dynamic Programming: Sequence alignment CS 466 Saurabh Sinha DNA Sequence Comparison: First Success Story Finding sequence similarities with genes of known function is a common approach to infer a newly
More informationMatriarch User's Guide
Matriarch User's Guide Written by: David I. Spivak, Tristan Giesa, Ravi Jagadeesan, and Markus J. Buehler Programmed by: Ravi Jagadeesan Laboratory for Atomistic and Molecular Mechanics, Department of
More informationFinding Selection in All the Right Places TA Notes and Key Lab 9
Objectives: Finding Selection in All the Right Places TA Notes and Key Lab 9 1. Use published genome data to look for evidence of selection in individual genes. 2. Understand the need for DNA sequence
More informationApplication of Nearest Neighbour Search techniques to Peptide identification from Mass Spectrometry
Escuela de Ingeniería en Computación Programa de Maestría en Computación Application of Nearest Neighbour Search techniques to Peptide identification from Mass Spectrometry A thesis submitted in partial
More informationScalable Solutions for DNA Sequence Analysis
Scalable Solutions for DNA Sequence Analysis Michael Schatz Dec 4, 2009 JHU/UMD Joint Sequencing Meeting The Evolution of DNA Sequencing Year Genome Technology Cost 2001 Venter et al. Sanger (ABI) $300,000,000
More informationGlobal Alignment Scoring Matrices Local Alignment Alignment with Affine Gap Penalties
Global Alignment Scoring Matrices Local Alignment Alignment with Affine Gap Penalties From LCS to Alignment: Change the Scoring The Longest Common Subsequence (LCS) problem the simplest form of sequence
More informationGraph Algorithms in Bioinformatics
Graph Algorithms in Bioinformatics Computational Biology IST Ana Teresa Freitas 2015/2016 Sequencing Clone-by-clone shotgun sequencing Human Genome Project Whole-genome shotgun sequencing Celera Genomics
More informationBioinformatics Toolbox
Bioinformatics Toolbox For Use with MATLAB Computation Visualization Programming Reference Version 2 How to Contact The MathWorks: www.mathworks.com comp.soft-sys.matlab support@mathworks.com suggest@mathworks.com
More informationEulerian Tours and Fleury s Algorithm
Eulerian Tours and Fleury s Algorithm CSE21 Winter 2017, Day 12 (B00), Day 8 (A00) February 8, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Vocabulary Path (or walk): describes a route from one vertex
More informationDNA Sequencing The Shortest Superstring & Traveling Salesman Problems Sequencing by Hybridization
Eulerian & Hamiltonian Cycle Problems DNA Sequencing The Shortest Superstring & Traveling Salesman Problems Sequencing by Hybridization The Bridge Obsession Problem Find a tour crossing every bridge just
More informationSequencing. Computational Biology IST Ana Teresa Freitas 2011/2012. (BACs) Whole-genome shotgun sequencing Celera Genomics
Computational Biology IST Ana Teresa Freitas 2011/2012 Sequencing Clone-by-clone shotgun sequencing Human Genome Project Whole-genome shotgun sequencing Celera Genomics (BACs) 1 Must take the fragments
More informationShortest Path Algorithm
Shortest Path Algorithm C Works just fine on this graph. C Length of shortest path = Copyright 2005 DIMACS BioMath Connect Institute Robert Hochberg Dynamic Programming SP #1 Same Questions, Different
More informationOWL & FOL COMP Sean Bechhofer Uli Sattler
OWL & FOL COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Uli Sattler uli.sattler@manchester.ac.uk 1 A reminder: quotations and citations Citations [4] inform us where you got an idea/approach/result/technique/
More informationPERFORMANCE ANALYSIS OF DATAMINIG TECHNIQUE IN RBC, WBC and PLATELET CANCER DATASETS
PERFORMANCE ANALYSIS OF DATAMINIG TECHNIQUE IN RBC, WBC and PLATELET CANCER DATASETS Mayilvaganan M 1 and Hemalatha 2 1 Associate Professor, Department of Computer Science, PSG College of arts and science,
More informationEulerian tours. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck. April 20, 2016
Eulerian tours Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck http://cseweb.ucsd.edu/classes/sp16/cse21-bd/ April 20, 2016 Seven Bridges of Konigsberg Is there a path that crosses each
More information10/15/2009 Comp 590/Comp Fall
Lecture 13: Graph Algorithms Study Chapter 8.1 8.8 10/15/2009 Comp 590/Comp 790-90 Fall 2009 1 The Bridge Obsession Problem Find a tour crossing every bridge just once Leonhard Euler, 1735 Bridges of Königsberg
More informationAlgorithms for Bioinformatics
Adapted from slides by Alexandru Tomescu, Leena Salmela and Veli Mäkinen, which are partly from http://bix.ucsd.edu/bioalgorithms/slides.php 58670 Algorithms for Bioinformatics Lecture 5: Graph Algorithms
More informationDetecting Superbubbles in Assembly Graphs. Taku Onodera (U. Tokyo)! Kunihiko Sadakane (NII)! Tetsuo Shibuya (U. Tokyo)!
Detecting Superbubbles in Assembly Graphs Taku Onodera (U. Tokyo)! Kunihiko Sadakane (NII)! Tetsuo Shibuya (U. Tokyo)! de Bruijn Graph-based Assembly Reads (substrings of original DNA sequence) de Bruijn
More informationdebgr: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph Prashant Pandey Stony Brook University, NY, USA
debgr: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph Prashant Pandey Stony Brook University, NY, USA De Bruijn graphs are ubiquitous [Pevzner et al. 2001, Zerbino and Birney,
More informationAlgorithms and Data Structures
Algorithms and Data Structures Sorting beyond Value Comparisons Marius Kloft Content of this Lecture Radix Exchange Sort Sorting bitstrings in linear time (almost) Bucket Sort Marius Kloft: Alg&DS, Summer
More informationde novo assembly Rayan Chikhi Pennsylvania State University Workshop On Genomics - Cesky Krumlov - January /73
1/73 de novo assembly Rayan Chikhi Pennsylvania State University Workshop On Genomics - Cesky Krumlov - January 2014 2/73 YOUR INSTRUCTOR IS.. - Postdoc at Penn State, USA - PhD at INRIA / ENS Cachan,
More informationMultiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations
Georgia State University ScholarWorks @ Georgia State University Computer Science Dissertations Department of Computer Science Fall 12-14-2011 Multiple Biolgical Sequence Alignment: Scoring Functions,
More informationCSE : Computational Issues in Molecular Biology. Lecture 7. Spring 2004
CSE 397-497: Computational Issues in Molecular Biology Lecture 7 Spring 2004-1 - CSE seminar on Monday Title: Redundancy Elimination Within Large Collections of Files Speaker: Dr. Fred Douglis (IBM T.J.
More informationWSSP-10 Chapter 7 BLASTN: DNA vs DNA searches
WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches 4-3 DSAP: BLASTn Page p. 7-1 NCBI BLAST Home Page p. 7-1 NCBI BLASTN search page p. 7-2 Copy sequence from DSAP or wave form program p. 7-2 Choose a database
More informationEECS 4425: Introductory Computational Bioinformatics Fall Suprakash Datta
EECS 4425: Introductory Computational Bioinformatics Fall 2018 Suprakash Datta datta [at] cse.yorku.ca Office: CSEB 3043 Phone: 416-736-2100 ext 77875 Course page: http://www.cse.yorku.ca/course/4425 Many
More informationSTAN Manual by Anne-Sophie Valin, Patrick Durand, and Grégory Ranchy. Published May, 9th, 2005
STAN Manual STAN Manual by Anne-Sophie Valin, Patrick Durand, and Grégory Ranchy Published May, 9th, 2005 Revision History Revision 2.0 31/01/2007 Revised by: Laetitia Guillot Update of the screen printings,
More informationParsimony-Based Approaches to Inferring Phylogenetic Trees
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:
More information3. The object system(s)
3. The object system(s) Thomas Lumley Ken Rice Universities of Washington and Auckland Seattle, June 2011 Generics and methods Many functions in R are generic. This means that the function itself (eg plot,
More informationPattern Matching. An Introduction to File Globs and Regular Expressions
Pattern Matching An Introduction to File Globs and Regular Expressions Copyright 2006 2009 Stewart Weiss The danger that lies ahead Much to your disadvantage, there are two different forms of patterns
More informationPattern Matching. An Introduction to File Globs and Regular Expressions. Adapted from Practical Unix and Programming Hunter College
Pattern Matching An Introduction to File Globs and Regular Expressions Adapted from Practical Unix and Programming Hunter College Copyright 2006 2009 Stewart Weiss The danger that lies ahead Much to your
More information10/8/13 Comp 555 Fall
10/8/13 Comp 555 Fall 2013 1 Find a tour crossing every bridge just once Leonhard Euler, 1735 Bridges of Königsberg 10/8/13 Comp 555 Fall 2013 2 Find a cycle that visits every edge exactly once Linear
More informationGEOMETRIC OPTIMIZATION IN SOME PROXIMITY AND BIOINFORMATICS PROBLEMS
University of Windsor Scholarship at UWindsor Electronic Theses and Dissertations 2014 GEOMETRIC OPTIMIZATION IN SOME PROXIMITY AND BIOINFORMATICS PROBLEMS Satish Chandra Panigrahi University of Windsor
More informationSEARCHING FOR REMOTELY HOMOLOGOUS SEQUENCES IN PROTEIN DATABASES WITH HYBRID PSI-BLAST
SEARCHING FOR REMOTELY HOMOLOGOUS SEQUENCES IN PROTEIN DATABASES WITH HYBRID PSI-BLAST DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in the Graduate
More informationProblem statement. CS267 Assignment 3: Parallelize Graph Algorithms for de Novo Genome Assembly. Spring Example.
CS267 Assignment 3: Problem statement 2 Parallelize Graph Algorithms for de Novo Genome Assembly k-mers are sequences of length k (alphabet is A/C/G/T). An extension is a simple symbol (A/C/G/T/F). The
More informationQuestion 4: a. We want to store a binary encoding of the 150 original Pokemon. How many bits do we need to use?
Question 4: a. We want to store a binary encoding of the 150 original Pokemon. How many bits do we need to use? b. What is the encoding for Pikachu (#25)? Question 2: Flippin Fo Fun (10 points, 14 minutes)
More informationHow to Run NCBI BLAST on zcluster at GACRC
How to Run NCBI BLAST on zcluster at GACRC BLAST: Basic Local Alignment Search Tool Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu 1 OVERVIEW What is BLAST?
More informationAssembly in the Clouds
Assembly in the Clouds Michael Schatz October 13, 2010 Beyond the Genome Shredded Book Reconstruction Dickens accidentally shreds the first printing of A Tale of Two Cities Text printed on 5 long spools
More informationPurpose of sequence assembly
Sequence Assembly Purpose of sequence assembly Reconstruct long DNA/RNA sequences from short sequence reads Genome sequencing RNA sequencing for gene discovery Amplicon sequencing But not for transcript
More informationOptimization of Molecular Dynamics Simulation Code and Applications to Biomolecular Systems
Optimization of Molecular Dynamics Simulation Code and Applications to Biomolecular Systems David M. Bowman Advisor: Dr. Paulo Martel, Faculty of Science and Technology, University of the Algarve Dissertation
More informationLecture Overview. Sequence search & alignment. Searching sequence databases. Sequence Alignment & Search. Goals: Motivations:
Lecture Overview Sequence Alignment & Search Karin Verspoor, Ph.D. Faculty, Computational Bioscience Program University of Colorado School of Medicine With credit and thanks to Larry Hunter for creating
More informationGlobal Alignment. Algorithms in BioInformatics Mandatory Project 1 Magnus Erik Hvass Pedersen (971055) Daimi, University of Aarhus September 2004
1 Introduction Global Alignment Algorithms in BioInformatics Mandatory Project 1 Magnus Erik Hvass Pedersen (971055) Daimi, University of Aarhus September 2004 The purpose of this report is to verify attendance
More informationGenome 373: Genome Assembly. Doug Fowler
Genome 373: Genome Assembly Doug Fowler What are some of the things we ve seen we can do with HTS data? We ve seen that HTS can enable a wide variety of analyses ranging from ID ing variants to genome-
More information