CSCI 1820 Notes. Scribes: tl40. February 26 - March 02, Estimating size of graphs used to build the assembly.

Size: px
Start display at page:

Download "CSCI 1820 Notes. Scribes: tl40. February 26 - March 02, Estimating size of graphs used to build the assembly."

Transcription

1 CSCI 1820 Notes Scribes: tl40 February 26 - March 02, 2018 Chapter 2. Genome Assembly Algorithms 2.1. Statistical Theory 2.2. Algorithmic Theory Idury-Waterman Algorithm Estimating size of graphs used to build the assembly. de Bruijn graphs combinatorics. DNA Assembly Problem We have a G bp-long DNA target molecule, which we do not know so want to reconstruct. We know sequences of a number of fragments of the target DNA, f 1, f 2,, f N. Assume that they are all of length L. Note that the same result holds when L is the average length of the fragments. We can record their consecutive order. We use the de Bruijn graphs for algorithms used to assembly the target DNA sequence. Remark 1. The graph consists of nodes as (k 1)-mers and directed edges as k-mers, so depends on the choice of k. 2. The graph is Eulerian, which means that the in-degree and out-degree are same for every node, so there exists at least one Eulerian cycle - actually, we have exponentially many different such ones. 1

2 Estimate the size of the de Bruijn graph based on k-mers Setting: k: size of the mers. G: size of the target DNA. N: number of fragments. L: length of the fragments. a = NL G : coverage. T = f 1 + f f N N(L k + 2): number of (k 1)-mers in the fragments. r: DNA single base error rate (sequencing error rate), about 2% in substitutions. Theorem Let L = L k + 2 and R = 1 (1 r) k 1. Also, let us classify the (k 1)-mers in the de Bruijn graph (sequence graph) as TRUE if they are derived from a correctly used fragment region, and FALSE otherwise. Assume the following: 1. The error rate is small. 2. No two fragments generate the same FALSE (k 1)-mer, that is, FALSE (k 1)- mers appear at most once among fragments. Then, in the associated de Bruijn graph (sequence graph), the expected number of vertices is E[ V ] = RT + [1 e a(1 R) ]L. Proof Let X α be the number of fragments that cover the (k 1)-long region α, α + 1,, α + k 2 of the target DNA sequence. Then, each X α is a Poisson random variable with mean a, and E[ α X α] = T. Since the uniform error rate on the probability of a base being used correctly by the sequencing machine is 1 r, the probability that an entire (k 1)-mer is used correctly is (1 r) k 1, which implies that the probability that an entire (k 1)-mer being used incorrectly is R = 1 (1 r) k 1 : we can say that R is the (k 1)-mer error rate, and, by the second assumption, E[ F ALSE ] = RT 2

3 The expected number of TRUE (k 1)-mers is equal to the number of positions α such that at least one of the fragments contains no sequencing errors in the (k 1)-long region α, α + 1,, α + k 2, so, for a Poisson random variable X, E[ T RUE ] = L (1 R i )P (X = i) ( e = L (P (X = i) P (X = i)r i a ) = L a i ( e a = L a i ) ( e a (ar) i = L e a a i e a = L (1 e a e ar ) = L (1 e a(1 R) ) ) e a a i R i ) (ar) i Therefore, E[ V ] = E[ T RUE ] + E[ F ALSE ] = RT + (1 e a(1 R) )L. Theory of Combinatorics on Words Definition Let A be the alphabet, A be the set of sequences over A including an empty word ɛ, A + be the set of sequences without ɛ, and u m = uu u for a sequence u. w A + is called primitive if w = u m for u A + then m = 1. Two words x and y are conjugate if there exist words u, v such that x = uv and y = vu (cyclic shifts of one another). Definition A Lyndon word is a primitive word that is strictly smaller than any of its proper suffixes (or rotations) in lexicographic order. Lemma Let L be the set of all Lyndon words. If l < m for l, m L, then lm L. Theorem (Factorization) Any word can be factorized uniquely as a non-increasing product of Lyndon words. Example abracadabra (abracad)(abr)(a) Theorem (Fredrickson & Maiorana) Let l 1 < l 2 < < l m be the increasing sequence of Lyndon words of length dividing n. Then l 1 l 2 l m is the lex-first de Bruijn cycle of order n (all n-mers are there). 3

4 De Bruijn Graphs Combinatorics & Eulerian Cycles (Building Genome Assemblies) Example A de Bruijn graph of order n = 3 over A = {a, b}: Definition For a given graph G = (V, E) and v V, d (v)= the in-degree = number of edges coming in. d + (v)= the out-degree = number of edges going out. Definition A spanning tree directed at vertex v has the follwing properties: 1. It is connected. 2. For every vertex u v, there is a directed path in the spanning tree from u to v. Remark (How many Eulerian orders are in the de Bruijn graph?) There is a connection between Eulerian cycles and spanning trees directed towards a fixed node v, which can be used for giving a relationship between the number of Eulerian cycles and that of spanning trees. Example Two directed spanning trees directed towards the vertex bb: 4

5 Definition A directed graph G = (V, E) is strongly connected if u, v V, a directed path from u to v. Theorem A strongly connected graph G = (V, E) is Eulerian if and only if d (v) = d + (v) for any v V. Definition An Eulerian cycle is a cycle on a graph that visits every edge exactly once. Theorem (BEST) For a graph G = (V, E), define the following: Π(G) = v V (d+ (v) 1)!. For a vertex v V, t(g) is the number of spanning trees oriented towards v. Then, the number of Eulerian cycles is equal to t(g) Π(G). Theorem The number of de Bruijn cycles of order n (all n-mers in the de Bruijn cycle) over k-letter alphabets is N(n, k) = (k!)kn 1 k n. Table 1: Some Examples n N(n, 2) N(n, 3) N(n, 4) N(n, 5) 24 Theorem (The Matrix-Tree Theorem) Let G be a graph on a set of vertices V, M = (M v w) be an adjacency matrix of G, where M vw is the number of edges between v and w, D be a diagonal matrix such that D vv = w V M vw, L = D M be the Laplacian matrix of G, and K v (G) be the determinant of the matrix C v, obtained from L by removing the row and column indexed v. Then, for any v V, the number of spanning trees of G oriented towards v is K v (G). Example Let G be a de Bruijn graph of order n = 3 over A = {a, b}, so V = {aa, ab, ba, bb}. Then, the adjacency matrix M, D, and the Laplacian matrix L are given by following: M = , D = , L = Therefore, the number of spanning trees directed towards bb is 2 by following: 5

6 1 1 0 C bb = det C bb =

Characterization of Graphs with Eulerian Circuits

Characterization of Graphs with Eulerian Circuits Eulerian Circuits 3. 73 Characterization of Graphs with Eulerian Circuits There is a simple way to determine if a graph has an Eulerian circuit. Theorems 3.. and 3..2: Let G be a pseudograph that is connected

More information

De Bruijn Sequences and De Bruijn Graphs for a general language

De Bruijn Sequences and De Bruijn Graphs for a general language De Bruijn Sequences and De Bruijn Graphs for a general language Eduardo Moreno 1 Institut Gaspard Monge, Université de Marne-la-Vallée, Champs-sur-Marne, 77454 Marne-la-Vallée cedex 2, France. Departamento

More information

Small-Space 2D Compressed Dictionary Matching

Small-Space 2D Compressed Dictionary Matching Shoshana Neuburger and Dina Sokol City University of New York Problem Definition Compressed 2D Dictionary Matching Input: Compressed text of uncompressed size n n. Dictionary containing k compressed patterns.

More information

Graph Algorithms in Bioinformatics

Graph Algorithms in Bioinformatics Graph Algorithms in Bioinformatics Computational Biology IST Ana Teresa Freitas 2015/2016 Sequencing Clone-by-clone shotgun sequencing Human Genome Project Whole-genome shotgun sequencing Celera Genomics

More information

CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly

CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly Ben Raphael Sept. 22, 2009 http://cs.brown.edu/courses/csci2950-c/ l-mer composition Def: Given string s, the Spectrum ( s, l ) is unordered multiset

More information

Reducing Genome Assembly Complexity with Optical Maps

Reducing Genome Assembly Complexity with Optical Maps Reducing Genome Assembly Complexity with Optical Maps Lee Mendelowitz LMendelo@math.umd.edu Advisor: Dr. Mihai Pop Computer Science Department Center for Bioinformatics and Computational Biology mpop@umiacs.umd.edu

More information

Sequencing. Computational Biology IST Ana Teresa Freitas 2011/2012. (BACs) Whole-genome shotgun sequencing Celera Genomics

Sequencing. Computational Biology IST Ana Teresa Freitas 2011/2012. (BACs) Whole-genome shotgun sequencing Celera Genomics Computational Biology IST Ana Teresa Freitas 2011/2012 Sequencing Clone-by-clone shotgun sequencing Human Genome Project Whole-genome shotgun sequencing Celera Genomics (BACs) 1 Must take the fragments

More information

On Universal Cycles of Labeled Graphs

On Universal Cycles of Labeled Graphs On Universal Cycles of Labeled Graphs Greg Brockman Harvard University Cambridge, MA 02138 United States brockman@hcs.harvard.edu Bill Kay University of South Carolina Columbia, SC 29208 United States

More information

02-711/ Computational Genomics and Molecular Biology Fall 2016

02-711/ Computational Genomics and Molecular Biology Fall 2016 Literature assignment 2 Due: Nov. 3 rd, 2016 at 4:00pm Your name: Article: Phillip E C Compeau, Pavel A. Pevzner, Glenn Tesler. How to apply de Bruijn graphs to genome assembly. Nature Biotechnology 29,

More information

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 03/02/17

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 03/02/17 CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh (rezab@stanford.edu) HW#3 Due at the beginning of class Thursday 03/02/17 1. Consider a model of a nonbipartite undirected graph in which

More information

Eulerian Tours and Fleury s Algorithm

Eulerian Tours and Fleury s Algorithm Eulerian Tours and Fleury s Algorithm CSE21 Winter 2017, Day 12 (B00), Day 8 (A00) February 8, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Vocabulary Path (or walk): describes a route from one vertex

More information

DNA Sequencing. Overview

DNA Sequencing. Overview BINF 3350, Genomics and Bioinformatics DNA Sequencing Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Eulerian Cycles Problem Hamiltonian Cycles

More information

Michał Dębski. Uniwersytet Warszawski. On a topological relaxation of a conjecture of Erdős and Nešetřil

Michał Dębski. Uniwersytet Warszawski. On a topological relaxation of a conjecture of Erdős and Nešetřil Michał Dębski Uniwersytet Warszawski On a topological relaxation of a conjecture of Erdős and Nešetřil Praca semestralna nr 3 (semestr letni 2012/13) Opiekun pracy: Tomasz Łuczak On a topological relaxation

More information

Math 778S Spectral Graph Theory Handout #2: Basic graph theory

Math 778S Spectral Graph Theory Handout #2: Basic graph theory Math 778S Spectral Graph Theory Handout #: Basic graph theory Graph theory was founded by the great Swiss mathematician Leonhard Euler (1707-178) after he solved the Königsberg Bridge problem: Is it possible

More information

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15 CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh (rezab@stanford.edu) HW#3 Due at the beginning of class Thursday 02/26/15 1. Consider a model of a nonbipartite undirected graph in which

More information

(for more info see:

(for more info see: Genome assembly (for more info see: http://www.cbcb.umd.edu/research/assembly_primer.shtml) Introduction Sequencing technologies can only "read" short fragments from a genome. Reconstructing the entire

More information

MAT 145: PROBLEM SET 4

MAT 145: PROBLEM SET 4 MAT 145: PROBLEM SET 4 DUE TO FRIDAY FEB 22 Abstract. This problem set corresponds to the sixth week of the Combinatorics Course in the Winter Quarter 2019. It was posted online on Friday Feb 15 and is

More information

Path Finding in Graphs. Problem Set #2 will be posted by tonight

Path Finding in Graphs. Problem Set #2 will be posted by tonight Path Finding in Graphs Problem Set #2 will be posted by tonight 1 From Last Time Two graphs representing 5-mers from the sequence "GACGGCGGCGCACGGCGCAA" Hamiltonian Path: Eulerian Path: Each k-mer is a

More information

IDBA A Practical Iterative de Bruijn Graph De Novo Assembler

IDBA A Practical Iterative de Bruijn Graph De Novo Assembler IDBA A Practical Iterative de Bruijn Graph De Novo Assembler Yu Peng, Henry C.M. Leung, S.M. Yiu, and Francis Y.L. Chin Department of Computer Science, The University of Hong Kong Pokfulam Road, Hong Kong

More information

Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner

Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner Outline I. Problem II. Two Historical Detours III.Example IV.The Mathematics of DNA Sequencing V.Complications

More information

K 4 C 5. Figure 4.5: Some well known family of graphs

K 4 C 5. Figure 4.5: Some well known family of graphs 08 CHAPTER. TOPICS IN CLASSICAL GRAPH THEORY K, K K K, K K, K K, K C C C C 6 6 P P P P P. Graph Operations Figure.: Some well known family of graphs A graph Y = (V,E ) is said to be a subgraph of a graph

More information

Counting the number of spanning tree. Pied Piper Department of Computer Science and Engineering Shanghai Jiao Tong University

Counting the number of spanning tree. Pied Piper Department of Computer Science and Engineering Shanghai Jiao Tong University Counting the number of spanning tree Pied Piper Department of Computer Science and Engineering Shanghai Jiao Tong University 目录 Contents 1 Complete Graph 2 Proof of the Lemma 3 Arbitrary Graph 4 Proof

More information

2 Eulerian digraphs and oriented trees.

2 Eulerian digraphs and oriented trees. 2 Eulerian digraphs and oriented trees. A famous problem which goes back to Euler asks for what graphs G is there a closed walk which uses every edge exactly once. (There is also a version for non-closed

More information

IDBA - A Practical Iterative de Bruijn Graph De Novo Assembler

IDBA - A Practical Iterative de Bruijn Graph De Novo Assembler IDBA - A Practical Iterative de Bruijn Graph De Novo Assembler Yu Peng, Henry Leung, S.M. Yiu, Francis Y.L. Chin Department of Computer Science, The University of Hong Kong Pokfulam Road, Hong Kong {ypeng,

More information

The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs

The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs Frankie Smith Nebraska Wesleyan University fsmith@nebrwesleyan.edu May 11, 2015 Abstract We will look at how to represent

More information

Genome Sequencing Algorithms

Genome Sequencing Algorithms Genome Sequencing Algorithms Phillip Compaeu and Pavel Pevzner Bioinformatics Algorithms: an Active Learning Approach Leonhard Euler (1707 1783) William Hamilton (1805 1865) Nicolaas Govert de Bruijn (1918

More information

BLAST & Genome assembly

BLAST & Genome assembly BLAST & Genome assembly Solon P. Pissis Tomáš Flouri Heidelberg Institute for Theoretical Studies May 15, 2014 1 BLAST What is BLAST? The algorithm 2 Genome assembly De novo assembly Mapping assembly 3

More information

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees.

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees. Tree 1. Trees and their Properties. Spanning trees 3. Minimum Spanning Trees 4. Applications of Minimum Spanning Trees 5. Minimum Spanning Tree Algorithms 1.1 Properties of Trees: Definition: A graph G

More information

Proposition 1. The edges of an even graph can be split (partitioned) into cycles, no two of which have an edge in common.

Proposition 1. The edges of an even graph can be split (partitioned) into cycles, no two of which have an edge in common. Math 3116 Dr. Franz Rothe June 5, 2012 08SUM\3116_2012t1.tex Name: Use the back pages for extra space 1 Solution of Test 1.1 Eulerian graphs Proposition 1. The edges of an even graph can be split (partitioned)

More information

Small-Space 2D Compressed Dictionary Matching

Small-Space 2D Compressed Dictionary Matching Small-Space 2D Compressed Dictionary Matching Shoshana Neuburger 1 and Dina Sokol 2 1 Department of Computer Science, The Graduate Center of the City University of New York, New York, NY, 10016 shoshana@sci.brooklyn.cuny.edu

More information

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PAUL BALISTER Abstract It has been shown [Balister, 2001] that if n is odd and m 1,, m t are integers with m i 3 and t i=1 m i = E(K n) then K n can be decomposed

More information

Math 776 Graph Theory Lecture Note 1 Basic concepts

Math 776 Graph Theory Lecture Note 1 Basic concepts Math 776 Graph Theory Lecture Note 1 Basic concepts Lectured by Lincoln Lu Transcribed by Lincoln Lu Graph theory was founded by the great Swiss mathematician Leonhard Euler (1707-178) after he solved

More information

58093 String Processing Algorithms. Lectures, Autumn 2013, period II

58093 String Processing Algorithms. Lectures, Autumn 2013, period II 58093 String Processing Algorithms Lectures, Autumn 2013, period II Juha Kärkkäinen 1 Contents 0. Introduction 1. Sets of strings Search trees, string sorting, binary search 2. Exact string matching Finding

More information

Four-Regular Graphs with Rigid Vertices Associated to DNA Recombination

Four-Regular Graphs with Rigid Vertices Associated to DNA Recombination Four-Regular Graphs with Rigid Vertices Associated to DNA Recombination Jonathan Burns, Egor Dolzhenko, Nataša Jonoska, Tilahun Muche, Masahico Saito Department of Mathematics and Statistics University

More information

Genome Assembly Using de Bruijn Graphs. Biostatistics 666

Genome Assembly Using de Bruijn Graphs. Biostatistics 666 Genome Assembly Using de Bruijn Graphs Biostatistics 666 Previously: Reference Based Analyses Individual short reads are aligned to reference Genotypes generated by examining reads overlapping each position

More information

Definition 2 (Projective plane). A projective plane is a class of points, and a class of lines satisfying the axioms:

Definition 2 (Projective plane). A projective plane is a class of points, and a class of lines satisfying the axioms: Math 3181 Name: Dr. Franz Rothe January 30, 2014 All3181\3181_spr14h2.tex Homework has to be turned in this handout. The homework can be done in groups up to three due February 11/12 2 Homework 1 Definition

More information

DO NOT RE-DISTRIBUTE THIS SOLUTION FILE

DO NOT RE-DISTRIBUTE THIS SOLUTION FILE Professor Kindred Math 104, Graph Theory Homework 2 Solutions February 7, 2013 Introduction to Graph Theory, West Section 1.2: 26, 38, 42 Section 1.3: 14, 18 Section 2.1: 26, 29, 30 DO NOT RE-DISTRIBUTE

More information

Genus Ranges of 4-Regular Rigid Vertex Graphs

Genus Ranges of 4-Regular Rigid Vertex Graphs Genus Ranges of 4-Regular Rigid Vertex Graphs Dorothy Buck Department of Mathematics Imperial College London London, England, UK d.buck@imperial.ac.uk Nataša Jonoska Egor Dolzhenko Molecular and Computational

More information

Problem Set 2 Solutions

Problem Set 2 Solutions Problem Set 2 Solutions Graph Theory 2016 EPFL Frank de Zeeuw & Claudiu Valculescu 1. Prove that the following statements about a graph G are equivalent. - G is a tree; - G is minimally connected (it is

More information

4 Remainder Cordial Labeling of Some Graphs

4 Remainder Cordial Labeling of Some Graphs International J.Math. Combin. Vol.(08), 8-5 Remainder Cordial Labeling of Some Graphs R.Ponraj, K.Annathurai and R.Kala. Department of Mathematics, Sri Paramakalyani College, Alwarkurichi-67, India. Department

More information

spanning trees of tree graphs Philippe Biane, CNRS-IGM-Université Paris-Est Firenze, May

spanning trees of tree graphs Philippe Biane, CNRS-IGM-Université Paris-Est Firenze, May Spanning trees of tree graphs, CNRS-IGM-Université Paris-Est Firenze, May 18 2015 joint work with Guillaume Chapuy, CNRS-LIAFA-Université Paris 7 V, E=directed graph w x e v Q=Laplacian matrix, indexed

More information

Genome 373: Genome Assembly. Doug Fowler

Genome 373: Genome Assembly. Doug Fowler Genome 373: Genome Assembly Doug Fowler What are some of the things we ve seen we can do with HTS data? We ve seen that HTS can enable a wide variety of analyses ranging from ID ing variants to genome-

More information

Graph Theory. Chapter 4.

Graph Theory. Chapter 4. Graph Theory. Chapter 4. Wandering. Here is an algorithm, due to Tarry, that constructs a walk in a connected graph, starting at any vertex v 0, traversing each edge exactly once in each direction, and

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 13. An Introduction to Graphs

Discrete Mathematics for CS Spring 2008 David Wagner Note 13. An Introduction to Graphs CS 70 Discrete Mathematics for CS Spring 2008 David Wagner Note 13 An Introduction to Graphs Formulating a simple, precise specification of a computational problem is often a prerequisite to writing a

More information

Eulerian tours. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck. April 20, 2016

Eulerian tours. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck.  April 20, 2016 Eulerian tours Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck http://cseweb.ucsd.edu/classes/sp16/cse21-bd/ April 20, 2016 Seven Bridges of Konigsberg Is there a path that crosses each

More information

Chapter 2. Splitting Operation and n-connected Matroids. 2.1 Introduction

Chapter 2. Splitting Operation and n-connected Matroids. 2.1 Introduction Chapter 2 Splitting Operation and n-connected Matroids The splitting operation on an n-connected binary matroid may not yield an n-connected binary matroid. In this chapter, we provide a necessary and

More information

Graph Algorithms Using Depth First Search

Graph Algorithms Using Depth First Search Graph Algorithms Using Depth First Search Analysis of Algorithms Week 8, Lecture 1 Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Graph Algorithms Using Depth

More information

Generating (n,2) De Bruijn Sequences with Some Balance and Uniformity Properties. Abstract

Generating (n,2) De Bruijn Sequences with Some Balance and Uniformity Properties. Abstract Generating (n,) De Bruijn Sequences with Some Balance and Uniformity Properties Yi-Chih Hsieh, Han-Suk Sohn, and Dennis L. Bricker Department of Industrial Management, National Huwei Institute of Technology,

More information

Sequence Assembly Required!

Sequence Assembly Required! Sequence Assembly Required! 1 October 3, ISMB 20172007 1 Sequence Assembly Genome Sequenced Fragments (reads) Assembled Contigs Finished Genome 2 Greedy solution is bounded 3 Typical assembly strategy

More information

FastA & the chaining problem

FastA & the chaining problem FastA & the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 1 Sources for this lecture: Lectures by Volker Heun, Daniel Huson and Knut Reinert,

More information

FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:

FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10: FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:56 4001 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem

More information

1 Random Walks on Graphs

1 Random Walks on Graphs Lecture 7 Com S 633: Randomness in Computation Scribe: Ankit Agrawal In the last lecture, we looked at random walks on line and used them to devise randomized algorithms for 2-SAT and 3-SAT For 2-SAT we

More information

Jo Ellis-Monaghan * St. Michaels College, Colchester, VT Irasema Sarmiento CINVESTAV, Mexico

Jo Ellis-Monaghan * St. Michaels College, Colchester, VT Irasema Sarmiento CINVESTAV, Mexico Jo Ellis-Monaghan * St. Michaels College, Colchester, VT 05439 Irasema Sarmiento CINVESTAV, Mexico e-mail: jellis-monaghan@smcvt.edu, website: http://academics.smcvt.edu/jellis-monaghan The motivating

More information

Counting the Number of Fixed Points in the Phase Space of Circ n

Counting the Number of Fixed Points in the Phase Space of Circ n Counting the Number of Fixed Points in the Phase Space of Circ n Adam Reevesman February 24, 2015 This paper will discuss a method for counting the number of fixed points in the phase space of the Circ

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Alexandru Tomescu, Leena Salmela and Veli Mäkinen, which are partly from http://bix.ucsd.edu/bioalgorithms/slides.php 582670 Algorithms for Bioinformatics Lecture 3: Graph Algorithms

More information

Math Summer 2012

Math Summer 2012 Math 481 - Summer 2012 Final Exam You have one hour and fifty minutes to complete this exam. You are not allowed to use any electronic device. Be sure to give reasonable justification to all your answers.

More information

Encoding/Decoding, Counting graphs

Encoding/Decoding, Counting graphs Encoding/Decoding, Counting graphs Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck http://cseweb.ucsd.edu/classes/sp16/cse21-bd/ May 13, 2016 11-avoiding binary strings Let s consider

More information

Mathematics and Statistics, Part A: Graph Theory Problem Sheet 1, lectures 1-4

Mathematics and Statistics, Part A: Graph Theory Problem Sheet 1, lectures 1-4 1. Draw Mathematics and Statistics, Part A: Graph Theory Problem Sheet 1, lectures 1-4 (i) a simple graph. A simple graph has a non-empty vertex set and no duplicated edges. For example sketch G with V

More information

Math 485, Graph Theory: Homework #3

Math 485, Graph Theory: Homework #3 Math 485, Graph Theory: Homework #3 Stephen G Simpson Due Monday, October 26, 2009 The assignment consists of Exercises 2129, 2135, 2137, 2218, 238, 2310, 2313, 2314, 2315 in the West textbook, plus the

More information

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) On Modularity Clustering Presented by: Presented by: Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) Modularity A quality index for clustering a graph G=(V,E) G=(VE) q( C): EC ( ) EC ( ) + ECC (, ')

More information

Reducing Genome Assembly Complexity with Optical Maps Mid-year Progress Report

Reducing Genome Assembly Complexity with Optical Maps Mid-year Progress Report Reducing Genome Assembly Complexity with Optical Maps Mid-year Progress Report Lee Mendelowitz LMendelo@math.umd.edu Advisor: Dr. Mihai Pop Computer Science Department Center for Bioinformatics and Computational

More information

Exercise set 2 Solutions

Exercise set 2 Solutions Exercise set 2 Solutions Let H and H be the two components of T e and let F E(T ) consist of the edges of T with one endpoint in V (H), the other in V (H ) Since T is connected, F Furthermore, since T

More information

Structural and spectral properties of minimal strong digraphs

Structural and spectral properties of minimal strong digraphs Structural and spectral properties of minimal strong digraphs C. Marijuán J. García-López, L.M. Pozo-Coronado Abstract In this article, we focus on structural and spectral properties of minimal strong

More information

Graph Algorithms. Tours in Graphs. Graph Algorithms

Graph Algorithms. Tours in Graphs. Graph Algorithms Graph Algorithms Tours in Graphs Graph Algorithms Special Paths and Cycles in Graphs Euler Path: A path that traverses all the edges of the graph exactly once. Euler Cycle: A cycle that traverses all the

More information

Graphs and Puzzles. Eulerian and Hamiltonian Tours.

Graphs and Puzzles. Eulerian and Hamiltonian Tours. Graphs and Puzzles. Eulerian and Hamiltonian Tours. CSE21 Winter 2017, Day 11 (B00), Day 7 (A00) February 3, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Exam Announcements Seating Chart on Website Good

More information

2 Solution of Homework

2 Solution of Homework Math 3181 Name: Dr. Franz Rothe February 6, 2014 All3181\3181_spr14h2.tex Homework has to be turned in this handout. The homework can be done in groups up to three due February 11/12 2 Solution of Homework

More information

RESEARCH TOPIC IN BIOINFORMANTIC

RESEARCH TOPIC IN BIOINFORMANTIC RESEARCH TOPIC IN BIOINFORMANTIC GENOME ASSEMBLY Instructor: Dr. Yufeng Wu Noted by: February 25, 2012 Genome Assembly is a kind of string sequencing problems. As we all know, the human genome is very

More information

Discrete mathematics , Fall Instructor: prof. János Pach

Discrete mathematics , Fall Instructor: prof. János Pach Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,

More information

Combinatorics MAP363 Sheet 1. Mark: /10. Name. Number. Hand in by 19th February. date marked: / /2007

Combinatorics MAP363 Sheet 1. Mark: /10. Name. Number. Hand in by 19th February. date marked: / /2007 Turn over Combinatorics MAP6 Sheet Hand in by 9th February Name Number Year Mark: /0 date marked: / /200 Please attach your working, with this sheet at the front. Guidance on notation: graphs may have

More information

CONTRIBUTIONS TO THE THEORY OF DE BRUIJN CYCLES

CONTRIBUTIONS TO THE THEORY OF DE BRUIJN CYCLES #A2 INTEGERS 4A (24) CONTRIBUTIONS TO THE THEORY OF DE BRUIJN CYCLES André AlexanderCampbell Department of Mathematics, East Tennessee State University, Johnson City, Tennessee campbella@goldmail.etsu.edu

More information

Theory of Computing. Lecture 10 MAS 714 Hartmut Klauck

Theory of Computing. Lecture 10 MAS 714 Hartmut Klauck Theory of Computing Lecture 10 MAS 714 Hartmut Klauck Seven Bridges of Königsberg Can one take a walk that crosses each bridge exactly once? Seven Bridges of Königsberg Model as a graph Is there a path

More information

4 Basics of Trees. Petr Hliněný, FI MU Brno 1 FI: MA010: Trees and Forests

4 Basics of Trees. Petr Hliněný, FI MU Brno 1 FI: MA010: Trees and Forests 4 Basics of Trees Trees, actually acyclic connected simple graphs, are among the simplest graph classes. Despite their simplicity, they still have rich structure and many useful application, such as in

More information

Complexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np

Complexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np Chapter 1: Introduction Introduction Purpose of the Theory of Computation: Develop formal mathematical models of computation that reflect real-world computers. Nowadays, the Theory of Computation can be

More information

CS 68: BIOINFORMATICS. Prof. Sara Mathieson Swarthmore College Spring 2018

CS 68: BIOINFORMATICS. Prof. Sara Mathieson Swarthmore College Spring 2018 CS 68: BIOINFORMATICS Prof. Sara Mathieson Swarthmore College Spring 2018 Outline: Jan 31 DBG assembly in practice Velvet assembler Evaluation of assemblies (if time) Start: string alignment Candidate

More information

Introduction to Graph Theory

Introduction to Graph Theory Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex

More information

Math 4410 Fall 2010 Exam 3. Show your work. A correct answer without any scratch work or justification may not receive much credit.

Math 4410 Fall 2010 Exam 3. Show your work. A correct answer without any scratch work or justification may not receive much credit. Math 4410 Fall 2010 Exam 3 Name: Directions: Complete all six questions. Show your work. A correct answer without any scratch work or justification may not receive much credit. You may not use any notes,

More information

Lectures by Volker Heun, Daniel Huson and Knut Reinert, in particular last years lectures

Lectures by Volker Heun, Daniel Huson and Knut Reinert, in particular last years lectures 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 4.1 Sources for this lecture Lectures by Volker Heun, Daniel Huson and Knut

More information

Sequence Assembly. BMI/CS 576 Mark Craven Some sequencing successes

Sequence Assembly. BMI/CS 576  Mark Craven Some sequencing successes Sequence Assembly BMI/CS 576 www.biostat.wisc.edu/bmi576/ Mark Craven craven@biostat.wisc.edu Some sequencing successes Yersinia pestis Cannabis sativa The sequencing problem We want to determine the identity

More information

CpSc 421 Final Solutions

CpSc 421 Final Solutions CpSc 421 Final Solutions Do any eight of the ten problems below. If you attempt more than eight problems, please indicate which ones to grade (otherwise we will make a random choice). This allows you to

More information

Practical Session No. 12 Graphs, BFS, DFS, Topological sort

Practical Session No. 12 Graphs, BFS, DFS, Topological sort Practical Session No. 12 Graphs, BFS, DFS, Topological sort Graphs and BFS Graph G = (V, E) Graph Representations (V G ) v1 v n V(G) = V - Set of all vertices in G E(G) = E - Set of all edges (u,v) in

More information

NP-Completeness. Algorithms

NP-Completeness. Algorithms NP-Completeness Algorithms The NP-Completeness Theory Objective: Identify a class of problems that are hard to solve. Exponential time is hard. Polynomial time is easy. Why: Do not try to find efficient

More information

Discrete Mathematics Course Review 3

Discrete Mathematics Course Review 3 21-228 Discrete Mathematics Course Review 3 This document contains a list of the important definitions and theorems that have been covered thus far in the course. It is not a complete listing of what has

More information

arxiv: v1 [cs.ds] 1 Jan 2013

arxiv: v1 [cs.ds] 1 Jan 2013 2D Lyndon Words and Applications arxiv:1301.0103v1 [cs.ds] 1 Jan 2013 Shoshana Marcus Dina Sokol Abstract A Lyndon word is a primitive string which is lexicographically smallest among cyclic permutations

More information

Strong edge coloring of subcubic graphs

Strong edge coloring of subcubic graphs Strong edge coloring of subcubic graphs Hervé Hocquard a, Petru Valicov a a LaBRI (Université Bordeaux 1), 351 cours de la Libération, 33405 Talence Cedex, France Abstract A strong edge colouring of a

More information

Generating edge covers of path graphs

Generating edge covers of path graphs Generating edge covers of path graphs J. Raymundo Marcial-Romero, J. A. Hernández, Vianney Muñoz-Jiménez and Héctor A. Montes-Venegas Facultad de Ingeniería, Universidad Autónoma del Estado de México,

More information

Chapter 4. Triangular Sum Labeling

Chapter 4. Triangular Sum Labeling Chapter 4 Triangular Sum Labeling 32 Chapter 4. Triangular Sum Graphs 33 4.1 Introduction This chapter is focused on triangular sum labeling of graphs. As every graph is not a triangular sum graph it is

More information

BIL694-Lecture 1: Introduction to Graphs

BIL694-Lecture 1: Introduction to Graphs BIL694-Lecture 1: Introduction to Graphs Lecturer: Lale Özkahya Resources for the presentation: http://www.math.ucsd.edu/ gptesler/184a/calendar.html http://www.inf.ed.ac.uk/teaching/courses/dmmr/ Outline

More information

2-uniform words: cycle graphs, and an algorithm to verify specific word-representations of graphs

2-uniform words: cycle graphs, and an algorithm to verify specific word-representations of graphs : cycle graphs, and an algorithm to verify specific word-representations of graphs, Mrityunjay Singh, Benny K. George Indian Institute of Technology, Guwahati February 20, 2018 Introduction What are word-representations

More information

Discharging and reducible configurations

Discharging and reducible configurations Discharging and reducible configurations Zdeněk Dvořák March 24, 2018 Suppose we want to show that graphs from some hereditary class G are k- colorable. Clearly, we can restrict our attention to graphs

More information

MAT 145: PROBLEM SET 6

MAT 145: PROBLEM SET 6 MAT 145: PROBLEM SET 6 DUE TO FRIDAY MAR 8 Abstract. This problem set corresponds to the eighth week of the Combinatorics Course in the Winter Quarter 2019. It was posted online on Friday Mar 1 and is

More information

Vertex-Colouring Edge-Weightings

Vertex-Colouring Edge-Weightings Vertex-Colouring Edge-Weightings L. Addario-Berry a, K. Dalal a, C. McDiarmid b, B. A. Reed a and A. Thomason c a School of Computer Science, McGill University, University St. Montreal, QC, H3A A7, Canada

More information

CSE 203A: Karger s Min-Cut Algorithm (Lecture Date: 5/11/06)

CSE 203A: Karger s Min-Cut Algorithm (Lecture Date: 5/11/06) CSE 03A: Karger s Min-Cut Algorithm (Lecture Date: 5/11/06) Evan Ettinger May 19, 006 1 General Min-Cut Problem There are typically main types of cut problems. The s-t cut problem asks how we can find

More information

Compatible circuits in eulerian digraphs

Compatible circuits in eulerian digraphs Compatible circuits in eulerian digraphs James Carraher University of Nebraska Lincoln s-jcarrah1@math.unl.edu Joint Work with Stephen Hartke March 2012 James Carraher (UNL) Compatible circuits in eulerian

More information

Cyclic matching sequencibility of graphs

Cyclic matching sequencibility of graphs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 53 (01), Pages 45 56 Cyclic matching sequencibility of graphs Richard A. Brualdi Kathleen P. Kiernan Seth A. Meyer Department of Mathematics University of Wisconsin

More information

CS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department

CS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department CS473-Algorithms I Lecture 3-A Graphs Graphs A directed graph (or digraph) G is a pair (V, E), where V is a finite set, and E is a binary relation on V The set V: Vertex set of G The set E: Edge set of

More information

Read Mapping. de Novo Assembly. Genomics: Lecture #2 WS 2014/2015

Read Mapping. de Novo Assembly. Genomics: Lecture #2 WS 2014/2015 Mapping de Novo Assembly Institut für Medizinische Genetik und Humangenetik Charité Universitätsmedizin Berlin Genomics: Lecture #2 WS 2014/2015 Today Genome assembly: the basics Hamiltonian and Eulerian

More information

Aztec diamond. An Aztec diamond of order n is the union of the unit squares with lattice point coordinates in the region given by...

Aztec diamond. An Aztec diamond of order n is the union of the unit squares with lattice point coordinates in the region given by... Aztec diamond An Aztec diamond of order n is the union of the unit squares with lattice point coordinates in the region given by x + y n + 1 Aztec diamond An Aztec diamond of order n is the union of the

More information

A note on the number of edges guaranteeing a C 4 in Eulerian bipartite digraphs

A note on the number of edges guaranteeing a C 4 in Eulerian bipartite digraphs A note on the number of edges guaranteeing a C 4 in Eulerian bipartite digraphs Jian Shen Department of Mathematics Southwest Texas State University San Marcos, TX 78666 email: js48@swt.edu Raphael Yuster

More information

Introduction aux Systèmes Collaboratifs Multi-Agents

Introduction aux Systèmes Collaboratifs Multi-Agents M1 EEAII - Découverte de la Recherche (ViRob) Introduction aux Systèmes Collaboratifs Multi-Agents UPJV, Département EEA Fabio MORBIDI Laboratoire MIS Équipe Perception et Robotique E-mail: fabio.morbidi@u-picardie.fr

More information

Master Theorem, Introduction to Graphs

Master Theorem, Introduction to Graphs Master Theorem, Introduction to Graphs CSE21 Winter 2017, Day 10 (B00), Day 6-7 (A00) February 1, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Divide & Conquer: General Strategy Divide the problem of

More information