From [4] D. Gusfield. Algorithms on Strings, Trees and Sequences. Cambridge University Press, New York, 1997.
|
|
- Robert Watson
- 5 years ago
- Views:
Transcription
1 Multiple Alignment From [4] D. Gusfield. Algorithms on Strings, Trees and Sequences. Cambridge University Press, New York, Multiple Sequence Alignment Introduction Very active research area Very important: extracting and representing biologically important similarities from a set of strings Algorithms for three basic variants of the problem Profile representation Consensus Sequence representation Signature representation 2 1
2 Multiple Sequence Alignment Shows multiple similarities: Common structure of protein product Common function Common evolutionary process Family of proteins => more information than single sequences Characterization: signatures of protein families 3 Multiple Sequence Alignment Characterization: signatures of protein families Identification of potential members of these families Similarities may be very faint and dispersed become more obvious in a comparison of related sequences Can be used to do better searches: The inverse problem of 2-string alignment: Goal is to deduce unknown conserved sub patterns from a set of strings that is known to be biologically related. Compare to 2-string alignment Common sub patterns Matching strings that may not have been known to be biologically related 4 2
3 Biological Background: Transcription Gene Expression from DNA to Protein Translation Transcription 1 st Fact of Biological Sequence Analysis [4]: Sequence similarity usually implies significant structural or functional similarity - 2-sequence comparison - Biological database search - BUT: the reverse is not true! 5 Sequence similarity 2 nd Fact of Biological Sequence Analysis [4]: Evolutionary and functionally related molecular strings can differ significantly throughout much of the string and yet preserve: 1. the same three dimensional structures 2. the same two-dimensional substructures (motifs, domains) 3. the same active sites 4. the same or related dispersed residues (DNA or amino acid) 6 3
4 Evolutionary Conservation Evolutionary preserved features: 3-dimensional structures (well preserved) 2-dimensional substructures (well preserved) active sites: functions (less common) Amino acid sequence (least common) Three biological uses: Representation of protein families Identification of conserved sequence features correlating with structure and function Deduction of evolutionary history 7 Example: Hemoglobin Hemoglobin: An almost universal protein found in insects, mammals, etc. 4 chains of ~140 amino acids Functions the same in both insects as well as mammals: binds and transports oxygen Insects and mammals diverged ~600 million years ago => On average 100 amino acids mutations per chain Pair wise alignment: human - chimpanzee equal Mammal - mammal suggests functional similarity Insect-mammal very little similarity! Secondary and 3-dimensional structure well preserved 8 4
5 human beta globin horse beta globin human alpha globin horse alpha globin cyano haemoglobin whale myoglobin leghaemoglobin Multiple Alignment α-helices Alignments of globins produced by Clustal. 9 From: Protein Data Bank 10 5
6 aligning several sequences acidic ribosomal protein P0 homolog (L10E) encoded by the Rplp0 gene 11 Multiple Alignment Examples Related sequences have so few conserved or dispersed matching amino acids: statistically indistinguishable from the best alignment of 2 random strings. For example this is true for: Hemoglobin, immunoglobulin (antibody proteins) E-Cadherin (adhesion molecule) Compare to: 1 DNA nucleotide change: Sickle cell anemia 12 6
7 Definition (Global) Multiple Alignment A multiple alignment of a set of strings {S 1,S 2,,S k } is a series of strings S`1,S`2,,S`k such that 1. S`1 = S`2 = = S`k (all S`i have the same length) 2. For every i: S`i is an extension of S i obtained by insertion of spaces. A multiple alignment of strings ACBCBD, CADDB, and ACABCD: Note: similarly, local multiple alignment (fro substrings) can be defined. 13 Multiple Alignment Definition Induced Pair-Wise Alignment: Given a multiple alignment M, the induced pair-wise alignment of two string S i and S is obtained from M by removing all rows except rows i an. Note: two opposing spaces can be removed Definition Score of Induced Pair-Wise Alignment: The score of an induced pair-wise alignment is determined using any chosen scoring scheme for 2-string alignment in the standard manner. 14 7
8 Multiple Alignment: Sum-of-Pairs Definition: Sum of Pairs Score The sum-of-pairs (SP) score of a multiple alignment M is the sum of the scores of pair-wise global alignments induced by M. Problem: The SP Alignment Problem Compute a global multiple alignment M with minimum sum-of-pairs score. Note: Intuitively this is a reasonable score, but: no theoretical ustification! 15 An Exact Solution of the SP Alignment Problem A Dynamic Programming Approach The best multiple alignment of r sequences is calculated using an r-dimensional hypercube D, where: Each dimension is formed by one of the r sequences. The nodes ( 1, 2,, r ) of the hypercube hold the best score D( 1, 2,, r ) for aligning the prefixes of the sequences x 1, x 2,, x r of lengths 1, 2,, r, respectively. 16 8
9 Multiple Alignment: Dynamic Programming The best score D can be calculated using dynamic programming on the following recursion: D( 0,0,...,0) 0 Where D(, 2,..., r ) min { D( 1 1, 2 2,..., ) ( 1,,..., )} r r r x 1 2x 2 rx 1 r {0,1}, 0 with ρ the cost function (for example Sum of Pairs (SP), i.e. pairwise summation of the induced alignment at that position), and r ( 1, 2,..., r ) {0,1 } Note: There is no generally accepted score for a multiple alignment. We will discuss some candidates. Biological relevance of the multiple alignment is of maor importance. 17 Multiple Alignment: Dynamic Programming (4,3,2) ε =(1,0,1) (3,3,1) (3,2,0) ε =(0,1,1) (1,0,0) (2,1,0) ε =(1,1,0) ε = (1,1,0) Space complexity O(n r ).(O(r-dimensional cost function ρ calculation)); Time complexity O(2 r.n r ) (consider 2 r previously calculated values) 18 9
10 Exact Multiple Alignment Complexity The Exact Multiple Alignment problem solved by Dynamic Programming has: Space complexity O(n r ).(O(ρ calculation)) Time complexity O(2 r.n r ) This exact solution using dynamic programming is only useful for a small number of strings, i.e., a small r. In general: The exact multiple alignment problem using sum-of-pairs or evolutionary-tree scoring is NP-complete [9]. 19 Scoring Metrics Possible ρ functions: Distance from Consensus Let C be the consensus sequence, then the total distance is defined as: i {1,..., r} Evolutionary Tree Alignment Define D(S v,s w ) as the number of changes between S v and S w, then the weight of an evolutionary tree is defined as: Then the weight of the lightest evolutionary tree that can be constructed from the sequences is taken. Sum of Pairs Defined as the sum of pairwise distances between all pairs of sequences: D( S i, C) {1,..., r}, i D( S i, S ) ( v, w) T D( S v, S w ) 20 10
11 Faster Dynamic Programming A heuristic by Carrillo and Lipman, 1988 [1] Implemented in MSA, Lipman et al., 1989 [6] Idea: For similar strings the alignment path is close to the diagonal (of the hypercube). Upper bound on best alignment-cost enables pruning of the alignments that are a priori too costly 21 Implementation of the MSA Heuristic Let A an alignment of sequences x 1,x 2,,x r. A the pair of rows x i and x from A. c(a ) the induced cost of the pair wise alignment of sequences x i and x. (alignment of space against space is ignored, cost function D), then the total cost of A, c(a) can be defined as: c( A) c( Assume known that for an optimal alignment A * : c(a * ) c, then: * * * * c' c( A ) c( A ) c( A ) c( A ) Hence: c( A c( A ) {1,..., r}, i A i, ) {1,..., r}, i D( x, x * u, v i ) {1,..., r}, i,( ) ( u, v) * u, v ) c' D( x x ) {1,..., r}, i,( ) ( u, v) u, v {1,..., r}, i,( ) ( u, v) optimal pair-wise alignment 22 11
12 Implementation of the MSA Heuristic Assume known that for an optimal alignment A * : c(a * ) c, then: * c( A ) c' D( x, x u, v i ) {1,..., r}, i,( ) ( u, v) Now define: B( u, v) c' D( x i, x ) {1,..., r}, i,( ) ( u, v) Note that given c` we can calculate B(u,v) by calculating optimal pair-wise alignment cost D(x i, x ) for each i and. 23 Implementation of the MSA Heuristic Now consider a node from the hyper-cube: q = (i 1,i 2,,i u =s,,i v =t,,i r ) then clearly (s,t) is the proection of q onto the (u,v)-plane If the best alignment A * goes through q then the proection of the A*-path to the (u,v)-plane, A* u,v, goes through (s,t). Now let best(u,v) s,t an upper bound on the optimal score for an alignment through (s,t) in the (u,v)-plane, then: best(u,v) s,t c(a* u,v ) B(u,v) B( u, v) c' D( x i, x ) Let d(k 1,k 2 ) the cost of matching characters k 1 and k 2 then: best(u,v) s,t = D(x u,1,,x u,i-1, x v,1,,x v,-1 ) + d(x u,i, x v, ) + D(x u,i+1,,x u,n(u), x v,+1,,x v,n(v) ) Can be computed by forward dynamic programming, keeping a queue of nodes with uncompleted D-values. {1,..., r}, i,( ) ( u, v) 24 12
13 Multiple Sequence Alignment Approximations Sketch computation by forward dynamic programming while keeping a queue of nodes with uncompleted D-values: The algorithm will set the D value of the cell w at the head of the queue and remove that cell. It also updates the shortest distance from cell (0,..., 0) to all neighbors of w, whose D value can be influenced by w. If any of the neighbors is not yet in the queue, it is added to the end of the queue. If at some point best(u,v) s,t > B(u, v), then it is clear that the best alignment A* cannot pass through node (i 1, i 2,..., i u = s,..., i v = t,..., i r ) for any i 1, i 2,..., i u 1, i u+1,..., i v 1, i v+1,..., i r => These cells can be discarded from the computation Note: the bound c is found by using heuristics giving promising solutions first. In practice, MSA [6] can align 6 sequences of length The Center Star Method for (SP) Alignment. An approximation algorithm for the optimal alignment under the Sum of Pairs metric [4, pp ] achieving an approximation ratio of 2. Sum of Pairs metric Defined as the sum of induced pair-wise distances between all pairs of sequences: D( S i, S ) Where D(S,T) the best score of aligning S with T, using a scoring function d(x,y) defined by: d(-,-)=0 d(x,y) = d(y,x) d(x,y) d(x,z) + d(z,y) (triangle inequality) {1,..., r}, i 26 13
14 star alignment - We put a string in center (which one?) - Compute pair-wise alignments - Extend pairs into multiple alignment - Where once a gap, always a gap Complexity: O(k 2 n 2 ) k sequences, n length Theory proves: within factor 2 of optimal alignment 27 Star Alignment Problem: The Sum of Pairs alignment problem Input: A set of sequences {S 1,S 2,,S k } Question: Compute a global multiple alignment M with minimum sum-of-pairs score. Definition: Given a set of k strings S, define a center string S c (element of S) as a string that minimizes: Definition: Define the center star to be a star tree of k nodes, with the center node S c and with each of the remaining k-1 nodes labeled by a distinct string from S\{S c }. S S D ( S, S ) c Definition: Define the multiple alignment M c of the set of strings S to be the multiple alignment consistent with the center star
15 The Center Star Method for (SP) Alignment. S 4 S 5 S 2 S 3 S 6 S 1 A generic center start for 6 strings with center string S C = S 3. [4, p349]. 29 The Center Star Algorithm Star Alignment 1. Find S t from S minimizing and let M={S t }. 2. Add the sequences in S\{S t } to M one by one such that the alignment of every newly added sequence with S t is optimal. Add spaces when needed, to all pre-aligned sequences. it D ( S i, S t ) 30 15
16 star alignment 1 ATTGCCATT 2 ATGGCCATT 3 ATCCAATTTT 4 ATCTTCTT 5 ACTGACC 1 ATTGCCATT 2 ATGGCCATT 1 ATTGCCATT-- 3 ATC-CAATTTT 1 ATTGCCATT 4 ATCTTC-TT 1 ATTGCCATT 5 ACTGACC-- <= center 1 ATTGCCATT-- 2 ATGGCCATT-- 3 ATC-CAATTTT 1 ATTGCCATT-- 2 ATGGCCATT-- 3 ATC-CAATTTT 4 ATCTTC-TT-- 1 ATTGCCATT-- 2 ATGGCCATT-- 3 ATC-CAATTTT 4 ATCTTC-TT-- 5 ACTGACC The Center Star Algorithm as D(x,y) is the optimal distance between x and y Multiple alignment of the center start produces a multiple alignment with a value at most 2(k-1)/k times the optimal value
17 The Center Star Algorithm (S 1 minimizes this sum) any other S i does worse * Triangle inequality: D(S i,s ) D(S i,s 1 ) + D(S 1,S ) 33 Multiple Alignment with Consensus Consensus string S* does not need to be An element of the set of strings S
18 Multiple Alignment with Consensus Steiner string capture common characteristics of the set S. S* not necessarily element of S. Approximation algorithm with worst case approximation ratio Multiple Alignment with Consensus Multiple alignment with consensus string A B C D E D 36 18
19 Multiple Alignment: Consensus Strings S * the Steiner String that minimizes the consensus error. Lemma 4.2 shows that actually you can use a string from the set S as a consensus string as an approximation. S c of the previous algorithm will do. 37 Multiple Alignment: Consensus Strings Triangle inequality Furthermore: 38 19
20 Consensus Strings from Multiple Alignment Column (character) wise consensus. ( See Theorem 4.3 ) 39 Common Alignment Methods Aligning a String to a Profile Iterative Pair-wise Alignment Progressive Alignment Feng-Doolittle (1987)[2] CLUSTALW 40 20
21 Example: Signature Profiles Helicases protein to unwound DNA for further read for duplication, transcription, recombination or repair. Werner s syndrome an aging disease is believed to be due to a gene that codes for a helicase protein. A Signature Profile for Helicases Conserved sequence signatures or motifs Some of these motifs are unique identifiers for helicases Maybe functional units 41 Multiple Alignment Profile Multiple Alignment Profile: Character frequencies given per column p i (a) is the fraction of a s in column i p(a) is the fraction of a s overall Log likelihood ratio log( p i (a)/p(a) ) can be used
22 Aligning a String to a Profile consensus Profile for Classical Chromo Domains [10]. 43 Iterative Pair-wise Alignment Algorithm 1. Align some pair 2. While (not done) (a) Pick an unaligned string which is near some aligned one(s). (b) Align with the profile of the previously aligned group. Resulting new spaces are inserted into all strings in the group. This approach uses pair-wise alignment scores to iteratively add one additional string to a growing multiple alignment. 1. We start by aligning the two strings whose edit distance is the minimum over all pairs of strings. 2. Then we iteratively consider the string with the smallest distance to any of the strings already in the multiple alignment
23 Progressive Alignment Feng-Doolittle (1987) CLUSTALW 45 Feng-Doolittle (1987)[2] Feng-Doolittle Algorithm (sketch): 1. Calculate the pairwise alignment scores, and convert them to distances. 2. Use an incremental clustering algorithm [3] to construct a tree from the distances. 3. Traverse the nodes in their order of addition to the tree, progressively aligning the sequences. This way, the most similar pair is aligned first, followed by the addition of the next most similar sequence or set of sequences. Features of the Feng-Doolittle Heuristic: The order of aligning of sequences, or sets of sequences, is determined by their highest scoring pairwise alignment. Once a gap, always a gap 46 23
24 CLUSTALW ClustalW is a software package for multiple alignment Implemention of an algorithm of Thompson, Higgins, Gibson 1994 [8]. The basic idea is the same as in the Feng-Doolittle algorithm: k 1. Calculate the pairwise alignment scores, and convert them to 2 distances. 2. Use a neighbor-oining algorithm to build a tree from the distances. 3. Align sequence - sequence, sequence - profile, profile - profile in decreasing similarity order. This algorithm makes use of many ad-hoc rules such as weighting, different matrix scores and special gap scores. 47 Alignment Tree by CLUSTALW 48 24
25 Multiple Sequence Alignment: HMM known Multiple Sequence Alignment Problem: Given sequence S 1,, S n, how can they be optimally aligned? Assume a profile HMM P is known, then: - Align each sequence S i to the profile separately - Accumulate the obtained alignments to a multiple alignment - Hereby insert runs are not aligned. (Just leftustify insert regions.) 49 Multiple Sequence Alignment: HMM unknown Multiple Sequence Alignment Problem: Given sequence S 1,, S n, how can they be optimally aligned? Assume a profile HMM P is not known, then obtain an HMM profile P from S 1,, S n as follows: - Choose a length L for the profile HMM and intialize the transition and emission probabilities. - Train the HMM using Baum- Welch on all sequences S 1,, S n. Now obtain the multiple alignment using this HMM P as in the previous case: - Align each sequence S i to the profile separately - Accumulate the obtained alignments to a multiple alignment - Hereby insert runs are not aligned. (Just left-ustify insert regions.) 50 25
26 Bibliography [1] H. Carrillo and D. Lipmann. The multiple sequence alignment problem in biology. SIAM J. Appl. Math, 48: , [2] D. Feng and R. F. Doolittle. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol., 25: , [3] W. M. Fitch and E. Margoliash. Construction of phylogenetic trees. science, 15: , [4] D. Gusfield. Algorithms on Strings, Trees and Sequences. Cambridge University Press, New York, [5] T. Jiang, L. Wang, and E. L. Lawler. Approximation algorithms for tree alignment with a given phylogeny. Algorithmica, 16: , [6] D. J. Lipman, S. Altshul, and J. Kececiogly. A tool for multiple sequence alignment. Proc. Natl. Academy Science, 86: , [7] M. Murata, J.S. Richardson, and J.L. Sussman. Three protein alignment. Medical Information Sciences, 231:9, [8] J. D. Thompson, D. G. Higgins, and T. J. Gibson. Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 22: , [9] L. Wang and T. Jiang. On the complexity of multiple sequence alignment. Computational Biology, 1: , [10] [11] [12] L. Sterck, ProCoGen Training Workshop, Umea, Hidden Markov Models Profiles 52 26
27 model structure A G C T many states & fully connected training seldom works because of local maxima use knowledge about the problem e.g. linear model for profile alignment 53 silent states skipping nodes [square states: emitting] silent states (no emission) linear size vs. quadratic size but less modeling possibilities high/low 54 27
28 Forward Algorithm: silent states: Forward Algorithm q as before transition / emission q for silent states no silent loops (!): update in topological order 55 profile alignment (1) begin M 2 end Case 1: ungapped transition prob s 1 trivial alignment of the HMM to sequence VGAHAGEY VTGNVDEV VEADVAGH VKSNDVAD VYSTVETS FNANIPKH IAGNGAGV profile HMM P has a dedicated topology e i (b) = probability of observing symbol b at pos i 56 28
29 profile alignment (2) Case 2: with unmatched inserts I insert state M M +1 match states VGA--HAGEY VNA--NVDEV VEA--DVAGH VKG--NYDED VYS--TYETS FNA--NIPKH IAGADNGAGV emission: background probabilities or based on alignment affine gap model open gap extension 57 profile alignment (3) Case 3: with unmatched inserts and deletions VGA--HAGEY V----NVDEV VEA--DVAGH VKG------D VYS--TYETS FNA--NIPKH IAGADNGAGV I M M +1 D -1 D insert state match states delete state (silent) M -1 M M +1 adapt Viterbi => 58 29
30 HMM for profiles / multiple alignment Deletion (D) Insertion (I) D I Viterbi Match (M) begin M end M Y v ( i) em ( x i ). max v 1( i 1). ty Y M, I, D 1 I Y v ( i) p( xi ). max v ( i 1). ty I Y M, I, D 1 D Y v ( i) max v 1( i). ty D Y M, I, D 1 M same level same position 59 given multiple alignment Insertion / Deletion states profile alignment M i : VGA--HAGEY V----NVDEV VEA--DVAGH VKG------D VYS--TYETS FNA--NIPKH IAGADNGAGV counting transitions M 1 M / 10 M 1 I / 10 M 1 D / 10 emissions F 1+1 2/ 27 I 1+1 2/ 27 V 5+1 6/ 27 other 17x 0+1 1/ 27 Laplace correction, i.e., adding 1 for each frequency to avoid dividing by
Multiple Alignment. From [4] D. Gusfield. Algorithms on Strings, Trees and Sequences. Cambridge University Press, New York, 1997.
Multiple Alignment From [4] D. Gusfield. Algorithms on Strings, Trees and Sequences. Cambridge University Press, New York, 1997. 1 Multiple Sequence Alignment Introduction Very active research area Very
More informationSpecial course in Computer Science: Advanced Text Algorithms
Special course in Computer Science: Advanced Text Algorithms Lecture 8: Multiple alignments Elena Czeizler and Ion Petre Department of IT, Abo Akademi Computational Biomodelling Laboratory http://www.users.abo.fi/ipetre/textalg
More informationMultiple Sequence Alignment: Multidimensional. Biological Motivation
Multiple Sequence Alignment: Multidimensional Dynamic Programming Boston University Biological Motivation Compare a new sequence with the sequences in a protein family. Proteins can be categorized into
More informationCISC 889 Bioinformatics (Spring 2003) Multiple Sequence Alignment
CISC 889 Bioinformatics (Spring 2003) Multiple Sequence Alignment Courtesy of jalview 1 Motivations Collective statistic Protein families Identification and representation of conserved sequence features
More informationProfiles and Multiple Alignments. COMP 571 Luay Nakhleh, Rice University
Profiles and Multiple Alignments COMP 571 Luay Nakhleh, Rice University Outline Profiles and sequence logos Profile hidden Markov models Aligning profiles Multiple sequence alignment by gradual sequence
More informationGenome 559: Introduction to Statistical and Computational Genomics. Lecture15a Multiple Sequence Alignment Larry Ruzzo
Genome 559: Introduction to Statistical and Computational Genomics Lecture15a Multiple Sequence Alignment Larry Ruzzo 1 Multiple Alignment: Motivations Common structure, function, or origin may be only
More informationStephen Scott.
1 / 33 sscott@cse.unl.edu 2 / 33 Start with a set of sequences In each column, residues are homolgous Residues occupy similar positions in 3D structure Residues diverge from a common ancestral residue
More informationPROTEIN MULTIPLE ALIGNMENT MOTIVATION: BACKGROUND: Marina Sirota
Marina Sirota MOTIVATION: PROTEIN MULTIPLE ALIGNMENT To study evolution on the genetic level across a wide range of organisms, biologists need accurate tools for multiple sequence alignment of protein
More informationHidden Markov Models Review and Applications. hidden Markov model. what we see model M = (,Q,T) states Q transition probabilities e Ax
Hidden Markov Models Review and Applications 1 hidden Markov model what we see x y model M = (,Q,T) states Q transition probabilities e Ax t AA e Ay observation observe states indirectly emission probabilities
More information3.4 Multiple sequence alignment
3.4 Multiple sequence alignment Why produce a multiple sequence alignment? Using more than two sequences results in a more convincing alignment by revealing conserved regions in ALL of the sequences Aligned
More informationMultiple Sequence Alignment. Mark Whitsitt - NCSA
Multiple Sequence Alignment Mark Whitsitt - NCSA What is a Multiple Sequence Alignment (MA)? GMHGTVYANYAVDSSDLLLAFGVRFDDRVTGKLEAFASRAKIVHIDIDSAEIGKNKQPHV GMHGTVYANYAVEHSDLLLAFGVRFDDRVTGKLEAFASRAKIVHIDIDSAEIGKNKTPHV
More informationGLOBEX Bioinformatics (Summer 2015) Multiple Sequence Alignment
GLOBEX Bioinformatics (Summer 2015) Multiple Sequence Alignment Scoring Dynamic Programming algorithms Heuristic algorithms CLUSTAL W Courtesy of jalview Motivations Collective (or aggregate) statistic
More informationMultiple sequence alignment. November 20, 2018
Multiple sequence alignment November 20, 2018 Why do multiple alignment? Gain insight into evolutionary history Can assess time of divergence by looking at the number of mutations needed to change one
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the
More informationMultiple Sequence Alignment (MSA)
I519 Introduction to Bioinformatics, Fall 2013 Multiple Sequence Alignment (MSA) Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Multiple sequence alignment (MSA) Generalize
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 06: Multiple Sequence Alignment https://upload.wikimedia.org/wikipedia/commons/thumb/7/79/rplp0_90_clustalw_aln.gif/575px-rplp0_90_clustalw_aln.gif Slides
More informationMultiple Sequence Alignment Sum-of-Pairs and ClustalW. Ulf Leser
Multiple Sequence Alignment Sum-of-Pairs and ClustalW Ulf Leser This Lecture Multiple Sequence Alignment The problem Theoretical approach: Sum-of-Pairs scores Practical approach: ClustalW Ulf Leser: Bioinformatics,
More informationChapter 6. Multiple sequence alignment (week 10)
Course organization Introduction ( Week 1,2) Part I: Algorithms for Sequence Analysis (Week 1-11) Chapter 1-3, Models and theories» Probability theory and Statistics (Week 3)» Algorithm complexity analysis
More informationRecent Research Results. Evolutionary Trees Distance Methods
Recent Research Results Evolutionary Trees Distance Methods Indo-European Languages After Tandy Warnow What is the purpose? Understand evolutionary history (relationship between species). Uderstand how
More informationLecture 5: Multiple sequence alignment
Lecture 5: Multiple sequence alignment Introduction to Computational Biology Teresa Przytycka, PhD (with some additions by Martin Vingron) Why do we need multiple sequence alignment Pairwise sequence alignment
More informationSpecial course in Computer Science: Advanced Text Algorithms
Special course in Computer Science: Advanced Text Algorithms Lecture 6: Alignments Elena Czeizler and Ion Petre Department of IT, Abo Akademi Computational Biomodelling Laboratory http://www.users.abo.fi/ipetre/textalg
More informationHIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT
HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT - Swarbhanu Chatterjee. Hidden Markov models are a sophisticated and flexible statistical tool for the study of protein models. Using HMMs to analyze proteins
More informationLecture 5: Markov models
Master s course Bioinformatics Data Analysis and Tools Lecture 5: Markov models Centre for Integrative Bioinformatics Problem in biology Data and patterns are often not clear cut When we want to make a
More informationPrinciples of Bioinformatics. BIO540/STA569/CSI660 Fall 2010
Principles of Bioinformatics BIO540/STA569/CSI660 Fall 2010 Lecture 11 Multiple Sequence Alignment I Administrivia Administrivia The midterm examination will be Monday, October 18 th, in class. Closed
More informationMachine Learning. Computational biology: Sequence alignment and profile HMMs
10-601 Machine Learning Computational biology: Sequence alignment and profile HMMs Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Growth
More informationDynamic Programming in 3-D Progressive Alignment Profile Progressive Alignment (ClustalW) Scoring Multiple Alignments Entropy Sum of Pairs Alignment
Dynamic Programming in 3-D Progressive Alignment Profile Progressive Alignment (ClustalW) Scoring Multiple Alignments Entropy Sum of Pairs Alignment Partial Order Alignment (POA) A-Bruijin (ABA) Approach
More informationMULTIPLE SEQUENCE ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT Multiple Alignment versus Pairwise Alignment Up until now we have only tried to align two sequences. What about more than two? A faint similarity between two sequences becomes
More informationBiology 644: Bioinformatics
A statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states in the training data. First used in speech and handwriting recognition In
More information15-780: Graduate Artificial Intelligence. Computational biology: Sequence alignment and profile HMMs
5-78: Graduate rtificial Intelligence omputational biology: Sequence alignment and profile HMMs entral dogma DN GGGG transcription mrn UGGUUUGUG translation Protein PEPIDE 2 omparison of Different Organisms
More informationCSE 549: Computational Biology
CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak
More informationMultiple Sequence Alignment II
Multiple Sequence Alignment II Lectures 20 Dec 5, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall (JHN) 022 1 Outline
More informationComputational Molecular Biology
Computational Molecular Biology Erwin M. Bakker Lecture 3, mainly from material by R. Shamir [2] and H.J. Hoogeboom [4]. 1 Pairwise Sequence Alignment Biological Motivation Algorithmic Aspect Recursive
More informationToday s Lecture. Multiple sequence alignment. Improved scoring of pairwise alignments. Affine gap penalties Profiles
Today s Lecture Multiple sequence alignment Improved scoring of pairwise alignments Affine gap penalties Profiles 1 The Edit Graph for a Pair of Sequences G A C G T T G A A T G A C C C A C A T G A C G
More informationMultiple sequence alignment. November 2, 2017
Multiple sequence alignment November 2, 2017 Why do multiple alignment? Gain insight into evolutionary history Can assess time of divergence by looking at the number of mutations needed to change one sequence
More informationChapter 8 Multiple sequence alignment. Chaochun Wei Spring 2018
1896 1920 1987 2006 Chapter 8 Multiple sequence alignment Chaochun Wei Spring 2018 Contents 1. Reading materials 2. Multiple sequence alignment basic algorithms and tools how to improve multiple alignment
More informationA New Approach For Tree Alignment Based on Local Re-Optimization
A New Approach For Tree Alignment Based on Local Re-Optimization Feng Yue and Jijun Tang Department of Computer Science and Engineering University of South Carolina Columbia, SC 29063, USA yuef, jtang
More information1. R. Durbin, S. Eddy, A. Krogh und G. Mitchison: Biological sequence analysis, Cambridge, 1998
7 Multiple Sequence Alignment The exposition was prepared by Clemens GrÃP pl, based on earlier versions by Daniel Huson, Knut Reinert, and Gunnar Klau. It is based on the following sources, which are all
More information1. R. Durbin, S. Eddy, A. Krogh und G. Mitchison: Biological sequence analysis, Cambridge, 1998
7 Multiple Sequence Alignment The exposition was prepared by Clemens Gröpl, based on earlier versions by Daniel Huson, Knut Reinert, and Gunnar Klau. It is based on the following sources, which are all
More informationReconstructing long sequences from overlapping sequence fragment. Searching databases for related sequences and subsequences
SEQUENCE ALIGNMENT ALGORITHMS 1 Why compare sequences? Reconstructing long sequences from overlapping sequence fragment Searching databases for related sequences and subsequences Storing, retrieving and
More informationFastA & the chaining problem
FastA & the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 1 Sources for this lecture: Lectures by Volker Heun, Daniel Huson and Knut Reinert,
More informationBioinformatics explained: Smith-Waterman
Bioinformatics Explained Bioinformatics explained: Smith-Waterman May 1, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com
More informationAn Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST
An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST Alexander Chan 5075504 Biochemistry 218 Final Project An Analysis of Pairwise
More informationFastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:
FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:56 4001 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem
More informationMultiple Sequence Alignment Based on Profile Alignment of Intermediate Sequences
Multiple Sequence Alignment Based on Profile Alignment of Intermediate Sequences Yue Lu and Sing-Hoi Sze RECOMB 2007 Presented by: Wanxing Xu March 6, 2008 Content Biology Motivation Computation Problem
More informationLectures by Volker Heun, Daniel Huson and Knut Reinert, in particular last years lectures
4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 4.1 Sources for this lecture Lectures by Volker Heun, Daniel Huson and Knut
More informationAs of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be
48 Bioinformatics I, WS 09-10, S. Henz (script by D. Huson) November 26, 2009 4 BLAST and BLAT Outline of the chapter: 1. Heuristics for the pairwise local alignment of two sequences 2. BLAST: search and
More informationMultiple Sequence Alignment Sum-of-Pairs and ClustalW. Ulf Leser
Multiple Sequence lignment Sum-of-Pairs and ClustalW Ulf Leser This Lecture Multiple Sequence lignment The problem Theoretical approach: Sum-of-Pairs scores Practical approach: ClustalW Ulf Leser: Bioinformatics,
More informationBLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. CS 466 Saurabh Sinha Motivation Sequence homology to a known protein suggest function of newly sequenced protein Bioinformatics
More informationSequence analysis Pairwise sequence alignment
UMF11 Introduction to bioinformatics, 25 Sequence analysis Pairwise sequence alignment 1. Sequence alignment Lecturer: Marina lexandersson 12 September, 25 here are two types of sequence alignments, global
More informationBasic Local Alignment Search Tool (BLAST)
BLAST 26.04.2018 Basic Local Alignment Search Tool (BLAST) BLAST (Altshul-1990) is an heuristic Pairwise Alignment composed by six-steps that search for local similarities. The most used access point to
More informationUSING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT
IADIS International Conference Applied Computing 2006 USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT Divya R. Singh Software Engineer Microsoft Corporation, Redmond, WA 98052, USA Abdullah
More informationBioinformatics explained: BLAST. March 8, 2007
Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics
More informationComputational Molecular Biology
Computational Molecular Biology Erwin M. Bakker Lecture 2 Materials used from R. Shamir [2] and H.J. Hoogeboom [4]. 1 Molecular Biology Sequences DNA A, T, C, G RNA A, U, C, G Protein A, R, D, N, C E,
More informationMultiple Sequence Alignment Using Reconfigurable Computing
Multiple Sequence Alignment Using Reconfigurable Computing Carlos R. Erig Lima, Heitor S. Lopes, Maiko R. Moroz, and Ramon M. Menezes Bioinformatics Laboratory, Federal University of Technology Paraná
More informationComparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods
Comparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods Khaddouja Boujenfa, Nadia Essoussi, and Mohamed Limam International Science Index, Computer and Information Engineering waset.org/publication/482
More informationPoMSA: An Efficient and Precise Position-based Multiple Sequence Alignment Technique
PoMSA: An Efficient and Precise Position-based Multiple Sequence Alignment Technique Sara Shehab 1,a, Sameh Shohdy 1,b, Arabi E. Keshk 1,c 1 Department of Computer Science, Faculty of Computers and Information,
More informationA multiple alignment tool in 3D
Outline Department of Computer Science, Bioinformatics Group University of Leipzig TBI Winterseminar Bled, Slovenia February 2005 Outline Outline 1 Multiple Alignments Problems Goal Outline Outline 1 Multiple
More informationBLAST, Profile, and PSI-BLAST
BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources
More informationEvolutionary tree reconstruction (Chapter 10)
Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early
More informationspaces (gaps) in each sequence so that they all have the same length. The following is a simple example of an alignment of the sequences ATTCGAC, TTCC
SALSA: Sequence ALignment via Steiner Ancestors Giuseppe Lancia?1 and R. Ravi 2 1 Dipartimento Elettronica ed Informatica, University of Padova, lancia@dei.unipd.it 2 G.S.I.A., Carnegie Mellon University,
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 04: Variations of sequence alignments http://www.pitt.edu/~mcs2/teaching/biocomp/tutorials/global.html Slides adapted from Dr. Shaojie Zhang (University
More informationParsimony-Based Approaches to Inferring Phylogenetic Trees
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:
More informationMULTIPLE SEQUENCE ALIGNMENT SOLUTIONS AND APPLICATIONS
MULTIPLE SEQUENCE ALIGNMENT SOLUTIONS AND APPLICATIONS By XU ZHANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE
More informationDynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014
Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into
More informationIn this section we describe how to extend the match refinement to the multiple case and then use T-Coffee to heuristically compute a multiple trace.
5 Multiple Match Refinement and T-Coffee In this section we describe how to extend the match refinement to the multiple case and then use T-Coffee to heuristically compute a multiple trace. This exposition
More informationData Mining Technologies for Bioinformatics Sequences
Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment
More informationJET 2 User Manual 1 INSTALLATION 2 EXECUTION AND FUNCTIONALITIES. 1.1 Download. 1.2 System requirements. 1.3 How to install JET 2
JET 2 User Manual 1 INSTALLATION 1.1 Download The JET 2 package is available at www.lcqb.upmc.fr/jet2. 1.2 System requirements JET 2 runs on Linux or Mac OS X. The program requires some external tools
More informationImportant Example: Gene Sequence Matching. Corrigiendum. Central Dogma of Modern Biology. Genetics. How Nucleotides code for Amino Acids
Important Example: Gene Sequence Matching Century of Biology Two views of computer science s relationship to biology: Bioinformatics: computational methods to help discover new biology from lots of data
More informationHMMConverter A tool-box for hidden Markov models with two novel, memory efficient parameter training algorithms
HMMConverter A tool-box for hidden Markov models with two novel, memory efficient parameter training algorithms by TIN YIN LAM B.Sc., The Chinese University of Hong Kong, 2006 A THESIS SUBMITTED IN PARTIAL
More informationBiologically significant sequence alignments using Boltzmann probabilities
Biologically significant sequence alignments using Boltzmann probabilities P Clote Department of Biology, Boston College Gasson Hall 16, Chestnut Hill MA 0267 clote@bcedu Abstract In this paper, we give
More informationSequence alignment theory and applications Session 3: BLAST algorithm
Sequence alignment theory and applications Session 3: BLAST algorithm Introduction to Bioinformatics online course : IBT Sonal Henson Learning Objectives Understand the principles of the BLAST algorithm
More informationLecture 2 Pairwise sequence alignment. Principles Computational Biology Teresa Przytycka, PhD
Lecture 2 Pairwise sequence alignment. Principles Computational Biology Teresa Przytycka, PhD Assumptions: Biological sequences evolved by evolution. Micro scale changes: For short sequences (e.g. one
More informationNear Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri
Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri Scene Completion Problem The Bare Data Approach High Dimensional Data Many real-world problems Web Search and Text Mining Billions
More informationDynamic Programming Course: A structure based flexible search method for motifs in RNA. By: Veksler, I., Ziv-Ukelson, M., Barash, D.
Dynamic Programming Course: A structure based flexible search method for motifs in RNA By: Veksler, I., Ziv-Ukelson, M., Barash, D., Kedem, K Outline Background Motivation RNA s structure representations
More informationLesson 13 Molecular Evolution
Sequence Analysis Spring 2000 Dr. Richard Friedman (212)305-6901 (76901) friedman@cuccfa.ccc.columbia.edu 130BB Lesson 13 Molecular Evolution In this class we learn how to draw molecular evolutionary trees
More informationBioinformatics for Biologists
Bioinformatics for Biologists Sequence Analysis: Part I. Pairwise alignment and database searching Fran Lewitter, Ph.D. Director Bioinformatics & Research Computing Whitehead Institute Topics to Cover
More information8/19/13. Computational problems. Introduction to Algorithm
I519, Introduction to Introduction to Algorithm Yuzhen Ye (yye@indiana.edu) School of Informatics and Computing, IUB Computational problems A computational problem specifies an input-output relationship
More informationSmall Libraries of Protein Fragments through Clustering
Small Libraries of Protein Fragments through Clustering Varun Ganapathi Department of Computer Science Stanford University June 8, 2005 Abstract When trying to extract information from the information
More informationCounting the Number of Eulerian Orientations
Counting the Number of Eulerian Orientations Zhenghui Wang March 16, 011 1 Introduction Consider an undirected Eulerian graph, a graph in which each vertex has even degree. An Eulerian orientation of the
More informationBasics of Multiple Sequence Alignment
Basics of Multiple Sequence Alignment Tandy Warnow February 10, 2018 Basics of Multiple Sequence Alignment Tandy Warnow Basic issues What is a multiple sequence alignment? Evolutionary processes operating
More informationNew String Kernels for Biosequence Data
Workshop on Kernel Methods in Bioinformatics New String Kernels for Biosequence Data Christina Leslie Department of Computer Science Columbia University Biological Sequence Classification Problems Protein
More informationUsing Hidden Markov Models to Detect DNA Motifs
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-13-2015 Using Hidden Markov Models to Detect DNA Motifs Santrupti Nerli San Jose State University
More information24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:
24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid
More informationCS273: Algorithms for Structure Handout # 4 and Motion in Biology Stanford University Thursday, 8 April 2004
CS273: Algorithms for Structure Handout # 4 and Motion in Biology Stanford University Thursday, 8 April 2004 Lecture #4: 8 April 2004 Topics: Sequence Similarity Scribe: Sonil Mukherjee 1 Introduction
More informationAlgorithms for Bioinformatics
Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and
More informationLecture 4: January 1, Biological Databases and Retrieval Systems
Algorithms for Molecular Biology Fall Semester, 1998 Lecture 4: January 1, 1999 Lecturer: Irit Orr Scribe: Irit Gat and Tal Kohen 4.1 Biological Databases and Retrieval Systems In recent years, biological
More information11/17/2009 Comp 590/Comp Fall
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 Problem Set #5 will be available tonight 11/17/2009 Comp 590/Comp 790-90 Fall 2009 1 Clique Graphs A clique is a graph with every vertex connected
More informationLecture 10: Local Alignments
Lecture 10: Local Alignments Study Chapter 6.8-6.10 1 Outline Edit Distances Longest Common Subsequence Global Sequence Alignment Scoring Matrices Local Sequence Alignment Alignment with Affine Gap Penalties
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationSequence alignment algorithms
Sequence alignment algorithms Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 23 rd 27 After this lecture, you can decide when to use local and global sequence alignments
More informationGene regulation. DNA is merely the blueprint Shared spatially (among all tissues) and temporally But cells manage to differentiate
Gene regulation DNA is merely the blueprint Shared spatially (among all tissues) and temporally But cells manage to differentiate Especially but not only during developmental stage And cells respond to
More informationAlignMe Manual. Version 1.1. Rene Staritzbichler, Marcus Stamm, Kamil Khafizov and Lucy R. Forrest
AlignMe Manual Version 1.1 Rene Staritzbichler, Marcus Stamm, Kamil Khafizov and Lucy R. Forrest Max Planck Institute of Biophysics Frankfurt am Main 60438 Germany 1) Introduction...3 2) Using AlignMe
More informationBUNDLED SUFFIX TREES
Motivation BUNDLED SUFFIX TREES Luca Bortolussi 1 Francesco Fabris 2 Alberto Policriti 1 1 Department of Mathematics and Computer Science University of Udine 2 Department of Mathematics and Computer Science
More informationHow Do We Measure Protein Shape? A Pattern Matching Example. A Simple Pattern Matching Algorithm. Comparing Protein Structures II
How Do We Measure Protein Shape? omparing Protein Structures II Protein function is largely based on the proteins geometric shape Protein substructures with similar shapes are likely to share a common
More informationDynamic Programming Part I: Examples. Bioinfo I (Institut Pasteur de Montevideo) Dynamic Programming -class4- July 25th, / 77
Dynamic Programming Part I: Examples Bioinfo I (Institut Pasteur de Montevideo) Dynamic Programming -class4- July 25th, 2011 1 / 77 Dynamic Programming Recall: the Change Problem Other problems: Manhattan
More informationLecture 9: Core String Edits and Alignments
Biosequence Algorithms, Spring 2005 Lecture 9: Core String Edits and Alignments Pekka Kilpeläinen University of Kuopio Department of Computer Science BSA Lecture 9: String Edits and Alignments p.1/30 III:
More informationGenetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)
Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony
More informationGribskov Profile. Hidden Markov Models. Building a Hidden Markov Model #$ %&
Gribskov Profile #$ %& Hidden Markov Models Building a Hidden Markov Model "! Proteins, DNA and other genomic features can be classified into families of related sequences and structures How to detect
More informationSpace Efficient Linear Time Construction of
Space Efficient Linear Time Construction of Suffix Arrays Pang Ko and Srinivas Aluru Dept. of Electrical and Computer Engineering 1 Laurence H. Baker Center for Bioinformatics and Biological Statistics
More informationDynamic Programming: Sequence alignment. CS 466 Saurabh Sinha
Dynamic Programming: Sequence alignment CS 466 Saurabh Sinha DNA Sequence Comparison: First Success Story Finding sequence similarities with genes of known function is a common approach to infer a newly
More information