A Constraint-Based Approach to Structure Prediction for Simplified Protein Models that Outperforms Other Existing Methods
|
|
- Rudolph Robbins
- 6 years ago
- Views:
Transcription
1 1 A Constraint-Based Approach to Structure Prediction for Simplified Protein Models that Outperforms Other Existing Methods Rolf Backofen and Sebastian Will Chair for Bioinformatics Albert-Ludiwgs-Universität Freiburg
2 Overview computer scientist are interested in methods method: constraint-based structure prediction lattice models basic model of HP-type models subproblems: bounds, hydropbic cores, threading bioinformatics are interested in applications as well results and applications degeneracy of sequences finding protein-likes sequences with unique ground state comparing different models (cubic/fcc, HP-model with HPNX)
3 Overview computer scientist are interested in methods method: constraint-based structure prediction lattice models basic model of HP-type models subproblems: bounds, hydropbic cores, threading bioinformatics are interested in applications as well results and applications degeneracy of sequences finding protein-likes sequences with unique ground state comparing different models (cubic/fcc, HP-model with HPNX)
4 Structure Prediction as Optimization Problem searched: structure (conformation) of minimal (free) energy huge search space hence: only parts of the search space considered generate-and-test generate approximation here: broad exploration of search space starting points for fine-tuning hierarchical approaches GPSQPTYPG DDAPVEDLI RFYDNLQQY LNVVTRHRY search in low resolution model often: low-resolution model = lattice model improvement: biolog. knowledge, molecular dynamics
5 Structure Prediction as Optimization Problem searched: structure (conformation) of minimal (free) energy huge search space hence: only parts of the search space considered generate-and-test generate approximation here: broad exploration of search space starting points for fine-tuning hierarchical approaches GPSQPTYPG DDAPVEDLI RFYDNLQQY LNVVTRHRY search in low resolution model often: low-resolution model = lattice model improvement: biolog. knowledge, molecular dynamics
6 Structure Prediction as Optimization Problem searched: structure (conformation) of minimal (free) energy huge search space hence: only parts of the search space considered generate-and-test generate approximation here: broad exploration of search space starting points for fine-tuning hierarchical approaches GPSQPTYPG DDAPVEDLI RFYDNLQQY LNVVTRHRY search in low resolution model often: low-resolution model = lattice model improvement: biolog. knowledge, molecular dynamics
7 4 Previous Prediction Approaches for Latttice Models sometimes: heuristic approaches advantages: disadvantages: chain growth algorithms genetic algorithms... fast only for structure prediction mostly: monte-carlo/simulated annealing advantages: easy to adapt if ergodic, then known distribution disadvantages: also: complete enumeration advantages: disadvantages: for HP-model, optimal solution nearly never found most approaches are not ergodic direct exploration of landscape very short sequences, only D
8 5 Idea of Lattice Models trade-off: choose between models, that closely resembles proteins structure BUT no hope of ever (algorithmically) finding the native structure models, that crudely resembles proteins structure BUT we can find the native structure BUT SO FAR: we cannot find the native structure either here: BUT we can find the native structure using constraint programming
9 5 Idea of Lattice Models trade-off: choose between models, that closely resembles proteins structure BUT no hope of ever (algorithmically) finding the native structure models, that crudely resembles proteins structure BUT we can find the native structure BUT SO FAR: we cannot find the native structure either here: BUT we can find the native structure using constraint programming
10 5 Idea of Lattice Models trade-off: choose between models, that closely resembles proteins structure BUT no hope of ever (algorithmically) finding the native structure models, that crudely resembles proteins structure BUT we can find the native structure BUT SO FAR: we cannot find the native structure either here: BUT we can find the native structure using constraint programming
11 5 Idea of Lattice Models trade-off: choose between models, that closely resembles proteins structure BUT no hope of ever (algorithmically) finding the native structure models, that crudely resembles proteins structure BUT we can find the native structure BUT SO FAR: we cannot find the native structure either here: BUT we can find the native structure using constraint programming
12 5 Idea of Lattice Models trade-off: choose between models, that closely resembles proteins structure BUT no hope of ever (algorithmically) finding the native structure models, that crudely resembles proteins structure BUT we can find the native structure BUT SO FAR: we cannot find the native structure either here: BUT we can find the native structure using constraint programming
13 5 Idea of Lattice Models trade-off: choose between models, that closely resembles proteins structure BUT no hope of ever (algorithmically) finding the native structure models, that crudely resembles proteins structure BUT we can find the native structure BUT SO FAR: we cannot find the native structure either here: BUT we can find the native structure using constraint programming
14 6 Lattice Models lattice models: usually only backbone positions = positions on lattice self-avoiding: no steric conflicts often used lattices: cubic face-centered-cubic BUT: search for native conformation = NP-complete which lattice should be used?
15 The FCC FCC = face-centered cubic lattice Kepler s conjecture: FCC=densest packing of balls proved just recently (after 400 years) [Bagci,Jernigan,Bahar 00]: clusters of near neighbours in proteins... the neighbours are not distributed in a uniform, less dense way, but rather in a clustered dense way, occupying positions that closely approximate those of a distorted FCC packing We confirm that lattices with large coordination numbers provide better 7 fits to protein structures...
16 The FCC FCC = face-centered cubic lattice Kepler s conjecture: FCC=densest packing of balls proved just recently (after 400 years) [Bagci,Jernigan,Bahar 00]: clusters of near neighbours in proteins... the neighbours are not distributed in a uniform, less dense way, but rather in a clustered dense way, occupying positions that closely approximate those of a distorted FCC packing We confirm that lattices with large coordination numbers provide better 7 fits to protein structures...
17 The FCC FCC = face-centered cubic lattice Kepler s conjecture: FCC=densest packing of balls proved just recently (after 400 years) [Bagci,Jernigan,Bahar 00]: clusters of near neighbours in proteins... the neighbours are not distributed in a uniform, less dense way, but rather in a clustered dense way, occupying positions that closely approximate those of a distorted FCC packing We confirm that lattices with large coordination numbers provide better 7 fits to protein structures...
18 The FCC FCC = face-centered cubic lattice Kepler s conjecture: FCC=densest packing of balls proved just recently (after 400 years) [Bagci,Jernigan,Bahar 00]: clusters of near neighbours in proteins... the neighbours are not distributed in a uniform, less dense way, but rather in a clustered dense way, occupying positions that closely approximate those of a distorted FCC packing We confirm that lattices with large coordination numbers provide better 7 fits to protein structures...
19 The FCC FCC = face-centered cubic lattice Kepler s conjecture: FCC=densest packing of balls proved just recently (after 400 years) [Bagci,Jernigan,Bahar 00]: clusters of near neighbours in proteins... the neighbours are not distributed in a uniform, less dense way, but rather in a clustered dense way, occupying positions that closely approximate those of a distorted FCC packing We confirm that lattices with large coordination numbers provide better 7 fits to protein structures...
20 8 Relation to Proteins hydrophob (H) polar (P) how does it related to proteins? hydrophobic and polar (hydrophilic) amino acids hydrophobic are densely packed alphabet: H = Hydrophobic P = Polar (hydrophilic) in the following: search for conformation with densest hydrophobic packing = max. number of contacts between hydrophobic AA (green) HP-model of Ken Dill: folding of sequences consisting of H and P
21 8 Relation to Proteins hydrophob (H) polar (P) how does it related to proteins? hydrophobic and polar (hydrophilic) amino acids hydrophobic are densely packed alphabet: H = Hydrophobic P = Polar (hydrophilic) in the following: search for conformation with densest hydrophobic packing = max. number of contacts between hydrophobic AA (green) HP-model of Ken Dill: folding of sequences consisting of H and P
22 8 Relation to Proteins hydrophob (H) polar (P) how does it related to proteins? hydrophobic and polar (hydrophilic) amino acids hydrophobic are densely packed alphabet: H = Hydrophobic P = Polar (hydrophilic) in the following: search for conformation with densest hydrophobic packing = max. number of contacts between hydrophobic AA (green) HP-model of Ken Dill: folding of sequences consisting of H and P
23 8 Relation to Proteins hydrophob (H) polar (P) how does it related to proteins? hydrophobic and polar (hydrophilic) amino acids hydrophobic are densely packed alphabet: H = Hydrophobic P = Polar (hydrophilic) in the following: search for conformation with densest hydrophobic packing = max. number of contacts between hydrophobic AA (green) HP-model of Ken Dill: folding of sequences consisting of H and P
24 8 Relation to Proteins hydrophob (H) polar (P) how does it related to proteins? hydrophobic and polar (hydrophilic) amino acids hydrophobic are densely packed alphabet: H = Hydrophobic P = Polar (hydrophilic) in the following: search for conformation with densest hydrophobic packing = max. number of contacts between hydrophobic AA (green) HP-model of Ken Dill: folding of sequences consisting of H and P
25 9 General Approach Algorithm consist of three steps: Step 1 and are precomputation steps Step 1: compute lower energy bounds estimate contacts (within layers, between layers) Step : construct hydrophobic cores use bounds from last step, precomputed Step 3: thread sequence to hydrophobic cores of size n. using constraint propagation Step 1 Step Step 3
26 9 General Approach Algorithm consist of three steps: Step 1 and are precomputation steps Step 1: compute lower energy bounds estimate contacts (within layers, between layers) Step : construct hydrophobic cores use bounds from last step, precomputed Step 3: thread sequence to hydrophobic cores of size n. using constraint propagation Step 1 Step Step 3
27 9 General Approach Algorithm consist of three steps: Step 1 and are precomputation steps Step 1: compute lower energy bounds estimate contacts (within layers, between layers) Step : construct hydrophobic cores use bounds from last step, precomputed Step 3: thread sequence to hydrophobic cores of size n. using constraint propagation Step 1 Step Step 3
28 9 General Approach Algorithm consist of three steps: Step 1 and are precomputation steps Step 1: compute lower energy bounds estimate contacts (within layers, between layers) Step : construct hydrophobic cores use bounds from last step, precomputed Step 3: thread sequence to hydrophobic cores of size n. using constraint propagation Step 1 Step Step 3
29 9 General Approach Algorithm consist of three steps: Step 1 and are precomputation steps Step 1: compute lower energy bounds estimate contacts (within layers, between layers) Step : construct hydrophobic cores use bounds from last step, precomputed Step 3: thread sequence to hydrophobic cores of size n. using constraint propagation Step 1 Step Step 3
30 9 General Approach Algorithm consist of three steps: Step 1 and are precomputation steps Step 1: compute lower energy bounds estimate contacts (within layers, between layers) Step : construct hydrophobic cores use bounds from last step, precomputed Step 3: thread sequence to hydrophobic cores of size n. using constraint propagation Step 1 Step Step 3
31 10 Example for a Constraint-Problem: Sudoku every number from exactly once in every row every column every block
32 10 Example for a Constraint-Problem: Sudoku every number from exactly once in every row every column every block
33 Constraints-Formulation Variablen X n k,l Constraints {0, 1} for every row k, column l and number n {1... 9}. 9 X = 1 1, für Reihe 5 1 l in Reihe 5 X = 1 5,l 9 l in Reihe 5 X = 1 5,l 11 similar for all columns and blocks
34 Constraints-Formulation Variablen X n k,l Constraints {0, 1} for every row k, column l and number n {1... 9}. 9 X = 1 1, für Reihe 5 1 l in Reihe 5 X = 1 5,l 9 l in Reihe 5 X = 1 5,l 11 similar for all columns and blocks
35 Constraint-Propagation k = 1..9 X = 1 k, new values are provable correct iteration till stable state if not solved: Search ?1? 3 for one variable: split over all possible values followed by Propagation eine 1 in keine 1 in Reihe 1,Spalte 8 Reihe 1, Spalte 8 1 (X = 1) 1, naive search (generate-and-test): 9 53 steps 1 Constraint-Programming: automatisation of progagation and search
36 Constraint-Propagation k = 1..9 X = 1 k, new values are provable correct iteration till stable state if not solved: Search ?1? 3 for one variable: split over all possible values followed by Propagation eine 1 in keine 1 in Reihe 1,Spalte 8 Reihe 1, Spalte 8 1 (X = 1) 1, naive search (generate-and-test): 9 53 steps 1 Constraint-Programming: automatisation of progagation and search
37 Constraint-Propagation k = 1..9 X = 1 k, new values are provable correct iteration till stable state if not solved: Search ?1? 3 for one variable: split over all possible values followed by Propagation eine 1 in keine 1 in Reihe 1,Spalte 8 Reihe 1, Spalte 8 1 (X = 1) 1, naive search (generate-and-test): 9 53 steps 1 Constraint-Programming: automatisation of progagation and search
38 Constraint-Propagation k = 1..9 X = 1 k, new values are provable correct iteration till stable state if not solved: Search ?1? 3 for one variable: split over all possible values followed by Propagation eine 1 in keine 1 in Reihe 1,Spalte 8 Reihe 1, Spalte 8 1 (X = 1) 1, naive search (generate-and-test): 9 53 steps 1 Constraint-Programming: automatisation of progagation and search
39 Constraint-Propagation k = 1..9 X = 1 k, new values are provable correct iteration till stable state if not solved: Search ?1? 3 for one variable: split over all possible values followed by Propagation eine 1 in keine 1 in Reihe 1,Spalte 8 Reihe 1, Spalte 8 1 (X = 1) 1, naive search (generate-and-test): 9 53 steps 1 Constraint-Programming: automatisation of progagation and search
40 Constraint-Propagation k = 1..9 X = 1 k, new values are provable correct iteration till stable state if not solved: Search ?1? 3 for one variable: split over all possible values followed by Propagation eine 1 in keine 1 in Reihe 1,Spalte 8 Reihe 1, Spalte 8 1 (X = 1) 1, naive search (generate-and-test): 9 53 steps 1 Constraint-Programming: automatisation of progagation and search
41 13 Constraint-based Formulation constraint problem C Pr : position of i-th amino acid: X i, Y i, Z i [1... n] constraints describe Self-Avoiding Walks (X i, Y i, Z i ) (X j, Y j, Z j ) and (X i, Y i, Z i ) (X i+1, Y i+1, Z i+1 ) = 1 constraint-based optimization: distributing over aminoacid positions C Pr X 1 = X 1 X 4 = 3 C Pr & X 1 = problems redundant constraints and search strategy [Backofen:98] symmetry breaking [Backofen&Will:99] bound for number of of HH-contacts [Backofen:00a,03] new constraints, propagation [Backofen:Will:01]
42 13 Constraint-based Formulation constraint problem C Pr : position of i-th amino acid: X i, Y i, Z i [1... n] constraints describe Self-Avoiding Walks (X i, Y i, Z i ) (X j, Y j, Z j ) and (X i, Y i, Z i ) (X i+1, Y i+1, Z i+1 ) = 1 constraint-based optimization: distributing over aminoacid positions C Pr X 1 = X 1 X 4 = 3 C Pr & X 1 = problems redundant constraints and search strategy [Backofen:98] symmetry breaking [Backofen&Will:99] bound for number of of HH-contacts [Backofen:00a,03] new constraints, propagation [Backofen:Will:01]
43 13 Constraint-based Formulation constraint problem C Pr : position of i-th amino acid: X i, Y i, Z i [1... n] constraints describe Self-Avoiding Walks (X i, Y i, Z i ) (X j, Y j, Z j ) and (X i, Y i, Z i ) (X i+1, Y i+1, Z i+1 ) = 1 constraint-based optimization: distributing over aminoacid positions C Pr X 1 = X 1 X 4 = 3 C Pr & X 1 = problems redundant constraints and search strategy [Backofen:98] symmetry breaking [Backofen&Will:99] bound for number of of HH-contacts [Backofen:00a,03] new constraints, propagation [Backofen:Will:01]
44 13 Constraint-based Formulation constraint problem C Pr : position of i-th amino acid: X i, Y i, Z i [1... n] constraints describe Self-Avoiding Walks (X i, Y i, Z i ) (X j, Y j, Z j ) and (X i, Y i, Z i ) (X i+1, Y i+1, Z i+1 ) = 1 constraint-based optimization: distributing over aminoacid positions C Pr X 1 = X 1 X 4 = 3 C Pr & X 1 = problems redundant constraints and search strategy [Backofen:98] symmetry breaking [Backofen&Will:99] bound for number of of HH-contacts [Backofen:00a,03] new constraints, propagation [Backofen:Will:01]
45 13 Constraint-based Formulation constraint problem C Pr : position of i-th amino acid: X i, Y i, Z i [1... n] constraints describe Self-Avoiding Walks (X i, Y i, Z i ) (X j, Y j, Z j ) and (X i, Y i, Z i ) (X i+1, Y i+1, Z i+1 ) = 1 constraint-based optimization: distributing over aminoacid positions C Pr X 1 = X 1 X 4 = 3 C Pr & X 1 = problems redundant constraints and search strategy [Backofen:98] symmetry breaking [Backofen&Will:99] bound for number of of HH-contacts [Backofen:00a,03] new constraints, propagation [Backofen:Will:01]
46 13 Constraint-based Formulation constraint problem C Pr : position of i-th amino acid: X i, Y i, Z i [1... n] constraints describe Self-Avoiding Walks (X i, Y i, Z i ) (X j, Y j, Z j ) and (X i, Y i, Z i ) (X i+1, Y i+1, Z i+1 ) = 1 constraint-based optimization: distributing over aminoacid positions C Pr X 1 = X 1 X 4 = 3 C Pr & X 1 = problems redundant constraints and search strategy [Backofen:98] symmetry breaking [Backofen&Will:99] bound for number of of HH-contacts [Backofen:00a,03] new constraints, propagation [Backofen:Will:01]
47 14 Problem 1: Frame Sequences
48 15 Bounds for FCC FCC models proteins better: 1.5 Å RMSD [Park&Levitt95] BUT: almost nothing was known approximation: 60% of optimum [Agarwala et al.98] only trivial bounds: 6 number of H-amino acids. approach: layer contacts interlayer contacts 4-point 3-point x=1 x= x=3 n 1 =1 n =5 n 3 =
49 15 Bounds for FCC FCC models proteins better: 1.5 Å RMSD [Park&Levitt95] BUT: almost nothing was known approximation: 60% of optimum [Agarwala et al.98] only trivial bounds: 6 number of H-amino acids. approach: layer contacts interlayer contacts 4-point 3-point x=1 x= x=3 n 1 =1 n =5 n 3 =
50 15 Bounds for FCC FCC models proteins better: 1.5 Å RMSD [Park&Levitt95] BUT: almost nothing was known approximation: 60% of optimum [Agarwala et al.98] only trivial bounds: 6 number of H-amino acids. approach: layer contacts interlayer contacts 4-point 3-point x=1 x= x=3 n 1 =1 n =5 n 3 =
51 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
52 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
53 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
54 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
55 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
56 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
57 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
58 16 Bound on Layer Contact relation between surface and contacts 4n = H-contacts + H-surface number of H-neighbours contacts to Ps or solution positions relation to frame H-surface = a + b (a, b)= (height,width) horizontal surface 1 3 vertical surface 1 minimal surface for n Hs = minimal frame (a, b) around n point a = n b = n a
59 17 Recursion for Bound b b b1 a1 n1 n a n = b1 a1 n1 + + n a B (n,n1,a1,b1) C B LC (n1,a1,b1) B ILC ( n1,a1,b1 n,a,b ) 3 4 B C (n n1,n,a,b) B C (n, n 1, a 1, b 1 ): contacts in core with n elements and first layer E 1 : n 1, a 1, b 1 = B LC (n 1, a 1, b 1 ) contacts in layer E 1 + B ILC (n 1, a 1, b 1, n, a, b ) contacts between layers E 1 and E : n, a, b + B C (n n 1, n, a, b ) contacts in core with n n 1 elements and first layer E
60 18 Bound on Interlayer Contacts recall: we need an bound on interlayer contacts point 4 point 3 point x=1 x= n1=5 n=3 but: we are given only frames b1= b= a1=3 a= x=1 n1=5 x= n=3 bound number of 4, 3, and 1 points, given frames
61 19 Bound on Number of 4-, 3-, - and 1-Points problem: number of 4-, 3-, - and 1-points in x = i + 1 depends on exact position of Hs in x = i n i = 8 a i = height b i = width needed: parameters, which determine the number of 4-, 3-, - and 1-points Lemma let l be the number of 3-points. Then: number of 4 = n i + 1 a i b i number of = a i + b i l 4 number of 1 = l + 4. for l, there is an upper bound [Backofen00]
62 19 Bound on Number of 4-, 3-, - and 1-Points problem: number of 4-, 3-, - and 1-points in x = i + 1 depends on exact position of Hs in x = i n i = 8 a i = height b i = width needed: parameters, which determine the number of 4-, 3-, - and 1-points Lemma let l be the number of 3-points. Then: number of 4 = n i + 1 a i b i number of = a i + b i l 4 number of 1 = l + 4. for l, there is an upper bound [Backofen00]
63 19 Bound on Number of 4-, 3-, - and 1-Points problem: number of 4-, 3-, - and 1-points in x = i + 1 depends on exact position of Hs in x = i n i = 8 a i = height b i = width needed: parameters, which determine the number of 4-, 3-, - and 1-points Lemma let l be the number of 3-points. Then: number of 4 = n i + 1 a i b i number of = a i + b i l 4 number of 1 = l + 4. for l, there is an upper bound [Backofen00]
64 19 Bound on Number of 4-, 3-, - and 1-Points problem: number of 4-, 3-, - and 1-points in x = i + 1 depends on exact position of Hs in x = i n i = 8 a i = height b i = width needed: parameters, which determine the number of 4-, 3-, - and 1-points Lemma let l be the number of 3-points. Then: number of 4 = n i + 1 a i b i number of = a i + b i l 4 number of 1 = l + 4. for l, there is an upper bound [Backofen00]
65 19 Bound on Number of 4-, 3-, - and 1-Points problem: number of 4-, 3-, - and 1-points in x = i + 1 depends on exact position of Hs in x = i n i = 8 a i = height b i = width needed: parameters, which determine the number of 4-, 3-, - and 1-points Lemma let l be the number of 3-points. Then: number of 4 = n i + 1 a i b i number of = a i + b i l 4 number of 1 = l + 4. for l, there is an upper bound [Backofen00]
66 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
67 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
68 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
69 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
70 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
71 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
72 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
73 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
74 0 Bounds on the Number l of 3-Points 3-point: from the side from the top 3 point = 3 point observation: l can also be calculated from the frame i3= i4= i=3 i1=1 3 point (next laxer) bound: calculate max. number of diagonals optimal placement: balance numbers between edges
75 1 Problem : Enumerate Hydrophobic Cores =
76 Enumerating Hydrophobic Cores constraint variables: boolean variable for every position ˆ= (pnt( p) = 1) ˆ= (pnt( p) = 0) ˆ= (pnt( p) undef.) contact variable for each neighboring position ˆ= con( p, q) = 1 constraints: pnt( p) = number of Hs p frames if optimal, then no caveats...
77 3 Enumerating Hydrophobic Cores remaining problem: relative positions of frames subproblems: symmetries later many subproblems solved several times do not use fixed frame position global bind frame positions by surrounding cube more pruning: optimal core must have optimal frame-sequence in any direction constructive disjunction
78 3 Enumerating Hydrophobic Cores remaining problem: relative positions of frames subproblems: symmetries later many subproblems solved several times do not use fixed frame position global bind frame positions by surrounding cube more pruning: optimal core must have optimal frame-sequence in any direction constructive disjunction
79 3 Enumerating Hydrophobic Cores remaining problem: relative positions of frames subproblems: symmetries later many subproblems solved several times do not use fixed frame position global bind frame positions by surrounding cube more pruning: optimal core must have optimal frame-sequence in any direction constructive disjunction
80 3 Enumerating Hydrophobic Cores remaining problem: relative positions of frames subproblems: symmetries later many subproblems solved several times do not use fixed frame position global bind frame positions by surrounding cube more pruning: optimal core must have optimal frame-sequence in any direction constructive disjunction
81 3 Enumerating Hydrophobic Cores remaining problem: relative positions of frames subproblems: symmetries later many subproblems solved several times do not use fixed frame position global bind frame positions by surrounding cube more pruning: optimal core must have optimal frame-sequence in any direction constructive disjunction
82 3 Enumerating Hydrophobic Cores remaining problem: relative positions of frames subproblems: symmetries later many subproblems solved several times do not use fixed frame position global bind frame positions by surrounding cube more pruning: optimal core must have optimal frame-sequence in any direction constructive disjunction
83 4 Problem 3: Threading Sequence onto Hydrophobic Cores =
84 5 New Constraints for Threading threading: given core, find a sequence of monomer through it main problem: self-avoiding walks new constraint: SAWalk(x 1,..., x m ) walk self-avoiding walk (SAWalk) Pos. used twice problem: complete handling for SAWalk(x 1,..., x m ) is hard therefore: approximate SAWalks k-avoiding walks 4- but not 5-avoiding
85 5 New Constraints for Threading threading: given core, find a sequence of monomer through it main problem: self-avoiding walks new constraint: SAWalk(x 1,..., x m ) walk self-avoiding walk (SAWalk) Pos. used twice problem: complete handling for SAWalk(x 1,..., x m ) is hard therefore: approximate SAWalks k-avoiding walks 4- but not 5-avoiding
86 5 New Constraints for Threading threading: given core, find a sequence of monomer through it main problem: self-avoiding walks new constraint: SAWalk(x 1,..., x m ) walk self-avoiding walk (SAWalk) Pos. used twice problem: complete handling for SAWalk(x 1,..., x m ) is hard therefore: approximate SAWalks k-avoiding walks 4- but not 5-avoiding
87 5 New Constraints for Threading threading: given core, find a sequence of monomer through it main problem: self-avoiding walks new constraint: SAWalk(x 1,..., x m ) walk self-avoiding walk (SAWalk) Pos. used twice problem: complete handling for SAWalk(x 1,..., x m ) is hard therefore: approximate SAWalks k-avoiding walks 4- but not 5-avoiding
88 5 New Constraints for Threading threading: given core, find a sequence of monomer through it main problem: self-avoiding walks new constraint: SAWalk(x 1,..., x m ) walk self-avoiding walk (SAWalk) Pos. used twice problem: complete handling for SAWalk(x 1,..., x m ) is hard therefore: approximate SAWalks k-avoiding walks 4- but not 5-avoiding
89 6 Example: 3-Avoiding psinglets: HPH-subsequence in cubic lattice: has strong influence on core caveat caveat-freeness by path constraint remaining invalid case excluded by 3-avoidingness
90 6 Example: 3-Avoiding psinglets: HPH-subsequence in cubic lattice: has strong influence on core caveat caveat-freeness by path constraint remaining invalid case excluded by 3-avoidingness
91 6 Example: 3-Avoiding psinglets: HPH-subsequence in cubic lattice: has strong influence on core caveat caveat-freeness by path constraint remaining invalid case excluded by 3-avoidingness
92 6 Example: 3-Avoiding psinglets: HPH-subsequence in cubic lattice: has strong influence on core caveat caveat-freeness by path constraint remaining invalid case excluded by 3-avoidingness
93 6 Example: 3-Avoiding psinglets: HPH-subsequence in cubic lattice: has strong influence on core caveat caveat-freeness by path constraint remaining invalid case excluded by 3-avoidingness
94 7 Problem 4: Symmetry Breaking X 1 = 1 X 1 1 X 1 = 3 solved, but skipped here!
95 Comparison of Results small selection of previous approaches: authors model dim. maxlen algorithm comment [Yue& Dill PhysRevE93] cubic HP 3 36 branch-and-bound optimality proven [Yue&Dill PNAS95] cubic HP 3 88 branch-and-bound optimality proven [Sazhin et al. 01] cubic HP, FCC 3 34 branch-and-bound not always optimal [Cui et al. PNAS0] square HP 18 compl. enum [Hart&Istrail JCB97] FCC side chain 3 approximation 86% of optimum 3 [Agarwala et al. JMB97] FCC HP 3 approximation of optimum 5 our results: native conformation up to length proof of optimality number of conformations of length n: 4.5 n threading on 100-Hs core seq. length runtime S s S s S s S s search space handled bigger 8 only existing non-heuristic algorithm for FCC
96 Comparison of Results small selection of previous approaches: authors model dim. maxlen algorithm comment [Yue& Dill PhysRevE93] cubic HP 3 36 branch-and-bound optimality proven [Yue&Dill PNAS95] cubic HP 3 88 branch-and-bound optimality proven [Sazhin et al. 01] cubic HP, FCC 3 34 branch-and-bound not always optimal [Cui et al. PNAS0] square HP 18 compl. enum [Hart&Istrail JCB97] FCC side chain 3 approximation 86% of optimum 3 [Agarwala et al. JMB97] FCC HP 3 approximation of optimum 5 our results: native conformation up to length proof of optimality number of conformations of length n: 4.5 n threading on 100-Hs core seq. length runtime S s S s S s S s search space handled bigger 8 only existing non-heuristic algorithm for FCC
97 Comparison of Results small selection of previous approaches: authors model dim. maxlen algorithm comment [Yue& Dill PhysRevE93] cubic HP 3 36 branch-and-bound optimality proven [Yue&Dill PNAS95] cubic HP 3 88 branch-and-bound optimality proven [Sazhin et al. 01] cubic HP, FCC 3 34 branch-and-bound not always optimal [Cui et al. PNAS0] square HP 18 compl. enum [Hart&Istrail JCB97] FCC side chain 3 approximation 86% of optimum 3 [Agarwala et al. JMB97] FCC HP 3 approximation of optimum 5 our results: native conformation up to length proof of optimality number of conformations of length n: 4.5 n threading on 100-Hs core seq. length runtime S s S s S s S s search space handled bigger 8 only existing non-heuristic algorithm for FCC
98 Comparison of Results small selection of previous approaches: authors model dim. maxlen algorithm comment [Yue& Dill PhysRevE93] cubic HP 3 36 branch-and-bound optimality proven [Yue&Dill PNAS95] cubic HP 3 88 branch-and-bound optimality proven [Sazhin et al. 01] cubic HP, FCC 3 34 branch-and-bound not always optimal [Cui et al. PNAS0] square HP 18 compl. enum [Hart&Istrail JCB97] FCC side chain 3 approximation 86% of optimum 3 [Agarwala et al. JMB97] FCC HP 3 approximation of optimum 5 our results: native conformation up to length proof of optimality number of conformations of length n: 4.5 n threading on 100-Hs core seq. length runtime S s S s S s S s search space handled bigger 8 only existing non-heuristic algorithm for FCC
99 prediction of one optimal structure Runtimes (sequence length 48, Harvard sequences from [Yue et al., 1995]) Nr. sequence CPSP PERM 1 HPH P H 4 PH 3 P H P HPH 3 PHPH P H P 3 HP 8 H 0,1 s 6,9 min H 4 PH PH 5 P HP H P HP 6 HP HP 3 HP H P H 3 PH 0,1 s 40,5 min 3 PHPH PH 6 P HPHP HPH PHPHP 3 HP H P H P HPHP HP 4,5 s 100, min 4 P HP 3 HPH 4 P H 4 PH PH 3 P HPHPHP HP 6 H PH PH 1,8 s 74,7 min 5 H 3 P 3 H PHPH PH PH PHP 7 HPHP HP 3 HP H 6 PH 1,7 s 59, min 6 PHP 4 HPH 3 PHPH 4 PH PH P 3 HPHP 3 H 3 P H P H P 3 H 1,1 s 144,7 min 7 PHPH P HPH 3 P H PH P 3 H 5 P HPH PHPHP 4 HP HPHP 7,3 s 84,0 min 8 PH PH 3 PH 4 P H 3 P 6 HPH P H PHP 3 H PHPHPH P 3 1,5 s 6,6 min 9 PHPHP 4 HPHPHP HPH 6 P H 3 PHP HPH P HPH 3 P 4 H 0,3 s 140,0 min 10 PH P 6 H P 3 H 3 PHP HPH P HP HP H P H 7 P H 0,1 s 18,3 min 9 CPSP: our approach, constraint-based PERM [Bastolla et al., 1998]: stochastic optimization PERM=pruned-enriched Rosenbluth method
100 30 Applications structure prediction investigation of landscape properties degeneracy of sequences finding protein-likes sequences with unique ground state comparing different models (cubic/fcc, HP-model with HPNX)
101 31 Degeneracy degeneracy (g) of a sequence = number of structures with lowest energy known: HP-model has high degeneracy unknow: how high is it? are there sequences with g=1 (unique ground state, protein-like )? how does it compare to other models (FCC, HPNX)? how do neutral nets look like? degeneracy: can only be tested via two algorithms Sequence degeneracy found by CHCC [Yue et al] our approach HPHHPPHHHHPHHHPPHHPPHPHHHPHPHHPPHHPPPHPPPPPPPPHH 1, 500, , 677, 113 HHHHPHHPHHHHHPPHPPHHPPHPPPPPPHPPHPPPHPPHHPPHHHPH 14, 000 8, 180 PHPHHPHHHHHHPPHPHPPHPHHPHPHPPPHPPHHPPHHPPHPHPPHP 5, 000 5, 090 PHHPPPPPPHHPPPHHHPHPPHPHHPPHPPHPPHHPPHHHHHHHPPHH 188, , 751
102 31 Degeneracy degeneracy (g) of a sequence = number of structures with lowest energy known: HP-model has high degeneracy unknow: how high is it? are there sequences with g=1 (unique ground state, protein-like )? how does it compare to other models (FCC, HPNX)? how do neutral nets look like? degeneracy: can only be tested via two algorithms Sequence degeneracy found by CHCC [Yue et al] our approach HPHHPPHHHHPHHHPPHHPPHPHHHPHPHHPPHHPPPHPPPPPPPPHH 1, 500, , 677, 113 HHHHPHHPHHHHHPPHPPHHPPHPPPPPPHPPHPPPHPPHHPPHHHPH 14, 000 8, 180 PHPHHPHHHHHHPPHPHPPHPHHPHPHPPPHPPHHPPHHPPHPHPPHP 5, 000 5, 090 PHHPPPPPPHHPPPHHHPHPPHPHHPPHPPHPPHHPPHHHHHHHPPHH 188, , 751
103 Application: Design of protein-like Sequences find sequences with exactly one optimal structure stochastic local search node: accepted sequences edges: 8 7 simulation step/mutation Degeneracy Step 1
104 33 Run Time Requirements at every step: calculation/estimation of degeneracy (using our CPFL) but: runtime depends on degeneracy good news: runtime grows only linearly with degeneracy Time/s e+00 e+05 4e+05 6e+05 8e+05 1e+06 Degeneracy
105 Example: Sequences with Unique Ground-State length 64: length 80: 34 Note: previously it was assumed that HP-model has none g=1 sequences
106 Three Typical Runs a) b) c) a) 81 b) 19 c)
107 Degeneracy: FCC vs. Cubic log-degeneracy cubic HP-model: Frequency Log Degeneracy log-degeneracy FCC HP-model: Frequency Log Degeneracy
108 37 Degeneracy: HP-Model vs. HPNX-model HPNX: P=positive N=negative X=neutral should reduce the degeneracy How much? preliminary results HP: approx % of all random sequences are uniquely folding. HPNX: approx..6% of all random sequences are uniquely folding. Note: 50% H monomers example for reduction: sequence S HPNX: HXNNHHHHXHXHHNXNHXHHNHPPXHP corresp. HP: HPPPHHHHPHPHHPPPHPHHPHPPPHP
109 38 S HP-sequence: 4 out of 97
110 39 S HPNX-sequence: the 4 native ones
111 40 Connectivity of Neutral Nets S6 S S13 S4 S1 S1 S33 S36 S3 S3 S0 S31 S18 S5 S7 S8 S17 S9 S35 S4 S1 S7 S30 S9 S34 S S3 S5 S16 S6 S14 S15 S19 S8 S11 S10
112 41 WWW-Page
113 4 Conclusion constraint-based approach to protein folding guaranteed to find optima models: HP-like models: HP, HPNX lattices: cubic, FCC applications: properties of landscape degeneracy neutral nets folding tunnel
114 43 Acknowledgment Sebastian Will Erich Bornberg-Bauer Peter Stadler Michael Wolfinger
CONSTRAINT APPROACH FOR PROTEIN STRUCTURE PREDICTION IN THE SIDE CHAIN HP MODEL
CONSTRAINT APPROACH FOR PROTEIN STRUCTURE PREDICTION IN THE SIDE CHAIN HP MODEL SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE AT ALBERT-LUDWIGS UNIVERSITY OF
More informationA Branch and Bound Algorithm for the Protein Folding Problem in the HP Lattice Model
Article A Branch and Bound Algorithm for the Protein Folding Problem in the HP Lattice Model Mao Chen* and Wen-Qi Huang School of Computer Science and Technology, Huazhong University of Science and Technology,
More informationVisualization of pinfold simulations
Visualization of pinfold simulations Sebastian Pötzsch Faculty of Mathematics and Computer Science University of Leipzig Overview 1. Introduction 1.1 Protein folding problem 1.2 HP-model 1.3 Pinfold simulation
More informationCombinatorial Problems on Strings with Applications to Protein Folding
Combinatorial Problems on Strings with Applications to Protein Folding Alantha Newman 1 and Matthias Ruhl 2 1 MIT Laboratory for Computer Science Cambridge, MA 02139 alantha@theory.lcs.mit.edu 2 IBM Almaden
More informationProtein Design in the 2D HP Model
rotein Design in the 2D Model A Monte-Carlo Iterative Design Approach Reza Lotun and Camilo Rostoker {rlotun,rostokec}@cs.ubc.ca Department of Computer Science, UBC 1 resentation Outline 1. Review of proteins
More informationLattice and O-Lattice Side Chain Models of Protein Folding: Linear Time Structure Prediction Better Than 86% of Optimal.
Lattice and O-Lattice Side Chain Models of Protein Folding: Linear Time Structure Prediction Better Than 86% of Optimal (Extended Abstract) William E. Hart Sorin Istrail y Abstract This paper considers
More informationConformations of Proteins on Lattice Models. Jiangbo Miao Natalie Kantz
Conformations of Proteins on Lattice Models Jiangbo Miao Natalie Kantz Lattice Model The lattice model offers a discrete space that limits the infinite number of protein conformations to the lattice space.
More informationHardware-Software Codesign
Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationA FILTERING TECHNIQUE FOR FRAGMENT ASSEMBLY- BASED PROTEINS LOOP MODELING WITH CONSTRAINTS
A FILTERING TECHNIQUE FOR FRAGMENT ASSEMBLY- BASED PROTEINS LOOP MODELING WITH CONSTRAINTS F. Campeotto 1,2 A. Dal Palù 3 A. Dovier 2 F. Fioretto 1 E. Pontelli 1 1. Dept. Computer Science, NMSU 2. Dept.
More informationarxiv: v1 [cs.ce] 31 Oct 2013
A Hybrid Local Search for Simplified Protein Structure Prediction Swakkhar Shatabda, M.A. Hakim Newton, Duc Nghia Pham and Abdul Sattar arxiv:1310.8583v1 [cs.ce] 31 Oct 2013 Abstract Protein structure
More informationApplication of Constraint Programming Techniques for Structure Prediction of Lattice Proteins with Extended Alphabets Rolf Backofen, Sebastian Will In
Application of Constraint Programming Techniques for Structure Prediction of Lattice Proteins with Extended Alphabets Rolf Backofen, Sebastian Will Institut fur Informatik, LMU Munchen Oettingenstrae 67,
More informationExample: Map coloring
Today s s lecture Local Search Lecture 7: Search - 6 Heuristic Repair CSP and 3-SAT Solving CSPs using Systematic Search. Victor Lesser CMPSCI 683 Fall 2004 The relationship between problem structure and
More informationLocal Search for CSPs
Local Search for CSPs Alan Mackworth UBC CS CSP February, 0 Textbook. Lecture Overview Domain splitting: recap, more details & pseudocode Local Search Time-permitting: Stochastic Local Search (start) Searching
More informationA Parallel Implementation of a Higher-order Self Consistent Mean Field. Effectively solving the protein repacking problem is a key step to successful
Karl Gutwin May 15, 2005 18.336 A Parallel Implementation of a Higher-order Self Consistent Mean Field Effectively solving the protein repacking problem is a key step to successful protein design. Put
More informationML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015
ML phylogenetic inference and GARLI Derrick Zwickl University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 Outline Heuristics and tree searches ML phylogeny inference and
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationData Mining Technologies for Bioinformatics Sequences
Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment
More informationAn Improved Ant Colony Optimisation Algorithm for the 2D HP Protein Folding Problem
An Improved Ant Colony Optimisation Algorithm for the 2D HP Protein Folding Problem Alena Shmygelska and Holger H. Hoos Department of Computer Science, University of British Columbia, Vancouver, B.C.,
More informationCOMBINATORIAL ALGORITHMS FOR BIPOLE SELF-ASSEMBLY ON LATTICES WITH APPLICATIONS TO THE LIPID BILAYER
COMBINATORIAL ALGORITHMS FOR BIPOLE SELF-ASSEMBLY ON LATTICES WITH APPLICATIONS TO THE LIPID BILAYER KSHITIJ LAURIA AND SORIN ISTRAIL Abstract. Phospholipid bilayer self-assembly is modeled as a discrete
More informationA Hybrid Population based ACO Algorithm for Protein Folding
A Hybrid Population based ACO Algorithm for Protein Folding Torsten Thalheim, Daniel Merkle, Martin Middendorf, Abstract A hybrid population based Ant Colony Optimization (ACO) algorithm PFold-P-ACO for
More informationSpecial course in Computer Science: Advanced Text Algorithms
Special course in Computer Science: Advanced Text Algorithms Lecture 8: Multiple alignments Elena Czeizler and Ion Petre Department of IT, Abo Akademi Computational Biomodelling Laboratory http://www.users.abo.fi/ipetre/textalg
More informationConstraint Satisfaction Problems
Constraint Satisfaction Problems Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Nebel, Hué and Wölfl (Universität Freiburg) Constraint
More informationMidterm Examination CS540-2: Introduction to Artificial Intelligence
Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search
More informationPseudo code of algorithms are to be read by.
Cs502 Quiz No1 Complete Solved File Pseudo code of algorithms are to be read by. People RAM Computer Compiler Approach of solving geometric problems by sweeping a line across the plane is called sweep.
More informationLast topic: Summary; Heuristics and Approximation Algorithms Topics we studied so far:
Last topic: Summary; Heuristics and Approximation Algorithms Topics we studied so far: I Strength of formulations; improving formulations by adding valid inequalities I Relaxations and dual problems; obtaining
More informationCISC 889 Bioinformatics (Spring 2003) Multiple Sequence Alignment
CISC 889 Bioinformatics (Spring 2003) Multiple Sequence Alignment Courtesy of jalview 1 Motivations Collective statistic Protein families Identification and representation of conserved sequence features
More information6 Randomized rounding of semidefinite programs
6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can
More informationA Complete and Effective Move Set for Simplified Protein Folding
A Complete and Effective Move Set for Simplified Protein Folding Neal Lesh Mitsubishi Electric Research Laboratories 201 Broadway Cambridge, MA, 02139 lesh@merl.com Michael Mitzenmacher Sue Whitesides
More informationTask Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval
Case Study 2: Document Retrieval Task Description: Finding Similar Documents Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 11, 2017 Sham Kakade 2017 1 Document
More informationAnytime AND/OR Depth-first Search for Combinatorial Optimization
Anytime AND/OR Depth-first Search for Combinatorial Optimization Lars Otten and Rina Dechter Dept. of Computer Science University of California, Irvine Outline AND/OR Search Spaces. Conflict: Decomposition
More informationDecision Problems. Observation: Many polynomial algorithms. Questions: Can we solve all problems in polynomial time? Answer: No, absolutely not.
Decision Problems Observation: Many polynomial algorithms. Questions: Can we solve all problems in polynomial time? Answer: No, absolutely not. Definition: The class of problems that can be solved by polynomial-time
More informationParsimony-Based Approaches to Inferring Phylogenetic Trees
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:
More informationLecture overview. Knowledge-based systems in Bioinformatics, 1MB602, Goal based agents. Search terminology. Specifying a search problem
Lecture overview Knowledge-based systems in ioinformatics, 1M602, 2006 Lecture 6: earch oal based agents earch terminology pecifying a search problem earch considerations Uninformed search euristic methods
More informationBranch & Bound (B&B) and Constraint Satisfaction Problems (CSPs)
Branch & Bound (B&B) and Constraint Satisfaction Problems (CSPs) Alan Mackworth UBC CS 322 CSP 1 January 25, 2013 P&M textbook 3.7.4 & 4.0-4.2 Lecture Overview Recap Branch & Bound Wrap up of search module
More informationA Re-examination of Limited Discrepancy Search
A Re-examination of Limited Discrepancy Search W. Ken Jackson, Morten Irgens, and William S. Havens Intelligent Systems Lab, Centre for Systems Science Simon Fraser University Burnaby, B.C., CANADA V5A
More informationEmpirical Comparisons of Fast Methods
Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas {dalang, klaas}@cs.ubc.ca University of British Columbia December 17, 2004 Fast N-Body Learning - Empirical Comparisons p. 1 Sum Kernel
More informationTrees, Trees and More Trees
Trees, Trees and More Trees August 9, 01 Andrew B. Kahng abk@cs.ucsd.edu http://vlsicad.ucsd.edu/~abk/ How You ll See Trees in CS Trees as mathematical objects Trees as data structures Trees as tools for
More informationRouting. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998
Routing Robust Channel Router Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Channel Routing Algorithms Previous algorithms we considered only work when one of the types
More informationLocal Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )
Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 Outline Local search techniques and optimization Hill-climbing
More informationThis is the search strategy that we are still using:
About Search This is the search strategy that we are still using: function : if a solution has been found: return true if the CSP is infeasible: return false for in : if : return true return false Let
More informationA Deterministic Optimization Approach to Protein Sequence Design Using Continuous Models
Sung K. Koh Mechanical Engineering and Applied Mechanics University of Pennsylvania Philadelphia, 19104-6315, USA G. K. Ananthasuresh Mechanical Engineering and Applied Mechanics University of Pennsylvania
More informationMachine Learning for Software Engineering
Machine Learning for Software Engineering Introduction and Motivation Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Organizational Stuff Lectures: Tuesday 11:00 12:30 in room SR015 Cover
More informationOnline Graph Exploration
Distributed Computing Online Graph Exploration Semester thesis Simon Hungerbühler simonhu@ethz.ch Distributed Computing Group Computer Engineering and Networks Laboratory ETH Zürich Supervisors: Sebastian
More informationChapter S:II. II. Search Space Representation
Chapter S:II II. Search Space Representation Systematic Search Encoding of Problems State-Space Representation Problem-Reduction Representation Choosing a Representation S:II-1 Search Space Representation
More informationData mining with sparse grids
Data mining with sparse grids Jochen Garcke and Michael Griebel Institut für Angewandte Mathematik Universität Bonn Data mining with sparse grids p.1/40 Overview What is Data mining? Regularization networks
More informationMidterm Examination CS 540-2: Introduction to Artificial Intelligence
Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State
More information8/19/13. Computational problems. Introduction to Algorithm
I519, Introduction to Introduction to Algorithm Yuzhen Ye (yye@indiana.edu) School of Informatics and Computing, IUB Computational problems A computational problem specifies an input-output relationship
More informationOn the Space-Time Trade-off in Solving Constraint Satisfaction Problems*
Appeared in Proc of the 14th Int l Joint Conf on Artificial Intelligence, 558-56, 1995 On the Space-Time Trade-off in Solving Constraint Satisfaction Problems* Roberto J Bayardo Jr and Daniel P Miranker
More informationA CSP Search Algorithm with Reduced Branching Factor
A CSP Search Algorithm with Reduced Branching Factor Igor Razgon and Amnon Meisels Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84-105, Israel {irazgon,am}@cs.bgu.ac.il
More informationOn Lattice Protein Structure Prediction Revisited
1 On Lattice Protein Structure Prediction Revisited Ivan Dotu Manuel Cebrián Pascal Van Hentenryck Peter Clote Abstract Protein structure prediction is regarded as a highly challenging problem both for
More informationFramework for Design of Dynamic Programming Algorithms
CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied
More informationConstraint Satisfaction Problems
Constraint Satisfaction Problems CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2013 Soleymani Course material: Artificial Intelligence: A Modern Approach, 3 rd Edition,
More informationLocal Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )
Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 2 Outline Local search techniques and optimization Hill-climbing
More informationLocal Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )
Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 2 Outline Local search techniques and optimization Hill-climbing
More informationAlgorithms for Integer Programming
Algorithms for Integer Programming Laura Galli November 9, 2016 Unlike linear programming problems, integer programming problems are very difficult to solve. In fact, no efficient general algorithm is
More informationNotes for Lecture 24
U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined
More informationIdentifying network modules
Network biology minicourse (part 3) Algorithmic challenges in genomics Identifying network modules Roded Sharan School of Computer Science, Tel Aviv University Gene/Protein Modules A module is a set of
More informationExploring the HP Model for Protein Folding
Exploring the HP Model for Protein Folding A Major Qualifying Project Report submitted to the Faculty of WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Bachelor
More informationN-Queens problem. Administrative. Local Search
Local Search CS151 David Kauchak Fall 2010 http://www.youtube.com/watch?v=4pcl6-mjrnk Some material borrowed from: Sara Owsley Sood and others Administrative N-Queens problem Assign 1 grading Assign 2
More informationPredicting Protein Folding Paths. S.Will, , Fall 2011
Predicting Protein Folding Paths Protein Folding by Robotics Probabilistic Roadmap Planning (PRM): Thomas, Song, Amato. Protein folding by motion planning. Phys. Biol., 2005 Aims Find good quality folding
More information4 INFORMED SEARCH AND EXPLORATION. 4.1 Heuristic Search Strategies
55 4 INFORMED SEARCH AND EXPLORATION We now consider informed search that uses problem-specific knowledge beyond the definition of the problem itself This information helps to find solutions more efficiently
More informationData Mining. Covering algorithms. Covering approach At each stage you identify a rule that covers some of instances. Fig. 4.
Data Mining Chapter 4. Algorithms: The Basic Methods (Covering algorithm, Association rule, Linear models, Instance-based learning, Clustering) 1 Covering approach At each stage you identify a rule that
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationOn the Space-Time Trade-off in Solving Constraint Satisfaction Problems*
On the Space-Time Trade-off in Solving Constraint Satisfaction Problems* Roberto J. Bayardo Jr. and Daniel P. Miranker Department of Computer Sciences and Applied Research Laboratories The University of
More informationOn the Rectangle Escape Problem
On the Rectangle Escape Problem A. Ahmadinejad S. Assadi E. Emamjomeh-Zadeh S. Yazdanbod H. Zarrabi-Zadeh Abstract Motivated by the bus escape routing problem in printed circuit boards, we study the following
More informationTwo-dimensional Totalistic Code 52
Two-dimensional Totalistic Code 52 Todd Rowland Senior Research Associate, Wolfram Research, Inc. 100 Trade Center Drive, Champaign, IL The totalistic two-dimensional cellular automaton code 52 is capable
More information4 Search Problem formulation (23 points)
4 Search Problem formulation (23 points) Consider a Mars rover that has to drive around the surface, collect rock samples, and return to the lander. We want to construct a plan for its exploration. It
More informationExamples of P vs NP: More Problems
Examples of P vs NP: More Problems COMP1600 / COMP6260 Dirk Pattinson Australian National University Semester 2, 2017 Catch Up / Drop in Lab When Fridays, 15.00-17.00 Where N335, CSIT Building (bldg 108)
More informationIntroduction to Randomized Algorithms
Introduction to Randomized Algorithms Gopinath Mishra Advanced Computing and Microelectronics Unit Indian Statistical Institute Kolkata 700108, India. Organization 1 Introduction 2 Some basic ideas from
More informationThe MIP-Solving-Framework SCIP
The MIP-Solving-Framework SCIP Timo Berthold Zuse Institut Berlin DFG Research Center MATHEON Mathematics for key technologies Berlin, 23.05.2007 What Is A MIP? Definition MIP The optimization problem
More informationCost Optimal Parallel Algorithm for 0-1 Knapsack Problem
Cost Optimal Parallel Algorithm for 0-1 Knapsack Problem Project Report Sandeep Kumar Ragila Rochester Institute of Technology sr5626@rit.edu Santosh Vodela Rochester Institute of Technology pv8395@rit.edu
More informationConstraint Programming
Depth-first search Let us go back to foundations: DFS = Depth First Search Constraint Programming Roman Barták Department of Theoretical Computer Science and Mathematical Logic 2 3 4 5 6 7 8 9 Observation:
More informationCS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36
CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 36 CS 473: Algorithms, Spring 2018 LP Duality Lecture 20 April 3, 2018 Some of the
More informationIntroduction VLSI PHYSICAL DESIGN AUTOMATION
VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.
More informationLecture 18. Questions? Monday, February 20 CS 430 Artificial Intelligence - Lecture 18 1
Lecture 18 Questions? Monday, February 20 CS 430 Artificial Intelligence - Lecture 18 1 Outline Chapter 6 - Constraint Satisfaction Problems Path Consistency & Global Constraints Sudoku Example Backtracking
More informationCS 581. Tandy Warnow
CS 581 Tandy Warnow This week Maximum parsimony: solving it on small datasets Maximum Likelihood optimization problem Felsenstein s pruning algorithm Bayesian MCMC methods Research opportunities Maximum
More informationSearch and Optimization
Search and Optimization Search, Optimization and Game-Playing The goal is to find one or more optimal or sub-optimal solutions in a given search space. We can either be interested in finding any one solution
More informationNew Implementation for the Multi-sequence All-Against-All Substring Matching Problem
New Implementation for the Multi-sequence All-Against-All Substring Matching Problem Oana Sandu Supervised by Ulrike Stege In collaboration with Chris Upton, Alex Thomo, and Marina Barsky University of
More informationHeuristic Optimization Introduction and Simple Heuristics
Heuristic Optimization Introduction and Simple Heuristics José M PEÑA (jmpena@fi.upm.es) (Universidad Politécnica de Madrid) 1 Outline 1. What are optimization problems? 2. Exhaustive vs. Heuristic approaches
More informationGiovanni De Micheli. Integrated Systems Centre EPF Lausanne
Two-level Logic Synthesis and Optimization Giovanni De Micheli Integrated Systems Centre EPF Lausanne This presentation can be used for non-commercial purposes as long as this note and the copyright footers
More informationNeighborhood Selection in Constraint-Based Local Search for Protein Structure Prediction
Neighborhood Selection in Constraint-Based Local Search for Protein Structure Prediction Author Shatabda, Swakkhar, Newton, MAHakim, Sattar, Abdul Published 2013 Conference Title Lecture Notes in Computer
More informationReview of the Robust K-means Algorithm and Comparison with Other Clustering Methods
Review of the Robust K-means Algorithm and Comparison with Other Clustering Methods Ben Karsin University of Hawaii at Manoa Information and Computer Science ICS 63 Machine Learning Fall 8 Introduction
More informationLearning factorizations in estimation of distribution algorithms using affinity propagation
Learning factorizations in estimation of distribution algorithms using affinity propagation Roberto Santana 1, Pedro Larrañaga 2,and Jose A. Lozano 1 1 Intelligent Systems Group Department of Computer
More informationCSC384 Test 1 Sample Questions
CSC384 Test 1 Sample Questions October 27, 2015 1 Short Answer 1. Is A s search behavior necessarily exponentially explosive?. That is, does its search time always grow at least exponentially with the
More informationLattice and Off-Lattice Side Chain Models of Protein Folding: Linear Time Structure Prediction Better Than 86% of Optimal (Extended Abstract)*
Lattice and OffLattice Side Chain Models of Protein Folding Linear Time Structure Prediction Better Than 86% of Optimal (Extended Abstract)* William E Hart+ Sorin strail$ August 9 1996 Abstract This paper
More informationTechnical University of Munich. Exercise 8: Neural Networks
Technical University of Munich Chair for Bioinformatics and Computational Biology Protein Prediction I for Computer Scientists SoSe 2018 Prof. B. Rost M. Bernhofer, M. Heinzinger, D. Nechaev, L. Richter
More informationWhat is Search For? CSE 473: Artificial Intelligence. Example: N-Queens. Example: N-Queens. Example: Map-Coloring 4/7/17
CSE 473: Artificial Intelligence Constraint Satisfaction Dieter Fox What is Search For? Models of the world: single agent, deterministic actions, fully observed state, discrete state space Planning: sequences
More informationToday. CS 188: Artificial Intelligence Fall Example: Boolean Satisfiability. Reminder: CSPs. Example: 3-SAT. CSPs: Queries.
CS 188: Artificial Intelligence Fall 2007 Lecture 5: CSPs II 9/11/2007 More CSPs Applications Tree Algorithms Cutset Conditioning Today Dan Klein UC Berkeley Many slides over the course adapted from either
More informationUninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall
Midterm Exam Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall Covers topics through Decision Trees and Random Forests (does not include constraint satisfaction) Closed book 8.5 x 11 sheet with notes
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 06: Multiple Sequence Alignment https://upload.wikimedia.org/wikipedia/commons/thumb/7/79/rplp0_90_clustalw_aln.gif/575px-rplp0_90_clustalw_aln.gif Slides
More informationAdministrative. Local Search!
Administrative Local Search! CS311 David Kauchak Spring 2013 Assignment 2 due Tuesday before class Written problems 2 posted Class participation http://www.youtube.com/watch? v=irhfvdphfzq&list=uucdoqrpqlqkvctckzqa
More informationWhat is a Pattern Database? Pattern Databases. Success Story #1. Success Story #2. Robert Holte Computing Science Dept. University of Alberta
What is a Pattern Database Pattern Databases Robert Holte Computing Science Dept. University of Alberta November, 00 PDB = a heuristic stored as a lookup table Invented by Culberson and Schaeffer (99)
More informationAlgorithms for Bioinformatics
Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and
More informationCode generation for modern processors
Code generation for modern processors Definitions (1 of 2) What are the dominant performance issues for a superscalar RISC processor? Refs: AS&U, Chapter 9 + Notes. Optional: Muchnick, 16.3 & 17.1 Instruction
More informationSolving lexicographic multiobjective MIPs with Branch-Cut-Price
Solving lexicographic multiobjective MIPs with Branch-Cut-Price Marta Eso (The Hotchkiss School) Laszlo Ladanyi (IBM T.J. Watson Research Center) David Jensen (IBM T.J. Watson Research Center) McMaster
More informationCS 331: Artificial Intelligence Local Search 1. Tough real-world problems
CS 331: Artificial Intelligence Local Search 1 1 Tough real-world problems Suppose you had to solve VLSI layout problems (minimize distance between components, unused space, etc.) Or schedule airlines
More informationCoping with the Limitations of Algorithm Power Exact Solution Strategies Backtracking Backtracking : A Scenario
Coping with the Limitations of Algorithm Power Tackling Difficult Combinatorial Problems There are two principal approaches to tackling difficult combinatorial problems (NP-hard problems): Use a strategy
More information=1, and F n. for all n " 2. The recursive definition of Fibonacci numbers immediately gives us a recursive algorithm for computing them:
Wavefront Pattern I. Problem Data elements are laid out as multidimensional grids representing a logical plane or space. The dependency between the elements, often formulated by dynamic programming, results
More informationCode generation for modern processors
Code generation for modern processors What are the dominant performance issues for a superscalar RISC processor? Refs: AS&U, Chapter 9 + Notes. Optional: Muchnick, 16.3 & 17.1 Strategy il il il il asm
More informationBinary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x
Binary recursion Unate unctions! Theorem I a cover C() is unate in,, then is unate in.! Theorem I is unate in,, then every prime implicant o is unate in. Why are unate unctions so special?! Special Boolean
More information