Network Based Models For Analysis of SNPs Yalta Opt

Similar documents
Computing Largest Correcting Codes and Their Estimates Using Optimization on Specially Constructed Graphs p.1/30

Introduction to Graph Theory

On the Complexity of Broadcast Scheduling. Problem

The Maximum Clique Problem

ON THE COMPLEXITY OF THE BROADCAST SCHEDULING PROBLEM

Introduction to Combinatorial Algorithms

Iterative Learning of Single Individual Haplotypes from High-Throughput DNA Sequencing Data

Graph Theory. Probabilistic Graphical Models. L. Enrique Sucar, INAOE. Definitions. Types of Graphs. Trajectories and Circuits.

Let G = (V, E) be a graph. If u, v V, then u is adjacent to v if {u, v} E. We also use the notation u v to denote that u is adjacent to v.

On Clique Relaxation Models in Network Analysis

Xiao, M. (Mingyu); Lin, W. (Weibo); Dai, Y. (Yuanshun); Zeng, Y. (Yifeng)

Minimum Recombinant Haplotype Configuration on Tree Pedigrees (Extended Abstract)

Network Clustering. Balabhaskar Balasundaram, Sergiy Butenko

COMP 355 Advanced Algorithms Approximation Algorithms: VC and TSP Chapter 11 (KT) Section (CLRS)

Small Survey on Perfect Graphs

Genetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland

Web Structure Mining Community Detection and Evaluation

NP Completeness. Andreas Klappenecker [partially based on slides by Jennifer Welch]

On the Approximability of Modularity Clustering

Decision Problems. Observation: Many polynomial algorithms. Questions: Can we solve all problems in polynomial time? Answer: No, absolutely not.

Paths. Path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph.

Algorithms for the maximum k-club problem in graphs

Isolation Concepts for Enumerating Dense Subgraphs

Genetic Algorithms Based Solution To Maximum Clique Problem

Graphs: Introduction. Ali Shokoufandeh, Department of Computer Science, Drexel University

Estimating. Local Ancestry in admixed Populations (LAMP)

Colouring graphs with no odd holes

Spotter Documentation Version 0.5, Released 4/12/2010

31.6 Powers of an element

Kernelization Through Tidying A Case-Study Based on s-plex Cluster Vertex Deletion

Genetic Analysis. Page 1

LD vignette Measures of linkage disequilibrium

Community Detection. Community

W4231: Analysis of Algorithms

P = NP; P NP. Intuition of the reduction idea:

Kyle Gettig. Mentor Benjamin Iriarte Fourth Annual MIT PRIMES Conference May 17, 2014

Statistical relationship discovery in SNP data using Bayesian networks

Bipartite Roots of Graphs

Clustering of SNP Data with Application to Genomics

Graph Theory. Connectivity, Coloring, Matching. Arjun Suresh 1. 1 GATE Overflow

Critical Node Detection Problem. Panos Pardalos Distinguished Professor CAO, Dept. of Industrial and Systems Engineering, University of Florida

Step-by-Step Guide to Basic Genetic Analysis

arxiv: v1 [cs.dm] 21 Dec 2015

PROJECT PROPOSALS: COMMUNITY DETECTION AND ENTITY RESOLUTION. Donatella Firmani

Effective Recombination in Plant Breeding and Linkage Mapping Populations: Testing Models and Mating Schemes

Package SimGbyE. July 20, 2009

Extremal Graph Theory: Turán s Theorem

HPC methods for hidden Markov models (HMMs) in population genetics

The k-center problem Approximation Algorithms 2009 Petros Potikas

Matchings and Covers in bipartite graphs

The complement of PATH is in NL

Step-by-Step Guide to Advanced Genetic Analysis

Partha Sarathi Mandal

Chapter 9 Graph Algorithms

GWAsimulator: A rapid whole-genome simulation program

Genetic type 1 Error Calculator (GEC)

HapBlock A Suite of Dynamic Programming Algorithms for Haplotype Block Partitioning and Tag SNP Selection Based on Haplotype and Genotype Data

Approximation Algorithms for Geometric Intersection Graphs

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions

Computability Theory

Linkage Disequilibrium Map by Unidimensional Nonnegative Scaling

Chapter 9 Graph Algorithms

Graph Theory: Introduction

CS388C: Combinatorics and Graph Theory

Class Six: Coloring Planar Graphs

Minimum Multicolored Subgraph Problem in Multiplex PCR Primer Set Selection and Population Haplotyping

How many colors are needed to color a map?

Introduction to Algorithms. Lecture 24. Prof. Patrick Jaillet

GBS Bioinformatics Pipeline(s) Overview

Practice Final Exam 1

REU Problems of the Czech Group

A GENETIC ALGORITHM FOR CLUSTERING ON VERY LARGE DATA SETS

freebayes in depth: model, filtering, and walkthrough Erik Garrison Wellcome Trust Sanger of Iowa May 19, 2015

The Structure of Bull-Free Perfect Graphs

Lecture 4: Walks, Trails, Paths and Connectivity

4. (a) Draw the Petersen graph. (b) Use Kuratowski s teorem to prove that the Petersen graph is non-planar.

4.1. Access the internet and log on to the UCSC Genome Bioinformatics Web Page (Figure 1-

Preventing Unraveling in Social Networks: The Anchored k-core Problem

Lecture Note: Computation problems in social. network analysis

On Approximating Minimum Vertex Cover for Graphs with Perfect Matching

Paths, Flowers and Vertex Cover

DNA Sequencing. Overview

P and NP (Millenium problem)

Dominating Set. Stephen Grady, Jeremy Poff

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs

COMP260 Spring 2014 Notes: February 4th

Multiplexing Schemes for Generic SNP Genotyping Assays 1 ABSTRACT

The Structure and Properties of Clique Graphs of Regular Graphs

Chapter 9 Graph Algorithms

Accelerating the Prediction of Protein Interactions

Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

XLVI Pesquisa Operacional na Gestão da Segurança Pública

SNP HiTLink Manual. Yoko Fukuda 1, Hiroki Adachi 2, Eiji Nakamura 2, and Shoji Tsuji 1

OmegaPlus Pavlos Pavlidis & Nikolaos Alachiotis

THE INDEPENDENCE NUMBER PROJECT:

Complementary Acyclic Weak Domination Preserving Sets

Genetic Algorithm for Dynamic Capacitated Minimum Spanning Tree

A taste of perfect graphs (continued)

A Genetic Algorithm Applied to Graph Problems Involving Subsets of Vertices

Transcription:

Outline Network Based Models For Analysis of Yalta Optimization Conference 2010 Network Science Zeynep Ertem*, Sergiy Butenko*, Clare Gill** *Department of Industrial and Systems Engineering, **Department of Animal Science, Texas A& M University College Station, TX 77843-3131 July 29, 2010 1/25 Network Based Models For Analysis of Yalta Opt

Outline Outline 1 Introduction to Graph Theory 2 3 4 5 2/25 Network Based Models For Analysis of Yalta Opt

Introduction to Graph Theory G = (V, E) is a simple undirected graph V = {1, 2,..., n} - set of vertices E V V - set of edges (arcs, lines) A subset C V is called a clique if G(C) is complete, i.e. it has all possible edges. 3/25 Network Based Models For Analysis of Yalta Opt

Introduction to Graph Theory A graph without cycles is acyclic A graph is connected if there is a path between any pair of vertices A tree is a simple, undirected, connected, acyclic graph 4/25 Network Based Models For Analysis of Yalta Opt

Introduction to Graph Theory G = (V, E), is the complement graph of G = (V, E) where E = {(i, j) i, j V, i j and (i, j) / E}. For S V, G(S) = (S, E S S) the subgraph induced by S. 5/25 Network Based Models For Analysis of Yalta Opt

Introduction to Graph Theory A subset C V is called a clique if G(C) is complete, i.e. it has all possible edges. A subset I V is called an independent set (stable set, vertex packing) if G(I ) has no edges. A clique (independent set) is said to be maximal, if it is not a subset of any larger clique (independent set); maximum, if there is no larger clique (independent set) in the graph. 6/25 Network Based Models For Analysis of Yalta Opt

k-plex Given a positive integer k, a k-plex is a subset of vertices C such that each vertex v C is adjacent to all but at most k vertices in C. 1-plex is a clique. 7/25 Network Based Models For Analysis of Yalta Opt

Graph Theory Basics α(g) the independence (stability) number of G. ω(g) the clique number of G. VC V is a vertex cover if every edge has at least one endpoint in VC. 8/25 Network Based Models For Analysis of Yalta Opt

Graph Theory Basics I is a maximum independent set of G I is a maximum clique of Ḡ V \ I is a minimum vertex cover of G. 9/25 Network Based Models For Analysis of Yalta Opt

Graph Theory Problems Maximum Clique Problem (MCP) To find largest k-plex in a given graph G Maximum independent set problem(misp) Minimum vertex cover problem (MVC) MC, MIS and MVC problems are NP-hard 10/25 Network Based Models For Analysis of Yalta Opt

Graph Theory Problems and Constructing a complete map of all occurring in human genome is one of the most important goal. WHAT ARE THOSE? 11/25 Network Based Models For Analysis of Yalta Opt

SNP Single - Nucleotide Polymorphism DNA sequence variation Disease development, response to pathogens, drugs... They do not cause diseases, however they determine susceptibility to them Between members of species or pairs of chromosomes in an individual 12/25 Network Based Models For Analysis of Yalta Opt

Difference is in single nucleotide ATTCGA ATTTGA In human DNA, more than 10 million. However only a few million discovered so far. Substitution Deletion Insertion 13/25 Network Based Models For Analysis of Yalta Opt

Occur in every 100 to 300 nucleotides in the genome Two of every three SNP are C and T substitutions 90% of genetic variation is attributed to SNP 14/25 Network Based Models For Analysis of Yalta Opt

SNP in Genomic Sequences 15/25 Network Based Models For Analysis of Yalta Opt

Tag A representative SNP in a region of the genome with high linkage disequilibrium With the help of tag it is possible to identify genetic variation without genotyping every SNP in a chromosomal region Reduction in experimental cost 16/25 Network Based Models For Analysis of Yalta Opt

Linkage Disequilibrium Nonrandom association of alleles between 2 or more loci Can be calculated as difference between observed and expected allelic frequencies Alleles There may be two or more bases can occur in, these bases 17/25 are called alleles. Like A and C Network Based Models For Analysis of Yalta Opt

Linkage Disequilibrium Generally LD is high when the distance between 2 is low Influenced by a number of factors Genetic linkage, selection the rate of recombination rate of mutation non-random mating population structure 18/25 Network Based Models For Analysis of Yalta Opt

Linkage Disequilibrium 19/25 Network Based Models For Analysis of Yalta Opt

Data With the help of HapMap Project= Haploid Mapping Project able to map genomic data Cattle Chromosome 1 20/25 Network Based Models For Analysis of Yalta Opt

Data Different threshold values ranging from 0.1 to 0.5 are used for the r 2 values Largest k-plexes found. BUT HOW? 21/25 Network Based Models For Analysis of Yalta Opt

How to find k-plexes? k-plexes in social network analysis introduced by Seidman and Foster (1978). ideal for cohesive subgroup not practical due to its restrictive nature Ostergard s algorithm(1999) Branch-and-bound algorithm for maximum-weight clique problem Balasundram et al.(2009) Branch-and-cut algorithm k-plex detection 22/25 Network Based Models For Analysis of Yalta Opt

Results Threshold # of in 2-plex # of in 3-plex # of in 4-plex 0.1 62 63 63 0.15 57 58 58 0.2 50 51 51 0.25 41 41 42 0.3 40 41 41 0.35 40 41 41 0.4 40 40 40 0.45 40 39 39 0.5 39 39 39 23/25 Network Based Models For Analysis of Yalta Opt

Results Largest clusters are very stable and well defined in terms of cliques. If we increase k, clusters are not changing drastically. There is high correlation in between the associated in the cattle s chromosome 1. 24/25 Network Based Models For Analysis of Yalta Opt

Thank You. 25/25 Network Based Models For Analysis of Yalta Opt