A quick review. Which molecular processes/functions are involved in a certain phenotype (e.g., disease, stress response, etc.)

Similar documents
A quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm:

Biological Networks Analysis

Biological Networks Analysis

Biological Networks Analysis Dijkstra s algorithm and Degree Distribution

Biological Networks Analysis Network Motifs. Genome 373 Genomic Informatics Elhanan Borenstein

Network analysis. Martina Kutmon Department of Bioinformatics Maastricht University

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

Advanced Algorithms and Models for Computational Biology -- a machine learning approach

Properties of Biological Networks

Modeling and Simulating Social Systems with MATLAB

Graph Theory for Network Science

Graph Theory for Network Science

Chapter 14 Section 3 - Slide 1

Math/Stat 2300 Modeling using Graph Theory (March 23/25) from text A First Course in Mathematical Modeling, Giordano, Fox, Horton, Weir, 2009.

Graph Algorithms. A Brief Introduction. 高晓沨 (Xiaofeng Gao) Department of Computer Science Shanghai Jiao Tong Univ.

1 Graphs and networks

V 1 Introduction! Mon, Oct 15, 2012! Bioinformatics 3 Volkhard Helms!

L Modelling and Simulating Social Systems with MATLAB

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

Lecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics.

Case Studies in Complex Networks

Introduction to Networks and Business Intelligence

The Generalized Topological Overlap Matrix in Biological Network Analysis

BioNSi - Biological Network Simulation Tool

L Modeling and Simulating Social Systems with MATLAB

Various Graphs and Their Applications in Real World

Eulerian Cycle (2A) Walk : vertices may repeat, edges may repeat (closed or open) Trail: vertices may repeat, edges cannot repeat (open)

Girls Talk Math Summer Camp

Understand graph terminology Implement graphs using

Introduction to Bioinformatics

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Graph-theoretic Properties of Networks

DIJKSTRA'S ALGORITHM. By Laksman Veeravagu and Luis Barrera

CS4800: Algorithms & Data Jonathan Ullman

Incoming, Outgoing Degree and Importance Analysis of Network Motifs

Structure of biological networks. Presentation by Atanas Kamburov

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Algorithms after Dijkstra and Kruskal for Big Data. User Manual

Paths, Circuits, and Connected Graphs

The Betweenness Centrality Of Biological Networks

Quick Review of Graphs

Network Theory Introduction

Graphs,EDA and Computational Biology. Robert Gentleman

Real-World Applications of Graph Theory

V 2 Clusters, Dijkstra, and Graph Layout"

V 2 Clusters, Dijkstra, and Graph Layout

CS1 Lecture 31 Apr. 3, 2019

IPA: networks generation algorithm

NetWalker Genomic Data Integration Platform. User Guide

Networks in economics and finance. Lecture 1 - Measuring networks

Konigsberg Bridge Problem

Tutorial:OverRepresentation - OpenTutorials

Content. 1 Algorithms. 2 Assignment and Knapsack problem 3 Bin Packing. 4 Scheduling

Data mining --- mining graphs

6.2. Paths and Cycles

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI. Department of Computer Science and Engineering CS6301 PROGRAMMING DATA STRUCTURES II

Missing Data Estimation in Microarrays Using Multi-Organism Approach

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

Introduction to Graph Theory. A problem to solve. A problem to solve. A problem to solve. Notes. Notes. Notes. Notes. Dr. James Burk.

/ Computational Genomics. Normalization

The quantitative analysis of interactions takes bioinformatics to the next higher dimension: we go from 1D to 2D with graph theory.

EECS730: Introduction to Bioinformatics

Some Graph Theory for Network Analysis. CS 249B: Science of Networks Week 01: Thursday, 01/31/08 Daniel Bilar Wellesley College Spring 2008

Graph Algorithms using Map-Reduce. Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web


Graph Algorithms. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

1. a graph G = (V (G), E(G)) consists of a set V (G) of vertices, and a set E(G) of edges (edges are pairs of elements of V (G))

Graph Theory and its Applications

Party hard! The maths of connections. Colva Roney-Dougal. March 23rd, University of St Andrews

CSE 326: Data Structures Dijkstra s Algorithm. James Fogarty Autumn 2007

Supplementary Materials for. A gene ontology inferred from molecular networks

Summary: What We Have Learned So Far

Chapter 14. Graphs Pearson Addison-Wesley. All rights reserved 14 A-1

Identifying network modules

CS 311 Discrete Math for Computer Science Dr. William C. Bulko. Graphs

CS350: Data Structures Dijkstra s Shortest Path Alg.

bcnql: A Query Language for Biochemical Network Hong Yang, Rajshekhar Sunderraman, Hao Tian Computer Science Department Georgia State University

An overview of Graph Categories and Graph Primitives

Mathematics of Networks II

V10 Metabolic networks - Graph connectivity

Wednesday, March 8, Complex Networks. Presenter: Jirakhom Ruttanavakul. CS 790R, University of Nevada, Reno

Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell

An overview of Cytoscape for network biology with a focus on residue interaction networks

Graph Algorithms. Revised based on the slides by Ruoming Kent State

V 2 Clusters, Dijkstra, and Graph Layout

Overview of Network Theory, I

CS 3410 Ch 14 Graphs and Paths

Introduction To Graphs and Networks. Fall 2013 Carola Wenk

Ma/CS 6a Class 8: Eulerian Cycles

Graphs. Data Structures and Algorithms CSE 373 SU 18 BEN JONES 1

Graphs. Tessema M. Mengistu Department of Computer Science Southern Illinois University Carbondale Room - Faner 3131

3 Global Properties of Networks

Contents. ! Data sets. ! Distance and similarity metrics. ! K-means clustering. ! Hierarchical clustering. ! Evaluation of clustering results

Clustering Jacques van Helden

PPI Network Alignment Advanced Topics in Computa8onal Genomics

Graphs. The ultimate data structure. graphs 1

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

8/19/13. Computational problems. Introduction to Algorithm

Exercises. Biological Data Analysis Using InterMine workshop exercises with answers

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1

Transcription:

Gene expression profiling A quick review Which molecular processes/functions are involved in a certain phenotype (e.g., disease, stress response, etc.) The Gene Ontology (GO) Project Provides shared vocabulary/annotation Terms are linked in a complex structure Enrichment analysis: Find the most differentially expressed genes Identify over-represented annotations Modified Fisher's exact test

Gene Set Enrichment Analysis Calculates a score for the enrichment of a entire set of genes Does not require setting a cutoff! Identifies the set of relevant genes! Provides a more robust statistical framework! GSEA steps: A quick review cont 1. Calculation of an enrichment score (ES) for each functional category. Estimation of significance level 3. Adjustment for multiple hypotheses testing

Biological Networks Analysis Introduction and Dijkstra s algorithm Genome 373 Genomic Informatics Elhanan Borenstein

Biological networks What is a network? What networks are used in biology? Why do we need networks (and network theory)? How do we find the shortest path between two nodes?

What is a network? A map of interactions or relationships A collection of nodes and links (edges)

What is a network? A map of interactions or relationships A collection of nodes and links (edges)

A long history of network/graph theory!! Network theory Social sciences (and Biological sciences) Mostly 0 th century Modeling real-life systems Measuring structure & topology Graph theory Computer science Since 18 th century!!! Modeling abstract systems Solving graphrelated questions

Leonhard Euler 1707 1783 The Seven Bridges of Königsberg Published by Leonhard Euler, 1736 Considered the first paper in graph theory

Leonhard Euler 1707 1783 The Seven Bridges of Königsberg Published by Leonhard Euler, 1736 Considered the first paper in graph theory

Edge properties and special topologies Edges: Directed/undirected Weighted/non-weighted Simple-edges/Hyperedges Special topologies: Trees Directed Acyclic Graphs (DAG) Bipartite networks

Transcriptional regulatory networks Reflect the cell s genetic regulatory circuitry Nodes: transcription factors and genes; Edges: from TF to the genes it regulates Directed; weighted?; almost bipartite Derived through: Chromatin IP Microarrays Computationally

Metabolic networks Reflect the set of biochemical reactions in a cell Nodes: metabolites Edges: biochemical reactions Directed; weighted?; hyperedges? Derived through: Knowledge of biochemistry Metabolic flux measurements Homology S. Cerevisiae 106 metabolites 1149 reactions

Protein-protein interaction (PPI) networks Reflect the cell s molecular interactions and signaling pathways (interactome) Nodes: proteins Edges: interactions(?) Undirected High-throughput experiments: Protein Complex-IP (Co-IP) Yeast two-hybrid Computationally S. Cerevisiae 4389 proteins 14319 interactions

Other networks in biology/medicine

Non-biological networks Computer related networks: WWW; Internet backbone Communications and IP Social networks: Friendship (facebook; clubs) Citations / information flow Co-authorships (papers) Co-occurrence (movies; Jazz) Transportation: Highway systems; Airline routes Electronic/Logic circuits Many many more

The Kevin Bacon Number Game Tropic Thunder (008) Tropic Thunder Iron Man Proof Flatliners Tom Cruise Robert Downey Jr. Gwyneth Paltrow Hope Davis Kevin Bacon Tropic Thunder Iron Man Frost/Nixon Tom Cruise Robert Downey Jr. Frank Langella Kevin Bacon

The Paul Erdos Number Game

The shortest path problem Find the minimal number of links connecting node A to node B in an undirected network How many friends between you and someone on FB (6 degrees of separation, Erdös number, Kevin Bacon number) How far apart are two genes in an interaction network What is the shortest (and likely) infection path Find the shortest (cheapest) path between two nodes in a weighted directed graph GPS; Google map

Dijkstra s Algorithm Edsger Wybe Dijkstra 1930 00 "Computer Science is no more about computers than astronomy is about telescopes."

Solves the single-source shortest path problem: Find the shortest path from a single source to ALL nodes in the network Works on both directed and undirected networks Works on both weighted and non-weighted networks Approach: Iterative : maintain shortest path to each intermediate node Greedy algorithm Dijkstra s algorithm but still guaranteed to provide optimal solution!!

Dijkstra s algorithm 1. Initialize: i. Assign a distance value, D, to each node. Set D to zero for start node and to infinity for all others. ii. iii. Mark all nodes as unvisited. Set start node as current node.. For each of the current node s unvisited neighbors: i. Calculate tentative distance, D t, through current node. ii. iii. If D t smaller than D (previously recorded distance): D D t Mark current node as visited (note: shortest dist. found). 3. Set the unvisited node with the smallest distance as the next "current node" and continue from step. 4. Once all nodes are marked as visited, finish.

Dijkstra s algorithm A simple synthetic network 9 B D 5 A 1 4 3 7 9 F 3 C E 1 1. Initialize: i. Assign a distance value, D, to each node. Set D to zero for start node and to infinity for all others. ii. Mark all nodes as unvisited. iii. Set start node as current node.. For each of the current node s unvisited neighbors: i. Calculate tentative distance, D t, through current node. ii. If D t smaller than D (previously recorded distance): D D t iii. Mark current node as visited (note: shortest dist. found). 3. Set the unvisited node with the smallest distance as the next "current node" and continue from step. 4. Once all nodes are marked as visited, finish.

Dijkstra s algorithm Initialization Mark A (start) as current node A B C D E F 0 9 B D 5 D: 0 A 3 1 C 4 3 7 9 E 1 F

Dijkstra s algorithm Check unvisited neighbors of A A B C D E F 0 0+9 vs. 9 B D 5 D: 0 A 1 3 0+3 vs. C 4 3 7 9 E 1 F

Dijkstra s algorithm Update D Record path A B C D E F 0 9,9 B D 5 0 9 3 D: 0 A 3 1 C,3 4 3 7 9 E 1 F

Mark A as visited Dijkstra s algorithm A B C D E F 0 9,9 B D 5 0 9 3 D: 0 A 3 1 C,3 4 3 7 9 E 1 F

Dijkstra s algorithm Mark C as current (unvisited node with smallest D) A B C D E F 0 9,9 B D 5 0 9 3 D: 0 A 3 1 C,3 4 3 7 9 E 1 F

Dijkstra s algorithm Check unvisited neighbors of C A B C D E F 0 3+4 vs. 9 9,9 B D 5 3+3 vs. 0 9 3 D: 0 A 3 1 C,3 4 3 7 9 E 1 F 3+ vs.

Dijkstra s algorithm Update distance Record path A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F

Dijkstra s algorithm Mark C as visited Note: Distance to C is final!! A B C D E F 0 0 9 3 7 3 6 5 D: 0 A 9 3,9,7 1 B C,3 D 4 3 7 9,6 E,5 5 1 F

Dijkstra s algorithm Mark E as current node Check unvisited neighbors of E A B C D E F 0 0 9 3 7 3 6 5 D: 0 A 9 3,9,7 1 B C,3 D 4 3 7 9,6 E,5 5 1 F

Dijkstra s algorithm Update D Record path A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17

Mark E as visited Dijkstra s algorithm A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17

Dijkstra s algorithm Mark D as current node Check unvisited neighbors of D A B C D E F 0 0 9 3 7 3 6 5 7 6 5 17 D: 0 A 9 3,9,7 1 B C,3 D 4 3 7 9,6 E,5 5 1 F,17

Dijkstra s algorithm Update D Record path (note: path has changed) A B C D E F 0 0 9 3 7 3 6 5 7 6 5 17 7 6 11 D: 0 A 9 1 3,9,7 B C,3 D 4 3 7 9,6 E,5 5 1 F,17,11

Mark D as visited Dijkstra s algorithm A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 7 6 11 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17,11

Dijkstra s algorithm Mark B as current node Check neighbors A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 7 6 11 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17,11

Dijkstra s algorithm No updates.. Mark B as visited A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 7 6 11 7 11 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17,11

Mark F as current Dijkstra s algorithm A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 7 6 11 7 11 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17,11

Mark F as visited Dijkstra s algorithm A B C D E F 0 9,9,7 B,6 D 5 0 9 3 7 3 6 5 7 6 5 17 7 6 11 7 11 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 F,17,11 11

We now have: Shortest path from A to each node (both length and path) Minimum spanning tree We are done! 9,9,7 B,6 D 5 A B C D E F 0 0 9 3 7 3 6 5 7 6 5 17 7 6 11 7 11 11 D: 0 A 3 1 C,3 4 3 7 9 E,5 1 Will we always get a tree? Can you prove it? F,17,11

Computational Representation of Networks A B C D List of edges: (ordered) pairs of nodes [ (A,C), (C,B), (D,B), (D,C) ] Connectivity Matrix A B C D A 0 0 1 0 B 0 0 0 0 C 0 1 0 0 D 0 1 1 0 Name:D ngr: p1 p Object Oriented Name:C ngr: p1 Name:B ngr: Name:A ngr: p1 Which is the most useful representation?