Reconstructing Boolean Networks from Noisy Gene Expression Data
|
|
- Jade Hutchinson
- 5 years ago
- Views:
Transcription
1 2004 8th International Conference on Control, Automation, Robotics and Vision Kunming, China, 6-9th December 2004 Reconstructing Boolean Networks from Noisy Gene Expression Data Zheng Yun and Kwoh Chee Keong School of Computer Engineering, Nanyang Technological University Nanyang Avenue, Singapore Abstract In recent years, a lot of interests have been given to simulate gene regulatory networks (GRNs), especially the architectures of them. Boolean networks (BLNs) are a good choice to obtain the architectures of GRNs when the accessible data sets are limited. Various algorithms have been introduced to reconstruct Boolean networks from gene expression profiles, which are always noisy. However, there are still few dedicated endeavors given to noise problems in learning BLNs. In this paper, we introduce a novel way of sifting noises from gene expression data. The noises cause indefinite states in the learned BLNs, but the correct BLNs could be obtained further with the incompletely specified Karnaugh maps. The experiments on both synthetic and yeast gene expression data show that the method can detect noises and reconstruct the original models in some cases. Keywords: gene regulatory networks, Boolean networks, reverse engineering, Karnaugh maps 1 Introduction With the availability of genome-wide gene expression data [1, 2], a lot of interests have been given to modelling GRNs [3-9], which are assumed to be the underlying mechanisms that regulate different gene expression patterns. The real gene expression levels are always assumed to be sigmoidal functions. It has been proved to be computationally expensive to simulate such sigmoidal functions. One of the practical way is to idealize the sigmoid curve with a step function, since the step functions can model the non-linear part of the sigmoid curve. The resulting models is BLNs. When using BLNs to model GRNs, genes are represented with binary variables with two values ON (1) and OFF (0), which means the genes are turned on or turned off respectively. The regulatory relationships between the genes are expressed by Boolean functions related to each variables. Formally, a BLN G(V, F) consists of a set V = {X 1,..., X n of nodes representing genes and a set F = {f 1,..., f n of Boolean functions, where a boolean function f i (X i1,..., X ik ) with inputs from specified nodes X i1,..., X ik at time step t is assigned to the node X i at time step t + 1, as shown in the following equation. X i (t + 1) = f i (X i1 (t),..., X ik (t)), 1 i n (1) The state of a BLN is expressed by the state vector of its nodes. We use v(t) = {x 1,..., x n to represent the state of a BLN at time t, and v(t + 1) = {x 1,..., x n to represent the state of the BLN at time t + 1. {x 1,..., x n is calculated from {x 1,..., x n with Equation 1. A state transition pair is v(t) v(t + 1). In reference [10], we introduce a new algorithm called DFL (Discrete Function Learning) algorithm to learn qualitative models of GRNs from discretized microarray gene expression data. In our method, the expression data are assumed to be the products of these functions. Then, we use a reverse engineering method based on information theory to find these functions from gene expression data. Gene expression data are always noisy. We introduce a method called ɛ function to deal with the noise problems in data sets in this paper. We show that after the BLNs are learned from noisy data, some noises in the Boolean functions could be detected and removed with the incompletely specified Karnaugh maps. The rest of this paper is organized as follows. In the next section, we introduce the theory foundation of learning functional relations from data. We briefly introduce the DFL algorithm with an example in section 3. Then, we propose a new concept called ɛ function to deal with noises in the data sets in section 4. Finally, we summarize the works of this paper in the last section. 2 Foundation of Information Theory Our approach is based on the information theory. First of all, we introduce the following theorem, which is the theoretical foundation of our algorithm. Theorem 2.1 If the mutual information between X and Y is equal to the entropy of Y, i.e., I(X; Y ) = /04/$ IEEE 1049
2 H(Y ), then Y is a function of X. Yeung [11] gave a proof for the following theorem. Theorem 2.2 H(Y X) = 0 if and only if Y is a function of X. Since I(X; Y ) = H(Y ) H(Y X), directly from Theorem 2.2, it is straightforward to obtain Theorem Methods In this section, we begin with a formal definition of the problem of reconstructing qualitative GRN models from state transition pairs. Then, we briefly introduce the DFL algorithm to solve this problem. For detailed analysis of the DFL algorithm, refer to reference [10]. 3.1 Problem definition The problem of inferring the BLN model of the GRN from input-output transition pairs (time series of gene expression) is defined as follows. Definition 3.1 Let V = {X 1,..., X n. Given a transition table T = {v(t) v(t + 1) where v(t) is the state vector of the GRN model at time t, find a set of discrete functions F = {f 1, f 2,, f n, so that X i (t + 1) (X i hereafter) is calculated from f i as follows X i (t + 1) = f i (X i1 (t),..., X ik (t)), where t goes from 1 to a limited constant. If the F are Boolean functions, then the GRN model is a BLN, otherwise the GRN model is a GLF or PLDE model. From Theorem 2.1, to solve the problem in Definition 3.1 is actually to find a group of genes X(t) = {X i1 (t),..., X ik (t), so that the mutual information between X(t) and X i is equal to the entropy of X i. Therefore, the problem is resorted to a searching over all combinations of V = {X 1,..., X n. Apparently, there are a total of 2 n combinations for V, which makes the problem become NP-complete. Fortunately, for GRNs, each gene is estimated on the average to interact with four to eight other genes [12]. That is to say, it is sufficient to consider the combinations whose cardinalities are bounded by a small integer. 3.2 Search method The main steps of the DFL algorithm are listed in Table 1. In the DFL algorithm, we use the following definition, called supersets. Definition 3.2 Let X be a subset of V = {X 1,..., X n, then i (X) of X are the supersets of X so that X i (X) and i = X + i, where X denotes the cardinality of X. Table 1: The DFL algorithm. Algorithm: DFL(V, k, T ) Input: V with n genes, indegree k, T = {v(t) v(t + 1),t = 1,, N. Output: F = {f 1, f 2,, f n Begin: 1 L all single element subsets of V; 2 T ree.f irstnode L; 3 for every gene Y V { 4 calculate H(Y ); //from T 5 D 1; //initial depth 6 F.add(Sub(Y, T ree, H(Y ), D, k)); 7 return F; End The Sub() is a sub routine listed in Table 3. To clarify the heuristic underlying the DFL algorithm, let us consider a BLN consisting of four genes, as shown in Figure 1. In this example, the function of each gene is listed in Table 3. The set of all genes is V = {A, B, C, D, and we use X to denote subsets of V. A B C D A B C D Figure 1: The wiring diagram of a BLN model, where n = 4, k max = 4. X denotes the state of X in next time step. One of the commonly used algorithms to infer BLNs from data is the REVEAL algorithm [13]. As shown in Figure 2, the REVEAL algorithm uses an exhaustive search method, it first searches the subsets with only one gene, then subsets with two genes, and so on. If the gene under consideration is Y, the REVEAL algorithm calculates the mutual information between X and Y. If I(X, Y ) = H(Y ), then it extracts function rules from the original transition table T, and stops the calculation for Y. Consequently, it finds the boolean function for the next gene. When compared with the REVEAL algorithm, the DFL algorithm uses a better heuristic when finding the target combination. An example is given in Figure 2, where it shows the search procedures to find Boolean function of D of the example in Figure
3 Table 2: The sub routine of the DFL algorithm. Algorithm: Sub(Y, T ree, H, D, k) Input: Y, T ree, entropy H(Y ) current depth D, indegree k Output: function Y = f(x) Begin: 1 L T ree.dthnode; 2 for every element X L { 3 calculate I(X, Y ); 4 if(i(x, Y ) == H) { 5 extract Y = f(x) from T ; 6 return Y = f(x) ; 7 sort L according to I; 8 for every element X L { 9 if(d < k){ 10 D D + 1; 11 T ree.dthnode 1 (X); 12 return Sub(Y, T ree, H, D, k); 13 return Fail(Y) ; End By deleting unrelated variables and duplicate rows in T. Table 3: Boolean functions of the example, where + is the logical OR operation, and is the logical AND operation. Gene Rule A A = B B B = A + C C C = (B C) + (C D) + (B D) D D = (A B) + (C D) and Table 3. Firstly, the DFL algorithm searches the first layer, then it sorts all subsets on the first layer. It finds that {A shares the largest mutual information with D among subsets on the first layer. Then, the DFL algorithm searches through 1 (A),..., k 1 (A), however it always decides the search order of i+1 (A) bases on the calculation results of i (A). If it still does not find the target subset, which satisfy the requirement of Theorem 2.1, in kth layer, the DFL algorithm will return to the first layer. Now, the first node on the first layer and all its 1,..., k 1 supersets have already been checked. It continues to calculate the second node on the first layer (and all its 1,..., k 1 supersets), the third one, and so on, until it reaches the end of the first layer. { {A {B {C {D {A,B {A,C {A,D {B,C {B,D {C,D {A,B,C {A,B,D {A,C,D {B,C,D {A,B,C,D Figure 2: Search procedures of the DFL algorithm and the REVEAL algorithm when finding the Boolean function of D in Figure 1. The solid line is for the DFL algorithm, the dashed line is for the RE- VEAL algorithm. The combinations with a black dot under them are the subsets which share the largest mutual information with D on their layers. The RE- VEAL algorithm firstly searches the first layer (the subsets with one gene), then the second layer, and so on. Finally, it finds the target subsets {A, B, C, D at the fourth layer. The DFL algorithm uses a different heuristics. Firstly, it searches the first layer, then finds that {A, with a black dot under it, shares the largest mutual information between D among subsets on the first layer. Then, it continues to search 1 (A) on the second layer. Similarly, these calculations continue until the target combination {A, B, C, D is found on the fourth layer. 4 Experiments and Results Due the the noises in the gene expression data, the requirement of Theorem 2.1 may not be satisfied strictly. Therefore, some regulatory relations can not be identified successfully. In this section, we introduce the concept of ɛ function to overcome the problems incurred by noises in the data sets. We further show that some noises in the data sets of BLNs can be detected and removed by the incompletely specified Karnaugh maps. 4.1 The definition of ɛ function In Theorem 2.1, the exact functional relation results from the strict equality between the entropy of Y H(Y ) and the mutual information of X and Y I(X; Y ). However, this equality is often ruined by the noisy data, like microarray gene expression data. In these cases, we can relax the requirement to obtain a compromised result. As shown in Figure 3, by defining a significant factor ɛ, if the difference between I(X; Y ) and H(Y ) is less than ɛ H(Y ), then we say Y is a ɛ function of X. Formally, we define the ɛ function as follows. Definition 4.1 If H(Y ) I(X; Y ) ɛ H(Y ), then Y = f ɛ (X) where ɛ is a significant factor. 1051
4 I(X;Y) H(Y) (a) H(X) (b) Figure 3: The Venn diagram of H(X),H(Y ) and I(X, Y ), when Y = f(x). (a) The noiseless case, where the mutual information between X and Y is the entropy of Y. (b) The noisy case, where the entropy of Y is not equal to the mutual information between X and Y strictly. The shaded region is resulted from the noises. The ɛ function means that if the area of the shaded region is smaller than or equal to ɛ H(Y ), then Y = f ɛ (X). The significant factor ɛ is adjustable for different noise levels in the data sets. 4.2 Results of synthetic data Again, we use the example in Figure 1 and Table 3 to show that some noises in BLNs can be detected and removed by the incompletely specified Karnaugh maps. We consider the learning of C here. In our experiment, we add one wrong transition pair (1111) (0000) to the transition table. There are 16 lines in the original transition table of the example. Thus, the noise rate here is 1/17. The DFL algorithm successfully finds the correct network architecture in Figure 1 when ɛ = The learned Boolean function table of C is listed in Table 4. In Table 4, we see that there are two lines with the same input 111 of BCD. Since the C is a deterministic function of BCD, one of these two rows are resulted from noise. Figure 4: The incompletely specified Karnaugh map of the learned C. The - in the figure is an unspecified state incurred by the noise. original function in Table 3. The Boolean functions for A, B and D can be correctly obtained in the same way. 4.3 Results of real data In this section, we use the gene expression data of yeast Saccharomyces cerevisiae cell cycle from Cho et al. [14], which covers approximately two full cell cycles [14]. In [15], Lee et al. reported a GRN related to cell cycle of yeast. The GRN consists of 11 wellknown yeast cell cycle regulators, which are Mbp1, Swi4, Swi6, Mcm1, Fkh1, Fkh2, Ndd1, Swi5, Ace2, Skn7 and Stb1. We discretize the data set in [14] to two levels, then rearrange these expression values to state-transition pairs such that the expression values at current time step are the product of expression values at the prior time step. Finally, we apply the DFL algorithm on the obtained transition table. The learned models are shown in Figure 5. Table 4: The learned function table of C. BCD C BCD C is the noise Then, we draw a Karnaugh map for the function table as in Figure 4. In Figure 4, we see that the noise entry in Table 4 produces an unspecified state in the Karnaugh map. However, the noises is correctly detected and removed after the merging rules of the incompletely specified Karnaugh map are applied. As shown in Figure 4, the final Boolean function of C is C = BC + BD + CD, which is the (a) (b) Figure 5: The learned GRN model. (a) The number of discrete levels for gene expression data is 2, the indegree of the GRN is set to 5 and ɛ is 0. (b) Idem, where the ɛ is 0.2. The regulators are represented by ovals. The directed edge from Gene A to Gene B means that Gene A is a regulator of Gene B. The solid edges represent regulatory relations that have been verified by other approaches. The dashed edges represent regulatory relations that have not been verified. All regulatory relations represented by solid edges in Figure 5 are verified in references [15, 16]. For instance, Swi4 transcription is regulated in late G1 by MBF(a complex of Mbp1 and Swi6) [16]. In Figure 1052
5 5, this regulatory relation is identified in both (a) and (b). In Figure 5 (a), the DFL algorithm can not find the regulators of Fkh2 due to the noise in the data. However, two regulators of Fkh2, Mbp1 and Ndd1 are successfully found in Figure 5 (b). The learned Boolean function of Fkh2 is listed in Table 5, where N, M and F represent the expression level of Ndd1, Mbp1 and Fkh2 respectively. Table 5: The learned function table of F. NM F NM F One of these two lines is the noise After applying the merging rules of the incompletely specified Karnaugh maps, we obtain F = N. However, Fkh2 is controlled by many other regulators, like Mbp1, Skh7, Swi4, Ace2, Fhk1, Mcm1 and Fhk2 itself [15]. Thus, we include the Mbp1 in the Boolean function of Fkh2. Finally, we have F = N + M. The conjectural reason why other regulators do not appear in the Boolean function of F is the limited size of the data set, which contains results of 17 experiments only. 5 Discussions We introduce a new concept called ɛ function to deal with the noises when learning BLNs from gene expression data sets. Just like P value in statistical method, the ɛ function method can approximate the original function with a known and adjustable precision. We also show that some noises in ɛ functions of BLNs can be detected and removed with the incompletely specified Karnaugh maps. In reference [10], we show that the ɛ function method can also be applied to functions other than Boolean functions. However, it is also possible that some kinds of noises can not be removed by this method. For example, if the noises cause the BCD = (101) becomes an unspecified state, then the final Boolean function of C would be C = BC +CD, which is incorrect. In these cases, the ɛ functions offer the correct architectures of BLNs although the precise Boolean functions are incorrect. The network architectures of GRN models are important, because they can guide the future researches for biologist. As shown by Yuh et al. [17, 18], the prior models of Endo16 gene of sea urchin embryo were used to guide future research. In return, the original GRN model of Endo16 was refined by further experiments. References [1] J. DeRisi, V. Iyer, and P. Brown, Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale, Science, vol. 278, no. 5338, pp , [2] P. Spellman, G. Sherlock, M. Zhang, V. Iyer, K. Anders, M. Eisen, P. Brown, D. Botstein, and B. Futcher, Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization, Molecular Biology of the Cell, vol. 9, pp , [3] P. D haeseleer, S. Liang, and R. Somogyi, Genetic networks inference: from co-expression clustering to reverse engineering, Bioinformatics, vol. 16, no. 8, pp , [4] P. Smolen, D. Baxter, and J. Byrne, Modeling transcriptional control in gene network: Methods, recent results, and future directions, Bulletin of Mathematical Biology, vol. 62, pp , [5] D. Endy and R. Brent, Modelling cellular behavior, Nature, vol. 409, no. 6818, pp , [6] J. Hasty, D. McMillen, F. Isaacs, and J. Collins, Computational studies of gene regulatory networks: in numero molecular biology, Nature Review Genetics, vol. 2, no. 4, pp , [7] H. Bolouri and E. Davidson, Modeling transcriptional regulatory networks, BioEssays, vol. 24, pp , [8] J. Hasty, D. McMillen, and J. Collins, Engineered gene circuits, Nature, vol. 420, pp , [9] H. de Jong, Modeling and simulation of genetic regulatory systems: A literature review, Jounral of Computational Biology, vol. 9, no. 1, pp , [10] Y. Zheng and C. K. Kwoh, Dynamic algorithm for inferring qualitative models of gene regulatory networks, in Proceedings of 3rd Computer Society Bioinformatics Conference, CSB 2004, [11] R. W. Yeung, A First Course in Information Theory. New York, NY: Kluwer Academic/Plenum Publishers, [12] M. Arnone and E. Davidson, The hardwiring of development: organization and function of genomic regulatory systems, Development, vol. 124, pp ,
6 [13] S. Liang, S. Fuhrman, and R. Somogyi, Reveal, a general reverse engineering algorithms for genetic network architectures, in Proceedings of Pacific Symposium on Biocomputing 98, vol. 3, pp , [14] R. J. Cho, M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart, and R. W. Davis, A genomewide transcriptional analysis of the mitotic cell cycle, Molecular Cell, vol. 2, pp , [15] T. I. Lee, N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Hannett, C. T. Harbison, C. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Murray, D. B. Gordon, B. Ren, J. J. Wyrick, J.-B. Tagne, T. L. Volkert, E. Fraenkel, D. K. Gifford, and R. A. Young, Transcriptional Regulatory Networks in Saccharomyces cerevisiae, Science, vol. 298, no. 5594, pp , [16] I. Simon, J. Barnett, N. Hannett, C. Harbison, T. Rinaldi, N.J. amd Volkert, J. Wyrick, J. Zeitlinger, D. Gifford, T. Jaakkola, and R. Young, Serial regulation of transcriptional regulators in the yeast cell cycle, Cell, vol. 166, pp , [17] C.-H. Yuh, H. Bolouri, and E. Davidson, Genomic Cis-Regulatory Logic: Experimental and Computational Analysis of a Sea Urchin Gene, Science, vol. 279, no. 5358, pp , [18] C.-H. Yuh, H. Bolouri, J. Bower, and E. Davidson, Computational Modeling of Genetic and Biochemical Networks, ch. A logical model of cis-regulatory control in eukaryotic system, pp Cambridge, MA: MIT Press,
Supervised Clustering of Yeast Gene Expression Data
Supervised Clustering of Yeast Gene Expression Data In the DeRisi paper five expression profile clusters were cited, each containing a small number (7-8) of genes. In the following examples we apply supervised
More informationISSN Article. A Feature Subset Selection Method Based On High-Dimensional Mutual Information
Entropy 2011, 13, 860-901; doi:10.3390/e13040860 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article A Feature Subset Selection Method Based On High-Dimensional Mutual Information Zheng
More informationCompClustTk Manual & Tutorial
CompClustTk Manual & Tutorial Brandon King Copyright c California Institute of Technology Version 0.1.10 May 13, 2004 Contents 1 Introduction 1 1.1 Purpose.............................................
More informationPacific Symposium on Biocomputing 4:17-28 (1999)
IDENTIFICATION OF GENETIC NETWORKS FROM A SMALL NUMBER OF GENE EXPRESSION PATTERNS UNDER THE BOOLEAN NETWORK MODEL Tatsuya AKUTSU, Satoru MIYANO Human Genome Center, Institute of Medical Science, University
More informationIdentifying Decision Lists with the Discrete Function Learning Algorithm
Identifying Decision Lists with the Discrete Function Learning Algorithm Zheng Yun BIRC, School of Comp. Eng. Nanyang Technological University Singapore 639798, +65-67906613 pg04325488@ntu.edu.sg Kwoh
More informationGene Expression Clustering with Functional Mixture Models
Gene Expression Clustering with Functional Mixture Models Darya Chudova, Department of Computer Science University of California, Irvine Irvine CA 92697-3425 dchudova@ics.uci.edu Eric Mjolsness Department
More informationMissing Data Estimation in Microarrays Using Multi-Organism Approach
Missing Data Estimation in Microarrays Using Multi-Organism Approach Marcel Nassar and Hady Zeineddine Progress Report: Data Mining Course Project, Spring 2008 Prof. Inderjit S. Dhillon April 02, 2008
More informationBoolean Network Modeling
Boolean Network Modeling Bioinformatics: Sequence Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Gene Regulatory Networks Gene regulatory networks describe the molecules involved in gene
More informationConstructing Bayesian Network Models of Gene Expression Networks from Microarray Data
Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data Peter Spirtes a, Clark Glymour b, Richard Scheines a, Stuart Kauffman c, Valerio Aimale c, Frank Wimberly c a Department
More informationLearning Regulatory Networks from Sparsely Sampled Time Series Expression Data
Learning Regulatory Networks from Sparsely Sampled Time Series Expression Data Anshul Kundaje Dept. of Electrical Engineering Columbia University Tony Jebara Machine Learning Laboratory Dept. of Computer
More informationAlgorithms for Bounded-Error Correlation of High Dimensional Data in Microarray Experiments
Algorithms for Bounded-Error Correlation of High Dimensional Data in Microarray Experiments Mehmet Koyutürk, Ananth Grama, and Wojciech Szpankowski Department of Computer Sciences, Purdue University West
More informationPublication d-side publications. Reprinted with permission.
Publication 6 Jarkko Venna, and Samuel Kaski. Visualizing gene interaction graphs with local multidimensional scaling. In Michel Verleysen, editor, Proceedings of the 14th European Symposium on Artificial
More informationEstimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification
1 Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification Feng Chu and Lipo Wang School of Electrical and Electronic Engineering Nanyang Technological niversity Singapore
More informationA New Approach to Analyzing Gene Expression Time Series Data
A New Approach to Analyzing Gene Expression Time Series Data Ziv Bar-Joseph Georg Gerber David K. Gifford Tommi S. Jaakkola MIT Lab for Computer Science and MIT AI Lab 200 Technology Square, Cambridge,
More informationEFFICIENT SYNTHESIS OF A CLASS OF BOOLEAN PROGRAMS FROM I-O DATA: APPLICATION TO GENETIC NETWORKS
EFFICIENT SYNTHESIS OF A CLASS OF BOOLEAN PROGRAMS FROM I-O DATA: APPLICATION TO GENETIC NETWORKS RUTH CHARNEY, JACQUES COHEN, AND AURÉLIEN RIZK Abstract. The paper addresses the problem of synthesizing
More information/ Computational Genomics. Normalization
10-810 /02-710 Computational Genomics Normalization Genes and Gene Expression Technology Display of Expression Information Yeast cell cycle expression Experiments (over time) baseline expression program
More informationDKT 122/3 DIGITAL SYSTEM 1
Company LOGO DKT 122/3 DIGITAL SYSTEM 1 BOOLEAN ALGEBRA (PART 2) Boolean Algebra Contents Boolean Operations & Expression Laws & Rules of Boolean algebra DeMorgan s Theorems Boolean analysis of logic circuits
More informationVisualizing Gene Clusters using Neighborhood Graphs in R
Theresa Scharl & Friedrich Leisch Visualizing Gene Clusters using Neighborhood Graphs in R Technical Report Number 16, 2008 Department of Statistics University of Munich http://www.stat.uni-muenchen.de
More informationUnit 7 Number System and Bases. 7.1 Number System. 7.2 Binary Numbers. 7.3 Adding and Subtracting Binary Numbers. 7.4 Multiplying Binary Numbers
Contents STRAND B: Number Theory Unit 7 Number System and Bases Student Text Contents Section 7. Number System 7.2 Binary Numbers 7.3 Adding and Subtracting Binary Numbers 7.4 Multiplying Binary Numbers
More informationCoordinated Perspectives and Enhanced Force-Directed Layout for the Analysis of Network Motifs
Coordinated Perspectives and Enhanced Force-Directed Layout for the Analysis of Network Motifs Christian Klukas Falk Schreiber Henning Schwöbbermeyer Leibniz Institute of Plant Genetics and Crop Plant
More informationGate Level Minimization Map Method
Gate Level Minimization Map Method Complexity of hardware implementation is directly related to the complexity of the algebraic expression Truth table representation of a function is unique Algebraically
More informationA Novel Gene Network Inference Algorithm Using Predictive Minimum Description Length Approach
The University of Southern Mississippi The Aquila Digital Community Faculty Publications 5-28-2010 A Novel Gene Network Inference Algorithm Using Predictive Minimum Description Length Approach Vijender
More informationIntroduction to Mfuzz package and its graphical user interface
Introduction to Mfuzz package and its graphical user interface Matthias E. Futschik SysBioLab, Universidade do Algarve URL: http://mfuzz.sysbiolab.eu and Lokesh Kumar Institute for Advanced Biosciences,
More informationFast Calculation of Pairwise Mutual Information for Gene Regulatory Network Reconstruction
Fast Calculation of Pairwise Mutual Information for Gene Regulatory Network Reconstruction Peng Qiu, Andrew J. Gentles and Sylvia K. Plevritis Department of Radiology, Stanford University, Stanford, CA
More information4 KARNAUGH MAP MINIMIZATION
4 KARNAUGH MAP MINIMIZATION A Karnaugh map provides a systematic method for simplifying Boolean expressions and, if properly used, will produce the simplest SOP or POS expression possible, known as the
More informationCombinational Logic Circuits
Chapter 3 Combinational Logic Circuits 12 Hours 24 Marks 3.1 Standard representation for logical functions Boolean expressions / logic expressions / logical functions are expressed in terms of logical
More informationClustering Techniques
Clustering Techniques Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 16 Lopresti Fall 2007 Lecture 16-1 - Administrative notes Your final project / paper proposal is due on Friday,
More informationContinuous Representations of Time Series Gene Expression Data
Continuous Representations of Time Series Gene Expression Data Ziv BarJoseph Georg Gerber David K. Gifford MIT Laboratory for Computer Science 200 Technology Square, Cambridge, MA 02139 zivbj,georg,gifford
More informationSpecifying logic functions
CSE4: Components and Design Techniques for Digital Systems Specifying logic functions Instructor: Mohsen Imani Slides from: Prof.Tajana Simunic and Dr.Pietro Mercati We have seen various concepts: Last
More informationA quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm:
The clustering problem: partition genes into distinct sets with high homogeneity and high separation Hierarchical clustering algorithm: 1. Assign each object to a separate cluster.. Regroup the pair of
More information1.4 Euler Diagram Layout Techniques
1.4 Euler Diagram Layout Techniques Euler Diagram Layout Techniques: Overview Dual graph based methods Inductive methods Drawing with circles Including software demos. How is the drawing problem stated?
More informationDouble Self-Organizing Maps to Cluster Gene Expression Data
Double Self-Organizing Maps to Cluster Gene Expression Data Dali Wang, Habtom Ressom, Mohamad Musavi, Cristian Domnisoru University of Maine, Department of Electrical & Computer Engineering, Intelligent
More informationBiological Networks Analysis
Biological Networks Analysis Introduction and Dijkstra s algorithm Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The clustering problem: partition genes into distinct
More informationAssignment (3-6) Boolean Algebra and Logic Simplification - General Questions
Assignment (3-6) Boolean Algebra and Logic Simplification - General Questions 1. Convert the following SOP expression to an equivalent POS expression. 2. Determine the values of A, B, C, and D that make
More informationAn Efficient Optimal Leaf Ordering for Hierarchical Clustering in Microarray Gene Expression Data Analysis
An Efficient Optimal Leaf Ordering for Hierarchical Clustering in Microarray Gene Expression Data Analysis Jianting Zhang Le Gruenwald School of Computer Science The University of Oklahoma Norman, Oklahoma,
More informationA Theorem of Ramsey- Ramsey s Number
A Theorem of Ramsey- Ramsey s Number A simple instance Of 6 (or more) people, either there are 3 each pair of whom are acquainted or there are 3 each pair of whom are unacquainted Can we explain this without
More informationCh. 5 : Boolean Algebra &
Ch. 5 : Boolean Algebra & Reduction elektronik@fisika.ui.ac.id Objectives Should able to: Write Boolean equations for combinational logic applications. Utilize Boolean algebra laws and rules for simplifying
More informationRandomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec.
Randomized rounding of semidefinite programs and primal-dual method for integer linear programming Dr. Saeedeh Parsaeefard 1 2 3 4 Semidefinite Programming () 1 Integer Programming integer programming
More informationCompClustTk Manual & Tutorial
CompClustTk Manual & Tutorial Brandon King Diane Trout Copyright c California Institute of Technology Version 0.2.0 May 16, 2005 Contents 1 Introduction 1 1.1 Purpose.............................................
More informationA quick review. Which molecular processes/functions are involved in a certain phenotype (e.g., disease, stress response, etc.)
Gene expression profiling A quick review Which molecular processes/functions are involved in a certain phenotype (e.g., disease, stress response, etc.) The Gene Ontology (GO) Project Provides shared vocabulary/annotation
More informationChapter 3 Simplification of Boolean functions
3.1 Introduction Chapter 3 Simplification of Boolean functions In this chapter, we are going to discuss several methods for simplifying the Boolean function. What is the need for simplifying the Boolean
More informationDECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY
DECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY Ramadevi Yellasiri, C.R.Rao 2,Vivekchan Reddy Dept. of CSE, Chaitanya Bharathi Institute of Technology, Hyderabad, INDIA. 2 DCIS, School
More informationIntroduction. The Quine-McCluskey Method Handout 5 January 24, CSEE E6861y Prof. Steven Nowick
CSEE E6861y Prof. Steven Nowick The Quine-McCluskey Method Handout 5 January 24, 2013 Introduction The Quine-McCluskey method is an exact algorithm which finds a minimum-cost sum-of-products implementation
More informationINFORMATION THEORETIC APPROACHES TOWARDS REGULATORY NETWORK INFERENCE
Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 INFORMATION THEORETIC APPROACHES TOWARDS REGULATORY NETWORK INFERENCE Vijender Chaitankar Virginia Commonwealth
More informationAnswer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency?
Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Fathiyeh Faghih and Daniel G. Brown David R. Cheriton School of Computer Science, University of
More informationNew Genetic Operators for Solving TSP: Application to Microarray Gene Ordering
New Genetic Operators for Solving TSP: Application to Microarray Gene Ordering Shubhra Sankar Ray, Sanghamitra Bandyopadhyay, and Sankar K. Pal Machine Intelligence Unit, Indian Statistical Institute,
More informationIT 201 Digital System Design Module II Notes
IT 201 Digital System Design Module II Notes BOOLEAN OPERATIONS AND EXPRESSIONS Variable, complement, and literal are terms used in Boolean algebra. A variable is a symbol used to represent a logical quantity.
More informationScheduling Unsplittable Flows Using Parallel Switches
Scheduling Unsplittable Flows Using Parallel Switches Saad Mneimneh, Kai-Yeung Siu Massachusetts Institute of Technology 77 Massachusetts Avenue Room -07, Cambridge, MA 039 Abstract We address the problem
More informationRobust Signal-Structure Reconstruction
Robust Signal-Structure Reconstruction V. Chetty 1, D. Hayden 2, J. Gonçalves 2, and S. Warnick 1 1 Information and Decision Algorithms Laboratories, Brigham Young University 2 Control Group, Department
More informationGate Level Minimization
Gate Level Minimization By Dr. M. Hebaishy Digital Logic Design Ch- Simplifying Boolean Equations Example : Y = AB + AB Example 2: = B (A + A) T8 = B () T5 = B T Y = A(AB + ABC) = A (AB ( + C ) ) T8 =
More information2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006
2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,
More informationMaximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube
Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Kavish Gandhi April 4, 2015 Abstract A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long
More informationSynthesis 1. 1 Figures in this chapter taken from S. H. Gerez, Algorithms for VLSI Design Automation, Wiley, Typeset by FoilTEX 1
Synthesis 1 1 Figures in this chapter taken from S. H. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998. Typeset by FoilTEX 1 Introduction Logic synthesis is automatic generation of circuitry
More informationCombinatorial Problems on Strings with Applications to Protein Folding
Combinatorial Problems on Strings with Applications to Protein Folding Alantha Newman 1 and Matthias Ruhl 2 1 MIT Laboratory for Computer Science Cambridge, MA 02139 alantha@theory.lcs.mit.edu 2 IBM Almaden
More informationSystem Identification Algorithms and Techniques for Systems Biology
System Identification Algorithms and Techniques for Systems Biology by c Choujun Zhan A Thesis submitted to the School of Graduate Studies in partial fulfillment of the requirements for the degree of Doctor
More informationSimilarity-Driven Cluster Merging Method for Unsupervised Fuzzy Clustering
Similarity-Driven Cluster Merging Method for Unsupervised Fuzzy Clustering Xuejian Xiong, Kian Lee Tan Singapore-MIT Alliance E4-04-10, 4 Engineering Drive 3 Singapore 117576 Abstract In this paper, a
More informationFormal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T.
Although this paper analyzes shaping with respect to its benefits on search problems, the reader should recognize that shaping is often intimately related to reinforcement learning. The objective in reinforcement
More informatione-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data
: Related work on biclustering algorithms for time series gene expression data Sara C. Madeira 1,2,3, Arlindo L. Oliveira 1,2 1 Knowledge Discovery and Bioinformatics (KDBIO) group, INESC-ID, Lisbon, Portugal
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationInferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles
Supporting Information to Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles Ali Shojaie,#, Alexandra Jauhiainen 2,#, Michael Kallitsis 3,#, George
More informationPROGRAMMABLE LOGIC DEVICES
PROGRAMMABLE LOGIC DEVICES Programmable logic devices (PLDs) are used for designing logic circuits. PLDs can be configured by the user to perform specific functions. The different types of PLDs available
More informationUsing modified Lasso regression to learn large undirected graphs in a probabilistic framework
Using modified Lasso regression to learn large undirected graphs in a probabilistic framework Fan Li LTI, SCS, Carnegie Mellon University 4 NSH, Forbes Ave. Pittsburgh, PA 13 hustlf@cs.cmu.edu Yiming Yang
More informationLSN 4 Boolean Algebra & Logic Simplification. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology
LSN 4 Boolean Algebra & Logic Simplification Department of Engineering Technology LSN 4 Key Terms Variable: a symbol used to represent a logic quantity Compliment: the inverse of a variable Literal: a
More informationUsing modified Lasso regression to learn large undirected graphs in a probabilistic framework
Using modified Lasso regression to learn large undirected graphs in a probabilistic framework Fan Li LTI, SCS, Carnegie Mellon University 4 NSH, Forbes Ave. Pittsburgh, PA 13 hustlf@cs.cmu.edu Yiming Yang
More informationSlide Set 5. for ENEL 353 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary
Slide Set 5 for ENEL 353 Fall 207 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Fall Term, 207 SN s ENEL 353 Fall 207 Slide Set 5 slide
More informationNP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.
CS 787: Advanced Algorithms NP-Hardness Instructor: Dieter van Melkebeek We review the concept of polynomial-time reductions, define various classes of problems including NP-complete, and show that 3-SAT
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental
More informationFISA: Fast Iterative Signature Algorithm for the analysis of large-scale gene expression data
FISA: Fast Iterative Signature Algorithm for the analysis of large-scale gene expression data Seema Aggarwal Department of Computer Science University of Delhi saggarwal@mh.du.ac.in and Neelima Gupta Department
More informationData Warehousing and Data Mining
Data Warehousing and Data Mining Lecture 3 Efficient Cube Computation CITS3401 CITS5504 Wei Liu School of Computer Science and Software Engineering Faculty of Engineering, Computing and Mathematics Acknowledgement:
More informationEvaluating the Effect of Perturbations in Reconstructing Network Topologies
DSC 2 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2/ Evaluating the Effect of Perturbations in Reconstructing Network Topologies Florian Markowetz and Rainer Spang Max-Planck-Institute
More informationA NOVEL HYBRID APPROACH TO ESTIMATING MISSING VALUES IN DATABASES USING K-NEAREST NEIGHBORS AND NEURAL NETWORKS
International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 7(A), July 2012 pp. 4705 4717 A NOVEL HYBRID APPROACH TO ESTIMATING MISSING
More informationA Reduction of Conway s Thrackle Conjecture
A Reduction of Conway s Thrackle Conjecture Wei Li, Karen Daniels, and Konstantin Rybnikov Department of Computer Science and Department of Mathematical Sciences University of Massachusetts, Lowell 01854
More informationOn Approximating Minimum Vertex Cover for Graphs with Perfect Matching
On Approximating Minimum Vertex Cover for Graphs with Perfect Matching Jianer Chen and Iyad A. Kanj Abstract It has been a challenging open problem whether there is a polynomial time approximation algorithm
More informationContents. ! Data sets. ! Distance and similarity metrics. ! K-means clustering. ! Hierarchical clustering. ! Evaluation of clustering results
Statistical Analysis of Microarray Data Contents Data sets Distance and similarity metrics K-means clustering Hierarchical clustering Evaluation of clustering results Clustering Jacques van Helden Jacques.van.Helden@ulb.ac.be
More informationOptimal Arrangement of Leaves in the Tree Representing Hierarchical Clustering of Gene Expression Data
Optimal Arrangement of Leaves in the Tree Representing Hierarchical Clustering of Gene Expression Data Ziv Bar-Joseph 1, Therese Biedl 2, Broňa Brejová 2, Erik D. Demaine 2, David K. Gifford 1,4, Angèle
More informationMulti-Way Number Partitioning
Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) Multi-Way Number Partitioning Richard E. Korf Computer Science Department University of California,
More informationCombinational Logic & Circuits
Week-I Combinational Logic & Circuits Spring' 232 - Logic Design Page Overview Binary logic operations and gates Switching algebra Algebraic Minimization Standard forms Karnaugh Map Minimization Other
More informationClustering Jacques van Helden
Statistical Analysis of Microarray Data Clustering Jacques van Helden Jacques.van.Helden@ulb.ac.be Contents Data sets Distance and similarity metrics K-means clustering Hierarchical clustering Evaluation
More informationAlgorithms for Bioinformatics
Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and
More informationData mining with Support Vector Machine
Data mining with Support Vector Machine Ms. Arti Patle IES, IPS Academy Indore (M.P.) artipatle@gmail.com Mr. Deepak Singh Chouhan IES, IPS Academy Indore (M.P.) deepak.schouhan@yahoo.com Abstract: Machine
More informationUNIT II. Circuit minimization
UNIT II Circuit minimization The complexity of the digital logic gates that implement a Boolean function is directly related to the complexity of the algebraic expression from which the function is implemented.
More informationA Partition Method for Graph Isomorphism
Available online at www.sciencedirect.com Physics Procedia ( ) 6 68 International Conference on Solid State Devices and Materials Science A Partition Method for Graph Isomorphism Lijun Tian, Chaoqun Liu
More informationA STUDY ON DYNAMIC CLUSTERING OF GENE EXPRESSION DATA
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LIX, Number 1, 2014 A STUDY ON DYNAMIC CLUSTERING OF GENE EXPRESSION DATA ADELA-MARIA SÎRBU Abstract. Microarray and next-generation sequencing technologies
More informationGPU Accelerated PK-means Algorithm for Gene Clustering
GPU Accelerated PK-means Algorithm for Gene Clustering Wuchao Situ, Yau-King Lam, Yi Xiao, P.W.M. Tsang, and Chi-Sing Leung Department of Electronic Engineering, City University of Hong Kong, Hong Kong,
More informationSimplification of Boolean Functions
COM111 Introduction to Computer Engineering (Fall 2006-2007) NOTES 5 -- page 1 of 5 Introduction Simplification of Boolean Functions You already know one method for simplifying Boolean expressions: Boolean
More informationVISUALIZING NP-COMPLETENESS THROUGH CIRCUIT-BASED WIDGETS
University of Portland Pilot Scholars Engineering Faculty Publications and Presentations Shiley School of Engineering 2016 VISUALIZING NP-COMPLETENESS THROUGH CIRCUIT-BASED WIDGETS Steven R. Vegdahl University
More informationPersistence of activity in random Boolean networks
Persistence of activity in random Boolean networks Shirshendu Chatterjee Courant Institute, NYU Joint work with Rick Durrett October 2012 S. Chatterjee (NYU) Random Boolean Networks 1 / 11 The process
More informationMICROARRAY IMAGE SEGMENTATION USING CLUSTERING METHODS
Mathematical and Computational Applications, Vol. 5, No. 2, pp. 240-247, 200. Association for Scientific Research MICROARRAY IMAGE SEGMENTATION USING CLUSTERING METHODS Volkan Uslan and Đhsan Ömür Bucak
More informationThis Lecture. We will first introduce some basic set theory before we do counting. Basic Definitions. Operations on Sets.
Sets A B C This Lecture We will first introduce some basic set theory before we do counting. Basic Definitions Operations on Sets Set Identities Defining Sets Definition: A set is an unordered collection
More information9/19/12. Why Study Discrete Math? What is discrete? Sets (Rosen, Chapter 2) can be described by discrete math TOPICS
What is discrete? Sets (Rosen, Chapter 2) TOPICS Discrete math Set Definition Set Operations Tuples Consisting of distinct or unconnected elements, not continuous (calculus) Helps us in Computer Science
More informationCLUSTERING GENE EXPRESSION DATA USING AN EFFECTIVE DISSIMILARITY MEASURE 1
International Journal of Computational Bioscience, Vol. 1, No. 1, 2010 CLUSTERING GENE EXPRESSION DATA USING AN EFFECTIVE DISSIMILARITY MEASURE 1 R. Das, D.K. Bhattacharyya, and J.K. Kalita Abstract This
More informationRuled Based Approach for Scheduling Flow-shop and Job-shop Problems
Ruled Based Approach for Scheduling Flow-shop and Job-shop Problems Mohammad Komaki, Shaya Sheikh, Behnam Malakooti Case Western Reserve University Systems Engineering Email: komakighorban@gmail.com Abstract
More informationCHAPTER 9 MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES
CHAPTER 9 MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES This chapter in the book includes: Objectives Study Guide 9.1 Introduction 9.2 Multiplexers 9.3 Three-State Buffers 9.4 Decoders and Encoders
More informationAn Interesting Way to Combine Numbers
An Interesting Way to Combine Numbers Joshua Zucker and Tom Davis October 12, 2016 Abstract This exercise can be used for middle school students and older. The original problem seems almost impossibly
More informationCombinatorial Optimization and Integer Linear Programming
Combinatorial Optimization and Integer Linear Programming 3 Combinatorial Optimization: Introduction Many problems arising in practical applications have a special, discrete and finite, nature: Definition.
More informationPackage RobustRankAggreg
Type Package Package RobustRankAggreg Title Methods for robust rank aggregation Version 1.1 Date 2010-11-14 Author Raivo Kolde, Sven Laur Maintainer February 19, 2015 Methods for aggregating ranked lists,
More informationDesign of Framework for Logic Synthesis Engine
Design of Framework for Logic Synthesis Engine Tribikram Pradhan 1, Pramod Kumar 2, Anil N S 3, Amit Bakshi 4 1 School of Information technology and Engineering, VIT University, Vellore 632014, Tamilnadu,
More informationDrawing Euler Diagrams with Circles: The Theory of Piercings
JOURNL OF L T E X CLSS FILES, VOL. 6, NO. 1, JNURY 2007 1 Drawing Euler Diagrams with Circles: The Theory of Piercings Gem Stapleton, Leishi Zhang, John Howse and Peter Rodgers bstract Euler diagrams are
More informationGenomics - Problem Set 2 Part 1 due Friday, 1/26/2018 by 9:00am Part 2 due Friday, 2/2/2018 by 9:00am
Genomics - Part 1 due Friday, 1/26/2018 by 9:00am Part 2 due Friday, 2/2/2018 by 9:00am One major aspect of functional genomics is measuring the transcript abundance of all genes simultaneously. This was
More informationBOOLEAN ALGEBRA. Logic circuit: 1. From logic circuit to Boolean expression. Derive the Boolean expression for the following circuits.
COURSE / CODE DIGITAL SYSTEMS FUNDAMENTAL (ECE 421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE 422) BOOLEAN ALGEBRA Boolean Logic Boolean logic is a complete system for logical operations. It is used in countless
More information