Bayesian Belief Networks for Data Mining. Harald Steck and Volker Tresp. Siemens AG, Corporate Technology. Information and Communications

Size: px
Start display at page:

Download "Bayesian Belief Networks for Data Mining. Harald Steck and Volker Tresp. Siemens AG, Corporate Technology. Information and Communications"

Transcription

1 Bayesian Belief Networks for Data Mining Harald Stek and Volker Tresp Siemens AG, Corporate Tehnology Information and Communiations Munih, Germany fharald.stek, Abstrat In this paper we present a novel onstraint based strutural learning algorithm for ausal networks. A set of onditional independene and dependene statements (CIDS) is derived from the data whih desribes the relationships among the variables. Although we impliitly assume that there exists a perfet map for the true, yet unknown, distribution, there does not need to be a perfet map for the CIDSs derived from the limited data. The reason is that the distribution of limited data might dier from the true probability distribution due to sampling noise. We derive a neessary ondition for the existene of a perfet map given a set of CIDSs and utilize it to hek for inonsistenies. If an inonsisteny is deteted, the algorithm nds all Bayesian networks with a minimum number of edges suh that a maximum number of CIDSs is represented in eah of the multiple solutions. The advantages of our approah are illustrated using the alarm network data set. 1 INTRODUCTION A Bayesian belief network represents onditional independenes in the underlying probability distribution of the data in the form of a direted ayli graph (DAG). The DAG: A! B C states, for example, that A and C are independent if B is not known, but A and C beome dependent as soon as the state of B is xed. If all onditional independenes in the probability distribution are represented in the DAG and vie versa, then the Bayesian network is said to be a perfet map of the probability distribution, and is alled a also with Tehnial University of Munih, Department of Computer Siene, Munih, Germany ausal network [Pearl 1988]. Typially a Bayesian network is onstruted from expert knowledge although reently there has been an inreasing interest in learning the struture of Bayesian networks from data for data mining appliations. The basi idea is to display probabilisti dependenes and independenes derived from the data in a onise way by a Bayesian network whih has the potential of providing muh more information about a domain than visualizations solely based on orrelations and distane measures. Bayesian networks an be used to display ausal models whih an be understood by humans more intuitively than models with undireted edges like Markov networks. Model unertainty is a serious problem in strutural learning of Bayesian network models and omes into play if there does not exist a perfet map of the probability distribution of the data. If a limited data set is given, its probability distribution might dier from the true one due to sampling noise. For strutural learning we assume that the true probability distribution is suh that there exists a perfet map. It is important to present to the user truefully the strutural unertainties sine otherwise he or she might draw inorret onlusions from the presented struture. In a Bayesian approah, strutural unertainty an be represented to the user by presenting multiple solutions whih have obtained a high sore. The disadvantage is that one an never be sure that one has found all solutions with a high sore. Furthermore, it is very diult for a user to study multiple solutions visually and to draw meaningful onlusions. In addition, the omputationally ost of nding the K-best solutions is even higher than for nding only the network with the best sore. Therefore we have foused on an approximate way of strutural learning without losing model unertainty in the result. In the next setion, we give a short summary of strutural learning. Then we present a proposition whih serves as a neessary ondition for the existene of a perfet map given a set of onditional independene

2 and dependene statements (CIDS). This proposition will be used to hek for inonsistenies among the estimated CIDSs whih then might lead to multiple solutions. Then we present the algorithm utilizing this proposition when heking the CIDSs for onsisteny and omment on its omplexity. We lose with results of the omputer experiments. 2 STRUCTURAL LEARNING There are two main approahes to strutural learning of Bayesian belief networks. In the Bayesian approah [Cooper and Herskovits 1992, Hekerman et al. 1994, Hekerman 1995] a global ost funtion { the posterior probability of the (entire) network { is maximized. This approah requires an involved searh for the best struture and is therefore omputationally very expensive and not diretly appliable to data mining appliations. In this paper we pursue a onstraint based approah similar to the ones in [Wermuth and Lauritzen 1983, Fung and Crawford 1990, Spirtes et al. 1993, Suzuki 1996, Cheng et al. 1997]. The basi idea of those algorithms is to derive a set of CIDSs from the data without taking into aount the Bayesian network struture. The Bayesian network is then onstruted from the CIDSs in a later step. This is done in standard algorithms by removing an edge in the Bayesian network whenever the orresponding pair of variables is found onditional independent. There an, however, our inonsistenies among the CIDSs derived from limited data when reonstruting the entire network (up to equivalene), i.e. not all CIDSs an be represented in a perfet map simultaneously. These inonsistenies reet model unertainty, i.e. there might exist more than one Bayesian network model desribing the probability distribution of the data well (aording to some measure). This indiates how reliable one partiular struture learned from the database is. These multiple solutions have usually many edges in ommon and dier in the presene of a few edges only whih we all inonsistent edges, sine they are related to inonsistenies among the CIDSs. The set of inonsistent edges an further be divided into subsets suh that the presene or absene of edges belonging to dierent subsets is independent of eah other in the multiple solutions. Suh a subset we all an ambiguous region. Hene, we visualize the set of solutions in a single graph (f. Figure 3), in whih the edges ommon to all networks are skethed by solid lines, whereas the edges belonging to the same ambiguous region are depited by dashed lines of the same style. The possible strutures in eah of the ambiguous regions annot be depited in suh a graph. They are hene stated in the gure aption. The diretions of the edges are speied in a later stage by this kind of onstraint based approah. These possibly multiple solutions for the network strutures do not belong to the same equivalene lass, sine they dier in the presene or absene of ertain edges. So the multiple solutions have to be distinguished from networks belonging to the same equivalene lass. The advantage of our onstraint based approah is that it an systematially onstrut the set of all Bayesian networks. Furthermore, it an take advantage of the fat that the multiple solutions typially have many edges in ommon and dier only in a few inonsistent edges whih an be partitioned into independent ambiguous regions. Regarding the derivation of the CIDSs from the data set, the omputation time of this approah relative to the global Bayesian approah is appreiably shorter for sparse network strutures, i.e. when the probability distribution exhibits many onditional independenes. For dense networks, this may not hold, but Bayesian network models might not be the most suitable model in suh a situation, anyway. The omputation time an additionally be redued by arrying it out in parallel. This an be realized in a simple master and slave sheme. 3 NECESSARY PATH CONDITION The algorithm we depit in this paper makes use of the following proposition as a neessary ondition for the existene of a perfet map given a set of CIDSs derived from the data. Essentially, the proposition requires that if two variables 1 a, b only beome independent by onditioning on an additional variable, then there must be a path onneting them via in the graph. Otherwise the dependene without onditioning on would be unexplained. If there were no neessary ondition at all, then one would arrive at the SGS or PC algorithm [Spirtes et al. 1993]. Proposition: Let V be the set of variables and P a probability distribution with the perfet map G: If two variables a; b 2 V are onditionally independent in P given a set of variables S V nfa; bg and they are dependent given any (proper) subset S 0 S, then there is no edge between a and b in G and there exist (undireted) paths between eah s 2 S and a not rossing b as well as between eah s 2 S and b not rossing a. 1 For brevity, we use nodes and variables synonymously and denote both with small letters, sine there is a one-toone orrespondene.

3 (B) learned networks (equivalene lasses) (C): SGS and PC algorithm (I) (II) (A) a b original network D(a,b),D(a,b {}), D(a,), I(a, {b}), D(b,),D(b, {a}) D(a,b),D(a,b {}), D(a,), I(a, {b}), D(b,), I(b, {a}) D(a,b),D(a,b {}), I(a,), I(a, {b}), I(b,), I(b, {a}) (I): (II): (III): a a a b b b a or a b b (III) 10(D): proposed 100 algorithm (I) sample size (II) (III) sample size Figure 1: From the original network (A) we randomly generate data sets of various sizes (sample size between 10 and 10000). The distribution of the variables is given by b = " b, a = m ab b + " a and = m b b + " with the parameters m ab = 3, m b = 0:3 and with Gaussian distributed noise " i for i 2 fa; b; g of unit variane and zero mean. For omparison with the SGS and PC algorithms we used the test on vanishing partial orrelations with a signiane level of 0:01 in this example in order to derive the onditional independene and dependene statements (CIDSs) from the data sets. For details see setion (3). Proof: This proposition follows immediately from the denitions of the perfet map and of the d-separation (see for instane [Pearl 1988]). Given two variables a and b and a set S V nfa; bg, assume that there is a variable v 2 S suh that there is no (undireted) path between a and v not rossing b in the perfet map G. Then, if a and b are d-separated given S, they are also d-separated given Snfvg. Hene, onsidering the probability distribution P, if a and b are independent onditional on S, then they are also independent onditional on Snfvg. QED. Neessary Path Condition: In general one annot expet that an arbitrary set of CIDSs an be represented in a perfet map, sine a perfet map implies ertain relations among the CIDSs to hold. Therefore it is desirable to have neessary and suient onditions at hand for eiently heking the CIDSs for onsisteny, i.e. if all the CIDSs an simultaneously be displayed in one perfet map. Sine we like to retain the property of the learning algorithm to sequentially learn the skeleton of the Bayesian network, i.e. rst the presene of its edges, then the orientations of the edges and nally the parameters of the model, a suient ondition for the existene of a perfet map when onstruting the skeleton annot be available. The above proposition provides, however, a neessary ondition for the existene of a perfet map whih means that it an be used to hek for inonsistenies among the CIDS in the sense that if an inonsisteny is found there annot exist a perfet map representing all the CIDSs. In networks involving a large number of variables, there might be several estimated onditional independene statements (CIS) for eah pair of variables a, b. Therefore we propose the following Neessary Path Condition to be used to hek for inonsistenies in the algorithm: For eah absent edge [a; b] in the network there has to be represented at least one { not all { CIS I(a; bjs) { with the orresponding D(a; bjs 0 ) for all S 0 S { in the perfet map aording to the above proposition. This is less strit than the proposition itself, but makes the algorithm more robust, sine an inonsisteny is only deteted if there is an edges for whih no CIDSs an be found onsistent. Example (f. Figure 1): In Figure 1 it is shown in a toy model of three variables a, b and that it is essential to hek the set of CIDSs derived from limited data on onsisteny with the Bayesian network model, sine eah onditional independene statement (CIS) or onditional dependene statement (CDS) is derived from the data without taking into aount the model at all. In (B) we ompare the resulting networks onstruted from (possibly inonsistent) sets of CIDSs by two dierent algorithms: First, the SGS and PC algorithms (dashed arrows) whih assumes an edge absent in the model whenever a onditional independene is derived from the data without heking for onsisteny. Seond, the proposed algorithm (solid arrows) heking for onsisteny. In the rst senario (f. network (I) in (B)), assume that the only onditional independene statement de-

4 rived from the data is I(a; jfbg) (together with the (onditional) dependene statements D(a; b), D(a; ), D(b; ), D(a; bjfg), D(b; jfag)). Aording to the proposition this requires that there must exist a path between a and b as well as between b and, whereas the edge [a; ] 2 must be absent. Apparently, there exists a perfet map, namely omprising the edges [a; b] and [b; ], i.e. the set of CIDSs is onsistent. In the seond senario (f. networks (II) in (B)), assume the two CISs I(a; jfbg) and I(b; jfag) { and the CDSs D(a; b), D(a; ), D(b; ), D(a; bjfg) { are found from the data. The edge [a; b] is required by these CIDSs, but the CIS I(a; jfbg) together with D(a; ) additionally requires the presene of the edge [b; ] and the absene of the edge [a; ], whereas the other CIS requires just the ontrary. Hene, there annot exist a perfet map fullling all the CIDSs simultaneously. For example, the set of CIDSs orresponding to a perfet map ontaining only the edge [a; b] (as found by the SGS or PC algorithm) omprises I(a; ) and I(b; ) whih violates both D(a; ) and D(b; ). If one assumes now that the inonsistenies among the CIDSs are solely due to sampling noise in the limited data set, it makes sense to searh for all possible Bayesian networks with a minimum number of edges eah of whih onstruted from a maximal onsistent subset of the CIDSs derived from the data. In this example, it is apparent that exatly one of the two CISs annot be represented in a perfet map. Consequently, one nds two possible network strutures: the edge [a; b] is ommon to both, and they ontain edge [a; ] or [b; ], respetively. Neither of the two networks represents all the CISs whih were derived from the data. Note that the orret ausal network, i.e. the perfet map of the { in general unknown { true set of CIDSs is among the multiple solutions in this example. Finally, the set of CIDSs omprising the unonditional CISs I(a; ) and I(b; ) are onsistent and lead to a network with only the edge [a; b] being present (f. network (III) in (B)), i.e. the original network struture ould not be reovered from these CIDSs whih tend to be derived from small data sets. In (C) and (D) the resulting network strutures depending on the sample size is shown. While our approah yields multiple solutions (f. (II) in (D)) for "medium sized" data sets (sample size between a. 70 and a. 700), the SGS or PC algorithms nd the network omprising the single edge [a; b] (f. (C)). 2 In this notation for undireted edges, the order of the variables does not matter, i.e. [a; b] = [b; a]. 4 ALGORITHM Before we go into details of the proposed algorithm, we give a short overview of its main steps. After the set of onditional independene and dependene statements (CIDS) has been derived from the data the proposed algorithm applies the Neessary Path Condition presented above in order to hek, if a perfet map an exist given the set of CIDSs. If inonsistenies are deteted, the algorithm searhes for all possible Bayesian networks as desribed in detail in setion 4.3. For eah of the multiple solutions, the diretions of the edges are xed in the last step. All these networks ontain the same number of edges. Among those are all the edge resulting from the the SGS algorithm 3. Eah of the minimal solutions found by the algorithm represents an equivalene lass, and is a andidate for being a perfet map of the (unknown) true probability distribution of the data set. 4.1 DERIVING THE CIDSs The algorithm builds up the set of onditional independene and dependene statements from a given data set in the rst step. This an eiently be done by relying on asymptoti results whih appear reasonably aurate in our experiments, for example by alulating the Bayesian Information Criterion or a test statisti for eah pair of variables given a subset of the remaining variables. Considering independene tests it is apparent that CDSs derived from the data are more reliable than the CISs. This is beause the null hypothesis of independene an only by falsied but not veried by a test. This indiates that network strutures learned by a onstraint based approah tend to remove too many edges, in partiular given small data sets. As will be shown below, this tendeny of removing too many edges an largely be redued by heking the CIDSs for onsisteny. For onveniene we do not denote the onditional dependene statements, i.e. if a statement is not in the set of CISs (and annot be inferred from it by ombining some of the CISs aording to the Faithfulness and Markov Conditions as desribed 3 In this paper, we fous on how to onstrut the skeleton of a Bayesian network from the CIDSs rather than how to derive the CIDSs themselves. The CIDS derived by the SGS and PC algorithms an dier, sine the latter failitates heuristis. Here, we ompare our algorithm with the SGS algorithm, beause the experiments arried out here are based on the CIDSs derived by the SGS algorithm. Of ourse, our algorithm an also be applied to the CIDSs derived by the PC algorithm, and the same statements apply as for the SGS algorithm.

5 Figure 2: The alarm network ontains 37 variables and 46 edges. The numbering of the variables is hosen as in [Cheng et al. 1997]. in [Spirtes et al. 1993]), it is understood that this implies a (onditional) dependene. 4.2 RULES For eah given CIS of the form I(a; bjs) with D(a; bjs 0 ) 8S 0 S the proposed Neessary Path Condition requires the absene of the edge [a; b] and the presene of the paths between a and eah variable s 2 S as well as between b and eah variable s 2 S. We represent this onstraint on the absene or presene of ertain edges by a rule. Suh a rule is of the form X Y, where X denotes an edge and Y is a (possibly empty) set of edges. Aording to the Neessary Path Condition, the above CIS with the CDSs is translated into a rule as follows: X = [a; b] and Y S = f[a; s]; [b; s]g. It an be interpreted in the s2s way that edge X an only be absent in the Bayesian network, if the edges in the set Y are present. Sine the above proposition requires ertain paths rather than ertain edges to be present, this is absorbed in the fat that new rules an be generated by substituting rules into eah other as follows: Given two rules X Y and W Z then a new rule an be generated for edge X, if the edges W 2 Y and X =2 Z, namely X (Z [ YnfW g). Therefore an edge an be absent, if there is at least one rule fullled (f. Neessary Path Condition). One the set of rules is derived from the set of CIDSs, the assoiated perfet map an be onstruted. If there is a Bayesian network suh that for all edges a rule is fullled, then there might exist a perfet map assoiated with the estimated probability distribution. If there are some edges for whih none of the given rules an be fullled by a Bayesian network, we all the set of rules and the set of CIDSs inonsistent. In this ase, there does not exist a perfet map assoiated with the estimated probability distribution. 4.3 MULTIPLE SOLUTIONS Our algorithm nds multiple solutions, when there does not exist a perfet map of the estimated CIDSs aording to the Neessary Path Condition. If inonsistenies among the CIDS are found by the algorithm, it makes sense to onsider these inonsistenies to be present due to sampling noise in the limited data set and to retain the assumption that there exists a perfet map assoiated with the (unknown) set of (true) CIDSs. Therefore the algorithm searhes for all possible minimum subsets of the set of CIDSs whih are onsistent in the above sense, i.e. the algorithm searhes for all possible network strutures with a minimum number of edges suh that for a maximum number of edges a rule is fullled. Eah of these possible networks is a andidate for being the perfet map of the (unknown) set of true CIDSs. It turns out that the edges of the resulting networks an be divided into three main groups: If there is no rule X Y for an edge X, then no estimated CIS was found, and hene this edge is present in the network. If there an be rendered a rule X Y for edge X suh that Y is empty or ontains only edges whih are present in the perfet map, then we all it a onsistent edge whih is absent. An edge for whih no suh rule an be generated indiates that there does not exist a perfet map given the set of CIDSs. Suh edges we all inonsistent and they might be present or absent in some of the multiple solutions. The multiple solutions dier only in the presenes of the the latter edges. As we will see from the experiments (in setion 5), the inonsistent edges an usually be further subdivided: They an be partitioned into sets of edges whose presenes depend on eah other, whereas edges belonging

6 Figure 3: This graph skethes the multiple solutions learned from a data set of size with the signiane level 0:01 before the diretions of the edges are added. Solid lines denote edges whih are present in all the possible network strutures. The multiple solutions dier in the edges belonging to the 4 ambiguous regions whih are depited in dierent line styles. The possible strutures for eah region are as follows: In region (A) either edge [11; 12] or edge [12; 32] is present. In region (B) either [14; 27] or [27; 33] is present. In region (C) either [15; 22] or [22; 35] is present. In region (D) either the single edge [18; 26] or the two edges [6; 18] and [3; 26] are present; here, the only minimum struture is the one omprising the single edge [18; 26]. Hene, there are two minimum strutures in eah of the regions (A), (B) and (C), and one minimum struture in region (D). Therefore, the overall number of multiple solutions is edges have orretly been identied whih are present in all the multiple solutions. Additional 4 edges have been found due to the Neessary Path Condition implemented in our algorithm. The networks among the multiple solutions whih are losest to the original one ontain 45 (out of 46) orret edges, and only one edge is missing, namely either [15; 22] or [22; 35]. The resulting network of the SGS algorithm 3 ontains the same 41 edges, whih are ommon to all multiple solutions of our algorithm, so that 5 edges are missing. to dierent sets do not depend on eah other. Eah of suh a set we all an ambiguous region. There might be several of suh regions. Tehnially speaking, the algorithm nds all the edges belonging to the same ambiguous region in the following way: Only the rules for inonsistent edges are onsidered. If for two inonsistent edges X and W there an be rendered rules X Y and W Z suh that X 2 Z and W 2 Y, then they are grouped into the same ambiguous region, beause it might not be possible to fulll both of those rules simultaneously in a Bayesian network. This an also be seen as searhing for yles in a direted ayli graph (DAG) whih is generated from the rules: Eah node of that graph represents an inonsistent edge of the Bayesian network, and for eah rule X Y edges are present pointing from eah node Y 2 Y to node X in that DAG. Our algorithm takes advantage of the fat that the multiple solutions dier only in the ambiguous regions and that they are independent of eah other. Sine eah of suh a region usually ontains only a few edges, searhing for all possible strutures suh that the number of onsistent CISs is maximum an be done very ef- iently: Simply by arrying out an exhaustive searh in eah ambiguous region separately. The number of dierent network strutures is then given by the produt of the number of dierent strutures in eah ambiguous region. 4.4 FINDING DIRECTIONS OF EDGES Constraint based algorithms of this kind have the property that they an be split up into several steps. First, the algorithm nds the (undireted) edges whih are present in the Bayesian network. We have foused on that part in this paper. In the seond step, diretions are added to those edges whih an be derived from the data so that the equivalene lasses are identied. This an be done like in [Spirtes et al. 1993], for example. The fat that a data set is of limited size might, however, give rise to additional inonsistenies among the estimated CIDSs regarding the diretions of the edges. This inreases additionally the number of multiple solutions whih might dier in the diretions of their edges, although they have the same edges in ommon. We do not present any details on that issue in this paper.

7 4.5 COMPLEXITY Calulating the Bayesian information riterion or a test statisti for eah pair of variables given every subset of the remaining variables is intratable for large numbers of variables. Sine not all of those omputations are usually neessary for onstruting the Bayesian network from data, the omplexity of the problem an be redued by applying heuristis (see for instane [Spirtes et al. 1993]). Carrying out the omputations for all pairs of variables in asending size of the onditioning set and up to a ertain maximum order only, redues the omplexity greatly, i.e. it beomes polynomial in jv j. For CISs of high orders whih are not omputed from the data set it is assumed that exatly those are true whih an be inferred from the CISs of lower orders aording to the Faithfulness and Markov Conditions as desribed in [Spirtes et al. 1993]. Calulations relying on asymptoti results might yield more unreliable results for higher orders, anyway. Conditioning only on neighbors of the pair of variables (in the undireted graph) is an additional heuristi to speed up the derivation of the CIDSs. These heuristis require the algorithm of keeping trak of the network struture when deriving the CIDSs. After nishing all alulations of a ertain order of the onditioning set, a new intermediate version of the network an be built up based on the results available so far. Then, this version an be used to deide, if arrying out omputations on higher orders is neessary, and if so, what the neighbors of eah node are. Finding all possible strutures of the network from the set of rules is, in prinipal, intratable for a large number of variables, too. As we found in our experiments, however, there are edges whih are present in all the multiple solutions and only a few edges whih belong to ambiguous regions. Sine the strutures in dierent ambiguous regions are independent of eah other, all the possible strutures an be found for eah ambiguous region separately, whih speeds up alulation very muh. Hene, the problem is exponential in the number of rules involved in the largest ambiguous region of the network. In the experiments we found that the size of the ambiguous regions is muh smaller than the total number of edges. Therefore, nding all the possibilities in eah ambiguous region beomes tratable. For example, in our alarm network experiments, it took less than a minute on a Sun UltraSPARC-II to generate all the possible strutures given the set of CIDSs. It turned out in our experiments that almost all the alulation time is onsumed for deriving the CIDSs from the data and only a small fration is needed for onstruting all the multiple solutions. 5 EXPERIMENTS The alarm network [Beinlih et al. 1989] has evolved as kind of a benhmark for strutural learning of Bayesian belief networks. We used the alarm network from Norsis Corp. [Netia Alarm Network] to generate randomly data sets of various size. The variables are disrete and their number of states ranges from two to four. The size of the sample data was varied between and 1000, whih is muh smaller than the number of ongurations in the joint state spae. orret edges # edges # sample size (x1000) (b) 5 (a) sample size (x1000) Figure 4: (a) The number of orret edges learned from data depends on the sample size. The networks among the multiple solutions whih have the most edges in ommon with the original network ontain almost all the orret edges even for small sample sizes (dots). The number of those edges is signiantly loser to the orret number of 46 edges than is the number of edges whih are ommon to all the learned multiple solutions (triangles). The latter edges are idential with the edges found by the SGS algorithm 3. (b) The dierene of the two urves in (a) is the number of edges whih are simultaneously present in the ambiguous regions of any one of the multiple solutions (dots). The number of edges missing in all the multiple solutions rises for smaller sample sizes (squares), but stays at smaller values than the number of edges being present in the ambiguous regions due to the proposed Neessary Path Condition. In ontrast, the sum of those edges (dots + squares) is missing in the network resulting from the SGS algorithm 3. The number of erroneously added edges stays at small values (rosses) due to a small signiane level of 0:01. Eah point represents the mean of 5 experiments of the alarm network.

8 multiple solutions # ambiguous regions # sample size (x1000) sample size (x1000) of view, the estimated probability distribution of a larger data set is expeted to be more similar to the (unknown) true distribution so that the struture of the ausal network an uniquely be determined. As the size of the data set shrinks, a unique ausal network annot be derived any more and the number of possible networks of the data inreases. Hene, the number of multiple solutions found by the algorithm indiates in a way, if the size of the data set is suiently large for learning the network struture with ertainty. As the size of the data set dereases, not only inreases the number of multiple solutions, but so does also the number of ambiguous regions, whereas the number of edges involved in eah ambiguous region inreases only slowly. Figure 5: As the size of the data set dereases, the number of multiple solutions inreases and so does the number of ambiguous regions (f. inset). Eah point represents the mean of 5 experiments of the alarm network with a signiane level of 0:01. In order to derive the onditional independenes, we used the likelihood ratio test with an asymptoti 2 distribution as desribed in [Spirtes et al. 1993] for disrete variables. A typial result of strutural learning from a data set of limited size is skethed in Figure 3. For this data set of size the algorithm deteted that there does not exist a perfet map suh that all the CIDSs derived from the data an be fullled aording to the Neessary Path Condition. The multiple solutions dier in the presene of the edges belonging to 4 ambiguous regions, for eah of whih the possible strutures are depited in the aption. The networks among the multiple solutions of our algorithm whih are losest to the original network ontain 45 orret edges (out of 46) whereas the SGS algorithm 3 yields a network with only 41 orret edges. In partiular for fairly small data sets the dierenes in the solutions of those two algorithms inrease. In our approah, the number of missing edges grows muh more slowly with the sample size dereasing than they do in the resulting network of the SGS algorithm, beause the number of edges present in the ambiguous regions inreases (f. Figure 4). The number of edges erroneously present stays partiularly small due to the fat that our algorithm allows to use a small signifiane level when applying onditional independene tests. The number of multiple solutions depends on the size of the data set (f. Figure 5). From a frequentist point 6 CONCLUSIONS Strutural learning of ausal networks based on this kind of onstraint based approah an be split up into several steps whih are arried out sequentially. First, it is learned whih (undireted) edges are present in the network, then their diretions are xed and eventually the values of the parameters are adapted to the data set. In this paper we fous on the rst step, deiding whih edges are present and absent in the situation that only a limited amount of data is available. The proposition presented here serves as a neessary ondition for the existene of a perfet map given a set of onditional independene and dependene statements (CIDS). It essentially states that if an edge is absent in the perfet map, ertain other paths are required to be present. The proposed algorithm heks the set of CIDSs on onsisteny with the Bayesian network model aording to the Neessary Path Condition. If inonsistenies are found, it is assumed that they are solely due to sampling noise in the limited data set and that there nevertheless exists a perfet map of the true, yet unknown, probability distribution. Therefore, the algorithm searhes for all network strutures whih ontain a minimum number of edges and represent a maximum number of onsistent CIDSs. This results in multiple solutions. It turned out in our experiments that all the multiple solutions have many edges in ommon. There are also some edges (whih we all inonsistent edges) in whih the strutures of the multiple solutions dier from eah other. Usually, they an be grouped together in what we all an ambiguous region. In eah of whih the possibly multiple strutures an be found eiently, sine the ambiguous regions are independent of eah other and usually involve only a small number of edges. The

9 overall number of multiple solutions is given by the produt of the number of dierent minimum strutures in eah of the ambiguous regions. We found in the experiments that the number of multiple solutions inreases when the size of the data set dereases. Furthermore, also the number of ambiguous regions rises with a dereasing number of data sets. The multiple solutions of our algorithm ontain all the edges whih are present in the network found by the SGS algorithm 3 [Spirtes et al. 1993] as well as some additional edges, sine an estimated CIS does not neessarily lead to the absene of the orresponding edge in the Bayesian network. When the size of the data set dereases, the overall number of orret edges in the resulting networks of our algorithm drops muh more gradually than it does in the network found by the SGS algorithm. Depending on the properties (e.g. size) of the data set, we found in our experiments that the number of edges present in the networks learned by our approah an be larger than in the result of the SGS or PC algorithms by 0? 30%. In Bayesian approahes like [Cooper and Herskovits 1992, Hekerman et al. 1994, Hekerman 1995], a ost funtion for the entire network is evaluated. This an therefore be alled a global approah. Constraint based algorithms like [Spirtes et al. 1993, Suzuki 1996, Cheng et al. 1997] whih remove all edges for whih a onditional independene statement an be derived from the data do not take into aount the struture of the Bayesian network at all and an therefore be onsidered loal. The algorithm presented here is also onstraint based, but heks the set of CIDSs for onsisteny by requiring the presene of ertain paths for eah edge being absent in the network. Therefore, our approah is not ompletely loal, but takes into aount the neighborhood of eah edge. The size of suh a neighborhood an vary as it depends on the number and lengths of the required paths, for example. From an appliation point of view we might state that a suient number of data for arriving at a unique solution is rarely ever available. If the number of data is very small, one annot really expet too muh to start with. In the intermediate region however, our new approah nds out that this is struture whih ould not be uniquely identied and provides a lear statement about its unertainty. Without displaying this unertain struture muh is lost in the interpretation of the data. In onlusion, we believe that Bayesian networks will play an inreasing role in data mining appliations where they are apable of displaying eiently the important dependenes in a domain. The onstraint based approah is superior the global Bayesian approah in terms of learning speed. We have extended the onstraint based approah to truefully display strutural ambiguities whih we feel is an important step towards gaining the aeptane of the user. Aknowledgements We would like to thank Reimar Hofmann for fruitful disussions. This work has been supported by an Ernst-von-Siemens grant and is part of the PRONEL projet (PRObabilisti NEtworks and Learning) funded by the EU (Esprit Projet number 28932, partners: Siemens, HUGIN, Aalborg University, TIGA). Referenes [Pearl 1988] J. Pearl. Probabilisti Reasoning in Intelligent Systems, Morgan and Kaufman, San Mateo, [Wermuth and Lauritzen 1983] N. Wermuth and S. Lauritzen. Graphial and reursive models for ontingeny tables, Biometrika 72, pp , [Cooper and Herskovits 1992] G. F. Cooper and E. Herskovits. A Bayesian Method for the Indution of Probabilisti Network from Data, Mahine Learning, Vol. 9, pp , July [Hekerman 1995] D. Hekerman. A Tutorial on Learning Bayesian Networks, Tehnial Report MSR-TR-95-06, Mirosoft Researh, [Hekerman et al. 1994] D. Hekerman, D. Geiger and D. M. Chikering. Learning Bayesian networks: the ombination of knowledge and statistial data, Tehnial Report MSR-TR-94-09, Mirosoft Researh, [Spirtes et al. 1993] P. Spirtes, C. Glymour and R. Sheines. Causation, Predition, and Searh, Springer, Leture Notes in Statistis 81, 1993; TETRAD.BOOK/book.html [Cheng et al. 1997] J. Cheng, D. A. Bell, W. Liu. An Algorithm for Bayesian Belief Network Constrution from Data, AI & STAT, 1997; Learning Belief Networks from Data: An Information Theory Based Approah, CIKM, [Fung and Crawford 1990] R. M. Fung and S. L. Crawford. Construtor: A System for the Indution of Probabilisti Models, AAAI-90 Proeedings. Eighth National Conferene

10 on Artiial Intelligene, MIT Press, 1990, Vol. 2, pp [Suzuki 1996] J. Suzuki. Learning Bayesian Belief Networks Based on the Minimum Desription Length Priniple: An Eient Algorithm Using the B & B Tehnique, Proeedings of the international onferene on mahine learning, Bally, Italy, [Beinlih et al. 1989] I. A. Beinlih, H. J. Suermondt, R. M. Chavez and G. F. Cooper. The ALARM monitoring system: A ase study with two probabilisti inferene tehniques for belief networks, Proeedings of the Seond European Conferene on Artiial Intelligene in Mediine, pp , London,UK, [Netia Alarm Network] Alarm Network from the Network Library of Norsys Software Corp.,

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer

More information

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and

More information

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauer and Charles A. Bouman Department of Eletrial Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 Shool of

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

the data. Structured Principal Component Analysis (SPCA)

the data. Structured Principal Component Analysis (SPCA) Strutured Prinipal Component Analysis Kristin M. Branson and Sameer Agarwal Department of Computer Siene and Engineering University of California, San Diego La Jolla, CA 9193-114 Abstrat Many tasks involving

More information

Approximate logic synthesis for error tolerant applications

Approximate logic synthesis for error tolerant applications Approximate logi synthesis for error tolerant appliations Doohul Shin and Sandeep K. Gupta Eletrial Engineering Department, University of Southern California, Los Angeles, CA 989 {doohuls, sandeep}@us.edu

More information

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1.

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1. Fuzzy Weighted Rank Ordered Mean (FWROM) Filters for Mixed Noise Suppression from Images S. Meher, G. Panda, B. Majhi 3, M.R. Meher 4,,4 Department of Eletronis and I.E., National Institute of Tehnology,

More information

Semi-Supervised Affinity Propagation with Instance-Level Constraints

Semi-Supervised Affinity Propagation with Instance-Level Constraints Semi-Supervised Affinity Propagation with Instane-Level Constraints Inmar E. Givoni, Brendan J. Frey Probabilisti and Statistial Inferene Group University of Toronto 10 King s College Road, Toronto, Ontario,

More information

A Novel Validity Index for Determination of the Optimal Number of Clusters

A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers

More information

An Alternative Approach to the Fuzzifier in Fuzzy Clustering to Obtain Better Clustering Results

An Alternative Approach to the Fuzzifier in Fuzzy Clustering to Obtain Better Clustering Results An Alternative Approah to the Fuzziier in Fuzzy Clustering to Obtain Better Clustering Results Frank Klawonn Department o Computer Siene University o Applied Sienes BS/WF Salzdahlumer Str. 46/48 D-38302

More information

Constructing Transaction Serialization Order for Incremental. Data Warehouse Refresh. Ming-Ling Lo and Hui-I Hsiao. IBM T. J. Watson Research Center

Constructing Transaction Serialization Order for Incremental. Data Warehouse Refresh. Ming-Ling Lo and Hui-I Hsiao. IBM T. J. Watson Research Center Construting Transation Serialization Order for Inremental Data Warehouse Refresh Ming-Ling Lo and Hui-I Hsiao IBM T. J. Watson Researh Center July 11, 1997 Abstrat In typial pratie of data warehouse, the

More information

Pipelined Multipliers for Reconfigurable Hardware

Pipelined Multipliers for Reconfigurable Hardware Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,

More information

Exploring the Commonality in Feature Modeling Notations

Exploring the Commonality in Feature Modeling Notations Exploring the Commonality in Feature Modeling Notations Miloslav ŠÍPKA Slovak University of Tehnology Faulty of Informatis and Information Tehnologies Ilkovičova 3, 842 16 Bratislava, Slovakia miloslav.sipka@gmail.om

More information

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Malaysian Journal of Computer Siene, Vol 10 No 1, June 1997, pp 36-41 A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Md Rafiqul Islam, Harihodin Selamat and Mohd Noor Md Sap Faulty of Computer Siene and

More information

Naïve Bayesian Rough Sets Under Fuzziness

Naïve Bayesian Rough Sets Under Fuzziness IJMSA: Vol. 6, No. 1-2, January-June 2012, pp. 19 25 Serials Publiations ISSN: 0973-6786 Naïve ayesian Rough Sets Under Fuzziness G. GANSAN 1,. KRISHNAVNI 2 T. HYMAVATHI 3 1,2,3 Department of Mathematis,

More information

Cluster-Based Cumulative Ensembles

Cluster-Based Cumulative Ensembles Cluster-Based Cumulative Ensembles Hanan G. Ayad and Mohamed S. Kamel Pattern Analysis and Mahine Intelligene Lab, Eletrial and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1,

More information

splitting tehniques that partition live ranges have been proposed to solve both the spilling problem[5][8] and the assignment problem[8][9]. The parti

splitting tehniques that partition live ranges have been proposed to solve both the spilling problem[5][8] and the assignment problem[8][9]. The parti Load/Store Range Analysis for Global Register Alloation Priyadarshan Kolte and Mary Jean Harrold Department of Computer Siene Clemson University Abstrat Live range splitting tehniques divide the live ranges

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

Using Augmented Measurements to Improve the Convergence of ICP

Using Augmented Measurements to Improve the Convergence of ICP Using Augmented Measurements to Improve the onvergene of IP Jaopo Serafin, Giorgio Grisetti Dept. of omputer, ontrol and Management Engineering, Sapienza University of Rome, Via Ariosto 25, I-0085, Rome,

More information

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy ajita.rattani@diee.unia.it Gian Lua Marialis University

More information

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System Algorithms, Mehanisms and Proedures for the Computer-aided Projet Generation System Anton O. Butko 1*, Aleksandr P. Briukhovetskii 2, Dmitry E. Grigoriev 2# and Konstantin S. Kalashnikov 3 1 Department

More information

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks Abouberine Ould Cheikhna Department of Computer Siene University of Piardie Jules Verne 80039 Amiens Frane Ould.heikhna.abouberine @u-piardie.fr

More information

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Ho Wireless Networks Giorgio Quer, Federio Librino, Lua Canzian, Leonardo Badia, Mihele Zorzi, University of California San Diego La

More information

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes Deteting Outliers in High-Dimensional Datasets with Mixed Attributes A. Koufakou, M. Georgiopoulos, and G.C. Anagnostopoulos 2 Shool of EECS, University of Central Florida, Orlando, FL, USA 2 Dept. of

More information

Plot-to-track correlation in A-SMGCS using the target images from a Surface Movement Radar

Plot-to-track correlation in A-SMGCS using the target images from a Surface Movement Radar Plot-to-trak orrelation in A-SMGCS using the target images from a Surfae Movement Radar G. Golino Radar & ehnology Division AMS, Italy ggolino@amsjv.it Abstrat he main topi of this paper is the formulation

More information

Gradient based progressive probabilistic Hough transform

Gradient based progressive probabilistic Hough transform Gradient based progressive probabilisti Hough transform C.Galambos, J.Kittler and J.Matas Abstrat: The authors look at the benefits of exploiting gradient information to enhane the progressive probabilisti

More information

Gray Codes for Reflectable Languages

Gray Codes for Reflectable Languages Gray Codes for Refletable Languages Yue Li Joe Sawada Marh 8, 2008 Abstrat We lassify a type of language alled a refletable language. We then develop a generi algorithm that an be used to list all strings

More information

Weak Dependence on Initialization in Mixture of Linear Regressions

Weak Dependence on Initialization in Mixture of Linear Regressions Proeedings of the International MultiConferene of Engineers and Computer Sientists 8 Vol I IMECS 8, Marh -6, 8, Hong Kong Weak Dependene on Initialization in Mixture of Linear Regressions Ryohei Nakano

More information

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality INTERNATIONAL CONFERENCE ON MANUFACTURING AUTOMATION (ICMA200) Multi-Piee Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality Stephen Stoyan, Yong Chen* Epstein Department of

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking Algorithms for External Memory Leture 6 Graph Algorithms - Weighted List Ranking Leturer: Nodari Sithinava Sribe: Andi Hellmund, Simon Ohsenreither 1 Introdution & Motivation After talking about I/O-effiient

More information

Sequential Incremental-Value Auctions

Sequential Incremental-Value Auctions Sequential Inremental-Value Autions Xiaoming Zheng and Sven Koenig Department of Computer Siene University of Southern California Los Angeles, CA 90089-0781 {xiaominz,skoenig}@us.edu Abstrat We study the

More information

特集 Road Border Recognition Using FIR Images and LIDAR Signal Processing

特集 Road Border Recognition Using FIR Images and LIDAR Signal Processing デンソーテクニカルレビュー Vol. 15 2010 特集 Road Border Reognition Using FIR Images and LIDAR Signal Proessing 高木聖和 バーゼル ファルディ Kiyokazu TAKAGI Basel Fardi ヘンドリック ヴァイゲル Hendrik Weigel ゲルド ヴァニーリック Gerd Wanielik This paper

More information

Trajectory Tracking Control for A Wheeled Mobile Robot Using Fuzzy Logic Controller

Trajectory Tracking Control for A Wheeled Mobile Robot Using Fuzzy Logic Controller Trajetory Traking Control for A Wheeled Mobile Robot Using Fuzzy Logi Controller K N FARESS 1 M T EL HAGRY 1 A A EL KOSY 2 1 Eletronis researh institute, Cairo, Egypt 2 Faulty of Engineering, Cairo University,

More information

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization Self-Adaptive Parent to Mean-Centri Reombination for Real-Parameter Optimization Kalyanmoy Deb and Himanshu Jain Department of Mehanial Engineering Indian Institute of Tehnology Kanpur Kanpur, PIN 86 {deb,hjain}@iitk.a.in

More information

with respect to the normal in each medium, respectively. The question is: How are θ

with respect to the normal in each medium, respectively. The question is: How are θ Prof. Raghuveer Parthasarathy University of Oregon Physis 35 Winter 8 3 R EFRACTION When light travels from one medium to another, it may hange diretion. This phenomenon familiar whenever we see the bent

More information

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0;

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0; Naïve line drawing algorithm // Connet to grid points(x0,y0) and // (x1,y1) by a line. void drawline(int x0, int y0, int x1, int y1) { int x; double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx;

More information

Outline: Software Design

Outline: Software Design Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling

More information

Query Optimization for Structured Documents Based on Knowledge. on the Document Type Denition. Institute of Inf. Systems. ETH Zentrum Dolivostr.

Query Optimization for Structured Documents Based on Knowledge. on the Document Type Denition. Institute of Inf. Systems. ETH Zentrum Dolivostr. Query Optimization for Strutured Douments Based on Knowledge on the Doument Type Denition Klemens Bohm Karl Aberer Institute of Inf. Systems GMD{IPSI ETH Zentrum Dolivostr. 15 8092 Zurih, Switzerland 64293

More information

A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering

A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering A Novel Bit Level Time Series Representation with Impliation of Similarity Searh and lustering hotirat Ratanamahatana, Eamonn Keogh, Anthony J. Bagnall 2, and Stefano Lonardi Dept. of omputer Siene & Engineering,

More information

mahines. HBSP enhanes the appliability of the BSP model by inorporating parameters that reet the relative speeds of the heterogeneous omputing omponen

mahines. HBSP enhanes the appliability of the BSP model by inorporating parameters that reet the relative speeds of the heterogeneous omputing omponen The Heterogeneous Bulk Synhronous Parallel Model Tiani L. Williams and Rebea J. Parsons Shool of Computer Siene University of Central Florida Orlando, FL 32816-2362 fwilliams,rebeag@s.uf.edu Abstrat. Trends

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

Parallelizing Frequent Web Access Pattern Mining with Partial Enumeration for High Speedup

Parallelizing Frequent Web Access Pattern Mining with Partial Enumeration for High Speedup Parallelizing Frequent Web Aess Pattern Mining with Partial Enumeration for High Peiyi Tang Markus P. Turkia Department of Computer Siene Department of Computer Siene University of Arkansas at Little Rok

More information

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index IJCSES International Journal of Computer Sienes and Engineering Systems, ol., No.4, Otober 2007 CSES International 2007 ISSN 0973-4406 253 An Optimized Approah on Applying Geneti Algorithm to Adaptive

More information

Video Data and Sonar Data: Real World Data Fusion Example

Video Data and Sonar Data: Real World Data Fusion Example 14th International Conferene on Information Fusion Chiago, Illinois, USA, July 5-8, 2011 Video Data and Sonar Data: Real World Data Fusion Example David W. Krout Applied Physis Lab dkrout@apl.washington.edu

More information

C 2 C 3 C 1 M S. f e. e f (3,0) (0,1) (2,0) (-1,1) (1,0) (-1,0) (1,-1) (0,-1) (-2,0) (-3,0) (0,-2)

C 2 C 3 C 1 M S. f e. e f (3,0) (0,1) (2,0) (-1,1) (1,0) (-1,0) (1,-1) (0,-1) (-2,0) (-3,0) (0,-2) SPECIAL ISSUE OF IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION: MULTI-ROBOT SSTEMS, 00 Distributed reonfiguration of hexagonal metamorphi robots Jennifer E. Walter, Jennifer L. Welh, and Nany M. Amato Abstrat

More information

HEXA: Compact Data Structures for Faster Packet Processing

HEXA: Compact Data Structures for Faster Packet Processing Washington University in St. Louis Washington University Open Sholarship All Computer Siene and Engineering Researh Computer Siene and Engineering Report Number: 27-26 27 HEXA: Compat Data Strutures for

More information

Colouring contact graphs of squares and rectilinear polygons de Berg, M.T.; Markovic, A.; Woeginger, G.

Colouring contact graphs of squares and rectilinear polygons de Berg, M.T.; Markovic, A.; Woeginger, G. Colouring ontat graphs of squares and retilinear polygons de Berg, M.T.; Markovi, A.; Woeginger, G. Published in: nd European Workshop on Computational Geometry (EuroCG 06), 0 Marh - April, Lugano, Switzerland

More information

COMBINATION OF INTERSECTION- AND SWEPT-BASED METHODS FOR SINGLE-MATERIAL REMAP

COMBINATION OF INTERSECTION- AND SWEPT-BASED METHODS FOR SINGLE-MATERIAL REMAP Combination of intersetion- and swept-based methods for single-material remap 11th World Congress on Computational Mehanis WCCM XI) 5th European Conferene on Computational Mehanis ECCM V) 6th European

More information

Chemical, Biological and Radiological Hazard Assessment: A New Model of a Plume in a Complex Urban Environment

Chemical, Biological and Radiological Hazard Assessment: A New Model of a Plume in a Complex Urban Environment hemial, Biologial and Radiologial Haard Assessment: A New Model of a Plume in a omplex Urban Environment Skvortsov, A.T., P.D. Dawson, M.D. Roberts and R.M. Gailis HPP Division, Defene Siene and Tehnology

More information

Probabilistic Classification of Image Regions using an Observation-Constrained Generative Approach

Probabilistic Classification of Image Regions using an Observation-Constrained Generative Approach 9 Probabilisti Classifiation of mage Regions using an Observation-Constrained Generative Approah Sanjiv Kumar, Alexander C. Loui 2, and Martial Hebert The Robotis nstitute, Carnegie Mellon University,

More information

Fast Rigid Motion Segmentation via Incrementally-Complex Local Models

Fast Rigid Motion Segmentation via Incrementally-Complex Local Models Fast Rigid Motion Segmentation via Inrementally-Complex Loal Models Fernando Flores-Mangas Allan D. Jepson Department of Computer Siene, University of Toronto {mangas,jepson}@s.toronto.edu Abstrat The

More information

CS269I: Incentives in Computer Science Lecture #19: Time-Inconsistent Planning

CS269I: Incentives in Computer Science Lecture #19: Time-Inconsistent Planning CS269I: Inentives in Computer Siene Leture #9: Time-Inonsistent Planning Tim Roughgarden Deember 5, 26 Utility Theory and Behavioral Eonomis In almost all of the models that we ve studied in this ourse,

More information

Parametric Abstract Domains for Shape Analysis

Parametric Abstract Domains for Shape Analysis Parametri Abstrat Domains for Shape Analysis Xavier RIVAL (INRIA & Éole Normale Supérieure) Joint work with Bor-Yuh Evan CHANG (University of Maryland U University of Colorado) and George NECULA (University

More information

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method 3537 Multiple-Criteria Deision Analysis: A Novel Rank Aggregation Method Derya Yiltas-Kaplan Department of Computer Engineering, Istanbul University, 34320, Avilar, Istanbul, Turkey Email: dyiltas@ istanbul.edu.tr

More information

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem Calulation of typial running time of a branh-and-bound algorithm for the vertex-over problem Joni Pajarinen, Joni.Pajarinen@iki.fi Otober 21, 2007 1 Introdution The vertex-over problem is one of a olletion

More information

Detection and Recognition of Non-Occluded Objects using Signature Map

Detection and Recognition of Non-Occluded Objects using Signature Map 6th WSEAS International Conferene on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, De 9-31, 007 65 Detetion and Reognition of Non-Oluded Objets using Signature Map Sangbum Park,

More information

A {k, n}-secret Sharing Scheme for Color Images

A {k, n}-secret Sharing Scheme for Color Images A {k, n}-seret Sharing Sheme for Color Images Rastislav Luka, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos The Edward S. Rogers Sr. Dept. of Eletrial and Computer Engineering, University

More information

An Interactive-Voting Based Map Matching Algorithm

An Interactive-Voting Based Map Matching Algorithm Eleventh International Conferene on Mobile Data Management An Interative-Voting Based Map Mathing Algorithm Jing Yuan* University of Siene and Tehnology of China Hefei, China yuanjing@mail.ust.edu.n Yu

More information

CleanUp: Improving Quadrilateral Finite Element Meshes

CleanUp: Improving Quadrilateral Finite Element Meshes CleanUp: Improving Quadrilateral Finite Element Meshes Paul Kinney MD-10 ECC P.O. Box 203 Ford Motor Company Dearborn, MI. 8121 (313) 28-1228 pkinney@ford.om Abstrat: Unless an all quadrilateral (quad)

More information

KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION

KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION Cuiui Kang 1, Shengai Liao, Shiming Xiang 1, Chunhong Pan 1 1 National Laboratory of Pattern Reognition, Institute of Automation, Chinese

More information

Query Evaluation Overview. Query Optimization: Chap. 15. Evaluation Example. Cost Estimation. Query Blocks. Query Blocks

Query Evaluation Overview. Query Optimization: Chap. 15. Evaluation Example. Cost Estimation. Query Blocks. Query Blocks Query Evaluation Overview Query Optimization: Chap. 15 CS634 Leture 12 SQL query first translated to relational algebra (RA) Atually, some additional operators needed for SQL Tree of RA operators, with

More information

A Fast Sub-pixel Motion Estimation Algorithm for. H.264/AVC Video Coding

A Fast Sub-pixel Motion Estimation Algorithm for. H.264/AVC Video Coding A Fast Sub-pixel Motion Estimation Algorithm or H.64/AVC Video Coding Weiyao Lin Krit Panusopone David M. Baylon Ming-Ting Sun 3 Zhenzhong Chen 4 and Hongxiang Li 5 Institute o Image Communiation and Inormation

More information

Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors

Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors Eurographis Symposium on Geometry Proessing (003) L. Kobbelt, P. Shröder, H. Hoppe (Editors) Rotation Invariant Spherial Harmoni Representation of 3D Shape Desriptors Mihael Kazhdan, Thomas Funkhouser,

More information

Wide-baseline Multiple-view Correspondences

Wide-baseline Multiple-view Correspondences Wide-baseline Multiple-view Correspondenes Vittorio Ferrari, Tinne Tuytelaars, Lu Van Gool, Computer Vision Group (BIWI), ETH Zuerih, Switzerland ESAT-PSI, University of Leuven, Belgium {ferrari,vangool}@vision.ee.ethz.h,

More information

Implementing Load-Balanced Switches With Fat-Tree Networks

Implementing Load-Balanced Switches With Fat-Tree Networks Implementing Load-Balaned Swithes With Fat-Tree Networks Hung-Shih Chueh, Ching-Min Lien, Cheng-Shang Chang, Jay Cheng, and Duan-Shin Lee Department of Eletrial Engineering & Institute of Communiations

More information

Directed Rectangle-Visibility Graphs have. Abstract. Visibility representations of graphs map vertices to sets in Euclidean space and

Directed Rectangle-Visibility Graphs have. Abstract. Visibility representations of graphs map vertices to sets in Euclidean space and Direted Retangle-Visibility Graphs have Unbounded Dimension Kathleen Romanik DIMACS Center for Disrete Mathematis and Theoretial Computer Siene Rutgers, The State University of New Jersey P.O. Box 1179,

More information

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall Department of Eletrial and Computer Engineering University of Wisonsin Madison ECE 553: Testing and Testable Design of Digital Systems Fall 2014-2015 Assignment #2 Date Tuesday, September 25, 2014 Due

More information

DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT

DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT 1 ZHANGGUO TANG, 2 HUANZHOU LI, 3 MINGQUAN ZHONG, 4 JIAN ZHANG 1 Institute of Computer Network and Communiation Tehnology,

More information

Adapting K-Medians to Generate Normalized Cluster Centers

Adapting K-Medians to Generate Normalized Cluster Centers Adapting -Medians to Generate Normalized Cluster Centers Benamin J. Anderson, Deborah S. Gross, David R. Musiant Anna M. Ritz, Thomas G. Smith, Leah E. Steinberg Carleton College andersbe@gmail.om, {dgross,

More information

arxiv: v1 [cs.db] 13 Sep 2017

arxiv: v1 [cs.db] 13 Sep 2017 An effiient lustering algorithm from the measure of loal Gaussian distribution Yuan-Yen Tai (Dated: May 27, 2018) In this paper, I will introdue a fast and novel lustering algorithm based on Gaussian distribution

More information

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis

More information

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks Unsupervised Stereosopi Video Objet Segmentation Based on Ative Contours and Retrainable Neural Networks KLIMIS NTALIANIS, ANASTASIOS DOULAMIS, and NIKOLAOS DOULAMIS National Tehnial University of Athens

More information

Manipulation of Graphs, Algebras and Pictures. Essays Dedicated to Hans-Jörg Kreowski on the Occasion of His 60th Birthday

Manipulation of Graphs, Algebras and Pictures. Essays Dedicated to Hans-Jörg Kreowski on the Occasion of His 60th Birthday Eletroni Communiations of the EASST Volume 26 (2010) Manipulation of Graphs, Algebras and Pitures Essays Dediated to Hans-Jörg Kreowski on the Oasion of His 60th Birthday Autonomous Units for Solving the

More information

A scheme for racquet sports video analysis with the combination of audio-visual information

A scheme for racquet sports video analysis with the combination of audio-visual information A sheme for raquet sports video analysis with the ombination of audio-visual information Liyuan Xing a*, Qixiang Ye b, Weigang Zhang, Qingming Huang a and Hua Yu a a Graduate Shool of the Chinese Aadamy

More information

ASL Recognition Based on a Coupling Between HMMs and 3D Motion Analysis

ASL Recognition Based on a Coupling Between HMMs and 3D Motion Analysis An earlier version of this paper appeared in the proeedings of the International Conferene on Computer Vision, pp. 33 3, Mumbai, India, January 4 7, 18 ASL Reognition Based on a Coupling Between HMMs and

More information

This fact makes it difficult to evaluate the cost function to be minimized

This fact makes it difficult to evaluate the cost function to be minimized RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess

More information

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays nalysis of input and output onfigurations for use in four-valued D programmable logi arrays J.T. utler H.G. Kerkhoff ndexing terms: Logi, iruit theory and design, harge-oupled devies bstrat: s in binary,

More information

Improved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules

Improved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules Improved Vehile Classifiation in Long Traffi Video by Cooperating Traker and Classifier Modules Brendan Morris and Mohan Trivedi University of California, San Diego San Diego, CA 92093 {b1morris, trivedi}@usd.edu

More information

A Fast Kernel-based Multilevel Algorithm for Graph Clustering

A Fast Kernel-based Multilevel Algorithm for Graph Clustering A Fast Kernel-based Multilevel Algorithm for Graph Clustering Inderjit Dhillon Dept. of Computer Sienes University of Texas at Austin Austin, TX 78712 inderjit@s.utexas.edu Yuqiang Guan Dept. of Computer

More information

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT?

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? 3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? Bernd Girod, Peter Eisert, Marus Magnor, Ekehard Steinbah, Thomas Wiegand Te {girod eommuniations Laboratory, University of Erlangen-Nuremberg

More information

The Implementation of RRTs for a Remote-Controlled Mobile Robot

The Implementation of RRTs for a Remote-Controlled Mobile Robot ICCAS5 June -5, KINEX, Gyeonggi-Do, Korea he Implementation of RRs for a Remote-Controlled Mobile Robot Chi-Won Roh*, Woo-Sub Lee **, Sung-Chul Kang *** and Kwang-Won Lee **** * Intelligent Robotis Researh

More information

An Efficient and Scalable Approach to CNN Queries in a Road Network

An Efficient and Scalable Approach to CNN Queries in a Road Network An Effiient and Salable Approah to CNN Queries in a Road Network Hyung-Ju Cho Chin-Wan Chung Dept. of Eletrial Engineering & Computer Siene Korea Advaned Institute of Siene and Tehnology 373- Kusong-dong,

More information

Simulation of Crystallographic Texture and Anisotropie of Polycrystals during Metal Forming with Respect to Scaling Aspects

Simulation of Crystallographic Texture and Anisotropie of Polycrystals during Metal Forming with Respect to Scaling Aspects Raabe, Roters, Wang Simulation of Crystallographi Texture and Anisotropie of Polyrystals during Metal Forming with Respet to Saling Aspets D. Raabe, F. Roters, Y. Wang Max-Plank-Institut für Eisenforshung,

More information

Dubins Path Planning of Multiple UAVs for Tracking Contaminant Cloud

Dubins Path Planning of Multiple UAVs for Tracking Contaminant Cloud Proeedings of the 17th World Congress The International Federation of Automati Control Dubins Path Planning of Multiple UAVs for Traking Contaminant Cloud S. Subhan, B.A. White, A. Tsourdos M. Shanmugavel,

More information

Improved Circuit-to-CNF Transformation for SAT-based ATPG

Improved Circuit-to-CNF Transformation for SAT-based ATPG Improved Ciruit-to-CNF Transformation for SAT-based ATPG Daniel Tille 1 René Krenz-Bååth 2 Juergen Shloeffel 2 Rolf Drehsler 1 1 Institute of Computer Siene, University of Bremen, 28359 Bremen, Germany

More information

Partial Character Decoding for Improved Regular Expression Matching in FPGAs

Partial Character Decoding for Improved Regular Expression Matching in FPGAs Partial Charater Deoding for Improved Regular Expression Mathing in FPGAs Peter Sutton Shool of Information Tehnology and Eletrial Engineering The University of Queensland Brisbane, Queensland, 4072, Australia

More information

Adaptive Implicit Surface Polygonization using Marching Triangles

Adaptive Implicit Surface Polygonization using Marching Triangles Volume 20 (2001), Number 2 pp. 67 80 Adaptive Impliit Surfae Polygonization using Marhing Triangles Samir Akkouhe Eri Galin L.I.G.I.M L.I.G.I.M Eole Centrale de Lyon Université Claude Bernard Lyon 1 B.P.

More information

Alleviating DFT cost using testability driven HLS

Alleviating DFT cost using testability driven HLS Alleviating DFT ost using testability driven HLS M.L.Flottes, R.Pires, B.Rouzeyre Laboratoire d Informatique, de Robotique et de Miroéletronique de Montpellier, U.M. CNRS 5506 6 rue Ada, 34392 Montpellier

More information

Recommendation Subgraphs for Web Discovery

Recommendation Subgraphs for Web Discovery Reommation Subgraphs for Web Disovery Arda Antikaioglu Department of Mathematis Carnegie Mellon University aantika@andrew.mu.edu R. Ravi Tepper Shool of Business Carnegie Mellon University ravi@mu.edu

More information

Dynamic Algorithms Multiple Choice Test

Dynamic Algorithms Multiple Choice Test 3226 Dynami Algorithms Multiple Choie Test Sample test: only 8 questions 32 minutes (Real test has 30 questions 120 minutes) Årskort Name Eah of the following 8 questions has 4 possible answers of whih

More information

2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any urrent or future media, inluding reprinting/republishing this material for advertising

More information

Triangles. Learning Objectives. Pre-Activity

Triangles. Learning Objectives. Pre-Activity Setion 3.2 Pre-tivity Preparation Triangles Geena needs to make sure that the dek she is building is perfetly square to the brae holding the dek in plae. How an she use geometry to ensure that the boards

More information

Z Combinatorial Filters: Sensor Beams, Obstacles, and Possible Paths

Z Combinatorial Filters: Sensor Beams, Obstacles, and Possible Paths Z Combinatorial Filters: Sensor Beams, Obstales, and Possible Paths BENJAMIN TOVAR, Northwestern University FRED COHEN, University of Rohester LEONARDO BOBADILLA, University of Illinois JUSTIN CZARNOWSKI,

More information

Relevance for Computer Vision

Relevance for Computer Vision The Geometry of ROC Spae: Understanding Mahine Learning Metris through ROC Isometris, by Peter A. Flah International Conferene on Mahine Learning (ICML-23) http://www.s.bris.a.uk/publiations/papers/74.pdf

More information

Real-time Container Transport Planning with Decision Trees based on Offline Obtained Optimal Solutions

Real-time Container Transport Planning with Decision Trees based on Offline Obtained Optimal Solutions Real-time Container Transport Planning with Deision Trees based on Offline Obtained Optimal Solutions Bart van Riessen vanriessen@ese.eur.nl Eonometri Institute Erasmus Shool of Eonomis Erasmus University

More information

Stochastic Constraint Programming

Stochastic Constraint Programming Stohasti Constraint Programming Toby Walsh Abstrat. To model ombinatorial deision problems involving unertainty and probability, we introdue stohasti onstraint programming. Stohasti onstraint programs

More information

Boosted Random Forest

Boosted Random Forest Boosted Random Forest Yohei Mishina, Masamitsu suhiya and Hironobu Fujiyoshi Department of Computer Siene, Chubu University, 1200 Matsumoto-ho, Kasugai, Aihi, Japan {mishi, mtdoll}@vision.s.hubu.a.jp,

More information

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advaned Researh in Computer Siene and Software Engineering Researh Paper Available online at: www.ijarsse.om A New-Fangled Algorithm

More information