1 Particle-baed Variational Inference for Continuou Sytem Alexander T. Ihler Dept. of Computer Science Univ. of California, Irvine Andrew J. Frank Dept. of Computer Science Univ. of California, Irvine Padhraic Smyth Dept. of Computer Science Univ. of California, Irvine Abtract Since the development of loopy belief propagation, there ha been coniderable work on advancing the tate of the art for approximate inference over ditribution defined on dicrete random variable. Improvement include guarantee of convergence, approximation that are provably more accurate, and bound on the reult of exact inference. However, extending thee method to continuou-valued ytem ha lagged behind. While everal method have been developed to ue belief propagation on ytem with continuou value, recent advance for dicrete variable have not a yet been incorporated. In thi context we extend a recently propoed particle-baed belief propagation algorithm to provide a general framework for adapting dicrete meage-paing algorithm to inference in continuou ytem. The reulting algorithm behave imilarly to their purely dicrete counterpart, extending the benefit of thee more advanced inference technique to the continuou domain. Introduction Graphical model have proven themelve to be an effective tool for repreenting the underlying tructure of probability ditribution and organizing the computation required for exact and approximate inference. Early example of the ue of graph tructure for inference include join or junction tree [] for exact inference, Markov chain Monte Carlo (MCMC) method [2], and variational method uch a mean field and tructured mean field approache [3]. Belief propagation (BP), originally propoed by Pearl [], ha gained in popularity a a method of approximate inference, and in the lat decade ha led to a number of more ophiticated algorithm baed on conjugate dual formulation and free energy approximation [4, 5, 6]. However, the progre on approximate inference in ytem with continuou random variable ha not kept pace with that for dicrete random variable. Some method, uch a MCMC technique, are directly applicable to continuou domain, while other uch a belief propagation have approximate continuou formulation [7, 8]. Sample-baed repreentation, uch a are ued in particle filtering, are particularly appealing a they are relatively eay to implement, have few numerical iue, and have no inherent ditributional aumption. Our aim i to extend particle method to take advantage of recent advance in approximate inference algorithm for dicrete-valued ytem. Several recent algorithm provide ignificant advantage over loopy belief propagation. Doubleloop algorithm uch a CCCP [9] and UPS [] ue the ame approximation a BP but guarantee convergence. More general approximation can be ued to provide theoretical bound on the reult of exact inference [5, 3] or are guaranteed to improve the quality of approximation [6], allowing an informed trade-off between computation and accuracy. Like belief propagation, they can be formulated a local meage-paing algorithm on the graph, making them amenable to parallel computation [] or inference in ditributed ytem [2, 3].

2 In hort, the algorithmic characteritic of thee recently-developed algorithm are often better, or at leat more flexible, than thoe of BP. However, thee method have not been applied to continuou random variable, and in fact thi ubject wa one of the open quetion poed at a recent NIPS workhop [4]. In order to develop particle-baed approximation for thee algorithm, we focu on one particular technique for concretene: tree-reweighted belief propagation (TRW) [5]. TRW repreent one of the earliet of a recent cla of inference algorithm for dicrete ytem, but a we dicu in Section 2.2 the extenion of TRW can be incorporated into the ame framework if deired. The baic idea of our algorithm i imple and extend previou particle formulation of exact inference [5] and loopy belief propagation [6]. We ue collection of ample drawn from the continuou tate pace of each variable to define a dicrete problem, lifting the inference tak from the original pace to a retricted, dicrete domain on which TRW can be performed. At any point, the current reult of the dicrete inference can be ued to re-elect the ample point from a variable continuou domain. Thi iterative interaction between the ample location and the dicrete meage produce a dynamic dicretization that adapt itelf to the inference reult. We demontrate that TRW and imilar method can be naturally incorporated into the lifted, dicrete phae of particle belief propagation and that they confer imilar benefit on the continuou problem a hold in truly dicrete ytem. To thi end we meaure the performance of the algorithm on an Iing grid, an analogou continuou model, and the enor localization problem. In each cae, we how that tree-reweighted particle BP exhibit behavior imilar to TRW and produce ignificantly more robut marginal etimate than ordinary particle BP. 2 Graphical Model and Inference Graphical model provide a convenient formalim for decribing tructure within a probability ditribution p(x) defined over a et of variable X = {x,..., x n }. Thi tructure can then be applied to organize computation over p(x) and contruct efficient algorithm for many inference tak, including optimization to find a maximum a poteriori (MAP) configuration, marginalization, or computing the likelihood of oberved data. 2. Factor Graph Factor graph [7] are a particular type of graphical model that decribe the factorization tructure of the ditribution p(x) uing a bipartite graph coniting of factor node and variable node. Specifically, uppoe uch a graph G conit of factor node F = {f,..., f m } and variable node X = {x,..., x n }. Let X u X denote the neighbor of factor node f u and F F denote the neighbor of variable node x. Then, G i conitent with a ditribution p(x) if and only if p(x,..., x n ) = m f u (X u ). () Z In a common abue of notation, we ue the ame ymbol to repreent each variable node and it aociated variable x, and imilarly for each factor node and it aociated function f u. Each factor f u correpond to a trictly poitive function over a ubet of the variable. The graph connectivity capture the conditional independence tructure of p(x), enabling the development of efficient exact and approximate inference algorithm [, 7, 8]. The quantity Z, called the partition function, i alo of importance in many problem; for example in normalized ditribution uch a Baye net, it correpond to the probability of evidence and can be ued for model comparion. A common inference problem i that of computing the marginal ditribution of p(x). Specifically, for each variable x we are intereted in computing the marginal ditribution p (x ) = p(x) X. X\x For dicrete-valued variable X, the integral i replaced by a ummation. When the variable are dicrete and the graph G repreenting p(x) form a tree (G ha no cycle), marginalization can be performed efficiently uing the belief propagation or um-product algorithm [, 7]. For inference in more general graph, the junction tree algorithm [9] create a u= 2

3 tree-tructured hypergraph of G and then perform inference on thi hypergraph. The computational complexity of thi proce i O(nd b ), where d i the number of poible value for each variable and b i the maximal clique ize of the hypergraph. Unfortunately, for even moderate value of d, thi complexity become prohibitive for even relatively mall b. 2.2 Approximate Inference Loopy BP [] i a popular alternative to exact method and proceed by iteratively paing meage between variable and factor node in the graph a though the graph were a tree (ignoring cycle). The algorithm i exact when the graph i tree-tructured and can provide excellent approximation in ome cae even when the graph ha loop. However, in other cae loopy BP may perform poorly, have multiple fixed point, or fail to converge at all. Many of the more recent varietie of approximate inference are framed explicitly a an optimization of local approximation over locally defined cot function. Variational or free-energy baed approache convert the problem of exact inference into the optimization of a free energy function over the et of realizable marginal ditribution M, called the marginal polytope [8]. Approximate inference then correpond to approximating the contraint et and/or energy function. Formally, max E µ[log P (X)] + H(µ) max E µ [log P (X)] + Ĥ(µ) µ M µ M where H i the entropy of the ditribution correponding to µ. Since the olution µ may not correpond to the marginal of any conitent joint ditribution, thee approximate marginal are typically referred to a peudomarginal. If both the contraint in M and approximate entropy Ĥ decompoe locally on the graph, the optimization proce can be interpreted a a meage-paing procedure, and i often performed uing fixed-point equation like thoe of BP. Belief propagation can be undertood in thi framework a correponding to an outer approximation M M enforcing local conitency and the Bethe approximation to H [4]. Thi viewpoint provide a clear path to directly improve upon the propertie of BP, leading to a number of different algorithm. For example, CCCP [9] and UPS [] make the ame approximation but ue an alternative, direct optimization procedure to enure convergence. Fractional belief propagation [2] correpond to a more general Bethe-like approximation with additional parameter, which can be modified to enure that the cot function i convex and ued with convergent algorithm [2]. A pecial cae include tree-reweighted belief propagation [5], which both enure convexity and provide an upper bound on the partition function Z. The approximation of M can alo be improved uing cutting plane method, which include additional, higher-order conitency contraint on the peudomarginal [6]. Other choice of local cot function lead to alternative familie of approximation [8]. Overall, thee advance have provided ignificant improvement in the tate of the art for approximate inference in dicrete-valued ytem. They provide increaed flexibility, theoretical bound on the reult of exact inference, and can provably increae the quality of the etimate. However, thee advance have not been carried over into the continuou domain. For concretene, in the ret of the paper we will ue tree-reweighted belief propagation (TRW) [5] a our inference method of choice, although the ame idea can be applied to any of the dicued inference algorithm. A we will ee hortly, the detail pecific to TRW are nicely encapulated and can be wapped out for thoe of another algorithm with minimal effort. The fixed-point equation for TRW lead to a meage-paing algorithm imilar to BP, defined by m x f u (x ) m fv x (x)ρv, m fu x m fu x f (x ) (x ) f u (X u ) /ρu m xt f u (x t ) v F X u\x x t X u\x The parameter ρ v are called edge weight or appearance probabilitie. For TRW, the ρ are required to correpond to the fractional occurrence rate of the edge in ome collection of tree-tructured ubgraph of G. The choice of ρ affect the quality of the approximation; the tightet upper bound can be obtained via a convex optimization of ρ which compute the peudomarginal a an inner loop. (2) 3

4 3 Continuou Random Variable For continuou-valued random variable, many of thee algorithm cannot be applied directly. In particular, any reaonably fine-grained dicretization produce a dicrete variable whoe domain ize d i quite large. The domain ize i typically exponential in the dimenion of the variable and the complexity of the meage-paing algorithm i O(nd b ), where n i the total number of variable and b i the number of variable in the larget factor. Thu, the computational cot can quickly become intractable even with pairwie factor over low dimenional variable. Our goal i to adapt the algorithm of Section 2.2 to perform efficient approximate inference in uch ytem. For time-erie problem, in which G form a chain, a claical olution i to ue equential Monte Carlo approximation, generally referred to a particle filtering [22]. Thee method ue ample to define an adaptive dicretization of the problem with fine granularity in region of high probability. The tochatic nature of the dicretization i imple to implement and enable probabilitic aurance of quality including convergence rate which are independent of the problem dimenionality. (In ufficiently few dimenion, determinitic adaptive dicretization can alo provide a competitive alternative, particularly if the factor are analytically tractable [23, 24].) 3. Particle Repreentation for Meage-Paing Particle-baed approximation have been extended to loopy belief propagation a well. For example, in the nonparametric belief propagation (NBP) algorithm [7], the BP meage are repreented a Gauian mixture and meage product are approximated by drawing ample, which are then moothed to form new Gauian mixture ditribution. A key apect of thi approach i the fact that the product of everal mixture of Gauian i alo a mixture of Gauian, and thu can be ampled from with relative eae. However, it i difficult to ee how to extend thi algorithm to more general meage-paing algorithm, ince for example the TRW fixed point equation (2) involve ratio and power of meage, which do not have a imple form for Gauian mixture and may not even form finitely integrable function. Intead, we adapt a recent particle belief propagation (PBP) algorithm [6] to work on the treereweighted formulation. In PBP, ample (particle) are drawn for each variable, and each meage i repreented a a et of weight over the available value of the target variable. At a high level, the procedure iterate between ampling particle from each variable domain, performing inference over the reulting dicrete problem, and adaptively updating the ampling ditribution. Thi proce i illutrated in Figure. Formally, we define a propoal ditribution W (x ) for each variable x uch that W (x ) i non-zero over the domain of x. Note that we may rewrite the factor meage computation (2) a an importance reweighted expectation: m fu x (x ) E X u\x m f u (X u ) (x /ρu xt fu t) (3) W t (x t ) x t X u\x Let u index the variable that are neighbor of factor f u a X u = {x u,..., x ub }. Then, after ampling particle {x (),, x (N) } from W (x ), we can index a particular aignment of particle value to the variable in X u with X ( j) u = [x u (j),..., x (j b) u b ]. We then obtain a finite-ample approximation of the factor meage in the form ( ) m fu x uk x (j) u k N b i:i k =j f u (X ( i) u ) /ρu l k m xul f u ( x (i l) u l ) ( ) W xul x (i l) u l (4) In other word, we contruct a Monte Carlo approximation to the integral uing importance weighted ample from the propoal. Each of the value in the meage then repreent an etimate of the continuou function (2) evaluated at a ingle particle. Oberve that the um i over N b element, and hence the complexity of computing an entire factor meage i O(N b ); thi could be made more efficient at the price of increaed tochaticity by umming over a random ubample of the vector 4

5 µ ( x (i) ) (2) Inference on dicrete ytem f ( x (i), x (j) ) t µ ( x (j) ) t { () Sample (i)} x W (x ) (3) () (3) Adjut W t (x t ) Figure : Schematic view of particle-baed inference. () Sample for each variable provide a dynamic dicretization of the continuou pace; (2) inference proceed by optimization or meagepaing in the dicrete pace; (3) the reulting local function can be ued to change the propoal W ( ) and chooe new ample location for each variable. i. Likewie, we compute variable meage and belief a imple point-wie product: ( ) ( ) m x f u x (j) m fv x x (j) ρv ( ), b (x (j) ) ( m fu x f v F x (j) f v F m fv x x (j) ) ρv (5) Thi parallel the development in [6], except here we ue factor weight ρ to compute meage according to TRW rather than tandard loopy BP. Jut a in dicrete problem, it i often deirable to obtain etimate of the log partition function for ue in goodne-of-fit teting or model comparion. Our implementation of TRW-PBP give u a tochatic etimate of an upper bound on the true partition function. Uing other meage paing approache that fit into thi framework, uch a mean field, can provide a imilar a lower bound. Thee bound provide a poible alternative to Monte Carlo etimate of marginal likelihood [25]. 3.2 Rao-Blackwellized Etimate Quantitie about x uch a expected value under the peudomarginal can be computed uing the ample x (i). However, for any given variable node x, the incoming meage to x given in (4) are defined in term of the importance weight and ampled value of the neighboring variable. Thu, we can compute an etimate of the meage and belief defined in (4) (5) at arbitrary value of x, imply by evaluating (4) at that point. Thi allow u to perform Rao-Blackwellization, conditioning on the ample at the neighbor of x rather than uing x ample directly. Uing thi trick we can often get much higher quality etimate from the inference for mall N. In particular, if the variable tate pace are ufficiently mall that they can be dicretized (for example, in 3 or fewer dimenion the dicretized domain ize d may be manageable) but the reulting factor domain ize, d b, i intractably large, we can evaluate (4) on the dicretized grid for only O(dN b ). More generally, we can ubtitute a larger number of ample N N with cot that grow only linearly in N. 3.3 Reampling and Propoal Ditribution Another critical point i that the efficiency of thi procedure hinge on the quality of the propoal ditribution W. Unfortunately, thi form a circular problem W mut be choen to perform inference, but the quality of W depend on the ditribution and it peudomarginal. Thi interdependence motivate an attempt to learn the ampling ditribution in an online fahion, adaptively updating them baed on the reult of the partially completed inference procedure. Note that thi procedure depend on the ame propertie a Rao-Blackwellized etimate: that we be able to compute our meage and belief at a new et of point given the meage weight at the other node. Both [5] and [6] ugget uing the current belief at each iteration to form a new propoal ditribution. In [5], parametric denity etimate are formed uing the meage-weighted ample at the current iteration, which form the ampling ditribution for the next phae. In [6], a hort Metropoli-Hating MCMC equence i run at a ingle node, uing the Rao-Blackwellized belief etimate to compute an acceptance probability. A third poibility i to ue a ampling/importance 5

6 L error BP η L error TRW η L error PBP 5 TRW PBP η Figure 2: 2-D Iing model performance. L error for PBP (left) and TRW-PBP (center) for varying number of particle; (right) PBP and TRW-PBP juxtapoed to reveal the gap for high η. reampling (SIR) procedure, drawing a large number of ample, computing weight, and probabilitically retaining only N. In our experiment we draw ample from the current belief, a approximated by Rao-Blackwellized etimation over a fine grid of particle. For variable in more than 2 dimenion, we recommend the Metropoli-Hating approach. 4 Iing-like Model The Iing model correpond to a graphical model, typically a grid, over binary-valued variable with pairwie factor. Originating in tatitical phyic, imilar model are common in many application including image denoiing and tereo depth etimation. Iing model are well undertood, and provide a imple example of how BP can fail and the benefit of more general form uch a TRW. We initially demontrate the behavior of our particle-baed algorithm on a mall (3 3) lattice of binary-valued variable to compare with the exact dicrete implementation, then how that the ame oberved behavior arie in an analagou continuou-valued problem. 4. Iing model Our factor conit of ingle-variable and pairwie function, given by [ η η f(x ) = [.5.5 ] f(x, x t ) = η η ] (6) for η >.5. By ymmetry, it i eay to ee that the true marginal of each variable i uniform, [.5.5]. However, around η.78 there i a phae tranition; the uniform fixed point become untable and everal other appear, becoming more kewed toward one tate or another a η increae. A the trength of coupling in an Iing model increae, the performance of BP often degrade harply, while TRW i comparatively robut and remain near the true marginal [5]. Figure 2 how the performance of PBP and TRW-PBP on thi model. Each data point repreent the median L error between the belief and the true marginal, acro all node and 4 randomly initialized trial, after 5 iteration. The left plot (BP) clearly how the phae hift; in contrat, the error of TRW remain low even for very trong interaction. In both cae, a N increae the particle verion of the algorithm converge to their dicrete equivalent. 4.2 Continuou grid model The reult for dicrete ytem, and their correponding intuition, carry over naturally into continuou ytem a well. To illutrate on an interpretable analogue of the Iing model, we ue the ame graph tructure but with real-valued variable, and factor given by: ( (x ) 2 ) ( x x t 2 f(x ) = exp ( x2 2σ 2 l ) + exp 2σ 2 l f(x, x t ) = exp 2σ 2 p ). (7) Local factor conit of bimodal Gauian mixture centered at and, while pairwie factor encourage imilarity uing a zero-mean Gauian on the ditance between neighboring variable. We et σ l =.2 and vary σ p analagouly to η in the dicrete model. Since all potential are Gauian mixture, the joint ditribution i alo a Gauian mixture and can be computed exactly. 6

7 L error log(σ p ) L error log(σ p ) L error PBP 5 TRW PBP log(σ p ) Figure 3: Continuou grid model performance. L error for PBP (left) and TRW-PBP (center) for varying number of particle; (right) PBP and TRW-PBP juxtapoed to reveal the gap for low σ p. Figure 3 how the reult of running PBP and TRW-PBP on the continuou grid model, demontrating imilar characteritic to the dicrete model. The left panel reveal that our continuou grid model alo induce a phae hift in PBP, much like that of the Iing model. For ufficiently mall value of σ p (large value on our tranformed axi), the belief in PBP collape to unimodal ditribution with an L error of. In contrat, TRW-PBP avoid thi collape and maintain multi-modal ditribution throughout; it primary ource of error (.2 at 5 particle) correpond to overdipered bimodal belief. Thi i expected in attractive model, in which BP tend to overcount information leading to underetimate of variance; TRW remove ome of thi overcounting and may overetimate uncertainty. A mentioned in Section 3., we can ue the reult of TRW-PBP to compute an upper bound on the log partition function. We implement naive mean field within thi ame framework to achieve a lower bound a well. The reulting bound, computed for a continuou grid model in which mean field collape to a ingle mode, are hown in Figure 4. With ufficiently many particle, the value produced by TRW-PBP and MF inference bound the true value, a they hould. With only 2 particle per variable, however, TRW-PBP occaionally fail and yield upper bound below the true value. Thi i not urpriing; the conitency guarantee aociated with the importancereweighted expectation take effect only when N i ufficiently large. 5 Senor Localization Figure 4: Bound on the log partition function. We alo demontrate the preence of thee effect in a imulation of a real-world application. Senor localization conider the tak of etimating the poition of a collection of enor in a network given noiy etimate of a ubet of the ditance between pair of enor, along with known poition for a mall number of anchor node. Typical localization algorithm operate by optimizing to find the mot likely joint configuration of enor poition. A claical model conit of (at a minimum) three anchor node, and a Gauian model on the noie in the ditance obervation. In [2], thi problem i formulated a a graphical model and an alternative olution i propoed uing nonparametric belief propagation to perform approximate marginalization. A ignificant advantage of thi approach i that by providing approximate marginal, we can etimate the degree of uncertainty in the enor poition. Gauging thi uncertainty can be particularly important when the ditance information i ufficiently ambiguou that the poterior belief i multi-modal, ince in thi cae the etimated enor poition may be quite far from it true value. Unfortunately, belief propagation i not ideal for identifying multimodality, ince the model i eentially attractive. BP may underetimate the degree of uncertainty in the marginal ditribution and (a in the cae of the Iing-like model in the previou ection) collape into a ingle mode, providing belief which are mileadingly overconfident. Figure 5 how a et of enor configuration where thi i the cae. The ditance obervation induce a fully connected graph; the edge are omitted for clarity. In thi network the anchor node are nearly collinear. Thi induce a bimodal uncertainty about the location of the remaining node 7

8 Anchor Mobile Target Anchor Mobile Target Anchor Mobile Target (a) Exact (b) PBP (c) TRW-PBP Figure 5: Senor location belief at the target node. (a) Exact belief computed uing importance ampling. (b) PBP collape and repreent only one of the two mode. (c) TRW-PBP overetimate the uncertainty around each mode, but repreent both. the configuration in which they are all reflected acro the crooked line formed by the anchor i nearly a likely a the true configuration. Although thi example i anecdotal, it reflect a ituation which can arie regularly in practice [26]. Figure 5a how the true marginal ditribution for one node, etimated exhautively uing importance ampling with 5 6 ample. It how a clear bimodal tructure a lightly larger mode near the enor true location and a maller mode at a point correponding to the reflection. In thi ytem there i not enough information in the meaurement to reolve the enor poition. We compare thee marginal to the reult found uing PBP. Figure 5b diplay the Rao-Blackwellized belief etimate for one node after 2 iteration of PBP with each variable repreented by particle. Only one mode i preent, uggeting that PBP belief have collaped, jut a in the highly attractive Iing model. Examination of the other node belief (not hown for pace) confirm that all are unimodal ditribution centered around their reflected location. It i worth noting that PBP converged to the alternative et of unimodal belief (upporting the true location) in about half of our trial. Such an outcome i only lightly better; an accurate etimate of confidence i equally important. The correponding belief etimate generated by TRW-PBP i hown in Figure 5c. It i clearly bimodal, with ignificant probability ma upporting both the true and reflected location. Alo, each of the two mode i le concentrated than the belief in 5b. A with the continuou grid model we ee increaed tability at the price of conervative overdiperion. Again, imilar effect occur for the other node in the network. 6 Concluion We propoe a framework for extending recent advance in dicrete approximate inference for application to continuou ytem. The framework directly integrate reweighted meage paing algorithm uch a TRW into the lifted, dicrete phae of PBP. Furthermore, it allow u to iteratively adjut the propoal ditribution, providing a dicretization that adapt to the reult of inference, and allow u to ue Rao-Blackwellized etimate to improve our final belief etimate. We conider the particular cae of TRW and how that it benefit carry over directly to continuou problem. Uing an Iing-like ytem, we argue that phae tranition exit for particle verion of BP imilar to thoe found in dicrete ytem, and that TRW ignificantly improve the quality of the etimate in thoe regime. Thi improvement i highly relevant to approximate marginalization for enor localization tak, in which it i important to accurately repreent the poterior uncertainty. The flexibility in the choice of meage paing algorithm make it eay to conider everal intantiation of the framework and ue the one bet uited to a particular problem. Furthermore, future improvement in meage-paing inference algorithm on dicrete ytem can be directly incorporated into continuou problem. Acknowledgement: Thi material i baed upon work partially upported by the Office of Naval Reearch under MURI grant N

9 Reference [] J. Pearl. Probabilitic Reaoning in Intelligent Sytem. Morgan Kaufman, San Mateo, 988. [2] S. Geman and D. Geman. Stochatic relaxation, Gibb ditribution, and the Bayeian retoration of image. IEEE Tran. PAMI, 6(6):72 74, November 984. [3] M. Jordan, Z. Ghahramani, T. Jaakkola, and L. Saul. An introduction to variational method for graphical method. Machine Learning, 37:83 233, 999. [4] J. Yedidia, W. Freeman, and Y. Wei. Contructing free energy approximation and generalized belief propagation algorithm. Technical Report 24-4, MERL, May 24. [5] M. Wainwright, T. Jaakkola, and A. Willky. A new cla of upper bound on the log partition function. IEEE Tran. Info. Theory, 5(7): , July 25. [6] D. Sontag and T. Jaakkola. New outer bound on the marginal polytope. In NIPS 2, page MIT Pre, Cambridge, MA, 28. [7] E. Sudderth, A. Ihler, W. Freeman, and A. Willky. Nonparametric belief propagation. In CVPR, 23. [8] T. Minka. Divergence meaure and meage paing. Technical Report 25-73, Microoft Reearch Ltd, January 25. [9] A. Yuille. CCCP algorithm to minimize the Bethe and Kikuchi free energie: convergent alternative to belief propagation. Neural Comput., 4(7):69 722, 22. [] Y.-W. Teh and M. Welling. The unified propagation and caling algorithm. In NIPS [] J. Gonzalez, Y. Low, and C. Guetrin. Reidual plah for optimally parallelizing belief propagation. In In Artificial Intelligence and Statitic (AISTATS), page 77 84, Clearwater Beach, Florida, April 29. [2] A. Ihler, J. Fiher, R. Moe, and A. Willky. Nonparametric belief propagation for elf-calibration in enor network. IEEE J. Select. Area Commun., page 89 89, April 25. [3] J. Schiff, D. Antonelli, A. Dimaki, D. Chu, and M. Wainwright. Robut meage-paing for tatitical inference in enor network. In IPSN, page 9 8, April 27. [4] A. Globeron, D. Sontag, and T. Jaakkola. Approximate inference How far have we come? (NIPS 8 Workhop), gamir/inference-workhop.html. [5] D. Koller, U. Lerner, and D. Angelov. A general algorithm for approximate inference and it application to hybrid Baye net. In UAI 5, page , 999. [6] A. Ihler and D. McAlleter. Particle belief propagation. In AI & Statitic: JMLR W&CP, volume 5, page , April 29. [7] F. Kchichang, B. Frey, and H.-A. Loeliger. Factor graph and the um-product algorithm. IEEE Tran. Info. Theory, 47(2):498 59, February 2. [8] M. Wainwright and M. Jordan. Graphical model, exponential familie, and variational inference. Technical Report 629, UC Berkeley Dept. of Statitic, September 23. [9] SL Lauritzen and DJ Spiegelhalter. Local computation with probabilitie on graphical tructure and their application to expert ytem. J. Royal Stat. Soc. B, page , 988. [2] W. Wiegerinck and T. Heke. Fractional belief propagation. In NIPS 5, page [2] T. Hazan and A. Shahua. Convergent meage-paing algorithm for inference over general graph with convex free energie. In UAI 24, page July 28. [22] M. S. Arulampalam, S. Makell, N. Gordon, and T. Clapp. A tutorial on particle filter for online nonlinear/non-gauian Bayeian tracking. IEEE Tran. Signal Proceing, 5(2):74 88, February 22. [23] J. Coughlan and H. Shen. Dynamic quantization for belief propagation in pare pace. Comput. Vi. Image Undert., 6():47 58, 27. [24] M. Iard, J. MacCormick, and K. Achan. Continuouly-adaptive dicretization for meage-paing algorithm. In NIPS 2, page [25] S. Chib. Marginal likelihood from the gibb output. JASA, 9(432):33 32, 995. [26] D. Moore, J. Leonard, D. Ru, and S. Teller. Robut ditributed network localization with noiy range meaurement. In 2nd Int l Conf. on Emb. Networked Senor Sy. (SenSy 4), page 5 6, 24. 9

More information