CHAPTER 8 INFERENCE The concept of Inference would be explained in the following chapter, as the CTBN framework has already been detailed on. The chapter begins by answering the most asked queries and the hardships of inference. This chapter progresses to optimize inference such that the computational advantages of the independencies encoded in factored CTBN representation. A near version of expected propagation is described ahead called as cluster-graph based message-passing algorithm.
8.1 CTBN QUERIES From the earlier chapters, it can be grasped that for a Markov process, CTBN can be termed as a compressed of a joint intensity matrix and this principle can be utilized as an alternative to solve queries. Therefore, this joint intensity matrix can be used to solve queries for Markov process. The distribution over drowsiness can be calculated for t=5 by assuming that the absorption of the drug is extraordinary, as we possess the value of the full joint distribution at this point of time and the distribution over joint pain at t=5.it is easy to find out the joint distribution for any state at any point of time by mere series of observation. At first observation, this is done by calculating the joint distribution at the timeand the same procedure will be carry forward for the next one which can be further extended for all. Considering an example, assume that our patient took the drug at t = 0, ate after an hour (t = 1) and felt drowsy three hours after eating (t = 4). The distribution over joint pain six hours after taking the drug (t = 6) can be computed by computing the joint distribution at time 1, conditioning that distribution on the observation of eating, and using that as an initial distribution with which to compute the joint distribution 3 hours later. After conditioning on the observation of drowsiness, the result can be pipelined as an initial distribution with which to calculate the joint distribution 2 hours after that. That joint distribution can be marginalized to give the distribution over joint pain given the sequence of evidence. Despite of the observations being irregularly placed, we need only to do propagation for each observation time. So far, it can be noted that the joint distribution between any two points in time can be calculated and also conditioning on the evidence at the later point in time obtains a propagation of evidence in
reverse chronological order. The distribution over the entrance time into the subsystem can be calculated -[Nodelman, U., & Horvitz, E. (2003)] Considering an example, we can calculate the distribution over time at which the pain fades away by assuming the initial distribution being the time when person has joint pain after taking the drug. 8.2 DIFFICULTIES WITH EXACT INFERENCE: So far, we have discussed that in the count of variables we will be able to derive complete joint intensity matrix. An inference has to be performed in a decomposed way as shown by CTBN graphical structure as in Bayesian networks. Similar to the muddle problem that we have seen in DBNs in which variables become correlated over few time slices, the states become correlated when temporal evolution is considered. In the simple chain of X Y Z. Here the change in the intensity for Z depends entirely on Y. However, when temporal evolution comes into picture, Z cannot be independent of X. Assuming that full distribution over the path for Y, the distribution over trajectories of Z is fully recovered while ignoring X. Thus we conclude that complete distribution is a complex structure for the continuous time. However, this assumption is groundless. Look at the following intensity matrices.
Q Y 1 2 1 2 = QY 10 = 20 10 20 Q 3 = 15 3 15 Q Z y1 Z y2 5 = 4 5 4 with disparity. The CTBN with the graph Y Z and with graph Y Z yields stationary distribution Keeping in mind, the phenomena that the next time instance can be obtained by multiplying each infinitesimal moment of time. The value of Y is evaluated at every instant and chooses the matrix to multiply with it. Sticking to the stationary distribution of Y implies that the usage number of matrix is in question. It has been observed that the frequent switching of values tend top produce a different result as compared to switching less frequently. The order of multiplication results in disparity despite of the total number of times a matrix was used. One way in which we do not have to summarize enough information over Y s distribution is summing up the formulation of data. As an overcoming effect for this drawback, Expectation Propagation that provides approximate message passing in cluster graphs is used. 8.3 OVERVIEW OF ALGORITHM: package sensorenergy; public class Main { public static void main(string[] args) {
/*one will write here application code */ 8.4 CENTRAL OPERATIONS import java.io.*; import java.awt.*; import javax.swing.plaf.filechooserui; import javax.imageio.imageio; import java.math.*; import javax.swing.*; import java.awt.image.*; public class BaseImageSensorData extends javax.swing.jframe { /* here we will form new one called BaseImageSensorData */ public BaseImageSensorData() { initcomponents(); this.setdefaultcloseoperation (JFrame.HIDE_ON_CLOSE); private Image img; private BufferedImage bi; @SuppressWarnings("unchecked") // <editor-fold defaultstate="collapsed" desc="generated Code">//GEN- BEGIN:initComponents 8.4.1 INCORPORATING EVIDENCE INTO CIMS private void initcomponents() {
jlabel1 = new javax.swing.jlabel(); jbutton1 = new javax.swing.jbutton(); jbutton2 = new javax.swing.jbutton(); jlabel2 = new javax.swing.jlabel(); jlabel3 = new javax.swing.jlabel(); jbutton3 = new javax.swing.jbutton(); setdefaultcloseoperation(javax.swing.windowconstants.exit_on_close); jlabel1.settext("image Sensed by Sensor"); jlabel1.setborder(new javax.swing.border.matteborder(null)); jbutton1.settext("select An Image Sensed by Sensor"); jbutton1.addactionlistener(new java.awt.event.actionlistener() { public void actionperformed(java.awt.event.actionevent evt) { jbutton1actionperformed(evt); ); jbutton2.settext("recover"); jbutton2.addactionlistener(new java.awt.event.actionlistener() { public void actionperformed(java.awt.event.actionevent evt) { jbutton2actionperformed(evt); ); jlabel2.settext("recovered"); jlabel2.setborder(new javax.swing.border.matteborder(null)); jlabel3.settext("jlabel3");
jlabel3.setborder(javax.swing.borderfactory.createlineborder(new java.awt.color(0, 0, 0))); jbutton3.settext("draw"); jbutton3.addactionlistener(new java.awt.event.actionlistener() { ); public void actionperformed(java.awt.event.actionevent evt) { jbutton3actionperformed(evt); javax.swing.grouplayout layout = new javax.swing.grouplayout(getcontentpane()); getcontentpane().setlayout(layout); layout.sethorizontalgroup( layout.createparallelgroup(javax.swing.grouplayout.alignment.leading).addgroup(layout.createsequentialgroup().addgap(68, 68, 68).addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING, false).addcomponent(jlabel3, javax.swing.grouplayout.alignment.leading, javax.swing.grouplayout.default_size, javax.swing.grouplayout.default_size, Short.MAX_VALUE).addComponent(jLabel2, javax.swing.grouplayout.alignment.leading, javax.swing.grouplayout.default_size, 301, Short.MAX_VALUE)).addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED).addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING).addGroup(layout.createSequentialGroup().addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING, false)
.addcomponent(jbutton2, javax.swing.grouplayout.default_size, javax.swing.grouplayout.default_size, Short.MAX_VALUE).addComponent(jButton1, javax.swing.grouplayout.default_size, javax.swing.grouplayout.default_size, Short.MAX_VALUE)).addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED).addComponent(jLabel1, javax.swing.grouplayout.preferred_size, 301, javax.swing.grouplayout.preferred_size)) );.addcomponent(jbutton3)).addcontainergap()) layout.setverticalgroup( layout.createparallelgroup(javax.swing.grouplayout.alignment.leading).addgroup(layout.createsequentialgroup().addgap(23, 23, 23).addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING).addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE).addComponent(jLabel1, javax.swing.grouplayout.preferred_size, 267, javax.swing.grouplayout.preferred_size).addcomponent(jlabel2, javax.swing.grouplayout.preferred_size, 267, javax.swing.grouplayout.preferred_size)).addgroup(layout.createsequentialgroup().addcomponent(jbutton1).addgap(18, 18, 18).addComponent(jButton2))).addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addgroup(layout.createparallelgroup(javax.swing.grouplayout.alignment.leading).addcomponent(jlabel3, javax.swing.grouplayout.preferred_size, 277, javax.swing.grouplayout.preferred_size) ); pack();.addcomponent(jbutton3)).addcontainergap(32, Short.MAX_VALUE)) // </editor-fold>//gen-end:initcomponents 8.4.2 MARGINALIZING CIMS public BufferedImage[] bis; public int nrows=10; public int ncols=10; private void jbutton1actionperformed(java.awt.event.actionevent evt) {//GEN- FIRST:event_jButton1ActionPerformed try { fc.showopendialog(this); JFileChooser fc=new JFileChooser(); String fname=fc.getselectedfile().getabsolutepath(); //JOptionPane.showMessageDialog(fname,"Selected File"); jlabel1.seticon(new ImageIcon(fname)); img=((imageicon)jlabel1.geticon()).getimage(); img=img.getscaledinstance(100,100,150);
jlabel1.seticon(new ImageIcon(img)); img=img.getscaledinstance(jlabel1.getheight(), jlabel1.getwidth(), 100); jlabel1.seticon(new ImageIcon(img)); File f = new File(fname); bi=imagetobufferedimage(img); JPanel panel = new JPanel (); ImageSplitFrame isf=new ImageSplitFrame(); try { bis=isf.split(f, nrows, ncols); for(int i=0;i<nrows*ncols;i++) { Image im=bis[i].getscaledinstance(bis[i].getwidth(), bis[i].getheight(), 100); panel.add (new JLabel (new ImageIcon (im))); JFrame frame= new JFrame ("Display multiple images from files."); frame.getcontentpane().add (panel); frame.pack(); frame.setvisible(true); frame.setdefaultcloseoperation (JFrame.HIDE_ON_CLOSE); catch(exception ex) { catch(exception ex)
{ //GEN-LAST:event_jButton1ActionPerformed private void jbutton2actionperformed(java.awt.event.actionevent evt) {//GEN- FIRST:event_jButton2ActionPerformed ImageSplitFrame isf=new ImageSplitFrame(); BufferedImage b=isf.merge(bis, nrows, ncols); Image im=b.getscaledinstance(jlabel2.getwidth(), jlabel2.getheight(), 100); jlabel2.seticon(new ImageIcon(im)); //GEN-LAST:event_jButton2ActionPerformed private void jbutton3actionperformed(java.awt.event.actionevent evt) {//GEN- FIRST:event_jButton3ActionPerformed BufferedImage bmp=new BufferedImage(100, 100, BufferedImage.TYPE_INT_RGB); Graphics2D gr = bmp.creategraphics(); //gr.drawimage(image, 0, 0, chunkwidth, chunkheight, chunkwidth * y, chunkheight * x, chunkwidth * y + chunkwidth, chunkheight * x + chunkheight, null); gr.drawline(0, 0, 50, 50); Image im=bmp.getscaledinstance(bmp.getwidth(), bmp.getheight(), BufferedImage.TYPE_INT_RGB); jlabel3.seticon(new ImageIcon(im)); gr.dispose(); //GEN-LAST:event_jButton3ActionPerformed private BufferedImage imagetobufferedimage(image im) { BufferedImage bi1 = new BufferedImage (im.getwidth(null),im.getheight(null),bufferedimage.type_int_rgb); Graphics bg = bi1.getgraphics(); bg.drawimage(im, 0, 0, null); bg.dispose();
return bi1; public static void main(string args[]) { java.awt.eventqueue.invokelater(new Runnable() { public void run() { new BaseImageSensorData().setVisible(true); ); private javax.swing.jbutton jbutton1; private javax.swing.jbutton jbutton2; private javax.swing.jbutton jbutton3; private javax.swing.jlabel jlabel1; private javax.swing.jlabel jlabel2; private javax.swing.jlabel jlabel3;
Example 8.4.1: The joint intensity matrix for the graph A B described in the chapter 3rd = 11 2 6 0 3 0 1 8 0 5 0 2 5 0 10 2 3 0 0 4 1 7 0 2 4 0 3 0 9 2 0 3 0 2 1 6 AB Q For deriving the sub-matrix corresponding to instantiations consistent with B = b1 is used. = 9 2 1 6 A,b 1 Q For the magnitude for the intensity when B = b1 we get some rows values a ve. 8.5 EXPECTATION PROPAGATION For CTBN we can perform some basic operations that will help to designate a message propagation algorithm. For one used iterative method for approximate projection then one need to apply expectation propagation to handle it for the better estimate improvements. 8.5.1 EP FOR SEGMENTS: Once if you start with the formulation of algorithm expectation propagation for the message propagation for one segment of our trajectory, with constant continuous evidence; the generalization to multiple segments follows. Hence, the cluster tree for the graph G is constructed first. This procedure is exactly the same as in Bayesian networks cycles do not
introduce new issues. Here we have simply moralize the graph, connecting all parents of a node with undirected edges, and then make all the remaining edges undirected. If we have a cycle, it simply turns into a loop in the resulting undirected graph. We then select a set of clusters Ci. These clusters can be selected so as to produce a clique tree for the graph, using any standard method for constructing such trees. Or, we can construct a loopy cluster graph, and use generalized belief propagation. The message passing scheme is the same in both cases. Assume Ai C i be the set of variables whose factors we associate with cluster C i. Assume Ni be the set of neighboring clusters for C i and let S i,j be the set of variables in C i C j. We also compute, for each cluster C i, the initial distribution P 0 C i using standard BN inference on the network B -[Nodelman, U., & Horvitz, E. (2003)] Example 8.5.1:
Assume the graph along with having a binary variable for our CTBN A B C D 1 1 Q A 1 1 1 10 Q B a1 B a2 1 10 Q 10 1 10 1 where QC B and QD C have the same parameterization as QB A. So as witch randomly between states a1 and a2, and each child tries to match the behavior of its parent. Suppose we have a uniform initial distribution over all variables except D which starts in state d1 and remains in that state for unit time (T=1). Our cluster tree is AB BC CD and our initial potentials are:
Here we have converged that If we use π1 to compute the distribution over A at time 1, we get [.703.297]. If we do exact inference by amalgamating all the factors and exponentiation, we get [.738.262]. Theoretically this algorithm should be generalized to do smoothing over trajectories; however there are a number of complications that arise. Hence for example, message passing can lead to a potential with a negative intensity in an off-diagonal entry. When that happens, it is not clear how to proceed - [Nodelman, U., & Horvitz, E. (2003)] DEMONSTRATION OF USE OF THE BAYESIAN NETWORK INFERENCE Here we are assuming that, all the nodes in the Bayesian network are basically of type boolean which is either 0 or 1. Assume that network contains 4 nodes and looks as follows: The probabilities of each node is given below for your reference, p(b=1) = 0.01
p(c=1) = 0.001 p(a=1 B=0, C=0) = 0.01 p(a=1 B=0, C=1) = 0.5 p(a=1 B=1, C=0) = 0.9 p(a=1 B=1, C=1) = 0.99 p(d=1 A=0) = 0.2 p(d=1 A=1) = 0.5 #include <iostream.h> #include <dlib/graph.h> #include <dlib/directed_graph.h> #include <dlib/bayes_utils.h> #include <dlib/graph_utils.h> #define cout show using namespace dlib; using namespace std; int main() { try { using namespace bayes_node_utils; directed_graph<bayes_node>::kernel_1a_c bnw; enum nodes { A = 0, B = 1, C = 2, D = 3 ; bnw.numberofnodes_set(4); bnw.add_edge(a, D); bnw.add_edge(b, A); bnw.add_edge(c, A);
Set_NodeNumValues(bnw, A, 2); Set_NodeNumValues(bnw, B, 2); Set_NodeNumValues(bnw, C, 2); Set_NodeNumValues(bnw, D, 2); assignment parent_state; Set_NodeProbability(bnw, B, 1, State_Parent, 0.01); Set_NodeProbability(bnw, B, 0, State_Parent, 1-0.01); Set_NodeProbability(bnw, C, 1, State_Parent, 0.001); Set_NodeProbability(bnw, C, 0, State_Parent, 1-0.001); State_Parent.add(B, 1); State_Parent.add(C, 1); Set_NodeProbability(bnw, A, 1, State_Parent, 0.99); Set_NodeProbability(bnw, A, 0, State_Parent, 1-0.99); State_Parent[B] = 1; State_Parent[C] = 0; Set_NodeProbability(bnw, A, 1, State_Parent, 0.9); Set_NodeProbability(bnw, A, 0, State_Parent, 1-0.9); State_Parent[B] = 0; State_Parent[C] = 1; Set_NodeProbability(bnw, A, 1, State_Parent, 0.5); Set_NodeProbability(bnw, A, 0, State_Parent, 1-0.5); State_Parent[B] = 0; State_Parent[C] = 0; Set_NodeProbability(bnw, A, 1, State_Parent, 0.01); Set_NodeProbability(bnw, A, 0, State_Parent, 1-0.01); State_Parent.clear(); State_Parent.add(A,1); Set_NodeProbability(bnw, D, 1, State_Parent, 0.5); Set_NodeProbability(bnw, D, 0, State_Parent, 1-0.5); State_Parent[A] = 0; Set_NodeProbability(bnw, D, 1, State_Parent, 0.2); Set_NodeProbability(bnw, D, 0, State_Parent, 1-0.2);
typedef dlib::set<unsigned long>::compare_1b_c set_type; typedef graph<set_type, set_type>::kernel_1a_c join_tree_type; join_tree_type join_tree; create_moral_graph(bnw, join_tree); create_join_tree(join_tree, join_tree); bayesian_network_join_tree solution(bnw, join_tree); show<< "Using the Join-tree Algorithm:\n"; show<< "p(a=1) = " << solution.probability(a)(1) << \n ; show<< "p(a=0) = " << solution.probability(a)(0) << \n ; show<< "p(b=1) = " << solution.probability(b)(1) << \n ; show<< "p(b=0) = " << solution.probability(b)(0) << \n ; show<< "p(c=1) = " << solution.probability(c)(1) << \n ; show<< "p(c=0) = " << solution.probability(c)(0) << \n ; show<< "p(d=1) = " << solution.probability(d)(1) << \n ; show<< "p(d=0) = " << solution.probability(d)(0) << \n ; show<< "\n\n\n"; set_node_value(bn, C, 1); set_node_as_evidence(bn, C); bayesian_network_join_tree solution_with_evidence(bn, join_tree); show<< "Using the join tree algorithm:\n"; show<< "p(a=1 C=1) = " << solution_with_evidence.probability(a)(1) << \n ; show<< "p(a=0 C=1) = " << solution_with_evidence.probability(a)(0) << \n ; show<< "p(b=1 C=1) = " << solution_with_evidence.probability(b)(1) << \n ; show<< "p(b=0 C=1) = " << solution_with_evidence.probability(b)(0) << \n ; show<< "p(c=1 C=1) = " << solution_with_evidence.probability(c)(1) << \n ; show<< "p(c=0 C=1) = " << solution_with_evidence.probability(c)(0) << \n ; show<< "p(d=1 C=1) = " << solution_with_evidence.probability(d)(1) << \n ; show<< "p(d=0 C=1) = " << solution_with_evidence.probability(d)(0) << \n ; show<< "\n\n\n"; set_node_value(bnw, A, 0); set_node_value(bnw, B, 0); set_node_value(bnw, D, 0); bayesian_network_gibbs_sampler sampler; unsigned long int Acount = 0;
unsigned long int Bcount = 0; unsigned long int Ccount = 0; unsigned long int Dcount = 0; const long rounds = 2000; for (long i = 0; i < rounds; ++i) { sampler.sample_graph(bn); if (node_value(bn, A) == 1) Acount = Acount + 1; if (node_value(bn, B) == 1) Bcount = Bcount + 1; if (node_value(bn, C) == 1) Ccount = Ccount + 1; if (node_value(bn, D) == 1) Dcount = Dcount + 1; show<< "Using the approximate Gibbs Sampler algorithm:\n"; show<< "p(a=1 C=1) = " << (double)acount/(double)rounds << \n ; show<< "p(b=1 C=1) = " << (double)bcount/(double)rounds << \n ; show<< "p(c=1 C=1) = " << (double)ccount/(double)rounds << \n ; show<< "p(d=1 C=1) = " << (double)dcount/(double)rounds << \n ; catch (std::exception& ew) { show<< "exception thrown: " << \n ; show<< ew.what() << \n ; show<< "hit enter to terminate" << \n ; cin.get(); 8.7 DISCUSSION
Here inference for CTBNs has been discussed in length. These inferences are exactly none but the summarizing capabilities of path that look like infeasible. We have also discuss the optimize inference such that the computational advantages of the independencies encoded in factored CTBN representation. A near version of expected propagation is described ahead called as cluster-graph based message-passing algorithm. At last we have discussed the use of inference in BN by suitable code.