Estimation of Bilateral Connections in a Network: Copula vs. Maximum Entropy
|
|
- Verity Charles
- 5 years ago
- Views:
Transcription
1 Estimation of Bilateral Connections in a Network: Copula vs. Maximum Entropy Pallavi Baral and Jose Pedro Fique Department of Economics Indiana University at Bloomington 1st Annual CIRANO Workshop on Networks in Trade and Finance November 9, 2012
2 Motivation We begin with examples of some complex networks.
3 Motivation We begin with examples of some complex networks. Figure: Fedwire Payment Network
4 Figure: Brazilian Interbank Network Motivation
5 Motivation Each network within this complex network, looks like a star.
6 Motivation Each network within this complex network, looks like a star. Figure: Star Interbank Network
7 Motivation However, often we do not have access to data on the interbank linkages.
8 Motivation However, often we do not have access to data on the interbank linkages. The interbank network data available usually looks like the following:
9 Motivation However, often we do not have access to data on the interbank linkages. The interbank network data available usually looks like the following: Figure: Interbank Network Data
10 Motivation So, we have the aggregate level data but not the micro level data.
11 Motivation So, we have the aggregate level data but not the micro level data. In terms of a matrix then, we have the following representation
12 Motivation So, we have the aggregate level data but not the micro level data. In terms of a matrix then, we have the following representation Figure: Matrix Representation of Interbank Linkage Data where X ij s denote the a connection between bank i and bank j, a i = j X ij and l i = i X ij.
13 Motivation In terms of a more generic setup we have, for any network, the following representation where a ij s denote a connection between nodes i and j, a i. = j a ij, a.i = i a ij and S G = i a i. = j a.i.
14 Motivation In terms of a more generic setup we have, for any network, the following representation where a ij s denote a connection between nodes i and j, a i. = j a ij, a.i = i a ij and S G = i a i. = j a.i. Figure: Matrix Representation of Any Network Data
15 Motivation And we only have access to the marginals of this matrix, i.e., a i. = j a ij and a.i = i a ij.
16 Motivation And we only have access to the marginals of this matrix, i.e., a i. = j a ij and a.i = i a ij. Figure: Example of Marginals of Network Data
17 Main Problem
18 Main Problem The primary question is then - Is there a way to estimate or simulate the joint distribution (a ij s) from these marginals?
19 Main Problem The primary question is then - Is there a way to estimate or simulate the joint distribution (a ij s) from these marginals? And the answer is YES.
20 Main Problem The primary question is then - Is there a way to estimate or simulate the joint distribution (a ij s) from these marginals? And the answer is YES. Upper and Worms (2004) proposed a method based on Maximum Entropy (ME, henceforth).
21 Maximum Entropy The standard ME estimation divides the total # or weight of connections equally among all other nodes.
22 Maximum Entropy The standard ME estimation divides the total # or weight of connections equally among all other nodes. Then uses the RAS algorithm proposed by Schneider (1990) to re-balance the matrix.
23 Maximum Entropy The standard ME estimation divides the total # or weight of connections equally among all other nodes. Then uses the RAS algorithm proposed by Schneider (1990) to re-balance the matrix. Matrix re-balancing is required to ensure that the sum of the individual elements amount to the original overall value of the connections.
24 Maximum Entropy The standard ME estimation divides the total # or weight of connections equally among all other nodes. Then uses the RAS algorithm proposed by Schneider (1990) to re-balance the matrix. Matrix re-balancing is required to ensure that the sum of the individual elements amount to the original overall value of the connections. Mistrulli (2010) provided a description of the shortcomings of ME, which serves as our motivation.
25 Maximum Entropy The standard ME estimation divides the total # or weight of connections equally among all other nodes. Then uses the RAS algorithm proposed by Schneider (1990) to re-balance the matrix. Matrix re-balancing is required to ensure that the sum of the individual elements amount to the original overall value of the connections. Mistrulli (2010) provided a description of the shortcomings of ME, which serves as our motivation. In particular, we will show that when we have complex systems such as networks, there are cases when ME doesn t perform well.
26 Maximum Entropy The standard ME estimation divides the total # or weight of connections equally among all other nodes. Then uses the RAS algorithm proposed by Schneider (1990) to re-balance the matrix. Matrix re-balancing is required to ensure that the sum of the individual elements amount to the original overall value of the connections. Mistrulli (2010) provided a description of the shortcomings of ME, which serves as our motivation. In particular, we will show that when we have complex systems such as networks, there are cases when ME doesn t perform well. Further, we propose an alternate methodology based on copulas.
27 Copulas A copula is a cumulative distribution function with uniform marginals.
28 Copulas A copula is a cumulative distribution function with uniform marginals. What makes a copula attractive?
29 Copulas A copula is a cumulative distribution function with uniform marginals. What makes a copula attractive? - A copula can express complex dependence structures using the marginals as a reference case and the corresponding dependence parameter.
30 Copulas A copula is a cumulative distribution function with uniform marginals. What makes a copula attractive? - A copula can express complex dependence structures using the marginals as a reference case and the corresponding dependence parameter. Raschke et al. (2010) use copulas to generate random networks for given dependence parameters.
31 Copulas Main differences between Raschke et al. (2010) and our work are as follows.
32 Copulas Main differences between Raschke et al. (2010) and our work are as follows. 1 They generate a random network whereas we propose a method applicable to directed networks via a sequential search using an estimated dependence parameter rather than assuming a value.
33 Copulas Main differences between Raschke et al. (2010) and our work are as follows. 1 They generate a random network whereas we propose a method applicable to directed networks via a sequential search using an estimated dependence parameter rather than assuming a value. 2 They do not address the role of the structure of data in affecting the performance of their methods.
34 Copulas Main differences between Raschke et al. (2010) and our work are as follows. 1 They generate a random network whereas we propose a method applicable to directed networks via a sequential search using an estimated dependence parameter rather than assuming a value. 2 They do not address the role of the structure of data in affecting the performance of their methods. 3 We provide various scenarios in terms of different data structures and analyze the relative performance of Copula based methods as compared to the standard ME approach (our benchmark).
35 Proposed Methodology
36 Proposed Methodology We propose two methods based on copulas - direct and indirect.
37 Proposed Methodology We propose two methods based on copulas - direct and indirect. We simulate networks via Monte Carlo simulation technique and compare their performance to that of ME.
38 Proposed Methodology We propose two methods based on copulas - direct and indirect. We simulate networks via Monte Carlo simulation technique and compare their performance to that of ME. We use an error measure that is directly based on the distance between the true and the simulated bilateral connections.
39 Proposed Methodology The direct method directly fits a copula to the data and generates the joint distributions over connections based on the estimated dependence parameter.
40 Proposed Methodology The direct method directly fits a copula to the data and generates the joint distributions over connections based on the estimated dependence parameter. The indirect method based on constraint optimization of the error between the matrix to be estimated and the initial guess.
41 Proposed Methodology The direct method directly fits a copula to the data and generates the joint distributions over connections based on the estimated dependence parameter. The indirect method based on constraint optimization of the error between the matrix to be estimated and the initial guess. Fernandez-Vazquez (2010) use a similar method but with Cross Entropy.
42 Proposed Methodology The direct method directly fits a copula to the data and generates the joint distributions over connections based on the estimated dependence parameter. The indirect method based on constraint optimization of the error between the matrix to be estimated and the initial guess. Fernandez-Vazquez (2010) use a similar method but with Cross Entropy. However, the cross entropy fitness measure has a major disadvantage
43 Proposed Methodology The direct method directly fits a copula to the data and generates the joint distributions over connections based on the estimated dependence parameter. The indirect method based on constraint optimization of the error between the matrix to be estimated and the initial guess. Fernandez-Vazquez (2010) use a similar method but with Cross Entropy. However, the cross entropy fitness measure has a major disadvantage - in cases where connections are underestimated, it produces a negative number biasing the the overall assessment of the fitness of the model
44 Cross Entropy The main differences between Fernandez-Vazquez (2010) and our indirect methodology are as follows.
45 Cross Entropy The main differences between Fernandez-Vazquez (2010) and our indirect methodology are as follows. 1 We construct the initial guess as a hybrid of two copulas.
46 Cross Entropy The main differences between Fernandez-Vazquez (2010) and our indirect methodology are as follows. 1 We construct the initial guess as a hybrid of two copulas. 2 Instead of the Kullback-Leibler measure of divergence, we use our error measure.
47 Cross Entropy The main differences between Fernandez-Vazquez (2010) and our indirect methodology are as follows. 1 We construct the initial guess as a hybrid of two copulas. 2 Instead of the Kullback-Leibler measure of divergence, we use our error measure. 3 We use additional constraint(s) that imposes restrictions on the structure of the network.
48 Core-Periphey Model In order to impose structural constraints on our simulation process we use the Core-Periphery model setup.
49 Core-Periphey Model In order to impose structural constraints on our simulation process we use the Core-Periphery model setup. Suppose we are able to partition the entire network into blocks - { Block 1, Block 2, Block 3, Block 4}
50 Core-Periphey Model In order to impose structural constraints on our simulation process we use the Core-Periphery model setup. Suppose we are able to partition the entire network into blocks - { Block 1, Block 2, Block 3, Block 4} Figure: Dividing A Matrix into Blocks
51 Core-Periphey Model Core-Periphey Model assumes that there are certain key nodes in the network (the core) that have a relatively larger impact in the way connections are formed as well as the way connections are sustained.
52 Core-Periphey Model Core-Periphey Model assumes that there are certain key nodes in the network (the core) that have a relatively larger impact in the way connections are formed as well as the way connections are sustained. The rest of the nodes are more peripheral and have less of an impact in the manner in which dependencies arise within the network.
53 Core-Periphey Model Core-Periphey Model assumes that there are certain key nodes in the network (the core) that have a relatively larger impact in the way connections are formed as well as the way connections are sustained. The rest of the nodes are more peripheral and have less of an impact in the manner in which dependencies arise within the network. For example, we can term Block 1 as the core-core (CC) connections and Block 4 as periphery-periphery (PP) connections.
54 Core-Periphey Model Core-Periphey Model assumes that there are certain key nodes in the network (the core) that have a relatively larger impact in the way connections are formed as well as the way connections are sustained. The rest of the nodes are more peripheral and have less of an impact in the manner in which dependencies arise within the network. For example, we can term Block 1 as the core-core (CC) connections and Block 4 as periphery-periphery (PP) connections. Block 2 will then be core-periphery (CP) connections and Block 3 will be periphery-core (PC) connections.
55 Core-Periphey Model The sum of the number/value of connections in each block may be treated as an indicator of the relative importance of the block.
56 Core-Periphey Model The sum of the number/value of connections in each block may be treated as an indicator of the relative importance of the block. We call this the share of a block.
57 Core-Periphey Model The sum of the number/value of connections in each block may be treated as an indicator of the relative importance of the block. We call this the share of a block. And the ratio of the number of nodes in each block to the total number of nodes denotes the size of the block.
58 Core-Periphey Model The sum of the number/value of connections in each block may be treated as an indicator of the relative importance of the block. We call this the share of a block. And the ratio of the number of nodes in each block to the total number of nodes denotes the size of the block. We assume that the size of CC is smaller than that of PP.
59 Data Asymmetry We define asymmetry of data in terms of variable block share and block size.
60 Data Asymmetry We define asymmetry of data in terms of variable block share and block size. Equal Block Share = Symmetry in Shares.
61 Data Asymmetry We define asymmetry of data in terms of variable block share and block size. Equal Block Share = Symmetry in Shares. Equal Block Sizes = Symmetry in Sizes.
62 Data Asymmetry We define asymmetry of data in terms of variable block share and block size. Equal Block Share = Symmetry in Shares. Equal Block Sizes = Symmetry in Sizes. This notion can be applied appropriately to represent underlying community structures of networks as well.
63 Main Findings of the Paper
64 Main Findings of the Paper We apply our methods to both sparse and dense networks.
65 Main Findings of the Paper We apply our methods to both sparse and dense networks. For sparse networks, we find that ME and our methods do equally well.
66 Main Findings of the Paper We apply our methods to both sparse and dense networks. For sparse networks, we find that ME and our methods do equally well. For dense networks, we find that the degree of asymmetry in the data affects the performance of ME and our methods.
67 Main Findings of the Paper We apply our methods to both sparse and dense networks. For sparse networks, we find that ME and our methods do equally well. For dense networks, we find that the degree of asymmetry in the data affects the performance of ME and our methods. In particular, higher the asymmetry better our methods perform upto a threshold size and share of CC.
68 Direct Method - Algorithm
69 Direct Method - Algorithm First, we present the steps involved in fitting a copula to the data.
70 Direct Method - Algorithm First, we present the steps involved in fitting a copula to the data. Since we have two vectors as our marginals, we use a bivariate copula.
71 Direct Method - Algorithm First, we present the steps involved in fitting a copula to the data. Since we have two vectors as our marginals, we use a bivariate copula. Specifically, we use the functional form given by a Gumbel Copula.
72 Direct Method - Algorithm The Gumbel Copula belongs to the extreme values family.
73 Direct Method - Algorithm The Gumbel Copula belongs to the extreme values family. For a bivariate case it may be expressed as
74 Direct Method - Algorithm The Gumbel Copula belongs to the extreme values family. For a bivariate case it may be expressed as C G(a.i ;a i. ) = exp( [( ln(a.i )) θ + ( ln(a i. )) θ ]) 1 θ The parameter θ controls the strength of dependence.
75 Direct Method - Algorithm The Gumbel Copula belongs to the extreme values family. For a bivariate case it may be expressed as C G(a.i ;a i. ) = exp( [( ln(a.i )) θ + ( ln(a i. )) θ ]) 1 θ The parameter θ controls the strength of dependence. The upper tail dependence for this copula is given by θ.
76 Direct Method - Algorithm The Gumbel Copula belongs to the extreme values family. For a bivariate case it may be expressed as C G(a.i ;a i. ) = exp( [( ln(a.i )) θ + ( ln(a i. )) θ ]) 1 θ The parameter θ controls the strength of dependence. The upper tail dependence for this copula is given by θ. And the lower dependency is 0.
77 Standard Copula Algorithm Plot the densities of the marginal sums of rows and columns (i.e, the available data).
78 Standard Copula Algorithm Plot the densities of the marginal sums of rows and columns (i.e, the available data). Based on these plots of the marginals, infer the nature of distribution of the marginals (e.g., perhaps, it is a mixture or it is not).
79 Standard Copula Algorithm Plot the densities of the marginal sums of rows and columns (i.e, the available data). Based on these plots of the marginals, infer the nature of distribution of the marginals (e.g., perhaps, it is a mixture or it is not). Transform these marginals into uniform distribution (using kernel density estimation), which is required for it to be able to serve as an input into the copula function.
80 Standard Copula Algorithm Plot the densities of the marginal sums of rows and columns (i.e, the available data). Based on these plots of the marginals, infer the nature of distribution of the marginals (e.g., perhaps, it is a mixture or it is not). Transform these marginals into uniform distribution (using kernel density estimation), which is required for it to be able to serve as an input into the copula function. Thereafter, fit a copula to the transformed data using maximum likelihood estimation that estimates the dependency parameter.
81 Standard Copula Algorithm The ML estimation undertaken in this step makes a parametric assumption that the copula is a function of some dependence parameter, θ.
82 Standard Copula Algorithm The ML estimation undertaken in this step makes a parametric assumption that the copula is a function of some dependence parameter, θ. Our aim is estimate the θ in order to deduce the nature of dependence using the ML estimation method.
83 Standard Copula Algorithm The ML estimation undertaken in this step makes a parametric assumption that the copula is a function of some dependence parameter, θ. Our aim is estimate the θ in order to deduce the nature of dependence using the ML estimation method. The copula distribution function is given by C(a i., a.j ; θ).
84 Standard Copula Algorithm The ML estimation undertaken in this step makes a parametric assumption that the copula is a function of some dependence parameter, θ. Our aim is estimate the θ in order to deduce the nature of dependence using the ML estimation method. The copula distribution function is given by C(a i., a.j ; θ). We denote copula density as c θ (a i. ; a.j ).
85 Standard Copula Algorithm The ML estimation undertaken in this step makes a parametric assumption that the copula is a function of some dependence parameter, θ. Our aim is estimate the θ in order to deduce the nature of dependence using the ML estimation method. The copula distribution function is given by C(a i., a.j ; θ). We denote copula density as c θ (a i. ; a.j ). ln L(θ a i., a.j ) = N c θ ( ˆF 1 (a ik ), ˆF 2 (a kj )) (1) k=1
86 Standard Copula Algorithm The ML estimation undertaken in this step makes a parametric assumption that the copula is a function of some dependence parameter, θ. Our aim is estimate the θ in order to deduce the nature of dependence using the ML estimation method. The copula distribution function is given by C(a i., a.j ; θ). We denote copula density as c θ (a i. ; a.j ). ln L(θ a i., a.j ) = N c θ ( ˆF 1 (a ik ), ˆF 2 (a kj )) (1) k=1 We then use this estimate of the dependency parameter,ˆθ, to generate a matrix of cummulative probabilities.
87 Direct Method Algorithm Begin by rescaling the matrix of probabilities derived from the copula estimation into a stochastic matrix.
88 Direct Method Algorithm Begin by rescaling the matrix of probabilities derived from the copula estimation into a stochastic matrix. Then, apply the RAS algorithm to the matrix to rebalance it.
89 Direct Method Algorithm Begin by rescaling the matrix of probabilities derived from the copula estimation into a stochastic matrix. Then, apply the RAS algorithm to the matrix to rebalance it. RAS algorithm is an iterative proportional fitting procedure for estimating the bilateral connections (or a ij s) of the connection matrix such that the marginal sums remain unchanged.
90 Direct Method Algorithm Begin by rescaling the matrix of probabilities derived from the copula estimation into a stochastic matrix. Then, apply the RAS algorithm to the matrix to rebalance it. RAS algorithm is an iterative proportional fitting procedure for estimating the bilateral connections (or a ij s) of the connection matrix such that the marginal sums remain unchanged. Then calculate the error measure.
91 Direct Method Algorithm Begin by rescaling the matrix of probabilities derived from the copula estimation into a stochastic matrix. Then, apply the RAS algorithm to the matrix to rebalance it. RAS algorithm is an iterative proportional fitting procedure for estimating the bilateral connections (or a ij s) of the connection matrix such that the marginal sums remain unchanged. Then calculate the error measure. i j ɛ = aˆ ij a ij i j a ij where â ij is the estimated a ij. (2)
92 Direct Method Algorithm Begin by rescaling the matrix of probabilities derived from the copula estimation into a stochastic matrix. Then, apply the RAS algorithm to the matrix to rebalance it. RAS algorithm is an iterative proportional fitting procedure for estimating the bilateral connections (or a ij s) of the connection matrix such that the marginal sums remain unchanged. Then calculate the error measure. i j ɛ = aˆ ij a ij i j a ij where â ij is the estimated a ij. The error measure returns the sum of the differences between the estimated and the true values as a percentage of total connections. (2)
93 Direct Method Algorithm Then estimate matrix of connections using the maximum entropy method.
94 Direct Method Algorithm Then estimate matrix of connections using the maximum entropy method. Calculate the error measure.
95 Direct Method Algorithm Then estimate matrix of connections using the maximum entropy method. Calculate the error measure. Compare both error measures.
96 Direct Method Algorithm Then estimate matrix of connections using the maximum entropy method. Calculate the error measure. Compare both error measures. The fit with the lower error measure is a better fit.
97 Indirect Method Algorithm
98 Indirect Method Algorithm First we estimate the matrix of connections using the maximum entropy method.
99 Indirect Method Algorithm First we estimate the matrix of connections using the maximum entropy method. Now, suppose we believe that the data is a combination of two different copulas in the ratio of x and 1 x.
100 Indirect Method Algorithm First we estimate the matrix of connections using the maximum entropy method. Now, suppose we believe that the data is a combination of two different copulas in the ratio of x and 1 x. Then, use the direct copula estimation method to fit each copula to the data separately.
101 Indirect Method Algorithm First we estimate the matrix of connections using the maximum entropy method. Now, suppose we believe that the data is a combination of two different copulas in the ratio of x and 1 x. Then, use the direct copula estimation method to fit each copula to the data separately. Using these estimates we construct Q 0 as follows.
102 Indirect Method Algorithm First we estimate the matrix of connections using the maximum entropy method. Now, suppose we believe that the data is a combination of two different copulas in the ratio of x and 1 x. Then, use the direct copula estimation method to fit each copula to the data separately. Using these estimates we construct Q 0 as follows. 1 First insert values extracted from the appropriate copula estimates in x% of the number of cells in Q 0.
103 Indirect Method Algorithm First we estimate the matrix of connections using the maximum entropy method. Now, suppose we believe that the data is a combination of two different copulas in the ratio of x and 1 x. Then, use the direct copula estimation method to fit each copula to the data separately. Using these estimates we construct Q 0 as follows. 1 First insert values extracted from the appropriate copula estimates in x% of the number of cells in Q 0. 2 Then for the (1 x)% we use the other copula estimates.
104 Indirect Method Algorithm For example, if we were looking at a 5x5 Q 0 matrix with, say, 60% of values from the first copula.
105 Indirect Method Algorithm For example, if we were looking at a 5x5 Q 0 matrix with, say, 60% of values from the first copula. Then we take the first 15 (0.6 25) estimates, i.e., the first 3 rows and 5 columns from the matrix of the first copula.
106 Indirect Method Algorithm For example, if we were looking at a 5x5 Q 0 matrix with, say, 60% of values from the first copula. Then we take the first 15 (0.6 25) estimates, i.e., the first 3 rows and 5 columns from the matrix of the first copula. Use them to be the first 3 rows and 5 columns of Q 0.
107 Indirect Method Algorithm For example, if we were looking at a 5x5 Q 0 matrix with, say, 60% of values from the first copula. Then we take the first 15 (0.6 25) estimates, i.e., the first 3 rows and 5 columns from the matrix of the first copula. Use them to be the first 3 rows and 5 columns of Q 0. The remaining part of Q 0 then comes from the other copula in a similar way.
108 Indirect Method Algorithm So Q 0 may be represented as
109 Indirect Method Algorithm So Q 0 may be represented as Figure: Hybrid of Copulas As Initial Guess
110 Indirect Method Algorithm Once Q 0 has been constructed, conduct a constrained minimization of
111 Indirect Method Algorithm Once Q 0 has been constructed, conduct a constrained minimization of G Q0 (3)
112 Indirect Method Algorithm Once Q 0 has been constructed, conduct a constrained minimization of G Q0 (3) subject to R.P = B (4) 1 G is the matrix to be estimated with typical element {a ij }
113 Indirect Method Algorithm Once Q 0 has been constructed, conduct a constrained minimization of G Q0 (3) subject to R.P = B (4) 1 G is the matrix to be estimated with typical element {a ij } 2 R is the vector of restrictions imposed on G
114 Indirect Method Algorithm Once Q 0 has been constructed, conduct a constrained minimization of G Q0 (3) subject to R.P = B (4) 1 G is the matrix to be estimated with typical element {a ij } 2 R is the vector of restrictions imposed on G 3 P is the matrix of probabilities.
115 Indirect Method Algorithm Once Q 0 has been constructed, conduct a constrained minimization of G Q0 (3) subject to R.P = B (4) 1 G is the matrix to be estimated with typical element {a ij } 2 R is the vector of restrictions imposed on G 3 P is the matrix of probabilities. 4 B is the target vector.
116 Indirect Method Algorithm Thereafter, extract the Q that is the minimizer.
117 Indirect Method Algorithm Thereafter, extract the Q that is the minimizer. Calculate the error measure of the Q and compare it to the one derived from maximum entropy.
118 Indirect Method Algorithm Thereafter, extract the Q that is the minimizer. Calculate the error measure of the Q and compare it to the one derived from maximum entropy. The better performer will have a lower error measure.
119 Monte Carlo Simulations - Data Generating Processes
120 Monte Carlo Simulations - Data Generating Processes We concentrate on intra- and inter-block asymmetries.
121 Monte Carlo Simulations - Data Generating Processes We concentrate on intra- and inter-block asymmetries. We introduce parameters to vary these called inter and intra.
122 Monte Carlo Simulations - Data Generating Processes We concentrate on intra- and inter-block asymmetries. We introduce parameters to vary these called inter and intra. We use two different DGPs - DGP 1 and DGP 2
123 Monte Carlo Simulations - DGP 1 It imposes that entries in the PP block are all 0 s.
124 Monte Carlo Simulations - DGP 1 It imposes that entries in the PP block are all 0 s. The values in the CC block are high (e.g., between 5000 and 10000)
125 Monte Carlo Simulations - DGP 1 It imposes that entries in the PP block are all 0 s. The values in the CC block are high (e.g., between 5000 and 10000) The values in the CP and PC blocks are lower than CC but higher than PP.
126 Monte Carlo Simulations - DGP 1 It imposes that entries in the PP block are all 0 s. The values in the CC block are high (e.g., between 5000 and 10000) The values in the CP and PC blocks are lower than CC but higher than PP. These are varied (except the PP and CC block) using the inter and intra parameters.
127 Monte Carlo Simulations - DGP 1 Figure: Graphical Representation of DGP 1
128 Monte Carlo Simulations - DGP 2 It imposes that entries in the PP block are NOT 0 s.
129 Monte Carlo Simulations - DGP 2 It imposes that entries in the PP block are NOT 0 s. The values in the CC block are higher (e.g., between and )
130 Monte Carlo Simulations - DGP 2 It imposes that entries in the PP block are NOT 0 s. The values in the CC block are higher (e.g., between and ) These are varied (except the CC block) using the inter and intra parameters.
131 Monte Carlo Simulations - DGP 2 Figure: Graphical Representation of DGP 2
132 Monte Carlo Simulations - Findings Under DGP 1
133 Monte Carlo Simulations - Findings Under DGP 1 We conducted 1000 simulations for fixed networks sizes but variable core sizes.
134 Monte Carlo Simulations - Findings Under DGP 1 We conducted 1000 simulations for fixed networks sizes but variable core sizes. We conducted 1000 simulations for variable networks sizes but fixed core sizes.
135 Monte Carlo Simulations - Findings Under DGP 1 Under the direct method with variable intra-block asymmetry the following represents difference between the error measure of ME and Copula based direct method.
136 Monte Carlo Simulations - Findings Under DGP 1 Under the direct method with variable intra-block asymmetry the following represents difference between the error measure of ME and Copula based direct method. Figure: Performance of Gumbel vs. ME under DGP 1 with Variable Inra Block Asymmetry
137 Monte Carlo Simulations - Findings Under DGP 1 The graphs clearly indicate that the difference is negative, implying ME performs better.
138 Monte Carlo Simulations - Findings Under DGP 1 The graphs clearly indicate that the difference is negative, implying ME performs better. However, as the network size increases, there seems to an upward trend.
139 Monte Carlo Simulations - Findings Under DGP 1 The graphs clearly indicate that the difference is negative, implying ME performs better. However, as the network size increases, there seems to an upward trend. Potential Reasons:
140 Monte Carlo Simulations - Findings Under DGP 1 The graphs clearly indicate that the difference is negative, implying ME performs better. However, as the network size increases, there seems to an upward trend. Potential Reasons: 1 Share of PP is 0.
141 Monte Carlo Simulations - Findings Under DGP 1 The graphs clearly indicate that the difference is negative, implying ME performs better. However, as the network size increases, there seems to an upward trend. Potential Reasons: 1 Share of PP is 0. 2 Size and Share of CC are high but not enough to induce to make the data significantly asymmetric.
142 Monte Carlo Simulations - Findings Under DGP 1 The graphs clearly indicate that the difference is negative, implying ME performs better. However, as the network size increases, there seems to an upward trend. Potential Reasons: 1 Share of PP is 0. 2 Size and Share of CC are high but not enough to induce to make the data significantly asymmetric. So, we check with DGP 2, which addresses both these points.
143 Monte Carlo Simulations - Findings Under DGP 2 The following graph clearly indicates an upward trend as CC and PP values are increased under DGP 2, implying that the direct method performs much better.
144 Monte Carlo Simulations - Findings Under DGP 2 The following graph clearly indicates an upward trend as CC and PP values are increased under DGP 2, implying that the direct method performs much better. Figure: Performance of Gumbel vs. ME under DGP 2 with Variable Intra Block Asymmetry
145 Monte Carlo Simulations - Findings Under DGP 2 However, as we reduce the PP values towards 0, one starts noticing a downward trend!
146 Monte Carlo Simulations - Findings Under DGP 2 However, as we reduce the PP values towards 0, one starts noticing a downward trend! Figure: Performance of Gumbel vs. ME under DGP 2 with Variable Intra Block Asymmetry with Lower PP
147 Monte Carlo Simulations - Findings On the other hand, under DGP 1 and DGP 2 as we change inter-block asymmetry we observe a distinct negative trend.
148 Monte Carlo Simulations - Findings On the other hand, under DGP 1 and DGP 2 as we change inter-block asymmetry we observe a distinct negative trend. Figure: Performance of Gumbel vs. ME under DGP 1 and DGP 2 with Variable Inter Block Asymmetry
149 Monte Carlo Simulations - Findings Potential Reasons:
150 Monte Carlo Simulations - Findings Potential Reasons: 1 Beyond a point (in particular, under DGP 2), as we increase the parameter inter, we end up balancing the shares of all blocks.
151 Monte Carlo Simulations - Findings Potential Reasons: 1 Beyond a point (in particular, under DGP 2), as we increase the parameter inter, we end up balancing the shares of all blocks. 2 As such, we end up making the data more symmetric.
152 Monte Carlo Simulations - Findings Potential Reasons: 1 Beyond a point (in particular, under DGP 2), as we increase the parameter inter, we end up balancing the shares of all blocks. 2 As such, we end up making the data more symmetric. 3 As the data becomes more symmetric, ME starts to perform better.
153 Monte Carlo Simulations - Findings Potential Reasons: 1 Beyond a point (in particular, under DGP 2), as we increase the parameter inter, we end up balancing the shares of all blocks. 2 As such, we end up making the data more symmetric. 3 As the data becomes more symmetric, ME starts to perform better. 4 Until the data is asymmetric, the copula-based method performs better.
154 Monte Carlo Simulations - Findings Potential Reasons: 1 Beyond a point (in particular, under DGP 2), as we increase the parameter inter, we end up balancing the shares of all blocks. 2 As such, we end up making the data more symmetric. 3 As the data becomes more symmetric, ME starts to perform better. 4 Until the data is asymmetric, the copula-based method performs better. 5 Observations are similar even when we vary core sizes.
155 Monte Carlo Simulations - Findings For Smaller Networks We observe the following for varibale shares of CC and performance of copula-based vs. ME methods under DGP 1.
156 Monte Carlo Simulations - Findings For Smaller Networks We observe the following for varibale shares of CC and performance of copula-based vs. ME methods under DGP 1. Figure: Performance of Gumbel vs. ME under DGP 1 with Variable Inter Block Asymmetry For Small Networks
157 Monte Carlo Simulations - Findings For Smaller Networks We observe the following for varibale shares of CC and performance of copula-based vs. ME methods under DGP 2.
158 Monte Carlo Simulations - Findings For Smaller Networks We observe the following for varibale shares of CC and performance of copula-based vs. ME methods under DGP 2. Figure: Performance of Gumbel, Hybrid vs. ME under DGP 2 with Variable Inter Block Asymmetry For Small Networks
159 Monte Carlo Simulations - Findings For Smaller Networks We observe that with DGP 1, we continue to get mixed results.
160 Monte Carlo Simulations - Findings For Smaller Networks We observe that with DGP 1, we continue to get mixed results. Whereas with DGP 2, we have that both the direct as well as indirect methods do better than ME even with smaller network sizes.
161 Monte Carlo Simulations - Findings For Smaller Networks We observe that with DGP 1, we continue to get mixed results. Whereas with DGP 2, we have that both the direct as well as indirect methods do better than ME even with smaller network sizes. In particular, as we increase the value of PP block keeping the values of CC constant, the copula based methods perform much better.
162 Monte Carlo Simulations - Findings For Smaller Networks We observe that with DGP 1, we continue to get mixed results. Whereas with DGP 2, we have that both the direct as well as indirect methods do better than ME even with smaller network sizes. In particular, as we increase the value of PP block keeping the values of CC constant, the copula based methods perform much better. Potential Reasons:
163 Monte Carlo Simulations - Findings For Smaller Networks We observe that with DGP 1, we continue to get mixed results. Whereas with DGP 2, we have that both the direct as well as indirect methods do better than ME even with smaller network sizes. In particular, as we increase the value of PP block keeping the values of CC constant, the copula based methods perform much better. Potential Reasons: 1 When we increase the values of PP, we increase the asymmetry in the data.
164 Monte Carlo Simulations - Findings For Smaller Networks We observe that with DGP 1, we continue to get mixed results. Whereas with DGP 2, we have that both the direct as well as indirect methods do better than ME even with smaller network sizes. In particular, as we increase the value of PP block keeping the values of CC constant, the copula based methods perform much better. Potential Reasons: 1 When we increase the values of PP, we increase the asymmetry in the data. 2 As the intra-block asymmetry rises, we are essentially increasing the asymmetry in the data which helps (1).
165 Application to Dense Network - BIS
166 Application to Dense Network - BIS Bank of International Settlements data provides a bilateral description of interbank exposures at a country level.
167 Application to Dense Network - BIS Bank of International Settlements data provides a bilateral description of interbank exposures at a country level. Figure: Structure of BIS Data
168 Application to Dense Network - BIS Since the BIS data has a given degree of asymmetry, we analyze its various sub-matrices.
169 Application to Dense Network - BIS Since the BIS data has a given degree of asymmetry, we analyze its various sub-matrices. We look at the following matrices extracted from the BIS matrix which is a matrix of connections:
170 Application to Dense Network - BIS Since the BIS data has a given degree of asymmetry, we analyze its various sub-matrices. We look at the following matrices extracted from the BIS matrix which is a matrix of connections: 6 6, 7 7, 8 8, 10 10, 11 11, 13 13, 15 15, and the full dataset,
171 Application to Dense Network - BIS For instance, a matrix of the BIS data looks like the following.
172 Application to Dense Network - BIS For instance, a matrix of the BIS data looks like the following. Figure: Structure of BIS Matrix
173 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%,
174 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%,
175 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %,
176 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %,
177 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric.
178 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar,
179 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,
180 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,
181 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = %
182 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %,
183 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric.
184 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better.
185 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better. In particular, the threshold is 11 11
186 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better. In particular, the threshold is with block shares for CC = %,
187 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better. In particular, the threshold is with block shares for CC = %, CP = %,
188 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better. In particular, the threshold is with block shares for CC = %, CP = %, PC = %
189 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better. In particular, the threshold is with block shares for CC = %, CP = %, PC = % and PP = %
190 Application to Dense Network - BIS We obseve that for 6 6 with 5 nodes in the core, the share of CC is around 96%, CP = 0%, PC = %, PP = %, i.e.,the data is highly asymmetric. At 21 21, the share of the blocks are very similar, i.e., CC = 34.5%,CP = %,PC = % and PP = %, implying the data is highly symmetric. As the data becomes more asymmetric, we observe that the copula based methods do better. In particular, the threshold is with block shares for CC = %, CP = %, PC = % and PP = % below which the Copula based methods do better and above which the ME approach does better.
191 Application to Dense Network - BIS This is represented in the following figure.
192 Application to Dense Network - BIS This is represented in the following figure. Figure: Performance of Copula Based Methods w.r.t BIS data, Variable Core Share
193 Application to Dense Network - BIS This is represented in the following figure.
194 Application to Dense Network - BIS This is represented in the following figure. Figure: Performance of Copula Based Methods w.r.t BIS data, Variable Core Size
195 Application to Sparse Network - emid
196 Application to Sparse Network - emid e-mid is the Italian based European reference electronic market for liquidity trading.
197 Application to Sparse Network - emid e-mid is the Italian based European reference electronic market for liquidity trading. This platform provides anonymized data for euro-denominated unsecured interbank transactions.
198 Application to Sparse Network - emid e-mid is the Italian based European reference electronic market for liquidity trading. This platform provides anonymized data for euro-denominated unsecured interbank transactions. We use a sample of emid data comprised of 43 trading days recorded on the last two months of 2011.
199 Application to Sparse Network - emid e-mid is the Italian based European reference electronic market for liquidity trading. This platform provides anonymized data for euro-denominated unsecured interbank transactions. We use a sample of emid data comprised of 43 trading days recorded on the last two months of We compare the performances of copula based methods, the ME approach as well as a random network generation method where we used a binomial probability distribution of connecting two nodes randomly.
200 Application to Sparse Network - emid e-mid is the Italian based European reference electronic market for liquidity trading. This platform provides anonymized data for euro-denominated unsecured interbank transactions. We use a sample of emid data comprised of 43 trading days recorded on the last two months of We compare the performances of copula based methods, the ME approach as well as a random network generation method where we used a binomial probability distribution of connecting two nodes randomly. Our preliminary results are reported here.
201 Application to Sparse Network - emid Data Description
202 Application to Sparse Network - emid Data Description The following figure shows the number of non-zero entries (our nodes).
203 Application to Sparse Network - emid Data Description The following figure shows the number of non-zero entries (our nodes). Figure: emid Data Description
204 Application to Sparse Network - emid Data Description Here is what we find:
205 Application to Sparse Network - emid Data Description Here is what we find: Figure: Performance of ME, Copula Based Direct Method and Random Network Generation Method
206 Application to Sparse Network - emid Data Description We observe that both ME and Copula based direct method outperform the random network generation method.
207 Application to Sparse Network - emid Data Description We observe that both ME and Copula based direct method outperform the random network generation method. It may be deduced that the random network generation approach may not be sufficient to solve the problem.
208 Application to Sparse Network - emid Data Description We observe that both ME and Copula based direct method outperform the random network generation method. It may be deduced that the random network generation approach may not be sufficient to solve the problem. ME and Copula-based direct method perform almost exactly the same!
209 Application to Sparse Network - emid Data Description We observe that both ME and Copula based direct method outperform the random network generation method. It may be deduced that the random network generation approach may not be sufficient to solve the problem. ME and Copula-based direct method perform almost exactly the same! Does that mean that impact of asymmetry vanishes as the matrices become very sparse?
210 Application to Sparse Network - emid Data Description We observe that both ME and Copula based direct method outperform the random network generation method. It may be deduced that the random network generation approach may not be sufficient to solve the problem. ME and Copula-based direct method perform almost exactly the same! Does that mean that impact of asymmetry vanishes as the matrices become very sparse? Intuitively, YES.
Samuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR
Detecting Missing and Spurious Edges in Large, Dense Networks Using Parallel Computing Samuel Coolidge, sam.r.coolidge@gmail.com Dan Simon, des480@nyu.edu Dennis Shasha, shasha@cims.nyu.edu Technical Report
More informationCS281 Section 9: Graph Models and Practical MCMC
CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs
More informationMotivation. Technical Background
Handling Outliers through Agglomerative Clustering with Full Model Maximum Likelihood Estimation, with Application to Flow Cytometry Mark Gordon, Justin Li, Kevin Matzen, Bryce Wiedenbeck Motivation Clustering
More informationDynamic Thresholding for Image Analysis
Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationParallelizing LU Factorization
Parallelizing LU Factorization Scott Ricketts December 3, 2006 Abstract Systems of linear equations can be represented by matrix equations of the form A x = b LU Factorization is a method for solving systems
More informationClustering Relational Data using the Infinite Relational Model
Clustering Relational Data using the Infinite Relational Model Ana Daglis Supervised by: Matthew Ludkin September 4, 2015 Ana Daglis Clustering Data using the Infinite Relational Model September 4, 2015
More informationCS Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationAn imputation approach for analyzing mixed-mode surveys
An imputation approach for analyzing mixed-mode surveys Jae-kwang Kim 1 Iowa State University June 4, 2013 1 Joint work with S. Park and S. Kim Ouline Introduction Proposed Methodology Application to Private
More informationComputer vision: models, learning and inference. Chapter 10 Graphical Models
Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x
More informationEstimation of Item Response Models
Estimation of Item Response Models Lecture #5 ICPSR Item Response Theory Workshop Lecture #5: 1of 39 The Big Picture of Estimation ESTIMATOR = Maximum Likelihood; Mplus Any questions? answers Lecture #5:
More informationConvexization in Markov Chain Monte Carlo
in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non
More informationA Fast Learning Algorithm for Deep Belief Nets
A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National
More information10701 Machine Learning. Clustering
171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among
More information1. Performance Comparison of Interdependent and Isolated Systems
Supplementary Information for: Fu, G., Dawson, R., Khoury, M., & Bullock, S. (2014) Interdependent networks: Vulnerability analysis and strategies to limit cascading failure, European Physical Journal
More information1 Methods for Posterior Simulation
1 Methods for Posterior Simulation Let p(θ y) be the posterior. simulation. Koop presents four methods for (posterior) 1. Monte Carlo integration: draw from p(θ y). 2. Gibbs sampler: sequentially drawing
More informationAdaptive Estimation of Distributions using Exponential Sub-Families Alan Gous Stanford University December 1996 Abstract: An algorithm is presented wh
Adaptive Estimation of Distributions using Exponential Sub-Families Alan Gous Stanford University December 1996 Abstract: An algorithm is presented which, for a large-dimensional exponential family G,
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationLabel Distribution Learning. Wei Han
Label Distribution Learning Wei Han, Big Data Research Center, UESTC Email:wei.hb.han@gmail.com Outline 1. Why label distribution learning? 2. What is label distribution learning? 2.1. Problem Formulation
More informationSegmentation: Clustering, Graph Cut and EM
Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationCOPULA MODELS FOR BIG DATA USING DATA SHUFFLING
COPULA MODELS FOR BIG DATA USING DATA SHUFFLING Krish Muralidhar, Rathindra Sarathy Department of Marketing & Supply Chain Management, Price College of Business, University of Oklahoma, Norman OK 73019
More informationE-Companion: On Styles in Product Design: An Analysis of US. Design Patents
E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing
More informationGLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015
GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015 Building predictive models is a multi-step process Set project goals and review background
More informationCOMPUTATIONAL STATISTICS UNSUPERVISED LEARNING
COMPUTATIONAL STATISTICS UNSUPERVISED LEARNING Luca Bortolussi Department of Mathematics and Geosciences University of Trieste Office 238, third floor, H2bis luca@dmi.units.it Trieste, Winter Semester
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,
More informationTopics in Machine Learning-EE 5359 Model Assessment and Selection
Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing
More informationAn Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm
An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm Henan Zhao and Rizos Sakellariou Department of Computer Science, University of Manchester,
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationFrom the collection of the Computer History Museum (
A study of interleaved memory systems by G. J. BURNETT ndex Systems, nc. Boston, Massachussetts and E. G. COFFMAN, JR. Princeton University Princeton, New Jersey NTRODUCTON There is frequently a severe
More informationSubspace Clustering with Global Dimension Minimization And Application to Motion Segmentation
Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Bryan Poling University of Minnesota Joint work with Gilad Lerman University of Minnesota The Problem of Subspace
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationThe Randomized Shortest Path model in a nutshell
Panzacchi M, Van Moorter B, Strand O, Saerens M, Kivimäki I, Cassady St.Clair C., Herfindal I, Boitani L. (2015) Predicting the continuum between corridors and barriers to animal movements using Step Selection
More informationPart I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a
Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering
More informationUsing network analysis to assess the centrality of second-tier banks in CHAPS
Using network analysis to assess the centrality of second-tier banks in CHAPS Ana Lasaosa CPSS workshop on payment system monitoring indicators 9 October 2012 Motivation Tiering in CHAPS creates risks:
More informationRecap: Gaussian (or Normal) Distribution. Recap: Minimizing the Expected Loss. Topics of This Lecture. Recap: Maximum Likelihood Approach
Truth Course Outline Machine Learning Lecture 3 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Probability Density Estimation II 2.04.205 Discriminative Approaches (5 weeks)
More informationThe Optimal Discovery Procedure: A New Approach to Simultaneous Significance Testing
UW Biostatistics Working Paper Series 9-6-2005 The Optimal Discovery Procedure: A New Approach to Simultaneous Significance Testing John D. Storey University of Washington, jstorey@u.washington.edu Suggested
More informationA Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu
More informationLatent Variable Models and Expectation Maximization
Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15
More informationClustering. Shishir K. Shah
Clustering Shishir K. Shah Acknowledgement: Notes by Profs. M. Pollefeys, R. Jin, B. Liu, Y. Ukrainitz, B. Sarel, D. Forsyth, M. Shah, K. Grauman, and S. K. Shah Clustering l Clustering is a technique
More informationAn Introduction to Markov Chain Monte Carlo
An Introduction to Markov Chain Monte Carlo Markov Chain Monte Carlo (MCMC) refers to a suite of processes for simulating a posterior distribution based on a random (ie. monte carlo) process. In other
More informationMachine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves
Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves
More informationCSC 2515 Introduction to Machine Learning Assignment 2
CSC 2515 Introduction to Machine Learning Assignment 2 Zhongtian Qiu(1002274530) Problem 1 See attached scan files for question 1. 2. Neural Network 2.1 Examine the statistics and plots of training error
More informationStatistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes.
Outliers Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Concepts What is an outlier? The set of data points that are considerably different than the remainder of the
More informationQuantitative Biology II!
Quantitative Biology II! Lecture 3: Markov Chain Monte Carlo! March 9, 2015! 2! Plan for Today!! Introduction to Sampling!! Introduction to MCMC!! Metropolis Algorithm!! Metropolis-Hastings Algorithm!!
More informationK-Means and Gaussian Mixture Models
K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationMixture Models and EM
Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering
More informationAdvanced Data Visualization
Advanced Data Visualization CS 6965 Spring 2018 Prof. Bei Wang Phillips University of Utah Lecture 03 Dim Reduction & Vis t-sne HD Announcement Project 1 has been posted on the schedule webpage: http://www.sci.utah.edu/~beiwang/teaching/cs6965-spring-2018/
More informationIntroduction to Multithreaded Algorithms
Introduction to Multithreaded Algorithms CCOM5050: Design and Analysis of Algorithms Chapter VII Selected Topics T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein. Introduction to algorithms, 3 rd
More informationChapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea
Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.
More informationParallel Hybrid Monte Carlo Algorithms for Matrix Computations
Parallel Hybrid Monte Carlo Algorithms for Matrix Computations V. Alexandrov 1, E. Atanassov 2, I. Dimov 2, S.Branford 1, A. Thandavan 1 and C. Weihrauch 1 1 Department of Computer Science, University
More informationLocally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling
Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Moritz Baecher May 15, 29 1 Introduction Edge-preserving smoothing and super-resolution are classic and important
More informationExam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3
UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationNote Set 4: Finite Mixture Models and the EM Algorithm
Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationEfficient Feature Learning Using Perturb-and-MAP
Efficient Feature Learning Using Perturb-and-MAP Ke Li, Kevin Swersky, Richard Zemel Dept. of Computer Science, University of Toronto {keli,kswersky,zemel}@cs.toronto.edu Abstract Perturb-and-MAP [1] is
More informationStephen Scott.
1 / 33 sscott@cse.unl.edu 2 / 33 Start with a set of sequences In each column, residues are homolgous Residues occupy similar positions in 3D structure Residues diverge from a common ancestral residue
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationPerformance Models for Evaluation and Automatic Tuning of Symmetric Sparse Matrix-Vector Multiply
Performance Models for Evaluation and Automatic Tuning of Symmetric Sparse Matrix-Vector Multiply University of California, Berkeley Berkeley Benchmarking and Optimization Group (BeBOP) http://bebop.cs.berkeley.edu
More informationReflexive Regular Equivalence for Bipartite Data
Reflexive Regular Equivalence for Bipartite Data Aaron Gerow 1, Mingyang Zhou 2, Stan Matwin 1, and Feng Shi 3 1 Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada 2 Department of Computer
More informationStatistical Physics of Community Detection
Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined
More informationPackage NetworkRiskMeasures
Type Package Package NetworkRiskMeasures Title Risk Measures for (Financial) Networks Version 0.1.2 March 26, 2017 Author Carlos Cinelli , Thiago Cristiano Silva
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More information(X 1:n η) 1 θ e 1. i=1. Using the traditional MLE derivation technique, the penalized MLEs for η and θ are: = n. (X i η) = 0. i=1 = 1.
EXAMINING THE PERFORMANCE OF A CONTROL CHART FOR THE SHIFTED EXPONENTIAL DISTRIBUTION USING PENALIZED MAXIMUM LIKELIHOOD ESTIMATORS: A SIMULATION STUDY USING SAS Austin Brown, M.S., University of Northern
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More information2.3 Algorithms Using Map-Reduce
28 CHAPTER 2. MAP-REDUCE AND THE NEW SOFTWARE STACK one becomes available. The Master must also inform each Reduce task that the location of its input from that Map task has changed. Dealing with a failure
More informationSingular Value Decomposition, and Application to Recommender Systems
Singular Value Decomposition, and Application to Recommender Systems CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Recommendation
More informationNetworks and Algebraic Statistics
Networks and Algebraic Statistics Dane Wilburne Illinois Institute of Technology UC Davis CACAO Seminar Davis, CA October 4th, 2016 dwilburne@hawk.iit.edu (IIT) Networks and Alg. Stat. Oct. 2016 1 / 23
More informationCHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM
96 CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM Clustering is the process of combining a set of relevant information in the same group. In this process KM algorithm plays
More informationClustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic
Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the
More informationClustering. Supervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationNested Sampling: Introduction and Implementation
UNIVERSITY OF TEXAS AT SAN ANTONIO Nested Sampling: Introduction and Implementation Liang Jing May 2009 1 1 ABSTRACT Nested Sampling is a new technique to calculate the evidence, Z = P(D M) = p(d θ, M)p(θ
More informationTag-based Social Interest Discovery
Tag-based Social Interest Discovery Xin Li / Lei Guo / Yihong (Eric) Zhao Yahoo!Inc 2008 Presented by: Tuan Anh Le (aletuan@vub.ac.be) 1 Outline Introduction Data set collection & Pre-processing Architecture
More informationBeyond community detection on undirected, unweighted graphs
Beyond community detection on undirected, unweighted graphs Vipul Pandey vpandey1@stanford.edu Juthika Dabholkar juthika@stanford.edu November 17, 2011 Rex Kirshner rbk@stanford.edu 1 Abstract unity detection
More informationEvaluating Machine-Learning Methods. Goals for the lecture
Evaluating Machine-Learning Methods Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationSupervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationGAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.
GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential
More informationExplore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan
Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More informationGaussian and Exponential Architectures in Small-World Associative Memories
and Architectures in Small-World Associative Memories Lee Calcraft, Rod Adams and Neil Davey School of Computer Science, University of Hertfordshire College Lane, Hatfield, Herts AL1 9AB, U.K. {L.Calcraft,
More informationRandom Graph Model; parameterization 2
Agenda Random Graphs Recap giant component and small world statistics problems: degree distribution and triangles Recall that a graph G = (V, E) consists of a set of vertices V and a set of edges E V V.
More informationThe Cross-Entropy Method for Mathematical Programming
The Cross-Entropy Method for Mathematical Programming Dirk P. Kroese Reuven Y. Rubinstein Department of Mathematics, The University of Queensland, Australia Faculty of Industrial Engineering and Management,
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationData Mining. Clustering. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Clustering Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 31 Table of contents 1 Introduction 2 Data matrix and
More informationLecture 8: The EM algorithm
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 8: The EM algorithm Lecturer: Manuela M. Veloso, Eric P. Xing Scribes: Huiting Liu, Yifan Yang 1 Introduction Previous lecture discusses
More informationStatistical Methods for Network Analysis: Exponential Random Graph Models
Day 2: Network Modeling Statistical Methods for Network Analysis: Exponential Random Graph Models NMID workshop September 17 21, 2012 Prof. Martina Morris Prof. Steven Goodreau Supported by the US National
More information1. Estimation equations for strip transect sampling, using notation consistent with that used to
Web-based Supplementary Materials for Line Transect Methods for Plant Surveys by S.T. Buckland, D.L. Borchers, A. Johnston, P.A. Henrys and T.A. Marques Web Appendix A. Introduction In this on-line appendix,
More informationMobility Models. Larissa Marinho Eglem de Oliveira. May 26th CMPE 257 Wireless Networks. (UCSC) May / 50
Mobility Models Larissa Marinho Eglem de Oliveira CMPE 257 Wireless Networks May 26th 2015 (UCSC) May 2015 1 / 50 1 Motivation 2 Mobility Models 3 Extracting a Mobility Model from Real User Traces 4 Self-similar
More informationCONDITIONAL SIMULATION OF TRUNCATED RANDOM FIELDS USING GRADIENT METHODS
CONDITIONAL SIMULATION OF TRUNCATED RANDOM FIELDS USING GRADIENT METHODS Introduction Ning Liu and Dean S. Oliver University of Oklahoma, Norman, Oklahoma, USA; ning@ou.edu The problem of estimating the
More informationHomework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)
Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in
More information