Exploiting Social Network Structure for Person-to-Person Sentiment Analysis (Supplementary Material to [WPLP14])

Similar documents
On vertex-coloring edge-weighting of graphs

Graph Theory Day Four

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points

1 The Traveling Salesperson Problem (TSP)

Traveling Salesman Problem (TSP) Input: undirected graph G=(V,E), c: E R + Goal: find a tour (Hamiltonian cycle) of minimum cost

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees.

CS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department

Theorem 2.9: nearest addition algorithm

Small Survey on Perfect Graphs

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Introduction III. Graphs. Motivations I. Introduction IV

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem

Vertex Magic Total Labelings of Complete Graphs 1

Part II. Graph Theory. Year

Graph Theory. ICT Theory Excerpt from various sources by Robert Pergl

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3)

Monotone Paths in Geometric Triangulations

HW Graph Theory SOLUTIONS (hbovik) - Q

On Acyclic Vertex Coloring of Grid like graphs

Vertex Odd Divisor Cordial Labeling for Vertex Switching of Special Graphs

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

Introduction to Graph Theory

Lecture 8: The Traveling Salesman Problem

4. Simplicial Complexes and Simplicial Homology

Hardness of Subgraph and Supergraph Problems in c-tournaments

Assignment 4 Solutions of graph problems

1 Minimum Spanning Trees (MST) b 2 3 a. 10 e h. j m

Minimal comparability completions of arbitrary graphs

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007

Properly Colored Paths and Cycles in Complete Graphs

On the Balanced Case of the Brualdi-Shen Conjecture on 4-Cycle Decompositions of Eulerian Bipartite Tournaments

Preimages of Small Geometric Cycles

Solutions to In-Class Problems Week 4, Fri

A Reduction of Conway s Thrackle Conjecture

CPS 102: Discrete Mathematics. Quiz 3 Date: Wednesday November 30, Instructor: Bruce Maggs NAME: Prob # Score. Total 60

The NP-Completeness of Some Edge-Partition Problems

NP and computational intractability. Kleinberg and Tardos, chapter 8

Math 776 Graph Theory Lecture Note 1 Basic concepts

CMSC 380. Graph Terminology and Representation

Pantographic polygons

Number Theory and Graph Theory

Graphs (MTAT , 6 EAP) Lectures: Mon 14-16, hall 404 Exercises: Wed 14-16, hall 402

Math 778S Spectral Graph Theory Handout #2: Basic graph theory

On Rainbow Cycles in Edge Colored Complete Graphs. S. Akbari, O. Etesami, H. Mahini, M. Mahmoody. Abstract

Two Characterizations of Hypercubes

13 th Annual Johns Hopkins Math Tournament Saturday, February 19, 2011 Automata Theory EUR solutions

CHAPTER 2. Graphs. 1. Introduction to Graphs and Graph Isomorphism

A Note On The Sparing Number Of The Sieve Graphs Of Certain Graphs

Max-Cut and Max-Bisection are NP-hard on unit disk graphs

Discharging and reducible configurations

Scalable Clustering of Signed Networks Using Balance Normalized Cut

1 Homophily and assortative mixing

Lecture 22 Tuesday, April 10

1. The following graph is not Eulerian. Make it into an Eulerian graph by adding as few edges as possible.

Ma/CS 6b Class 26: Art Galleries and Politicians

Graph Theory Questions from Past Papers

Binary Decision Diagrams

EDGE-COLOURED GRAPHS AND SWITCHING WITH S m, A m AND D m

Complexity and approximation of satisfactory partition problems

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

Crossing bridges. Crossing bridges Great Ideas in Theoretical Computer Science. Lecture 12: Graphs I: The Basics. Königsberg (Prussia)

Graphs. Pseudograph: multiple edges and loops allowed

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

EECS 1028 M: Discrete Mathematics for Engineers

DO NOT RE-DISTRIBUTE THIS SOLUTION FILE

Lecture 2 September 3

Math 443/543 Graph Theory Notes

Algorithms for finding the minimum cycle mean in the weighted directed graph

Characterizing Graphs (3) Characterizing Graphs (1) Characterizing Graphs (2) Characterizing Graphs (4)

Elements of Graph Theory

7.3 Spanning trees Spanning trees [ ] 61

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

by conservation of flow, hence the cancelation. Similarly, we have

GENERALIZED CPR-GRAPHS AND APPLICATIONS

Approximation Algorithms

ON THE STRUCTURE OF SELF-COMPLEMENTARY GRAPHS ROBERT MOLINA DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE ALMA COLLEGE ABSTRACT

1 Variations of the Traveling Salesman Problem

Algebraic Graph Theory- Adjacency Matrix and Spectrum

The Encoding Complexity of Network Coding

Problem 3.1 (Building up geometry). For an axiomatically built-up geometry, six groups of axioms needed:

Gracefulness of a New Class from Copies of kc 4 P 2n and P 2 * nc 3

1. Lecture notes on bipartite matching February 4th,

Extremal Graph Theory: Turán s Theorem

Matching and Factor-Critical Property in 3-Dominating-Critical Graphs

1 Digraphs. Definition 1

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Embeddability of Arrangements of Pseudocircles into the Sphere

Star coloring planar graphs from small lists

Notes for Recitation 9

Best known solution time is Ω(V!) Check every permutation of vertices to see if there is a graph edge between adjacent vertices

Testing Isomorphism of Strongly Regular Graphs

Graph Algorithms Using Depth First Search

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube

Discrete mathematics , Fall Instructor: prof. János Pach

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1

Approximation Algorithms: The Primal-Dual Method. My T. Thai

γ(ɛ) (a, b) (a, d) (d, a) (a, b) (c, d) (d, d) (e, e) (e, a) (e, e) (a) Draw a picture of G.

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Vertex Magic Total Labelings of Complete Graphs

Transcription:

Exploiting Social Network Structure for Person-to-Person Sentiment Analysis (Supplementary Material to [WPLP14]) Robert West, Hristo S. Paskov, Jure Leskovec, Christopher Potts Stanford University west@cs.stanford.edu, hpaskov@stanford.edu, jure@cs.stanford.edu, cgpotts@stanford.edu 1 Exploiting symmetries (Sec. 3.) Our definition of a triangle t = {e 1,e,e 3 } for an undirected signed graph G = (V,E,x) expresses each triangle as an unordered set of edges. However, we implicitly impose an ordering on the edges of t when we form x t = (x e1,x e,x e3 ) because the vector x t is an ordered object. As this ordering is arbitrary, d : {0,1} 3 R + must be invariant to the order of its arguments, i.e., for any permutation matrix π, d(x t ) = d(x t π). This requirement implies that there are only 4 possible values that d can take on: d(0,0,0),d(1,0,0) = d(0,1,0) = d(0,0,1),d(0,1,1) = d(1,0,1) = d(1,1,0), and d(1,1,1). We always use this restriction when learning or specifying d. Extension of the model to directed graphs (Sec. 3.) We now describe the TRIANGLE BALANCE problem for a directed signed graph G = (V,E,x). Since there is no global reference node, edge directionality does not influence the edge cost function c(x e, p e ) = λ 1 (1 p e )x e +λ 0 p e (1 x e ). However, when forming our triangle cost function, we now distinguish between two basic types of triangles: acyclic and cyclic. Since each triangle t is specified as an unordered set of edges, t is isomorphic to one of these basic types. We allow for different costs for each triangle type/sign combination by extending the definition of d to take the triangle type, σ t {CYCLIC, ACYCLIC}, as an argument. Our overall objective is then c(x e, p e ) + d(x t,σ t ). (1) e E t T Similarly to the undirected case, symmetries exist that d must take into account. When t is cyclical, the order of the edges does not matter and the same symmetries as in the undirected case must be obeyed. However, when t is acyclical each directed edge plays a different role in the triangle, and order matters. As such, we will assume that x t is formed by always taking edges in the same canonical order when t is acyclical. Moreover, d takes on 8 distinct values when its second argument is ACYCLIC, yielding a total of 1 distinct values for d when G is directed. 3 NP-hardness of TRIANGLE BALANCE (Sec. 3.3) We prove that TRIANGLE BALANCE is NP-hard for undirected graphs. It then follows immediately that the problem is NP-hard for directed graphs, too, since every undirected graph can be expressed as a directed graph, 1 which implies that the directed case is at least as hard as the undirected case. 1 For instance, by fixing some order v 1,...,v n on the vertices V = {v 1,...,v n } and representing an undirected edge e = {v i,v j } as the directed edge (v i,v j ) if i < j, and (v j,v i ) otherwise. 1

Further, in our problem formulation in Sec. 3. (Eq. ) of the paper, the cost of an edge e, c(x e, p e ), is defined as a mere shortcut for λ 1 (1 p e )x e +λ 0 p e (1 x e ), where p e [0,1], λ 1 R +, and λ 0 R + are free parameters associated with the problem instance, and x e {0,1} is the optimization variable. That is, an edge e s cost of being positive (negative) is entirely determined by the sentiment model s output p e and λ 1 (λ 0 ). For ease of exposition, we will first assume that edge costs are defined directly on a per-edge basis; i.e., each edge e has costs c(x e,e) associated immediately with it (for x e {0,1}), without assuming parameters p e, λ 1, and λ 0. We write c 1 (e) for c(1,e), and c 0 (e) for c(0, e). Having proved NP-hardness for this more general case, we will provide an argument why NP-hardness even holds for the more special case where edge costs are constrained to be of the form λ 1 (1 p e )x e + λ 0 p e (1 x e ). The TRIANGLE BALANCE problem is therefore defined as follows. Problem. TRIANGLE BALANCE Instance. An undirected graph G = (V,E), a cost function c 1 : E R, specifying the cost of each edge if it is positive, a cost function c 0 : E R, specifying the cost of each edge if it is negative, a triangle cost function d : {0,1} 3 R, where d(x t ) = d(x 1,x,x 3 ) defines the cost of a triangle t whose edges have the signs x t = (x 1,x,x 3 ). Task. Find edge signs x {0,1} E of minimum cost H(x) = H E (x) + H T (x), () where the total cost H(x) is the sum of the total edge cost H E (x) = x e c 1 (e) + (1 x e )c 0 (e) (3) e E and the total triangle cost H T (x) = t T d(x t ). (4) Theorem 1. TRIANGLE BALANCE is NP-hard. Proof. We prove NP-hardness by reduction from TWO-LEVEL SPIN GLASS, which is known to be NP-hard and defined as follows [Bar8]. Problem. TWO-LEVEL SPIN GLASS Instance. A two-level grid (cf. Fig. 1) Ḡ = ( V,Ē), an edge cost function c : Ē { 1,0,+1}. Task. Find vertex signs x { 1,+1} V of minimum cost H( x) = c(u,v) x u x v. (5) (u,v) Ē

Figure 1: A two-level grid [Bar8]. c Sol LeWitt: Cubic-Modular Wall Structure, Black, 1966, Museum of Modern Art, New York. Edge costs: 1 0 +1 v* TWO-LEVEL SPIN GLASS TRIANGLE BALANCE Figure : Illustration of the reduction from TWO-LEVEL SPIN GLASS to TRIANGLE BALANCE. (Dangling vertices with no incident blue or yellow edges not shown on right.) As a matter of notation, for a graph G = (V,E), we assume some order v 1,...,v n on the vertices V = {v 1,...,v n } and represent an undirected edge e = {v i,v j } as the pair (v i,v j ) if i < j, and (v j,v i ) otherwise. Here is a table of correspondences that motivates our reduction from TWO-LEVEL SPIN GLASS to TRIANGLE BALANCE: TWO-LEVEL SPIN GLASS vertex labels vertex costs (all-zero) edge costs TRIANGLE BALANCE edge labels edge costs triangle costs So the idea is to map vertices from the TWO-LEVEL SPIN GLASS instance to edges in the TRIANGLE BALANCE instance, and edges in the TWO-LEVEL SPIN GLASS instance to triangles in the TRIANGLE BALANCE instance. The reduction is illustrated in Fig.. Formally, we transform a TWO-LEVEL SPIN GLASS instance (Ḡ = ( V,Ē), c) into a TRIANGLE BALANCE instance (G = (V,E),c 1,c 0,d) as follows: 3

V = V {v }, where v / V (6) E = {e Ē : c(e) 0} {(v,v ) : v V } (7) 0 if e / Ē c 1 (e) = 0 if e Ē and c(e) = +1 (8) Ē + 1 if e Ē and c(e) = 1 0 if e / Ē c 0 (e) = 0 if e Ē and c(e) = 1 (9) Ē + 1 if e Ē and c(e) = +1 { 1 if x1 + x d(x 1,x,x 3 ) = + x 3 {1,3} (10) +1 if x 1 + x + x 3 {0,} We need to prove that each optimal solution to the original problem induces an optimal solution to the transformed problem, and vice versa. Our first observation is that, in the transformed problem, there is 1. exactly one new edge (v,v ) E \ Ē for each old vertex v V (red edges in Fig. ), creating. exactly one new triangle {(v 1,v ),(v 1,v ),(v,v )} for each old non-zero edge (v 1,v ) E Ē. Further, since every two-level grid Ḡ is triangle-free, the set of triangles in G corresponds exactly to the set of non-zero edges in Ḡ. Due to item 1, there is a one-to-one correspondence between labelings of the old (black in Fig. ) vertices V and labelings of the new (red in Fig. ) edges E \ Ē. Further, the old edge cost function c induces a labeling of the old (blue and yellow) edges E Ē. So, for each labeling x of the old vertices V, we can define exactly one corresponding labeling x of the edges E of the transformed problem: { xu if (u,v) E \ Ē, i.e., if v = v, x (u,v) = (11) c(u,v) if (u,v) E Ē. We will show below that x has cost H(x) = H( x). Also, any x that labels the old edges E Ē differently than (11) will have cost H(x ) Ē + 1 Ē = Ē + 1 > H( x) = H(x), due to (10) and the third case of (9), and because the cost H( x) in the original problem is never more than Ē. Hence, any optimal solution x must label the old edges E Ē as in (11), which means that each optimal solution x to the old problem is in a one-to-one correspondence with an optimal solution x to the transformed problem, and min x H(x) = H(x ) = H( x ) = min x H( x). To see that H(x) = H( x) for the x-to-x transformation of (11), first observe that the total edge cost H E (x) (cf. (3)) is 0, since new edges are always free (case 1 of (9)) and old edges are free if they are signed according to c (case of (9)). Regarding the triangle cost H T (x) (cf. (4)), observe that a triangle (which always consists of one old edge and two new edges) has an even number of negative edges if and only if its old edge is positive and its two new edges are of the same sign, or its old edge is negative and its two new edges are of opposite signs. Therefore (cf. (10)), the number of triangles with cost ±1 equals the number of edges (u,v) of the original problem for which c(u,v) x u x v = 1, or equivalently, c(u,v) x u x v = ±1. So the total triangle cost H T (x) of the transformed problem equals the total cost H( x) of the original problem (cf. (5)). Adding edge and triangle costs, we obtain the total cost of the transformed problem as H(x) = H E (x) + H T (x) = 0 + H( x) = H( x). To conclude, we show that the problem is NP-hard even when we constrain the edge costs c(x e, p e ) to be of the form λ 1 (1 p e )x e + λ 0 p e (1 x e ), respectively, as assumed in Eq. of the paper (cf. the 4

discussion at the beginning of this section). In this case, instead of defining c 1 (e) and c 0 (e) for each e E, we need to define p e for each e E, as well as λ 1 and λ 0. We define 1 if e / Ē p e = 1 if e Ē and c(e) = +1 0 if e Ē and c(e) = 1 λ 1 = Ē + 1, λ 0 = Ē + 1. Since c 1 (e) = c(1, p e ) = λ 1 (1 p e ) and c 0 (e) = c(0, p e ) = λ 0 p e, we can simply use the following edge costs in the same proof as above: Ē + 1 if e / Ē c 1 (e) = 0 if e Ē and c(e) = +1 (1) Ē + 1 if e Ē and c(e) = 1 Ē + 1 if e / Ē c 0 (e) = Ē + 1 if e Ē and c(e) = +1 (13) 0 if e Ē and c(e) = 1 The same proof as above will be valid, with the small difference that now the costs of the optimal solutions to the TWO-LEVEL SPIN GLASS and TRIANGLE BALANCE instances will differ by a constant V ( Ē + 1 ), as there are V edges in E \ Ē, each of which contributes a cost of Ē + 1. A final remark: Although TRIANGLE BALANCE was defined for general triangle-cost functions d in Eq. 1 of the paper, our reduction shows that even the special case of d that rewards triangles with an even, and punishes triangles with an odd, number of negative edges is NP-hard. This is noteworthy since it is the notion of social balance originally considered in sociology [Hei46]. References [Bar8] Francisco Barahona. On the computational complexity of Ising spin glass models. Journal of Physics A: Mathematical and General, 15(10):341 353, 198. [Hei46] Fritz Heider. Attitudes and cognitive organization. The Journal of Psychology, 1(1):107 11, 1946. [WPLP14] Robert West, Hristo S. Paskov, Jure Leskovec, and Christopher Potts. Exploiting social network structure for person-to-person sentiment analysis. Transactions of the Association for Computational Linguistics,, 014. 5