arxiv: v1 [cs.ds] 12 Jun 2017

Size: px
Start display at page:

Download "arxiv: v1 [cs.ds] 12 Jun 2017"

Transcription

1 Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly arxiv: v [cs.ds] 2 Jun 27 Baharan Mirzasoleiman ETH Zurich Stefanie Jegelka MIT Abstract Andreas Krause ETH Zurich The need for real time analysis of rapidly producing data streams (e.g., video and image streams) motivated the design of streaming algorithms that can efficiently extract and summarize useful information from massive data on the fly. Such problems can often be reduced to maximizing a submodular set function subject to various constraints. While efficient streaming methods have been recently developed for monotone submodular maximization, in a wide range of applications, such as video summarization, the underlying utility function is nonmonotone, and there are often various constraints imposed on the optimization problem to consider privacy or personalization. We develop the first efficient single pass streaming algorithm, STREAMING LOCAL SEARCH, with constant factor (4p )/4p(8p+2d ) approximation guarantee for maximizing a non-monotone submodular function under a p-system and d knapsack constraints, with a memory independent of data size. In our experiments, we show that for the video summarization problem, our streaming method, while achieving practically the same performance, runs more than 7 times faster than previous work. Introduction Data summarization the task of efficiently extracting a representative subset of manageable size from a large dataset has become an important goal in machine learning and information retrieval. Submodular maximization has recently been explored as a natural abstraction for many data summarization tasks, including image summarization [], scene summarization [2], document and corpus summarization [3], active set selection in non-parametric learning [4] and training data compression [5]. Submodularity is an intuitive notion of diminishing returns, stating that selecting any given element earlier helps more than selecting it later. Given a set of constraints on the desired summary, and a (pre-designed or learned) submodular utility function f that quantifies the representativeness f(s) of a subset S of items, data summarization can be naturally reduced to a constrained submodular optimization problem. In this paper, we are motivated by application of non-monotone submodular maximization. In particular, we consider video summarization in a streaming setting, where video frames are produced at a fast pace, and we want to keep an updated summary of the video so far, with little or no memory overhead. This has important applications e.g. in surveillance cameras, wearable cameras, and astro video cameras, where massive volume of rapidly produced data makes it impractical for computational units to analyze and store them in main memory. The same framework can be applied more generally in many settings where we need to extract a small subset of data from a large stream to train or update a machine learning model. At the same time, various constraints may be imposed by the underlying summarization application. These may range from a simple limit on the size of the summary to more complex restrictions such as focusing on particular individuals or objects, or excluding them from the summary. These requirements often arise in real-world scenarios to consider privacy concerns (e.g. in case of surveillance cameras) or personalization (according to users interests).

2 In machine learning, Determinantal Point Processes (DPP) have been proposed as computationally efficient methods for selecting a diverse subset from a ground set of items [6]. They have recently shown great success for video summarization [7], as well as problems like document summarization [6] and information retrieval [8]. While finding the most likely configuration (MAP) is NP-hard, the DPP probability is a log-submodular function, and submodular optimization techniques can be used to find a near optimal solution. In general the above submodular function is very non-monotone, and we need techniques for maximizing a non-monotone submodular function in the streaming setting. Although efficient streaming methods have been recently developed for maximizing a monotone submodular function f with a variety of constraints, there is no effective solution for non-monotone submodular maximization under general types of constraints in the streaming setting. In this work, we provide STREAMING LOCAL SEARCH, the first single pass streaming algorithm for non-monotone submodular function maximization, subject to the intersection of a p-system and d knapsack constraints. Our approach builds on local search, a widely used technique for maximizing non-monotone submodular functions in batch mode. Local search, however, needs multiple passes over the input, and hence does not directly extend to the streaming setting, where we are only allowed to make a single pass over the data. STREAMING LOCAL SEARCH provides a constant factor(4p )/4p(8p+2d ) approximation to the optimum solution, using O(pklog 2 (k)/ε 2 ) memory and O(pk 2 log 2 (k)/ε 2 ) update time per element, where k is the size of the largest feasible solutions. Using parallel computation, the update time can be reduced to O(pk 2 ), making our approach an appealing solution in real-time scenarios. We show that for video summarization, our algorithm leads to streaming solutions that provide competitive utility when compared with those obtained via centralized methods, at a small fraction of the computational cost, i.e. more than 7 times faster. 2 Related Work Video summarization aims to retain diverse and representative frames according to criteria such as representativeness, diversity, interestingness, or importance of the frames [9,, ]. This often requires hand-crafting to combine the criteria effectively. Recently, [7] proposed a supervised subset selection method using DPPs. Despite its superior performance, this method uses an exhaustive search for MAP inference, which makes it inapplicable for producing real-time summaries. Local search has been widely used for submodular maximization subject to various constraints. This includes the analysis of greedy and local search by Nemhauser et al. [2] providing a /(p+) approximation for monotone submodular maximization under p matroid constraints. Among the most recent results for non-monotone submodular maximization are a (+O(/ p))p-approximations subject to p-independence system constraints [3], a /(5 ε) approximation under d knapsack constraints [4], and a(p + )(2p + 2l + )/p-approximation, for maximizing a general submodular function subject to ap-system anddknapsack constraints [5]. Streaming algorithms for submodular maximization have gained increasing attention for producing online summaries from data streams. Recently, Badanidiyuru et al. [6] proposed a single pass streaming algorithm for monotone maximization that yields a /2 ǫ approximation and needs O(k log k/ǫ) memory. Chakrabarti and Kale [7] developed a single pass algorithm for monotone functions over intersections of p matroids, achieving a /4p approximation guarantee. However, the required memory increases polylogarithmically with the size of the data. Finally, Chekuri et al. [8] presented a deterministic and a randomized algorithm for maximizing monotone and non-monotone submodular functions subject to a broader range of constraints, namely p-matchoids. Their methods gives a Ω(/p) approximation (in expectation) using O(klogk/ǫ 2 ) memory (k is the size of the largest feasible solution). 3 Problem Statement We consider the problem of summarizing a stream of data by selecting, on the fly, a subset that maximizes a utility function f : 2 V R +. The utility function is defined on subsets of the entire streamv and, for eachs V,f(S) quantifies how well S representsv. We assume that f is submodular, a property that holds for many widely used such utility functions. This means that for any two sets S T V and any elemente V \T we have that f(s {e}) f(s) f(t {e}) f(t). 2

3 We denote the marginal gain of adding an element e V to a summary S V by f S (e) = f(s {e}) f(s). The functionf is monotone if f S (e) for all S V and e V \S. Here, we allow f to be non-monotone. Many data summarization applications can be cast as an instance of a constrained submodular maximization under a set ζ 2 V of constraints: S = argmax S ζ f(s), In this work, we consider a wide set of hereditary constraints, where any subset of a feasible set is also feasible, and knapsack constraints. Common examples of hereditary constraints are matroids, matchoids and p-systems. A matroid M is a pair (V,I) where V is a finite (ground) set, and I 2 V is a family of independent subsets of V satisfying the following two properties. (i) for any A B V ; B I implies that A I (heredity property), and (ii) if A,B I and B > A, there is an element e B \ A such that A {e} I. The independent sets of M share a common cardinality, called the rank ofm. A uniform matroid is the family of all subsets of size at mostk. In a partition matroid, we have a collection of disjoint setsb i and integers k i B i where a seta is independent if for every indexi, we have A B i k i. Ap-matchoid generalizes matchings and intersections of matroids. For q matroids defined over overlapping groundsets,m l (V l,i l ),l [q], it requires that every elemente V, is a member of V l for at most p indices. Finally, a p-system is the most general type of constraint we consider in this paper. It requires that if A,B I are two maximal sets, then A p B. A knapsack constraint is defined by a cost functionc : V R +. A set S V is said to satisfy the knapsack constraint ifc(s) = e S c(e). The goal in this paper is to maximize a (non-monotone) submodular function f subject to a set of constraints ζ defined by the intersection of a p-independence system (V, I) and d knapsacks. In other words, we would like to find a sets I that maximizesf where for each knapsackc i,i [d], we have e S c i(e). We assume that the ground set V = {e,,e n } is received from the stream in some arbitrary order. At each point t in time, the algorithm may maintain a memory M t V of points, and must be ready to output a candidate feasible solution S t M t, such that S t ζ. Upon receiving an elemente t from the stream, the algorithm may elect to ) insert it into its memory, 2) discard some elements in it s memory and accepte t instead, 3) discarde t. 4 Video Summarization with DPPs Suppose that we are receiving a stream of video frames, e.g. from a surveillance or a wearable camera, and we wish to select a subset of frames that concisely represents all the diversity from the video. DPP is an appealing tool for modeling diversity in such applications. DPPs [9] are distributions over subsets with a preference for diversity, and have been successfully applied to video summarization [7], as well as problems like document summarization [6] and information retrieval [8]. Formally, a DPP P on a set of items V = {,2,...,N} defines a discrete probability distribution on2 V (the set of all subsets ofv), such that the probability of observing subset S V is P(Y = S) = det(l S) det(i+l), () where L is a positive semidefinite kernel matrix, and L S [L ij ] i,j S, is the restriction of L to the entries indexed by elements of S, and I is the N N identity matrix. In order to find the most diverse and informative feasible subset, we need to solve the NP-hard problem of finding argmax S I det(l S ) [2], where I 2 V is a given family of feasible solutions. However, the logarithm f(s) = logdet(l S ) is a (non-monotone) submodular function [6], and we can apply submodular maximization techniques. Various constraints can be imposed while maximizing the above non-monotone submodular utility function. In its simplest form, we can partition the video to T segments, and define a diversity reinforcing partition matroid to select at most k frames from each segment. In another example, various content based constraints can be applied, e.g., we can use object recognition to select at most k i frames from personiin the video, or to find a summary that is focused on a particular person or object. Finally, to improve the quality of the produced summaries, the cost of a frame can be chosen as a function of its quality, such as resolution, contrast, luminance, or the probability that the given frame contains an object. 3

4 Algorithm STREAMING LOCAL SEARCH for Independence Systems Input: f : 2 E R +, a membership oracle for independence-systemsi 2 E, and a streaming algorithm INDSTREAM for independence systems with α-approximation guarantee Output: A set S E satisfyings I. : fort = tot do 2: D {e t } 3: for i = to do LOCAL SEARCH iterations 4: [D i,s i ]= INDSTREAM i (D i ) D i is the discarded set by INDSTREAM i 5: 6: S i =UNCONSTRAINED-MAX(S i). end for 7: S t = argmax i {S i,s i } 8: end for 9: ReturnS t 5 Streaming algorithm for constrained submodular maximization In this section, we describe our streaming algorithm for maximizing a non-monotone submodular function subject to the intersection of a p-system and d-knapsack constraints. Our approach builds on local search, which is a powerful and widely used technique for maximizing non-monotone submodular functions. It starts from a candidate solution S and iteratively increases the value of the solution by either including a new element in S or discarding one of the elements of S [2]. Gupta et al. [22] showed that similar results can be obtained with much lower complexity by using algorithms for monotone submodular maximization, which, however, are run multiple times. Despite their effectiveness, these algorithms need multiple passes over the input and do not directly extend to the streaming setting, where we are only allowed to make a single pass over the data. In the sequel, we show how local search can be implemented in a single pass in the streaming setting. 5. STREAMING LOCAL SEARCH for independence systems The simple yet crucial observation underlying the approach of Gupta et el. [22] is the following. The solution obtained by approximation algorithms for monotone submodular functions often satisfy f(s) αf(s C ), where α >, and C is the optimal solution. In the monotone case f(s C ) f(c ), and we get the desired result. However, this does not hold for non-monotone functions. But, if f(s C ) provides a good fraction of the optimal solution, then we can find a near-optimal solution by pruning elements in S using unconstrained maximization. This still retains a feasible set, since the constraints are downward closed. Otherwise, if f(s C ) εopt, then running another round of the algorithm on the remainder of the ground set will lead to a good solution. Backed by the above intuition, we will try to build multiple disjoint solutions simultaneously within a single pass over the data. Let INDSTREAM be a single pass streaming algorithm for monotone submodular maximization under independence systems, with approximation factor α. Upon receiving a new element from the stream, INDSTREAM can choose () to insert it into its memory, (2) to replace it with one or a subset of elements in the memory, or otherwise (3) the element gets discarded and cannot be used later by the algorithm. The key insight for our approach is that it is possible to build other solutions from the elements discarded by INDSTREAM. Consider a chain of q = instances of our streaming algorithm, i.e. {INDSTREAM,, INDSTREAM q }. Any element e received from the stream is first passed to INDSTREAM. If INDSTREAM discards e, or adds e to its solution and instead discards a set of elements from its memory, then we pass the set D of discarded elements on to be processed by INDSTREAM 2. Similarly, if a set of elementsd 2 is discarded by INDSTREAM 2, we pass them to INDSTREAM 3, and so on. The elements discarded by the last instance INDSTREAM m are discarded forever. Theorem 5.. Let INDSTREAM be a distributed algorithm for monotone submodular maximization under a p-system constraint with approximation guarantee α. Algorithm returns a set S I with f(s) α 2/α OPT. We make Theorem 5. concrete by an example: Chekuri et al [8] proposed a /4p-approximation algorithm for maximizing a monotone submodular function under a p-matchoid constraint in the 4

5 Algorithm 2 STREAMING LOCAL SEARCH for Independence systems and d-knapsacks Input: f : 2 E R +, a membership oracle for independence-systems I 2 E, d knapsack-cost functionsc i : E [,], an upper bound on the cardinality of the largest feasible solutionk. Output: A set S E satisfyings I andc i (S) i. : fort = tot do 2: D {e t } 3: 4: m = max(m,f(e t )),e m = argmax t (f(e t )), γ = 2( α) m 2/α+2d R t = { γ,(+ǫ)γ,(+ǫ) 2 γ,(+ǫ) 3 γ,...,γ k } 5: forρ R t in parallel do 6: fori = to do LOCAL SEARCH f Si (e) 7: [D i,s i ]= INDSTREAMDENSITY i (D i,ρ), picks elements only if d ρ cie 8: 9: S i =UNCONSTRAINED-MAX(S i). end for : S ρ = argmax i {S i,s i,{e m}} : end for 2: S t = argmax ρ R f(s ρ ) 3: end for 4: ReturnS t streaming setting. Using this algorithm as INDSTREAM in our STREAMING LOCAL SEARCH, we obtain the following result: Corollary 5.2. With STREAMING GREEDY of [8] as INDSTREAM, STREAMING LOCAL SEARCH yields a solution S I with approximation guarantee (4p )/4p(8p ), using O(pklog 2 (k)/ε 2 ) memory and O(pk 2 log 2 (k)/ε 2 ) update time per element, where I are the independent sets of thep-matchoid constraint, andk is the size of the largest feasible solutions. 5.2 STREAMING LOCAL SEARCH for independence systems and d-knapsacks To respect multiple knapsack constraints in addition to the p-system, we integrate the idea of a density threshold [23, 24] into our local search algorithm. We use a (fixed) density threshold ρ to restrict the INDSTREAM algorithm to only pick elements if the function value per unit size of the selected elements is above the given threshold. We call this new algorithm INDSTREAMDENSITY. The threshold should be carefully chosen to be below the value/size ratio of the optimal solution. To do so, we need to know (a good approximation) to the value of the optimal solution OPT. To obtain a rough estimate of OPT, it suffices to know the maximum value m = max e V f(e) of any singleton element: submodularity implies that m OPT km, where k is an upper bound on the cardinality of the largest feasible solution satisfying all constraints. We update the value of the maximum singleton element on the fly [6], and lazily instantiate the thresholds to log(k)/ǫ different possible values. The idea of density threshold [23, 24] can be integrated into the above local search algorithm to provide guarantees for non-monotone submodular maximization under multiple knapsack constraints. Here a fixed density threshold ρ could be applied to restrict the INDSTREAM algorithm to pick elements, if the function value per unit size of the selected elements is above the given threshold. The threshold should be carefully chosen to be below the value/size ratio of the optimal solution. Hence, we requires that we know (a good approximation) to the value of the optimal solution OPT. In order to get a crude estimate on OPT, it is enough to know the maximum value of any singleton element m = max e V f(e). From submodularity, we have that m OPT km, where k is an upper bound on the cardinality of the largest feasible solution, under intersection of all the constraints. We update the value of the maximum singleton element on the fly [6], and instantiate log(k)/ǫ different threshold values, for which we run the local search in parallel. We show that for at least one of the discretized density thresholds we obtain a good enough solution. Theorem 5.3. STREAMING LOCAL SEARCH (outlined in Alg. 2) has an approximation guarantee f(s) ( α)( ǫ) (2/α+2d ) OPT, 5

6 Table : Performance of various video summarization methods on YouTube and OVP. YouTube OVP seqdpp FANTOM STREAMING LS Linear N. Nets Linear N. Nets Linear N. Nets F 57.8±.5 6.3± ±.5 6.3± ± ±.5 P 54.2± ±.6 54.±.5 59.± ± ±.6 R 69.8± ±.5 7.± ±.5 7.± ±.5 F 75.5± ± ±.3 78.± ± ±.5 P 77.5±.5 75.± ±.3 75.± ±.2 7.8±.7 R 78.4± ± ± ± ± ±.2 with update timeo(t log(k)/(αǫ)) per element, wherek is an upper bound on the size of the largest feasible solution, and T is the update time of INDSTREAM algorithm. Corollary 5.4. By using STREAMING GREEDY of [8], we get that STREAMING LOCAL SEARCH has an approximation ratio(+ǫ)(4p)(8p+2d )/(4p ) witho(pklog 2 (k)/ε 2 ) memory and update timeo(pk 2 log 2 (k)/ǫ 2 ) per element. 6 Experiments In this section, we apply STREAMING LOCAL SEARCH to video summarization in streaming setting. The main goal of this section is to validate our theoretical results and demonstrate the effectiveness of STREAMING LOCAL SEARCH in practical scenarios where existing algorithms are incapable of providing desirable solutions. We compare the performance of our streaming algorithm with that of [7], and the centralized method, FANTOM, for maximizing non-monotone submodular functions under ap-system andd-kanpsack constraints [5]. Dataset. For our experiment, we use the Open Video Project (OVP), and the Youtube dataset with 5 and 39 videos, respectively [25]. We use the pruned video frames as described in [7], where one frame is uniformly sampled per second, and uninformative frames are removed. Each video frame is then associated with a feature vector that consists of Fisher vectors [26] computed from SIFT features [27], contextual features, and features computed from the frame saliency map [28]. The size of the feature vectors,v i, are 86 and 58 for OVP and YouTube dataset respectively. The DPP kernel L (c.f. Section 4), can be parametrized and learned via maximum likelihood estimation [7]. To compare the performance of our algorithm with the method of [7], we use both linear transformation, i.e. L ij = vi TWT i W iv i, as well as non-linear transformation using a one-hiddenlayer neural network, i.e. L ij = zi TWT Wz j where z i = tanh(uv i ), and tanh(.) stands for the hyperbolic transfer function. ParametersW oru andw, are learned on 8% of the videos, selected uniformly at random. Following [7] for evaluation, we treat each of the 5 human-created summaries per video as ground truth for each video. Sequential DPP. To capture the sequential structures in video data, [7] proposed sequential DPP. Here, a long video sequence is partitioned into T disjoint yet consecutive short segments, and at time t {,, T}, a DPP is imposed over two neighboring segments. The conditional distribution of the selected subset from segmentt is thus given byp(s t S t ) = det(ks t S t ) det(i t+k St V t ), wherev t is all the video frames in segment t, and I t is a diagonal matrix in which the elements corresponding to S t are zeros and the elements corresponding tos t are. Intuitively, the sequential DPP only captures the diversity between the frames in segmentt, and the selected subsets t from the immediate past segment t. MAP inference for the sequential DPP is as hard as for the standard DPP, but submodular optimization techniques can be used to find approximate solutions. In our experiments, we use sequential DPP as the utility function in all the algorithms. Results. Figures a, g show the ratio of the F-score obtained by STREAMING LOCAL SEARCH and FANTOM vs. the F-score obtained by the method of [7] for varying segment size, using linear embeddings on YouTube and OVP datasets. It can be observed that our streaming method is able to obtain the same quality of solution compared to the centralized baselines. Figures a, g show the speedup of STREAMING LOCAL SEARCH and FANTOM over the method of [7], for varying segment size. We note that both FANTOM and STREAMING LOCAL SEARCH show an exponential speedup by increasing the segment size. Interestingly, STREAMING LOCAL SEARCH is able to 6

7 Normalized F-score Speedup Stream LS Fantom Utility Running time (a) YouTube Linear (b) YouTube Linear Fantom Streaming Local Search Random (c) YouTube Linear.2 2 Streaming Local Search Fantom Utility Running time Normalized F-score Speedup (d) YouTube N. Nets (e) YouTube N. Nets Fantom Streaming Local Search Random (f) YouTube N. Nets.2 2 Stream LS Fantom Utility Running time Normalized F-score Speedup (g) OVP Linear (h) OVP Linear Fantom Streaming Local Search Random (i) OVP Linear.2 2 Stream LS Fantom Utility Running time Normalized F-score Speedup (j) OVP N. Nets (k) OVP N. Nets Fantom Streaming Local Search Random (l) OVP N. Nets Figure : Performance of STREAMING LOCAL SEARCH compare to the other benchmarks. a), d) show the ratio of the F-score obtained by STREAMING LOCAL SEARCH and FANTOM vs. the F-score obtained by the method of [7], using linear embeddings on YouTube and OVP datasets. g), j) show similar qualities using nonlinear features from a one-hidden-layer neural network. b), e), h), k) show the speedup of STREAMING LOCAL SEARCH and FANTOM over the method of [7]. c), f), i), l) show the utility vs running time for STREAMING LOCAL SEARCH vs FANTOM and random selection. obtain a similar quality of summary, but more than 7 times faster than [7], and more than 2 times faster than FANTOM for larger segment size. This makes our streaming method an appealing solution for extracting real-time summaries. Note that in real-world scenarios, video frames are received in a fast pace, and thus we need to use a larger segment size in practice. Moreover, unlike the centralized baselines that need to first buffer an entire segment, and then produced summaries, our method generates real-time summaries after receiving each video frame. This capability is of significant importance in privacy sensitive applications. Figures d, j show similar behaviour for using nonlinear hidden representation, where a one-hiddenlayer neural network is used to infer a hidden representation for each frame. It can be seen that while using non-linear representations generally improves the quality of the solution, a similar exponential speedup is achieved by our streaming method (Fig. e, k). We also compared the ratio of the utility 7

8 Figure 2: Summary focused on judges, and singer for YouTube video 6. Figure 3: Summary produced by method of [7] (top row), vs. STREAMING LOCAL SEARCH (middle row), and a user selected summary (bottom row), for YouTube video 5. and running time of our algorithm to FANTOM using original DPP (c.f. Section 4) for producing summaries of length 5% of the video length. Again, our method achieved a competitive performance with much less running time (Figures i, c, l, f). Finally, we compared the performance of our algorithm with that of [7] by reporting the F-score, Precision, and Recall in Table, for segment size =. We see that STREAMING LOCAL SEARCH is able to produce summaries competitive with the exhaustive search [7], and the centralized baselines. Using constraints to generate customized summaries. In our second experiment, we show how constraints can be applied to generate customized summaries. We apply STREAMING LOCAL SEARCH to YouTube video 6, which is a part of America s Got Talent series. It features a singer and three judges in the judging panel. Here, we produced two sets of summaries using different constraints. The top row in Fig. 2 shows a summary focused on the judges. Here we considered 3 uniform matroid constraints to limit the number of frames chosen containing each of the judges. The limits for all the matroid constraints are set to 3. To produce real-time summaries while receiving the video, we used the Viola-Jones algorithm [29] to detect faces in each video frame, and trained a multiclass support vector machines using histograms of oriented gradients (HOG) to recognize different faces. The bottom row in Fig. 2 shows another summary focused on the singer using one matroid constraint. To further enhance the quality of the summaries, we assigned different weights to the frames based on the probability for each frame to contain objects, using selective search [3]. Assigning a higher cost to the frames with a low probability of having objects let us filter uninformative and blurry frames, and produce a summary closer to the human created summaries, as shown in Figures 3. 7 Conclusion We have developed the first streaming algorithm, STREAMING LOCAL SEARCH, for maximizing non-monotone submodular functions subject to a p-system and d-knapsack constraints. We have showed it s applications to video summarization for producing online summaries in streaming setting. Our experimental results showed that our method is able to speedup the summarization task more than 7 times, while achieving a similar performance to the baselines. This makes it an appealing approach in real-time summarization tasks. We note that our method is applicable to any summarization task with a non-monotone submodular utility function. Given the importance of submodular optimization to numerous data mining and machine learning applications, we believe our result is an important step towards providing real-time summaries and prediction. 8

9 References [] Sebastian Tschiatschek, Rishabh K Iyer, Haochen Wei, and Jeff A Bilmes. Learning mixtures of submodular functions for image collection summarization. In Advances in neural information processing systems, pages 43 42, 24. [2] Ian Simon, Noah Snavely, and Steven M Seitz. Scene summarization for online image collections. In Computer Vision, 27. ICCV 27. IEEE th International Conference on, pages 8. IEEE, 27. [3] Hui Lin and Jeff Bilmes. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume, pages Association for Computational Linguistics, 2. [4] Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, and Andreas Krause. Distributed submodular maximization: Identifying representative elements in massive data. In Advances in Neural Information Processing Systems, pages , 23. [5] Kai Wei, Rishabh Iyer, and Jeff Bilmes. Submodularity in data subset selection and active learning. In Proceedings of the International Conference on Machine Learning (ICML), 25. [6] Alex Kulesza, Ben Taskar, et al. Determinantal point processes for machine learning. Foundations and Trends R in Machine Learning, 5(2 3):23 286, 22. [7] Boqing Gong, Wei-Lun Chao, Kristen Grauman, and Fei Sha. Diverse sequential subset selection for supervised video summarization. In Advances in Neural Information Processing Systems, pages , 24. [8] Jennifer Gillenwater, Alex Kulesza, and Ben Taskar. Discovering diverse and salient threads in document collections. In Proceedings of the 22 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages Association for Computational Linguistics, 22. [9] Chong-Wah Ngo, Yu-Fei Ma, and Hong-Jiang Zhang. Automatic video summarization by graph modeling. In Computer Vision, 23. Proceedings. Ninth IEEE International Conference on, pages 4 9. IEEE, 23. [] Tiecheng Liu and John Kender. Optimization algorithms for the selection of key frame sequences of variable length. Computer Vision ECCV 22, pages 3 35, 26. [] Yong Jae Lee, Joydeep Ghosh, and Kristen Grauman. Discovering important people and objects for egocentric video summarization. In Computer Vision and Pattern Recognition (CVPR), 22 IEEE Conference on, pages IEEE, 22. [2] George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. An analysis of approximations for maximizing submodular set functions i. Mathematical Programming, 4(): , 978. [3] Moran Feldman, Christopher Harshaw, and Amin Karbasi. Greed is good: Near-optimal submodular maximization via greedy optimization. arxiv preprint arxiv:74.652, 27. [4] Jon Lee, Vahab S Mirrokni, Viswanath Nagarajan, and Maxim Sviridenko. Non-monotone submodular maximization under matroid and knapsack constraints. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages ACM, 29. [5] Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, and Amin Karbasi. Fast constrained submodular maximization: Personalized data summarization. In ICLM 6: Proceedings of the 33rd International Conference on Machine Learning (ICML), 26. [6] Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. Streaming submodular maximization: Massive data summarization on the fly. In Proceedings of the 2th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, 24. [7] Amit Chakrabarti and Sagar Kale. Submodular maximization meets streaming: Matchings, matroids, and more. Mathematical Programming, 54(-2): , 25. [8] Chandra Chekuri, Shalmoli Gupta, and Kent Quanrud. Streaming algorithms for submodular function maximization. In International Colloquium on Automata, Languages, and Programming, pages Springer, 25. [9] Odile Macchi. The coincidence approach to stochastic point processes. Advances in Applied Probability, 7():83 22, 975. [2] Chun-Wa Ko, Jon Lee, and Maurice Queyranne. An exact algorithm for maximum entropy sampling. Operations Research, 43(4):684 69, 995. [2] Uriel Feige, Vahab S Mirrokni, and Jan Vondrak. Maximizing non-monotone submodular functions. SIAM Journal on Computing, 4(4):33 53, 2. 9

10 [22] Anupam Gupta, Aaron Roth, Grant Schoenebeck, and Kunal Talwar. Constrained non-monotone submodular maximization: Offline and secretary algorithms. In International Workshop on Internet and Network Economics, pages Springer, 2. [23] Ashwinkumar Badanidiyuru and Jan Vondrák. Fast algorithms for maximizing submodular functions. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pages Society for Industrial and Applied Mathematics, 24. [24] Maxim Sviridenko. A note on maximizing a submodular set function subject to a knapsack constraint. Operations Research Letters, 32():4 43, 24. [25] Sandra Eliza Fontes De Avila, Ana Paula Brandão Lopes, Antonio da Luz, and Arnaldo de Albuquerque Araújo. Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recognition Letters, 32():56 68, 2. [26] Florent Perronnin and Christopher Dance. Fisher kernels on visual vocabularies for image categorization. In Computer Vision and Pattern Recognition, 27. CVPR 7. IEEE Conference on, pages 8. IEEE, 27. [27] David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 6(2):9, 24. [28] Esa Rahtu, Juho Kannala, Mikko Salo, and Janne Heikkilä. Segmenting salient objects from images and videos. Computer Vision ECCV 2, pages , 2. [29] Paul Viola and Michael J Jones. Robust real-time face detection. International journal of computer vision, 57(2):37 54, 24. [3] Jasper RR Uijlings, Koen EA Van De Sande, Theo Gevers, and Arnold WM Smeulders. Selective search for object recognition. International journal of computer vision, 4(2):54 7, 23. [3] Niv Buchbinder, Moran Feldman, Joseph Seffi, and Roy Schwartz. A tight linear time (/2)-approximation for unconstrained submodular maximization. SIAM Journal on Computing, 44(5):384 42, 25.

11 Supplementary Materials. A Analysis of STREAMING LOCAL SEARCH Proof of theorem 5. Proof. For each i [, ] by assumption we have f(s i ) αf(s i C i ), (2) where C i = C X i, for all i [,], and X i is the subset of elements processed by INDSTREAM i. Therefore,C = C. Also, for each i, using the tight /2-approximation algorithm for unconstrained maximization from [3] we get f(s i ) 2 f(s i C i ). (3) Now, via a similar argument as used in [5], by induction we show that t [2,] we have t f(s i C i )+(t i)f(s i C i ) (t )f(c )+f( t S i C ). (4) Using submodularity, for the base case t = 2, we get: f(s C )+f(s 2 C 2)+f(S C ) f(s S 2 C )+f(c 2)+f(S C ) f(s S 2 C )+f(c ) Now, we prove the inductive case: t t t [f(s i C i)+(t i)f(s i C i)] [f(s i C i)+(t i)f(s i C i)]+f(s t C t)+ f(s i C i) t (t 2)f(C )+f( t S i C )+f(s t C t)+ f(s i C ) t (t 2)f(C )+f( t S i C )+f(c t)+ f(s i C ) (t 2)f(C )+f( t S i C )+f(c ) = (t )f(c )+f( t S i C ). (5) The first inequality is resulted from Eq. 4, and the last two inequalities are followed from submodularity. Multiplying Eq. 2 by and Eq. 3 by2( i), and using Eq. 5, we get: () f(s j )+ 2( i)f(s j) f(s i C i )+ ( i)f(s i C ) ( )f(c ) Taking a max over the left hand side of the equation we get the following inequality Hence, [ 2 +2 ( i) ] max(f(s i),f(s i)) ( )f(c ) i max(f(s i),f(s i)) α( α) i 2 α f(c ).

12 Proof of theorem 5.3 Proof. Consider an optimal solution C and set ρ = 2( α) (2/α+2d ) f(c ). By submodularity we know that m f(c ) mk, where k is an upper bound on the cardinality of the largest feasible solution, and m is the maximum value of any singleton element. Hence 2m( α) 2/α+2d ρ 2mk( α) 2/α+2d. (6) Thus there is a run of the algorithm with density thresholdρ R such that ρ ρ (+ǫ)ρ. For the run of the algorithm correspond to ρ, we call the solution of the first instance INDSTREAMDENSITY, S ρ. If INDSTREAMDENSITY terminates by exceeding some knapsack capacity, we know that for one of the knapsacks i [d], we have c i (S ρ ) >, and hence also d j= c i(s ρ ) >. On the other hand, the extra density threshold we used for selecting the elements tells us that for anyj S ρ, we have fsρ(j) d ρ. I.e., the marginal gain of every element added to j= cij the solutions ρ was greater than or equal to ρ d j= c ij. Therefore, we get that f(s ρ ) j S ρ ( ρ d ) c i,j > ρ. Note that S ρ it s not a feasible solution, as it exceeds the ith knapsack capacity. However, the solution before adding the last element j to S ρ, i.e. T ρ = S ρ {j}, and the last element itself are both feasible solutions, and by submodularity, the best of them provide us with the value of at least max{f(t ρ ),f({j})} ρ 2 2( α) (+ε)(2/α+2d ) f(c ) On the other hand, if INDSTREAMDENSITY terminates without exceeding any knapsack capacity. We dividec into two sets. LetC<ρ be the set of elements fromc which cannot be added because f Sρ (j) their density is below the threshold, i.e., d < ρ and C cij ρ be the set of elements from C which cannot be added due to independence system constraints. f Sρ (C <ρ) e C <ρ ρ d d c ie = ρ c ie dρ dρ = 2d( α) 2/α +2d f(c ) (7) e C <ρ On the other hand, since S ρ is a feasible solution for INDSTREAM without any density threshold, from Eq. 2 we know thatf(s ρ ) αf(s ρ C ρ ), and thus we obtain: f Sρ (C ρ ) =f(s ρ C ρ ) f(s ρ) ( α ) f(s ρ ) = α α f(s ρ). (8) Adding Eq 7 and 8, and using submodularity we get: Therefore, f(s ρ C ) f ( S ρ ) f Sρ (C <ρ)+f Sρ (C ρ) α α f(s ρ)+dρ f(s ρ ) αf(s ρ C ) αdρ Using a similar induction to the one we used in the proof of Theorem 5., we show that at the solution for one of the INDSTREAM i has the desired approximation guarantee. We multiply Eq. 5 by and Eq. 3 by2( i) to get: () f(s j )+ 2( i)f(s j) [f(s i C i ) dρ]+ ( i)f(s i C ) ( )f(c ) dρ/α 2

13 Taking a max over the left hand side of the equation we get the following inequality Hence, [ 2 +2 ( i) ] max(f(s i i),f(s i)) ( )f(c ) dρ/α max(f(s i i),f(s i)) α( α) 2 α f(c ) αdρ 2 α Replacing the corresponding value for ρ from Eq. 6, we get the desired result: max(f(s i),f(s i)) α( α) i = 2 α f(c ) α 2/α+2d f(c ) 2αd( α) (2 α)(2/α+2d )(+ε) f(c ) 3

Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly

Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly Baharan Mirzasoleiman ETH Zurich, Switzerland baharanm@ethz.ch Stefanie Jegelka MIT, United States stefje@mit.edu

More information

arxiv: v3 [cs.ds] 26 Dec 2017

arxiv: v3 [cs.ds] 26 Dec 2017 Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly Baharan Mirzasoleiman ETH Zurich, Switzerland baharanm@ethz.ch Stefanie Jegelka MIT, United States stefje@mit.edu

More information

A survey of submodular functions maximization. Yao Zhang 03/19/2015

A survey of submodular functions maximization. Yao Zhang 03/19/2015 A survey of submodular functions maximization Yao Zhang 03/19/2015 Example Deploy sensors in the water distribution network to detect contamination F(S): the performance of the detection when a set S of

More information

Distributed Submodular Maximization in Massive Datasets. Alina Ene. Joint work with Rafael Barbosa, Huy L. Nguyen, Justin Ward

Distributed Submodular Maximization in Massive Datasets. Alina Ene. Joint work with Rafael Barbosa, Huy L. Nguyen, Justin Ward Distributed Submodular Maximization in Massive Datasets Alina Ene Joint work with Rafael Barbosa, Huy L. Nguyen, Justin Ward Combinatorial Optimization Given A set of objects V A function f on subsets

More information

Fast Constrained Submodular Maximization: Personalized Data Summarization

Fast Constrained Submodular Maximization: Personalized Data Summarization Fast Constrained Submodular Maximization: Personalized Data Summarization Baharan Mirzasoleiman ETH Zurich Ashwinkumar Badanidiyuru Google Amin Karbasi Yale University BAHARANM@INF.ETHZ.CH ASHWINKUMARBV@GOOGLE.COM

More information

CS 598CSC: Approximation Algorithms Lecture date: March 2, 2011 Instructor: Chandra Chekuri

CS 598CSC: Approximation Algorithms Lecture date: March 2, 2011 Instructor: Chandra Chekuri CS 598CSC: Approximation Algorithms Lecture date: March, 011 Instructor: Chandra Chekuri Scribe: CC Local search is a powerful and widely used heuristic method (with various extensions). In this lecture

More information

A Class of Submodular Functions for Document Summarization

A Class of Submodular Functions for Document Summarization A Class of Submodular Functions for Document Summarization Hui Lin, Jeff Bilmes University of Washington, Seattle Dept. of Electrical Engineering June 20, 2011 Lin and Bilmes Submodular Summarization June

More information

Deletion-Robust Submodular Maximization: Data Summarization with the Right to be Forgotten

Deletion-Robust Submodular Maximization: Data Summarization with the Right to be Forgotten : Data Summarization with the Right to be Forgotten Baharan Mirzasoleiman Amin Karbasi 2 Andreas Krause Abstract How can we summarize a dynamic data stream when elements selected for the summary can be

More information

Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization

Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization Eric Balkanski Harvard University Andreas Krause ETH Zurich Baharan Mirzasoleiman ETH Zurich Yaron Singer Harvard University

More information

On Distributed Submodular Maximization with Limited Information

On Distributed Submodular Maximization with Limited Information On Distributed Submodular Maximization with Limited Information Bahman Gharesifard Stephen L. Smith Abstract This paper considers a class of distributed submodular maximization problems in which each agent

More information

Optimisation While Streaming

Optimisation While Streaming Optimisation While Streaming Amit Chakrabarti Dartmouth College Joint work with S. Kale, A. Wirth DIMACS Workshop on Big Data Through the Lens of Sublinear Algorithms, Aug 2015 Combinatorial Optimisation

More information

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010 Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010 Barna Saha, Vahid Liaghat Abstract This summary is mostly based on the work of Saberi et al. [1] on online stochastic matching problem

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

Secretary Problems and Incentives via Linear Programming

Secretary Problems and Incentives via Linear Programming Secretary Problems and Incentives via Linear Programming Niv Buchbinder Microsoft Research, New England and Kamal Jain Microsoft Research, Redmond and Mohit Singh Microsoft Research, New England In the

More information

Near-Optimal MAP Inference for Determinantal Point Processes

Near-Optimal MAP Inference for Determinantal Point Processes ear-optimal MAP Inference for Determinantal Point Processes Jennifer Gillenwater Alex Kulesza Ben Taskar Computer and Information Science University of Pennsylvania {jengi,kulesza,taskar}@cis.upenn.edu

More information

Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition

Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Sikha O K 1, Sachin Kumar S 2, K P Soman 2 1 Department of Computer Science 2 Centre for Computational Engineering and

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A 4 credit unit course Part of Theoretical Computer Science courses at the Laboratory of Mathematics There will be 4 hours

More information

Supplementary Material Summary Transfer: Exemplar-based Subset Selection for Video Summarization

Supplementary Material Summary Transfer: Exemplar-based Subset Selection for Video Summarization In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. Supplementary Material Summary Transfer: Exemplar-based Subset Selection for Video Summarization

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Randomized Composable Core-sets for Distributed Optimization Vahab Mirrokni

Randomized Composable Core-sets for Distributed Optimization Vahab Mirrokni Randomized Composable Core-sets for Distributed Optimization Vahab Mirrokni Mainly based on joint work with: Algorithms Research Group, Google Research, New York Hossein Bateni, Aditya Bhaskara, Hossein

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

Consistency and Set Intersection

Consistency and Set Intersection Consistency and Set Intersection Yuanlin Zhang and Roland H.C. Yap National University of Singapore 3 Science Drive 2, Singapore {zhangyl,ryap}@comp.nus.edu.sg Abstract We propose a new framework to study

More information

Sequential Determinantal Point Processes (SeqDPPs) and Variations for Supervised Video Summarization Boqing Gong

Sequential Determinantal Point Processes (SeqDPPs) and Variations for Supervised Video Summarization Boqing Gong Sequential Determinantal Point Processes (SeqDPPs) and Variations for Supervised Video Summarization Boqing Gong BGong@CRCV.ucf.edu Big Video on the Internet Big Video on the Internet Big Video on the

More information

(b) Linking and dynamic graph t=

(b) Linking and dynamic graph t= 1 (a) (b) (c) 2 2 2 1 1 1 6 3 4 5 6 3 4 5 6 3 4 5 7 7 7 Supplementary Figure 1: Controlling a directed tree of seven nodes. To control the whole network we need at least 3 driver nodes, which can be either

More information

Flexible Coloring. Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a. Abstract

Flexible Coloring. Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a. Abstract Flexible Coloring Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a a firstname.lastname@hp.com, HP Labs, 1501 Page Mill Road, Palo Alto, CA 94304 b atri@buffalo.edu, Computer Sc. & Engg. dept., SUNY Buffalo,

More information

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007 CS880: Approximations Algorithms Scribe: Chi Man Liu Lecturer: Shuchi Chawla Topic: Local Search: Max-Cut, Facility Location Date: 2/3/2007 In previous lectures we saw how dynamic programming could be

More information

Supplementary Material: Unconstrained Salient Object Detection via Proposal Subset Optimization

Supplementary Material: Unconstrained Salient Object Detection via Proposal Subset Optimization Supplementary Material: Unconstrained Salient Object via Proposal Subset Optimization 1. Proof of the Submodularity According to Eqns. 10-12 in our paper, the objective function of the proposed optimization

More information

Summary of Raptor Codes

Summary of Raptor Codes Summary of Raptor Codes Tracey Ho October 29, 2003 1 Introduction This summary gives an overview of Raptor Codes, the latest class of codes proposed for reliable multicast in the Digital Fountain model.

More information

Feature Selection. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 262

Feature Selection. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 262 Feature Selection Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2016 239 / 262 What is Feature Selection? Department Biosysteme Karsten Borgwardt Data Mining Course Basel

More information

6. Lecture notes on matroid intersection

6. Lecture notes on matroid intersection Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans May 2, 2017 6. Lecture notes on matroid intersection One nice feature about matroids is that a simple greedy algorithm

More information

Submodularity in Speech/NLP

Submodularity in Speech/NLP Submodularity in Speech/NLP Jeffrey A. Bilmes Professor Departments of Electrical Engineering & Computer Science and Engineering University of Washington, Seattle http://melodi.ee.washington.edu/~bilmes

More information

Improving Recognition through Object Sub-categorization

Improving Recognition through Object Sub-categorization Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,

More information

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN 2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine

More information

Submodular Optimization

Submodular Optimization Submodular Optimization Nathaniel Grammel N. Grammel Submodular Optimization 1 / 28 Submodularity Captures the notion of Diminishing Returns Definition Suppose U is a set. A set function f :2 U! R is submodular

More information

The Online Connected Facility Location Problem

The Online Connected Facility Location Problem The Online Connected Facility Location Problem Mário César San Felice 1, David P. Willamson 2, and Orlando Lee 1 1 Unicamp, Institute of Computing, Campinas SP 13083-852, Brazil felice@ic.unicamp.br, lee@ic.unicamp.br

More information

A Reduction of Conway s Thrackle Conjecture

A Reduction of Conway s Thrackle Conjecture A Reduction of Conway s Thrackle Conjecture Wei Li, Karen Daniels, and Konstantin Rybnikov Department of Computer Science and Department of Mathematical Sciences University of Massachusetts, Lowell 01854

More information

Using the Kolmogorov-Smirnov Test for Image Segmentation

Using the Kolmogorov-Smirnov Test for Image Segmentation Using the Kolmogorov-Smirnov Test for Image Segmentation Yong Jae Lee CS395T Computational Statistics Final Project Report May 6th, 2009 I. INTRODUCTION Image segmentation is a fundamental task in computer

More information

Expected Approximation Guarantees for the Demand Matching Problem

Expected Approximation Guarantees for the Demand Matching Problem Expected Approximation Guarantees for the Demand Matching Problem C. Boucher D. Loker September 2006 Abstract The objective of the demand matching problem is to obtain the subset M of edges which is feasible

More information

An Efficient Approximation for the Generalized Assignment Problem

An Efficient Approximation for the Generalized Assignment Problem An Efficient Approximation for the Generalized Assignment Problem Reuven Cohen Liran Katzir Danny Raz Department of Computer Science Technion Haifa 32000, Israel Abstract We present a simple family of

More information

On Covering a Graph Optimally with Induced Subgraphs

On Covering a Graph Optimally with Induced Subgraphs On Covering a Graph Optimally with Induced Subgraphs Shripad Thite April 1, 006 Abstract We consider the problem of covering a graph with a given number of induced subgraphs so that the maximum number

More information

Sustainable Computing: Informatics and Systems 00 (2014) Huangxin Wang

Sustainable Computing: Informatics and Systems 00 (2014) Huangxin Wang Sustainable Computing: Informatics and Systems 00 (2014) 1 17 Sustainable Computing Worst-Case Performance Guarantees of Scheduling Algorithms Maximizing Weighted Throughput in Energy-Harvesting Networks

More information

Word Alignment via Submodular Maximization over Matroids

Word Alignment via Submodular Maximization over Matroids Word Alignment via Submodular Maximization over Matroids Hui Lin Dept. of Electrical Engineering University of Washington Seattle, WA 98195, USA hlin@ee.washington.edu Jeff Bilmes Dept. of Electrical Engineering

More information

Fast and Simple Algorithms for Weighted Perfect Matching

Fast and Simple Algorithms for Weighted Perfect Matching Fast and Simple Algorithms for Weighted Perfect Matching Mirjam Wattenhofer, Roger Wattenhofer {mirjam.wattenhofer,wattenhofer}@inf.ethz.ch, Department of Computer Science, ETH Zurich, Switzerland Abstract

More information

arxiv: v2 [cs.dm] 3 Dec 2014

arxiv: v2 [cs.dm] 3 Dec 2014 The Student/Project Allocation problem with group projects Aswhin Arulselvan, Ágnes Cseh, and Jannik Matuschke arxiv:4.035v [cs.dm] 3 Dec 04 Department of Management Science, University of Strathclyde,

More information

Combining Selective Search Segmentation and Random Forest for Image Classification

Combining Selective Search Segmentation and Random Forest for Image Classification Combining Selective Search Segmentation and Random Forest for Image Classification Gediminas Bertasius November 24, 2013 1 Problem Statement Random Forest algorithm have been successfully used in many

More information

Formal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T.

Formal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T. Although this paper analyzes shaping with respect to its benefits on search problems, the reader should recognize that shaping is often intimately related to reinforcement learning. The objective in reinforcement

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Using Document Summarization Techniques for Speech Data Subset Selection

Using Document Summarization Techniques for Speech Data Subset Selection Using Document Summarization Techniques for Speech Data Subset Selection Kai Wei, Yuzong Liu, Katrin Kirchhoff, Jeff Bilmes Department of Electrical Engineering University of Washington Seattle, WA 98195,

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

REDUCING GRAPH COLORING TO CLIQUE SEARCH

REDUCING GRAPH COLORING TO CLIQUE SEARCH Asia Pacific Journal of Mathematics, Vol. 3, No. 1 (2016), 64-85 ISSN 2357-2205 REDUCING GRAPH COLORING TO CLIQUE SEARCH SÁNDOR SZABÓ AND BOGDÁN ZAVÁLNIJ Institute of Mathematics and Informatics, University

More information

CSC Linear Programming and Combinatorial Optimization Lecture 12: Semidefinite Programming(SDP) Relaxation

CSC Linear Programming and Combinatorial Optimization Lecture 12: Semidefinite Programming(SDP) Relaxation CSC411 - Linear Programming and Combinatorial Optimization Lecture 1: Semidefinite Programming(SDP) Relaxation Notes taken by Xinwei Gui May 1, 007 Summary: This lecture introduces the semidefinite programming(sdp)

More information

Opinion Mining by Transformation-Based Domain Adaptation

Opinion Mining by Transformation-Based Domain Adaptation Opinion Mining by Transformation-Based Domain Adaptation Róbert Ormándi, István Hegedűs, and Richárd Farkas University of Szeged, Hungary {ormandi,ihegedus,rfarkas}@inf.u-szeged.hu Abstract. Here we propose

More information

Acyclic Subgraphs of Planar Digraphs

Acyclic Subgraphs of Planar Digraphs Acyclic Subgraphs of Planar Digraphs Noah Golowich Research Science Institute Department of Mathematics Massachusetts Institute of Technology Cambridge, Massachusetts, U.S.A. ngolowich@college.harvard.edu

More information

Query-focused Video Summarization

Query-focused Video Summarization Query-focused Video Summarization Jacob Laurel 1, Aidean Sharghi 2, and Boqing Gong 2 1 University of Alabama at Birmingham 2 University of Central Florida {jslaurel, aidean, bgong}@crcv.ucf.edu Abstract

More information

Lecture and notes by: Nate Chenette, Brent Myers, Hari Prasad November 8, Property Testing

Lecture and notes by: Nate Chenette, Brent Myers, Hari Prasad November 8, Property Testing Property Testing 1 Introduction Broadly, property testing is the study of the following class of problems: Given the ability to perform (local) queries concerning a particular object (e.g., a function,

More information

PTAS for Matroid Matching

PTAS for Matroid Matching PTAS for Matroid Matching Jon Lee 1 Maxim Sviridenko 1 Jan Vondrák 2 1 IBM Watson Research Center Yorktown Heights, NY 2 IBM Almaden Research Center San Jose, CA May 6, 2010 Jan Vondrák (IBM Almaden) PTAS

More information

Mining High Average-Utility Itemsets

Mining High Average-Utility Itemsets Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering

More information

Limitations of Matrix Completion via Trace Norm Minimization

Limitations of Matrix Completion via Trace Norm Minimization Limitations of Matrix Completion via Trace Norm Minimization ABSTRACT Xiaoxiao Shi Computer Science Department University of Illinois at Chicago xiaoxiao@cs.uic.edu In recent years, compressive sensing

More information

Complexity Results on Graphs with Few Cliques

Complexity Results on Graphs with Few Cliques Discrete Mathematics and Theoretical Computer Science DMTCS vol. 9, 2007, 127 136 Complexity Results on Graphs with Few Cliques Bill Rosgen 1 and Lorna Stewart 2 1 Institute for Quantum Computing and School

More information

CLASS-ROOM NOTES: OPTIMIZATION PROBLEM SOLVING - I

CLASS-ROOM NOTES: OPTIMIZATION PROBLEM SOLVING - I Sutra: International Journal of Mathematical Science Education, Technomathematics Research Foundation Vol. 1, No. 1, 30-35, 2008 CLASS-ROOM NOTES: OPTIMIZATION PROBLEM SOLVING - I R. Akerkar Technomathematics

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

Stochastic Submodular Maximization: The Case of Coverage Functions

Stochastic Submodular Maximization: The Case of Coverage Functions Stochastic Submodular Maximization: The Case of Coverage Functions Mohammad Reza Karimi Department of Computer Science ETH Zurich mkarimi@ethz.ch Mario Lucic Department of Computer Science ETH Zurich lucic@inf.ethz.ch

More information

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions. CS 787: Advanced Algorithms NP-Hardness Instructor: Dieter van Melkebeek We review the concept of polynomial-time reductions, define various classes of problems including NP-complete, and show that 3-SAT

More information

Automatic Domain Partitioning for Multi-Domain Learning

Automatic Domain Partitioning for Multi-Domain Learning Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels

More information

A Graph Theoretic Approach to Image Database Retrieval

A Graph Theoretic Approach to Image Database Retrieval A Graph Theoretic Approach to Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington, Seattle, WA 98195-2500

More information

Coverage Approximation Algorithms

Coverage Approximation Algorithms DATA MINING LECTURE 12 Coverage Approximation Algorithms Example Promotion campaign on a social network We have a social network as a graph. People are more likely to buy a product if they have a friend

More information

Comparing the strength of query types in property testing: The case of testing k-colorability

Comparing the strength of query types in property testing: The case of testing k-colorability Comparing the strength of query types in property testing: The case of testing k-colorability Ido Ben-Eliezer Tali Kaufman Michael Krivelevich Dana Ron Abstract We study the power of four query models

More information

Maximum Betweenness Centrality: Approximability and Tractable Cases

Maximum Betweenness Centrality: Approximability and Tractable Cases Maximum Betweenness Centrality: Approximability and Tractable Cases Martin Fink and Joachim Spoerhase Chair of Computer Science I University of Würzburg {martin.a.fink, joachim.spoerhase}@uni-wuerzburg.de

More information

Matt Weinberg. Princeton University

Matt Weinberg. Princeton University Matt Weinberg Princeton University Online Selection Problems: Secretary Problems Offline: Every secretary i has a weight w i (chosen by adversary, unknown to you). Secretaries permuted randomly. Online:

More information

Lecture 9: Pipage Rounding Method

Lecture 9: Pipage Rounding Method Recent Advances in Approximation Algorithms Spring 2015 Lecture 9: Pipage Rounding Method Lecturer: Shayan Oveis Gharan April 27th Disclaimer: These notes have not been subjected to the usual scrutiny

More information

Lecture 7: Asymmetric K-Center

Lecture 7: Asymmetric K-Center Advanced Approximation Algorithms (CMU 18-854B, Spring 008) Lecture 7: Asymmetric K-Center February 5, 007 Lecturer: Anupam Gupta Scribe: Jeremiah Blocki In this lecture, we will consider the K-center

More information

Polynomial-Time Approximation Algorithms

Polynomial-Time Approximation Algorithms 6.854 Advanced Algorithms Lecture 20: 10/27/2006 Lecturer: David Karger Scribes: Matt Doherty, John Nham, Sergiy Sidenko, David Schultz Polynomial-Time Approximation Algorithms NP-hard problems are a vast

More information

5. Lecture notes on matroid intersection

5. Lecture notes on matroid intersection Massachusetts Institute of Technology Handout 14 18.433: Combinatorial Optimization April 1st, 2009 Michel X. Goemans 5. Lecture notes on matroid intersection One nice feature about matroids is that a

More information

Efficient homomorphism-free enumeration of conjunctive queries

Efficient homomorphism-free enumeration of conjunctive queries Efficient homomorphism-free enumeration of conjunctive queries Jan Ramon 1, Samrat Roy 1, and Jonny Daenen 2 1 K.U.Leuven, Belgium, Jan.Ramon@cs.kuleuven.be, Samrat.Roy@cs.kuleuven.be 2 University of Hasselt,

More information

1 Overview. 2 Applications of submodular maximization. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Applications of submodular maximization. AM 221: Advanced Optimization Spring 2016 AM : Advanced Optimization Spring 06 Prof. Yaron Singer Lecture 0 April th Overview Last time we saw the problem of Combinatorial Auctions and framed it as a submodular maximization problem under a partition

More information

Unlabeled equivalence for matroids representable over finite fields

Unlabeled equivalence for matroids representable over finite fields Unlabeled equivalence for matroids representable over finite fields November 16, 2012 S. R. Kingan Department of Mathematics Brooklyn College, City University of New York 2900 Bedford Avenue Brooklyn,

More information

Prices and Auctions in Markets with Complex Constraints

Prices and Auctions in Markets with Complex Constraints Conference on Frontiers of Economics and Computer Science Becker-Friedman Institute Prices and Auctions in Markets with Complex Constraints Paul Milgrom Stanford University & Auctionomics August 2016 1

More information

Convex Optimization MLSS 2015

Convex Optimization MLSS 2015 Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing Qiang Ma Hui Kong Martin D. F. Wong Evangeline F. Y. Young Department of Electrical and Computer Engineering,

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Introduction to Approximation Algorithms

Introduction to Approximation Algorithms Introduction to Approximation Algorithms Subir Kumar Ghosh School of Technology & Computer Science Tata Institute of Fundamental Research Mumbai 400005, India ghosh@tifr.res.in Overview 1. Background 2.

More information

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents

More information

Efficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms

Efficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms Efficient Sequential Algorithms, Comp309 Part 1: Algorithmic Paradigms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press

More information

Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization

Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization Adish Singla ETH Zurich adish.singla@inf.ethz.ch Sebastian Tschiatschek ETH Zurich sebastian.tschiatschek@inf.ethz.ch

More information

A Hybrid Recursive Multi-Way Number Partitioning Algorithm

A Hybrid Recursive Multi-Way Number Partitioning Algorithm Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Hybrid Recursive Multi-Way Number Partitioning Algorithm Richard E. Korf Computer Science Department University

More information

Comp Online Algorithms

Comp Online Algorithms Comp 7720 - Online Algorithms Notes 4: Bin Packing Shahin Kamalli University of Manitoba - Fall 208 December, 208 Introduction Bin packing is one of the fundamental problems in theory of computer science.

More information

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs Integer Programming ISE 418 Lecture 7 Dr. Ted Ralphs ISE 418 Lecture 7 1 Reading for This Lecture Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Wolsey Chapter 7 CCZ Chapter 1 Constraint

More information

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017 COMPSCI 632: Approximation Algorithms August 30, 2017 Lecturer: Debmalya Panigrahi Lecture 2 Scribe: Nat Kell 1 Introduction In this lecture, we examine a variety of problems for which we give greedy approximation

More information

Progress Towards the Total Domination Game 3 4 -Conjecture

Progress Towards the Total Domination Game 3 4 -Conjecture Progress Towards the Total Domination Game 3 4 -Conjecture 1 Michael A. Henning and 2 Douglas F. Rall 1 Department of Pure and Applied Mathematics University of Johannesburg Auckland Park, 2006 South Africa

More information

Simplified clustering algorithms for RFID networks

Simplified clustering algorithms for RFID networks Simplified clustering algorithms for FID networks Vinay Deolalikar, Malena Mesarina, John ecker, Salil Pradhan HP Laboratories Palo Alto HPL-2005-163 September 16, 2005* clustering, FID, sensors The problem

More information

FOUR EDGE-INDEPENDENT SPANNING TREES 1

FOUR EDGE-INDEPENDENT SPANNING TREES 1 FOUR EDGE-INDEPENDENT SPANNING TREES 1 Alexander Hoyer and Robin Thomas School of Mathematics Georgia Institute of Technology Atlanta, Georgia 30332-0160, USA ABSTRACT We prove an ear-decomposition theorem

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Scheduling Unsplittable Flows Using Parallel Switches

Scheduling Unsplittable Flows Using Parallel Switches Scheduling Unsplittable Flows Using Parallel Switches Saad Mneimneh, Kai-Yeung Siu Massachusetts Institute of Technology 77 Massachusetts Avenue Room -07, Cambridge, MA 039 Abstract We address the problem

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information