Approximate Integration of Streaming data
|
|
- Kathryn Garrison
- 5 years ago
- Views:
Transcription
1 Approximate Integration of Streaming data Michel de Rougemont, Guillaume Vimont University Paris II & Irif
2 Plan 1. Approximation for Datawarehouses: Boolean queries Analytic queries 2. Streaming Datawarehouses Reservoir sampling Community in Graphs via Uniform sampling A good approximation for some random graphs 3. Data Integration for streams Compress streams with «good representations» with h.p. Define the Integration from this compressed forms The «value» of the data increases with the integration
3 1. OLAP Queries for a Datawarehouse OLAP queries (Analytic queries): filter, dimensions, measure, aggregation Dimension: Channel Measure: Sentiment Analysis Aggregation: Sum Datawarehouse Tweets with Sentiment Analysis (measure in [0 9])
4 Approximation of OLAP queries PBS 200 2/5 CNN 300 3/5 PBS 225 CNN 275 Result is a distribution μ Distance between two distributions: L 1 [μ-μ ]=0.1 i.e. 10%
5 Uniform samples, Weighted samples Take N uniform samples: Prob Ω [[μ-μ ]<ε ] > 1-δ N=O(log(1/ δ ).c/ε 2 ) Prob Ω [[μ-μ ]<0.1 ] > 0.9
6 Approximation of Queries Assume a large G and a small random subgraph G Approximate Q on G by Q on G : 1. Approximate an OLAP query Q (distribution): Prob Ω [ [Q G - Q G ] < ε ] > 1-δ 1. Approximate a graph property Q(): Prob Ω [ Q() G = Q () G ] > 1-δ 1. Approximate a graph search property Q(x). Prob Ω [ Q(x) G Q (x) G Φ ] > 1-δ
7 Approximate Q on G by Q on G : Refinement of Approximation Strict Approximation a graph property Q(): Prob Ω [ Q() G = Q () G ] > 1-δ Property Tester (ε,δ)-approximation: If Q is true on G then Q is true on G If Q is ε-far from G then : Prob Ω [ Q () G false ] > 1-δ Size of G only depends on (ε,δ) Note: Q is ε-far from G if dist(q,g)>ε
8 2. Streaming data Assume a stream of tuples of G: t 1, t 2, t n. Can t store all the tuples: keep a subset G of size N Tool: Reservoir sampling to approximate an OLAP query Q. Reservoir Sampling: keeps k edges with a uniform distribution with a weighted distribution Theorem 1: If N>O(log(1/ δ ).c/ε 2 ), then Prob Ω [ [Q G - Q G ] < ε ] > 1-δ G G
9 Streaming Graph edges Assume a stream of edges of G: e 1, e 2, e n. Can t store all the edges: keep a subgraph G and the «most recent» G t G t G G t Tools: Reservoir for G and Window Reservoir G t Store: all the nodes (Mysql) and only edges in the Reservoirs Complexity : Avoid storing O(n 2 ) edges
10 Examples of Queries Assume a stream of Twitter edges of G: e 1, e 2, e n. Can t store all the edges: keep a subgraph G Boolean queries: Q():- is there one community (dense subgraph)? Q():- are there k disjoint communities? Search queries: Q(x):- x is in a community (largest dense subgraph (V,E) s.t [E]>α. [V] 2 ) Analytic queries: Q :- the distribution of the sizes of the communities
11 Streaming Graph edges Assume a stream of edges of G: e 1, e 2, e n. Can t store all the edges: keep a subgraph G and the «most recent» G t G t G G t Search for a community Q(x) Algorithm: keep large connected components in G : Q (x) G Goal: Prob Ω [ Q(x) G Q (x) G Φ ] > 1-δ
12 Nodes Uniform Sampling: nodes vs. Edges Edges Edges witness the concentration. Equivalently, Nodes could be chosen with their degree distribution.
13 Random graphs Erdos-Renyi: G(n,p) Preferential Attachment: PA(n,m) Degree distribution: Power law: Prob Ω [ degree(x)=i ] =c/i 2 Example: [15,6,4,3,3,2] as histogram Concentration property: O( m/2) nodes of high degrees S concentrated if v in S then Majority (edges(v)) in S S is a community. We can build a graph with several communities of different sizes.
14 Graph with 2 communities of the same size
15 Uniform Sampling on the edges G λ G t G G t Algorithm: Keep large connected components of G t at times λ, 2λ,.. Theorem 2: If a graph G follows a power law and has «a concentration property», then Prob Ω [ Q(x) G Q (x) G Φ ] > 1-δ C 2,i C 1,i
16 Several streams: S 1, S 2, 3. Integration of Streaming data S j : we compress the stream to V, C j the union of the large connected components C j =Union C i,j All the nodes are stored in a Mysql database
17 2 streams
18 Integration of Streaming data How can we correlate two streams of edges? Store large connected components of the reservoir, at discrete times, for each stream. Let V C1, V C2 the nodes of the communities. Nodes correlation (t): (V 1 V 2 )/Max(V 1,V 2 ) Communities correlation (t): (V C1 V C2 )/Max(V C1,V C2 ) t Série 1 Série 2
19 Conclusion 1. Approximation of queries Boolean, search, analytic queries 2. Streaming Data Streaming tuples Streaming edges: dynamic community detection 3. Data Integration for streams Compress streams with «good representations» with h.p. Define the Integration from this compressed forms The «value» of the data increases with the integration
Course : Data mining
Course : Data mining Lecture : Mining data streams Aristides Gionis Department of Computer Science Aalto University visiting in Sapienza University of Rome fall 2016 reading assignment LRU book: chapter
More informationSublinear Algorithms for Big Data Analysis
Sublinear Algorithms for Big Data Analysis Michael Kapralov Theory of Computation Lab 4 EPFL 7 September 2017 The age of big data: massive amounts of data collected in various areas of science and technology
More informationMapReduce Algorithms. Barna Saha. March 28, 2016
MapReduce Algorithms Barna Saha March 28, 2016 Complexity Model for MapReduce Minimum Spanning Tree in MapReduce Computing Dense Subgraph in MapReduce Complexity Model for MapReduce:MRC i Input: finite
More informationThe 2-core of a Non-homogeneous Hypergraph
July 16, 2012 k-cores A Hypergraph G on vertex set V is a collection E of subsets of V. E is the set of hyperedges. For ordinary graphs, e = 2 for all e E. The k-core of a (hyper)graph is the maximal subgraph
More informationThe 4/3 Additive Spanner Exponent is Tight
The 4/3 Additive Spanner Exponent is Tight Amir Abboud and Greg Bodwin Stanford University June 8, 2016 Our Question How much can you compress a graph into low space while still being able to approximately
More informationLecture 15. Lecture 15: Bitmap Indexes
Lecture 5 Lecture 5: Bitmap Indexes Lecture 5 What you will learn about in this section. Bitmap Indexes 2. Storing a bitmap index 3. Bitslice Indexes 2 Lecture 5. Bitmap indexes 3 Motivation Consider the
More informationManaging Uncertainty in Data Streams. Aleka Seliniotaki Project Presentation HY561 Heraklion, 22/05/2013
Managing Uncertainty in Data Streams Aleka Seliniotaki Project Presentation HY561 Heraklion, 22/05/2013 Introduction Uncertain Data Streams T V Data: incomplete, imprecise, misleading Results: unknown
More informationD-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.
D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either
More informationRectangle-Efficient Aggregation in Spatial Data Streams
Rectangle-Efficient Aggregation in Spatial Data Streams Srikanta Tirthapura Iowa State David Woodruff IBM Almaden The Data Stream Model Stream S of additive updates (i, Δ) to an underlying vector v: v
More informationAn Efficient Transformation for Klee s Measure Problem in the Streaming Model
An Efficient Transformation for Klee s Measure Problem in the Streaming Model Gokarna Sharma, Costas Busch, Ramachandran Vaidyanathan, Suresh Rai, and Jerry Trahan Louisiana State University Outline of
More informationsimply ordered sets. We ll state only the result here, since the proof is given in Munkres.
p. 1 Math 490 Notes 20 More About Compactness Recall that in Munkres it is proved that a simply (totally) ordered set X with the order topology is connected iff it satisfies: (1) Every subset bounded above
More informationDistances in power-law random graphs
Distances in power-law random graphs Sander Dommers Supervisor: Remco van der Hofstad February 2, 2009 Where innovation starts Introduction There are many complex real-world networks, e.g. Social networks
More informationSketching Asynchronous Streams Over a Sliding Window
Sketching Asynchronous Streams Over a Sliding Window Srikanta Tirthapura (Iowa State University) Bojian Xu (Iowa State University) Costas Busch (Rensselaer Polytechnic Institute) 1/32 Data Stream Processing
More information[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics
400 lecture note #4 [Ch 6] Set Theory 1. Basic Concepts and Definitions 1) Basics Element: ; A is a set consisting of elements x which is in a/another set S such that P(x) is true. Empty set: notated {
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationCHAPTER 3 FUZZY RELATION and COMPOSITION
CHAPTER 3 FUZZY RELATION and COMPOSITION Crisp relation! Definition (Product set) Let A and B be two non-empty sets, the prod uct set or Cartesian product A B is defined as follows, A B = {(a, b) a A,
More informationSolutions for the Exam 6 January 2014
Mastermath and LNMB Course: Discrete Optimization Solutions for the Exam 6 January 2014 Utrecht University, Educatorium, 13:30 16:30 The examination lasts 3 hours. Grading will be done before January 20,
More informationLocality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC)
Locality-Sensitive Codes from Shift-Invariant Kernels Maxim Raginsky (Duke) and Svetlana Lazebnik (UNC) Goal We want to design a binary encoding of data such that similar data points (similarity measures
More informationImplementation of Relational Operations
Implementation of Relational Operations Module 4, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Operations We will consider how to implement: Selection ( ) Selects a subset of rows
More informationGRAPH THEORY and APPLICATIONS. Factorization Domination Indepence Clique
GRAPH THEORY and APPLICATIONS Factorization Domination Indepence Clique Factorization Factor A factor of a graph G is a spanning subgraph of G, not necessarily connected. G is the sum of factors G i, if:
More informationEvaluation of Relational Operations. Relational Operations
Evaluation of Relational Operations Chapter 14, Part A (Joins) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Relational Operations v We will consider how to implement: Selection ( )
More informationINF580 Advanced Mathematical Programming
INF580 Advanced Mathematical Programming TD3 Complexity and MP Leo Liberti CNRS LIX, Ecole Polytechnique, France 190125 Leo Liberti (CNRS LIX) INF580 / TD3 190125 1 / 9 Simple AMPL codes Write AMPL code
More informationarxiv: v1 [math.co] 18 Jun 2009
Decompositions into subgraphs of small diameter Jacob Fox Benny Sudakov arxiv:0906.3530v1 [math.co] 18 Jun 009 Abstract We investigate decompositions of a graph into a small number of low diameter subgraphs.
More informationCombinatorial Optimization of Group Key Management
Combinatorial Optimization of Group Key Management M. Eltoweissy, James Madison U., M. H. Heydari, James Madison U., Linda Morales, Texas A&M Commerce, & Hal Sudborough, U. Texas Dallas, Why is key maintenance
More informationSketching Probabilistic Data Streams
Sketching Probabilistic Data Streams Graham Cormode AT&T Labs - Research graham@research.att.com Minos Garofalakis Yahoo! Research minos@acm.org Challenge of Uncertain Data Many applications generate data
More informationProbabilistic Graph Summarization
Probabilistic Graph Summarization Nasrin Hassanlou, Maryam Shoaran, and Alex Thomo University of Victoria, Victoria, Canada {hassanlou,maryam,thomo}@cs.uvic.ca 1 Abstract We study group-summarization of
More informationA fast-growing subset of Av(1324)
A fast-growing subset of Av(1324) David Bevan Permutation Patterns 2014, East Tennessee State University 7 th July 2014 Permutations Permutation of length n: an ordering on 1,..., n. Example σ = 31567482
More informationIntroduction III. Graphs. Motivations I. Introduction IV
Introduction I Graphs Computer Science & Engineering 235: Discrete Mathematics Christopher M. Bourke cbourke@cse.unl.edu Graph theory was introduced in the 18th century by Leonhard Euler via the Königsberg
More informationChapter 23. Minimum Spanning Trees
Chapter 23. Minimum Spanning Trees We are given a connected, weighted, undirected graph G = (V,E;w), where each edge (u,v) E has a non-negative weight (often called length) w(u,v). The Minimum Spanning
More informationLecture and notes by: Nate Chenette, Brent Myers, Hari Prasad November 8, Property Testing
Property Testing 1 Introduction Broadly, property testing is the study of the following class of problems: Given the ability to perform (local) queries concerning a particular object (e.g., a function,
More informationAnalyzing a Greedy Approximation of an MDL Summarization
Analyzing a Greedy Approximation of an MDL Summarization Peter Fontana fontanap@seas.upenn.edu Faculty Advisor: Dr. Sudipto Guha April 10, 2007 Abstract Many OLAP (On-line Analytical Processing) applications
More informationPart 2.2 Continuous functions and their properties v1 2018
Part 2.2 Continuous functions and their properties v 208 Intermediate Values Recall R is complete. This means that ever non-empt subset of R which is bounded above has a least upper bound. That is: (A
More informationCSC Discrete Math I, Spring Sets
CSC 125 - Discrete Math I, Spring 2017 Sets Sets A set is well-defined, unordered collection of objects The objects in a set are called the elements, or members, of the set A set is said to contain its
More informationDistribution-Free Models of Social and Information Networks
Distribution-Free Models of Social and Information Networks Tim Roughgarden (Stanford CS) joint work with Jacob Fox (Stanford Math), Rishi Gupta (Stanford CS), C. Seshadhri (UC Santa Cruz), Fan Wei (Stanford
More informationNetworks in economics and finance. Lecture 1 - Measuring networks
Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks
More informationarxiv: v2 [cs.ds] 30 Sep 2016
Synergistic Sorting, MultiSelection and Deferred Data Structures on MultiSets Jérémy Barbay 1, Carlos Ochoa 1, and Srinivasa Rao Satti 2 1 Departamento de Ciencias de la Computación, Universidad de Chile,
More informationDiscrete Mathematics Lecture 4. Harper Langston New York University
Discrete Mathematics Lecture 4 Harper Langston New York University Sequences Sequence is a set of (usually infinite number of) ordered elements: a 1, a 2,, a n, Each individual element a k is called a
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 22, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 22, 2011 1 / 22 If the graph is non-chordal, then
More informationImage Enhancement: To improve the quality of images
Image Enhancement: To improve the quality of images Examples: Noise reduction (to improve SNR or subjective quality) Change contrast, brightness, color etc. Image smoothing Image sharpening Modify image
More informationThe Relational Algebra
The Relational Algebra Relational Algebra Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) 27-Jan-14
More informationQuery Evaluation and Optimization
Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21
More informationRandom Simplicial Complexes
Random Simplicial Complexes Duke University CAT-School 2015 Oxford 9/9/2015 Part II Random Geometric Complexes Contents Probabilistic Ingredients Random Geometric Graphs Definitions Random Geometric Complexes
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank
More informationSocial and Technological Network Data Analytics. Lecture 5: Structure of the Web, Search and Power Laws. Prof Cecilia Mascolo
Social and Technological Network Data Analytics Lecture 5: Structure of the Web, Search and Power Laws Prof Cecilia Mascolo In This Lecture We describe power law networks and their properties and show
More informationOptimal Routing and Scheduling in Multihop Wireless Renewable Energy Networks
Optimal Routing and Scheduling in Multihop Wireless Renewable Energy Networks ITA 11, San Diego CA, February 2011 MHR. Khouzani, Saswati Sarkar, Koushik Kar UPenn, UPenn, RPI March 23, 2011 Khouzani, Sarkar,
More informationCSCI5070 Advanced Topics in Social Computing
CSCI5070 Advanced Topics in Social Computing Irwin King The Chinese University of Hong Kong king@cse.cuhk.edu.hk!! 2012 All Rights Reserved. Outline Graphs Origins Definition Spectral Properties Type of
More informationOverview of Clustering
based on Loïc Cerfs slides (UFMG) April 2017 UCBL LIRIS DM2L Example of applicative problem Student profiles Given the marks received by students for different courses, how to group the students so that
More informationDruid Power Interactive Applications at Scale. Jonathan Wei Software Engineer
Druid Power Interactive Applications at Scale Jonathan Wei Software Engineer History & Motivation Demo Overview Storage Internals Druid Architecture Motivation Motivation Visibility and analysis for complex
More informationB561 Advanced Database Concepts. 6. Streaming Algorithms. Qin Zhang 1-1
B561 Advanced Database Concepts 6. Streaming Algorithms Qin Zhang 1-1 The model and challenge The data stream model (Alon, Matias and Szegedy 1996) a n a 2 a 1 RAM CPU Why hard? Cannot store everything.
More informationLecture 8: Jointly distributed random variables
Lecture : Jointly distributed random variables Random Vectors and Joint Probability Distributions Definition: Random Vector. An n-dimensional random vector, denoted as Z = (Z, Z,, Z n ), is a function
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams
More informationAMS /672: Graph Theory Homework Problems - Week V. Problems to be handed in on Wednesday, March 2: 6, 8, 9, 11, 12.
AMS 550.47/67: Graph Theory Homework Problems - Week V Problems to be handed in on Wednesday, March : 6, 8, 9,,.. Assignment Problem. Suppose we have a set {J, J,..., J r } of r jobs to be filled by a
More informationUniversity of Ostrava. Fuzzy Transform of a Function on the Basis of Triangulation
University of Ostrava Institute for Research and Applications of Fuzzy Modeling Fuzzy Transform of a Function on the Basis of Triangulation Dagmar Plšková Research report No. 83 2005 Submitted/to appear:
More informationColoring 3-Colorable Graphs
Coloring -Colorable Graphs Charles Jin April, 015 1 Introduction Graph coloring in general is an etremely easy-to-understand yet powerful tool. It has wide-ranging applications from register allocation
More informationNew Directions in Traffic Measurement and Accounting. Need for traffic measurement. Relation to stream databases. Internet backbone monitoring
New Directions in Traffic Measurement and Accounting C. Estan and G. Varghese Presented by Aaditeshwar Seth 1 Need for traffic measurement Internet backbone monitoring Short term Detect DoS attacks Long
More informationTOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3.
TOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3. 301. Definition. Let m be a positive integer, and let X be a set. An m-tuple of elements of X is a function x : {1,..., m} X. We sometimes use x i instead
More informationPoint-Set Topology II
Point-Set Topology II Charles Staats September 14, 2010 1 More on Quotients Universal Property of Quotients. Let X be a topological space with equivalence relation. Suppose that f : X Y is continuous and
More informationSuperlinear Lower Bounds for Multipass Graph Processing
Krzysztof Onak Superlinear lower bounds for multipass graph processing p. 1/29 Superlinear Lower Bounds for Multipass Graph Processing Krzysztof Onak IBM T.J. Watson Research Center Joint work with Venkat
More informationMODELS OF CUBIC THEORIES
Bulletin of the Section of Logic Volume 43:1/2 (2014), pp. 19 34 Sergey Sudoplatov MODELS OF CUBIC THEORIES Abstract Cubic structures and cubic theories are defined on a base of multidimensional cubes.
More informationGEMINI GEneric Multimedia INdexIng
GEMINI GEneric Multimedia INdexIng GEneric Multimedia INdexIng distance measure Sub-pattern Match quick and dirty test Lower bounding lemma 1-D Time Sequences Color histograms Color auto-correlogram Shapes
More informationOutline. The History of Histograms. Yannis Ioannidis University of Athens, Hellas
The History of Histograms Yannis Ioannidis University of Athens, Hellas Outline Prehistory Definitions and Framework The Early Past 10 Years Ago The Recent Past Industry Competitors The Future Prehistory
More information3 : Representation of Undirected GMs
0-708: Probabilistic Graphical Models 0-708, Spring 202 3 : Representation of Undirected GMs Lecturer: Eric P. Xing Scribes: Nicole Rafidi, Kirstin Early Last Time In the last lecture, we discussed directed
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models
More informationRouting v.s. Spanners
Routing v.s. Spanners Spanner et routage compact : similarités et différences Cyril Gavoille Université de Bordeaux AlgoTel 09 - Carry-Le-Rouet June 16-19, 2009 Outline Spanners Routing The Question and
More informationETL TESTING TRAINING
ETL TESTING TRAINING Retrieving Data using the SQL SELECT Statement Capabilities of the SELECT statement Arithmetic expressions and NULL values in the SELECT statement Column aliases Use of concatenation
More informationEvaluation of Relational Operations
Evaluation of Relational Operations Chapter 12, Part A Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Relational Operations We will consider how to implement: Selection ( ) Selects a subset
More informationPoint-Set Topology for Impossibility Results in Distributed Computing. Thomas Nowak
Point-Set Topology for Impossibility Results in Distributed Computing Thomas Nowak Overview Introduction Safety vs. Liveness First Example: Wait-Free Shared Memory Message Omission Model Execution Trees
More informationData Streams Algorithms
Data Streams Algorithms Phillip B. Gibbons Intel Research Pittsburgh Guest Lecture in 15-853 Algorithms in the Real World Phil Gibbons, 15-853, 12/4/06 # 1 Outline Data Streams in the Real World Formal
More informationSafely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems
Safely Measuring Tor Safely Measuring Tor, Rob Jansen and Aaron Johnson, In the Proceedings of the 23rd ACM Conference on Computer and Communication Security (CCS 2016). Rob Jansen Center for High Assurance
More informationNP-complete Reductions
NP-complete Reductions 1. Prove that 3SAT P DOUBLE-SAT, i.e., show DOUBLE-SAT is NP-complete by reduction from 3SAT. The 3-SAT problem consists of a conjunction of clauses over n Boolean variables, where
More informationRandomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec.
Randomized rounding of semidefinite programs and primal-dual method for integer linear programming Dr. Saeedeh Parsaeefard 1 2 3 4 Semidefinite Programming () 1 Integer Programming integer programming
More informationAgenda. Math Google PageRank algorithm. 2 Developing a formula for ranking web pages. 3 Interpretation. 4 Computing the score of each page
Agenda Math 104 1 Google PageRank algorithm 2 Developing a formula for ranking web pages 3 Interpretation 4 Computing the score of each page Google: background Mid nineties: many search engines often times
More informationCombinatorial Geometry & Approximation Algorithms
Combinatorial Geometry & Approximation Algorithms Timothy Chan U. of Waterloo PROLOGUE Analysis of Approx Factor in Analysis of Runtime in Computational Geometry Combinatorial Geometry Problem 1: Geometric
More informationMethods for Intelligent Systems
Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering
More informationDense triangle-free graphs are four-colorable: A solution to the Erdős-Simonovits problem.
Dense triangle-free graphs are four-colorable: A solution to the Erdős-Simonovits problem. Stephan Brandt Technische Universität Ilmenau Fakultät für Mathematik und Naturwissenschaften Postfach 100565
More informationComputing Data Cubes Using Massively Parallel Processors
Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University
More informationPERIODS OF ALGEBRAIC VARIETIES
PERIODS OF ALGEBRAIC VARIETIES OLIVIER DEBARRE Abstract. The periods of a compact complex algebraic manifold X are the integrals of its holomorphic 1-forms over paths. These integrals are in general not
More informationLecture 7: Counting classes
princeton university cos 522: computational complexity Lecture 7: Counting classes Lecturer: Sanjeev Arora Scribe:Manoj First we define a few interesting problems: Given a boolean function φ, #SAT is the
More informationRealization polytopes for the degree sequence of a graph
Realization polytopes for the degree sequence of a graph Michael D. Barrus Department of Mathematics Brigham Young University CanaDAM 203 June 2, 203 M. D. Barrus (BYU) Realization polytopes for degree
More informationIntroduction to Data Mining
Introduction to Data Mining Lecture #6: Mining Data Streams Seoul National University 1 Outline Overview Sampling From Data Stream Queries Over Sliding Window 2 Data Streams In many data mining situations,
More information2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data.
Code No: M0502/R05 Set No. 1 1. (a) Explain data mining as a step in the process of knowledge discovery. (b) Differentiate operational database systems and data warehousing. [8+8] 2. (a) Briefly discuss
More informationCOUNTING AND PROBABILITY
CHAPTER 9 COUNTING AND PROBABILITY Copyright Cengage Learning. All rights reserved. SECTION 9.3 Counting Elements of Disjoint Sets: The Addition Rule Copyright Cengage Learning. All rights reserved. Counting
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/25/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 3 In many data mining
More informationFlexible Coloring. Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a. Abstract
Flexible Coloring Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a a firstname.lastname@hp.com, HP Labs, 1501 Page Mill Road, Palo Alto, CA 94304 b atri@buffalo.edu, Computer Sc. & Engg. dept., SUNY Buffalo,
More informationCreate a simple database with MySQL
Create a simple database with MySQL 1.Connect the MySQL server through MySQL Workbench You can achieve many database operations by typing the SQL langue into the Query panel, such as creating a database,
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationSAT-CNF Is N P-complete
SAT-CNF Is N P-complete Rod Howell Kansas State University November 9, 2000 The purpose of this paper is to give a detailed presentation of an N P- completeness proof using the definition of N P given
More informationPentagons vs. triangles
Discrete Mathematics 308 (2008) 4332 4336 www.elsevier.com/locate/disc Pentagons vs. triangles Béla Bollobás a,b, Ervin Győri c,1 a Trinity College, Cambridge CB2 1TQ, UK b Department of Mathematical Sciences,
More informationApproximation slides 1. An optimal polynomial algorithm for the Vertex Cover and matching in Bipartite graphs
Approximation slides 1 An optimal polynomial algorithm for the Vertex Cover and matching in Bipartite graphs Approximation slides 2 Linear independence A collection of row vectors {v T i } are independent
More informationPartitioning Complete Multipartite Graphs by Monochromatic Trees
Partitioning Complete Multipartite Graphs by Monochromatic Trees Atsushi Kaneko, M.Kano 1 and Kazuhiro Suzuki 1 1 Department of Computer and Information Sciences Ibaraki University, Hitachi 316-8511 Japan
More informationOn the Approximability of Modularity Clustering
On the Approximability of Modularity Clustering Newman s Community Finding Approach for Social Nets Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607,
More informationSpatio-temporal Range Searching Over Compressed Kinetic Sensor Data. Sorelle A. Friedler Google Joint work with David M. Mount
Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data Sorelle A. Friedler Google Joint work with David M. Mount Motivation Kinetic data: data generated by moving objects Sensors collect data
More informationLogic and Discrete Mathematics. Section 2.5 Equivalence relations and partitions
Logic and Discrete Mathematics Section 2.5 Equivalence relations and partitions Slides version: January 2015 Equivalence relations Let X be a set and R X X a binary relation on X. We call R an equivalence
More informationBitmap Index Partition Techniques for Continuous and High Cardinality Discrete Attributes
Bitmap Index Partition Techniques for Continuous and High Cardinality Discrete Attributes Songrit Maneewongvatana Department of Computer Engineering King s Mongkut s University of Technology, Thonburi,
More informationRobert Cowen and Stephen H. Hechler. Received June 4, 2003; revised June 18, 2003
Scientiae Mathematicae Japonicae Online, Vol. 9, (2003), 9 15 9 G-FREE COLORABILITY AND THE BOOLEAN PRIME IDEAL THEOREM Robert Cowen and Stephen H. Hechler Received June 4, 2003; revised June 18, 2003
More informationWho to Select: Identifying Critical Sources in Social Sensing
Who to Select: Identifying Critical Sources in Social Sensing Dong Wang, Nathan Vance, Chao Huang Department of Computer Science and Engineering University of Notre Dame Notre Dame, IN 46556 Abstract Social
More informationCopyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and
Chapter 6 The Relational Algebra and Relational Calculus Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 Outline Unary Relational Operations: SELECT and PROJECT Relational
More informationDetection Theory for Graphs
Detection Theory for Graphs Benjamin A. Miller, Nadya T. Bliss, Patrick J. Wolfe, and Michelle S. Beard Graphs are fast emerging as a common data structure used in many scientific and engineering fields.
More informationCMSC 380. Graph Terminology and Representation
CMSC 380 Graph Terminology and Representation GRAPH BASICS 2 Basic Graph Definitions n A graph G = (V,E) consists of a finite set of vertices, V, and a finite set of edges, E. n Each edge is a pair (v,w)
More information