FMC: An Approach for Privacy Preserving OLAP

Size: px
Start display at page:

Download "FMC: An Approach for Privacy Preserving OLAP"

Transcription

1 FMC: An Approach for Privacy Preserving OLAP Ming Hua, Shouzhi Zhang, Wei Wang, Haofeng Zhou, Baile Shi Fudan University, China {minghua, shouzhi_zhang, weiwang, haofzhou, Abstract. To preserve private information while providing thorough analysis is one of the significant issues in OLAP systems. One of the challenges in it is to prevent inferring the sensitive value through the more aggregated non-sensitive data. This paper presents a novel algorithm FMC to eliminate the inference problem by hiding additional data besides the sensitive information itself, and proves that this additional information is both necessary and sufficient. Thus, this approach could provide as much information as possible for users, as well as preserve the security. The strategy does not impact on the online performance of the OLAP system. Systematic analysis and experimental comparison are provided to show the effectiveness and feasibility of FMC. Introduction Online analytic processing (OLAP) is an important infrastructure for advanced data analysis and knowledge discovery. While most of the previous studies on OLAP focus on OLAP models, data cube and data warehouse construction, maintenance and compression, as well as efficient query answering methods, it is critical to investigate the problem of privacy preserving in OLAP query answering. Example (Motivation) Consider a table about the patient cases in some hospitals as shown in Table. Table. a table about the patient cases Hospital Disease Number of cases Forest Lung cancer 6 Forest Diabetes 63 Memorial Diabetes 87 Memorial Heart attack 32 Fig.. The data cubes based on Table Suppose the hospitals do not want to make the population of individual diseases public, but agree to share the total number of all cases in a hospital or the total number of a certain disease in all hospitals. That is, in the data cube based on Table, the <f,*> 79 <f,l> <m,*> 9 <f,d> <*,*> 98 <*,l> 6 <m,d> <*,d> 5 <m,h> <*,h> 32

2 value of cells <f,l>, <f,d>, <m,d> and <m,h> should be hidden from users (as shown in Figure. <f, l> stands for the cell <forest, lung cancer> and so do other cells). A simple and direct security policy is to decline all the access to the sensitive cells. However, such a declining-direct-access policy is insufficient to preserve the privacy. Since just parts of the measure are hidden, the structure of the cube could be found out from the rest columns of the fact table, so the sensitive values could be revealed through other unprotected cells. For example, the value of <f,l> is exactly the same as that of <*,l>, since <*,l> only aggregates this record. Moreover, subtracting the value of <f,l> from that of <f,*> discloses the value of <f,d>. Now, the problem becomes, Can we make up a better security policy so that the privacy is strictly preserved? Moreover, we want such a policy to hide as few information as possible. We call it the privacy preserving OLAP problem. In this paper, we tackle the problem by hiding a minimal set of unprotected cells involved in determining the value of confidential cells, so that the precondition of information leakage will no longer hold. For example, if we hide the cells <*,l> and <*,h> in Figure, the value of the sensitive cells <f,l>, <f,d>, <m,l> and <m,d> will never be obtained by only accessing the remainder unprotected cells. Compared to the privacy control problems in statistical database and data mining, there are several new challenges for the privacy preserving OLAP problem, and we make the following contributions. ) Sensitive data items can be distributed at different granularity level in OLAP. We propose a general model and solution that can handle this case. 2) It is crucial for OLAP systems to provide users with as much information as possible while protecting the sensitive data. We prove that our algorithm only hides the necessary data. 3) OLAP applications usually require short response time. We eliminate the inference before users interacting with the system, so that the algorithm would not affect the online performance of the OLAP system. The rest of the paper is organized as follows. In Section 2, we formulate the problem of privacy preserving OLAP. Then Section 3 provides the overview of the solution. The key techniques are discussed in section 4 and section 5. Extensive experimental results are reported in Section 6. Finally, we draw the conclusion in Section 7.. Related Work Inference control methods in statistical databases are classified into two categories []. Restriction based techniques include auditing all queries [2], suppressing sensitive data [3] and so on. Perturbation based techniques include adding noise to source or outputs to affect the precision of detail data [4]. Inference control for OLAP systems received less attention. However, Lingyu Wang et al. have systematically studied this problem: [] derives sufficient conditions

3 for non-compromisability in sum-only data cubes; [5] discusses the inference problem caused by the multi-dimensional range queries; [6] proposes a method to eliminate both unauthorized accesses and malicious inferences. 2 Problem Definition A data cube consists of a set of dimensions and measures with aggregate functions defined on it. In this paper, we mainly focus on the SUM function. Each node of the data cube is called a cuboid, and a tuple in the cuboid is called a cell. Two cuboids C and C 2 follow the partial order (i.e., C C 2 ), iff on each dimension, either they share the same attribute, or C 2 has a higher-level of attribute in the dimension hierarchy. In this case, we say C 2 is an ancestor of C, and C is a descendant of C 2. C 2 is a father of C, and correspondingly, C is a son of C 2, if C C 2, and there isn t any cuboid C such that C C and C C 2. These definitions apply to cells as well. In Example, cuboids <Hospital, Disease> <Hospital, *>, and the cells <f,l> <f,*>. Decided by the multi-dimensional data model, the access control in OLAP systems lies in cuboids and cells. We define the confidential information as a forbidden set in the form of {c,, c m }, where c i is a cell of the data cube. We assume that the forbidden set includes all the confidential cells and their descendants, since a confidential cell could also be computed by simply aggregating all its descendants. All the cells not included in the forbidden set compose the available set, which is accessible for users. For example, the available set in Example includes all the cells except <f,l>, <f,d>, <m,d> and <m,h>. However, we have shown in Example that some confidential information (such as <f,l> and <f,d>) could be obtained by combining the cells in the available set. We define the available set as well as all the information derived from it as the available set closure. Definition [Available Set Closure]. Given an available set A, the Available Set Closure C(A) is defined as:. If cell c A, c C(A); 2. If cell c C(A), k c C(A), k is a real number; 3. If cells c,c 2 C(A), c +c 2 C(A); When the available set closure and the forbidden set have intersections, inference occurs. In this case, we also say that the forbidden set is compromised. The cells in the available set that cause the inference are called the source of the inference. Definition 2 [Compromisability]. Given a data cube L and a forbidden set F in L, F is compromised when C(L- F) F. To prevent the compromisability, we hide some cells in the source, so that all the sensitive cells couldn t be computed through the incomplete source. However, the hidden cells may also be inferred by higher granular cells, therefore, more cells should be hidden to protect them. Finally we could find a set of cells in addition to the forbidden set, and any cell outside them would not cause inference to the cells inside.

4 Definition 3 [Minimal Cover (MC)]. Given a data cube L and a Forbidden Set F in L, a set S is defined as the Minimal Cover of F (represented as MC(F)) if:. S L-F; 2. C(L-F-S) (F+S)=. 3. S S, C(L-F-S ) (F+S ) The minimal cover is a subset of the available set, and the second condition requires that after hiding the minimal cover, the remainder cells would not cause inference to both the minimal cover and the forbidden set. The third condition claims that any subset of the minimal cover couldn t satisfy the second one, which guarantees that all the cells in the minimal cover are indispensable to eliminate the inference. Problem Statement. Given a data cube L and a forbidden set F, the privacy preserving OLAP problem is to find a minimal cover MC(F) of F, which prevents F from being compromised while prohibiting as few information as possible. 3 Overview of Privacy Preserving OLAP Procedure From the definitions, it is clear that the minimal cover should be free of inference to both the forbidden set and itself; otherwise, one can disclose sensitive information by first inferring the values of minimal cover, and then getting to the forbidden set. A subset of the minimal cover that is only free of inference to the forbidden set is called the minimal partial cover. We take the following two steps to firstly find the minimal partial cover of the forbidden set, and then extend it to the minimal cover to preserve absolute security. Step Finding the minimal partial cover for the forbidden set. We find the minimal partial cover MPC of the forbidden set by linear system theory, such that hiding MPC would eliminate all the inference direct to the forbidden set, but just hiding any subset of MPC would not work. Step 2 Extending the minimal partial cover to the minimal cover. We then take MPC found in step as the new forbidden set, and repeat finding the minimal partial cover for the newly hidden cells until no more cells need to be hidden. 4 Finding Minimal Partial Cover In this section, we will discuss how to find the minimal partial cover for a forbidden set. First, we define the vector code to represent each cell in the cuboid as follows. Definition 5 [Vector Code]. Given a cuboid C, the vector code c v for cell c in C or C s father cuboids is defined as (a,, a n ), where n is the number of cells in C, and a i = if c is the ith cellin C (c C) or a i = if c aggregates the ith cellin C (c Father(C)). otherwise otherwise

5 For example, in the cuboid <Hospital, Disease> in Figure, the vector code of cell <f,*> is (,,,), and the vector for <f, l> is (,,,). The cell corresponding to c v could be inferred by c,, c n, if vector codes c v,, c v n can be linearly combined into the vector code c v. To determine whether it would happen, we discuss the following three cases of the solution of equation (): (x,, x n are real numbers). x c v + + x n c v n =[ c v,, c v n ] [x,, x n ] T = c v () Equation () has no solutions. Cell c corresponding to c v couldn t be computed with any other cells, so no additional information needs to be hidden. Equation () has only one non-zero solution. c v could be computed with a certain combination of c v,, c v n. If x i,, x j are the non-zero components of the solution, then the corresponding cells c v i,, c v j are indispensable to inferring c v. Therefore, just hiding one of c v i,, c v j could prevent the inference. Equation () has more than one non-zero solutions. To eliminate all the inference, we need to hide one cell whose corresponding component of solution X is always non-zero. If there isn t such kind of cells, we need to find a set of cells at least one of which is used in each solution. 4. An Example Based on linear system theory [7], we develop a method to eliminate the inference to certain cells. The method is illustrated in the following example. Example 2. We try to find the minimal partial cover for cell <f,d> in Example, and the security requirements are the same. Suppose c =<f,*>, c 2 =<m,*>, c 3 =<*,l>, c 4 =<*,d>, and c 5 =<*,h>. The corresponding vectors are c v,, c v 5.. We construct the equation by making A=[ c v,, c v 5 ], b= c v ( vector code of <f,d>). AX= X= (2) 2. The solution of equation (2) is X=X +k X, where X =[,,-,,] T, X =[-,-,,,] T, and k is a real number. If the i th component of X is non-zero, then c v i is used to compute <f,d>. For example, if we take k=, then X=[,,-,,] T, (i.e., c v = c v - c v 3 ), which is exactly the case depicted in Example. 3. We try to find a component of X that is always non-zero, or find a set of components at least one of which is non-zero in each X. If k=: X=X, the first and third components are non-zero. If k : by carefully choosing a value for k, the first or the third component can be zero, but the other components will never be zero. Hence, a cell in {c, c 3 } and another one in {c 2, c 4, c 5 } form the minimal partial cover of <f,d>. For example, if we hide {c, c 5 }, <f,d> wouldn t be compromised.

6 Input: The forbidden set F, and the cuboid C Output: A minimal partial cover MPC of F Method: : construct the coefficient matrix A=[ c v c v n ] 2: for each cell c in F 3: if Ax=c v has solutions 4: find the solutions X of Ax= c v 5: find the set of components M c at least one of which is non-zero in each X 7: return MPC= Mc U c F Fig. 2. Algorithm FMPC: finding a minimal partial cover 4.2 Algorithm Now, let us generalize the algorithm of finding the minimal partial cover (Figure 2). Given a forbidden set F in cuboid C, first construct the coefficient matrix A using the unprotected cells in C or C s fathers. Then for each cell c in F, if Ax=c has solutions, find the set of components in the solutions at least one of which is non-zero in each X. Here we use linear system theory [7] to find such cells. The solutions of Ax=c can be represented as x=x +[x,, x r ] [k,, k r ], where x,, x r is the basic solutions of Ax=, and X is a certain solution of Ax=c. There are r independent components in X, taking zero in x and taking respectively in each x i (i=,, r). For example, in figure 3, the last three components are independent. Suppose X [i] and X 2 [i] are non-zero in all the i th components of X to X 3, and X 2 [j] is the independent component taking in X 2, then either X[i] or X[j] is used in X, and the corresponding cells are the minimal partial cover. X X X X 2 X 3 X [ i] X [ i] k = + k 2 k 3 X [ j ] [ ] X 2 i Fig. 3. An example of minimal partial cover Lemma. Given X=X +[X,, X n-r ] [k,, k n-r ] T, the (r+) th to n th component of X are the independent components. If X [i], and only X d [i],, X dj [i] of X [i],, X n-r [i] are non-zero (d,, d j {,, n-r} and i<r+), then:. At least one of the components X[i], X[r+d ],, X[r+d j ] in X would be non-zero. 2. Any subset of components X[i], X[r+d ],, X[r+d j ] could all be zero in X. Lemma 2. Algorithm returns a minimal partial cover of the forbidden set FS. (The proof of Lemma and Lemma 2 are not provided here due to the limit of space.) Independent Components

7 5 Extending the Minimal Partial Cover to Minimal Cover In this section, we employ a level-wise framework to extend the minimal partial cover to the minimal cover to each cuboid of the cube with some optimizing strategies. 5. Two Optimizing Strategies Eliminating Single-son Inference. A cell is called a single-son cell if it has only one child in its son cuboid. All the single-son fathers of the forbidden set are definitely sensitive. In Example, if we hide the two single-son cell <*,l> and <*,h>, all inferences will be eliminated. Thus, in our algorithm we first add all the single-son fathers of the sensitive cells to the minimal cover. It may both eliminate a large part of inference and reduce the number of cells we must check for inference. Finding Candidate Range. In algorithm, we check all the fathers and unprotected siblings of the forbidden cells for inference. However, not all of them are dangerous. Example 3. A two-dimensional cube is shown in Figure 4(a). The cell <a 2,b > marked with * in the cuboid <A,B> is sensitive. We construct the coefficient matrix A for cuboid <A,B> (as shown in Figure 4(b)). The column vectors of A are related with 8 father cells and 5 unprotected cells in cuboid <A,B>. However, only the column vector A[], A[2], A[5], A[6], A[9] and A[] are probable to infer the value of <a 2,b >, because others have all zeros in the corresponding components. We call the sub matrix formed by A[], A[2], A[5], A[6], A[9], A[] and the non-zero components of them the candidate range of the forbidden set (surrounded with dashed in Figure 4(b)). The candidate range could be found by first setting it to the father cells of the forbidden set, and then iteratively add in the cells which intersect with the candidate range. <*,*> < ab, > < ab, 2> < ab 2, > * <a,*> <a2,*> <a3,*> <a4,*> <*,b> <*,b2> <*,b3> A= < ab 3, 3> < ab 3, 4> <a,b> <a,b2> <a2,b>* <a3,b3> <a3,b4> <a4,b3> < ab 4, 3> (a) A two-dimensional data cube (b) The coefficient matrix for cuboid <A,B> Fig. 4. A two-dimensional data cube 5.2 Algorithm We use a level-wise framework to extend the minimal partial cover to minimal cover. As shown in the Algorithm 2 in Figure 5, we first rank the cuboids in the cube ac-

8 Input: The forbidden set FS Output: A minimal cover MC of FS Method: : for each cuboid C* in the cube 2: while FS C* 3: Add single son father to MC 4: find the candidate range CR for FS 5: m=fmpc(fs, CR) //m is the minimal partial cover of FS returned by FMPC 6: FS=FS-FS C* //inference to FS C* has been eliminated 7: MC=MC m 8: FS=FS m //the minimal partial cover should be protected 9: return MC Fig. 5. Algorithm 2 (FMC: a level wise algorithm to find a minimal cover) cording to the ascend order of the granularity level. Then, for each cuboid, we apply the two optimizing strategies, and invoke Algorithm to find the minimal partial cover of the forbidden set in this cuboid. The returned minimal partial cover should be further checked for inference. This process should be repeated until there isn t any new minimal partial cover in the current cuboid. Theorem. Algorithm 2 returns a minimal cover of the forbidden set FS. (The proof of Theorem is based on Lemma and Lemma 2, and is not provided here due to the limit of space.) 6 Experimental Results Implementation. All experiments are conducted on a Pentium4 2.8 GHz PC with 52MB main memory, running Microsoft Windows XP Professional. The algorithm is implemented using Borland C++ Builder 6 with Microsoft SQL Server 2. Data Set. We used the synthetic data sets and real data set TPC-H benchmark for our experiments. In synthetic data sets, we generated data from a Zipfian distribution, skew of the data (z) was varied over,, 2 and 3. The sizes of the data sets vary from 2 to 8 cells, with 3 dimensions and 4 granularity levels in one dimension. Comparison on Different Zipf Parameter. We apply FMC to TPC-H benchmark and the synthetic datasets whose parameter z=,, 2, and 3. We randomly select % of the cells in two cuboids as the forbidden set, and compared the additional cells hidden by FMC and SeCube (L. Wang et al. 24). Figure 6(a) shows the results. When z=, the data is uniformly distributed, fewer additional cells need to be hidden The generator is obtained via ftp.research.microsoft.com/users/viveknar/tpcdskew

9 than that in the skewed case. Because some values of the dimension appear less often in the skewed dataset, these sparse data are the main cause of inference SeCube FMC 2 3 TPCH Z factor of zipfian distribution SeCube FMC 2% 4% 6% 8% % Size of forbidden set (%) (a) Compare on different zipf factors (b) Compare on different forbidden set size Fig. 6. Size of additional protected cells / size of cube We also evaluate the effectiveness of the two optimizing strategies. Figure 7(a) with the size of candidate range shows that at most 5% of the cube needs to be check for inference. Figure 7(b) shows the number of single-son inference cases. Since it takes a significant part in all inference cases, to eliminate the single-son inference first will contribute to the approach greatly..8 Candidate Range TPCH Z factor of zipfian distribution.7.6 Single-son Inference TPCH Z factor of zipfian distribution (a) Size of candidate range/size of cube (b) single-son inference/all inference cases Fig. 7. Experimental result of two optimizing strategies Size of candidate range.5 Candidate Range % 4% 6% 8% % Size of forbidden set (%) Runtime(millisecond) FMC 2% 4% 6% 8% % Size of forbidden set (%) Fig. 8. Size of candidate range / size of cube Fig. 9. Runtime of FMC

10 Comparison on Varied Forbidden Set. We set the zipf parameter to z=, and change the size of forbidden set. Figure 6(b) shows the size of additional cells hidden by SeCube [6] and FMC, where FMC hide fewer cells than SeCube in all cases. Figure 8 demonstrates the candidate range on different forbidden set size. The size of candidate range stays below 4% in all cases, which means that we only need to check 4% of the whole cube for inference. We also tested the runtime of FMC for different size of forbidden set (Figure 9). 7 Conclusions In this paper, we present an effective and efficient algorithm to address the privacy preserving OLAP problem. The main idea is to hide part of the data causing the inference, so that the sensitive information could no longer be computed. We could guarantee that all the information we hide is necessary, and thus as much information as possible can be provided for users while protecting the sensitive data. All work will be done before users interacting with the system, and thus, it would not affect the online performance of the OLAP system. Our algorithm is partially based on the linear system theory, so the correctness could be strictly proved. Experimental results also demonstrate the effectiveness of the algorithm. Future work includes applying the method to other aggregation functions and improving the efficiency of the algorithm. We also plan to extend the work to solve the inference problem caused by involving two aggregation functions in one cube. References. L. Wang, D. Wijesekera: Cardinality-based Inference Control in Sum-only Data Cubes. Proc. of the 7th European Symp. on Research in Computer Security, F. Y. Chin, G. Ozsoyoglu: Auditing and inference control in statistical databases. IEEE Trans. on Software. Eng. pp (Apr. 982) 3. L.H. Cox: Suppression methodology and statistical disclosure control. Journal of American Statistic Association, 75(37): , D. E. Denning: Secure statistical databases under random sample queries. ACM Trans. on Database Syst. Vol. 5(3) pp (Sept. 98) 5. L. Wang, Y. Li, D. Wijesekera, S. Jajodia: Precisely Answering Multi-dimensional Range Queries without Privacy Breaches. ESORICS 23: L. Wang, S. Jajodia, D. Wijesekera: Securing OLAP data cubes against privacy breaches. Proc. IEEE Symp. on Security and Privacy, 24, pages K. Nicholson: Elementary Linear Algebra. Second Edition, McGraw Hill, 24.

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP 324 Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP Shivaji Yadav(131322) Assistant Professor, CSE Dept. CSE, IIMT College of Engineering, Greater Noida,

More information

Security Control Methods for Statistical Database

Security Control Methods for Statistical Database Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP

More information

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases Jinlong Wang, Congfu Xu, Hongwei Dan, and Yunhe Pan Institute of Artificial Intelligence, Zhejiang University Hangzhou, 310027,

More information

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,

More information

PPKM: Preserving Privacy in Knowledge Management

PPKM: Preserving Privacy in Knowledge Management PPKM: Preserving Privacy in Knowledge Management N. Maheswari (Corresponding Author) P.G. Department of Computer Science Kongu Arts and Science College, Erode-638-107, Tamil Nadu, India E-mail: mahii_14@yahoo.com

More information

Partition Based Perturbation for Privacy Preserving Distributed Data Mining

Partition Based Perturbation for Privacy Preserving Distributed Data Mining BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation

More information

Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud

Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud R. H. Jadhav 1 P.E.S college of Engineering, Aurangabad, Maharashtra, India 1 rjadhav377@gmail.com ABSTRACT: Many

More information

Auditing a Batch of SQL Queries

Auditing a Batch of SQL Queries Auditing a Batch of SQL Queries Rajeev Motwani, Shubha U. Nabar, Dilys Thomas Department of Computer Science, Stanford University Abstract. In this paper, we study the problem of auditing a batch of SQL

More information

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,

More information

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

Review on Techniques of Collaborative Tagging

Review on Techniques of Collaborative Tagging Review on Techniques of Collaborative Tagging Ms. Benazeer S. Inamdar 1, Mrs. Gyankamal J. Chhajed 2 1 Student, M. E. Computer Engineering, VPCOE Baramati, Savitribai Phule Pune University, India benazeer.inamdar@gmail.com

More information

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Lecture 3 Efficient Cube Computation CITS3401 CITS5504 Wei Liu School of Computer Science and Software Engineering Faculty of Engineering, Computing and Mathematics Acknowledgement:

More information

Max-Count Aggregation Estimation for Moving Points

Max-Count Aggregation Estimation for Moving Points Max-Count Aggregation Estimation for Moving Points Yi Chen Peter Revesz Dept. of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA Abstract Many interesting problems

More information

Understanding policy intent and misconfigurations from implementations: consistency and convergence

Understanding policy intent and misconfigurations from implementations: consistency and convergence Understanding policy intent and misconfigurations from implementations: consistency and convergence Prasad Naldurg 1, Ranjita Bhagwan 1, and Tathagata Das 2 1 Microsoft Research India, prasadn@microsoft.com,

More information

Integration of information security and network data mining technology in the era of big data

Integration of information security and network data mining technology in the era of big data Acta Technica 62 No. 1A/2017, 157 166 c 2017 Institute of Thermomechanics CAS, v.v.i. Integration of information security and network data mining technology in the era of big data Lu Li 1 Abstract. The

More information

Provable data privacy

Provable data privacy Provable data privacy Kilian Stoffel 1 and Thomas Studer 2 1 Université de Neuchâtel, Pierre-à-Mazel 7, CH-2000 Neuchâtel, Switzerland kilian.stoffel@unine.ch 2 Institut für Informatik und angewandte Mathematik,

More information

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

Mining Frequent Itemsets for data streams over Weighted Sliding Windows Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology

More information

K-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007

K-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007 K-Anonymity and Other Cluster- Based Methods Ge Ruan Oct 11,2007 Data Publishing and Data Privacy Society is experiencing exponential growth in the number and variety of data collections containing person-specific

More information

On Demand Phenotype Ranking through Subspace Clustering

On Demand Phenotype Ranking through Subspace Clustering On Demand Phenotype Ranking through Subspace Clustering Xiang Zhang, Wei Wang Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27599, USA {xiang, weiwang}@cs.unc.edu

More information

K ANONYMITY. Xiaoyong Zhou

K ANONYMITY. Xiaoyong Zhou K ANONYMITY LATANYA SWEENEY Xiaoyong Zhou DATA releasing: Privacy vs. Utility Society is experiencing exponential growth in the number and variety of data collections containing person specific specific

More information

Information Cloaking Technique with Tree Based Similarity

Information Cloaking Technique with Tree Based Similarity Information Cloaking Technique with Tree Based Similarity C.Bharathipriya [1], K.Lakshminarayanan [2] 1 Final Year, Computer Science and Engineering, Mailam Engineering College, 2 Assistant Professor,

More information

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan

More information

LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier

LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier LRLW-LSI: An Improved Latent Semantic Indexing (LSI) Text Classifier Wang Ding, Songnian Yu, Shanqing Yu, Wei Wei, and Qianfeng Wang School of Computer Engineering and Science, Shanghai University, 200072

More information

An Overview of various methodologies used in Data set Preparation for Data mining Analysis

An Overview of various methodologies used in Data set Preparation for Data mining Analysis An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of

More information

Privacy, Security & Ethical Issues

Privacy, Security & Ethical Issues Privacy, Security & Ethical Issues How do we mine data when we can t even look at it? 2 Individual Privacy Nobody should know more about any entity after the data mining than they did before Approaches:

More information

Purna Prasad Mutyala et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5), 2011,

Purna Prasad Mutyala et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5), 2011, Weighted Association Rule Mining Without Pre-assigned Weights PURNA PRASAD MUTYALA, KUMAR VASANTHA Department of CSE, Avanthi Institute of Engg & Tech, Tamaram, Visakhapatnam, A.P., India. Abstract Association

More information

Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

Range CUBE: Efficient Cube Computation by Exploiting Data Correlation Range CUBE: Efficient Cube Computation by Exploiting Data Correlation Ying Feng Divyakant Agrawal Amr El Abbadi Ahmed Metwally Department of Computer Science University of California, Santa Barbara Email:

More information

NON-CENTRALIZED DISTINCT L-DIVERSITY

NON-CENTRALIZED DISTINCT L-DIVERSITY NON-CENTRALIZED DISTINCT L-DIVERSITY Chi Hong Cheong 1, Dan Wu 2, and Man Hon Wong 3 1,3 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong {chcheong, mhwong}@cse.cuhk.edu.hk

More information

AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE

AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE AN OPTIMIZATION GENETIC ALGORITHM FOR IMAGE DATABASES IN AGRICULTURE Changwu Zhu 1, Guanxiang Yan 2, Zhi Liu 3, Li Gao 1,* 1 Department of Computer Science, Hua Zhong Normal University, Wuhan 430079, China

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

EFFICIENT ATTRIBUTE REDUCTION ALGORITHM

EFFICIENT ATTRIBUTE REDUCTION ALGORITHM EFFICIENT ATTRIBUTE REDUCTION ALGORITHM Zhongzhi Shi, Shaohui Liu, Zheng Zheng Institute Of Computing Technology,Chinese Academy of Sciences, Beijing, China Abstract: Key words: Efficiency of algorithms

More information

SECURITY IN COMPUTING, FIFTH EDITION

SECURITY IN COMPUTING, FIFTH EDITION 1 SECURITY IN COMPUTING, FIFTH EDITION Chapter 7: Database Security 2 Database Terms Database administrator Database management system (DBMS) Record Field/element Schema Subschema Attribute Relation 3

More information

High Capacity Reversible Watermarking Scheme for 2D Vector Maps

High Capacity Reversible Watermarking Scheme for 2D Vector Maps Scheme for 2D Vector Maps 1 Information Management Department, China National Petroleum Corporation, Beijing, 100007, China E-mail: jxw@petrochina.com.cn Mei Feng Research Institute of Petroleum Exploration

More information

Generating All Solutions of Minesweeper Problem Using Degree Constrained Subgraph Model

Generating All Solutions of Minesweeper Problem Using Degree Constrained Subgraph Model 356 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16 Generating All Solutions of Minesweeper Problem Using Degree Constrained Subgraph Model Hirofumi Suzuki, Sun Hao, and Shin-ichi Minato Graduate

More information

Computing Data Cubes Using Massively Parallel Processors

Computing Data Cubes Using Massively Parallel Processors Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University

More information

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Abstract Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in content-based

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Lecture 6: Faces, Facets

Lecture 6: Faces, Facets IE 511: Integer Programming, Spring 2019 31 Jan, 2019 Lecturer: Karthik Chandrasekaran Lecture 6: Faces, Facets Scribe: Setareh Taki Disclaimer: These notes have not been subjected to the usual scrutiny

More information

Statistical Databases: Query Restriction

Statistical Databases: Query Restriction Statistical Databases: Query Restriction Nina Mishra January 21, 2004 Introduction A statistical database typically contains information about n individuals where n is very large. A statistical database

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique Sumit Jain 1, Abhishek Raghuvanshi 1, Department of information Technology, MIT, Ujjain Abstract--Knowledge

More information

An Efficient Clustering Method for k-anonymization

An Efficient Clustering Method for k-anonymization An Efficient Clustering Method for -Anonymization Jun-Lin Lin Department of Information Management Yuan Ze University Chung-Li, Taiwan jun@saturn.yzu.edu.tw Meng-Cheng Wei Department of Information Management

More information

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING 1 B.KARTHIKEYAN, 2 G.MANIKANDAN, 3 V.VAITHIYANATHAN 1 Assistant Professor, School of Computing, SASTRA University, TamilNadu, India. 2 Assistant

More information

Conjunctive Keyword Search with Designated Tester and Timing Enabled Proxy Re-Encryption Function for Electronic Health Cloud

Conjunctive Keyword Search with Designated Tester and Timing Enabled Proxy Re-Encryption Function for Electronic Health Cloud Conjunctive Keyword Search with Designated Tester and Timing Enabled Proxy Re-Encryption Function for Electronic Health Cloud Mrs. Rashi Saxena 1, N. Yogitha 2, G. Swetha Reddy 3, D. Rasika 4 1 Associate

More information

On the Soundness Property for SQL Queries of Fine-grained Access Control in DBMSs

On the Soundness Property for SQL Queries of Fine-grained Access Control in DBMSs 2009 Eigth IEEE/ACIS International Conference on Computer and Information Science On the Soundness Property for SQL Queries of Fine-grained Access Control in DBMSs Jie Shi, Hong Zhu, Ge Fu, Tao Jiang College

More information

Emerging Measures in Preserving Privacy for Publishing The Data

Emerging Measures in Preserving Privacy for Publishing The Data Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the

More information

Chapter 5, Data Cube Computation

Chapter 5, Data Cube Computation CSI 4352, Introduction to Data Mining Chapter 5, Data Cube Computation Young-Rae Cho Associate Professor Department of Computer Science Baylor University A Roadmap for Data Cube Computation Full Cube Full

More information

A reversible data hiding based on adaptive prediction technique and histogram shifting

A reversible data hiding based on adaptive prediction technique and histogram shifting A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn

More information

Closed Pattern Mining from n-ary Relations

Closed Pattern Mining from n-ary Relations Closed Pattern Mining from n-ary Relations R V Nataraj Department of Information Technology PSG College of Technology Coimbatore, India S Selvan Department of Computer Science Francis Xavier Engineering

More information

Incognito: Efficient Full Domain K Anonymity

Incognito: Efficient Full Domain K Anonymity Incognito: Efficient Full Domain K Anonymity Kristen LeFevre David J. DeWitt Raghu Ramakrishnan University of Wisconsin Madison 1210 West Dayton St. Madison, WI 53706 Talk Prepared By Parul Halwe(05305002)

More information

Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining

Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining CS573 Data Privacy and Security Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining Li Xiong Slides credit: Chris Clifton, Purdue University; Murat Kantarcioglu, UT

More information

A Graph-Based Approach for Mining Closed Large Itemsets

A Graph-Based Approach for Mining Closed Large Itemsets A Graph-Based Approach for Mining Closed Large Itemsets Lee-Wen Huang Dept. of Computer Science and Engineering National Sun Yat-Sen University huanglw@gmail.com Ye-In Chang Dept. of Computer Science and

More information

. (1) N. supp T (A) = If supp T (A) S min, then A is a frequent itemset in T, where S min is a user-defined parameter called minimum support [3].

. (1) N. supp T (A) = If supp T (A) S min, then A is a frequent itemset in T, where S min is a user-defined parameter called minimum support [3]. An Improved Approach to High Level Privacy Preserving Itemset Mining Rajesh Kumar Boora Ruchi Shukla $ A. K. Misra Computer Science and Engineering Department Motilal Nehru National Institute of Technology,

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

OPTIMIZING ACCESS ACROSS HIERARCHIES IN DATA WAREHOUSES QUERY REWRITING ICS 624 FINAL PROJECT MAY 成玉 Cheng, Yu. Supervisor: Lipyeow Lim, PhD

OPTIMIZING ACCESS ACROSS HIERARCHIES IN DATA WAREHOUSES QUERY REWRITING ICS 624 FINAL PROJECT MAY 成玉 Cheng, Yu. Supervisor: Lipyeow Lim, PhD OPTIMIZING ACCESS ACROSS HIERARCHIES IN DATA WAREHOUSES QUERY REWRITING ICS 624 FINAL PROJECT MAY 2011 By 成玉 Cheng, Yu Supervisor: Lipyeow Lim, PhD Contents 1 Introduction 3 1.1 Data Warehousing...........................................

More information

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 2 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0032 Privacy-Preserving of Check-in

More information

ScienceDirect. A privacy preserving technique to prevent sensitive behavior exposure in semantic location-based service

ScienceDirect. A privacy preserving technique to prevent sensitive behavior exposure in semantic location-based service Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 35 (2014 ) 318 327 18 th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems

More information

Distributed Data Anonymization with Hiding Sensitive Node Labels

Distributed Data Anonymization with Hiding Sensitive Node Labels Distributed Data Anonymization with Hiding Sensitive Node Labels C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan University,Trichy

More information

Math 414 Lecture 2 Everyone have a laptop?

Math 414 Lecture 2 Everyone have a laptop? Math 44 Lecture 2 Everyone have a laptop? THEOREM. Let v,...,v k be k vectors in an n-dimensional space and A = [v ;...; v k ] v,..., v k independent v,..., v k span the space v,..., v k a basis v,...,

More information

I.INTRODUCTION ABSTRACT

I.INTRODUCTION ABSTRACT PRIVATE BLOCK DOWNLOAD FROM DATA WAREHOUSES Authors:Sangavi M,Suganya B and Vetriselvi D College:Jeppiaar SRR Engineering College Email:sangavi01996@gmail.com, suganyabalu96@gmail.com vetri.raja.nilaa@gmail.com

More information

Novel Materialized View Selection in a Multidimensional Database

Novel Materialized View Selection in a Multidimensional Database Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/

More information

Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database

Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database T.Malathi 1, S. Nandagopal 2 PG Scholar, Department of Computer Science and Engineering, Nandha College of Technology, Erode,

More information

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix.

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. Row echelon form A matrix is said to be in the row echelon form if the leading entries shift to the

More information

QUERY PLANNING FOR CONTINUOUS AGGREGATION QUERIES USING DATA AGGREGATORS

QUERY PLANNING FOR CONTINUOUS AGGREGATION QUERIES USING DATA AGGREGATORS QUERY PLANNING FOR CONTINUOUS AGGREGATION QUERIES USING DATA AGGREGATORS A. SATEESH 1, D. ANIL 2, M. KIRANKUMAR 3 ABSTRACT: Continuous aggregation queries are used to monitor the changes in data with time

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Research Article Improvements in Geometry-Based Secret Image Sharing Approach with Steganography

Research Article Improvements in Geometry-Based Secret Image Sharing Approach with Steganography Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2009, Article ID 187874, 11 pages doi:10.1155/2009/187874 Research Article Improvements in Geometry-Based Secret Image Sharing

More information

FSRM Feedback Algorithm based on Learning Theory

FSRM Feedback Algorithm based on Learning Theory Send Orders for Reprints to reprints@benthamscience.ae The Open Cybernetics & Systemics Journal, 2015, 9, 699-703 699 FSRM Feedback Algorithm based on Learning Theory Open Access Zhang Shui-Li *, Dong

More information

Cascaded Coded Distributed Computing on Heterogeneous Networks

Cascaded Coded Distributed Computing on Heterogeneous Networks Cascaded Coded Distributed Computing on Heterogeneous Networks Nicholas Woolsey, Rong-Rong Chen, and Mingyue Ji Department of Electrical and Computer Engineering, University of Utah Salt Lake City, UT,

More information

Selecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach +

Selecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach + Selecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach + Abdullah Al-Hamdani, Gultekin Ozsoyoglu Electrical Engineering and Computer Science Dept, Case Western Reserve University,

More information

Optimized Watermarking Using Swarm-Based Bacterial Foraging

Optimized Watermarking Using Swarm-Based Bacterial Foraging Journal of Information Hiding and Multimedia Signal Processing c 2009 ISSN 2073-4212 Ubiquitous International Volume 1, Number 1, January 2010 Optimized Watermarking Using Swarm-Based Bacterial Foraging

More information

Probabilistic Graph Summarization

Probabilistic Graph Summarization Probabilistic Graph Summarization Nasrin Hassanlou, Maryam Shoaran, and Alex Thomo University of Victoria, Victoria, Canada {hassanlou,maryam,thomo}@cs.uvic.ca 1 Abstract We study group-summarization of

More information

Query Auditing for Protecting Sensitive Attributes in Statistical Databases

Query Auditing for Protecting Sensitive Attributes in Statistical Databases Query Auditing for Protecting Sensitive Attributes in Statistical Databases Vinh-Thong Ta INRIA Lyon, PRIVATICS Team vinh-thong.ta@inria.fr CAPPRIS Meeting, March 18-19, 2014, Paris Query Auditing Statistical

More information

Multi-Level Trust Privacy Preserving Data Mining to Enhance Data Security and Prevent Leakage of the Sensitive Data

Multi-Level Trust Privacy Preserving Data Mining to Enhance Data Security and Prevent Leakage of the Sensitive Data Bonfring International Journal of Industrial Engineering and Management Science, Vol. 7, No. 2, May 2017 21 Multi-Level Trust Privacy Preserving Data Mining to Enhance Data Security and Prevent Leakage

More information

CHAPTER 6 ORTHOGONAL PARTICLE SWARM OPTIMIZATION

CHAPTER 6 ORTHOGONAL PARTICLE SWARM OPTIMIZATION 131 CHAPTER 6 ORTHOGONAL PARTICLE SWARM OPTIMIZATION 6.1 INTRODUCTION The Orthogonal arrays are helpful in guiding the heuristic algorithms to obtain a good solution when applied to NP-hard problems. This

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining

Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining hen Wang 1 Mingsheng Hong 1 Jian Pei 2 Haofeng Zhou 1 Wei Wang 1 aile Shi 1 1 Fudan University, hina {chenwang, 9924013, haofzhou, weiwang1,

More information

Reconstruction-based Classification Rule Hiding through Controlled Data Modification

Reconstruction-based Classification Rule Hiding through Controlled Data Modification Reconstruction-based Classification Rule Hiding through Controlled Data Modification Aliki Katsarou, Aris Gkoulalas-Divanis, and Vassilios S. Verykios Abstract In this paper, we propose a reconstruction

More information

Privacy-Enhanced Collaborative Filtering

Privacy-Enhanced Collaborative Filtering Privacy-Enhanced Collaborative Filtering Shlomo Berkovsky 1, Yaniv Eytani 1, Tsvi Kuflik 2, Francesco Ricci 3 1 Computer Science Department, University of Haifa, Israel {slavax, ieytani}@cs.haifa.ac.il

More information

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University

More information

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking Dong Xin Zheng Shao Jiawei Han Hongyan Liu University of Illinois at Urbana-Champaign, Urbana, IL 680, USA October 6, 005 Abstract

More information

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking Dong Xin Zheng Shao Jiawei Han Hongyan Liu University of Illinois at Urbana-Champaign, Urbana, IL 6, USA Tsinghua University,

More information

Provide a drawing. Mark any line with three points in blue color.

Provide a drawing. Mark any line with three points in blue color. Math 3181 Name: Dr. Franz Rothe August 18, 2014 All3181\3181_fall14h1.tex Homework has to be turned in this handout. For extra space, use the back pages, or blank pages between. The homework can be done

More information

Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD)

Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Vol.2, Issue.1, Jan-Feb 2012 pp-208-212 ISSN: 2249-6645 Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Krishna.V #, Santhana Lakshmi. S * # PG Student,

More information

NETWORK SECURITY PROVISION BY MEANS OF ACCESS CONTROL LIST

NETWORK SECURITY PROVISION BY MEANS OF ACCESS CONTROL LIST INTERNATIONAL JOURNAL OF REVIEWS ON RECENT ELECTRONICS AND COMPUTER SCIENCE NETWORK SECURITY PROVISION BY MEANS OF ACCESS CONTROL LIST Chate A.B 1, Chirchi V.R 2 1 PG Student, Dept of CNE, M.B.E.S College

More information

Optimally-balanced Hash Tree Generation in Ad Hoc Networks

Optimally-balanced Hash Tree Generation in Ad Hoc Networks African Journal of Information and Communication Technology, Vol. 6, No., September Optimally-balanced Hash Tree Generation in Ad Hoc Networks V. R. Ghorpade, Y. V. Joshi and R. R. Manthalkar. Kolhapur

More information

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,

More information

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400

More information

Unified PMU Placement Algorithm for Power Systems

Unified PMU Placement Algorithm for Power Systems Unified PMU Placement Algorithm for Power Systems Kunal Amare, and Virgilio A. Centeno Bradley Department of Electrical and Computer Engineering, Virginia Tech Blacksburg, VA-24061, USA. Anamitra Pal Network

More information

Synthetic Data. Michael Lin

Synthetic Data. Michael Lin Synthetic Data Michael Lin 1 Overview The data privacy problem Imputation Synthetic data Analysis 2 Data Privacy As a data provider, how can we release data containing private information without disclosing

More information

Project Participants

Project Participants Annual Report for Period:10/2004-10/2005 Submitted on: 06/21/2005 Principal Investigator: Yang, Li. Award ID: 0414857 Organization: Western Michigan Univ Title: Projection and Interactive Exploration of

More information

Unlabeled equivalence for matroids representable over finite fields

Unlabeled equivalence for matroids representable over finite fields Unlabeled equivalence for matroids representable over finite fields November 16, 2012 S. R. Kingan Department of Mathematics Brooklyn College, City University of New York 2900 Bedford Avenue Brooklyn,

More information

TAPER: A Two-Step Approach for All-strong-pairs. Correlation Query in Large Databases

TAPER: A Two-Step Approach for All-strong-pairs. Correlation Query in Large Databases IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXX 200X TAPER: A Two-Step Approach for All-strong-pairs Correlation Query in Large Databases Hui Xiong, Member, IEEE, Shashi Shekhar,

More information

TAPER: A Two-Step Approach for All-strong-pairs. Correlation Query in Large Databases

TAPER: A Two-Step Approach for All-strong-pairs. Correlation Query in Large Databases IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXX 2X TAPER: A Two-Step Approach for All-strong-pairs Correlation Query in Large Databases Hui Xiong, Student Member, IEEE, Shashi Shekhar,

More information

Data Distortion for Privacy Protection in a Terrorist Analysis System

Data Distortion for Privacy Protection in a Terrorist Analysis System Data Distortion for Privacy Protection in a Terrorist Analysis System Shuting Xu, Jun Zhang, Dianwei Han, and Jie Wang Department of Computer Science, University of Kentucky, Lexington KY 40506-0046, USA

More information

CLASS-ROOM NOTES: OPTIMIZATION PROBLEM SOLVING - I

CLASS-ROOM NOTES: OPTIMIZATION PROBLEM SOLVING - I Sutra: International Journal of Mathematical Science Education, Technomathematics Research Foundation Vol. 1, No. 1, 30-35, 2008 CLASS-ROOM NOTES: OPTIMIZATION PROBLEM SOLVING - I R. Akerkar Technomathematics

More information

American International Journal of Research in Science, Technology, Engineering & Mathematics

American International Journal of Research in Science, Technology, Engineering & Mathematics American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629

More information

CS6015 / LARP ACK : Linear Algebra and Its Applications - Gilbert Strang

CS6015 / LARP ACK : Linear Algebra and Its Applications - Gilbert Strang Solving and CS6015 / LARP 2018 ACK : Linear Algebra and Its Applications - Gilbert Strang Introduction Chapter 1 concentrated on square invertible matrices. There was one solution to Ax = b and it was

More information