A Survey on A Privacy Preserving Technique using K-means Clustering Algorithm

Size: px
Start display at page:

Download "A Survey on A Privacy Preserving Technique using K-means Clustering Algorithm"

Transcription

1 A Survey on A Privacy Preserving Technique using K-means Clustering Algorithm Falguni A.Patel 1, Chetna G.Chand 2 1 Student(Master of Engineering), Computer Engineering Department, Government Engineering College, Modasa, Gujarat, India 2 Assistant Professor, Computer Engineering Department, Government Engineering College, Modasa, Gujarat, India ABSTRACT In many organizations large amount of data are collected. These data are sometimes used by the organizations for data mining tasks. However, the data collected may contain private or sensitive information which should be protected. Privacy protection is an important issue if we release data for the mining or sharing purpose. There are various techniques that protect the sensitive data with less information loss which increase data usability and also prevent the sensitive data for various types of attack. Various comprehensive experiments on real as well as synthetic datasets show that they are effective and provide moderate privacy.clustering based noise techniques that not only preserve the privacy but also ensure effective data mining. Keywords: - Data Mining, Privacy Preserving, Clustering, Privacy Preserving Techniques, K-means clustering algorithm. 1. INTRODUCTION: Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both [1]. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified [2]. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases [2]. While large-scale information technology has been evolving separate transaction and analytical systems, data mining provides the link between the two. Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries [3]. Several types of analytical software are available: statistical, machine learning, and neural networks. Generally, any of four types of relationships are sought:[8] Classes: Stored data is used to locate data in predetermined groups. For example, a restaurant chain could mine customer purchase data to determine when customers visit and what they typically order [8]. This information could be used to increase traffic by having daily specials. Clusters: Data items are grouped according to logical relationships or consumer preferences.[8] For example, data can be mined to identify market segments or consumer affinities

2 Associations: Data can be mined to identify associations. The beer-diaper example is an example of associative mining.[8] Sequential patterns: Data is mined to anticipate behavior patterns and trends. For example, an outdoor equipment retailer could predict the likelihood of a backpack being purchased based on a consumer's purchase of sleeping bags and hiking shoes [8]. Introduction of Clustering Clustering can be considered the most important unsupervised learning technique; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data.[3] Clustering is the process of organizing objects into groups whose members are similar in some way.[4]a cluster is therefore a collection of objects which are similar between them and are dissimilar to the objects belonging to other clusters[3] Clustering is used in many fields like pattern reorganization,image analysis,e-health, bio informatics[3] Major existing clustering methods are, distance based, hierarchical based, partition based and probabilistic based. In partitioning based clustering there is K-means clustering algorithm is used [9]. K-means clustering algorithm Input: Number of desired clusters, k, and a database D={d1, d2, dn}containing n data objects. Output: A set of k clusters Steps: 1) Randomly select k data objects from dataset D as initial cluster centers 2) Repeat; 3) Calculate the distance between each data object di (1 <=i<=n) and all k cluster centers cj(1<=j<=k) and assign data object di to the nearest cluster. 4) For each cluster j (1<=j<=k), recalculate the cluster center. 5) until no changing in the center of clusters is positive integer number. Flowchart of k-means clustering algorithm [9]

3 Privacy preserving in Data mining Data Mining is extensively used for knowledge discovery from large databases. The problem with data mining is that with the availability of non- sensitive information, one is able to infer sensitive information that is not to be disclosed [5]. Thus privacy is becoming an increasingly important issue in many data mining applications.the first approach protects the privacy of data by using an extended role based access control approach where sensitive objects identification is used to protect an individual s privacy and another approach uses cryptographic methods.[7] Privacy preserving is used in many areas like E-health, cloud,location based services and wireless sensor networks.[10] There are many different techniques used for privacy preserving data mining like Anonymization based,perturbation based, randomized response based,condensation based and cryptography based[10]. A.Anonymization based PPDM The basic form of the data in a table consists of following four types of attributes. (i) Explicit Identifiers is a set of attributes containing information that identifies a record owner explicitly such as name, SS number etc.[10] (ii) Quasi Identifiers is a set of attributes that could potentially identify a record owner when combined with publicly available data.[10] (iii) Sensitive Attributes is a set of attributes that contains sensitive person specific information such as disease, salary etc[10]. (iv) Non-Sensitive Attributes is a set of attributes that creates no problem if revealed even to untrustworthy [10] Anonymization refers to an approach where identity or/and sensitive data about record owners are to be hidden.[5] It even assumes that sensitive data should be retained for analysis. It's obvious that explicit identifiers should be removed but still there is a danger of privacy intrusion [5] when quasi identifiers are linked to publicly available data. Such attacks are called as linking attacks. Sweeney proposed k-anonymity model using generalization and suppression to achieve k-anonymity i.e. any individual is distinguishable from at least k-1 other ones with respect to quasi-identifier attribute in the anonymized dataset [10]. Replacing a value less specific but semantically consistent value is called as generalization and suppression involves blocking the data. Releasing such data for mining reduces the risk of identification when combined with publically available data.[10] Limitations of the k-anonymity model stem from the two assumptions. First, it may be very hard for the owner of a database to determine which of the attributes are available or which are not available in external tables. The second limitation is that the k-anonymity model assumes a certain method of attack, while in real scenarios; there is no reason why the attacker should not try other methods.[10] B. Perturbation based PPDM Perturbation has an inherent property of simplicity, efficiency and ability to preserve statistical information. In perturbation the original values are replaced with some synthetic data values so that the statistical information computed from the perturbed data does not differ from the statistical information computed from the original data[5]. The perturbed data records do not correspond to real-world record owners, so the attacker cannot perform the sensitive linkages or recover sensitive information from the published data. In perturbation approach, records released is synthetic i.e. it does not correspond to real world entities represented by the original data [5],[6]. Therefore the individual records in the perturbed data are meaningless to the human recipient as only statistical properties of the records are preserved. Perturbation can be done by using additive noise or data swapping or synthetic data generation [5]. Since the perturbation method does not reconstruct the original values but only the distributions, new algorithms are to be developed for mining of the data. This means that a new distribution based mining algorithms need to be developed for each individual data 28 problems like classification, clustering or association rule mining. For example, Some approaches have been developed for distribution-based mining of data for problems such as association rules and classification, it is clear that using distributions instead of original records restricts the range of algorithmic techniques that can be used on the data. Kantar cioglu and Clifton and Rizvi and Harista have developed methods to preserve privacy of association rule mining. In the perturbation approach, any distribution based data mining algorithm works under an implicit assumption to treat each dimension independently. Relevant information for data mining algorithms such as classification remains hidden in interattribute correlations [5]. For example, the classification technique makes use of distribution-based analogue of single attribute split algorithm. However, other techniques such as multivariate decision tree algorithms cannot be modified accordingly to work with the perturbation approach. This is because perturbation approach treats different

4 attributes independently. Hence the distribution based data mining algorithms have an inherent disadvantage of loss of implicit information available in multidimensional records [10]. C. Randomized Response based PPDM Basically, randomized response is statistical technique introduced by Warner to solve a survey problem. In Randomized response, the data is scrambled in such a way that the central place cannot tell with probabilities better than a pre-defined threshold, whether the data from a customer contains truthful information or false information.[5] The information received from each individual user is scrambled and if the number of users is significantly large, the aggregate information of these users can be estimated with good amount of accuracy. This is very useful for decision-tree classification since decision-tree classification is based on aggregate values of a dataset, rather than individual data items. [6]The data collection process in randomization method is carried out using two steps. During first step, the data providers randomize their data and transmit the randomized data to the data receiver. In second step, the data receiver reconstructs the original distribution of the data by employing a distribution reconstruction algorithm. Randomization method is relatively very simple and does not require knowledge of the distribution of other records in the data.[6] Hence, the randomization method can be implemented at data collection time. It does not require a trusted server to contain all the original records in order to perform the Anonymization process. The weakness of a randomization response based PPDM technique is that it treats all the records equal irrespective of their local density. One solution to this is to be needlessly more aggressive in adding noise to all the records in the data. But, it reduces the utility of the data for mining purposes as the reconstructed distribution may not yield results in conformity of the purpose of data mining. The randomized response approach has extended its strengths to a number of data mining problems [5]. In Agrawal and Srikant have discussed the use of randomized response approach for classification. A number of other techniques have also been proposed that work well over a variety of different classifiers. Techniques have been proposed for privacy preserving classification that enhances the effectiveness of classifiers [6]. For example, Gambs, Kegl and Aimeur have proposed methods for privacy-preserving boosting of classifiers. Methods for privacy-preserving mining of association rules have been proposed in the problem of association rules is especially challenging because of the discrete nature of the attributes corresponding to presence or absence of items[10] The randomization approach has also been extended to other applications such as OLAP, and SVD based collaborative filtering. D. Condensation approach based PPDM Condensation approach constructs constrained clusters in dataset and then generates pseudo data from the statistics of these clusters. It is called as condensation because of its approach of using condensed statistics of the clusters to generate pseudo data. It constructs groups of non-homogeneous size from the data, such that it is guaranteed that each record lies in a group whose size is at least equal to its anonymity level.[5] Subsequently, pseudo data is generated from each group so as to create a synthetic data set with the same aggregate distribution as the original data. This approach can be effectively used for the problem of classification [6]. The use of pseudo-data provides an additional layer of protection, as it becomes difficult to perform adversarial attacks on synthetic data. Moreover, the aggregate behavior of them data is preserved, making it useful for a variety of data mining problems.[5] This approach helps in better privacy preservation as compared to other techniques as it uses pseudo data rather than modified data. Moreover, it works even without redesigning data mining algorithms since the pseudo data has the same format as that of the original data.[10] It is very effective in case of data stream problems where the data is highly dynamic. At the same time, data mining results get affected as large amount of information is lost because of the condensation of a larger number of records into a single statistical group entity[10]. E. Cryptography based PPDM Cryptographic techniques are ideally meant for such scenarios where multiple parties collaborate to compute results or share non sensitive mining results and thereby avoiding disclosure of sensitive information [5]. Cryptographic techniques find its utility in such scenarios because of two reasons. First, it offers a well defined model for privacy that includes methods for proving and quantifying it. Second a vast set of cryptographic algorithms and constructs to implement privacy preserving data mining algorithms are available in this domain. The data may be distributed among different collaborators vertically or horizontally.[5] In vertically partitioned data among different collaborators, the individual entities may have different attributes of same set of records and in case of horizontally partitioned data, individual records are spread out across multiple entities, each of which has the same set of

5 attributes. Most of the privacy preserving distributed data mining algorithms reveal nothing other than the final results[6]. Kantarcioglu and Clifton in incorporated cryptographic techniques to preserve privacy in association rule mining over horizontally partitioned data to minimize information shared and at the same time adding very little overheads to the mining task [5]. Lindell and Pinkas in have discussed how to generate ID3 decision trees on horizontally partitioned data.[10] Yang et al. in have discussed a solution for horizontally partitioned data where each customer has a private access only to their own record. Vaidya and Clifton were the first who studied how secure association rule mining can be done for vertically partitioned data [10]. Vaidya and Clifton in proposed a method for clustering over vertically partitioned data. All these methods are almost based on a special encryption protocol known as Secure Multiparty Computation (SMC) technology.[5] Yao in discussed a problem where two millionaires wanted to know who is richer with neither revealing their net worth. So, SMC was coined and developed. SMC defines two basic adversarial models namely (i) Semi-Honest model and (ii) Malicious model. Semi-honest model follows protocols honestly, but can try to infer the secret information of other parties. In Malicious model, malicious adversaries can do anything to infer secret information [5]. They can abort protocol, send spurious messages, collude with other malicious models or even spoof messages. SMC used in distributed privacy preserving data mining consists of a set of secure sub protocols that are used in horizontally and vertically partitioned data: secure sum, secure set union, secure size of intersection and scalar product [5]. Moreover, the data mining results may breach the privacy of individual records [10]. 2. LITERATURE REVIEW: 2.1 The state of the art and tendency of privacy preserving data mining In [1] this paper two problems are considered in privacy preserving data mining.the first one is the protection of sensitive raw data, such as name, id num, address income level and other micro data. The other is the protection of sensitive knowledge showed by data mining,which is also called knowledge hiding database(khd)privacy preserving data mining and traditional data mining are different in data processing, data concerning, technology application and so on. For single data record of privacy preservation Randomized technique,data blocking, Data perturbation techniques used. For Centralized data sets privacy Reconstruction technique based on association rule mining and Reconstruction technique based on classification mining are used. For Distributed data s privacy Secure multiparty computation used. Standardization of privacy preserving technology is most important issue of this PPDM. 2.2 Preservation of privacy in data mining by using PCA based perturbation technique In [2] this paper Geometrical data transformation methods been extensively used for privacy conserving cluster, but drawback is that it is not reversible, that result in law security. The technique that preserves the privacy of delicate information in a multiparty cluster situation called guideline segment investigation based technique. K-means cluster algorithm and machine learning based methodology are used on artificial and realistic world information sets. Security preservation in data processing is a justifiably new subject of research. 2.3 Privacy preserving data classification and similarity evaluation for distributed system. In [3] this paper Privacy preservation data classification, Hidden data discover, Data classifier used. see there is existing data or new data. New data are not directly revealed during classification. Holomorphic cryptosystem, support vector machine, alternating direction method of multiplier are used for privacy preservation. Training data are private but trained model can be public. Correctness, feasibility, and efficiency of proposed scheme via extensive experiments. 2.4 Using K-means clustering algorithm for handling data precision. In [4]this paper Privacy preserving, Anonymized technique, Clustering, K-means clustering algorithm, AES encryption algorithm is used. Original patient record is transformed into Anonymized table which is identified by administrative. To keep anonymized table in cluster form using k-mean clustering algorithm. To partition anonymized table into cluster and each cluster has similar data object which is based on anonymized data attribute. After that

6 grouped similar data to be protected using AES encryption algorithm. More accuracy and better level of privacy of sensitive attribute. 2.5 Efficient privacy preserving classification construction model with differential privacy technology. In [5] this paper Decision tree, efficient classifier used to add noise via Laplace and exponential mechanism. Improving privacy budget allocation method two mode of implementing privacy protection: Interface mode, Full Access mode.controls privacy budget and avoids various types joint visit attack that are based on background knowledge. Studies on how to apply differential privacy technology in the privacy protection of large data have a wide application value. We will continue research in the direction of meeting large data set privacy security access requirements. and practical 2.6 Modified K-means clustering algorithm In [11] this paper presented a modified K-means algorithm for iterative clustering algorithm. This procedure is based on the optimization formulation and a novel iterative method.. According to the above numerical experiment results, the proposed method is an effective clustering method..it takes less number of iterations and give more accurate result then k-means clustering algorithm. 3. COMPARATIVE TABLE: Table -1: Comparative Table Sr.No. Paper Title Method Advantage Disadvantage 1 The state of the art and tendency of privacy preserving data mining Randomized technique,data blocking,data perturbation,secure Standardization of privacy Protection of sensitive raw data and micro data 2 Preservation of privacy in data mining by using PCA based perturbation technique 3 Privacy preserving data classification and similarity evaluation for distributed system 4 Using K-means clustering algorithm for handling data precision 5 Efficient privacy preserving classification construction model with differential privacy technology 6 Modified K-means clustering algorithm multiparty computation K-means cluster algorithm and machine learning based methodology Support vector machine,alternative direction method for multiplier Anonymized technique,kmeans clustering algorithm, AES encryption Decision tree,classifier to add noise, Privacy budget allocation method K-means clustering and modified K-means clustering Security preservation Data are private and correct Data is accurate and private Control privacy budget very well Less no. of iterations and saves time Law security,not reversible New data not directly revealed. For small data clustering is not possible Various types of attacks so maintaining privacy is difficult use for only numerical data 4. CONCLUSION: Privacy preservation of data is main concern for current generation in current scenario. This paper concentrates on various privacy preserving techniques and their advantages and disadvantages. So there is a need to develop a

7 method to provide mechanisms for data privacy which will provide efficiency, security and accuracy. So in future, new privacy preserving techniques with improved or modified K-means clustering algorithm are to be produced which defeats the disadvantages of current systems. 5. REFERENCES: 1) Bo Wang, Jing Yang The state of the art and tendency of privacy preserving data mining Supported by grants from the National Science Foundation of China No and No IEEE ) C.Gokulnath, M.K.Priyam, Vishnu Balan, K.P.Ramaprabhu R. Jayanthi.. Preservation of privacy in data mining by using PCA based perturbation technique 2015 International conference on Smart technologies and Management for Computing, Communication, Controls, Energy and Materials(ICSTM),IEEE ) QI Jia,Linke Guo,Zhangpeng Jin,Yuguang Fung Privacy preserving data classification and similarity evaluation for distributed system,2016 IEEE 36 TH International conference on Distributed Computing System. 4) P.Suganthi, K.Kala, DR.C.Balasubramanin. Using K-means clustering algorithm for handling data precision IEEE 5) Lin Zhang,Yan Liu,Ruchuan Wang,XiongFu,Qiamon Lin. Efficient privacy preserving classification construction model with differntial privacy technology. Journal of system engineering and electronics Vol 28 no 1 February 2017,IEEE ) Zhang P, Tong YH, Tang SW, Yang DQ, Ma XL. An effective method for privacy preserving association rule mining. Journal of Software, 2006, 17(8): ) Stanley R.M.Oliveira and Osmar R. Zaiane. Achieving Privacy Preservation When Sharing Data For Clustering [C].In Proc. of the International Workshop on Secure Data Management in a Connected World (SDM 04) in conjunction with VLDB, Canada, 2004: )Bhavani Thuraisingham, A primer for understanding and applying data mining /00/$ IEEE 9) Shruti Kapil,Meenu Chawla,Mohd Dilshad Ansari, On K-means Data Clustering Algorithm with genetic algorithm /16/$31.00@2016 IEEE 10) Majid Mashir Malik,M.asger Ghazi, Rashid Ali, Privacy Preserving Data Mining Techniques: Current Scenario and FutureProspects,IEEE ) W. Li, Modified K-Means Clustering Algorithm, 2008 Congr. Image Signal Process., pp ,

Privacy Preserving Decision Tree Classification on Horizontal Partition Data

Privacy Preserving Decision Tree Classification on Horizontal Partition Data Privacy Preserving Decision Tree Classification on Horizontal Partition Kamini D. Tandel Shri S ad Vidya Mandal Institute of Technology Bharuch, Gujarat, India Jignasa N. Patel Shri S ad Vidya Mandal Institute

More information

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,

More information

A Review on Privacy Preserving Data Mining Approaches

A Review on Privacy Preserving Data Mining Approaches A Review on Privacy Preserving Data Mining Approaches Anu Thomas Asst.Prof. Computer Science & Engineering Department DJMIT,Mogar,Anand Gujarat Technological University Anu.thomas@djmit.ac.in Jimesh Rana

More information

Best Keyword Cover Search

Best Keyword Cover Search Vennapusa Mahesh Kumar Reddy Dept of CSE, Benaiah Institute of Technology and Science. Best Keyword Cover Search Sudhakar Babu Pendhurthi Assistant Professor, Benaiah Institute of Technology and Science.

More information

Partition Based Perturbation for Privacy Preserving Distributed Data Mining

Partition Based Perturbation for Privacy Preserving Distributed Data Mining BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation

More information

Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining

Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining CS573 Data Privacy and Security Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining Li Xiong Slides credit: Chris Clifton, Purdue University; Murat Kantarcioglu, UT

More information

Raunak Rathi 1, Prof. A.V.Deorankar 2 1,2 Department of Computer Science and Engineering, Government College of Engineering Amravati

Raunak Rathi 1, Prof. A.V.Deorankar 2 1,2 Department of Computer Science and Engineering, Government College of Engineering Amravati Analytical Representation on Secure Mining in Horizontally Distributed Database Raunak Rathi 1, Prof. A.V.Deorankar 2 1,2 Department of Computer Science and Engineering, Government College of Engineering

More information

Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM

Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM 1 Dr.G.Kirubhakar and 2 Dr.C.Venkatesh Address for Correspondence 1 Department of Computer Science and Engineering, Surya Engineering

More information

Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database

Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database 204 IJEDR Volume 2, Issue ISSN: 232-9939 Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database Rachit Adhvaryu, 2 Nikunj Domadiya PG Student, 2 Professor

More information

Data mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014

Data mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014 Data Mining Data mining processes What technological infrastructure is required? Data mining is a system of searching through large amounts of data for patterns. It is a relatively new concept which is

More information

ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING

ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING R.Kalaivani #1,S.Chidambaram #2 # Department of Information Techology, National Engineering College,

More information

Accountability in Privacy-Preserving Data Mining

Accountability in Privacy-Preserving Data Mining PORTIA Privacy, Obligations, and Rights in Technologies of Information Assessment Accountability in Privacy-Preserving Data Mining Rebecca Wright Computer Science Department Stevens Institute of Technology

More information

Multilevel Data Aggregated Using Privacy Preserving Data mining

Multilevel Data Aggregated Using Privacy Preserving Data mining Multilevel Data Aggregated Using Privacy Preserving Data mining V.Nirupa Department of Computer Science and Engineering Madanapalle, Andhra Pradesh, India M.V.Jaganadha Reddy Department of Computer Science

More information

An Architecture for Privacy-preserving Mining of Client Information

An Architecture for Privacy-preserving Mining of Client Information An Architecture for Privacy-preserving Mining of Client Information Murat Kantarcioglu Jaideep Vaidya Department of Computer Sciences Purdue University 1398 Computer Sciences Building West Lafayette, IN

More information

Survey Result on Privacy Preserving Techniques in Data Publishing

Survey Result on Privacy Preserving Techniques in Data Publishing Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

Emerging Measures in Preserving Privacy for Publishing The Data

Emerging Measures in Preserving Privacy for Publishing The Data Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the

More information

PPKM: Preserving Privacy in Knowledge Management

PPKM: Preserving Privacy in Knowledge Management PPKM: Preserving Privacy in Knowledge Management N. Maheswari (Corresponding Author) P.G. Department of Computer Science Kongu Arts and Science College, Erode-638-107, Tamil Nadu, India E-mail: mahii_14@yahoo.com

More information

The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data

The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data Li Liu, Murat Kantarcioglu and Bhavani Thuraisingham Computer Science Department University of Texas

More information

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Introduction to Privacy-Preserving Data Publishing Concepts and Techniques Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu CRC

More information

Privacy-Preserving Data Mining in the Fully Distributed Model

Privacy-Preserving Data Mining in the Fully Distributed Model Privacy-Preserving Data Mining in the Fully Distributed Model Rebecca Wright Stevens Institute of Technology www.cs.stevens.edu/~rwright MADNES 05 22 September, 2005 (Includes joint work with Zhiqiang

More information

Secure Multiparty Computation

Secure Multiparty Computation CS573 Data Privacy and Security Secure Multiparty Computation Problem and security definitions Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING

A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING Abhinav Kathuria Email - abhinav.kathuria90@gmail.com Abstract: Data mining is the process of the extraction of the hidden pattern from the data

More information

Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database

Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database T.Malathi 1, S. Nandagopal 2 PG Scholar, Department of Computer Science and Engineering, Nandha College of Technology, Erode,

More information

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING 1 B.KARTHIKEYAN, 2 G.MANIKANDAN, 3 V.VAITHIYANATHAN 1 Assistant Professor, School of Computing, SASTRA University, TamilNadu, India. 2 Assistant

More information

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

PRIVACY PRESERVING DATA MINING BASED ON VECTOR QUANTIZATION

PRIVACY PRESERVING DATA MINING BASED ON VECTOR QUANTIZATION PRIVACY PRESERVING DATA MINING BASED ON VECTOR QUANTIZATION D.Aruna Kumari 1, Dr.K.Rajasekhara Rao 2, M.Suman 3 1,3 Department of Electronics and Computer Engineering, Associate professors,csi Life Member

More information

PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION

PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION Justin Z. Zhan, LiWu Chang, Stan Matwin Abstract We propose a new scheme for multiple parties to conduct data mining computations without disclosing

More information

PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES)

PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES) PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES) Jyotirmayee Rautaray 1, Raghvendra Kumar 2 School of Computer Engineering, KIIT University, Odisha, India 1 School of Computer

More information

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

A Comparative study of Clustering Algorithms using MapReduce in Hadoop A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering

More information

Privacy Preserving Data Mining Technique and Their Implementation

Privacy Preserving Data Mining Technique and Their Implementation International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 4, Issue 2, 2017, PP 14-19 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) DOI: http://dx.doi.org/10.20431/2349-4859.0402003

More information

Privacy, Security & Ethical Issues

Privacy, Security & Ethical Issues Privacy, Security & Ethical Issues How do we mine data when we can t even look at it? 2 Individual Privacy Nobody should know more about any entity after the data mining than they did before Approaches:

More information

MTAT Research Seminar in Cryptography Building a secure aggregation database

MTAT Research Seminar in Cryptography Building a secure aggregation database MTAT.07.006 Research Seminar in Cryptography Building a secure aggregation database Dan Bogdanov University of Tartu, Institute of Computer Science 22.10.2006 1 Introduction This paper starts by describing

More information

Privacy-Preserving Algorithms for Distributed Mining of Frequent Itemsets

Privacy-Preserving Algorithms for Distributed Mining of Frequent Itemsets Privacy-Preserving Algorithms for Distributed Mining of Frequent Itemsets Sheng Zhong August 15, 2003 Abstract Standard algorithms for association rule mining are based on identification of frequent itemsets.

More information

. (1) N. supp T (A) = If supp T (A) S min, then A is a frequent itemset in T, where S min is a user-defined parameter called minimum support [3].

. (1) N. supp T (A) = If supp T (A) S min, then A is a frequent itemset in T, where S min is a user-defined parameter called minimum support [3]. An Improved Approach to High Level Privacy Preserving Itemset Mining Rajesh Kumar Boora Ruchi Shukla $ A. K. Misra Computer Science and Engineering Department Motilal Nehru National Institute of Technology,

More information

EFFICIENT RETRIEVAL OF DATA FROM CLOUD USING DATA PARTITIONING METHOD FOR BANKING APPLICATIONS [RBAC]

EFFICIENT RETRIEVAL OF DATA FROM CLOUD USING DATA PARTITIONING METHOD FOR BANKING APPLICATIONS [RBAC] EFFICIENT RETRIEVAL OF DATA FROM CLOUD USING DATA PARTITIONING METHOD FOR BANKING APPLICATIONS [RBAC] Rajalakshmi V., Jothi Nisha V. and Dhanalakshmi S. Faculty of Computing, Sathyabama University, Chennai,

More information

IJSER. Privacy and Data Mining

IJSER. Privacy and Data Mining Privacy and Data Mining 2177 Shilpa M.S Dept. of Computer Science Mohandas College of Engineering and Technology Anad,Trivandrum shilpams333@gmail.com Shalini.L Dept. of Computer Science Mohandas College

More information

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,

More information

FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING

FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING Neha V. Sonparote, Professor Vijay B. More. Neha V. Sonparote, Dept. of computer Engineering, MET s Institute of Engineering Nashik, Maharashtra,

More information

Privacy and Security Ensured Rule Mining under Partitioned Databases

Privacy and Security Ensured Rule Mining under Partitioned Databases www.ijiarec.com ISSN:2348-2079 Volume-5 Issue-1 International Journal of Intellectual Advancements and Research in Engineering Computations Privacy and Security Ensured Rule Mining under Partitioned Databases

More information

International Journal of Scientific & Engineering Research, Volume 8, Issue 4, April-2017 ISSN V.Sathya, and Dr.V.

International Journal of Scientific & Engineering Research, Volume 8, Issue 4, April-2017 ISSN V.Sathya, and Dr.V. International Journal of Scientific & Engineering Research, Volume 8, Issue 4, April-2017 52 Encryption-Based Techniques For Privacy Preserving Data Mining V.Sathya, and Dr.V. Gayathiri ABSTRACT: In present

More information

Privacy in Distributed Database

Privacy in Distributed Database Privacy in Distributed Database Priyanka Pandey 1 & Javed Aktar Khan 2 1,2 Department of Computer Science and Engineering, Takshshila Institute of Engineering and Technology, Jabalpur, M.P., India pandeypriyanka906@gmail.com;

More information

Conceptual Review of clustering techniques in data mining field

Conceptual Review of clustering techniques in data mining field Conceptual Review of clustering techniques in data mining field Divya Shree ABSTRACT The marvelous amount of data produced nowadays in various application domains such as molecular biology or geography

More information

Correlation Based Feature Selection with Irrelevant Feature Removal

Correlation Based Feature Selection with Irrelevant Feature Removal Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Comparison of Online Record Linkage Techniques

Comparison of Online Record Linkage Techniques International Research Journal of Engineering and Technology (IRJET) e-issn: 2395-0056 Volume: 02 Issue: 09 Dec-2015 p-issn: 2395-0072 www.irjet.net Comparison of Online Record Linkage Techniques Ms. SRUTHI.

More information

A Study on Data Perturbation Techniques in Privacy Preserving Data Mining

A Study on Data Perturbation Techniques in Privacy Preserving Data Mining A Study on Data Perturbation Techniques in Privacy Preserving Data Mining Nimpal Patel, Shreya Patel Student, Dept. Of Computer Engineering, Grow More Faculty of Engineering Himatnagar, Gujarat, India

More information

A Novel Identification Approach to Encryption Mode of Block Cipher Cheng Tan1, a, Yifu Li 2,b and Shan Yao*2,c

A Novel Identification Approach to Encryption Mode of Block Cipher Cheng Tan1, a, Yifu Li 2,b and Shan Yao*2,c th International Conference on Sensors, Mechatronics and Automation (ICSMA 16) A Novel Approach to Encryption Mode of Block Cheng Tan1, a, Yifu Li 2,b and Shan Yao*2,c 1 Science and Technology on Communication

More information

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1105

More information

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION Kiran V. Gaidhane*, Prof. L. H. Patil, Prof. C. U. Chouhan DOI: 10.5281/zenodo.58632

More information

Preserving Data Mining through Data Perturbation

Preserving Data Mining through Data Perturbation Preserving Data Mining through Data Perturbation Mr. Swapnil Kadam, Prof. Navnath Pokale Abstract Data perturbation, a widely employed and accepted Privacy Preserving Data Mining (PPDM) approach, tacitly

More information

Integration of information security and network data mining technology in the era of big data

Integration of information security and network data mining technology in the era of big data Acta Technica 62 No. 1A/2017, 157 166 c 2017 Institute of Thermomechanics CAS, v.v.i. Integration of information security and network data mining technology in the era of big data Lu Li 1 Abstract. The

More information

CLASSIFICATIONANDEVALUATION THE PRIVACY PRESERVING DISTRIBUTED DATA MININGTECHNIQUES

CLASSIFICATIONANDEVALUATION THE PRIVACY PRESERVING DISTRIBUTED DATA MININGTECHNIQUES CLASSIFICATIONANDEVALUATION THE PRIVACY PRESERVING DISTRIBUTED DATA MININGTECHNIQUES 1 SOMAYYEH SEIFI MORADI, 2 MOHAMMAD REZA KEYVANPOUR 1 Department of Computer Engineering, Qazvin University, Qazvin,

More information

Clustering and Association using K-Mean over Well-Formed Protected Relational Data

Clustering and Association using K-Mean over Well-Formed Protected Relational Data Clustering and Association using K-Mean over Well-Formed Protected Relational Data Aparna Student M.Tech Computer Science and Engineering Department of Computer Science SRM University, Kattankulathur-603203

More information

Secure Multiparty Computation

Secure Multiparty Computation Secure Multiparty Computation Li Xiong CS573 Data Privacy and Security Outline Secure multiparty computation Problem and security definitions Basic cryptographic tools and general constructions Yao s Millionnare

More information

Frequent Itemset Mining With PFP Growth Algorithm (Transaction Splitting)

Frequent Itemset Mining With PFP Growth Algorithm (Transaction Splitting) Frequent Itemset Mining With PFP Growth Algorithm (Transaction Splitting) Nikita Khandare 1 and Shrikant Nagure 2 1,2 Computer Department, RMDSOE Abstract Frequent sets play an important role in many Data

More information

Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud

Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud R. H. Jadhav 1 P.E.S college of Engineering, Aurangabad, Maharashtra, India 1 rjadhav377@gmail.com ABSTRACT: Many

More information

A generic and distributed privacy preserving classification method with a worst-case privacy guarantee

A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Distrib Parallel Databases (2014) 32:5 35 DOI 10.1007/s10619-013-7126-6 A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Madhushri Banerjee Zhiyuan

More information

MINING ASSOCIATION RULE FOR HORIZONTALLY PARTITIONED DATABASES USING CK SECURE SUM TECHNIQUE

MINING ASSOCIATION RULE FOR HORIZONTALLY PARTITIONED DATABASES USING CK SECURE SUM TECHNIQUE MINING ASSOCIATION RULE FOR HORIZONTALLY PARTITIONED DATABASES USING CK SECURE SUM TECHNIQUE Jayanti Danasana 1, Raghvendra Kumar 1 and Debadutta Dey 1 1 School of Computer Engineering, KIIT University,

More information

Computational Time Analysis of K-mean Clustering Algorithm

Computational Time Analysis of K-mean Clustering Algorithm Computational Time Analysis of K-mean Clustering Algorithm 1 Praveen Kumari, 2 Hakam Singh, 3 Pratibha Sharma 1 Student Mtech, CSE 4 th SEM, 2 Assistant professor CSE, 3 Assistant professor CSE Career

More information

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of

More information

Reconstruction-based Classification Rule Hiding through Controlled Data Modification

Reconstruction-based Classification Rule Hiding through Controlled Data Modification Reconstruction-based Classification Rule Hiding through Controlled Data Modification Aliki Katsarou, Aris Gkoulalas-Divanis, and Vassilios S. Verykios Abstract In this paper, we propose a reconstruction

More information

Privacy Preserving Two-Layer Decision Tree Classifier for Multiparty Databases

Privacy Preserving Two-Layer Decision Tree Classifier for Multiparty Databases Privacy Preserving Two-Layer Decision Tree Classifier for Multiparty Databases Alka Gangrade T.I.T.-M.C.A. Technocrats Institute of Technology Bhopal, India alkagangrade@yahoo.co.in Ravindra Patel Dept.

More information

Chapter 1, Introduction

Chapter 1, Introduction CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from

More information

Privacy Preserving Data Mining

Privacy Preserving Data Mining Privacy Preserving Data Mining D. Aruna Kumari 1, Dr. K. Rajasekhara Rao 2, M. Suman 3 B. V. Susmitha 4, ASNS Ramya 5 and VVS Mounika 1,3 Department of Electronics and Computer Engineering, Associate professors,csi

More information

Clustering of Data with Mixed Attributes based on Unified Similarity Metric

Clustering of Data with Mixed Attributes based on Unified Similarity Metric Clustering of Data with Mixed Attributes based on Unified Similarity Metric M.Soundaryadevi 1, Dr.L.S.Jayashree 2 Dept of CSE, RVS College of Engineering and Technology, Coimbatore, Tamilnadu, India 1

More information

CSC 5930/9010 Cloud S & P: Cloud Primitives

CSC 5930/9010 Cloud S & P: Cloud Primitives CSC 5930/9010 Cloud S & P: Cloud Primitives Professor Henry Carter Spring 2017 Methodology Section This is the most important technical portion of a research paper Methodology sections differ widely depending

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

The Two Dimensions of Data Privacy Measures

The Two Dimensions of Data Privacy Measures The Two Dimensions of Data Privacy Measures Abstract Orit Levin Page 1 of 9 Javier Salido Corporat e, Extern a l an d Lega l A ffairs, Microsoft This paper describes a practical framework for the first

More information

Privacy Preserving Naïve Bayes Classifier for Horizontally Distribution Scenario Using Un-trusted Third Party

Privacy Preserving Naïve Bayes Classifier for Horizontally Distribution Scenario Using Un-trusted Third Party IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727, Volume 7, Issue 6 (Nov. - Dec. 2012), PP 04-12 Privacy Preserving Naïve Bayes Classifier for Horizontally Distribution Scenario

More information

CERIAS Tech Report

CERIAS Tech Report CERIAS Tech Report 25-96 A FRAMEWORK FOR EVALUATING PRIVACY PRESERVING DATA MINING ALGORITHMS by ELISA BERTINO, IGOR NAI FOVINO, LOREDANA PARASILITI PROVENZA Center for Education and Research in Information

More information

Distributed Data Mining with Differential Privacy

Distributed Data Mining with Differential Privacy Distributed Data Mining with Differential Privacy Ning Zhang, Ming Li, Wenjing Lou Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, MA Email: {ning, mingli}@wpi.edu,

More information

Privacy Preserving Mining Techniques and Precaution for Data Perturbation

Privacy Preserving Mining Techniques and Precaution for Data Perturbation Privacy Preserving Mining Techniques and Precaution for Data Perturbation J.P. Maurya 1, Sandeep Kumar 2, Sushanki Nikhade 3 1 Asst.Prof. C.S Dept., IES BHOPAL 2 Mtech Scholar, IES BHOPAL 1 jpeemaurya@gmail.com,

More information

IEEE 2013 JAVA PROJECTS Contact No: KNOWLEDGE AND DATA ENGINEERING

IEEE 2013 JAVA PROJECTS  Contact No: KNOWLEDGE AND DATA ENGINEERING IEEE 2013 JAVA PROJECTS www.chennaisunday.com Contact No: 9566137117 KNOWLEDGE AND DATA ENGINEERING (DATA MINING) 1. A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data

More information

Co-clustering for differentially private synthetic data generation

Co-clustering for differentially private synthetic data generation Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &

More information

Preserving Privacy Using Gradient Descent Methods Applied for Neural Network with Distributed Datasets

Preserving Privacy Using Gradient Descent Methods Applied for Neural Network with Distributed Datasets www.ijcsi.org 728 Preserving Privacy Using Gradient Descent Methods Applied for Neural Network with Distributed Datasets Mr.Sachin P.Yadav, Mr.Amit B.Chougule 1 D.Y.Patil College of Engineering Kolhapur,Maharashtra,India

More information

m-privacy for Collaborative Data Publishing

m-privacy for Collaborative Data Publishing m-privacy for Collaborative Data Publishing Slawomir Goryczka Emory University Email: sgorycz@emory.edu Li Xiong Emory University Email: lxiong@emory.edu Benjamin C. M. Fung Concordia University Email:

More information

Data Anonymization. Graham Cormode.

Data Anonymization. Graham Cormode. Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties

More information

Iteration Reduction K Means Clustering Algorithm

Iteration Reduction K Means Clustering Algorithm Iteration Reduction K Means Clustering Algorithm Kedar Sawant 1 and Snehal Bhogan 2 1 Department of Computer Engineering, Agnel Institute of Technology and Design, Assagao, Goa 403507, India 2 Department

More information

DISCLOSURE PROTECTION OF SENSITIVE ATTRIBUTES IN COLLABORATIVE DATA MINING V. Uma Rani *1, Dr. M. Sreenivasa Rao *2, V. Theresa Vinayasheela *3

DISCLOSURE PROTECTION OF SENSITIVE ATTRIBUTES IN COLLABORATIVE DATA MINING V. Uma Rani *1, Dr. M. Sreenivasa Rao *2, V. Theresa Vinayasheela *3 www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 5 May, 2014 Page No. 5594-5599 DISCLOSURE PROTECTION OF SENSITIVE ATTRIBUTES IN COLLABORATIVE DATA MINING

More information

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP 324 Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP Shivaji Yadav(131322) Assistant Professor, CSE Dept. CSE, IIMT College of Engineering, Greater Noida,

More information

NON-CENTRALIZED DISTINCT L-DIVERSITY

NON-CENTRALIZED DISTINCT L-DIVERSITY NON-CENTRALIZED DISTINCT L-DIVERSITY Chi Hong Cheong 1, Dan Wu 2, and Man Hon Wong 3 1,3 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong {chcheong, mhwong}@cse.cuhk.edu.hk

More information

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong CS573 Data Privacy and Security Cryptographic Primitives and Secure Multiparty Computation Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan

More information

Randomized Response Technique in Data Mining

Randomized Response Technique in Data Mining Randomized Response Technique in Data Mining Monika Soni Arya College of Engineering and IT, Jaipur(Raj.) 12.monika@gmail.com Vishal Shrivastva Arya College of Engineering and IT, Jaipur(Raj.) vishal500371@yahoo.co.in

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Sanitization Techniques against Personal Information Inference Attack on Social Network

Sanitization Techniques against Personal Information Inference Attack on Social Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 12, December 2014,

More information

A Framework for Securing Databases from Intrusion Threats

A Framework for Securing Databases from Intrusion Threats A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:

More information

A Review on Cluster Based Approach in Data Mining

A Review on Cluster Based Approach in Data Mining A Review on Cluster Based Approach in Data Mining M. Vijaya Maheswari PhD Research Scholar, Department of Computer Science Karpagam University Coimbatore, Tamilnadu,India Dr T. Christopher Assistant professor,

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

PRIVACY PRESERVING ANALYSIS OF GRAPH STRUCTURED DATA

PRIVACY PRESERVING ANALYSIS OF GRAPH STRUCTURED DATA PRIVACY PRESERVING ANALYSIS OF GRAPH STRUCTURED DATA by Xiaoyun He A dissertation submitted to the Graduate school-newark Rutgers, The State University of New Jersey in partial fulfillment of the requirements

More information

IMPLEMENTATION OF UNIFY ALGORITHM FOR MINING OF ASSOCIATION RULES IN PARTITIONED DATABASES

IMPLEMENTATION OF UNIFY ALGORITHM FOR MINING OF ASSOCIATION RULES IN PARTITIONED DATABASES IMPLEMENTATION OF UNIFY ALGORITHM FOR MINING OF ASSOCIATION RULES IN PARTITIONED DATABASES KATARU MANI 1, N. JAYA KRISHNA 2 1 Mtech Student, Department of CSE. EMAIL: manicsit@gmail.com 2 Assistant Professor,

More information

On Privacy-Preservation of Text and Sparse Binary Data with Sketches

On Privacy-Preservation of Text and Sparse Binary Data with Sketches On Privacy-Preservation of Text and Sparse Binary Data with Sketches Charu C. Aggarwal Philip S. Yu Abstract In recent years, privacy preserving data mining has become very important because of the proliferation

More information

Verifying Result Correctness of Outsourced Frequent Item set Mining using Rob Frugal Algorithm

Verifying Result Correctness of Outsourced Frequent Item set Mining using Rob Frugal Algorithm Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,

More information

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University

More information

Separation Shielding and Content Supporting Locus Positioned Problems

Separation Shielding and Content Supporting Locus Positioned Problems Separation Shielding and Content Supporting Locus Positioned Problems Urlam Sridhar Assistant Professor, Department of CSE, Sri Venkateswara College of Engineering and Technology. ABSTRACT: In this paper

More information

International Journal of Modern Engineering and Research Technology

International Journal of Modern Engineering and Research Technology Volume 2, Issue 4, October 2015 ISSN: 2348-8565 (Online) International Journal of Modern Engineering and Research Technology Website: http://www.ijmert.org Privacy Preservation in Data Mining Using Mixed

More information

Void main Technologies

Void main Technologies Sno Title Domain 1. A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specification and Verification 2. A Lightweight Secure Data Sharing Scheme for Mobile Cloud Computing 3. A Modified

More information

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models

More information

Computer Based Image Algorithm For Wireless Sensor Networks To Prevent Hotspot Locating Attack

Computer Based Image Algorithm For Wireless Sensor Networks To Prevent Hotspot Locating Attack Computer Based Image Algorithm For Wireless Sensor Networks To Prevent Hotspot Locating Attack J.Anbu selvan 1, P.Bharat 2, S.Mathiyalagan 3 J.Anand 4 1, 2, 3, 4 PG Scholar, BIT, Sathyamangalam ABSTRACT:

More information