ADVANCES in NATURAL and APPLIED SCIENCES

Size: px
Start display at page:

Download "ADVANCES in NATURAL and APPLIED SCIENCES"

Transcription

1 ADVANCES in NATURAL and APPLIED SCIENCES ISSN: Published BYAENSI Publication EISSN: May 11(7): pages Open Access Journal A Privacy Preserving Data Mining Approach for Handling Data with Outliers 1 V.V. Vishnu Priya, 2 A.K. Ilavarasi, 3 Dr.B. Sathiya Bhama 1 PG Student- CSE Sona College of Technology Salem, India. 2 Assistant Professor CSE Sona College of Technology Salem, India. 3 HOD CSE Sona College of Technology Salem, India. Received 28 January 2017; Accepted 22 May 2017; Available online 28 May 2017 Address For Correspondence: V.V. Vishnu Priya, PG Student- CSE Sona College of Technology Salem, India. vishnupriya31.v@gmail.com Copyright 2017 by authors and American-Eurasian Network for ScientificInformation (AENSI Publication). This work is licensed under the Creative Commons Attribution International License (CC BY). ABSTRACT Organizations publish their private data for the research analysis. Publishing datasets for analysis causes serious concerns in the data privacy. The data published may contains outliers. Outliers are easily identifiable, therefore adversaries can capture their private information about an individual by linking with the other attribute published in external database. The motivation is to prevent the disclosure of sensitive information. Distinguishability attack occurs while publishing the datasets that contains outliers. The syntactic privacy models could not prevent the attack. The plain l-diversity could defend against the attack. The existing plain l-diversity preserves the dataset from the distinguishability attack but it results in information loss. In this paper we are going to improve the algorithm with minimal information loss using K Nearest Neighbour Algorithm. KEYWORDS: Privacy, Data Mining, Outliers, Data Sharing. INTRODUCTION Data Mining [1] is the process of extracting the knowledge from the data which is stored in the large repositories. Privacy Preserving Data Mining problem [2] has been considered more importantly in recent years due to the fact that huge amount of information about individuals are stored at different vendors for the research purposes. PPDM is an new research topic in Data Mining and in the Statistical databases in which the Data Mining algorithms are analyzed to check whether they acquire privacy in data. Privacy Preservation of individuals data from disclosure is considered as the important function inorder to maintain privacy. In this way privacy plays major role in the data mining process. The problem in the data mining output is it reveals the individuals personal data. It leads to threats in the privacy of the individuals. The motivation of the people is that their personal information should not be known to others without their knowledge. But data mining algorithms failed to protect the privacy of the individuals. Privacy is defined as the right of an individual person to keep their sensitive information from being disclosed. Privacy states that from an set of records the adversary should not identify the person associated with that record. The results of the data mining operations are sensitive. Privacy is one of the important properties [4] that an system needs to be satisfied. For this purpose, numerous efforts had been undertaken to devote the PPDM algorithms to protect the information from being disclosed. One of the basic data mining problem is Outlier Detection. An Outlier is an observation point that deviates from the other observations or from the rest of the data [6].Outliers can be novel, abnormal, unusual or noisy information. Outliers may be real or erroneous. ToCite ThisArticle: V.V. Vishnu Priya, A.K. Ilavarasi, Dr.B. Sathiya Bhama., A Privacy Preserving Data Mining Approach for Handling Data with Outliers. Advances in Natural and Applied Sciences. 11(7); Pages:

2 586 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Related work: In recent years the privacy preserving data publishing had drawn more attention. To protect[3] the privacy of the individuals the dataset must be anonymized before it is released. Previous[3] study shown that by removing the explicit identifiers such as name,ssn(social Security Number) from the dataset cannot maintain privacy. It is because the Quasi identifiers such as zip code, gender helps to jointly identify the person privately. The identity of the person can be revealed easily when it is compared with the public dataset (eg.voter list).sweeney[5] proposed k-anonymity method which is treated as the conventional method for anonymization.quasi Identifier consists of person specific sensitive attribute information. It achieves using generalization and suppression method so that the each individual is indistinguishable from the at least k- 1records.Generalization replaces the value less specific but it is also said to be semantically consistent. Suppression[8] reduces the exactness of applications and does not releases the value at all.this type of K- anonymity method prevents from the linkage attack. The authors[10][5] proved that removing Quasi-Identifiers from the dataset donot ensure the privacy so they suggested that the k-anonymity method is better for publishing the microdata. Author[9] suggested an novel approach based on bottom up method to group the quasi identifiers.k-anonymity model[5] is proved to be theoretically NP-Hard.two types of attack are possible such as Background Knowledge attack and Homogeneity attack. L-diversity [7] model is introduced to protect from the attribute disclosure.it consists of distinct well represented values in each equivalence class. The improved methods such as the t-closeness, p-sensitive anonymity, (k,e)anonymity[11] are described in it. As the L-diversity model, the several other approaches are proposed to achieve the principle of privacy in[13,15,16,18].they are Classified as partitioning method and randomization method. The dataset[15,18] is divided into Quasiidentifier groups and it publishes only the anonymized groups in the partitioning based method. To increase the utility of the anonymized dataset nonhomogeneous generalization method is proposed by the Koudos [12] In randomization approach the original values are replaced with the noise or duplicate values[15]li et al proposed that the distribution of sensitive values in the released dataset must be close to the original in t-closeness method[14] If outliers present in the original dataset they must be shown in the both the original and the modified dataset. In this way the outliers can be easily detected using the distribution. Few studies shows the possibility of attacks in the partition based schemes. Machanavajjhala et al[17] described some of the attacks in the k-anonymity and proposed the l-diversity. Our work adapts the l-diversity model. In the recent years an new model for privacy is emerged known as differential privacy [20] In this differential privacy method the removal or the addition of any one record will not affect the entire dataset[19].numerous techniques had been proposed to publish the different types of data to satisfy the differential privacy[21,23]barak et al proposes[21] the method to publish the marginals of the dataset. Blum et al [22]proposed an approach for releasing the one dimensional data which satisfies the differential privacy in non interactive way.hay et al[24] improves the performance of the[22]the wavelet based approach[25]is used by the Xiao et al for publishing the micro dimensional dataset. 3.Privacy preserving method for data containing Outliers: Organizations are increasingly publishing microdata that contains non aggregated information about the individuals.non-aggregated data that contains outliers raises serious concerns in data privacy.when outliers exist in the dataset, they are easier to be distinguished from the crowd and the privacy is breached. Distinguishability-based attack occurs by which the adversary can identify outliers and reveal their private information from an anonymized dataset. The existing plain l-diversity preserves the dataset from the distinguishability attack but it results in information loss since it hides only the hideable outliers present in the dataset. In l-diversity all records that share the same values of quasi identifiers should have distinct values for their sensitive attributes. In previous studies[3]using l-diversity method the QI attributes are generalized and the outliers present in the data are hided inorder to maintain privacy.but when we generalize and hide the data containing outliers it results in information loss since it hides only the hideable outliers and the unhideable outliers are eliminated. In the proposed system the information loss is reduced using KNN algorithm by enhancing the l-diversity model.the proposed system consists of five steps described in the fig 3.1

3 587 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Load Dataset Cluster records KNN Classifier Find Outliers Group outliers to its nearest cluster Fig. 3.1: System Architecture of Proposed System In this proposed system first we import the dataset and Fuzzy clustering is applied. Fuzzy clustering used to group the data into n clusters in which each datapoint in the dataset are belongs to each cluster to an certain degree.in simple words we can say that each point can belong to more than one cluster. Fuzzy clustering is applied because it helps the datapoint to move to its nearest cluster then KNN classifier is applied to find the outliers. Fuzzy Clustering helps to find combination weights, membership functions and cluster centres to minimize the objective function. Outliers are the observation point that deviates from others.knn classifies the new cases (outliers) based on its distance functions.the outliers present in the data are moved to its nearest bucket (cluster). KNN algorithm steps: Determine parameter K=number of nearest neighbours. Calculate the distance between the query instance and all training samples. Find the nearest neighbour by sorting and gather the category of the nearest neighbours. Use simple majority of the category of the nearest neighbours as the prediction value of the query instance. RESULT AND DISCUSSION The input given is adult dataset which is first loaded and then fuzzy Clustering method is applied to cluster the records.it is especially used for mapping the outliers to its nearest cluster. Fig. 4.1: Raw Dataset

4 588 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Fig. 4.2: Fuzzy Clustering Fig. 4.3: Assigning K value for KNN classification Fig. 4.4: Outlier Detection Fig. 4.5: Outlier Mapping After that the KNN classifier is applied and found the outliers based on its distance. Euclidian Distance Method is used to calculate the distance between the records.then the outliers are moved to its nearest bucket (cluster). By this way the privacy of the dataset is maintained using l-diversity and information loss is reduced using KNN classifier by assigning the outliers to its nearest clusters. Table 1 describes the performance of the Information loss metrics. The information loss is analyzed in terms of outlier detection error ratio results in figure 4.6. Information loss is defined as, 1 Loss = D 2 O 1 + N DC 1 where, N DC ( D 2, N DC D ) 2

5 589 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: D: Dimensionality of the vector (2,3,4,5,...) O: Outlier NDc: The number of training samples per class (>D+1) Table 1: Comparisons of Information Loss during number of runs Methods Existing Proposed as, Table 2 describes the performance of the system using silhoutee metrics in figure 4.3. Accuracy is defined Accuracy(i) = b(i) a(i) max {a(i), b(i)} where, a(i) is the cluster similarity, b(i) be the lowest average dissimilarity of i to any other cluster, of which i is not a member. The cluster with this lowest average dissimilarity is said to be the neighbouring cluster of i because it is the next best fit cluster for point i. Table 2: Comparisons of Cluster Accuracy during number of runs Methods Existing Proposed Table 3 describes the performance of the system using Time metrics. Computational time metrics is analyzed in terms of Outlier detection in figure 4.8. Computational Time(CT) is defined as CT = Process Start Time - Process End Time Table 2: Comparisons of Computational Time (CPU seconds) during number of runs Methods Existing Proposed Fig. 4.6: Information loss Graph The information loss is reduced when comparing to the existing method.

6 590 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Fig. 4.7: Accuracy Graph Fig. 4.8: Compilation Time Graph Conclusion: In this paper the problem of publishing data with outliers in privacy fashion is studied. The microdata containing outliers are published in a privacy preserving way. The existing plain l-diversity system provides privacy only for the hideable outliers and it results in information loss. In this paper we improved the algorithm using K Nearest Neighbour Algorithm to reduce information loss. REFERENCES 1. Han, J. and M. Kamber, Data Mining: Concepts andtechniques, 2nd ed.,the Morgan Kaufmann Series in DataManagement Systems, Jim Gray, Series Editor. 2. AnkitaShrivastava, U. Dutta, An Emblematic Study of Different Techniques in PPDM, International Journal of Advanced Research in Computer Science and Software Engineering. 3. Hui(Wendy)Wang, RuilinLiu, Hiding outliers into crowd: Privacy Preserving data publishing with outliers,elsevier. 4. Elisa Bertino, Dan Lin, and Wei Jiang, A Survey of Quantification of Privacy Preserving Data Mining Algorithms 5. Sweeney, L., k-anonymity: a model for protecting privacy, Int. J. Uncertain. Fuzziness Knowl. Based Syst., 10(5): Williams, G., R. Baxter, H. He, S. Hawkins and L. Gu, A Comparative Study for RNN for OutlierDetection in Data Mining. In Proceedings of the 2ndIEEE International Conference on Data Mining, page709, Maebashi City, Japan. 7. Tiancheng Li, Ninghui Li, Jian Zhang, Ian Molloy, "Slicing: A New Approach for Privacy Preserving Data Publishing", IEEE Transactions on Knowledge & Data Engineering, 24(3): , doi: /tkde Samarati, P., Protectingrespondent s privacy in Microdata release, IEEE Transactions on Knowledge and Data Engineering, 13: TianchengLi, NinghuiLi, Towards Optimal k-anonymization, Data & Knowledge Engineering, Elsevier. 303.

7 591 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Machanavajjhala, A., J. Gehrke, D. Kifer, l-diversity: Privacy Beyond k-anonymity, ACM Transactions on Knowledge Discovery from Data, pp: SergioMartínez, David Sánchez, Aida Valls, A semantic framework to protect the privacy of electronic health records with non-numerical attributes, Journal of Biomedical Informatics, 46: Wong, W.K., N. Mamoulis, D.W.L. Cheung, Non-homogeneous generalization in privacy preserving data publishing, Proceedings of ACM International Conferenceon Special Interest Group on Management of Data (SIGMOD) pp: LeFevre, K., D.J. DeWitt, R. Ramakrishnan, Incognito: efficient full-domain k-anonymity, Proceedings of ACMInternational Conference on Special Interest Group on Management of Data (SIGMOD) pp: Li, N., T. Li, t-closeness: Privacy beyond k-anonymity and l-diversity, Proceedings of the International Conference on Data Engineering (ICDE) pp: Koudas, N., D. Srivastava, T. Yu, Q. Zhang, Distribution based microdata anonymization, Proc. VLDB Endow. 2(1): Li, J., Y. Tao, X. Xiao, Preservation of proximity privacy in publishing numerical sensitive data, Proceedings of ACM International Conference on Special InterestGroup on Management of Data (SIGMOD) pp: Li, N., T. Li, t-closeness: Privacy beyond k-anonymity and l-diversity, Proceedings of the IEEE 23rd International Conference on Data Engineering, pp: Chaytor, R., K. Wang, Small domain randomization: same privacy, more utility, Proc. VLDB Endow. 3(1-2): Dwork, C., Differential privacy, Proceedings of International Colloquium on Automata, Languages and Programming (ICALP) pp: Dwork, C., F. McSherry, K. Nissim, A. Smith, Calibrating noise to sensitivity in private data analysis, Proceedings of the Conference on Theory of Cryptography(TCC) pp: Barak, B., K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, K. Talwar, Privacy, accuracy, and consistency too: a holistic solution to contingency table release, Proceedings of ACM Symposium on Principles of Database Systems (PODS), pp: Blum, A., K. Ligett, A. Roth, A learning theory approach to non-interactive database privacy, Proceedings of the ACM Symposium on Theory of Computing (STOC), pp: Xiao, X., G. Bender, M. Hay, J. Gehrke, ireduct: differential privacy with reduced relative errors, Proceedings of ACM International Conference on Special InterestGroup on Management of Data (SIGMOD), pp: Li, C., M. Hay, V. Rastogi, G. Miklau, A. McGregor, Optimizing linear counting queries under differential privacy, Proceedings of ACM Symposium on Principles ofdatabase Systems (PODS), pp: Xiao, X., G. Wang, J. Gehrke, Differential privacy via wavelet transforms, IEEE Trans. Knowl. Data Eng., 23(8):

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Privacy Preservation Data Mining Using GSlicing Approach Mr. Ghanshyam P. Dhomse

More information

Comparative Analysis of Anonymization Techniques

Comparative Analysis of Anonymization Techniques International Journal of Electronic and Electrical Engineering. ISSN 0974-2174 Volume 7, Number 8 (2014), pp. 773-778 International Research Publication House http://www.irphouse.com Comparative Analysis

More information

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique Sumit Jain 1, Abhishek Raghuvanshi 1, Department of information Technology, MIT, Ujjain Abstract--Knowledge

More information

Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database

Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database T.Malathi 1, S. Nandagopal 2 PG Scholar, Department of Computer Science and Engineering, Nandha College of Technology, Erode,

More information

Survey Result on Privacy Preserving Techniques in Data Publishing

Survey Result on Privacy Preserving Techniques in Data Publishing Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant

More information

Survey of Anonymity Techniques for Privacy Preserving

Survey of Anonymity Techniques for Privacy Preserving 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore Survey of Anonymity Techniques for Privacy Preserving Luo Yongcheng

More information

A Review of Privacy Preserving Data Publishing Technique

A Review of Privacy Preserving Data Publishing Technique A Review of Privacy Preserving Data Publishing Technique Abstract:- Amar Paul Singh School of CSE Bahra University Shimla Hills, India Ms. Dhanshri Parihar Asst. Prof (School of CSE) Bahra University Shimla

More information

Privacy Preserving in Knowledge Discovery and Data Publishing

Privacy Preserving in Knowledge Discovery and Data Publishing B.Lakshmana Rao, G.V Konda Reddy and G.Yedukondalu 33 Privacy Preserving in Knowledge Discovery and Data Publishing B.Lakshmana Rao 1, G.V Konda Reddy 2, G.Yedukondalu 3 Abstract Knowledge Discovery is

More information

Distributed Data Anonymization with Hiding Sensitive Node Labels

Distributed Data Anonymization with Hiding Sensitive Node Labels Distributed Data Anonymization with Hiding Sensitive Node Labels C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan University,Trichy

More information

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1105

More information

Slicing Technique For Privacy Preserving Data Publishing

Slicing Technique For Privacy Preserving Data Publishing Slicing Technique For Privacy Preserving Data Publishing D. Mohanapriya #1, Dr. T.Meyyappan M.Sc., MBA. M.Phil., Ph.d., 2 # Department of Computer Science and Engineering, Alagappa University, Karaikudi,

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

Emerging Measures in Preserving Privacy for Publishing The Data

Emerging Measures in Preserving Privacy for Publishing The Data Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the

More information

Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD)

Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Vol.2, Issue.1, Jan-Feb 2012 pp-208-212 ISSN: 2249-6645 Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Krishna.V #, Santhana Lakshmi. S * # PG Student,

More information

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,

More information

Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud

Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud R. H. Jadhav 1 P.E.S college of Engineering, Aurangabad, Maharashtra, India 1 rjadhav377@gmail.com ABSTRACT: Many

More information

A Novel Technique for Privacy Preserving Data Publishing

A Novel Technique for Privacy Preserving Data Publishing Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017 Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017 Outline 1. Introduction 2. Definition and Features of

More information

An Efficient Clustering Method for k-anonymization

An Efficient Clustering Method for k-anonymization An Efficient Clustering Method for -Anonymization Jun-Lin Lin Department of Information Management Yuan Ze University Chung-Li, Taiwan jun@saturn.yzu.edu.tw Meng-Cheng Wei Department of Information Management

More information

Demonstration of Damson: Differential Privacy for Analysis of Large Data

Demonstration of Damson: Differential Privacy for Analysis of Large Data Demonstration of Damson: Differential Privacy for Analysis of Large Data Marianne Winslett 1,2, Yin Yang 1,2, Zhenjie Zhang 1 1 Advanced Digital Sciences Center, Singapore {yin.yang, zhenjie}@adsc.com.sg

More information

FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING

FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING Neha V. Sonparote, Professor Vijay B. More. Neha V. Sonparote, Dept. of computer Engineering, MET s Institute of Engineering Nashik, Maharashtra,

More information

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING 1 B.KARTHIKEYAN, 2 G.MANIKANDAN, 3 V.VAITHIYANATHAN 1 Assistant Professor, School of Computing, SASTRA University, TamilNadu, India. 2 Assistant

More information

Comparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data

Comparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 2 (2017) pp. 247-253 Research India Publications http://www.ripublication.com Comparison and Analysis of Anonymization

More information

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING

ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING R.Kalaivani #1,S.Chidambaram #2 # Department of Information Techology, National Engineering College,

More information

Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database

Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database 204 IJEDR Volume 2, Issue ISSN: 232-9939 Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database Rachit Adhvaryu, 2 Nikunj Domadiya PG Student, 2 Professor

More information

On Privacy-Preservation of Text and Sparse Binary Data with Sketches

On Privacy-Preservation of Text and Sparse Binary Data with Sketches On Privacy-Preservation of Text and Sparse Binary Data with Sketches Charu C. Aggarwal Philip S. Yu Abstract In recent years, privacy preserving data mining has become very important because of the proliferation

More information

Data attribute security and privacy in Collaborative distributed database Publishing

Data attribute security and privacy in Collaborative distributed database Publishing International Journal of Engineering Inventions e-issn: 2278-7461, p-issn: 2319-6491 Volume 3, Issue 12 (July 2014) PP: 60-65 Data attribute security and privacy in Collaborative distributed database Publishing

More information

Partition Based Perturbation for Privacy Preserving Distributed Data Mining

Partition Based Perturbation for Privacy Preserving Distributed Data Mining BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation

More information

Privacy Preserved Data Publishing Techniques for Tabular Data

Privacy Preserved Data Publishing Techniques for Tabular Data Privacy Preserved Data Publishing Techniques for Tabular Data Keerthy C. College of Engineering Trivandrum Sabitha S. College of Engineering Trivandrum ABSTRACT Almost all countries have imposed strict

More information

A Proposed Technique for Privacy Preservation by Anonymization Method Accomplishing Concept of K-Means Clustering and DES

A Proposed Technique for Privacy Preservation by Anonymization Method Accomplishing Concept of K-Means Clustering and DES A Proposed Technique for Privacy Preservation by Anonymization Method Accomplishing Concept of K-Means Clustering and DES Priyanka Pachauri 1, Unmukh Datta 2 1 Dept. of CSE/IT, MPCT College Gwalior, India

More information

(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data

(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data (δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data Mohammad-Reza Zare-Mirakabad Department of Computer Engineering Scool of Electrical and Computer Yazd University, Iran mzare@yazduni.ac.ir

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching

Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Tiancheng Li Ninghui Li CERIAS and Department of Computer Science, Purdue University 250 N. University Street, West

More information

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER

SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 31 st July 216. Vol.89. No.2 25-216 JATIT & LLS. All rights reserved. SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 1 AMANI MAHAGOUB OMER, 2 MOHD MURTADHA BIN MOHAMAD 1 Faculty of Computing,

More information

AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA

AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA Morla Dinesh 1, Shaik. Jumlesha 2 1 M.Tech (S.E), Audisankara College Of Engineering &Technology 2

More information

Detection and Deletion of Outliers from Large Datasets

Detection and Deletion of Outliers from Large Datasets Detection and Deletion of Outliers from Large Datasets Nithya.Jayaprakash 1, Ms. Caroline Mary 2 M. tech Student, Dept of Computer Science, Mohandas College of Engineering and Technology, India 1 Assistant

More information

Multilevel Data Aggregated Using Privacy Preserving Data mining

Multilevel Data Aggregated Using Privacy Preserving Data mining Multilevel Data Aggregated Using Privacy Preserving Data mining V.Nirupa Department of Computer Science and Engineering Madanapalle, Andhra Pradesh, India M.V.Jaganadha Reddy Department of Computer Science

More information

Injector: Mining Background Knowledge for Data Anonymization

Injector: Mining Background Knowledge for Data Anonymization : Mining Background Knowledge for Data Anonymization Tiancheng Li, Ninghui Li Department of Computer Science, Purdue University 35 N. University Street, West Lafayette, IN 4797, USA {li83,ninghui}@cs.purdue.edu

More information

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and

More information

CERIAS Tech Report Injector: Mining Background Knowledge for Data Anonymization by Tiancheng Li; Ninghui Li Center for Education and Research

CERIAS Tech Report Injector: Mining Background Knowledge for Data Anonymization by Tiancheng Li; Ninghui Li Center for Education and Research CERIAS Tech Report 28-29 : Mining Background Knowledge for Data Anonymization by Tiancheng Li; Ninghui Li Center for Education and Research Information Assurance and Security Purdue University, West Lafayette,

More information

Information Security in Big Data: Privacy & Data Mining

Information Security in Big Data: Privacy & Data Mining Engineering (IJERCSE) Vol. 1, Issue 2, December 2014 Information Security in Big Data: Privacy & Data Mining [1] Kiran S.Gaikwad, [2] Assistant Professor. Seema Singh Solanki [1][2] Everest College of

More information

Preserving Privacy during Big Data Publishing using K-Anonymity Model A Survey

Preserving Privacy during Big Data Publishing using K-Anonymity Model A Survey ISSN No. 0976-5697 Volume 8, No. 5, May-June 2017 International Journal of Advanced Research in Computer Science SURVEY REPORT Available Online at www.ijarcs.info Preserving Privacy during Big Data Publishing

More information

Evaluating the Classification Accuracy of Data Mining Algorithms for Anonymized Data

Evaluating the Classification Accuracy of Data Mining Algorithms for Anonymized Data International Journal of Computer Science and Telecommunications [Volume 3, Issue 8, August 2012] 63 ISSN 2047-3338 Evaluating the Classification Accuracy of Data Mining Algorithms for Anonymized Data

More information

Personalized Privacy Preserving Publication of Transactional Datasets Using Concept Learning

Personalized Privacy Preserving Publication of Transactional Datasets Using Concept Learning Personalized Privacy Preserving Publication of Transactional Datasets Using Concept Learning S. Ram Prasad Reddy, Kvsvn Raju, and V. Valli Kumari associated with a particular transaction, if the adversary

More information

Data Anonymization. Graham Cormode.

Data Anonymization. Graham Cormode. Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties

More information

Preserving Data Mining through Data Perturbation

Preserving Data Mining through Data Perturbation Preserving Data Mining through Data Perturbation Mr. Swapnil Kadam, Prof. Navnath Pokale Abstract Data perturbation, a widely employed and accepted Privacy Preserving Data Mining (PPDM) approach, tacitly

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 3, March 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue:

More information

Composition Attacks and Auxiliary Information in Data Privacy

Composition Attacks and Auxiliary Information in Data Privacy Composition Attacks and Auxiliary Information in Data Privacy Srivatsava Ranjit Ganta Pennsylvania State University University Park, PA 1682 ranjit@cse.psu.edu Shiva Prasad Kasiviswanathan Pennsylvania

More information

Efficient k-anonymization Using Clustering Techniques

Efficient k-anonymization Using Clustering Techniques Efficient k-anonymization Using Clustering Techniques Ji-Won Byun 1,AshishKamra 2, Elisa Bertino 1, and Ninghui Li 1 1 CERIAS and Computer Science, Purdue University {byunj, bertino, ninghui}@cs.purdue.edu

More information

Alpha Anonymization in Social Networks using the Lossy-Join Approach

Alpha Anonymization in Social Networks using the Lossy-Join Approach TRANSACTIONS ON DATA PRIVACY 11 (2018) 1 22 Alpha Anonymization in Social Networks using the Lossy-Join Kiran Baktha*, B K Tripathy** * Department of Electronics and Communication Engineering, VIT University,

More information

NON-CENTRALIZED DISTINCT L-DIVERSITY

NON-CENTRALIZED DISTINCT L-DIVERSITY NON-CENTRALIZED DISTINCT L-DIVERSITY Chi Hong Cheong 1, Dan Wu 2, and Man Hon Wong 3 1,3 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong {chcheong, mhwong}@cse.cuhk.edu.hk

More information

Clustering of Data with Mixed Attributes based on Unified Similarity Metric

Clustering of Data with Mixed Attributes based on Unified Similarity Metric Clustering of Data with Mixed Attributes based on Unified Similarity Metric M.Soundaryadevi 1, Dr.L.S.Jayashree 2 Dept of CSE, RVS College of Engineering and Technology, Coimbatore, Tamilnadu, India 1

More information

MultiRelational k-anonymity

MultiRelational k-anonymity MultiRelational k-anonymity M. Ercan Nergiz Chris Clifton Department of Computer Sciences, Purdue University {mnergiz, clifton}@cs.purdue.edu A. Erhan Nergiz Bilkent University anergiz@ug.bilkent.edu.tr

More information

(α, k)-anonymity: An Enhanced k-anonymity Model for Privacy-Preserving Data Publishing

(α, k)-anonymity: An Enhanced k-anonymity Model for Privacy-Preserving Data Publishing (α, k)-anonymity: An Enhanced k-anonymity Model for Privacy-Preserving Data Publishing Raymond Chi-Wing Wong, Jiuyong Li +, Ada Wai-Chee Fu and Ke Wang Department of Computer Science and Engineering +

More information

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of

More information

Privacy Preserving High-Dimensional Data Mashup Megala.K 1, Amudha.S 2 2. THE CHALLENGES 1. INTRODUCTION

Privacy Preserving High-Dimensional Data Mashup Megala.K 1, Amudha.S 2 2. THE CHALLENGES 1. INTRODUCTION Privacy Preserving High-Dimensional Data Mashup Megala.K 1, Amudha.S 2 1 PG Student, Department of Computer Science and Engineering, Sriram Engineering College, Perumpattu-602 024, Tamil Nadu, India 2

More information

Generalization-Based Privacy-Preserving Data Collection

Generalization-Based Privacy-Preserving Data Collection Generalization-Based Privacy-Preserving Data Collection Lijie Zhang and Weining Zhang Department of Computer Science, University of Texas at San Antonio {lijez,wzhang}@cs.utsa.edu Abstract. In privacy-preserving

More information

A Review on Privacy Preserving Data Mining Approaches

A Review on Privacy Preserving Data Mining Approaches A Review on Privacy Preserving Data Mining Approaches Anu Thomas Asst.Prof. Computer Science & Engineering Department DJMIT,Mogar,Anand Gujarat Technological University Anu.thomas@djmit.ac.in Jimesh Rana

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods

A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods S.Anusuya 1, M.Balaganesh 2 P.G. Student, Department of Computer Science and Engineering, Sembodai Rukmani Varatharajan Engineering

More information

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE Sinu T S 1, Mr.Joseph George 1,2 Computer Science and Engineering, Adi Shankara Institute of Engineering

More information

Data Anonymization - Generalization Algorithms

Data Anonymization - Generalization Algorithms Data Anonymization - Generalization Algorithms Li Xiong CS573 Data Privacy and Anonymity Generalization and Suppression Z2 = {410**} Z1 = {4107*. 4109*} Generalization Replace the value with a less specific

More information

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP 324 Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP Shivaji Yadav(131322) Assistant Professor, CSE Dept. CSE, IIMT College of Engineering, Greater Noida,

More information

Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM

Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM 1 Dr.G.Kirubhakar and 2 Dr.C.Venkatesh Address for Correspondence 1 Department of Computer Science and Engineering, Surya Engineering

More information

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 2 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0032 Privacy-Preserving of Check-in

More information

Measuring and Evaluating Dissimilarity in Data and Pattern Spaces

Measuring and Evaluating Dissimilarity in Data and Pattern Spaces Measuring and Evaluating Dissimilarity in Data and Pattern Spaces Irene Ntoutsi, Yannis Theodoridis Database Group, Information Systems Laboratory Department of Informatics, University of Piraeus, Greece

More information

Privacy Preserving Health Data Mining

Privacy Preserving Health Data Mining IJCST Vo l. 6, Is s u e 4, Oc t - De c 2015 ISSN : 0976-8491 (Online) ISSN : 2229-4333 (Print) Privacy Preserving Health Data Mining 1 Somy.M.S, 2 Gayatri.K.S, 3 Ashwini.B 1,2,3 Dept. of CSE, Mar Baselios

More information

PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES)

PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES) PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES) Jyotirmayee Rautaray 1, Raghvendra Kumar 2 School of Computer Engineering, KIIT University, Odisha, India 1 School of Computer

More information

Pufferfish: A Semantic Approach to Customizable Privacy

Pufferfish: A Semantic Approach to Customizable Privacy Pufferfish: A Semantic Approach to Customizable Privacy Ashwin Machanavajjhala ashwin AT cs.duke.edu Collaborators: Daniel Kifer (Penn State), Bolin Ding (UIUC, Microsoft Research) idash Privacy Workshop

More information

Data Privacy in Big Data Applications. Sreagni Banerjee CS-846

Data Privacy in Big Data Applications. Sreagni Banerjee CS-846 Data Privacy in Big Data Applications Sreagni Banerjee CS-846 Outline! Motivation! Goal and Approach! Introduction to Big Data Privacy! Privacy preserving methods in Big Data Application! Progress! Next

More information

Achieving k-anonmity* Privacy Protection Using Generalization and Suppression

Achieving k-anonmity* Privacy Protection Using Generalization and Suppression UT DALLAS Erik Jonsson School of Engineering & Computer Science Achieving k-anonmity* Privacy Protection Using Generalization and Suppression Murat Kantarcioglu Based on Sweeney 2002 paper Releasing Private

More information

Microdata Publishing with Algorithmic Privacy Guarantees

Microdata Publishing with Algorithmic Privacy Guarantees Microdata Publishing with Algorithmic Privacy Guarantees Tiancheng Li and Ninghui Li Department of Computer Science, Purdue University 35 N. University Street West Lafayette, IN 4797-217 {li83,ninghui}@cs.purdue.edu

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W STRUCTURAL DIVERSITY ANONYMITY

IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W STRUCTURAL DIVERSITY ANONYMITY IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W STRUCTURAL DIVERSITY ANONYMITY Gowthamy.R 1* and Uma.P 2 *1 M.E.Scholar, Department of Computer Science & Engineering Nandha Engineering College,

More information

Normalization based K means Clustering Algorithm

Normalization based K means Clustering Algorithm Normalization based K means Clustering Algorithm Deepali Virmani 1,Shweta Taneja 2,Geetika Malhotra 3 1 Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi Email:deepalivirmani@gmail.com

More information

Sanitization Techniques against Personal Information Inference Attack on Social Network

Sanitization Techniques against Personal Information Inference Attack on Social Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 12, December 2014,

More information

Co-clustering for differentially private synthetic data generation

Co-clustering for differentially private synthetic data generation Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &

More information

An efficient hash-based algorithm for minimal k-anonymity

An efficient hash-based algorithm for minimal k-anonymity An efficient hash-based algorithm for minimal k-anonymity Xiaoxun Sun Min Li Hua Wang Ashley Plank Department of Mathematics & Computing University of Southern Queensland Toowoomba, Queensland 4350, Australia

More information

An Efficient Clustering for Crime Analysis

An Efficient Clustering for Crime Analysis An Efficient Clustering for Crime Analysis Malarvizhi S 1, Siddique Ibrahim 2 1 UG Scholar, Department of Computer Science and Engineering, Kumaraguru College Of Technology, Coimbatore, Tamilnadu, India

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Generalizing PIR for Practical Private Retrieval of Public Data

Generalizing PIR for Practical Private Retrieval of Public Data Generalizing PIR for Practical Private Retrieval of Public Data Shiyuan Wang, Divyakant Agrawal, and Amr El Abbadi Department of Computer Science, UC Santa Barbara {sywang, agrawal, amr}@cs.ucsb.edu Abstract.

More information

ISSN: (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

International Journal of Advance Engineering and Research Development CLUSTERING ON UNCERTAIN DATA BASED PROBABILITY DISTRIBUTION SIMILARITY

International Journal of Advance Engineering and Research Development CLUSTERING ON UNCERTAIN DATA BASED PROBABILITY DISTRIBUTION SIMILARITY Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 08, August -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 CLUSTERING

More information

Private Database Synthesis for Outsourced System Evaluation

Private Database Synthesis for Outsourced System Evaluation Private Database Synthesis for Outsourced System Evaluation Vani Gupta 1, Gerome Miklau 1, and Neoklis Polyzotis 2 1 Dept. of Computer Science, University of Massachusetts, Amherst, MA, USA 2 Dept. of

More information

Comparison of Online Record Linkage Techniques

Comparison of Online Record Linkage Techniques International Research Journal of Engineering and Technology (IRJET) e-issn: 2395-0056 Volume: 02 Issue: 09 Dec-2015 p-issn: 2395-0072 www.irjet.net Comparison of Online Record Linkage Techniques Ms. SRUTHI.

More information

A Survey on: Privacy Preserving Mining Implementation Techniques

A Survey on: Privacy Preserving Mining Implementation Techniques A Survey on: Privacy Preserving Mining Implementation Techniques Mukesh Kumar Dangi P.G. Scholar Computer Science & Engineering, Millennium Institute of Technology Bhopal E-mail-mukeshlncit1987@gmail.com

More information

Correlation Based Feature Selection with Irrelevant Feature Removal

Correlation Based Feature Selection with Irrelevant Feature Removal Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models

More information

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan

More information

Privacy and Security Ensured Rule Mining under Partitioned Databases

Privacy and Security Ensured Rule Mining under Partitioned Databases www.ijiarec.com ISSN:2348-2079 Volume-5 Issue-1 International Journal of Intellectual Advancements and Research in Engineering Computations Privacy and Security Ensured Rule Mining under Partitioned Databases

More information

Sensitive Label Privacy Protection on Social Network Data

Sensitive Label Privacy Protection on Social Network Data Sensitive Label Privacy Protection on Social Network Data Yi Song, Panagiotis Karras, Qian Xiao, and Stéphane Bressan School of Computing National University of Singapore {songyi, xiaoqian, steph}@nus.edu.sg

More information

CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY

CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY 1 V.VIJEYA KAVERI, 2 Dr.V.MAHESWARI 1 Research Scholar, Sathyabama University, Chennai 2 Prof., Department of Master

More information

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at Performance Evaluation of Ensemble Method Based Outlier Detection Algorithm Priya. M 1, M. Karthikeyan 2 Department of Computer and Information Science, Annamalai University, Annamalai Nagar, Tamil Nadu,

More information

PRIVACY PROTECTION OF FREQUENTLY USED DATA SENSITIVE IN CLOUD SEVER

PRIVACY PROTECTION OF FREQUENTLY USED DATA SENSITIVE IN CLOUD SEVER PRIVACY PROTECTION OF FREQUENTLY USED DATA SENSITIVE IN CLOUD SEVER T.Pavithra M.E Computer Science And Engineering S.A.Engineering College, Chennai-600077, Tamil Nadu. Mrs.G.Umarani Ph.D Professor & Head

More information

PPKM: Preserving Privacy in Knowledge Management

PPKM: Preserving Privacy in Knowledge Management PPKM: Preserving Privacy in Knowledge Management N. Maheswari (Corresponding Author) P.G. Department of Computer Science Kongu Arts and Science College, Erode-638-107, Tamil Nadu, India E-mail: mahii_14@yahoo.com

More information

Ambiguity: Hide the Presence of Individuals and Their Privacy with Low Information Loss

Ambiguity: Hide the Presence of Individuals and Their Privacy with Low Information Loss : Hide the Presence of Individuals and Their Privacy with Low Information Loss Hui (Wendy) Wang Department of Computer Science Stevens Institute of Technology Hoboken, NJ, USA hwang@cs.stevens.edu Abstract

More information

Anonymization Algorithms - Microaggregation and Clustering

Anonymization Algorithms - Microaggregation and Clustering Anonymization Algorithms - Microaggregation and Clustering Li Xiong CS573 Data Privacy and Anonymity Anonymization using Microaggregation or Clustering Practical Data-Oriented Microaggregation for Statistical

More information