ADVANCES in NATURAL and APPLIED SCIENCES
|
|
- Laura Palmer
- 5 years ago
- Views:
Transcription
1 ADVANCES in NATURAL and APPLIED SCIENCES ISSN: Published BYAENSI Publication EISSN: May 11(7): pages Open Access Journal A Privacy Preserving Data Mining Approach for Handling Data with Outliers 1 V.V. Vishnu Priya, 2 A.K. Ilavarasi, 3 Dr.B. Sathiya Bhama 1 PG Student- CSE Sona College of Technology Salem, India. 2 Assistant Professor CSE Sona College of Technology Salem, India. 3 HOD CSE Sona College of Technology Salem, India. Received 28 January 2017; Accepted 22 May 2017; Available online 28 May 2017 Address For Correspondence: V.V. Vishnu Priya, PG Student- CSE Sona College of Technology Salem, India. vishnupriya31.v@gmail.com Copyright 2017 by authors and American-Eurasian Network for ScientificInformation (AENSI Publication). This work is licensed under the Creative Commons Attribution International License (CC BY). ABSTRACT Organizations publish their private data for the research analysis. Publishing datasets for analysis causes serious concerns in the data privacy. The data published may contains outliers. Outliers are easily identifiable, therefore adversaries can capture their private information about an individual by linking with the other attribute published in external database. The motivation is to prevent the disclosure of sensitive information. Distinguishability attack occurs while publishing the datasets that contains outliers. The syntactic privacy models could not prevent the attack. The plain l-diversity could defend against the attack. The existing plain l-diversity preserves the dataset from the distinguishability attack but it results in information loss. In this paper we are going to improve the algorithm with minimal information loss using K Nearest Neighbour Algorithm. KEYWORDS: Privacy, Data Mining, Outliers, Data Sharing. INTRODUCTION Data Mining [1] is the process of extracting the knowledge from the data which is stored in the large repositories. Privacy Preserving Data Mining problem [2] has been considered more importantly in recent years due to the fact that huge amount of information about individuals are stored at different vendors for the research purposes. PPDM is an new research topic in Data Mining and in the Statistical databases in which the Data Mining algorithms are analyzed to check whether they acquire privacy in data. Privacy Preservation of individuals data from disclosure is considered as the important function inorder to maintain privacy. In this way privacy plays major role in the data mining process. The problem in the data mining output is it reveals the individuals personal data. It leads to threats in the privacy of the individuals. The motivation of the people is that their personal information should not be known to others without their knowledge. But data mining algorithms failed to protect the privacy of the individuals. Privacy is defined as the right of an individual person to keep their sensitive information from being disclosed. Privacy states that from an set of records the adversary should not identify the person associated with that record. The results of the data mining operations are sensitive. Privacy is one of the important properties [4] that an system needs to be satisfied. For this purpose, numerous efforts had been undertaken to devote the PPDM algorithms to protect the information from being disclosed. One of the basic data mining problem is Outlier Detection. An Outlier is an observation point that deviates from the other observations or from the rest of the data [6].Outliers can be novel, abnormal, unusual or noisy information. Outliers may be real or erroneous. ToCite ThisArticle: V.V. Vishnu Priya, A.K. Ilavarasi, Dr.B. Sathiya Bhama., A Privacy Preserving Data Mining Approach for Handling Data with Outliers. Advances in Natural and Applied Sciences. 11(7); Pages:
2 586 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Related work: In recent years the privacy preserving data publishing had drawn more attention. To protect[3] the privacy of the individuals the dataset must be anonymized before it is released. Previous[3] study shown that by removing the explicit identifiers such as name,ssn(social Security Number) from the dataset cannot maintain privacy. It is because the Quasi identifiers such as zip code, gender helps to jointly identify the person privately. The identity of the person can be revealed easily when it is compared with the public dataset (eg.voter list).sweeney[5] proposed k-anonymity method which is treated as the conventional method for anonymization.quasi Identifier consists of person specific sensitive attribute information. It achieves using generalization and suppression method so that the each individual is indistinguishable from the at least k- 1records.Generalization replaces the value less specific but it is also said to be semantically consistent. Suppression[8] reduces the exactness of applications and does not releases the value at all.this type of K- anonymity method prevents from the linkage attack. The authors[10][5] proved that removing Quasi-Identifiers from the dataset donot ensure the privacy so they suggested that the k-anonymity method is better for publishing the microdata. Author[9] suggested an novel approach based on bottom up method to group the quasi identifiers.k-anonymity model[5] is proved to be theoretically NP-Hard.two types of attack are possible such as Background Knowledge attack and Homogeneity attack. L-diversity [7] model is introduced to protect from the attribute disclosure.it consists of distinct well represented values in each equivalence class. The improved methods such as the t-closeness, p-sensitive anonymity, (k,e)anonymity[11] are described in it. As the L-diversity model, the several other approaches are proposed to achieve the principle of privacy in[13,15,16,18].they are Classified as partitioning method and randomization method. The dataset[15,18] is divided into Quasiidentifier groups and it publishes only the anonymized groups in the partitioning based method. To increase the utility of the anonymized dataset nonhomogeneous generalization method is proposed by the Koudos [12] In randomization approach the original values are replaced with the noise or duplicate values[15]li et al proposed that the distribution of sensitive values in the released dataset must be close to the original in t-closeness method[14] If outliers present in the original dataset they must be shown in the both the original and the modified dataset. In this way the outliers can be easily detected using the distribution. Few studies shows the possibility of attacks in the partition based schemes. Machanavajjhala et al[17] described some of the attacks in the k-anonymity and proposed the l-diversity. Our work adapts the l-diversity model. In the recent years an new model for privacy is emerged known as differential privacy [20] In this differential privacy method the removal or the addition of any one record will not affect the entire dataset[19].numerous techniques had been proposed to publish the different types of data to satisfy the differential privacy[21,23]barak et al proposes[21] the method to publish the marginals of the dataset. Blum et al [22]proposed an approach for releasing the one dimensional data which satisfies the differential privacy in non interactive way.hay et al[24] improves the performance of the[22]the wavelet based approach[25]is used by the Xiao et al for publishing the micro dimensional dataset. 3.Privacy preserving method for data containing Outliers: Organizations are increasingly publishing microdata that contains non aggregated information about the individuals.non-aggregated data that contains outliers raises serious concerns in data privacy.when outliers exist in the dataset, they are easier to be distinguished from the crowd and the privacy is breached. Distinguishability-based attack occurs by which the adversary can identify outliers and reveal their private information from an anonymized dataset. The existing plain l-diversity preserves the dataset from the distinguishability attack but it results in information loss since it hides only the hideable outliers present in the dataset. In l-diversity all records that share the same values of quasi identifiers should have distinct values for their sensitive attributes. In previous studies[3]using l-diversity method the QI attributes are generalized and the outliers present in the data are hided inorder to maintain privacy.but when we generalize and hide the data containing outliers it results in information loss since it hides only the hideable outliers and the unhideable outliers are eliminated. In the proposed system the information loss is reduced using KNN algorithm by enhancing the l-diversity model.the proposed system consists of five steps described in the fig 3.1
3 587 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Load Dataset Cluster records KNN Classifier Find Outliers Group outliers to its nearest cluster Fig. 3.1: System Architecture of Proposed System In this proposed system first we import the dataset and Fuzzy clustering is applied. Fuzzy clustering used to group the data into n clusters in which each datapoint in the dataset are belongs to each cluster to an certain degree.in simple words we can say that each point can belong to more than one cluster. Fuzzy clustering is applied because it helps the datapoint to move to its nearest cluster then KNN classifier is applied to find the outliers. Fuzzy Clustering helps to find combination weights, membership functions and cluster centres to minimize the objective function. Outliers are the observation point that deviates from others.knn classifies the new cases (outliers) based on its distance functions.the outliers present in the data are moved to its nearest bucket (cluster). KNN algorithm steps: Determine parameter K=number of nearest neighbours. Calculate the distance between the query instance and all training samples. Find the nearest neighbour by sorting and gather the category of the nearest neighbours. Use simple majority of the category of the nearest neighbours as the prediction value of the query instance. RESULT AND DISCUSSION The input given is adult dataset which is first loaded and then fuzzy Clustering method is applied to cluster the records.it is especially used for mapping the outliers to its nearest cluster. Fig. 4.1: Raw Dataset
4 588 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Fig. 4.2: Fuzzy Clustering Fig. 4.3: Assigning K value for KNN classification Fig. 4.4: Outlier Detection Fig. 4.5: Outlier Mapping After that the KNN classifier is applied and found the outliers based on its distance. Euclidian Distance Method is used to calculate the distance between the records.then the outliers are moved to its nearest bucket (cluster). By this way the privacy of the dataset is maintained using l-diversity and information loss is reduced using KNN classifier by assigning the outliers to its nearest clusters. Table 1 describes the performance of the Information loss metrics. The information loss is analyzed in terms of outlier detection error ratio results in figure 4.6. Information loss is defined as, 1 Loss = D 2 O 1 + N DC 1 where, N DC ( D 2, N DC D ) 2
5 589 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: D: Dimensionality of the vector (2,3,4,5,...) O: Outlier NDc: The number of training samples per class (>D+1) Table 1: Comparisons of Information Loss during number of runs Methods Existing Proposed as, Table 2 describes the performance of the system using silhoutee metrics in figure 4.3. Accuracy is defined Accuracy(i) = b(i) a(i) max {a(i), b(i)} where, a(i) is the cluster similarity, b(i) be the lowest average dissimilarity of i to any other cluster, of which i is not a member. The cluster with this lowest average dissimilarity is said to be the neighbouring cluster of i because it is the next best fit cluster for point i. Table 2: Comparisons of Cluster Accuracy during number of runs Methods Existing Proposed Table 3 describes the performance of the system using Time metrics. Computational time metrics is analyzed in terms of Outlier detection in figure 4.8. Computational Time(CT) is defined as CT = Process Start Time - Process End Time Table 2: Comparisons of Computational Time (CPU seconds) during number of runs Methods Existing Proposed Fig. 4.6: Information loss Graph The information loss is reduced when comparing to the existing method.
6 590 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Fig. 4.7: Accuracy Graph Fig. 4.8: Compilation Time Graph Conclusion: In this paper the problem of publishing data with outliers in privacy fashion is studied. The microdata containing outliers are published in a privacy preserving way. The existing plain l-diversity system provides privacy only for the hideable outliers and it results in information loss. In this paper we improved the algorithm using K Nearest Neighbour Algorithm to reduce information loss. REFERENCES 1. Han, J. and M. Kamber, Data Mining: Concepts andtechniques, 2nd ed.,the Morgan Kaufmann Series in DataManagement Systems, Jim Gray, Series Editor. 2. AnkitaShrivastava, U. Dutta, An Emblematic Study of Different Techniques in PPDM, International Journal of Advanced Research in Computer Science and Software Engineering. 3. Hui(Wendy)Wang, RuilinLiu, Hiding outliers into crowd: Privacy Preserving data publishing with outliers,elsevier. 4. Elisa Bertino, Dan Lin, and Wei Jiang, A Survey of Quantification of Privacy Preserving Data Mining Algorithms 5. Sweeney, L., k-anonymity: a model for protecting privacy, Int. J. Uncertain. Fuzziness Knowl. Based Syst., 10(5): Williams, G., R. Baxter, H. He, S. Hawkins and L. Gu, A Comparative Study for RNN for OutlierDetection in Data Mining. In Proceedings of the 2ndIEEE International Conference on Data Mining, page709, Maebashi City, Japan. 7. Tiancheng Li, Ninghui Li, Jian Zhang, Ian Molloy, "Slicing: A New Approach for Privacy Preserving Data Publishing", IEEE Transactions on Knowledge & Data Engineering, 24(3): , doi: /tkde Samarati, P., Protectingrespondent s privacy in Microdata release, IEEE Transactions on Knowledge and Data Engineering, 13: TianchengLi, NinghuiLi, Towards Optimal k-anonymization, Data & Knowledge Engineering, Elsevier. 303.
7 591 V.V. Vishnu Priya et al., 2017/Advances in Natural and Applied Sciences. 11(7) May 2017, Pages: Machanavajjhala, A., J. Gehrke, D. Kifer, l-diversity: Privacy Beyond k-anonymity, ACM Transactions on Knowledge Discovery from Data, pp: SergioMartínez, David Sánchez, Aida Valls, A semantic framework to protect the privacy of electronic health records with non-numerical attributes, Journal of Biomedical Informatics, 46: Wong, W.K., N. Mamoulis, D.W.L. Cheung, Non-homogeneous generalization in privacy preserving data publishing, Proceedings of ACM International Conferenceon Special Interest Group on Management of Data (SIGMOD) pp: LeFevre, K., D.J. DeWitt, R. Ramakrishnan, Incognito: efficient full-domain k-anonymity, Proceedings of ACMInternational Conference on Special Interest Group on Management of Data (SIGMOD) pp: Li, N., T. Li, t-closeness: Privacy beyond k-anonymity and l-diversity, Proceedings of the International Conference on Data Engineering (ICDE) pp: Koudas, N., D. Srivastava, T. Yu, Q. Zhang, Distribution based microdata anonymization, Proc. VLDB Endow. 2(1): Li, J., Y. Tao, X. Xiao, Preservation of proximity privacy in publishing numerical sensitive data, Proceedings of ACM International Conference on Special InterestGroup on Management of Data (SIGMOD) pp: Li, N., T. Li, t-closeness: Privacy beyond k-anonymity and l-diversity, Proceedings of the IEEE 23rd International Conference on Data Engineering, pp: Chaytor, R., K. Wang, Small domain randomization: same privacy, more utility, Proc. VLDB Endow. 3(1-2): Dwork, C., Differential privacy, Proceedings of International Colloquium on Automata, Languages and Programming (ICALP) pp: Dwork, C., F. McSherry, K. Nissim, A. Smith, Calibrating noise to sensitivity in private data analysis, Proceedings of the Conference on Theory of Cryptography(TCC) pp: Barak, B., K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, K. Talwar, Privacy, accuracy, and consistency too: a holistic solution to contingency table release, Proceedings of ACM Symposium on Principles of Database Systems (PODS), pp: Blum, A., K. Ligett, A. Roth, A learning theory approach to non-interactive database privacy, Proceedings of the ACM Symposium on Theory of Computing (STOC), pp: Xiao, X., G. Bender, M. Hay, J. Gehrke, ireduct: differential privacy with reduced relative errors, Proceedings of ACM International Conference on Special InterestGroup on Management of Data (SIGMOD), pp: Li, C., M. Hay, V. Rastogi, G. Miklau, A. McGregor, Optimizing linear counting queries under differential privacy, Proceedings of ACM Symposium on Principles ofdatabase Systems (PODS), pp: Xiao, X., G. Wang, J. Gehrke, Differential privacy via wavelet transforms, IEEE Trans. Knowl. Data Eng., 23(8):
Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique
Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Privacy Preservation Data Mining Using GSlicing Approach Mr. Ghanshyam P. Dhomse
More informationComparative Analysis of Anonymization Techniques
International Journal of Electronic and Electrical Engineering. ISSN 0974-2174 Volume 7, Number 8 (2014), pp. 773-778 International Research Publication House http://www.irphouse.com Comparative Analysis
More informationSMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique
SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique Sumit Jain 1, Abhishek Raghuvanshi 1, Department of information Technology, MIT, Ujjain Abstract--Knowledge
More informationEnhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database
Enhanced Slicing Technique for Improving Accuracy in Crowdsourcing Database T.Malathi 1, S. Nandagopal 2 PG Scholar, Department of Computer Science and Engineering, Nandha College of Technology, Erode,
More informationSurvey Result on Privacy Preserving Techniques in Data Publishing
Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant
More informationSurvey of Anonymity Techniques for Privacy Preserving
2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore Survey of Anonymity Techniques for Privacy Preserving Luo Yongcheng
More informationA Review of Privacy Preserving Data Publishing Technique
A Review of Privacy Preserving Data Publishing Technique Abstract:- Amar Paul Singh School of CSE Bahra University Shimla Hills, India Ms. Dhanshri Parihar Asst. Prof (School of CSE) Bahra University Shimla
More informationPrivacy Preserving in Knowledge Discovery and Data Publishing
B.Lakshmana Rao, G.V Konda Reddy and G.Yedukondalu 33 Privacy Preserving in Knowledge Discovery and Data Publishing B.Lakshmana Rao 1, G.V Konda Reddy 2, G.Yedukondalu 3 Abstract Knowledge Discovery is
More informationDistributed Data Anonymization with Hiding Sensitive Node Labels
Distributed Data Anonymization with Hiding Sensitive Node Labels C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan University,Trichy
More informationImplementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1105
More informationSlicing Technique For Privacy Preserving Data Publishing
Slicing Technique For Privacy Preserving Data Publishing D. Mohanapriya #1, Dr. T.Meyyappan M.Sc., MBA. M.Phil., Ph.d., 2 # Department of Computer Science and Engineering, Alagappa University, Karaikudi,
More informationAutomated Information Retrieval System Using Correlation Based Multi- Document Summarization Method
Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated
More informationEmerging Measures in Preserving Privacy for Publishing The Data
Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the
More informationSecured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD)
Vol.2, Issue.1, Jan-Feb 2012 pp-208-212 ISSN: 2249-6645 Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Krishna.V #, Santhana Lakshmi. S * # PG Student,
More informationAccumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust
Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,
More informationDistributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud
Distributed Bottom up Approach for Data Anonymization using MapReduce framework on Cloud R. H. Jadhav 1 P.E.S college of Engineering, Aurangabad, Maharashtra, India 1 rjadhav377@gmail.com ABSTRACT: Many
More informationA Novel Technique for Privacy Preserving Data Publishing
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,
More informationDifferential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017
Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017 Outline 1. Introduction 2. Definition and Features of
More informationAn Efficient Clustering Method for k-anonymization
An Efficient Clustering Method for -Anonymization Jun-Lin Lin Department of Information Management Yuan Ze University Chung-Li, Taiwan jun@saturn.yzu.edu.tw Meng-Cheng Wei Department of Information Management
More informationDemonstration of Damson: Differential Privacy for Analysis of Large Data
Demonstration of Damson: Differential Privacy for Analysis of Large Data Marianne Winslett 1,2, Yin Yang 1,2, Zhenjie Zhang 1 1 Advanced Digital Sciences Center, Singapore {yin.yang, zhenjie}@adsc.com.sg
More informationFREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING
FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING Neha V. Sonparote, Professor Vijay B. More. Neha V. Sonparote, Dept. of computer Engineering, MET s Institute of Engineering Nashik, Maharashtra,
More informationA FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING
A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING 1 B.KARTHIKEYAN, 2 G.MANIKANDAN, 3 V.VAITHIYANATHAN 1 Assistant Professor, School of Computing, SASTRA University, TamilNadu, India. 2 Assistant
More informationComparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 2 (2017) pp. 247-253 Research India Publications http://www.ripublication.com Comparison and Analysis of Anonymization
More informationPrivacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University
Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks
More informationADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING
ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING R.Kalaivani #1,S.Chidambaram #2 # Department of Information Techology, National Engineering College,
More informationResearch Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database
204 IJEDR Volume 2, Issue ISSN: 232-9939 Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database Rachit Adhvaryu, 2 Nikunj Domadiya PG Student, 2 Professor
More informationOn Privacy-Preservation of Text and Sparse Binary Data with Sketches
On Privacy-Preservation of Text and Sparse Binary Data with Sketches Charu C. Aggarwal Philip S. Yu Abstract In recent years, privacy preserving data mining has become very important because of the proliferation
More informationData attribute security and privacy in Collaborative distributed database Publishing
International Journal of Engineering Inventions e-issn: 2278-7461, p-issn: 2319-6491 Volume 3, Issue 12 (July 2014) PP: 60-65 Data attribute security and privacy in Collaborative distributed database Publishing
More informationPartition Based Perturbation for Privacy Preserving Distributed Data Mining
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation
More informationPrivacy Preserved Data Publishing Techniques for Tabular Data
Privacy Preserved Data Publishing Techniques for Tabular Data Keerthy C. College of Engineering Trivandrum Sabitha S. College of Engineering Trivandrum ABSTRACT Almost all countries have imposed strict
More informationA Proposed Technique for Privacy Preservation by Anonymization Method Accomplishing Concept of K-Means Clustering and DES
A Proposed Technique for Privacy Preservation by Anonymization Method Accomplishing Concept of K-Means Clustering and DES Priyanka Pachauri 1, Unmukh Datta 2 1 Dept. of CSE/IT, MPCT College Gwalior, India
More information(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data
(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data Mohammad-Reza Zare-Mirakabad Department of Computer Engineering Scool of Electrical and Computer Yazd University, Iran mzare@yazduni.ac.ir
More informationOutlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data
Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University
More informationOptimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching
Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Tiancheng Li Ninghui Li CERIAS and Department of Computer Science, Purdue University 250 N. University Street, West
More informationCS573 Data Privacy and Security. Differential Privacy. Li Xiong
CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy
More informationSIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER
31 st July 216. Vol.89. No.2 25-216 JATIT & LLS. All rights reserved. SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 1 AMANI MAHAGOUB OMER, 2 MOHD MURTADHA BIN MOHAMAD 1 Faculty of Computing,
More informationAN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA
AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA Morla Dinesh 1, Shaik. Jumlesha 2 1 M.Tech (S.E), Audisankara College Of Engineering &Technology 2
More informationDetection and Deletion of Outliers from Large Datasets
Detection and Deletion of Outliers from Large Datasets Nithya.Jayaprakash 1, Ms. Caroline Mary 2 M. tech Student, Dept of Computer Science, Mohandas College of Engineering and Technology, India 1 Assistant
More informationMultilevel Data Aggregated Using Privacy Preserving Data mining
Multilevel Data Aggregated Using Privacy Preserving Data mining V.Nirupa Department of Computer Science and Engineering Madanapalle, Andhra Pradesh, India M.V.Jaganadha Reddy Department of Computer Science
More informationInjector: Mining Background Knowledge for Data Anonymization
: Mining Background Knowledge for Data Anonymization Tiancheng Li, Ninghui Li Department of Computer Science, Purdue University 35 N. University Street, West Lafayette, IN 4797, USA {li83,ninghui}@cs.purdue.edu
More informationData Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness
Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and
More informationCERIAS Tech Report Injector: Mining Background Knowledge for Data Anonymization by Tiancheng Li; Ninghui Li Center for Education and Research
CERIAS Tech Report 28-29 : Mining Background Knowledge for Data Anonymization by Tiancheng Li; Ninghui Li Center for Education and Research Information Assurance and Security Purdue University, West Lafayette,
More informationInformation Security in Big Data: Privacy & Data Mining
Engineering (IJERCSE) Vol. 1, Issue 2, December 2014 Information Security in Big Data: Privacy & Data Mining [1] Kiran S.Gaikwad, [2] Assistant Professor. Seema Singh Solanki [1][2] Everest College of
More informationPreserving Privacy during Big Data Publishing using K-Anonymity Model A Survey
ISSN No. 0976-5697 Volume 8, No. 5, May-June 2017 International Journal of Advanced Research in Computer Science SURVEY REPORT Available Online at www.ijarcs.info Preserving Privacy during Big Data Publishing
More informationEvaluating the Classification Accuracy of Data Mining Algorithms for Anonymized Data
International Journal of Computer Science and Telecommunications [Volume 3, Issue 8, August 2012] 63 ISSN 2047-3338 Evaluating the Classification Accuracy of Data Mining Algorithms for Anonymized Data
More informationPersonalized Privacy Preserving Publication of Transactional Datasets Using Concept Learning
Personalized Privacy Preserving Publication of Transactional Datasets Using Concept Learning S. Ram Prasad Reddy, Kvsvn Raju, and V. Valli Kumari associated with a particular transaction, if the adversary
More informationData Anonymization. Graham Cormode.
Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties
More informationPreserving Data Mining through Data Perturbation
Preserving Data Mining through Data Perturbation Mr. Swapnil Kadam, Prof. Navnath Pokale Abstract Data perturbation, a widely employed and accepted Privacy Preserving Data Mining (PPDM) approach, tacitly
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 3, Issue 3, March 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue:
More informationComposition Attacks and Auxiliary Information in Data Privacy
Composition Attacks and Auxiliary Information in Data Privacy Srivatsava Ranjit Ganta Pennsylvania State University University Park, PA 1682 ranjit@cse.psu.edu Shiva Prasad Kasiviswanathan Pennsylvania
More informationEfficient k-anonymization Using Clustering Techniques
Efficient k-anonymization Using Clustering Techniques Ji-Won Byun 1,AshishKamra 2, Elisa Bertino 1, and Ninghui Li 1 1 CERIAS and Computer Science, Purdue University {byunj, bertino, ninghui}@cs.purdue.edu
More informationAlpha Anonymization in Social Networks using the Lossy-Join Approach
TRANSACTIONS ON DATA PRIVACY 11 (2018) 1 22 Alpha Anonymization in Social Networks using the Lossy-Join Kiran Baktha*, B K Tripathy** * Department of Electronics and Communication Engineering, VIT University,
More informationNON-CENTRALIZED DISTINCT L-DIVERSITY
NON-CENTRALIZED DISTINCT L-DIVERSITY Chi Hong Cheong 1, Dan Wu 2, and Man Hon Wong 3 1,3 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong {chcheong, mhwong}@cse.cuhk.edu.hk
More informationClustering of Data with Mixed Attributes based on Unified Similarity Metric
Clustering of Data with Mixed Attributes based on Unified Similarity Metric M.Soundaryadevi 1, Dr.L.S.Jayashree 2 Dept of CSE, RVS College of Engineering and Technology, Coimbatore, Tamilnadu, India 1
More informationMultiRelational k-anonymity
MultiRelational k-anonymity M. Ercan Nergiz Chris Clifton Department of Computer Sciences, Purdue University {mnergiz, clifton}@cs.purdue.edu A. Erhan Nergiz Bilkent University anergiz@ug.bilkent.edu.tr
More information(α, k)-anonymity: An Enhanced k-anonymity Model for Privacy-Preserving Data Publishing
(α, k)-anonymity: An Enhanced k-anonymity Model for Privacy-Preserving Data Publishing Raymond Chi-Wing Wong, Jiuyong Li +, Ada Wai-Chee Fu and Ke Wang Department of Computer Science and Engineering +
More informationDynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering
Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of
More informationPrivacy Preserving High-Dimensional Data Mashup Megala.K 1, Amudha.S 2 2. THE CHALLENGES 1. INTRODUCTION
Privacy Preserving High-Dimensional Data Mashup Megala.K 1, Amudha.S 2 1 PG Student, Department of Computer Science and Engineering, Sriram Engineering College, Perumpattu-602 024, Tamil Nadu, India 2
More informationGeneralization-Based Privacy-Preserving Data Collection
Generalization-Based Privacy-Preserving Data Collection Lijie Zhang and Weining Zhang Department of Computer Science, University of Texas at San Antonio {lijez,wzhang}@cs.utsa.edu Abstract. In privacy-preserving
More informationA Review on Privacy Preserving Data Mining Approaches
A Review on Privacy Preserving Data Mining Approaches Anu Thomas Asst.Prof. Computer Science & Engineering Department DJMIT,Mogar,Anand Gujarat Technological University Anu.thomas@djmit.ac.in Jimesh Rana
More informationC-NBC: Neighborhood-Based Clustering with Constraints
C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is
More informationA Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods
A Study on Reverse Top-K Queries Using Monochromatic and Bichromatic Methods S.Anusuya 1, M.Balaganesh 2 P.G. Student, Department of Computer Science and Engineering, Sembodai Rukmani Varatharajan Engineering
More informationDENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE
DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE Sinu T S 1, Mr.Joseph George 1,2 Computer Science and Engineering, Adi Shankara Institute of Engineering
More informationData Anonymization - Generalization Algorithms
Data Anonymization - Generalization Algorithms Li Xiong CS573 Data Privacy and Anonymity Generalization and Suppression Z2 = {410**} Z1 = {4107*. 4109*} Generalization Replace the value with a less specific
More informationImplementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP
324 Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP Shivaji Yadav(131322) Assistant Professor, CSE Dept. CSE, IIMT College of Engineering, Greater Noida,
More informationResearch Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM
Research Paper SECURED UTILITY ENHANCEMENT IN MINING USING GENETIC ALGORITHM 1 Dr.G.Kirubhakar and 2 Dr.C.Venkatesh Address for Correspondence 1 Department of Computer Science and Engineering, Surya Engineering
More informationPrivacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 2 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0032 Privacy-Preserving of Check-in
More informationMeasuring and Evaluating Dissimilarity in Data and Pattern Spaces
Measuring and Evaluating Dissimilarity in Data and Pattern Spaces Irene Ntoutsi, Yannis Theodoridis Database Group, Information Systems Laboratory Department of Informatics, University of Piraeus, Greece
More informationPrivacy Preserving Health Data Mining
IJCST Vo l. 6, Is s u e 4, Oc t - De c 2015 ISSN : 0976-8491 (Online) ISSN : 2229-4333 (Print) Privacy Preserving Health Data Mining 1 Somy.M.S, 2 Gayatri.K.S, 3 Ashwini.B 1,2,3 Dept. of CSE, Mar Baselios
More informationPRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES)
PRIVACY PRESERVING IN DISTRIBUTED DATABASE USING DATA ENCRYPTION STANDARD (DES) Jyotirmayee Rautaray 1, Raghvendra Kumar 2 School of Computer Engineering, KIIT University, Odisha, India 1 School of Computer
More informationPufferfish: A Semantic Approach to Customizable Privacy
Pufferfish: A Semantic Approach to Customizable Privacy Ashwin Machanavajjhala ashwin AT cs.duke.edu Collaborators: Daniel Kifer (Penn State), Bolin Ding (UIUC, Microsoft Research) idash Privacy Workshop
More informationData Privacy in Big Data Applications. Sreagni Banerjee CS-846
Data Privacy in Big Data Applications Sreagni Banerjee CS-846 Outline! Motivation! Goal and Approach! Introduction to Big Data Privacy! Privacy preserving methods in Big Data Application! Progress! Next
More informationAchieving k-anonmity* Privacy Protection Using Generalization and Suppression
UT DALLAS Erik Jonsson School of Engineering & Computer Science Achieving k-anonmity* Privacy Protection Using Generalization and Suppression Murat Kantarcioglu Based on Sweeney 2002 paper Releasing Private
More informationMicrodata Publishing with Algorithmic Privacy Guarantees
Microdata Publishing with Algorithmic Privacy Guarantees Tiancheng Li and Ninghui Li Department of Computer Science, Purdue University 35 N. University Street West Lafayette, IN 4797-217 {li83,ninghui}@cs.purdue.edu
More informationInfrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset
Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,
More informationIDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W STRUCTURAL DIVERSITY ANONYMITY
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W STRUCTURAL DIVERSITY ANONYMITY Gowthamy.R 1* and Uma.P 2 *1 M.E.Scholar, Department of Computer Science & Engineering Nandha Engineering College,
More informationNormalization based K means Clustering Algorithm
Normalization based K means Clustering Algorithm Deepali Virmani 1,Shweta Taneja 2,Geetika Malhotra 3 1 Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi Email:deepalivirmani@gmail.com
More informationSanitization Techniques against Personal Information Inference Attack on Social Network
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 12, December 2014,
More informationCo-clustering for differentially private synthetic data generation
Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &
More informationAn efficient hash-based algorithm for minimal k-anonymity
An efficient hash-based algorithm for minimal k-anonymity Xiaoxun Sun Min Li Hua Wang Ashley Plank Department of Mathematics & Computing University of Southern Queensland Toowoomba, Queensland 4350, Australia
More informationAn Efficient Clustering for Crime Analysis
An Efficient Clustering for Crime Analysis Malarvizhi S 1, Siddique Ibrahim 2 1 UG Scholar, Department of Computer Science and Engineering, Kumaraguru College Of Technology, Coimbatore, Tamilnadu, India
More informationDynamic Clustering of Data with Modified K-Means Algorithm
2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq
More informationGeneralizing PIR for Practical Private Retrieval of Public Data
Generalizing PIR for Practical Private Retrieval of Public Data Shiyuan Wang, Divyakant Agrawal, and Amr El Abbadi Department of Computer Science, UC Santa Barbara {sywang, agrawal, amr}@cs.ucsb.edu Abstract.
More informationISSN: (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationInternational Journal of Advance Engineering and Research Development CLUSTERING ON UNCERTAIN DATA BASED PROBABILITY DISTRIBUTION SIMILARITY
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 08, August -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 CLUSTERING
More informationPrivate Database Synthesis for Outsourced System Evaluation
Private Database Synthesis for Outsourced System Evaluation Vani Gupta 1, Gerome Miklau 1, and Neoklis Polyzotis 2 1 Dept. of Computer Science, University of Massachusetts, Amherst, MA, USA 2 Dept. of
More informationComparison of Online Record Linkage Techniques
International Research Journal of Engineering and Technology (IRJET) e-issn: 2395-0056 Volume: 02 Issue: 09 Dec-2015 p-issn: 2395-0072 www.irjet.net Comparison of Online Record Linkage Techniques Ms. SRUTHI.
More informationA Survey on: Privacy Preserving Mining Implementation Techniques
A Survey on: Privacy Preserving Mining Implementation Techniques Mukesh Kumar Dangi P.G. Scholar Computer Science & Engineering, Millennium Institute of Technology Bhopal E-mail-mukeshlncit1987@gmail.com
More informationCorrelation Based Feature Selection with Irrelevant Feature Removal
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationInternational Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani
LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models
More informationAn Approach for Privacy Preserving in Association Rule Mining Using Data Restriction
International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan
More informationPrivacy and Security Ensured Rule Mining under Partitioned Databases
www.ijiarec.com ISSN:2348-2079 Volume-5 Issue-1 International Journal of Intellectual Advancements and Research in Engineering Computations Privacy and Security Ensured Rule Mining under Partitioned Databases
More informationSensitive Label Privacy Protection on Social Network Data
Sensitive Label Privacy Protection on Social Network Data Yi Song, Panagiotis Karras, Qian Xiao, and Stéphane Bressan School of Computing National University of Singapore {songyi, xiaoqian, steph}@nus.edu.sg
More informationCLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY
CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY 1 V.VIJEYA KAVERI, 2 Dr.V.MAHESWARI 1 Research Scholar, Sathyabama University, Chennai 2 Prof., Department of Master
More informationInternational Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at
Performance Evaluation of Ensemble Method Based Outlier Detection Algorithm Priya. M 1, M. Karthikeyan 2 Department of Computer and Information Science, Annamalai University, Annamalai Nagar, Tamil Nadu,
More informationPRIVACY PROTECTION OF FREQUENTLY USED DATA SENSITIVE IN CLOUD SEVER
PRIVACY PROTECTION OF FREQUENTLY USED DATA SENSITIVE IN CLOUD SEVER T.Pavithra M.E Computer Science And Engineering S.A.Engineering College, Chennai-600077, Tamil Nadu. Mrs.G.Umarani Ph.D Professor & Head
More informationPPKM: Preserving Privacy in Knowledge Management
PPKM: Preserving Privacy in Knowledge Management N. Maheswari (Corresponding Author) P.G. Department of Computer Science Kongu Arts and Science College, Erode-638-107, Tamil Nadu, India E-mail: mahii_14@yahoo.com
More informationAmbiguity: Hide the Presence of Individuals and Their Privacy with Low Information Loss
: Hide the Presence of Individuals and Their Privacy with Low Information Loss Hui (Wendy) Wang Department of Computer Science Stevens Institute of Technology Hoboken, NJ, USA hwang@cs.stevens.edu Abstract
More informationAnonymization Algorithms - Microaggregation and Clustering
Anonymization Algorithms - Microaggregation and Clustering Li Xiong CS573 Data Privacy and Anonymity Anonymization using Microaggregation or Clustering Practical Data-Oriented Microaggregation for Statistical
More information