Privacy Preserving Data Mining: An approach to safely share and use sensible medical data
|
|
- Alan Thornton
- 6 years ago
- Views:
Transcription
1 Privacy Preserving Data Mining: An approach to safely share and use sensible medical data Gerhard Kranner, Viscovery Biomax Symposium, June 24 th, 2016, Munich
2 Privacy protection vs knowledge gain What is Privacy Preserving Data Mining? Terms and standards Risks, limits, and issues Data mining without need of data disclosure Data abstraction with perceptual maps Connectome example
3 Privacy Preserving Data Mining Ø PPDM is the responsible use of data mining to extract useful knowledge from data without compromising data privacy. Which implies to Access, explore and model sensible data Share results, deploy analytical models But, in doing so, to Observe legal and ethical standards In particular, preserve data confidentiality
4 Basic terms Pseudonymization Replace identifying fields within each data record by pseudonyms (artificial codes) De-identification Remove, mask or generalize identifying information to prevent a person s identity from being connected with information Anonymization Irreversibly remove association between an identifying dataset and the data subject
5 Common de-identification methods Removal of identifiers Direct identifiers: name, address, social security number Quasi-identifiers: birthday, ZIP, sex Any links to identifying information Data and/or output perturbation Add non-deterministic noise to attribute values Mask, modify, aggregate values systematically Generalization (data binning, bucketing) Original data values which fall in a given small interval, a bin, are replaced by a value representative of that interval Generalize all dates to year: 17 th March 1983 à 1983 Reduce zip codes to three digits: D à 821
6 Example: Two-dimensional binning
7 The HIPAA Safe Harbor Method HIPAA Privacy Rule, USA, 2003: Provides mechanisms for using and disclosing health data responsibly without the need for patient consent EITHER apply Expert Determination Method OR remove or generalize 18 specific types of data: (A) Names (B) All geographic subdivisions, including street address, city, county, precinct, ZIP code, if the geographic unit contains less than 20,000 people (C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 (D) Telephone numbers (E) Fax numbers (F) addresses (G) Social security numbers (H) Medical record numbers (I) Health plan beneficiary numbers (J) Account numbers (K) Certificate/license numbers (L) Vehicle identifiers and serial numbers, including license plate numbers (M) Device identifiers and serial numbers (N) Web Universal Resource Locators (URLs) (O) Internet Protocol (IP) addresses (P) Biometric identifiers, including finger and voiceprints (Q) Full-face photographs and any comparable images (R) Any other unique identifying number, characteristic, or code
8 Usual de-identification process Source: NISTIR 8053, De-Identification ofpersonal Information, 2015
9 Limits and issues Re-identification risk Cross-reference anonymous data with other data sources to re-identify the origin (linkage attack) May result in harms to individuals or groups De-identification is of limited use Not robust against advanced re-identification methods Impossible in certain cases E.g., genetic data cannot be safely anonymized due to huge amount of pattern information in bio-specimens which allows to re-identify the donors à Cannot be sure whether information is re-identifiable!
10 Implicit disclosure risk Attribute disclosure Adversary derives sensible information about a patient from released data in conjunction with disclosed information E.g. all patients in a list have a specific diagnosis Inferential disclosure When information can be inferred with high confidence from statistical properties of released data E.g. infer the income of a data subject from the (publicly available) purchase price of a home
11 Linkage attacks Link records in datasets based on similarity between subsets of attributes Combination of attributes allows to discern records in each dataset (fingerprint information) Use machine learning for pattern matching à Can link identity of data subjects in a (released or public) dataset with confidential information contained in another dataset
12 Linkage examples for re-identification Movie ratings Dataset 1: 500,000 training records containing customer ratings of movies (1 to 5 stars) published by Netflix Dataset 2: Ratings of (personally) registered users at IMDb With only eight movie ratings and dates, 96% of released Netflix subscribers can be uniquely identified Medical tests Only four consecutive laboratory test results of CHEM-7 (creatinine) uniquely distinguished 89.9% oft test subjects in a sample of 61,280 patients Credit card transactions Four distinct points in space and time were sufficient to specify uniquely 90% of the individuals in a sample of 1.1 million people
13 Conclusion De-identification should be applied Removal of direct identifiers is essential Must conform with legal regulations However, even complete anonymization Only reduces matching accuracy Doesn t prevent from re-identification Ø Tradiditonal de-identification is not sufficient to ensure privacy, yet being detrimental to data mining!
14 Consequences Need comprehensive strategies (Release Models) for the use of confidential data and results Observe data privacy Limit risk of re-identification Minimize information loss Need technologies that support these strategies Level of disclosed information under control of application Ideal application: Provides complete conceptual information without disclosing original data
15 Release Models Data Use Agreement (DUA) model Make de-identified data available under a legally binding data use agreement Conceptual model Provide access only to aggregate data while prohibiting access to records containing data on an individual Enclave model Keep data in kind of segregated enclave that restricts export of original data, instead accept queries from qualified users, run the queries on the data, and respond with results
16 Role and purpose based access control Source: Indumathi, InTech, 2012,
17 PPDM by decoupling models from data Represent original data in perceptual map Generates abstraction that directly shows data distribution Data statistics contained in microcluster ensemble Perform data mining on the map Explore, visualize, and cluster data distribution Enhance model with predictive capabilities Segregate map from original data Disclose map as conceptual repository for further explor n Deploy predictive model for use/integration in applications Enable access to original data via map Achievable through Micro-Cluster Queries (MCQ) User authorization for MCQ under control of application
18 Example: CIROCO data representation Vanfleteren et al., AJRCCM, 2013
19 CIROCO study: Model publication
20 CIROCO study: Diagnostic factors
21 CIROCO study: Aggregate statistics
22 Self-Organizing Maps (SOM) SOMs represent data distributions in perceptual maps Able to create maps from big / complex data Original data can be forgotten Maintains essential distribution information Contains local data statistics in microclusters (cluster binning) Released map is a conceptual repository to Visually explore data distributions Make complex distributions tangible Explore patterns and data dependences Draw benefit from sensible data without disclosing data
23 PPDM with Viscovery Workflow-oriented system for predictive modeling Explorative data mining, visual clustering Profiling, statistical analyses Classification, non-linear regression Based on innovative, patented combination of Self-Organizing Maps (SOM) Multivariate statistics Map can be segregated from original data Disclosure of map does not compromise privacy Can be integrated in operational systems (BioXM) Level of data disclosure under control of application
24 Viscovery data flow (project mode) Application data Model data Viscovery SOMine Preprocessing Analytical Datamarts Deidentified data Modeling Predictive Models Application Results
25 Viscovery data flow (operational mode) Viscovery One(2)One Engine User interaction Model name Model loading PPDM application with user access control Parameter name Parameter value Model recall Predictive Model Deidentified data Data record Result Model application
26 Example: Mining the connectome Connectome matrices of individual brains Source: De-identified, pseudonymized data (highly confidential) Connectivity Matrix + Diagnosis (Autism) + Personal data Draw conclusions about personality, mental disorders, Derive networks measures Build network graph from each matrix Calculate network measures (on global or local level) E.g. Clustering Coefficient, Characteristic Path Length, Transitivity, Assortativity, Betweenness Visualize, explore, cluster network data in Viscovery
27 Diffusion Tensor Imaging data from the Human Connectome Project Source:
28 Diffusion Tensor Imaging (DTI) Diffusion Gradients Reconstructed Fiber Tracts Connectivity Matrix Directed flow of water molecules detected by MR indicating fiber tracts Reconstructed fiber tracts indicate a potential anatomical connection between two brain areas Thickness of detected fibers between brain areas (color coded)
29 Topological graph of functional network Source: Bullmore, Sporns 2009, Nature Reviews Neuroscience, Vol. 10
30 Calculation of network measures Values are computed by Brain Connectivity Toolbox, Rubinov & Sporns, 2009 Source:
31 Can network measures hold as biomarkers for brain diseases?
32 Stratification of autism patients leveraging comprehensive clinical knowledge without compromising patient data privacy
33 Learn more and visit us at... Viscovery Software GmbH Kupelwiesergasse 27 A-1130 Wien Tel
POLICY. Create a governance process to manage requests to extract de- identified data from the Information Exchange (IE).
Academic Health Center Office of Biomedical Health Informatics POLICY Extraction of De- Identifiable Data from the Information Exchange Approved Proposal Purpose Create a governance process to manage requests
More informationIntroduction/Instructions
Introduction/Instructions Registries (data banks) and repositories (tissue banks, usually with databases associated) all involve the collection and storage of information and/or biological specimens that
More informationHIPAA and HIPAA Compliance with PHI/PII in Research
HIPAA and HIPAA Compliance with PHI/PII in Research HIPAA Compliance Federal Regulations-Enforced by Office of Civil Rights State Regulations-Texas Administrative Codes Institutional Policies-UTHSA HOPs/IRB
More informationEXAMPLE 3-JOINT PRIVACY AND SECURITY CHECKLIST
Purpose: The purpose of this Checklist is to evaluate your proposal to use or disclose Protected Health Information ( PHI ) for the purpose indicated below and allow the University Privacy Office and Office
More informationBest Practices. Contents. Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL Meridiantechnologies.net
Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL 32257 Meridiantechnologies.net Contents Overview... 2 A Word on Data Profiling... 2 Extract... 2 De- Identification... 3 PHI... 3 Subsets...
More informationHIPAA and Research Contracts JILL RAINES, ASSISTANT GENERAL COUNSEL AND UNIVERSITY PRIVACY OFFICIAL
HIPAA and Research Contracts JILL RAINES, ASSISTANT GENERAL COUNSEL AND UNIVERSITY PRIVACY OFFICIAL Just a Few Reminders HIPAA applies to Covered Entities HIPAA is a federal law that governs the privacy
More informationEXAMPLE 2-JOINT PRIVACY AND SECURITY CHECKLIST
Purpose: The purpose of this Checklist is to evaluate your proposal to use or disclose Protected Health Information ( PHI ) for the purpose indicated below and allow the University Privacy Office and Office
More informationUniversal Patient Key
Universal Patient Key Overview The Healthcare Data Privacy (i.e., HIPAA Compliance) and Data Management Challenge The healthcare industry continues to struggle with two important goals that many view as
More informationComputer Security Incident Response Plan. Date of Approval: 23-FEB-2014
Computer Security Incident Response Plan Name of Approver: Mary Ann Blair Date of Approval: 23-FEB-2014 Date of Review: 31-MAY-2016 Effective Date: 23-FEB-2014 Name of Reviewer: John Lerchey Table of Contents
More informationData Governance & Classification Policy A Data Classification and Data Types
Data Governance & Classification Policy 9.1.1.A Data Classification and Data Types Data Classification and Data Types The university utilizes various data types. Data types with similar levels of risk
More informationHIPAA 101: What All Doctors NEED To Know
HIPAA 101: What All Doctors NEED To Know 1 HIPAA Basics HIPAA: Health Insurance and Portability Accountability Act of 1996 Purpose: to protect confidential information through improved security and privacy
More informationSecurity Overview. Joseph Balberde North Country Community Mental Health Information Technology Director
Security Overview Joseph Balberde North Country Community Mental Health Information Technology Director 2-5-2019 Protected Health Information Individually Identifiable Health Information (IIHI): is information
More informationAttachment B Newtopia Wellness Program and Genetic Testing. The Health Risk Assessment also invites individuals to undergo genetic testing.
Attachment B Newtopia Wellness Program and Genetic Testing The Newtopia health risk assessment asks about individuals health status, history, and risk factors, including family history of obesity. The
More informationHIPAA and Social Media and other PHI Safeguards. Presented by the UAMS HIPAA Office August 2016 William Dobbins
HIPAA and Social Media and other PHI Safeguards Presented by the UAMS HIPAA Office August 2016 William Dobbins Social Networking Let s Talk Facebook More than 1 billion users (TNW, 2014) Half of all adult
More informationOverview of Datavant's De-Identification and Linking Technology for Structured Data
Overview of Datavant's De-Identification and Linking Technology for Structured Data Introduction Datavant is firmly committed to advancing healthcare through data analytics while protecting patients privacy.
More informationHIPAA Federal Security Rule H I P A A
H I P A A HIPAA Federal Security Rule nsurance ortability ccountability ct of 1996 HIPAA Introduction - What is HIPAA? HIPAA = The Health Insurance Portability and Accountability Act A Federal Law Created
More informationRegulatory Aspects of Digital Healthcare Solutions
Regulatory Aspects of Digital Healthcare Solutions TÜV SÜD Product Service GmbH Dr. Markus Siebert Rev. 02 / 2017 02.05.2017 TÜV SÜD Product Service GmbH Slide 1 Contents Digital solutions as Medical Device
More informationPrivacy by Design: Product Development Guidelines for Engineers & Product Managers. Purpose:
Privacy by Design: Product Development Guidelines for Engineers & Product Managers Purpose: The purpose of this document is to provide our development teams with high level principles and concepts relating
More informationUniversity of Mississippi Medical Center Data Use Agreement Protected Health Information
Data Use Agreement Protected Health Information This Data Use Agreement ( DUA ) is effective on the day of, 20, ( Effective Date ) by and between (UMMC) ( Data Custodian ), and ( Recipient ), located at
More informationIntroduction to Data Mining
Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL
More informationMachine Learning - Clustering. CS102 Fall 2017
Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for
More informationPrivacy, Security & Ethical Issues
Privacy, Security & Ethical Issues How do we mine data when we can t even look at it? 2 Individual Privacy Nobody should know more about any entity after the data mining than they did before Approaches:
More informationIn order to mine data. P. Pearl O Rourke, MD Partners HealthCare Boston, MA
In order to mine data P. Pearl O Rourke, MD Partners HealthCare Boston, MA In order to mine data You need a Mine P. Pearl O Rourke, MD Partners HealthCare Boston, MA Assumptions Current science requires
More informationHIDE: Privacy Preserving Medical Data Publishing. James Gardner Department of Mathematics and Computer Science Emory University
HIDE: Privacy Preserving Medical Data Publishing James Gardner Department of Mathematics and Computer Science Emory University jgardn3@emory.edu Motivation De-identification is critical in any health informatics
More informationThe Two Dimensions of Data Privacy Measures
The Two Dimensions of Data Privacy Measures Abstract Orit Levin Page 1 of 9 Javier Salido Corporat e, Extern a l an d Lega l A ffairs, Microsoft This paper describes a practical framework for the first
More informationEmergency Compliance DG Special Case DAMA INDIANA
1 Emergency Compliance DG Special Case DAMA INDIANA Agenda 2 Overview of full-blown data governance (DG) program Emergency compliance with a specific regulation We'll use GDPR as an example What is GDPR
More informationSecurity Control Methods for Statistical Database
Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP
More informationK ANONYMITY. Xiaoyong Zhou
K ANONYMITY LATANYA SWEENEY Xiaoyong Zhou DATA releasing: Privacy vs. Utility Society is experiencing exponential growth in the number and variety of data collections containing person specific specific
More informationPRIVACY POLICY POLICY KEY DEFINITIONS: PROCESSING OF YOUR PERSONAL DATA
PRIVACY POLICY This privacy policy notice is for this website; www.aldlife.org and served by ALD Life, 45 Peckham High Street, London SE15 5EB and governs the privacy of those who use it. The purpose of
More informationGeneral Data Protection Regulation Frequently Asked Questions (FAQ) General Questions
General Data Protection Regulation Frequently Asked Questions (FAQ) This document addresses some of the frequently asked questions regarding the General Data Protection Regulation (GDPR), which goes into
More informationAUTHORIZATION TO RELEASE HEALTH INFORMATION
Request Completed Health Information Management AUTHORIZATION TO RELEASE HEALTH INFORMATION Completion of this form authorizes the use and/or disclosure (release) of individually identifiable health information,
More informationAccumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust
Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,
More informationThe Analytic Utility of Anonymized Data
The Analytic Utility of Anonymized Data Data has become a precious, prized asset for healthcare organizations looking to control costs and improve patient care that can capture and action the considerable
More informationZIPpy Safe Harbor De-Identification Macros
ZIPpy Safe Harbor De-Identification Macros SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates
More informationA Review on Privacy Preserving Data Mining Approaches
A Review on Privacy Preserving Data Mining Approaches Anu Thomas Asst.Prof. Computer Science & Engineering Department DJMIT,Mogar,Anand Gujarat Technological University Anu.thomas@djmit.ac.in Jimesh Rana
More informationContractual Approaches to Data Protection in Clinical Research Projects
Contractual Approaches to Data Protection in Clinical Research Projects EICAR, 24th Annual Conference Nürnberg, October 2016 Dr. jur. Marc Stauch Institute for Legal Informatics Leibniz Universität Hannover
More informationOverview of the Multi-Payer Claims Database (MPCD)
Overview of the Multi-Payer Claims Database (MPCD) Genesis of the MPCD The MPCD project is one of a number of initiatives related to comparative effectiveness research (CER) funded by the American Recovery
More informationMobile Application Privacy Policy
Mobile Application Privacy Policy Introduction This mobile application is hosted and operated on behalf of your health plan. As such, some information collected through the mobile application may be considered
More informationHIPAA & RESEARCH DATA SECURITY FOR BU RESEARCHERS CHARLES RIVER CAMPUS. November 14, 2017
HIPAA & RESEARCH DATA SECURITY FOR BU RESEARCHERS CHARLES RIVER CAMPUS November 14, 2017 This Training Will Cover- How HIPAA impacts human subject research What researchers need to do to protect health
More informationRecord Linkage using Probabilistic Methods and Data Mining Techniques
Doi:10.5901/mjss.2017.v8n3p203 Abstract Record Linkage using Probabilistic Methods and Data Mining Techniques Ogerta Elezaj Faculty of Economy, University of Tirana Gloria Tuxhari Faculty of Economy, University
More informationHMIS (HOMELESS MANAGEMENT INFORMATION SYSTEM) SECURITY AWARENESS TRAINING. Created By:
HMIS (HOMELESS MANAGEMENT INFORMATION SYSTEM) SECURITY AWARENESS TRAINING Created By: Overview The purpose of this presentation is to emphasize the importance of security when using HMIS. Client information
More informationAbstract & Implementation
Abstract & Implementation The Health Insurance Portability and Accountability Act of 1996 (HIPAA) Privacy Rule mandates the deidentification of specific types of Protected Health Information (PHI) for
More informationWebsite Privacy Policy
Website Privacy Policy We are very sensitive to privacy issues. The purpose of this Website Privacy Policy is to let you know how Associated Underwriters Insurance, but not limited to, Associated Underwriters
More informationMobile security: Tips and tricks for securing your iphone, Android and other mobile devices
Mobile security: Tips and tricks for securing your iphone, Android and other mobile devices Presented by Michael Harris [MS, CISSP, WAPT] Systems Security Analyst University of Missouri Overview What data
More informationSIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER
31 st July 216. Vol.89. No.2 25-216 JATIT & LLS. All rights reserved. SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 1 AMANI MAHAGOUB OMER, 2 MOHD MURTADHA BIN MOHAMAD 1 Faculty of Computing,
More informationENCRYPTED . Copyright UT Health 1
ENCRYPTED EMAIL The improper use or disclosure of sensitive information presents the risk of identity theft, invasion of privacy, and can cause harm and embarrassment to students, faculty, staff, patients,
More informationPreprocessing Short Lecture Notes cse352. Professor Anita Wasilewska
Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept
More informationPartition Based Perturbation for Privacy Preserving Distributed Data Mining
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation
More informationPrivacy and Security Aspects Related to the Use of Big Data Progress of work in the ESS. Pascal Jacques Eurostat Local Security Officer 1
Privacy and Security Aspects Related to the Use of Big Data Progress of work in the ESS Pascal Jacques Eurostat Local Security Officer 1 Current work on privacy and ethics in Big data Privacy Confidentiality
More informationThe Two Dimensions of Data Privacy Measures
The Two Dimensions of Data Privacy Measures Ms. Orit Levin, Principle Program Manager Corporate, External and Legal Affairs, Microsoft Abstract This paper describes a practical framework that can be used
More informationMy Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens
My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens My Health, My Data! 1 / 11 / 2016-30 / 10 / 2019 ~3M ( ~420K for ARC) Age ParCHD Procedures
More informationNYSVMS WEBSITE PRIVACY POLICY
Your Privacy Rights Effective Date: June 16, 2016 NYSVMS WEBSITE PRIVACY POLICY The New York State Veterinary Medical Society, Inc. and its affiliates ( NYSVMS, we, and us ) recognize the importance of
More informationData Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness
Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and
More informationPseudonymization of Information for Privacy in E-Health (PIPE)
Pseudonymization of Information for Privacy in E-Health (PIPE) A Min Tjoa TU Wien & SBA One side of Privacy No one shall be subjected to arbitrary or unlawful interference with his privacy, family, home,
More informationHIPAA Privacy & Security Training. HIPAA The Health Insurance Portability and Accountability Act of 1996
HIPAA Privacy & Security Training HIPAA The Health Insurance Portability and Accountability Act of 1996 AMTA confidentiality requirements AMTA Professional Competencies 20. Documentation 20.7 Demonstrate
More informationProtected Environment at CHPC
Protected Environment at CHPC Anita Orendt, anita.orendt@utah.edu Wayne Bradford, wayne.bradford@utah.edu Center for High Performance Computing 25 October 2016 CHPC Mission In addition to deploying and
More informationBanner Health Information Security and Privacy Training Team. Morgan Raimo Paul Lockwood
Banner Health Information Security and Privacy Training Team Morgan Raimo Paul Lockwood PHI Storage InfoGraphics PHI Data Storage and Sharing Cybersecurity and Privacy Training and Awareness Table of Contents
More informationSurvey Result on Privacy Preserving Techniques in Data Publishing
Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant
More informationPrivacy policy NTI AG
Privacy policy NTI AG NTI AG / LinMot Dok-Nr. Privacy Policy_NTI_AG_180607 Content 1 Privacy policy... 3 2 Who are we?... 3 3 What is Personal Information?... 3 4 What Personal Information does NTI AG
More informationThe NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets
The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets Jeffrey Brown, Lesley Curtis, and Rich Platt June 13, 2014 Previously The NIH Collaboratory:
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationPRIVACY POLICY. 3.1 This policy does not apply to the collection, holding, use or disclosure of personal information that is an employee record.
1. Introduction 1.1 From time to time Business & Risk Solutions Pty Ltd ("the Company") is required to collect, hold, use and/or disclose personal information relating to individuals (including, but not
More informationDocument Cloud (including Adobe Sign) Additional Terms of Use. Last updated June 5, Replaces all prior versions.
Document Cloud (including Adobe Sign) Additional Terms of Use Last updated June 5, 2018. Replaces all prior versions. These Additional Terms govern your use of Document Cloud (including Adobe Sign) and
More informationPrivacy Challenges in Big Data and Industry 4.0
Privacy Challenges in Big Data and Industry 4.0 Jiannong Cao Internet & Mobile Computing Lab Department of Computing Hong Kong Polytechnic University Email: csjcao@comp.polyu.edu.hk http://www.comp.polyu.edu.hk/~csjcao/
More informationInformation Classification & Protection Policy
University of Scranton Information Technology Policy Information Classification & Protection Policy Executive Sponsor: AVP Information Resources Responsible Office: Information Security Originally Issued:
More informationPrivacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University
Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks
More informationImproving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique
Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,
More informationEconomic and Social Council
United Nations Economic and Social Council ECE/TRANS/WP.29/2017/46 Distr.: General 23 December 2016 Original: English Economic Commission for Europe Inland Transport Committee World Forum for Harmonization
More informationPrivacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.
Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Introduction to Privacy-Preserving Data Publishing Concepts and Techniques Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu CRC
More informationHIPAA UPDATE. Michael L. Brody, DPM
HIPAA UPDATE Michael L. Brody, DPM Objectives: How to respond to a patient s request for a copy of their records. Understand your responsibilities after you send information out to another doctor, hospital
More informationData Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA
Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI
More informationData Compromise Notice Procedure Summary and Guide
Data Compromise Notice Procedure Summary and Guide Various federal and state laws require notification of the breach of security or compromise of personally identifiable data. No single federal law or
More informationEmerging Measures in Preserving Privacy for Publishing The Data
Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the
More informationIT Privacy Certification Outline of the Body of Knowledge (BOK) for the Certified Information Privacy Technologist (CIPT)
Page 1 of 6 IT Privacy Certification Outline of the Body of Knowledge (BOK) for the Certified Information Privacy Technologist (CIPT) I. Understanding the need for privacy in the IT environment A. Evolving
More informationPowering Knowledge Discovery. Insights from big data with Linguamatics I2E
Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural
More informationExercising Rights Under the GDPR
THE 23ANDME GUIDE Exercising Rights Under the GDPR Right to Object. Right to Rectify. Right to Restrict. JULY 20, 2018 Exercise Your Rights The 23andMe Guide to Objecting, Rectifying, and Restricting Introduction
More informationBig Data - Security with Privacy
Big Data - Security with Privacy Elisa Bertino CS Department, Cyber Center, and CERIAS Purdue University Cyber Center Today we have technologies for Acquiring and sensing data Transmitting data Storing,
More informationK-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007
K-Anonymity and Other Cluster- Based Methods Ge Ruan Oct 11,2007 Data Publishing and Data Privacy Society is experiencing exponential growth in the number and variety of data collections containing person-specific
More informationI. INFORMATION WE COLLECT
PRIVACY POLICY USIT PRIVACY POLICY Usit (the Company ) is committed to maintaining robust privacy protections for its users. Our Privacy Policy ( Privacy Policy ) is designed to help you understand how
More informationIn this policy, whenever you see the words we, us, our, it refers to Ashby Concert Band Registered Charity Number
ASHBY CONCERT BAND PRIVACY POLICY The privacy and security of your personal information is extremely important to us. This privacy policy explains how and why we use your personal data. We will keep this
More informationInformational Guide for the NewSTEPs Data Repository
Informational Guide for the NewSTEPs Data Repository Document Contents What is the NewSTEPs Data Repository... 2 What data is being collected?... 2 Why is this data being collected?... 2 How did NewSTEPs
More informationAn Iterative Approach to Examining the Effectiveness of Data Sanitization
An Iterative Approach to Examining the Effectiveness of Data Sanitization By ANHAD PREET SINGH B.Tech. (Punjabi University) 2007 M.S. (University of California, Davis) 2012 DISSERTATION Submitted in partial
More informationData Mining and Data Warehousing Introduction to Data Mining
Data Mining and Data Warehousing Introduction to Data Mining Quiz Easy Q1. Which of the following is a data warehouse? a. Can be updated by end users. b. Contains numerous naming conventions and formats.
More informationOverview of Akamai s Personal Data Processing Activities and Role
Overview of Akamai s Personal Data Processing Activities and Role Last Updated: April 2018 This document is maintained by the Akamai Global Data Protection Office 1 Introduction Akamai is a global leader
More informationStarflow Token Sale Privacy Policy
Starflow Token Sale Privacy Policy Last Updated: 23 March 2018 Please read this Privacy Policy carefully. By registering your interest to participate in the sale of STAR tokens (the Token Sale ) through
More informationKnowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA
Knowledge Discovery Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics
More informationonly be used for the purpose of handling an individual transaction. The Personal Information you supply to us, when you opt in to marketing
Privacy The Phoenix Theatre, Blyth Privacy Policy 1. Privacy commitment The website www.thephoenixtheatre.org.uk is owned and operated by The Phoenix Theatre, Blyth. We are committed to safeguarding your
More informationAutomated Information Retrieval System Using Correlation Based Multi- Document Summarization Method
Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated
More informationData Access, Data Sharing, and Data Use in Education Research Advancing Knowledge through Responsible Conduct
Data Access, Data Sharing, and Data Use in Education Research Advancing Knowledge through Responsible Conduct Felice J. Levine American Educational Research Association OECD Workshop on Fostering Innovation
More information2011 INTERNATIONAL COMPARISON PROGRAM
2011 INTERNATIONAL COMPARISON PROGRAM 2011 ICP DATA ACCESS AND ARCHIVING POLICY GUIDING PRINCIPLES AND PROCEDURES FOR DATA ACCESS ICP Global Office June 2011 Contents I. PURPOSE... 3 II. CONTEXT... 3 III.
More informationJarek Szlichta
Jarek Szlichta http://data.science.uoit.ca/ Approximate terminology, though there is some overlap: Data(base) operations Executing specific operations or queries over data Data mining Looking for patterns
More informationData Mining with Weka
Data Mining with Weka Class 5 Lesson 1 The data mining process Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Lesson 5.1 The data mining process Class
More informationHIPAA ( ) HIPAA 2017 Compliancy Group, LLC
855 85 HIPAA (855-854-4722) www.compliancygroup.com 1 Started in 2005 by HIPAA auditors & Compliance experts Market need for a total end client solution Created The Guard: cloud-based solution Compliance
More informationAccountability in Privacy-Preserving Data Mining
PORTIA Privacy, Obligations, and Rights in Technologies of Information Assessment Accountability in Privacy-Preserving Data Mining Rebecca Wright Computer Science Department Stevens Institute of Technology
More informationBig Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition
Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the
More informationPlaying in the Big (Data) Leagues: Consumer Data Mining Data Privacy and Compliance
Playing in the Big (Data) Leagues: Consumer Data Mining Data Privacy and Compliance Presented by Charlie Bingham, Legal and Corporate Affairs -Enterprise Partner Group, Microsoft Corporation Rachel Reid,
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationChapter 3: Data Mining:
Chapter 3: Data Mining: 3.1 What is Data Mining? Data Mining is the process of automatically discovering useful information in large repository. Why do we need Data mining? Conventional database systems
More informationGraphVar: A user-friendly toolbox for comprehensive graph analyses of functional brain connectivity.
GraphVar: A user-friendly toolbox for comprehensive graph analyses of functional brain connectivity. features: I. Pipeline construction of graph networks II. Calculation of network topological measures
More informationEagles Charitable Foundation Privacy Policy
Eagles Charitable Foundation Privacy Policy Effective Date: 1/18/2018 The Eagles Charitable Foundation, Inc. ( Eagles Charitable Foundation, we, our, us ) respects your privacy and values your trust and
More information