Privacy Preserving Data Mining: An approach to safely share and use sensible medical data

Size: px
Start display at page:

Download "Privacy Preserving Data Mining: An approach to safely share and use sensible medical data"

Transcription

1 Privacy Preserving Data Mining: An approach to safely share and use sensible medical data Gerhard Kranner, Viscovery Biomax Symposium, June 24 th, 2016, Munich

2 Privacy protection vs knowledge gain What is Privacy Preserving Data Mining? Terms and standards Risks, limits, and issues Data mining without need of data disclosure Data abstraction with perceptual maps Connectome example

3 Privacy Preserving Data Mining Ø PPDM is the responsible use of data mining to extract useful knowledge from data without compromising data privacy. Which implies to Access, explore and model sensible data Share results, deploy analytical models But, in doing so, to Observe legal and ethical standards In particular, preserve data confidentiality

4 Basic terms Pseudonymization Replace identifying fields within each data record by pseudonyms (artificial codes) De-identification Remove, mask or generalize identifying information to prevent a person s identity from being connected with information Anonymization Irreversibly remove association between an identifying dataset and the data subject

5 Common de-identification methods Removal of identifiers Direct identifiers: name, address, social security number Quasi-identifiers: birthday, ZIP, sex Any links to identifying information Data and/or output perturbation Add non-deterministic noise to attribute values Mask, modify, aggregate values systematically Generalization (data binning, bucketing) Original data values which fall in a given small interval, a bin, are replaced by a value representative of that interval Generalize all dates to year: 17 th March 1983 à 1983 Reduce zip codes to three digits: D à 821

6 Example: Two-dimensional binning

7 The HIPAA Safe Harbor Method HIPAA Privacy Rule, USA, 2003: Provides mechanisms for using and disclosing health data responsibly without the need for patient consent EITHER apply Expert Determination Method OR remove or generalize 18 specific types of data: (A) Names (B) All geographic subdivisions, including street address, city, county, precinct, ZIP code, if the geographic unit contains less than 20,000 people (C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 (D) Telephone numbers (E) Fax numbers (F) addresses (G) Social security numbers (H) Medical record numbers (I) Health plan beneficiary numbers (J) Account numbers (K) Certificate/license numbers (L) Vehicle identifiers and serial numbers, including license plate numbers (M) Device identifiers and serial numbers (N) Web Universal Resource Locators (URLs) (O) Internet Protocol (IP) addresses (P) Biometric identifiers, including finger and voiceprints (Q) Full-face photographs and any comparable images (R) Any other unique identifying number, characteristic, or code

8 Usual de-identification process Source: NISTIR 8053, De-Identification ofpersonal Information, 2015

9 Limits and issues Re-identification risk Cross-reference anonymous data with other data sources to re-identify the origin (linkage attack) May result in harms to individuals or groups De-identification is of limited use Not robust against advanced re-identification methods Impossible in certain cases E.g., genetic data cannot be safely anonymized due to huge amount of pattern information in bio-specimens which allows to re-identify the donors à Cannot be sure whether information is re-identifiable!

10 Implicit disclosure risk Attribute disclosure Adversary derives sensible information about a patient from released data in conjunction with disclosed information E.g. all patients in a list have a specific diagnosis Inferential disclosure When information can be inferred with high confidence from statistical properties of released data E.g. infer the income of a data subject from the (publicly available) purchase price of a home

11 Linkage attacks Link records in datasets based on similarity between subsets of attributes Combination of attributes allows to discern records in each dataset (fingerprint information) Use machine learning for pattern matching à Can link identity of data subjects in a (released or public) dataset with confidential information contained in another dataset

12 Linkage examples for re-identification Movie ratings Dataset 1: 500,000 training records containing customer ratings of movies (1 to 5 stars) published by Netflix Dataset 2: Ratings of (personally) registered users at IMDb With only eight movie ratings and dates, 96% of released Netflix subscribers can be uniquely identified Medical tests Only four consecutive laboratory test results of CHEM-7 (creatinine) uniquely distinguished 89.9% oft test subjects in a sample of 61,280 patients Credit card transactions Four distinct points in space and time were sufficient to specify uniquely 90% of the individuals in a sample of 1.1 million people

13 Conclusion De-identification should be applied Removal of direct identifiers is essential Must conform with legal regulations However, even complete anonymization Only reduces matching accuracy Doesn t prevent from re-identification Ø Tradiditonal de-identification is not sufficient to ensure privacy, yet being detrimental to data mining!

14 Consequences Need comprehensive strategies (Release Models) for the use of confidential data and results Observe data privacy Limit risk of re-identification Minimize information loss Need technologies that support these strategies Level of disclosed information under control of application Ideal application: Provides complete conceptual information without disclosing original data

15 Release Models Data Use Agreement (DUA) model Make de-identified data available under a legally binding data use agreement Conceptual model Provide access only to aggregate data while prohibiting access to records containing data on an individual Enclave model Keep data in kind of segregated enclave that restricts export of original data, instead accept queries from qualified users, run the queries on the data, and respond with results

16 Role and purpose based access control Source: Indumathi, InTech, 2012,

17 PPDM by decoupling models from data Represent original data in perceptual map Generates abstraction that directly shows data distribution Data statistics contained in microcluster ensemble Perform data mining on the map Explore, visualize, and cluster data distribution Enhance model with predictive capabilities Segregate map from original data Disclose map as conceptual repository for further explor n Deploy predictive model for use/integration in applications Enable access to original data via map Achievable through Micro-Cluster Queries (MCQ) User authorization for MCQ under control of application

18 Example: CIROCO data representation Vanfleteren et al., AJRCCM, 2013

19 CIROCO study: Model publication

20 CIROCO study: Diagnostic factors

21 CIROCO study: Aggregate statistics

22 Self-Organizing Maps (SOM) SOMs represent data distributions in perceptual maps Able to create maps from big / complex data Original data can be forgotten Maintains essential distribution information Contains local data statistics in microclusters (cluster binning) Released map is a conceptual repository to Visually explore data distributions Make complex distributions tangible Explore patterns and data dependences Draw benefit from sensible data without disclosing data

23 PPDM with Viscovery Workflow-oriented system for predictive modeling Explorative data mining, visual clustering Profiling, statistical analyses Classification, non-linear regression Based on innovative, patented combination of Self-Organizing Maps (SOM) Multivariate statistics Map can be segregated from original data Disclosure of map does not compromise privacy Can be integrated in operational systems (BioXM) Level of data disclosure under control of application

24 Viscovery data flow (project mode) Application data Model data Viscovery SOMine Preprocessing Analytical Datamarts Deidentified data Modeling Predictive Models Application Results

25 Viscovery data flow (operational mode) Viscovery One(2)One Engine User interaction Model name Model loading PPDM application with user access control Parameter name Parameter value Model recall Predictive Model Deidentified data Data record Result Model application

26 Example: Mining the connectome Connectome matrices of individual brains Source: De-identified, pseudonymized data (highly confidential) Connectivity Matrix + Diagnosis (Autism) + Personal data Draw conclusions about personality, mental disorders, Derive networks measures Build network graph from each matrix Calculate network measures (on global or local level) E.g. Clustering Coefficient, Characteristic Path Length, Transitivity, Assortativity, Betweenness Visualize, explore, cluster network data in Viscovery

27 Diffusion Tensor Imaging data from the Human Connectome Project Source:

28 Diffusion Tensor Imaging (DTI) Diffusion Gradients Reconstructed Fiber Tracts Connectivity Matrix Directed flow of water molecules detected by MR indicating fiber tracts Reconstructed fiber tracts indicate a potential anatomical connection between two brain areas Thickness of detected fibers between brain areas (color coded)

29 Topological graph of functional network Source: Bullmore, Sporns 2009, Nature Reviews Neuroscience, Vol. 10

30 Calculation of network measures Values are computed by Brain Connectivity Toolbox, Rubinov & Sporns, 2009 Source:

31 Can network measures hold as biomarkers for brain diseases?

32 Stratification of autism patients leveraging comprehensive clinical knowledge without compromising patient data privacy

33 Learn more and visit us at... Viscovery Software GmbH Kupelwiesergasse 27 A-1130 Wien Tel

POLICY. Create a governance process to manage requests to extract de- identified data from the Information Exchange (IE).

POLICY. Create a governance process to manage requests to extract de- identified data from the Information Exchange (IE). Academic Health Center Office of Biomedical Health Informatics POLICY Extraction of De- Identifiable Data from the Information Exchange Approved Proposal Purpose Create a governance process to manage requests

More information

Introduction/Instructions

Introduction/Instructions Introduction/Instructions Registries (data banks) and repositories (tissue banks, usually with databases associated) all involve the collection and storage of information and/or biological specimens that

More information

HIPAA and HIPAA Compliance with PHI/PII in Research

HIPAA and HIPAA Compliance with PHI/PII in Research HIPAA and HIPAA Compliance with PHI/PII in Research HIPAA Compliance Federal Regulations-Enforced by Office of Civil Rights State Regulations-Texas Administrative Codes Institutional Policies-UTHSA HOPs/IRB

More information

EXAMPLE 3-JOINT PRIVACY AND SECURITY CHECKLIST

EXAMPLE 3-JOINT PRIVACY AND SECURITY CHECKLIST Purpose: The purpose of this Checklist is to evaluate your proposal to use or disclose Protected Health Information ( PHI ) for the purpose indicated below and allow the University Privacy Office and Office

More information

Best Practices. Contents. Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL Meridiantechnologies.net

Best Practices. Contents. Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL Meridiantechnologies.net Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL 32257 Meridiantechnologies.net Contents Overview... 2 A Word on Data Profiling... 2 Extract... 2 De- Identification... 3 PHI... 3 Subsets...

More information

HIPAA and Research Contracts JILL RAINES, ASSISTANT GENERAL COUNSEL AND UNIVERSITY PRIVACY OFFICIAL

HIPAA and Research Contracts JILL RAINES, ASSISTANT GENERAL COUNSEL AND UNIVERSITY PRIVACY OFFICIAL HIPAA and Research Contracts JILL RAINES, ASSISTANT GENERAL COUNSEL AND UNIVERSITY PRIVACY OFFICIAL Just a Few Reminders HIPAA applies to Covered Entities HIPAA is a federal law that governs the privacy

More information

EXAMPLE 2-JOINT PRIVACY AND SECURITY CHECKLIST

EXAMPLE 2-JOINT PRIVACY AND SECURITY CHECKLIST Purpose: The purpose of this Checklist is to evaluate your proposal to use or disclose Protected Health Information ( PHI ) for the purpose indicated below and allow the University Privacy Office and Office

More information

Universal Patient Key

Universal Patient Key Universal Patient Key Overview The Healthcare Data Privacy (i.e., HIPAA Compliance) and Data Management Challenge The healthcare industry continues to struggle with two important goals that many view as

More information

Computer Security Incident Response Plan. Date of Approval: 23-FEB-2014

Computer Security Incident Response Plan. Date of Approval: 23-FEB-2014 Computer Security Incident Response Plan Name of Approver: Mary Ann Blair Date of Approval: 23-FEB-2014 Date of Review: 31-MAY-2016 Effective Date: 23-FEB-2014 Name of Reviewer: John Lerchey Table of Contents

More information

Data Governance & Classification Policy A Data Classification and Data Types

Data Governance & Classification Policy A Data Classification and Data Types Data Governance & Classification Policy 9.1.1.A Data Classification and Data Types Data Classification and Data Types The university utilizes various data types. Data types with similar levels of risk

More information

HIPAA 101: What All Doctors NEED To Know

HIPAA 101: What All Doctors NEED To Know HIPAA 101: What All Doctors NEED To Know 1 HIPAA Basics HIPAA: Health Insurance and Portability Accountability Act of 1996 Purpose: to protect confidential information through improved security and privacy

More information

Security Overview. Joseph Balberde North Country Community Mental Health Information Technology Director

Security Overview. Joseph Balberde North Country Community Mental Health Information Technology Director Security Overview Joseph Balberde North Country Community Mental Health Information Technology Director 2-5-2019 Protected Health Information Individually Identifiable Health Information (IIHI): is information

More information

Attachment B Newtopia Wellness Program and Genetic Testing. The Health Risk Assessment also invites individuals to undergo genetic testing.

Attachment B Newtopia Wellness Program and Genetic Testing. The Health Risk Assessment also invites individuals to undergo genetic testing. Attachment B Newtopia Wellness Program and Genetic Testing The Newtopia health risk assessment asks about individuals health status, history, and risk factors, including family history of obesity. The

More information

HIPAA and Social Media and other PHI Safeguards. Presented by the UAMS HIPAA Office August 2016 William Dobbins

HIPAA and Social Media and other PHI Safeguards. Presented by the UAMS HIPAA Office August 2016 William Dobbins HIPAA and Social Media and other PHI Safeguards Presented by the UAMS HIPAA Office August 2016 William Dobbins Social Networking Let s Talk Facebook More than 1 billion users (TNW, 2014) Half of all adult

More information

Overview of Datavant's De-Identification and Linking Technology for Structured Data

Overview of Datavant's De-Identification and Linking Technology for Structured Data Overview of Datavant's De-Identification and Linking Technology for Structured Data Introduction Datavant is firmly committed to advancing healthcare through data analytics while protecting patients privacy.

More information

HIPAA Federal Security Rule H I P A A

HIPAA Federal Security Rule H I P A A H I P A A HIPAA Federal Security Rule nsurance ortability ccountability ct of 1996 HIPAA Introduction - What is HIPAA? HIPAA = The Health Insurance Portability and Accountability Act A Federal Law Created

More information

Regulatory Aspects of Digital Healthcare Solutions

Regulatory Aspects of Digital Healthcare Solutions Regulatory Aspects of Digital Healthcare Solutions TÜV SÜD Product Service GmbH Dr. Markus Siebert Rev. 02 / 2017 02.05.2017 TÜV SÜD Product Service GmbH Slide 1 Contents Digital solutions as Medical Device

More information

Privacy by Design: Product Development Guidelines for Engineers & Product Managers. Purpose:

Privacy by Design: Product Development Guidelines for Engineers & Product Managers. Purpose: Privacy by Design: Product Development Guidelines for Engineers & Product Managers Purpose: The purpose of this document is to provide our development teams with high level principles and concepts relating

More information

University of Mississippi Medical Center Data Use Agreement Protected Health Information

University of Mississippi Medical Center Data Use Agreement Protected Health Information Data Use Agreement Protected Health Information This Data Use Agreement ( DUA ) is effective on the day of, 20, ( Effective Date ) by and between (UMMC) ( Data Custodian ), and ( Recipient ), located at

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

Machine Learning - Clustering. CS102 Fall 2017

Machine Learning - Clustering. CS102 Fall 2017 Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for

More information

Privacy, Security & Ethical Issues

Privacy, Security & Ethical Issues Privacy, Security & Ethical Issues How do we mine data when we can t even look at it? 2 Individual Privacy Nobody should know more about any entity after the data mining than they did before Approaches:

More information

In order to mine data. P. Pearl O Rourke, MD Partners HealthCare Boston, MA

In order to mine data. P. Pearl O Rourke, MD Partners HealthCare Boston, MA In order to mine data P. Pearl O Rourke, MD Partners HealthCare Boston, MA In order to mine data You need a Mine P. Pearl O Rourke, MD Partners HealthCare Boston, MA Assumptions Current science requires

More information

HIDE: Privacy Preserving Medical Data Publishing. James Gardner Department of Mathematics and Computer Science Emory University

HIDE: Privacy Preserving Medical Data Publishing. James Gardner Department of Mathematics and Computer Science Emory University HIDE: Privacy Preserving Medical Data Publishing James Gardner Department of Mathematics and Computer Science Emory University jgardn3@emory.edu Motivation De-identification is critical in any health informatics

More information

The Two Dimensions of Data Privacy Measures

The Two Dimensions of Data Privacy Measures The Two Dimensions of Data Privacy Measures Abstract Orit Levin Page 1 of 9 Javier Salido Corporat e, Extern a l an d Lega l A ffairs, Microsoft This paper describes a practical framework for the first

More information

Emergency Compliance DG Special Case DAMA INDIANA

Emergency Compliance DG Special Case DAMA INDIANA 1 Emergency Compliance DG Special Case DAMA INDIANA Agenda 2 Overview of full-blown data governance (DG) program Emergency compliance with a specific regulation We'll use GDPR as an example What is GDPR

More information

Security Control Methods for Statistical Database

Security Control Methods for Statistical Database Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP

More information

K ANONYMITY. Xiaoyong Zhou

K ANONYMITY. Xiaoyong Zhou K ANONYMITY LATANYA SWEENEY Xiaoyong Zhou DATA releasing: Privacy vs. Utility Society is experiencing exponential growth in the number and variety of data collections containing person specific specific

More information

PRIVACY POLICY POLICY KEY DEFINITIONS: PROCESSING OF YOUR PERSONAL DATA

PRIVACY POLICY POLICY KEY DEFINITIONS: PROCESSING OF YOUR PERSONAL DATA PRIVACY POLICY This privacy policy notice is for this website; www.aldlife.org and served by ALD Life, 45 Peckham High Street, London SE15 5EB and governs the privacy of those who use it. The purpose of

More information

General Data Protection Regulation Frequently Asked Questions (FAQ) General Questions

General Data Protection Regulation Frequently Asked Questions (FAQ) General Questions General Data Protection Regulation Frequently Asked Questions (FAQ) This document addresses some of the frequently asked questions regarding the General Data Protection Regulation (GDPR), which goes into

More information

AUTHORIZATION TO RELEASE HEALTH INFORMATION

AUTHORIZATION TO RELEASE HEALTH INFORMATION Request Completed Health Information Management AUTHORIZATION TO RELEASE HEALTH INFORMATION Completion of this form authorizes the use and/or disclosure (release) of individually identifiable health information,

More information

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,

More information

The Analytic Utility of Anonymized Data

The Analytic Utility of Anonymized Data The Analytic Utility of Anonymized Data Data has become a precious, prized asset for healthcare organizations looking to control costs and improve patient care that can capture and action the considerable

More information

ZIPpy Safe Harbor De-Identification Macros

ZIPpy Safe Harbor De-Identification Macros ZIPpy Safe Harbor De-Identification Macros SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates

More information

A Review on Privacy Preserving Data Mining Approaches

A Review on Privacy Preserving Data Mining Approaches A Review on Privacy Preserving Data Mining Approaches Anu Thomas Asst.Prof. Computer Science & Engineering Department DJMIT,Mogar,Anand Gujarat Technological University Anu.thomas@djmit.ac.in Jimesh Rana

More information

Contractual Approaches to Data Protection in Clinical Research Projects

Contractual Approaches to Data Protection in Clinical Research Projects Contractual Approaches to Data Protection in Clinical Research Projects EICAR, 24th Annual Conference Nürnberg, October 2016 Dr. jur. Marc Stauch Institute for Legal Informatics Leibniz Universität Hannover

More information

Overview of the Multi-Payer Claims Database (MPCD)

Overview of the Multi-Payer Claims Database (MPCD) Overview of the Multi-Payer Claims Database (MPCD) Genesis of the MPCD The MPCD project is one of a number of initiatives related to comparative effectiveness research (CER) funded by the American Recovery

More information

Mobile Application Privacy Policy

Mobile Application Privacy Policy Mobile Application Privacy Policy Introduction This mobile application is hosted and operated on behalf of your health plan. As such, some information collected through the mobile application may be considered

More information

HIPAA & RESEARCH DATA SECURITY FOR BU RESEARCHERS CHARLES RIVER CAMPUS. November 14, 2017

HIPAA & RESEARCH DATA SECURITY FOR BU RESEARCHERS CHARLES RIVER CAMPUS. November 14, 2017 HIPAA & RESEARCH DATA SECURITY FOR BU RESEARCHERS CHARLES RIVER CAMPUS November 14, 2017 This Training Will Cover- How HIPAA impacts human subject research What researchers need to do to protect health

More information

Record Linkage using Probabilistic Methods and Data Mining Techniques

Record Linkage using Probabilistic Methods and Data Mining Techniques Doi:10.5901/mjss.2017.v8n3p203 Abstract Record Linkage using Probabilistic Methods and Data Mining Techniques Ogerta Elezaj Faculty of Economy, University of Tirana Gloria Tuxhari Faculty of Economy, University

More information

HMIS (HOMELESS MANAGEMENT INFORMATION SYSTEM) SECURITY AWARENESS TRAINING. Created By:

HMIS (HOMELESS MANAGEMENT INFORMATION SYSTEM) SECURITY AWARENESS TRAINING. Created By: HMIS (HOMELESS MANAGEMENT INFORMATION SYSTEM) SECURITY AWARENESS TRAINING Created By: Overview The purpose of this presentation is to emphasize the importance of security when using HMIS. Client information

More information

Abstract & Implementation

Abstract & Implementation Abstract & Implementation The Health Insurance Portability and Accountability Act of 1996 (HIPAA) Privacy Rule mandates the deidentification of specific types of Protected Health Information (PHI) for

More information

Website Privacy Policy

Website Privacy Policy Website Privacy Policy We are very sensitive to privacy issues. The purpose of this Website Privacy Policy is to let you know how Associated Underwriters Insurance, but not limited to, Associated Underwriters

More information

Mobile security: Tips and tricks for securing your iphone, Android and other mobile devices

Mobile security: Tips and tricks for securing your iphone, Android and other mobile devices Mobile security: Tips and tricks for securing your iphone, Android and other mobile devices Presented by Michael Harris [MS, CISSP, WAPT] Systems Security Analyst University of Missouri Overview What data

More information

SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER

SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 31 st July 216. Vol.89. No.2 25-216 JATIT & LLS. All rights reserved. SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 1 AMANI MAHAGOUB OMER, 2 MOHD MURTADHA BIN MOHAMAD 1 Faculty of Computing,

More information

ENCRYPTED . Copyright UT Health 1

ENCRYPTED  . Copyright UT Health 1 ENCRYPTED EMAIL The improper use or disclosure of sensitive information presents the risk of identity theft, invasion of privacy, and can cause harm and embarrassment to students, faculty, staff, patients,

More information

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept

More information

Partition Based Perturbation for Privacy Preserving Distributed Data Mining

Partition Based Perturbation for Privacy Preserving Distributed Data Mining BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation

More information

Privacy and Security Aspects Related to the Use of Big Data Progress of work in the ESS. Pascal Jacques Eurostat Local Security Officer 1

Privacy and Security Aspects Related to the Use of Big Data Progress of work in the ESS. Pascal Jacques Eurostat Local Security Officer 1 Privacy and Security Aspects Related to the Use of Big Data Progress of work in the ESS Pascal Jacques Eurostat Local Security Officer 1 Current work on privacy and ethics in Big data Privacy Confidentiality

More information

The Two Dimensions of Data Privacy Measures

The Two Dimensions of Data Privacy Measures The Two Dimensions of Data Privacy Measures Ms. Orit Levin, Principle Program Manager Corporate, External and Legal Affairs, Microsoft Abstract This paper describes a practical framework that can be used

More information

My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens

My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens My Health, My Data! 1 / 11 / 2016-30 / 10 / 2019 ~3M ( ~420K for ARC) Age ParCHD Procedures

More information

NYSVMS WEBSITE PRIVACY POLICY

NYSVMS WEBSITE PRIVACY POLICY Your Privacy Rights Effective Date: June 16, 2016 NYSVMS WEBSITE PRIVACY POLICY The New York State Veterinary Medical Society, Inc. and its affiliates ( NYSVMS, we, and us ) recognize the importance of

More information

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and

More information

Pseudonymization of Information for Privacy in E-Health (PIPE)

Pseudonymization of Information for Privacy in E-Health (PIPE) Pseudonymization of Information for Privacy in E-Health (PIPE) A Min Tjoa TU Wien & SBA One side of Privacy No one shall be subjected to arbitrary or unlawful interference with his privacy, family, home,

More information

HIPAA Privacy & Security Training. HIPAA The Health Insurance Portability and Accountability Act of 1996

HIPAA Privacy & Security Training. HIPAA The Health Insurance Portability and Accountability Act of 1996 HIPAA Privacy & Security Training HIPAA The Health Insurance Portability and Accountability Act of 1996 AMTA confidentiality requirements AMTA Professional Competencies 20. Documentation 20.7 Demonstrate

More information

Protected Environment at CHPC

Protected Environment at CHPC Protected Environment at CHPC Anita Orendt, anita.orendt@utah.edu Wayne Bradford, wayne.bradford@utah.edu Center for High Performance Computing 25 October 2016 CHPC Mission In addition to deploying and

More information

Banner Health Information Security and Privacy Training Team. Morgan Raimo Paul Lockwood

Banner Health Information Security and Privacy Training Team. Morgan Raimo Paul Lockwood Banner Health Information Security and Privacy Training Team Morgan Raimo Paul Lockwood PHI Storage InfoGraphics PHI Data Storage and Sharing Cybersecurity and Privacy Training and Awareness Table of Contents

More information

Survey Result on Privacy Preserving Techniques in Data Publishing

Survey Result on Privacy Preserving Techniques in Data Publishing Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant

More information

Privacy policy NTI AG

Privacy policy NTI AG Privacy policy NTI AG NTI AG / LinMot Dok-Nr. Privacy Policy_NTI_AG_180607 Content 1 Privacy policy... 3 2 Who are we?... 3 3 What is Personal Information?... 3 4 What Personal Information does NTI AG

More information

The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets

The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets Jeffrey Brown, Lesley Curtis, and Rich Platt June 13, 2014 Previously The NIH Collaboratory:

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

PRIVACY POLICY. 3.1 This policy does not apply to the collection, holding, use or disclosure of personal information that is an employee record.

PRIVACY POLICY. 3.1 This policy does not apply to the collection, holding, use or disclosure of personal information that is an employee record. 1. Introduction 1.1 From time to time Business & Risk Solutions Pty Ltd ("the Company") is required to collect, hold, use and/or disclose personal information relating to individuals (including, but not

More information

Document Cloud (including Adobe Sign) Additional Terms of Use. Last updated June 5, Replaces all prior versions.

Document Cloud (including Adobe Sign) Additional Terms of Use. Last updated June 5, Replaces all prior versions. Document Cloud (including Adobe Sign) Additional Terms of Use Last updated June 5, 2018. Replaces all prior versions. These Additional Terms govern your use of Document Cloud (including Adobe Sign) and

More information

Privacy Challenges in Big Data and Industry 4.0

Privacy Challenges in Big Data and Industry 4.0 Privacy Challenges in Big Data and Industry 4.0 Jiannong Cao Internet & Mobile Computing Lab Department of Computing Hong Kong Polytechnic University Email: csjcao@comp.polyu.edu.hk http://www.comp.polyu.edu.hk/~csjcao/

More information

Information Classification & Protection Policy

Information Classification & Protection Policy University of Scranton Information Technology Policy Information Classification & Protection Policy Executive Sponsor: AVP Information Resources Responsible Office: Information Security Originally Issued:

More information

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,

More information

Economic and Social Council

Economic and Social Council United Nations Economic and Social Council ECE/TRANS/WP.29/2017/46 Distr.: General 23 December 2016 Original: English Economic Commission for Europe Inland Transport Committee World Forum for Harmonization

More information

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Introduction to Privacy-Preserving Data Publishing Concepts and Techniques Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu CRC

More information

HIPAA UPDATE. Michael L. Brody, DPM

HIPAA UPDATE. Michael L. Brody, DPM HIPAA UPDATE Michael L. Brody, DPM Objectives: How to respond to a patient s request for a copy of their records. Understand your responsibilities after you send information out to another doctor, hospital

More information

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI

More information

Data Compromise Notice Procedure Summary and Guide

Data Compromise Notice Procedure Summary and Guide Data Compromise Notice Procedure Summary and Guide Various federal and state laws require notification of the breach of security or compromise of personally identifiable data. No single federal law or

More information

Emerging Measures in Preserving Privacy for Publishing The Data

Emerging Measures in Preserving Privacy for Publishing The Data Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the

More information

IT Privacy Certification Outline of the Body of Knowledge (BOK) for the Certified Information Privacy Technologist (CIPT)

IT Privacy Certification Outline of the Body of Knowledge (BOK) for the Certified Information Privacy Technologist (CIPT) Page 1 of 6 IT Privacy Certification Outline of the Body of Knowledge (BOK) for the Certified Information Privacy Technologist (CIPT) I. Understanding the need for privacy in the IT environment A. Evolving

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Exercising Rights Under the GDPR

Exercising Rights Under the GDPR THE 23ANDME GUIDE Exercising Rights Under the GDPR Right to Object. Right to Rectify. Right to Restrict. JULY 20, 2018 Exercise Your Rights The 23andMe Guide to Objecting, Rectifying, and Restricting Introduction

More information

Big Data - Security with Privacy

Big Data - Security with Privacy Big Data - Security with Privacy Elisa Bertino CS Department, Cyber Center, and CERIAS Purdue University Cyber Center Today we have technologies for Acquiring and sensing data Transmitting data Storing,

More information

K-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007

K-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007 K-Anonymity and Other Cluster- Based Methods Ge Ruan Oct 11,2007 Data Publishing and Data Privacy Society is experiencing exponential growth in the number and variety of data collections containing person-specific

More information

I. INFORMATION WE COLLECT

I. INFORMATION WE COLLECT PRIVACY POLICY USIT PRIVACY POLICY Usit (the Company ) is committed to maintaining robust privacy protections for its users. Our Privacy Policy ( Privacy Policy ) is designed to help you understand how

More information

In this policy, whenever you see the words we, us, our, it refers to Ashby Concert Band Registered Charity Number

In this policy, whenever you see the words we, us, our, it refers to Ashby Concert Band Registered Charity Number ASHBY CONCERT BAND PRIVACY POLICY The privacy and security of your personal information is extremely important to us. This privacy policy explains how and why we use your personal data. We will keep this

More information

Informational Guide for the NewSTEPs Data Repository

Informational Guide for the NewSTEPs Data Repository Informational Guide for the NewSTEPs Data Repository Document Contents What is the NewSTEPs Data Repository... 2 What data is being collected?... 2 Why is this data being collected?... 2 How did NewSTEPs

More information

An Iterative Approach to Examining the Effectiveness of Data Sanitization

An Iterative Approach to Examining the Effectiveness of Data Sanitization An Iterative Approach to Examining the Effectiveness of Data Sanitization By ANHAD PREET SINGH B.Tech. (Punjabi University) 2007 M.S. (University of California, Davis) 2012 DISSERTATION Submitted in partial

More information

Data Mining and Data Warehousing Introduction to Data Mining

Data Mining and Data Warehousing Introduction to Data Mining Data Mining and Data Warehousing Introduction to Data Mining Quiz Easy Q1. Which of the following is a data warehouse? a. Can be updated by end users. b. Contains numerous naming conventions and formats.

More information

Overview of Akamai s Personal Data Processing Activities and Role

Overview of Akamai s Personal Data Processing Activities and Role Overview of Akamai s Personal Data Processing Activities and Role Last Updated: April 2018 This document is maintained by the Akamai Global Data Protection Office 1 Introduction Akamai is a global leader

More information

Starflow Token Sale Privacy Policy

Starflow Token Sale Privacy Policy Starflow Token Sale Privacy Policy Last Updated: 23 March 2018 Please read this Privacy Policy carefully. By registering your interest to participate in the sale of STAR tokens (the Token Sale ) through

More information

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

only be used for the purpose of handling an individual transaction. The Personal Information you supply to us, when you opt in to marketing

only be used for the purpose of handling an individual transaction. The Personal Information you supply to us, when you opt in to marketing Privacy The Phoenix Theatre, Blyth Privacy Policy 1. Privacy commitment The website www.thephoenixtheatre.org.uk is owned and operated by The Phoenix Theatre, Blyth. We are committed to safeguarding your

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

Data Access, Data Sharing, and Data Use in Education Research Advancing Knowledge through Responsible Conduct

Data Access, Data Sharing, and Data Use in Education Research Advancing Knowledge through Responsible Conduct Data Access, Data Sharing, and Data Use in Education Research Advancing Knowledge through Responsible Conduct Felice J. Levine American Educational Research Association OECD Workshop on Fostering Innovation

More information

2011 INTERNATIONAL COMPARISON PROGRAM

2011 INTERNATIONAL COMPARISON PROGRAM 2011 INTERNATIONAL COMPARISON PROGRAM 2011 ICP DATA ACCESS AND ARCHIVING POLICY GUIDING PRINCIPLES AND PROCEDURES FOR DATA ACCESS ICP Global Office June 2011 Contents I. PURPOSE... 3 II. CONTEXT... 3 III.

More information

Jarek Szlichta

Jarek Szlichta Jarek Szlichta http://data.science.uoit.ca/ Approximate terminology, though there is some overlap: Data(base) operations Executing specific operations or queries over data Data mining Looking for patterns

More information

Data Mining with Weka

Data Mining with Weka Data Mining with Weka Class 5 Lesson 1 The data mining process Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Lesson 5.1 The data mining process Class

More information

HIPAA ( ) HIPAA 2017 Compliancy Group, LLC

HIPAA ( ) HIPAA 2017 Compliancy Group, LLC 855 85 HIPAA (855-854-4722) www.compliancygroup.com 1 Started in 2005 by HIPAA auditors & Compliance experts Market need for a total end client solution Created The Guard: cloud-based solution Compliance

More information

Accountability in Privacy-Preserving Data Mining

Accountability in Privacy-Preserving Data Mining PORTIA Privacy, Obligations, and Rights in Technologies of Information Assessment Accountability in Privacy-Preserving Data Mining Rebecca Wright Computer Science Department Stevens Institute of Technology

More information

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the

More information

Playing in the Big (Data) Leagues: Consumer Data Mining Data Privacy and Compliance

Playing in the Big (Data) Leagues: Consumer Data Mining Data Privacy and Compliance Playing in the Big (Data) Leagues: Consumer Data Mining Data Privacy and Compliance Presented by Charlie Bingham, Legal and Corporate Affairs -Enterprise Partner Group, Microsoft Corporation Rachel Reid,

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Chapter 3: Data Mining:

Chapter 3: Data Mining: Chapter 3: Data Mining: 3.1 What is Data Mining? Data Mining is the process of automatically discovering useful information in large repository. Why do we need Data mining? Conventional database systems

More information

GraphVar: A user-friendly toolbox for comprehensive graph analyses of functional brain connectivity.

GraphVar: A user-friendly toolbox for comprehensive graph analyses of functional brain connectivity. GraphVar: A user-friendly toolbox for comprehensive graph analyses of functional brain connectivity. features: I. Pipeline construction of graph networks II. Calculation of network topological measures

More information

Eagles Charitable Foundation Privacy Policy

Eagles Charitable Foundation Privacy Policy Eagles Charitable Foundation Privacy Policy Effective Date: 1/18/2018 The Eagles Charitable Foundation, Inc. ( Eagles Charitable Foundation, we, our, us ) respects your privacy and values your trust and

More information