Adaptive Temporal Entity Resolution on Dynamic Databases

Size: px
Start display at page:

Download "Adaptive Temporal Entity Resolution on Dynamic Databases"

Transcription

1 Adaptive Temporal Entity Resolution on Dynamic Databases Peter Christen 1 and Ross Gayler 2 1 Research School of Computer Science, ANU College of Engineering and Computer Science, The Australian National University, Canberra, Australia 2 Veda, Melbourne VIC 3000, Australia Contacts: peter.christen@anu.edu.au / ross.gayler@veda.com.au This research was funded by the Australian Research Council (ARC), Veda, and Funnelback Pty. Ltd., under Linkage Project LP April 2013 p.1/18

2 Outline Short introduction to entity resolution An example application: Identity verification Problem formulation and contribution An example set of temporal records Modelling temporal changes of entities Adjusting similarities between records Calculating agreement and disagreement probabilities The adaptive temporal matching process Experimental evaluation Conclusions and future work April 2013 p.2/18

3 Short introduction to entity resolution Entity resolution is the process of identifying and matching records that correspond to the same entity from one or several databases Several major challenges to entity resolution Entity identifiers are commonly not available, so often personal details need to be used for matching Real world data are dirty (typos, variations, etc.) Naive comparison of all record pairs scales quadratic with the sizes of databases to be matched Lack of training data (true match status of record pairs) makes accurate and automatic classification difficult April 2013 p.3/18

4 Example: Identity verification Many services require the verification of the personal details provided by customers (government services, credit cards, loans, etc.) Based on large databases of known entities (the personal details of individuals, such as their names, addresses, phone numbers, dates of birth, etc.) Requires real-time matching of query records with one or several large databases Accurate and fast matching is crucial for good service and to prevent identify fraud Personal details change over time (databases are dynamic) April 2013 p.4/18

5 Problem formulation and contribution We investigate how temporal information can be incorporated into the entity resolution process (such as people changing their names or addresses) We modify similarities between records according to temporal characteristics of the data Building on the earlier approach Linking temporal records (Li et al., VLDB Endowment, 2011) Our contributions An adaptive entity resolution approach for dynamic data that contain temporal information An efficient temporal adjustment method An evaluation on both synthetic and real data April 2013 p.5/18

6 Example set of temporal records RecID / EntID Givenname Surname Street address City Time-stamp r1 / e1 Gale Miller 13 Main Rd Sydney r2 / e2 Peter O Brian 43/1 Miller St Sydeny r3 / e1 Gail Miller 11 Town Pl Hobart r4 / e1 Gail Smith 42 Ocean Dr Perth r5 / e2 Pete O Brien 43 Miller St Sydney r6 / e1 Abigail Smith 42 Ocean Dr Perth r7 / e2 Peter OBrian 12 Nice Tr Brisbane r8 / e1 Gayle Smith 11a Town Pl Sydney An entity changes address values more often than surname values Small variations in values are possible (no actual changes) Several entities can have the same value in an attribute April 2013 p.6/18

7 Modelling temporal changes (1) Basic assumptions and notation used R, r i Database containing entity records r i a j r i.e, r i.t Attributes of r i, denoted by r i.a j Entity identifier and time-stamp of r i q, q.a j, q.t Query record with attributes a j and time-stamp (q does not have a known entity identifier) t s same, s match Difference in time-stamps t = r i.t - q i.t Global agreement and match thresholds The aim is to match a query record, q, to its correct true entity in R (q.e r i.e) We calculate similarities 0 sim j (r i.a j, q.a j ) 1 (values are agreeing if sim j s same, else disagreeing) April 2013 p.7/18

8 Modelling temporal changes (2) To consider temporal aspects, we define: S is the event that q and r i actually refer to same entity A j is the event that q and r i have an agreeing value in attribute a j We consider two probabilities P(A j, t S) Probability that a query and a database record that actually refer to the same entity have an agreeing value in attribute a j over t (no value change) P( A j, t S) Probability that a query and a database record that actually refer to different entities have disagreeing (different) values in attribute a j over t April 2013 p.8/18

9 Adjusting similarities (1) Based on previous two probabilities, we adjust the overall similarity between compared records Assume q and r i have been compared using a set of attribute similarity functions s j = sim j (r i.a j, q.a j ) We assign relative weights, w j, to the attribute similarities, s j sim(r i, q) = j w j(s j, t) s j j w j(s j, t) These weights are calculated based on the likelihood of change in their attribute values April 2013 p.9/18

10 Adjusting similarities (2) We adjust similarities based on s j and s same If s j s same then w j (s j, t) = s j P( A j, t S) The more likely it is that two different entities have the same value in attribute a j over time t, the less weight is assigned for this agreement If s j < s same then w j (s j, t) = s j P(A j, t S) The more likely it is that for an entity a value in attribute a j changes over time t, the less weight is assigned for this disagreement April 2013 p.10/18

11 Calculating probabilities In a dynamic and real-time setting, P(A j, t S) and P( A j, t S) need to be calculated and updated in an adaptive and efficient way P(A j, t S) can be calculated from data if it is known which records correspond to the same entity (or based on match decisions made) P( A j, t S) is calculated as P( A j, t S) = 1 - P(A j, t S), the probability of how frequently certain values appear in an attribute (surname value Smith is more frequent than Dijkstra ) Details of these calculations please see paper April 2013 p.11/18

12 Adaptive temporal matching process Assume an initial database R of known entity records, and a stream of query records q For each q, the following process is conducted 1. Get a set of candidate records C from R using an appropriate blocking/indexing technique 2. For each candidate record c C, calculate overall adjusted similarity sim(c, q) 3. Get c best with highest similarity s best of all c 4. If s best s match, set q.e = c best.e, else set q.e to a new unique entity identifier value 5. Update P(A j, t S) and P( A j, t S) 6. Add q into database R April 2013 p.12/18

13 Experiments and data sets We used a North Carolina voter database (personal details of 2.4 million voters, only 113,801 voters with duplicate records) We also generated synthetic data sets based on real personal data (details and results in paper) Prototype implemented in Python (code available from authors) Three baseline approaches Traditional entity resolution that does not consider temporal aspects An additional temporal attribute a temp = t/max( t) Non-adaptive temporal (no update of probabilities) April 2013 p.13/18

14 Percentage of true matches correctly identified Results on NC voter data NC Voter matching quality with s match =0.7 TM, s same =0.8 TM top 10, s same =0.8 TM, s same =0.9 TM top 10, s same =0.9 None Temp Attr Adapt Non Adapt April 2013 p.14/18

15 Time in milli-seconds Temporal overhead for NC voter data Timing results for NC Voter data set Match time Adjust time Update time 10 0 None Temp Attr Adapt Non Adapt April 2013 p.15/18

16 Conclusions and future work We proposed an efficient approach for adaptive entity resolution on dynamic databases We consider temporal aspects to adjust agreement and disagreement weights Experiments showed that taking temporal aspects into account can improve matching quality Future work includes Take attribute dependencies into account Combine the proposed approach with probabilistic record linkage Incorporate constraints April 2013 p.16/18

17 Results on synthetic data Parameters None Adapt Avrg rec per ent Clean, 0.8 / Clean, 0.9 / Dirty, 0.8 / Dirty, 0.9 / Results reported are accuracy calculated as percentage of true matches correctly identified Parameter values are for s same / s match April 2013 p.17/18

18 Percentage of true matches correctly identified Results on NC voter data (2) NC Voter matching quality with s match =0.8 TM, s same =0.8 TM top 10, s same =0.8 TM, s same =0.9 TM top 10, s same =0.9 None Temp Attr Adapt Non Adapt April 2013 p.18/18

Adaptive Temporal Entity Resolution on Dynamic Databases

Adaptive Temporal Entity Resolution on Dynamic Databases Adaptive Temporal Entity Resolution on Dynamic Databases Peter Christen 1 and Ross W. Gayler 2 1 Research School of Computer Science, The Australian National University, Canberra ACT 0200, Australia peter.christen@anu.edu.au

More information

Automatic Record Linkage using Seeded Nearest Neighbour and SVM Classification

Automatic Record Linkage using Seeded Nearest Neighbour and SVM Classification Automatic Record Linkage using Seeded Nearest Neighbour and SVM Classification Peter Christen Department of Computer Science, ANU College of Engineering and Computer Science, The Australian National University,

More information

Automatic training example selection for scalable unsupervised record linkage

Automatic training example selection for scalable unsupervised record linkage Automatic training example selection for scalable unsupervised record linkage Peter Christen Department of Computer Science, The Australian National University, Canberra, Australia Contact: peter.christen@anu.edu.au

More information

Probabilistic Deduplication, Record Linkage and Geocoding

Probabilistic Deduplication, Record Linkage and Geocoding Probabilistic Deduplication, Record Linkage and Geocoding Peter Christen Data Mining Group, Australian National University in collaboration with Centre for Epidemiology and Research, New South Wales Department

More information

Data Linkage Techniques: Past, Present and Future

Data Linkage Techniques: Past, Present and Future Data Linkage Techniques: Past, Present and Future Peter Christen Department of Computer Science, The Australian National University Contact: peter.christen@anu.edu.au Project Web site: http://datamining.anu.edu.au/linkage.html

More information

Data Linkage Methods: Overview of Computer Science Research

Data Linkage Methods: Overview of Computer Science Research Data Linkage Methods: Overview of Computer Science Research Peter Christen Research School of Computer Science, ANU College of Engineering and Computer Science, The Australian National University, Canberra,

More information

Outline. Probabilistic Name and Address Cleaning and Standardisation. Record linkage and data integration. Data cleaning and standardisation (I)

Outline. Probabilistic Name and Address Cleaning and Standardisation. Record linkage and data integration. Data cleaning and standardisation (I) Outline Probabilistic Name and Address Cleaning and Standardisation Peter Christen, Tim Churches and Justin Xi Zhu Data Mining Group, Australian National University Centre for Epidemiology and Research,

More information

Real-time Collaborative Filtering Recommender Systems

Real-time Collaborative Filtering Recommender Systems Real-time Collaborative Filtering Recommender Systems Huizhi Liang, Haoran Du, Qing Wang Presenter: Qing Wang Research School of Computer Science The Australian National University Australia Partially

More information

Privacy-Preserving Data Sharing and Matching

Privacy-Preserving Data Sharing and Matching Privacy-Preserving Data Sharing and Matching Peter Christen School of Computer Science, ANU College of Engineering and Computer Science, The Australian National University, Canberra, Australia Contact:

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

A Two Stage Similarity aware Indexing for Large Scale Real time Entity Resolution

A Two Stage Similarity aware Indexing for Large Scale Real time Entity Resolution A Two Stage Similarity aware Indexing for Large Scale Real time Entity Resolution Shouheng Li Supervisor: Huizhi (Elly) Liang u4713006@anu.edu.au The Australian National University Outline Introduction

More information

Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach

Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach Peter Christen Ross Gayler 2 Department of Computer Science, The Australian National University, Canberra 2,

More information

Overview of Record Linkage Techniques

Overview of Record Linkage Techniques Overview of Record Linkage Techniques Record linkage or data matching refers to the process used to identify records which relate to the same entity (e.g. patient, customer, household) in one or more data

More information

Febrl Freely Extensible Biomedical Record Linkage

Febrl Freely Extensible Biomedical Record Linkage Febrl Freely Extensible Biomedical Record Linkage Release 0.4.01 Peter Christen December 13, 2007 Department of Computer Science The Australian National University Canberra ACT 0200 Australia Email: peter.christen@anu.edu.au

More information

Active Blocking Scheme Learning for Entity Resolution

Active Blocking Scheme Learning for Entity Resolution Active Blocking Scheme Learning for Entity Resolution Jingyu Shao and Qing Wang Research School of Computer Science, Australian National University {jingyu.shao,qing.wang}@anu.edu.au Abstract. Blocking

More information

Nexgen Australia. Service Level Agreement

Nexgen Australia. Service Level Agreement Nexgen Australia Service Level Agreement V090218 1 P a g e Contents 1. Introduction 2. Definitions 3. Faults 3.1 Fault Reporting 3.2 Fault Management 3.3 Fault Priority Classification 3.4 Target Response

More information

Quality and Complexity Measures for Data Linkage and Deduplication

Quality and Complexity Measures for Data Linkage and Deduplication Quality and Complexity Measures for Data Linkage and Deduplication Peter Christen and Karl Goiser Department of Computer Science, The Australian National University, Canberra ACT 0200, Australia {peter.christen,karl.goiser}@anu.edu.au

More information

Performance and scalability of fast blocking techniques for deduplication and data linkage

Performance and scalability of fast blocking techniques for deduplication and data linkage Performance and scalability of fast blocking techniques for deduplication and data linkage Peter Christen Department of Computer Science The Australian National University Canberra ACT, Australia peter.christen@anu.edu.au

More information

Record Linkage using Probabilistic Methods and Data Mining Techniques

Record Linkage using Probabilistic Methods and Data Mining Techniques Doi:10.5901/mjss.2017.v8n3p203 Abstract Record Linkage using Probabilistic Methods and Data Mining Techniques Ogerta Elezaj Faculty of Economy, University of Tirana Gloria Tuxhari Faculty of Economy, University

More information

A note on using the F-measure for evaluating data linkage algorithms

A note on using the F-measure for evaluating data linkage algorithms Noname manuscript No. (will be inserted by the editor) A note on using the for evaluating data linkage algorithms David Hand Peter Christen Received: date / Accepted: date Abstract Record linkage is the

More information

Real-time Collaborative Filtering Recommender Systems

Real-time Collaborative Filtering Recommender Systems Real-time Collaborative Filtering Recommender Systems Huizhi Liang 1,2 Haoran Du 2 Qing Wang 2 1 Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia Email:

More information

Regression classifier for Improved Temporal Record Linkage

Regression classifier for Improved Temporal Record Linkage Regression classifier for Improved Temporal Record Linkage Yichen Hu Qing Wang Dinusha Vatsalan Peter Christen Research School of Computer Science, The Australian National University, Canberra ACT 0200,

More information

Data linkages in PEDSnet

Data linkages in PEDSnet 2016/2017 CRISP Seminar Series - Part IV Data linkages in PEDSnet Toan C. Ong, PhD Assistant Professor Department of Pediatrics University of Colorado, Anschutz Medical Campus Content Record linkage background

More information

Grouping methods for ongoing record linkage

Grouping methods for ongoing record linkage Grouping methods for ongoing record linkage Sean M. Randall sean.randall@curtin.edu.au James H. Boyd j.boyd@curtin.edu.au Anna M. Ferrante a.ferrante@curtin.edu.au Adrian P. Brown adrian.brown@curtin.edu.au

More information

Foundation Suite - AP/AR Inquiries

Foundation Suite - AP/AR Inquiries Foundation Suite - AP/AR Inquiries Dialog s AP-AR Inquiries provides significant improvements to the standard Dynamics SL Inquires for Customer and Vendor. Payment and Credit Applications The Renown Inquiry

More information

School of Computer Science and Software Engineering. 1st SEMESTER EXAMINATIONS 2008 CITS3240 DATABASES

School of Computer Science and Software Engineering. 1st SEMESTER EXAMINATIONS 2008 CITS3240 DATABASES School of Computer Science and Software Engineering 2008 SURNAME: GIVEN NAMES: STUDENT NO: SIGNATURE: This paper contains:?? pages (including the title page) Time allowed: 2 hours 10 minutes Section A:

More information

Link Mining & Entity Resolution. Lise Getoor University of Maryland, College Park

Link Mining & Entity Resolution. Lise Getoor University of Maryland, College Park Link Mining & Entity Resolution Lise Getoor University of Maryland, College Park Learning in Structured Domains Traditional machine learning and data mining approaches assume: A random sample of homogeneous

More information

Evaluating Record Linkage Software Using Representative Synthetic Datasets

Evaluating Record Linkage Software Using Representative Synthetic Datasets Evaluating Record Linkage Software Using Representative Synthetic Datasets Benmei Liu, Ph.D. Mandi Yu, Ph.D. Eric J. Feuer, Ph.D. National Cancer Institute Presented at the NAACCR Annual Conference June

More information

arxiv: v3 [cs.db] 19 Mar 2018

arxiv: v3 [cs.db] 19 Mar 2018 Scalable Entity Resolution Using Probabilistic Signatures on Parallel Databases Yuhang Zhang AUSTRAC Pauline Chou AUSTRAC Kee Siong Ng AUSTRAC / ANU Tania Churchill AUSTRAC Michael Walker AUSTRAC Peter

More information

Improving the Expected Quality of Experience in Cloud-Enabled Wireless Access Networks

Improving the Expected Quality of Experience in Cloud-Enabled Wireless Access Networks Improving the Expected Quality of Experience in Cloud-Enabled Wireless Access Networks Dr. Hang Liu & Kristofer Smith Department of Electrical Engineering and Computer Science The Catholic University of

More information

An Ensemble Approach for Record Matching in Data Linkage

An Ensemble Approach for Record Matching in Data Linkage Digital Health Innovation for Consumers, Clinicians, Connectivity and Community A. Georgiou et al. (Eds.) 2016 The authors and IOS Press. This article is published online with Open Access by IOS Press

More information

PUMA RETAIL PARTNER APPLICATION

PUMA RETAIL PARTNER APPLICATION PUMA RETAIL PARTNER APPLICATION Document Prepared by Puma Energy PERSONAL INFORMATION Surname: First Name: Address: State: Country: Suburb/Town: Postcode: Years in current address: Home Phone: Work Phone:

More information

ARC Research Management System New User Guide

ARC Research Management System New User Guide ARC Research Management System New User Guide ********************************************************************************************************************************************** Contents Contents...

More information

Learning High Accuracy Rules for Object Identification

Learning High Accuracy Rules for Object Identification Learning High Accuracy Rules for Object Identification Sheila Tejada Wednesday, December 12, 2001 Committee Chair: Craig A. Knoblock Committee: Dr. George Bekey, Dr. Kevin Knight, Dr. Steven Minton, Dr.

More information

Single Error Analysis of String Comparison Methods

Single Error Analysis of String Comparison Methods Single Error Analysis of String Comparison Methods Peter Christen Department of Computer Science, Australian National University, Canberra ACT 2, Australia peter.christen@anu.edu.au Abstract. Comparing

More information

Ad Hoc Reporting with Report Builder

Ad Hoc Reporting with Report Builder BI316 Ad Hoc Reporting with Report Builder David Lean Principal Technology Specialist Microsoft Australia Visit www.sqlserver.com.au Monthly Meetings + Great info + Great Contacts + Pizza & Beer It s Free!!!

More information

High Performance Computing and Data Mining

High Performance Computing and Data Mining High Performance Computing and Data Mining Performance Issues in Data Mining Peter Christen Peter.Christen@anu.edu.au Data Mining Group Department of Computer Science, FEIT Australian National University,

More information

Subscription Terms & Conditions 6 Month, 12 Month and 24 Month Subscriptions

Subscription Terms & Conditions 6 Month, 12 Month and 24 Month Subscriptions Subscription Terms & Conditions 6 Month, 12 Month and 24 Month Subscriptions By subscribing to The Big Issue magazine you agree to the following terms and conditions: 1. The Big Issue will supply a magazine

More information

Use of Synthetic Data in Testing Administrative Records Systems

Use of Synthetic Data in Testing Administrative Records Systems Use of Synthetic Data in Testing Administrative Records Systems K. Bradley Paxton and Thomas Hager ADI, LLC 200 Canal View Boulevard, Rochester, NY 14623 brad.paxton@adillc.net, tom.hager@adillc.net Executive

More information

TERTIARY INSTITUTIONS SERVICE CENTRE (Incorporated in Western Australia)

TERTIARY INSTITUTIONS SERVICE CENTRE (Incorporated in Western Australia) TERTIARY INSTITUTIONS SERVICE CENTRE (Incorporated in Western Australia) Royal Street East Perth, Western Australia 6004 Telephone (08) 9318 8000 Facsimile (08) 9225 7050 http://www.tisc.edu.au/ THE AUSTRALIAN

More information

Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ

Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ 45 Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ Department of Computer Science The Australian National University Canberra, ACT 2611 Email: fzhen.he, Jeffrey.X.Yu,

More information

PTable Documentation. Release latest

PTable Documentation. Release latest PTable Documentation Release latest May 02, 2015 Contents 1 Row by row 3 2 Column by column 5 3 Mixing and matching 7 4 Importing data from a CSV file 9 5 Importing data from a database cursor 11 6 Getting

More information

Query Relaxation Using Malleable Schemas. Dipl.-Inf.(FH) Michael Knoppik

Query Relaxation Using Malleable Schemas. Dipl.-Inf.(FH) Michael Knoppik Query Relaxation Using Malleable Schemas Dipl.-Inf.(FH) Michael Knoppik Table Of Contents 1.Introduction 2.Limitations 3.Query Relaxation 4.Implementation Issues 5.Experiments 6.Conclusion Slide 2 1.Introduction

More information

Distance-based Outlier Detection: Consolidation and Renewed Bearing

Distance-based Outlier Detection: Consolidation and Renewed Bearing Distance-based Outlier Detection: Consolidation and Renewed Bearing Gustavo. H. Orair, Carlos H. C. Teixeira, Wagner Meira Jr., Ye Wang, Srinivasan Parthasarathy September 15, 2010 Table of contents Introduction

More information

A Highly Accurate Method for Managing Missing Reads in RFID Enabled Asset Tracking

A Highly Accurate Method for Managing Missing Reads in RFID Enabled Asset Tracking A Highly Accurate Method for Managing Missing Reads in RFID Enabled Asset Tracking Rengamathi Sankarkumar (B), Damith Ranasinghe, and Thuraiappah Sathyan Auto-ID Lab, The School of Computer Science, University

More information

Comparison of Online Record Linkage Techniques

Comparison of Online Record Linkage Techniques International Research Journal of Engineering and Technology (IRJET) e-issn: 2395-0056 Volume: 02 Issue: 09 Dec-2015 p-issn: 2395-0072 www.irjet.net Comparison of Online Record Linkage Techniques Ms. SRUTHI.

More information

Jaccard Coefficients as a Potential Graph Benchmark

Jaccard Coefficients as a Potential Graph Benchmark Jaccard Coefficients as a Potential Graph Benchmark Peter M. Kogge McCourtney Prof. of CSE Univ. of Notre Dame IBM Fellow (retired) Please Sir, I want more 1 Outline Motivation Jaccard Coefficients A MapReduce

More information

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

Self-tuning ongoing terminology extraction retrained on terminology validation decisions Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin

More information

Outline. How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III

Outline. How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III Outline How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III Peter Christen and Adam Czezowski CAP Research Group Department of Computer Science,

More information

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano

More information

SAMPLE REPORT. Business Continuity Gap Analysis Report. Prepared for XYZ Business by CSC Business Continuity Services Date: xx/xx/xxxx

SAMPLE REPORT. Business Continuity Gap Analysis Report. Prepared for XYZ Business by CSC Business Continuity Services Date: xx/xx/xxxx SAMPLE REPORT Business Continuity Gap Analysis Report Prepared for XYZ Business by CSC Business Continuity Services Date: xx/xx/xxxx COMMERCIAL-IN-CONFIDENCE PAGE 1 OF 11 Contact Details CSC Contacts CSC

More information

Payment Card Industry (PCI) Data Security Standard

Payment Card Industry (PCI) Data Security Standard Payment Card Industry (PCI) Data Security Standard Attestation of Compliance for Onsite Assessments Service Providers Version 3.1 April 2015 Section 1: Assessment Information Instructions for Submission

More information

Critical Information Summary

Critical Information Summary Information about the Service Critical Information Summary NBN Services Service Description Service Speed Categories Network Coverage Prerequisites Bundling Restricted Offer NBN broadband internet service

More information

Performance Based Study of Association Rule Algorithms On Voter DB

Performance Based Study of Association Rule Algorithms On Voter DB Performance Based Study of Association Rule Algorithms On Voter DB K.Padmavathi 1, R.Aruna Kirithika 2 1 Department of BCA, St.Joseph s College, Thiruvalluvar University, Cuddalore, Tamil Nadu, India,

More information

STEP BY STEP HOW TO COMPLETE THE ELECTRONIC BGC FORM

STEP BY STEP HOW TO COMPLETE THE ELECTRONIC BGC FORM Human Resources Background Check Program backgroundchecks.hr.ncsu.edu 2711 Sullivan Drive, Admin Services II Raleigh, NC 27695 background-checks@ncsu.edu STEP BY STEP HOW TO COMPLETE THE ELECTRONIC BGC

More information

Mining Generalised Emerging Patterns

Mining Generalised Emerging Patterns Mining Generalised Emerging Patterns Xiaoyuan Qian, James Bailey, Christopher Leckie Department of Computer Science and Software Engineering University of Melbourne, Australia {jbailey, caleckie}@csse.unimelb.edu.au

More information

RMIT University at TREC 2006: Terabyte Track

RMIT University at TREC 2006: Terabyte Track RMIT University at TREC 2006: Terabyte Track Steven Garcia Falk Scholer Nicholas Lester Milad Shokouhi School of Computer Science and IT RMIT University, GPO Box 2476V Melbourne 3001, Australia 1 Introduction

More information

Discovery of Genuine Functional Dependencies from Relational Data with Missing Values

Discovery of Genuine Functional Dependencies from Relational Data with Missing Values Functional Dependencies from Relational Data with Missing VLDB 2018 Dependency Discovery and Data Models Session Rio de Janeiro-Brazil 29 th August 2018 Laure Berti-Equille (LIS, Aix Marseille Uni) (HPI,

More information

Service Description: FTTC Test Sandpit

Service Description: FTTC Test Sandpit Test Agreement Supporting Document for Test Description: TM FTTC Test Service Description: FTTC Test This document is being provided for the purposes of the Test Agreement for the FTTC Test only. It should

More information

Flexible Longitudinal Data Generation

Flexible Longitudinal Data Generation A COMP8780 Information and Human-Centred Computing Project Flexible Longitudinal Data Generation Generating Synthetic Temporal Data in Support of Data Mining and Knowledge Discovery in Databases Author

More information

Mining Time-Profiled Associations: A Preliminary Study Report. Technical Report

Mining Time-Profiled Associations: A Preliminary Study Report. Technical Report Mining Time-Profiled Associations: A Preliminary Study Report Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 EECS Building 200 Union Street SE Minneapolis,

More information

North Carolina A&T State University Blackboard Support

North Carolina A&T State University Blackboard Support North Carolina A&T State University Blackboard Support Using the Digital Drop Box The Digital Drop Box is a tool that allows students and instructors to exchange files. Students can use the Digital Drop

More information

Probabilistic Scoring Methods to Assist Entity Resolution Systems Using Boolean Rules

Probabilistic Scoring Methods to Assist Entity Resolution Systems Using Boolean Rules Probabilistic Scoring Methods to Assist Entity Resolution Systems Using Boolean Rules Fumiko Kobayashi, John R Talburt Department of Information Science University of Arkansas at Little Rock 2801 South

More information

FULL INSPECTION APPLICATION

FULL INSPECTION APPLICATION FULL INSPECTION APPLICATION The cost of a FULL INSPECTION is $550 + GST for residential properties only and covers all metropolitan areas. If you have a commercial or industrial property, please contact

More information

Efficient Record De-Duplication Identifying Using Febrl Framework

Efficient Record De-Duplication Identifying Using Febrl Framework IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 10, Issue 2 (Mar. - Apr. 2013), PP 22-27 Efficient Record De-Duplication Identifying Using Febrl Framework K.Mala

More information

How To Use Transdirect Transdirect Starter Guide: How To Use Transdirect. Transdirect Starter ter Guide 1

How To Use Transdirect Transdirect Starter Guide: How To Use Transdirect. Transdirect Starter ter Guide 1 Transdirect Starter Guide: Transdirect Starter ter Guide 1 Contents As a new Transdirect member, you have access to a vast network of Australia s leading courier and shipping companies at your fingertips.

More information

TRIE BASED METHODS FOR STRING SIMILARTIY JOINS

TRIE BASED METHODS FOR STRING SIMILARTIY JOINS TRIE BASED METHODS FOR STRING SIMILARTIY JOINS Venkat Charan Varma Buddharaju #10498995 Department of Computer and Information Science University of MIssissippi ENGR-654 INFORMATION SYSTEM PRINCIPLES RESEARCH

More information

Annual Report for the Utility Savings Initiative

Annual Report for the Utility Savings Initiative Report to the North Carolina General Assembly Annual Report for the Utility Savings Initiative July 1, 2016 June 30, 2017 NORTH CAROLINA DEPARTMENT OF ENVIRONMENTAL QUALITY http://portal.ncdenr.org Page

More information

Veda Advantage Company File Client Reference Guide

Veda Advantage Company File Client Reference Guide Veda Advantage Company File Version: 1.2 Date: 02/03/2010 1800 773 773 confirm@citec.com.au Innovative Information Solutions Veda Advantage Company File Version: 1.2 Date: 02/03/2010 Page 2 of 11 Veda

More information

CommBiz Application Worksheet

CommBiz Application Worksheet Step 1: CommBiz Application Worksheet Service Details Information about the organisations for which the CommBiz service will be established. All legal entities which own accounts to be registered in this

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

AUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH

AUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH AUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH Dezhao Song and Jeff Heflin SWAT Lab Department of Computer Science and Engineering Lehigh University 11/10/2011

More information

After Conversation - A Forensic ICQ Logfile Extraction Tool

After Conversation - A Forensic ICQ Logfile Extraction Tool Edith Cowan University Research Online ECU Publications Pre. 2011 2005 After Conversation - A Forensic ICQ Logfile Extraction Tool Kim Morfitt Edith Cowan University Craig Valli Edith Cowan University

More information

Microsoft Implementing Desktop Application Environments

Microsoft Implementing Desktop Application Environments 1800 ULEARN (853 276) www.ddls.com.au Microsoft 20416 - Implementing Desktop Application Environments Length 5 days Price $4290.00 (inc GST) Version B Overview This five-day course provides students with

More information

August 2017 G-NAF. Data Release Report August 2017

August 2017 G-NAF. Data Release Report August 2017 Product: Prepared: G-NAF Data Report Revision History Date Version Change Coordinator 1.0 Initial Version Anthony Hesling Disclaimer PSMA Australia believes this publication to be correct at the time of

More information

ACS CBOK ICT Building Blocks

ACS CBOK ICT Building Blocks Assessing the work readiness skills of ICT graduates: Developing a SFIA-based ICT Curriculum Brian von Konsky PhD(Curtin) FACS CP Charlynn Miller PhD MACS (Senior) CP Asheley Jones DBA Candidate MACS CP

More information

Recordkeeping Standards Analysis of HealthConnect

Recordkeeping Standards Analysis of HealthConnect Recordkeeping Standards Analysis of HealthConnect Electronic Health Records: Achieving an Effective and Ethical Legal and Recordkeeping Framework Australian Research Council Discovery Grant, DP0208109

More information

POLICY FOR THE RE-ISSUE OF NATIONAL CERTIFICATES

POLICY FOR THE RE-ISSUE OF NATIONAL CERTIFICATES POLICY FOR THE RE-ISSUE OF NATIONAL CERTIFICATES Umalusi Umalusi House 37 General van Ryneveld Street Persequor Technopark Pretoria PO Box 151 Persequor Technopark Pretoria 0020 South Africa Tel: +27 12

More information

Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing

Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing Nesime Tatbul Uğur Çetintemel Stan Zdonik Talk Outline Problem Introduction Approach Overview Advance Planning with an

More information

Metadata Elements Comparison: Vetadata and ANZ-LOM

Metadata Elements Comparison: Vetadata and ANZ-LOM Metadata Elements Comparison: Vetadata and ANZ-LOM The Learning Federation and E-standards for Training Version 1.0 April 2008 flexiblelearning.net.au thelearningfederation.edu.au Disclaimer The Australian

More information

UNIVERSITY OF NORTH CAROLINA CHARLOTTE

UNIVERSITY OF NORTH CAROLINA CHARLOTTE STATE OF NORTH CAROLINA OFFICE OF THE STATE AUDITOR BETH A. WOOD, CPA UNIVERSITY OF NORTH CAROLINA CHARLOTTE INFORMATION TECHNOLOGY GENERAL CONTROLS INFORMATION SYSTEMS AUDIT JULY 2017 EXECUTIVE SUMMARY

More information

Autonomic Workload Execution Control Using Throttling

Autonomic Workload Execution Control Using Throttling Autonomic Workload Execution Control Using Throttling Wendy Powley, Patrick Martin, Mingyi Zhang School of Computing, Queen s University, Canada Paul Bird, Keith McDonald IBM Toronto Lab, Canada March

More information

The Pairwise-Comparison Method

The Pairwise-Comparison Method The Pairwise-Comparison Method Lecture 10 Section 1.5 Robb T. Koether Hampden-Sydney College Mon, Sep 11, 2017 Robb T. Koether (Hampden-Sydney College) The Pairwise-Comparison Method Mon, Sep 11, 2017

More information

CORPORATE GOVERNANCE OF INFORMATION & COMMUNICATION TECHNOLOGY

CORPORATE GOVERNANCE OF INFORMATION & COMMUNICATION TECHNOLOGY AS 8015 2005 CORPORATE GOVERNANCE OF INFORMATION & COMMUNICATION TECHNOLOGY This Australian Standard was prepared by Committee IT-030, IT Governance. It was approved on behalf of the Council of Standards

More information

MICROSOFT INFOPATH 2007 ESSENTIALS

MICROSOFT INFOPATH 2007 ESSENTIALS Phone:1300 121 400 Email: enquiries@pdtraining.com.au MICROSOFT INFOPATH 2007 ESSENTIALS Generate a group quote today or register now for the next public course date COURSE LENGTH: 1.0 DAYS This course

More information

Entity Resolution over Graphs

Entity Resolution over Graphs Entity Resolution over Graphs Bingxin Li Supervisor: Dr. Qing Wang Australian National University Semester 1, 2014 Acknowledgements I would take this opportunity to thank my supervisor, Dr. Qing Wang,

More information

MICROSOFT EXCEL 2007 ADVANCED

MICROSOFT EXCEL 2007 ADVANCED Phone:1300 121 400 Email: enquiries@pdtraining.com.au MICROSOFT EXCEL 2007 ADVANCED Generate a group quote today or register now for the next public course date COURSE LENGTH: 1.0 DAYS Excel is the world

More information

Deduplication of Hospital Data using Genetic Programming

Deduplication of Hospital Data using Genetic Programming Deduplication of Hospital Data using Genetic Programming P. Gujar Department of computer engineering Thakur college of engineering and Technology, Kandiwali, Maharashtra, India Priyanka Desai Department

More information

Commercial Projects. Data Centres & IT. Alternative Energy Generation. Manufacturing & Service

Commercial Projects.   Data Centres & IT. Alternative Energy Generation. Manufacturing & Service Commercial Projects Data Centres & IT Alternative Energy Generation Manufacturing & Service www.silcar.com.au 02 Silcar designs, constructs, operates, manages and maintains critical infrastructure assets

More information

Online Mining of Frequent Query Trees over XML Data Streams

Online Mining of Frequent Query Trees over XML Data Streams Online Mining of Frequent Query Trees over XML Data Streams Hua-Fu Li*, Man-Kwan Shan and Suh-Yin Lee Department of Computer Science National Chiao-Tung University Hsinchu, Taiwan 300, R.O.C. http://www.csie.nctu.edu.tw/~hfli/

More information

Ontology-based Integration and Refinement of Evaluation-Committee Data from Heterogeneous Data Sources

Ontology-based Integration and Refinement of Evaluation-Committee Data from Heterogeneous Data Sources Indian Journal of Science and Technology, Vol 8(23), DOI: 10.17485/ijst/2015/v8i23/79342 September 2015 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Ontology-based Integration and Refinement of Evaluation-Committee

More information

Private Candidates Guide

Private Candidates Guide Private Candidates Guide Please note the following before proceeding with the SCHOOLS REGISTRATION SYSTEM Customers under 18 must be registered by a parent or guardian. A Parent or guardian can use the

More information

A Clustering-Based Framework to Control Block Sizes for Entity Resolution

A Clustering-Based Framework to Control Block Sizes for Entity Resolution A Clustering-Based Framework to Control Block s for Entity Resolution Jeffrey Fisher Research School of Computer Science Australian National University Canberra ACT 0200 jeffrey.fisher@anu.edu.au Peter

More information

Assessing Deduplication and Data Linkage Quality: What to Measure?

Assessing Deduplication and Data Linkage Quality: What to Measure? Assessing Deduplication and Data Linkage Quality: What to Measure? http://datamining.anu.edu.au/linkage.html Peter Christen and Karl Goiser Department of Computer Science, Australian National University,

More information

SmartGossip: : an improved randomized broadcast protocol for sensor networks

SmartGossip: : an improved randomized broadcast protocol for sensor networks SmartGossip: : an improved randomized broadcast protocol for sensor networks Presented by Vilas Veeraraghavan Advisor Dr. Steven Weber Presented to the Center for Telecommunications and Information Networking

More information

PRIVACY POLICY 1. ABOUT THIS POLICY

PRIVACY POLICY 1. ABOUT THIS POLICY Updated Privacy Policy We ve recently updated our Privacy Policy. The updated Privacy Policy will automatically come into effect on 6 August 2018. Your c ontinued use of the Platform from that date onwards

More information

Telephone Survey Response: Effects of Cell Phones in Landline Households

Telephone Survey Response: Effects of Cell Phones in Landline Households Telephone Survey Response: Effects of Cell Phones in Landline Households Dennis Lambries* ¹, Michael Link², Robert Oldendick 1 ¹University of South Carolina, ²Centers for Disease Control and Prevention

More information

Cleanup and Statistical Analysis of Sets of National Files

Cleanup and Statistical Analysis of Sets of National Files Cleanup and Statistical Analysis of Sets of National Files William.e.winkler@census.gov FCSM Conference, November 6, 2013 Outline 1. Background on record linkage 2. Background on edit/imputation 3. Current

More information

CFSE / CFSP Training & Certification

CFSE / CFSP Training & Certification CFSE / CFSP Training & Certification The Certified Functional Safety Expert (CSFE) and the Certified Functional Safety Professional (CFSP) are global programs that apply to the field of functional safety.

More information

The Plurality-with-Elimination Method

The Plurality-with-Elimination Method The Plurality-with-Elimination Method Lecture 9 Section 1.4 Robb T. Koether Hampden-Sydney College Fri, Sep 8, 2017 Robb T. Koether (Hampden-Sydney College) The Plurality-with-Elimination Method Fri, Sep

More information