Records are deduplicated as they are read into the CRS but it is sometimes useful to deduplicate manually.

Similar documents
When first importing a Specialised Register (SR) into the CRS, use the Import Register tab.

Training Manual for Researchers. How to Create an Online Human Ethics Application

Guide for Researchers: Online Human Ethics Application Form

Letter Assistant Word 2003 Setting up a New Letter DOC

Guide for Researchers: Online Human Ethics Application Form

A dialog box will appear informing you that you will need to restart Enterprise to see the Calendar changes. Click Ok and then restart Enterprise.

Guide for Candidates: Online Progress Reports

Outlook 2003 Efficiency Tips

PART 7. Getting Started with Excel

SWISH - How to instructions

Setting Up an Event Using Surveys

THE EXCEL ENVIRONMENT... 1 EDITING...

Chapter Copy Buckets

Objective 1: Familiarize yourself with basic database terms and definitions. Objective 2: Familiarize yourself with the Access environment.

Chapter 2: Clients, charts of accounts, and bank accounts

This manual will explain how to do a mail merge in Cordell Connect, using the following Windows programs:

Step 1 Navigation There are 4 ways to access the Patient Deduplication Queue:

Decatur City Schools

Sales Order Processing

OXFORD DIGITAL HELP GUIDE

HOW TO OPERATE GMAIL

Learning Management System 2.0 User Information Guide

Session 10 MS Word. Mail Merge

Excel Tips for Compensation Practitioners Weeks Data Validation and Protection

MAXQDA and Chapter 9 Coding Schemes

Dormant Accounts Fund. Technical Guide for Applications

Part 1: Understanding Windows XP Basics

A cell is highlighted when a thick black border appears around it. Use TAB to move to the next cell to the LEFT. Use SHIFT-TAB to move to the RIGHT.

Microsoft Office Outlook 2010

1.0 Instructions for using your UQ templates

RITIS Training Module 4 Script

Using Tables, Sparklines and Conditional Formatting. Module 5. Adobe Captivate Wednesday, May 11, 2016

New Finance Officer & Staff Training

Applied Epic Importing and Exporting Functionality for Front Office

Visual Traffic Version 4.2

Desire2Learn: Assignments

Chat Activity. Moodle: Collaborative Activities & Blocks. Creating Chats

How to refresh a Wireless Profile on your computer

How to use Reports... 3 How to generate a Report... 4 How to perform a Batch VEVO Search...11 How to Save a Report Template...19

Copyright 2018 MakeUseOf. All Rights Reserved.

Setting Yourself Up as a User

My MessageMedia User Guide

DATA COLLECTION GUIDANCE

Document Management System GUI. v6.0 User Guide

Word - Basics. Course Description. Getting Started. Objectives. Editing a Document. Proofing a Document. Formatting Characters. Formatting Paragraphs

People Inc. Intermediate Training Course

Top Producer 7i Tips & Tricks Volume 1

LETTER BUILDER. Letter Builder. For RiskMan Version Last reviewed 24 Jun Copyright 2017 Page RiskMan 1 of International 17 Pty Ltd

Introduction to Microsoft Office 2016: Word

Series 6 Technical Admin Guide Page 1

MORS TEMP SHIFTS USER GUIDE

PhoneLink for Outlook User Manual

Compare and Merge Track Changes

NZ Online Forms for Research Software Manual

1. Managing Information in Table

Microsoft Outlook 2003 Microsoft screen shots used in accordance with Microsoft rules to be viewed at URL

Contents. Batch & Import Guide. Batch Overview 2. Import 157. Batch and Import: The Big Picture 2 Batch Configuration 11 Batch Entry 131

Intro to Excel. To start a new workbook, click on the Blank workbook icon in the middle of the screen.

Accessing the L4U Learning Commons Library

Contents. Announcer Pro Version 4.6 Page 2 of 35 Version V4.6

TerminalFOUR Version 8 Change Guide

ShiftWizard User Guide. Version 4

Findmyshift - Getting started with Findmyshift

Office 365 Outlook Web App Room Scheduling Options & Settings. Introduction. Open the Resource Calendar Settings

DXWeb Webenabled Version 4.0 Supplement

WordTalk Simple Guide

Vision Extended Care Fax Solution

Using Microsoft Word. Working With Objects

Quick Start Guide to Dynamic Templates

IP4 - Running reports

[Type text] DLC Client Manager Welcome Manual

data8 duplicare documentation Contents End User Documentation... 2 Merge Quick Merge... 4 Duplicate Detection

HBS Training - IT Solutions. PlanWeb. Introduction

Tips & Tricks for Microsoft Word

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

Basic Concepts 1. Starting Powerpoint 2000 (Windows) For the Basics workshop, select Template. For this workshop, select Artsy

TradeGuider RT V4 Quick Install Guide.

San Diego Elementary PowerTeacher: Seating Charts Quick Reference Card

Approving a Requisition

Veco User Guides. Grids, Views, and Grid Reports

Creating. guide. quick start. Get Started! your yearbook. Let s. Easy Account Access.

Making Windows XP work for you

Transaction Approval Process

download instant at

ORACLE RDC ONSITE RESEARCH COORDINATOR TRAINING

OpenClinica Site Data Entry Guide

Lakeshore Contact Management Module

Service Level Agreements

Word Processing for Dissertations

Microsoft Word 2000 Mail Merge

Impress Guide Chapter 1 Introducing Impress

Intermediate EndNote Tips

Australia Online Forms for Research Software User Manual

1. Move your mouse to the location you wish text to appear in the document. 2. Click the mouse. The insertion point appears.

Microsoft Word Part I Reference Manual

DISCLAIMER: The names of files, values and reports in this Guide may differ slightly from those in the example files supplied with your software.

HARBORTOUCH RESERVATIONS & WAITLIST MANUAL

ProScript. Quick User Guide. Helpdesk Phone number:

Chapter 7: Searching, sorting, bookmarking and grouping.

Batch and Import Guide

Transcription:

Deduplication Records are deduplicated as they are read into the CRS but it is sometimes useful to deduplicate manually. There are two tabs within the deduplication module. 1. Deduplication: checks records in the segment against other records in the segment and flags up potential duplicates. 2. Central Deduplication: checks the records in the segment that have been sent to be published in Central against existing Central records and flags possible duplicates. Deduplication within a segment Creating a deduplication filter The deduplication process Merging duplicates one at a time Marking records as not duplicates Merging all records Select Mark all records. Notice that only one of the pair of records is marked Deleting a record CENTRAL deduplication Deduplication within a segment The first thing to do is decide on a strategy for deduplicating the records. There is a default filter built into the CRS, which checks the first 35 characters of the title, the year and the first four characters of the author. This filter is selected by default but it is also possible to edit this filter and choose specific criteria for deduplication. As many filters as are necessary can be created and stored within the local CRS installation. The default filter works well for most types of records, but sometimes it s useful to explore the other options to find duplicates that may not match the default criteria. Adding a Fuzzy search to the Title for example can help find records where the title may have been misspelt or has slightly different wording. Fuzzy searching is a lot slower so is not activated by default, however itis definitely worth exploring. Creating a deduplication filter 1. Go to View Deduplication Preferences on the toolbar 1

2. The deduplication preferences box will then appear. Click Create New and enter a name for the new filter, and then click OK. 3. Check the Match records by subtype box to match records by subtype before deduplicating. For example, if there are records in the register with the subtype BOOK, they will not be deduplicated against those records with the subtype JOURNAL. 4. Choose the Record category to be deduplicated - References or Studies from the drop down menu. 5. Pick the fields that will give the best chance of a match by checking the box next to the field. 2

The title field may be the most useful. Options are given to remove punctuation and strip out any brackets, which will increase the chances of finding a match in those records where the punctuation may be slightly different. These options can be selected by checking the boxes. The search can also be restricted to a specific number of characters, so if there are a lot of very long titles which start with the same 35 characters it might be worth increasing that value to check those records. A fuzzy search goes through all the records and compares each title, for example, with every other title, one at a time. It works out how different the two titles are and then decides whether they are so different that they are unlikely to be the same value with some misspellings in one or whether they potentially ought to be the same value. The precision setting determines the cut-off point for making that decision and the default setting of 38 characters has been found to be a good option for general use. When matching on Author there is the option to search on the Author surname only, which will ignore any author initials in the field. So for example Chen XI would match with Chen I, which may be correct, but of course it would also match Chen XP, which is unlikely to be correct. It s worth experimenting with this feature to find the optimum match.. 3

If Year of publication is included as part of the deduplication filter, just the year can be extracted, leaving out any months, e.g. 1995 Jan would match with 1995. In practice this is likely to be the best setting because it would not miss a potential duplicate because one record had month and the other didn t. 6. Once all the required fields have been selected, click Save to save the filter. 7. The filter will now appear in the drop down menu at the top of the screen for future use. 4

The deduplication process 1. Once a filter has been selected or created, the deduplication process can begin. Choose which fields are displayed on the results screen by selecting Table Template. 2. The Custom Table Template box will then appear. Choose the fields to be displayed by checking the box next to the field, and then click the plus sign to add it to the Selected Fields frame on the right. Arrange the order in which the fields appear by highlighting the field name and using the green arrows to move it up or down. 3. All the records in the segment can be deduplicated, or just the marked records. 5

The default option is to deduplicate all the records in the segment. 4. Click on the green Process arrow to begin the deduplication process. 5. A progress bar will appear at the bottom of the screen, along with a count of possible duplicates. found. It is possible to cancel the process at this stage by clicking on the process icon at the bottom of the screen to the right of the progress bar. Stop deduplication The speed of the process will depend on how many records have been chosen to deduplicate and how the deduplication filter has been set up. Using fuzzy searching as part of the deduplication process will slow the process considerably. 6. Once the deduplication process is complete, a message displays showing how many records have been deduplicated successfully, and how many possible duplicates have been returned. The results of the deduplication process will be displayed and the number of possible matches appears in the top right hand corner. Results are colour-coded. Records in the same colour next to each other are potential duplicates. 6

Merging duplicates one at a time It is possible to look at the records individually side by side, and then merge them: 1. Select both potential duplicates by marking the check boxes. 2. Click the Merge icon on the toolbar. 3. The records are displayed side by side. The record marked with a is a CENTRAL record. Where there is a difference in field values it will be highlighted in yellow. 7

4. By default, the CRS chooses the field with the most data as the field that will be kept in the merged record. To change this setting, click on the Merge Preferences icon. It is possible to turn off the preference to use the field with most text, and also the preference to use fields containing non-latin characters by unchecking the boxes. 8

Checking the box to Merge multiple value fields will turn on the preference to merge together fields which have more than one value. Click Apply to return to the records. 5. It is possible to opt to keep all of the fields in either one of the records by clicking on the Select all fields in this record icon. Notice that all the boxes on the selected record are now ticked and the corresponding boxes on the deselected record are now blank. 6. Alternatively, select individual fields from each record that are to be included in the merged record by marking the box next to the field in the Use column. 9

7. When the required fields are selected the records can be merged by clicking the Merge icon. 8. A message asks you to confirm, click Yes and the two records will be merged. 9. The merged records now appear in green in the list of results Marking records as not duplicates If the two records being displaying are not deemed to be duplicates they can be marked as Not Duplicate so that they don t appear in the list again. 1. Select both potential duplicates by marking the check boxes. 2. Select the Not Duplicate icon. 3. The records will now be flagged as unique and will not be displayed in the next deduplication session. 10

Merging all records There is also the option to merge all of the records without viewing them individually if you are confident that the records are all duplicates and that the default fields are the required fields in the merged record. Select marked. Mark all records. Notice that only one of the pair of records is 1. Click the Merge All icon 2. A message will be generated: Merging all records will delete all matching records that have been found within the deduplication process from the local database. Do you wish to continue? Click Yes and the records will be merged and appear in green on the deduplication screen. Deleting a record If one (or both) of the duplicates is not needed, it is possible to delete them from your segment. 1. Select the record to be deleted by marking the check box. 2. Click the Delete marked records icon 3. A message will be generated: Are you sure you want to delete 1 marked record? Click Yes and the record will be deleted Tip: It is not possible to delete records which are published in CENTRAL, only records which appear in the user s segment and are not in CENTRAL If To cancel the deduplication process must be cancelled, click the arrow on the toolbar to start again. Restart deduplication process 11

When the deduplication process has completed, click the End Deduplication icon. CENTRAL Deduplication When a record is sent from the user s segment to be published in CENTRAL, the CRS automatically searches for duplicates. The results are displayed in the CENTRAL deduplication tab. Clicking on the record to highlight it will display the local record (My record value on the left) and the CENTRAL record (Currently published CENTRAL value on the right) side by side for comparison. 12

There are then five options: 1. If the records are not duplicates, click the Add as New CENTRAL record. This will send the local record to CENTRAL as a new record and it will no longer appear as a duplicate. 2. If the record is a duplicate, but no changes are required either to the local record or the CENTRAL record, click the Add my Group code button, in blue. This will associate the user s group code with the record and cancel the publication of the local record. 3. If the record is a duplicate, but the local version has correct fields and the CENTRAL record has incorrect fields, click the Add group code and update CENTRAL fields button, in green. This will update the CENTRAL record with the correct fields. 4. If the record is a duplicate, but the CENTRAL version has correct fields and the local record has incorrect fields, click the Add group code & copy CENTRAL fields button, in black. This will update the local record with the correct fields and assign your group s code to the CENTRAL record. 5. If the record is a duplicate, but neither is completely correct, then click the button marked Cancel records CENTRAL publication request. It will then be neccessary to find the 13

record by searching for it within the CRS, and update the fields before sending the record for publication in CENTRAL again. The decisions applied to each record will appear as ticked boxes in the corresponding columns next to each record. When the process has been completed for each, click Submit. The records will then be sent for publication in CENTRAL. If a record has more than one duplicate, the CRS will display this in the Match Count column. Click on the record to highlight it, this will display both matches in two separate tabs. 14

Assign one of the five options using the same methods outlined above. Any of the five options can also be applied globally to all of the matches. To do this, use the relevant option in the smaller icons above the list of records. Tip: When a record is owned by more than one group, the CRS will notify the other group s Trials Search Co-ordinator of a conflict. Both owners must agree changes before they are finalised. 15