Fuzzy Matching in Fraud Analytics. Grant Brodie, President, Arbutus Software
|
|
- Suzanna McBride
- 6 years ago
- Views:
Transcription
1 Fuzzy Matching in Fraud Analytics Grant Brodie, President, Arbutus Software
2 Outline What Is Fuzzy? Causes Effective Implementation Application to Specific Products Demonstration Q&A 2
3 Why Is Fuzzy Important? Big data Too many transactions User-entered data (web sites) E-Commerce Less manual oversight 3
4 What Is Fuzzy? Subset of duplicates testing Find specific keywords in text (FCPA, PCard) Close, but not the same Two reasonable definitions Proximity Looks similar 4
5 Proximity Sorts close together Characters Albert vs. Albertson Numbers 123, vs. 123, Dates Jan 19, 2014 vs. Jan 20,
6 Looks Similar Characters Microsoft vs. Wicrosoft Numbers 127, vs. 12, Dates Jan 13, 2014 vs. Jan 31,
7 Traditional Approach to Close Pronunciation based Soundex NYSIIS Designed for names Many false positives Not useful for numbers or dates 7
8 Fuzzy Today Based on physical string matching Levenshtein (ACL) Damerau-Levenshtein (Arbutus) N-Gram Jaro-Winkler And many more Differences expressed as a distance or percentage 8
9 Quick Lesson: Damerau-Levenshtein Min. # changes to make one string into another Insert, delete, replace, transpose 123 Main Street vs. 123 Main St = vs = 1 (Levenshtein: 2) Rob vs. Robert = 3 Gary vs. Mary = 1 Gary vs. gary = 1 9
10 Problems with String Matching Very literal Doesn t apply any context John Smith vs. John Smith (1) Smith John vs. Smith, John (1) John Smith vs. john smith (2) México vs. Mexico (1) John Smith vs. john smith same as John Hmitz (2) 10
11 What Do You Use? Whatever your tool offers Almost impossible to implement manually VERY compute intensive 11
12 Causes Accidental errors Carelessness/mistyping Transpositions Blurry source Punctuation Extra blanks 1 vs. I, 0 vs. O (particularly with OCR) 12
13 Errors vs. Fraud All of the causes were likely errors Fraud uses intentional errors to mask activity Obscure duplicates Obscure relationships Trick through similarity Disparate systems make comparison even harder 13
14 Practical Issues Generally hard to target fuzzy tests Forced to use broad tests Most findings will be errors Even so, the finding is still valuable Need a process to address errors found 14
15 Our System Catches Duplicates Exact matches only Strict application (i.e. company, vendor, invoice) May only warn Not all duplicates are payments Most only test document numbers 15
16 Types of Duplicates Names Personal Corporate Addresses Document numbers (e.g., invoice) Contact information Phone numbers s 16
17 Issues Very compute intensive (wait times) Exponential relationship 1000x data = 1,000,000x more work False positives Ease of use 17
18 False Positives Easily the most challenging aspect Any time spent on a false positive is wasted Can easily outnumber the true positives by 10, 100, 1000 to 1 If too many, can remove any cost effectiveness How does this happen? Only one way to get an exact match Virtually unlimited ways to get close 18
19 False Positive Examples Matching to with a single difference: Missing (1245): 5, Transposition (12435): 4 Incorrect (12745): min 45 (175 if alpha, 1,000+ if any char) Extra (123345): min 60 (200+ if alpha, 1,000+ if any char) Hundreds/thousands of ways that differ by just 1 Not just errors, all close values Exponentially more with a distance of 2 Bad actor tries to rely on his needle in a haystack 19
20 How to Address the Issues Data preparation Utilize context Use tight specifications Choose software that meets needs Rank your results 20
21 Choose Your Software Has the capabilities you need Can process your data volumes Easy to implement Easy to automate ACL, Arbutus, IDEA, fraud-specific, non-audit tools 21
22 Data Preparation Remove immaterial differences first (i.e., normalization) Text manipulation Upper case Punctuation Extra blanks Foreign characters (México vs. Mexico, Québec vs. Quebec) 22
23 Data Preparation (Cont.) (Remove immaterial differences first, normalization) Eliminate noise words Different by type of data Address: Suite, Unit Corporate name: Company, Co, Inc Personal name: Mr, Ms, Dr, Prof 23
24 Data Preparation (Cont.) (Remove immaterial differences first, normalization) Common misspellings/typos Common vocabulary (chair vs. silla) Different by data type Avenue: Av, Ave, Aven, Avenu First vs. 1 st West vs. W Richard, Rick, Dick, Ricky, Rich 24
25 Data Preparation (Cont.) (Remove immaterial differences first, normalization) Word order 123 W Main St. vs. 123 Main St. W 25
26 Data Preparation: Result Well implemented data prep. minimizes the need for fuzzy Consider the two addresses: # Main Street West 1234 W MAIN ST, Suite 200 Levenshtein distance is 20 Applying data prep can make both strings identical W ST MAIN
27 Text Manipulation: ACL Create a computed field Upper case: Upper(field) (FUZZYDUP ignores case, but data prep is simpler) Punctuation: Include(field, ABCDEFGHIJKLMNOPQRSTUVWXYZ ), but Extra blanks: (replace 2 with 1) Replace(Replace(field,, ),, ) Foreign characters: Replace(Replace(field É, E ), Á, A ) Replace(Replace(Replace(Replace(Include(Upper(field), ABCDEFGHIJKLMNOPQRSTUVWXYZ ),, ),, ),, ), É, E ) In practice, many more replace calls May break up into multiple fields for clarity 27
28 Text Manipulation: Arbutus Create a computed field Upper case: Upper(field) Punctuation: Include(field, 0~9A~Z ), but Extra blanks: Compact(field) Foreign characters: Replace(field, É, E, Á, A, ) Replace(Compact(Include(Upper(field), 0~9A~Z )), É, E ) May break up into multiple fields for clarity Only for unusual situations (use Normalize function) 28
29 Eliminate Noise Words: ACL Use whole words Omit(field+, INCORPORATED,INC,LIMITED,LTD, F), but Omit(field, INC ): CINCH INDUSTRIES becomes CH INDUSTRIES Problem is, many noise words to eliminate two solutions: Long list Alltrim(Omit(field+, INCORPORATED,INC,LIMITED,LTD,CORPORATION, CORP, )) Sequential omits of a variable in a group v_field=omit(field v_field=omit(v_field 29
30 Common Vocabulary: ACL Similar to noise words, only Replace instead of Omit Use whole words Replace(field+, ROAD, RD ) Otherwise, BROADWAY becomes BRDWAY Don t omit, as Peachtree Lane is not the same as Peachtree Court Problem is, MANY vocabulary words to potentially normalize USPS 400 street terms, 500+ male names, 700+ female names Nested functions (with Replace instead of Omit) Sequential replaces of a variable in a group 30
31 Word Order: ACL No practical way to address this 31
32 Noise Words and Common Vocabulary: Arbutus If you choose, ACL syntax all works Instead: Use Normalize() or SortNormalize() Automatically implements ALL of the data prep described (Upper case, punctuation, blanks, foreign, noise, vocabulary) Normalize(address, addr.txt ) Norm( Suite Main Street West, addr.txt ) = MAIN ST W SortNormalize has the same syntax, but = W ST MAIN Normalize can use a separate vocabulary file (addr.txt) Replaces or omits any word, on a whole word basis User configurable and selectable, by data type 32
33 Noise Words and Common Vocabulary: Arbutus Substitution file (addr.txt, for example) FIRST 1ST SEVENTH 7TH AV AVENU AVENUE AVN AVE AVE AVE AVE PARKWAY PKWY PARKWY PKWAY PKY PKWY PKWY PKWY SUITE UNIT 33
34 False Positive Reduction: Utilize Context Data elements always have a context Names or address: location (e.g., city, state, ZIP, country, etc.) Documents: vendor, employee, etc. Reference the similarities to minimize the ambiguity Same state, city, similar address 123 Main St., Springfield, IL/MA Same vendor, date, amount, similar invoice number 34
35 Utilize Context: Application ACL FUZZYDUP: Only supports one key field Concatenate fields into a single expression/computed field State+City+Address Other data types require conversion: vendor+date(dt)+str(amount, 16)+invno Arbutus DUPLICATES: Supports multiple key fields Specify each key separately Last key can be fuzzy 35
36 False Positive Reduction: Use Tight Specs Levenshtein distance 1, or 2 max Looser specifications = more false positives Avoid Soundex and similar approaches There is no substitute for good data prep 36
37 False Positives: Rank Your Results Order based on exposure Size of item Degree of inherent risk (cash) Order based on degree of similarity Distance (1 vs. 2) Number of matching same elements 37
38 Execution: ACL Separate menu item Analyze/fuzzy duplicates Choose your (concatenated) key Choose diff. threshold (1 or 2) Select other fields to use in investigation Select the output table name Be patient 38
39 Execution: Arbutus Included with duplicates testing Analyze/duplicates Choose your key fields (any type) Choose either near or similar processing Choose max. difference (0, 1, or 2) Select other fields to use in investigation Select output location and name 39
40 Similar Processing: Arbutus Specifically designed to work with document IDs Uses Damerau-Levenshtein, but auto. pre-processes Removes all blanks and punctuation, upper cases Matches similar characters: O=0, I=1, 5=S, etc. Works on all data types 127, vs. 12, (diff. 1) I vs (diff 0) Particularly useful with OCR 40
41 Similar Processing: ACL Not explicitly supported Pre-process the data to create a computed field Upper case Include only numbers and letters (no blanks, punctuation) Convert numbers and dates to strings (date or string) Use the FUZZYDUP command as in the past 41
42 Manual Duplicates Testing: ACL Data prep is still important LevDist(string1, string2 <, case sensitive>) Case sensitive by default Filter: LevDist(name1, name2, F) < 3 IsFuzzyDup(string1, string2, distance <, diff%> ) Automatically case insensitive Filter: IsFuzzyDup(name1, name2, 2) Can also be used as a join test 42
43 Manual Duplicates Testing: Arbutus All case sensitive, by default (assumes normalized inputs) Difference(string1, string2 <, case sensitive>) Filter: difference(name1, name2, F) < 3 Near(field1, field2, difference) Filter: near(name1, name2, 2) Applies to all data types Char: Damerau-Levenshtein; numbers and dates: proximity (4799 vs 4803) Similar(field, field2, difference) Applies to all data types, always uses Damerau-Levenshtein Char: prepared data; numbers and dates: 123,456 vs. 12,456 43
44 Find Specific Keywords in Text: ACL Very common for purchase card reviews, FCPA Use the Find function: Filter: IF Find( Exotic, desc) Multiple words: IF Find( Exotic, desc) OR Find( IPad, desc) Not case sensitive, not whole word Create a Logical computed field (say Exception ): T IF Find( Exotic, desc) T IF Find( IPad, desc) F Filter: IF Exception 44
45 Find Specific Keywords in Text: Arbutus Find function works the same as ACL Use the ListFind function instead: Filter: IF ListFind( exceptions.txt, desc) Simple text file Easily maintained in Notepad Unlimited entries Supports an external reference file or an internal array Like Find function, not case sensitive, not whole word 45
46 Continuous Monitoring Mostly errors Test vs. control Ownership of the process May relate to frequency Detective vs. Preventative Entire presentation detective Opportunity to run against documents before committing Preventative almost certainly a control 46
47 Fuzzy Testing in action Demonstration 47
REAL-TIME SOLUTIONS TO REAL-TIME PROBLEMS FUZZY MATCHING IN FRAUD ANALYTICS
REAL-TIME SOLUTIONS TO REAL-TIME PROBLEMS FUZZY MATCHING IN FRAUD ANALYTICS Technology that allows you to look for inexact but close matches (fuzzy matching) has been around for many years. As sophistication
More informationOverview of Record Linkage Techniques
Overview of Record Linkage Techniques Record linkage or data matching refers to the process used to identify records which relate to the same entity (e.g. patient, customer, household) in one or more data
More informationData.com Record Matching in Salesforce
Data.com Record Matching in Salesforce Salesforce, Winter 16 @salesforcedocs Last updated: October 1, 2015 Copyright 2000 2015, inc. All rights reserved. Salesforce is a registered trademark of, inc.,
More informationProcurement Card Purchasing
Procurement Card Purchasing Some vendors listed in Mercury as either Punch out vendors or hosted vendors can be paid for using your Procurement Card when placing your order through Mercury Commerce system.
More informationAccounting Information Systems, 2e (Kay/Ovlia) Chapter 2 Accounting Databases. Objective 1
Accounting Information Systems, 2e (Kay/Ovlia) Chapter 2 Accounting Databases Objective 1 1) One of the disadvantages of a relational database is that we can enter data once into the database, and then
More informationRelational Database Management Systems for Epidemiologists: SQL Part I
Relational Database Management Systems for Epidemiologists: SQL Part I Outline SQL Basics Retrieving Data from a Table Operators and Functions What is SQL? SQL is the standard programming language to create,
More informationDuplicate Constituents and Merge Tasks Guide
Duplicate Constituents and Merge Tasks Guide 06/12/2017 Altru 4.96 Duplicate Constituents and Merge Tasks US 2017 Blackbaud, Inc. This publication, or any part thereof, may not be reproduced or transmitted
More informationSearching Guide. September 16, Version 9.3
Searching Guide September 16, 2016 - Version 9.3 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
More informationIntroduction to SQL. IT 5101 Introduction to Database Systems. J.G. Zheng Fall 2011
Introduction to SQL IT 5101 Introduction to Database Systems J.G. Zheng Fall 2011 Overview Using Structured Query Language (SQL) to get the data you want from relational databases Learning basic syntax
More informationDB2 SQL Class Outline
DB2 SQL Class Outline The Basics of SQL Introduction Finding Your Current Schema Setting Your Default SCHEMA SELECT * (All Columns) in a Table SELECT Specific Columns in a Table Commas in the Front or
More informationTips and Tricks for Data Quality Management
Tips and Tricks for Data Quality Management Thomas A. Dye III CCP Informatica Chris Phillips Senior Product Manager, Data Quality Informatica 1 Biography Thomas A. Dye III, CCP Senior Consultant with Informatica
More information100 + REASONS TO LIKE ARBUTUS
100 + REASONS TO LIKE ARBUTUS Arbutus Audit Analytics offers a broad range of capabilities unmatched by the alternatives. The list below is an example of company president Grant Brodie s focus on making
More informationDATA MANAGEMENT USE CASES
DATA MANAGEMENT USE CASES Data Management is a general term for a variety of tasks ISD is frequently asked to assist with. DATA MANAGEMENT WHAT S POSSIBLE? Consulting on implementing new workflows or improving
More informationManage Duplicate Records in Salesforce PREVIEW
Manage Duplicate Records in Salesforce Salesforce, Winter 18 PREVIEW Note: This release is in preview. Features described in this document don t become generally available until the latest general availability
More informationAccess Basics: When and How
Access Basics: When and How Hal Jankowski CACUBO Winter Workshop Kansas City, MO April 2014 Learning outcome disclaimer Access is a complex tool that requires significant hands on time to become familiar.
More informationThe Corticon Rule Modeling Methodology. Applied to. FEMA Disaster Assistance Fraud Detection. A Case Study
The Corticon Rule Modeling Methodology Applied to FEMA Disaster Assistance Fraud Detection A Case Study By Mike Parish Contents Table Of Figures... 4 The Business Problem... 6 Identify the Business Decision(s)
More informationSettings Options User Manual
Settings Options User Manual Settings Options User Manual 04/05/2016 User Reference Manual Copyright 2016 by Celerant Technology Corp. All rights reserved worldwide. This manual, as well as the software
More informationVendor Inquiry and Reports Munis Version 11.2
Objective This document gives you step by step instructions for using the Vendor Inquiry/Reports program to query the vendor master table for information regarding a specific vendor(s) and how to produce
More informationStatTrak Address Manager Business Edition User Manual
StatTrak Address Manager Business Edition User Manual Overview... 2 Frequently Asked Questions... 5 Toolbar... 7 Address Listings... 9 Update Main & Details... 11 Update Individual... 12 Back Up Database...
More informationGDPR Thread or Opportunity? Jan Sál 10th October 2017
GDPR Thread or Opportunity? Jan Sál 10th October 2017 Contents 1 2 3 4 5 Status of GDPR implementation process Rights: of access, to erase and to restrict processing Case study of implementation Data mining
More informationAddress Cleansing in Michigan Lessons Learned. Michigan Care Improvement Registry
Address Cleansing in Michigan Lessons Learned Michigan Care Improvement Registry Address Cleansing: what is it Break the address into parts (ie. primary number, street name, city name, state name, ZIP
More informationData Warehousing. Jens Teubner, TU Dortmund Summer Jens Teubner Data Warehousing Summer
Jens Teubner Data Warehousing Summer 2018 1 Data Warehousing Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 Jens Teubner Data Warehousing Summer 2018 160 Part VI ETL Process ETL Overview
More informationMicrosoft Access XP (2002) - Advanced Queries
Microsoft Access XP (2002) - Advanced Queries Group/Summary Operations Change Join Properties Not Equal Query Parameter Queries Working with Text IIF Queries Expression Builder Backing up Tables Action
More informationDonor Management with GiftWorks
Donor Management with GiftWorks The Big Picture With GiftWorks, you can store a large amount of information about each of your donors. In addition to basics like names, addresses, and phone numbers, you
More informationAbout this exam review
Final Exam Review About this exam review I ve prepared an outline of the material covered in class May not be totally complete! Exam may ask about things that were covered in class but not in this review
More informationGiftWorks Import Guide Page 2
Import Guide Introduction... 2 GiftWorks Import Services... 3 Import Sources... 4 Preparing for Import... 9 Importing and Matching to Existing Donors... 11 Handling Receipting of Imported Donations...
More informationVendor Maint/Reports Menu Vendor Inquiry/Reports MUNIS Version 7
Module: Topic: Accounts Payable Vendor Maint/Reports Menu Vendor Inquiry/Reports MUNIS Version 7 Objective This document gives you step by step instructions for using the Vendor Inquiry/Reports program
More information1 Writing Basic SQL SELECT Statements 2 Restricting and Sorting Data
1 Writing Basic SQL SELECT Statements Objectives 1-2 Capabilities of SQL SELECT Statements 1-3 Basic SELECT Statement 1-4 Selecting All Columns 1-5 Selecting Specific Columns 1-6 Writing SQL Statements
More informationAccess Intermediate
Access 2013 - Intermediate 103-134 Advanced Queries Quick Links Overview Pages AC124 AC125 Selecting Fields Pages AC125 AC128 AC129 AC131 AC238 Sorting Results Pages AC131 AC136 Specifying Criteria Pages
More informationAttacking Return-to- Sender Mail from All Directions
Attacking Return-to- Sender Mail from All Directions JEFF STANGLE DIRECTOR OF SOLUTIONS POSTAL CONSULTING PITNEY BOWES MANAGEMENT SERVICES ADAM COLLINSON ENGAGEMENT MANAGER POSTAL CONSULTING PITNEY BOWES
More informationDonor Management with GiftWorks. The Big Picture... 2 A Closer Look... 2 Scenarios... 4 Best Practices Conclusion... 21
Donor Management with GiftWorks The Big Picture... 2 A Closer Look... 2 Scenarios... 4 Best Practices... 20 Conclusion... 21 The Big Picture With GiftWorks, you can store a large amount of information
More informationFormat of Session 1. Forensic Accounting then and now 2. Overview of Data Analytics 3. Fraud Analytics Basics 4. Advanced Fraud Analytics 5. Data Visualization 6. Wrap-up Question are welcome and encouraged!
More informationEXTRACTING DATA FOR MAILING LISTS OR REPORTS
EXTRACTING DATA FOR MAILING LISTS OR REPORTS The data stored in your files provide a valuable source of information. There are many reports in Lakeshore but sometimes you may need something unique or you
More informationCMPT 354: Database System I. Lecture 3. SQL Basics
CMPT 354: Database System I Lecture 3. SQL Basics 1 Announcements! About Piazza 97 enrolled (as of today) Posts are anonymous to classmates You should have started doing A1 Please come to office hours
More informationThe Matching Engine. The Science of Maximising Legitimate Matches, Minimising False Matches and Taking Control of the Matching Process
The Matching Engine The Science of Maximising Legitimate Matches, Minimising False Matches and Taking Control of the Matching Process CLEANER DATA. BETTER DECISIONS. The Challenge of Contact Data Matching
More informationFirst Data Global Gateway SM Virtual Terminal User Manual
First Data Global Gateway SM Virtual Terminal User Manual Version 1.0 2015 First Data Corporation. All Rights Reserved. All trademarks, service marks, and trade names referenced in this material are the
More informationdtalink Faster probabilistic record linking and deduplication methods in Stata for large data files Keith Kranker
dtalink Faster probabilistic record linking and deduplication methods in Stata for large data files Presentation at the 2018 Stata Conference Columbus, Ohio July 20, 2018 Keith Kranker Abstract Stata users
More informationGuide to Importing Data
Guide to Importing Data CONTENTS Data Import Introduction... 3 Who should use the Gold-Vision Import Client?... 3 Prepare your data... 3 Downloading and installing the import client... 7 Step One Getting
More informationTechniques for Large Scale Data Linking in SAS. By Damien John Melksham
Techniques for Large Scale Data Linking in SAS By Damien John Melksham What is Data Linking? Called everything imaginable: Data linking, record linkage, mergepurge, entity resolution, deduplication, fuzzy
More informationAccess Intermediate
Access 2010 - Intermediate 103-134 Advanced Queries Quick Links Overview Pages AC116 AC117 Selecting Fields Pages AC118 AC119 AC122 Sorting Results Pages AC125 AC126 Specifying Criteria Pages AC132 AC134
More informationTutorial 5 Advanced Queries and Enhancing Table Design
Tutorial 5 Advanced Queries and Enhancing Table Design (Sessions 1 and 3 only) The Clinic Database Clinic.accdb file for Tutorials 5-8 object names include tags no spaces in field names to promote upsizing
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More informationAutomatic training example selection for scalable unsupervised record linkage
Automatic training example selection for scalable unsupervised record linkage Peter Christen Department of Computer Science, The Australian National University, Canberra, Australia Contact: peter.christen@anu.edu.au
More informationMatching Rules: Too Loose, Too Tight, or Just Right?
Paper 1674-2014 Matching Rules: Too Loose, Too Tight, or Just Right? Richard Cadieux, Towers Watson, Arlington, VA & Daniel R. Bretheim, Towers Watson, Arlington, VA ABSTRACT This paper describes a technique
More informationWHY EFFECTIVE WEB WRITING MATTERS Web users read differently on the web. They rarely read entire pages, word for word.
Web Writing 101 WHY EFFECTIVE WEB WRITING MATTERS Web users read differently on the web. They rarely read entire pages, word for word. Instead, users: Scan pages Pick out key words and phrases Read in
More informationLecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University
Lecture 3 SQL Shuigeng Zhou September 23, 2008 School of Computer Science Fudan University Outline Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views
More informationDATA HYGIENE AND MERGE PURGE
DATA HYGIENE AND MERGE PURGE DMAW DM101 By Lori Barao, MMI Direct WHAT YOU LL LEARN TODAY DATA HYGIENE AND MERGE/PURGE You ll leave with an understanding the Merge/Purge process and the tools available
More informationOutlier Detection With SQL And R. Kevin Feasel, Engineering Manager, ChannelAdvisor Moderated By: Satya Jayanty
Outlier Detection With SQL And R Kevin Feasel, Engineering Manager, ChannelAdvisor Moderated By: Satya Jayanty Technical Assistance If you require assistance during the session, type your inquiry into
More informationConverting a Lowercase Letter Character to Uppercase (Or Vice Versa)
Looping Forward Through the Characters of a C String A lot of C string algorithms require looping forward through all of the characters of the string. We can use a for loop to do that. The first character
More informationSegregating Data Within Databases for Performance Prepared by Bill Hulsizer
Segregating Data Within Databases for Performance Prepared by Bill Hulsizer When designing databases, segregating data within tables is usually important and sometimes very important. The higher the volume
More informationCopyright 2009 Labyrinth Learning Not for Sale or Classroom Use LESSON 1. Designing a Relational Database
LESSON 1 By now, you should have a good understanding of the basic features of a database. As you move forward in your study of Access, it is important to get a better idea of what makes Access a relational
More informationExam code: Exam name: Database Fundamentals. Version 16.0
98-364 Number: 98-364 Passing Score: 800 Time Limit: 120 min File Version: 16.0 Exam code: 98-364 Exam name: Database Fundamentals Version 16.0 98-364 QUESTION 1 You have a table that contains the following
More informationjellyfish Documentation
jellyfish Documentation Release 0.5.6 James Turk December 01, 2016 Contents 1 Overview 1 1.1 Phonetic Encoding............................................ 1 1.1.1 American Soundex.......................................
More informationExcel Tips for Compensation Practitioners Weeks Text Formulae
Excel Tips for Compensation Practitioners Weeks 70-73 Text Formulae Week 70 Using Left, Mid and Right Formulae When analysing compensation data, you generally extract data from the payroll, the HR system,
More informationMunis Self Service Vendor Self Service
Munis Self Service Vendor Self Service User Guide Version 10.5 For more information, visit www.tylertech.com. TABLE OF CONTENTS Vendor Self Service Overview... 3 Vendor Self Service Users... 3 Vendor Registration...
More informationSlicing and Dicing Data in CF and SQL: Part 1
Slicing and Dicing Data in CF and SQL: Part 1 Charlie Arehart Founder/CTO Systemanage carehart@systemanage.com SysteManage: Agenda Slicing and Dicing Data in Many Ways Handling Distinct Column Values Manipulating
More informationFull file at
David Kroenke's Database Processing: Fundamentals, Design and Implementation (10 th Edition) CHAPTER TWO INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) True-False Questions 1. SQL stands for Standard
More informationCASS Certification Procedures
CASS Certification Procedures How Your Member Addresses are CASS-Certified INTRODUCTION The United States Postal Service, in cooperation with the mailing industry, has developed a process of evaluating
More informationThe Entity-Relationship Model (ER Model) - Part 2
Lecture 4 The Entity-Relationship Model (ER Model) - Part 2 By Michael Hahsler Based on slides for CS145 Introduction to Databases (Stanford) Lecture 4 > Section 2 What you will learn about in this section
More informationFuzzy Matching: Where Is It Appropriate and How Is It Done? SAS Can Help.
Paper 7760-2016 Fuzzy Matching: Where Is It Appropriate and How Is It Done? SAS Can Help. ABSTRACT Stephen Sloan, Accenture Dan Hoicowitz, Accenture Federal Services When attempting to match names and
More informationShelbyNext Financials: Accounts Payable Best Practices (Course #N210)
ShelbyNext Financials: Accounts Payable Best Practices (Course #N210) Presented by: Carmen Dea, Shelby Consultant 2017 Shelby Systems, Inc. Other brand and product names are trademarks or registered trademarks
More informationAster Data Basics Class Outline
Aster Data Basics Class Outline CoffingDW education has been customized for every customer for the past 20 years. Our classes can be taught either on site or remotely via the internet. Education Contact:
More informationOpera Customer Information System Guide. Version 2.0 January 2011
Opera Customer Information System Guide Version 2.0 January 2011 Contents Opera Customer Information System: What is it?... 2 Using the Lookup Feature... 2 Information on the Profile Screen... 4 Central
More informationIntroduction to SQL Server 2005/2008 and Transact SQL
Introduction to SQL Server 2005/2008 and Transact SQL Week 2 TRANSACT SQL CRUD Create, Read, Update, and Delete Steve Stedman - Instructor Steve@SteveStedman.com Homework Review Review of homework from
More informationGreenplum SQL Class Outline
Greenplum SQL Class Outline The Basics of Greenplum SQL Introduction SELECT * (All Columns) in a Table Fully Qualifying a Database, Schema and Table SELECT Specific Columns in a Table Commas in the Front
More informationReviewed 07/13/10. if the. address and. at (865) 974-
POLICY: Addresses Effective: 3/09/04 Revised 07/13/10 Reviewed 07/13/10 Objective: To standardize the process of maintaining different types of addresses for an entity. Quick Address Software (QAS) is
More informationREDCAP DATA DICTIONARY CLASS. November 9, 2017
REDCAP DATA DICTIONARY CLASS November 9, 2017 LEARNING OBJECTIVES Learn how to leverage the data dictionary Data dictionary basics Column descriptions Best practices Interplay with longitudinal features
More informationIntroduction to blocking techniques and traditional record linkage
Introduction to blocking techniques and traditional record linkage Brenda Betancourt Duke University Department of Statistical Science bb222@stat.duke.edu May 2, 2018 1 / 32 Blocking: Motivation Naively
More informationMicrosoft Access 2007 Module 1
Microsoft Access 007 Module http://citt.hccfl.edu Microsoft Access 007: Module August 007 007 Hillsborough Community College - CITT Faculty Professional Development Hillsborough Community College - CITT
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More informationBest Practices. Contents. Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL Meridiantechnologies.net
Meridian Technologies 5210 Belfort Rd, Suite 400 Jacksonville, FL 32257 Meridiantechnologies.net Contents Overview... 2 A Word on Data Profiling... 2 Extract... 2 De- Identification... 3 PHI... 3 Subsets...
More informationLinux Systems Security. Security Design NETS Fall 2016
Linux Systems Security Security Design NETS1028 - Fall 2016 Designing a Security Approach Physical access Boot control Service availability and control User access Change control Data protection and backup
More informationCHOIR APPLICATION FORM
P a g e 1 CHOIR APPLICATION FORM BARCELONA: July 17-27, 2017 Please have one representative register interest for the entire choir with this form. Name of Choir/Ensemble: Title (please circle): Mr. Ms.
More informationThe Power Of An Integrated Search Strategy
The Power Of An Integrated Search Strategy Chad Hallert Director of Digital Strategy Noble Studios A Quick Introduction About Me About Noble Studios 15 Years in Digital Marketing 2015 Direct Marketing
More informationProceedings of the Eighth International Conference on Information Quality (ICIQ-03)
Record for a Large Master Client Index at the New York City Health Department Andrew Borthwick ChoiceMaker Technologies andrew.borthwick@choicemaker.com Executive Summary/Abstract: The New York City Department
More informationOracle Syllabus Course code-r10605 SQL
Oracle Syllabus Course code-r10605 SQL Writing Basic SQL SELECT Statements Basic SELECT Statement Selecting All Columns Selecting Specific Columns Writing SQL Statements Column Heading Defaults Arithmetic
More informationLesson 2. Data Manipulation Language
Lesson 2 Data Manipulation Language IN THIS LESSON YOU WILL LEARN To add data to the database. To remove data. To update existing data. To retrieve the information from the database that fulfil the stablished
More informationAAL 217: DATA STRUCTURES
Chapter # 4: Hashing AAL 217: DATA STRUCTURES The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions, and finds in constant average
More informationSilberschatz, Korth and Sudarshan See for conditions on re-use
Chapter 3: SQL Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 3: SQL Data Definition Basic Query Structure Set Operations Aggregate Functions Null Values Nested
More informationTable of Contents. PDF created with FinePrint pdffactory Pro trial version
Table of Contents Course Description The SQL Course covers relational database principles and Oracle concepts, writing basic SQL statements, restricting and sorting data, and using single-row functions.
More informationDatabase 2: Slicing and Dicing Data in CF and SQL
Database 2: Slicing and Dicing Data in CF and SQL Charlie Arehart Founder/CTO Systemanage carehart@systemanage.com SysteManage: Agenda Slicing and Dicing Data in Many Ways Handling Distinct Column Values
More informationCS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1
CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1 Reminder: Rela0onal Algebra Rela2onal algebra is a nota2on for specifying queries
More information01 Transaction Pro Importer version 6.0
01 Transaction Pro Importer version 6.0 PLEASE READ: This help file gives an introduction to the basics of using the product. For more detailed instructions including frequently asked questions (FAQ's)
More informationCA Mahesh Bhatki Mumbai, 28/December/2013. Agenda
Data Analytics and Use of CAATTs Seminar On Investigation and Forensic Accounting & Audit CA Mahesh Bhatki Mumbai, 28/December/2013 Agenda Overview of CAATTs (Tools and Techniques) Some Useful Techniques
More informationPEGASUS DISTRIBUTOR S GUIDE
PEGASUS DISTRIBUTOR S GUIDE GPS /GPRS SOLUTION FOR YOUR FLEET Web Based Tracking System Tel: +44 (0)1509 808168 E- Mail: info@naxertech.com. www.naxertech.co.uk www.naxertech.com Revision History Note:
More informationQuality Control of Clinical Data Listings with Proc Compare
ABSTRACT Quality Control of Clinical Data Listings with Proc Compare Robert Bikwemu, Pharmapace, Inc., San Diego, CA Nicole Wallstedt, Pharmapace, Inc., San Diego, CA Checking clinical data listings with
More informationTags, Categories and Keywords
Tags, Categories and Keywords Document Management Tip Sheet As more and more content gets added to your repository, it will become harder to find what you need. Documents may become buried in multi-level
More informationChapter 3: Introduction to SQL
Chapter 3: Introduction to SQL Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 3: Introduction to SQL Overview of the SQL Query Language Data Definition Basic Query
More informationCASS Cycle L ( ) Certification: Frequently Asked Questions
CASS Cycle L (2007-2008) Certification: Frequently Asked Questions Q. What is CASS Cycle L? A. CASS Cycle L is the next regularly scheduled update of address-matching software. The USPS requires address-matching
More informationFormulas, LookUp Tables and PivotTables Prepared for Aero Controlex
Basic Topics: Formulas, LookUp Tables and PivotTables Prepared for Aero Controlex Review ribbon terminology such as tabs, groups and commands Navigate a worksheet, workbook, and multiple workbooks Prepare
More information3/3/2008. Announcements. A Table with a View (continued) Fields (Attributes) and Primary Keys. Video. Keys Primary & Foreign Primary/Foreign Key
Announcements Quiz will cover chapter 16 in Fluency Nothing in QuickStart Read Chapter 17 for Wednesday Project 3 3A due Friday before 11pm 3B due Monday, March 17 before 11pm A Table with a View (continued)
More informationSearching Guide. November 17, Version 9.5
Searching Guide November 17, 2017 - Version 9.5 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
More informationWhy Use OSU Printing & Mailing Services?
Why Use OSU Printing & Mailing Services? Printing & Mailing Numbers Bulk Mail Pieces Reducing Costs For Our Clients Without proper mail preparation, you could be paying a significant amount more in production
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationCOVER LETTER UNIT 1 LESSON 3
1 COVER LETTER Naviance Family Connection http://connection.naviance.com/cascadehs http://connection.naviance.com/everetths http://connection.naviance.com/henrymjhs http://connection.naviance.com/sequoiahs
More informationWalt Brainerd s Fortran 90 programming tips
Walt Brainerd s Fortran 90 programming tips I WORKETA - March, 2004 Summary by Margarete Domingues (www.cleanscape.net/products/fortranlint/fortran-programming tips.html) Fortran tips WORKETA - 2004 p.1/??
More informationChapter 3: Introduction to SQL. Chapter 3: Introduction to SQL
Chapter 3: Introduction to SQL Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 3: Introduction to SQL Overview of The SQL Query Language Data Definition Basic Query
More informationImplementation of Lexical Analysis
Written ssignments W assigned today Implementation of Lexical nalysis Lecture 4 Due in one week y 5pm Turn in In class In box outside 4 Gates Electronically Prof. iken CS 43 Lecture 4 Prof. iken CS 43
More informationImplementation of Lexical Analysis
Written ssignments W assigned today Implementation of Lexical nalysis Lecture 4 Due in one week :59pm Electronic hand-in Prof. iken CS 43 Lecture 4 Prof. iken CS 43 Lecture 4 2 Tips on uilding Large Systems
More informationImportacular for The Raiser s Edge
Importacular for The Raiser s Edge development@zeidman.info www.zeidman.info UK: 020 3637 0080 US: (646) 570 1131 Table of Contents Overview... 4 Installation for Self-Hosted Users (on premise)... 4 Hosted
More information