Databases, Data Mining & Knowledge Discovery

Similar documents
CSE 530A SQL. Washington University Fall 2013

CSE 530A. ER Model to Relational Schema. Washington University Fall 2013

Microsoft Access - Using Relational Database Data Queries (Stored Procedures) Paul A. Harris, Ph.D. Director, GCRC Informatics.

Question Bank. 4) It is the source of information later delivered to data marts.

Anthem Mobile Provider Finder

Managing Data Resources

KANRI DISTANCE CALCULATOR. User Guide v2.4.9

Touchline Software

Managing Data Resources

Membership Application Form U.S. and Canadian Group

Fundamentals of Information Systems, Seventh Edition

Bulk Upload Instructions: User Data Rules and Guidelines. As of v1.1.7

IT1105 Information Systems and Technology. BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing. Student Manual

Introduction. Who wants to study databases?

Data Preprocessing Part 1

A collection of persistent data that can be shared and interrelated. A system or application that must be operational for a company to function.

Lecture 1. Monday, August 25, 2014

User Manual: Logi Vision

CS143: Relational Model

NEMSIS Update and Strategic Vision

Figure 1: Relationship among Three Entities/Tables

Knowledge Discovery and Data Mining

Creating a Relational Database Using Microsoft SQL Code. Farrokh Alemi, Ph.D.

TIM 50 - Business Information Systems

Chapter 11 Database Concepts

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Helping Terra Dotta Help You

Payer s Flat File Specification Version 1.3

I2B2 Query Tool Demo. I2B2 Screen Shots. I2B2 Query Tool Capabilities. Work we have done. Things we are working towards. 1.

Database Management Systems Their Place in Nursing Informatics Education

Computers Are Your Future

IGCSE Information Communication Technology (ICT) Syllabus code Section 5: Data types

Machine Learning - Clustering. CS102 Fall 2017

DATA MINING DATA KNOWLEDGE DECISION ACTION. Data. Decision. Data mining (analysis) Business modeling (using data mining software) Business hypothesis

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

Introduction to Databases

4th Quarter Communicating with Fans and Advertisers Using Databases

Data Modeling with the Entity Relationship Model. CS157A Chris Pollett Sept. 7, 2005.

Architecture. Outline. Review - Alibris. Announcements. What is Architecture? TIM 50 - Business Information Systems. How do you architect a solution?

Outline. Database Management and Tuning. Index Data Structures. Outline. Index Tuning. Johann Gamper. Unit 5

Last Name First Name M.I. Individual NPI. Group/Clinic Name as it appears on W9 (if applicable) TIN Taxonomy

Data Strategies for Efficiency and Growth

Foundations of Business Intelligence: Databases and Information Management

i2b2 User Guide Informatics for Integrating Biology & the Bedside Version 1.0 October 2012

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]

Anonymization Case Study 1: Randomizing Names and Addresses

Introduction to MS Access: creating tables, keys, and relationships

Management Information Systems

Introduction to Database. Dr Simon Jones Thanks to Mariam Mohaideen

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS

Technology In Action, Complete, 14e (Evans et al.) Chapter 11 Behind the Scenes: Databases and Information Systems

POLICY. Create a governance process to manage requests to extract de- identified data from the Information Exchange (IE).

Database Systems. Answers

AMERICAN BOARD OF UROLOGY 2017 INSTRUCTIONS FOR SUBMISSION OF ELECTRONIC LOGS

ICTR UW Institute of Clinical and Translational Research. i2b2 User Guide. Version 1.0 Updated 9/11/2017

Tribhuvan University Institute of Science and Technology MODEL QUESTION


Linking Patients in PDMP Data

BCS THE CHARTERED INSTITUTE FOR IT BCS HIGHER EDUCATION QUALIFICATIONS BCS Level 5 Diploma in IT DATABASE SYSTEMS

TNM STAGING AS AN HTTP SERVICE

Literature Databases

Table of Contents. Page 1 of 51

Data Mining & Data Warehouse

Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization

Data Management Lecture Outline 2 Part 2. Instructor: Trevor Nadeau

22/10/16. Data Coding in SPSS. Data Coding in SPSS. Data Coding in SPSS. Data Coding in SPSS

Maryland IMMUNET System

CDC-ENDORSED DATA ELEMENTS. AIRA Discovery Session November 27, 2017, 4pm Eastern

Sample Answers to Discussion Questions

U1. Data Base Management System (DBMS) Unit -1. MCA 203, Data Base Management System

LR01 - New Enrollment for Legally-Exempt Care Window

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Instructional Improvement System (IIS) Dashboard District User Guide Statewide Longitudinal Data System (SLDS)

How to Use the Cancer-Rates.Info/NJ

Ten Great Reasons to Learn SAS Software's SQL Procedure

Tips for Completing ISBE SIS Student Demographics Excel Template

Interpreter Intelligence

ETL and OLAP Systems

2. Click File and then select Import from the menu above the toolbar. 3. From the Import window click the Create File to Import button:

Student Adv Imports Update Guide

Application Architecture

CSC 4710 / CSC 6710 Database Systems. Rao Casturi

LIFE LONG LEARNING LEVEL INSTRUCTIONS FOR SUBMISSION OF ELECTRONIC LOGS

Data Models and SQL for GIS. John Porter Department of Environmental Sciences University of Virginia

Data and Text Mining

Access EMR Data for Research

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Tables. Tables. Caption. Tables. Tables. Tables. Condense data into rows and columns Case 1: Single IV

Workforce Repository and Planning Tool Creating a Standard Baseline report

Web Databases Relational Databases Normalization

LexisNexis C.L.U.E. Auto Fixed Length Input File Mapping Insurance Solutions Portal

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #2: The Relational Model and Relational Algebra

ECS 10 Concepts of Computation Example Final Problems

ERROR: ERROR: ERROR:

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 7 Data Modeling with Entity Relationship Diagrams

LR01 - New Enrollment for Legally-Exempt Care Window

Subject Area Data Element Examples Earliest Date Patient Demographics Race, primary language, mortality 2000 Encounters

Relational Data Structure and Concepts. Structured Query Language (Part 1) The Entity Integrity Rules. Relational Data Structure and Concepts

The Relational Model. Week 2

Database Applications (15-415)

Transcription:

Databases, Data Mining & Knowledge Discovery Charlotte Seckman, PhD, RN-BC Assistant Professor, Course Director University of Maryland School of Nursing Nursing Informatics Program

Objectives Define key terms. Discuss various database models and associated terms such as data warehouse, data dashboards, data mining and knowledge discovery.

Definitions A Database is a shared collection of logically related data, designed to meet the information needs of multiple users in an organization Database Management System (DBMS): software that helps organize data in a way that allows fast and easy access to the data

Databases Advantages Store large amounts of data long term Data easily available Fast response Simultaneous access Efficient exchange Used for research Disadvantages Can be expensive to set up Depending on database model direct access to data may be limited Data excess

Database Design From smallest to largest unit: Character Field Record File Database

Database Models Types of DB Models: Flat Relational

Database Models: Flat Binary file; contains one record per line Fields separated by delimiters, e.g. commas, or may have a fixed length No relationship between the records Example: Creating a name and address list on paper, in a word processing program or a spreadsheet. This would be called a flat file

Example: Flat File Last Name First Name Age Gender Race Street Address City State Zip Bunny Bugs 68 Male Rabbit 555 Looney Tunes Lane Tunetown PA 11122 Duck Daffy 72 Male Duck 444 Looney Tunes Drive Tunetown PA 11122 Pig Porky 69 Male Pig 333 Looney Tunes Way Tunetown PA 11122 Bird Tweety 58 Female Bird 222 Looney Tunes Ave Tunetown PA 11122 Cat Sylvester 60 Male Cat 111 Looney Tunes Terr. Tunetown PA 11122 LePew Pepe 69 Male Skunk 110 Looney Tunes Street Tunetown PA 11122

Database Models: Relational Data is organized into two dimensional table Can retrieve data across all files directly without going through pathways Can extract key information

Database Models: Relational Three Types of Rational DB Proprietary licensed by vendors All Electronic Health Record Systems Open Source freely available for use MySQL, PostGIS Embedded packaged as part of other software or hardware Mobile applications to store phone numbers

Relational Model Format Table = File or relation Box = Data item Column = Field Rows = Record Primary key = Common Field used to link Files / Tables

Relational Terms Column=Field (Category/Name) Results Table Results Number Order Number Hgb (gm) Hct (%) 2082 34687 11 28 2276 58997 13 41 3488 40899 12 37 9089 21220 11 31

Relational Terms Row=record (results) Results Table Results Number Order Number Hgb (gm) Hct (%) 2082 34687 11 28 2276 58997 13 41 3488 40899 12 37 9089 21220 11 31

Tables & Relationships Patient Table MRN Last Name Zip Code 222-22-3333 Wilson 21201 111-33-5555 Smith 21234 444-55-7777 Jones 34543 relationship Patient has an order Order Table Order Number MRN Order Type Order Date 34678 222-22-333 HCT/HGB 7/4/2000 23456 111-33-5555 HCT/HGB 7/4/2000 12123 444-55-7777 HCT/HGB 7/4/2000 Order has a result relationship Results Table Result # Order # HGB (gm) HCT% 4567 34678 12 43 7865 23456 13 45 8942 12123 10 34

Working with Databases Query operation to retrieve and update data from a database table i.e. ask a question Reports report generators can filter and display selected fields from tables Forms visual interface for user to input new data

Data Warehouse, Data Dashboard, Data Mining & Knowledge Discovery

Definition: Data Warehouse Virtual storage areas for large databases Contain decision support tools for analysis, reports, mining, and other processes Decision Support Analysis Reports Mining Other Data Warehouse Data Warehouse Data Sources Database Database Database Database

Data Dashboard An interface that allows users to view information from a variety of data sources simultaneously

Data Mining Definition: Technique that looks for hidden patterns and relationships in large groups of data using software (Hebda & Czar, 2013).

Knowledge Discovery Knowledge Discovery in Databases (KDD): Extraction of implicit, unknown, and potentially useful information from data (Hebda & Czar, 2013). KDD refers to the higher level processes that include extraction, interpretation and application of data and is interrelated (and often used interchangeably) with the term data mining.

Knowledge Discovery in Databases Pattern Evaluation Task-related Data Data Warehouse Algorithms Statistics Modeling etc. Databases Selection Data Integration