Data, Information, and Databases

Similar documents
Fundamentals of Information Systems, Seventh Edition

A database management system (DBMS) is a software package with computer

DBMS Questions for IBPS Bank Exam

KNGX NOTES INFS1603 [INFS1603] KEVIN NGUYEN

DBM/500 COURSE NOTES

Strategic Information Systems Systems Development Life Cycle. From Turban et al. (2004), Information Technology for Management.

Data, Databases, and DBMSs

Database Management System Fall Introduction to Information and Communication Technologies CSD 102

Database Management Systems

File Processing Approaches

Accounting Information Systems, 2e (Kay/Ovlia) Chapter 2 Accounting Databases. Objective 1

Chapter 11 Database Concepts

Visit for more.

Data about data is database Select correct option: True False Partially True None of the Above

The functions performed by a typical DBMS are the following:

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Introduction to Database Concepts. Department of Computer Science Northern Illinois University January 2018

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

Chapter 6 VIDEO CASES

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Relational Database Systems Part 01. Karine Reis Ferreira

9. Introduction to MS Access

Chapter 3. Foundations of Business Intelligence: Databases and Information Management

Relation Databases. By- Neha Tyagi PGT CS KV 5 Jaipur II Shift Jaipur Region. Based on CBSE Curriculum Class -11. Neha Tyagi, PGT CS II Shift Jaipur

2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the integrity of the data.

16/06/56. Databases. Databases. Databases The McGraw-Hill Companies, Inc. All rights reserved.

Chapter. Relational Database Concepts COPYRIGHTED MATERIAL

Databases The McGraw-Hill Companies, Inc. All rights reserved.

Topics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)

CHAPTER 2: DATA MODELS

Bonus Content. Glossary

CHAPTER 2: DATA MODELS

CSC 355 Database Systems

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 1 Database Systems

Introduction to Databases

Fundamentals of Database Systems (INSY2061)

CS102B: Introduction to Information Systems. Minerva A. Lagarde

Managing Data Resources

Data Management Lecture Outline 2 Part 2. Instructor: Trevor Nadeau

ITP 140 Mobile Technologies. Databases Client/Server

Layers. External Level Conceptual Level Internal Level

Management Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management

Course Logistics & Chapter 1 Introduction

Introduction to Database Systems. Motivation. Werner Nutt

Basant Group of Institution

Database Management System 9

Mahathma Gandhi University

MANAGING FILES: Basic Concepts A database is a logically organized collection of related data designed and built for a specific purpose,

Introduction to Databases

CISC 3140 (CIS 20.2) Design & Implementation of Software Application II

Database Management System 2

IT1105 Information Systems and Technology. BIT 1 ST YEAR SEMESTER 1 University of Colombo School of Computing. Student Manual

Rapid Application Development

Where is Database Management System (DBMS) being Used?

Data Base Concepts. Course Guide 2

What is database? Types and Examples

Today Learning outcomes LO2

Database Processing. Fundamentals, Design, and Implementation. Global Edition

II. Data Models. Importance of Data Models. Entity Set (and its attributes) Data Modeling and Data Models. Data Model Basic Building Blocks

Introduction to MS Access: creating tables, keys, and relationships

Database Management System

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

U1. Data Base Management System (DBMS) Unit -1. MCA 203, Data Base Management System

Several major software companies including IBM, Informix, Microsoft, Oracle, and Sybase have all released object-relational versions of their

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

CA ERwin Data Modeler

Chapter 12 Databases and Database Management Systems

CS313D: ADVANCED PROGRAMMING LANGUAGE

OLAP Introduction and Overview

Managing Data Resources

CS143: Relational Model

An Introduction to Databases and Database Management Systems.

Module-01 Introduction to Database Concepts

CS6312 DATABASE MANAGEMENT SYSTEMS LABORATORY L T P C

DATABASES SQL INFOTEK SOLUTIONS TEAM

Sample Answers to Discussion Questions

John Edgar 2

Database Systems Concepts *

IT Service Delivery and Support Week Three. IT Auditing and Cyber Security Fall 2016 Instructor: Liang Yao

Institute of Aga. Network Database LECTURER NIYAZ M. SALIH

BIS Database Management Systems.

MIS Database Systems.

Database Management System. Fundamental Database Concepts

Physical Design of Relational Databases

Introduction To Computers

8) A top-to-bottom relationship among the items in a database is established by a

DB Basic Concepts. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan

Introduction to DBMS DATA DISK. File Systems. DBMS Stands for Data Base Management System. Examples of Information Systems which require a Database:

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Consistency The DBMS must ensure the database will always be in a consistent state. Whenever data is modified, the database will change from one

Database Management Systems MIT Lesson 01 - Introduction By S. Sabraz Nawaz

Database Fundamentals Chapter 1

Department of Computer Science and Information Systems, College of Business and Technology, Morehead State University

CTL.SC4x Technology and Systems

Database Foundations. 5-1 Mapping Entities and Attributes. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Oracle Financial Analyzer Oracle General Ledger

Conceptual Database Modeling

MIT Database Management Systems Lesson 01: Introduction

Chapter 5. Database Processing

Transcription:

Data, Information, and Databases BDIS 6.1 Topics Covered Information types: transactional vsanalytical Five characteristics of information quality Database versus a DBMS RDBMS: advantages and terminology Multi-user issues BSAD 141 Dave Novak The Need for High-Quality Information Data are everywhere Which data are important? The Need for High-Quality Information Recall difference between data and information Which data should the organization store? Which data need to be further manipulated? Which data are required to make different types of decisions? How does the organization convert raw data into the information that is needed? The Need for High-Quality Information The need to obtain and analyze the many different levels, formats, and granularities of organizational information to make decisions The Need for High-Quality Information Decisions are only as good as the quality of the data and information that are used to make the decisions Garbage in Garbage out Using poor quality data doesn t help 1

Data Quality Problems Example of Poor Quality Data Characteristics of High Quality Data 1) Accurate 2) Complete 3) Consistent 4) Unique 5) Timely 1) Accurate 2) Complete Are the data (is the information) correct, precise, and exact? For example: Are the data factual? Are data error-free? Have data been verified? Correct spelling Precise numbers Are the data whole (complete) and do they have all the necessary parts? For example Are there missing values or pieces of data? Full street address Area code along with phone number Empty fields Full Names 3) Consistent Are the data are in agreement with themselves and with known facts? For example Does summary information agree with detailed information? Can you reconcile the data? Do mathematical manipulations yield correct results? Are data manipulations performed consistently for the entire data set? 4) Unique Are the data unique (one of a kind) or are there redundant, repetitious or unnecessary data stored in the same database? For example: Are there duplicate records for the same event? Are there different versions of the same file or event (which is the latest or most accurate?) 2

5) Timely Are the data current with respect to decision-making needs? Timeliness depends on the situation Real-time information Immediate, up-to-date information Real-time system Provides realtime information in response to requests Real-time is a relative description that depends on the use or need Examples of how can data be of poor quality Customers intentionally enter inaccurate information to protect their privacy or because they are irritated Different data entry standards and formats are used Operators enter abbreviated or erroneous information by accident or to save time Third party and external information contains inconsistencies, inaccuracies, and errors What is a Database? Database a collection of information organized in a way that provides efficient retrieval There are electronic and physical databases (paper/print) A database can be a very simple collection of data such as alphabetically arranging names in an address book What Is a Database? Self-describing collection of integrated records includes Meta Data about the fields/attributes Governs data acceptable formats for consistency Hierarchy of data elements Columns/Fields Rows/Records Tables/Relations A location to store and retrieve well structured and well governed data What is a Database Management System (DBMS)? Database management systems (DBMS) A set of computer programs / software that allow users to store, modify, query, and retrieve data in an organized, systematic, and controlled manner Database Management System (DBMS) A database (the physical collection of data) is typically not portable across different DBMS Like application software, different DBMS are generally designed to work with specific system software and specific database schema 3

Database Management System (DBMS) What is a database schema? The way in which the objects in the database are logically grouped / organized What are the tables and how are they linked? What are the different user views? What types of procedures and queries are stored? Database Management System (DBMS) A database is typically something inside the DBMS, although in the case of a MS Excel workbook the database is a standalone object Single File Data Management MS Excel is a database, but it is not a DBMS! There is NO DB management component - each worksheet is a single large two-dimensional matrix A DBMS is software that is used to manage the database and provides a set of tools used to manipulate and query data A database is simply an organized collection of data that can be accessed Why go beyond a Spreadsheet? Need to Store Multiple Themes of Data Spreadsheets Lack Structure and are prone to error To reduce redundantly stored data Optimized Query/Reporting Databases ENFORCE Consistency of the Data Spreadsheets are Clumsy & Time consuming to Update, Append or Expand Multiple User Access Why Redundancy and Duplication of data are Important to Avoid Update, Insertion and Deletion Anomalies Poorly normalized tables that require duplicate entries how do we ensure that when you change a value for one record that the duplicated value is changed? If an employee leaves or if you stop selling a specific product, should your system permit those records to be deleted? Would you have this level of control over a spreadsheet? Redundancy is great for backups but terribly inefficient for Data Structures Increase manual time required for development and data entry Increase required disk space Decrease processing speeds & response time Lead to data anomalies and inconsistencies Types of Database Architectures Hierarchical Model Parent/Child Tree Like Structure. Parents can have many children but children only one parent Network Model Permitted children to have many parents Offers more direct relationships between entities Mostly Replaced by Relational Model Object Model Ideal when demand for massive amounts of information about single items is frequent (high energy physics, molecular biology, spatial databases, telecommunications..) Relational Model Most Common and what we will study in this class By far the most dominant enterprise data structure 4

Database Management System (DBMS) NoSQL database technologies RDBMS are not well suited to handle unstructured data NoSQL technologies offer increased flexibility and scalability NoSQL technologies are designed with big data needs as opposed to transaction processing needs in mind RDBMS Most popular and common DBMS is the relational DBMS (RDBMS) A standard program and user interface in the RDBMS is the Structure Query Language (SQL) A programming language used to create, modify, and retrieve information from a database Different databases use different (proprietary) variations of SQL RDBMS RDBMS are still best for most business needs Oracle: Oracle Database and MySQL IBM: DB2 and Informix Microsoft: SQL Server SAP: Sybase Enterprise and Sybase IQ Teradata https://www.wired.com/insights/2013/09/thefuture-of-enterprise-data-rdbms-will-be-there/ RDBMS Data are organized as a set of formal tables Data can be accessed and combined in different ways without altering the data within the tables RDBMS can be easily extended / scaled new data and new categories of data can be added without changing existing data RDBMS Terminology Data model A picture of logical structures that detail the relationships among data elements RDBMS Terminology Data dictionary Compiles all of the metadata about the elements in the data model Metadata Formal description of data structures (like tables and fields) and any constraints of the table or values within the table Data about the containers of data 5

Entity Sets (Tables) Relational table or entity set Each table consists of columns (fields/attributes) and rows (records/entities) The table has a name that describes the group of related entities within the table For example, a table labeled Student would contain a group of student entities Entity / Record / Row A person, place, thing, transaction, or event about which data are being collected and stored The individual rows in a table contain entities Each row is also referred to as a record Example? Attributes / Field / Column The data elements that describe the characteristics of a specific entity The columns in each table contain the attributes Example? What is a Relationship? When designing a relational DB, data are grouped into tables Each table contains all related data elements For example we would store data related to customer (name, address, phone, etc.) and data related to the customer s particular order (orderid, date, shipping method, etc.) in different tables (Customer and Order) What is a Relationship? All information specific to a customer would go into a Customer table All information specific to the orders would go into an Order table We would then create a relationship between the tables that allows us to match a particular customer with a particular order What is a Relationship? A relationship in an RDBMS is an association between the entities within the different tables There are THREE (3) types of relationships: One-to-One (1:1) One-to-Many (1:M) Many-to-Many (M:M) 6

Creating Relationships Through Keys KEYS are used to create relationships between the entities in different tables in the RDB Primary key A field (or group of fields) that uniquely identifies a given entity in a table Foreign key A primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables Creating Relationships Through Keys For our purposes: Every table in a RDBMS MUST have a primary key The foreign key is not required in every table and will only appear on the many side of the relationship Advantages of RDBMs RDBMS advantages from a business perspective include 1) Flexibility 2) Scalability and performance 3) Improved information integrity (quality) Reduced information redundancy 4) Information security 1) Flexibility Handle changes quickly and easily Provide users with different views of the data Arranging data items in different ways depending on the specific user need Showing a particular user only some of the available fields while not showing them other fields 1) Flexibility: Schema Different database schema can be owned by or associated with different users The schema is a user personalized set of tables, views, and indexes 2) Scalability and Performance A DBMS must expand to meet increased demand, while maintaining acceptable performance levels Scalability Refers to how well a system can adapt to increased demands Performance Measures how quickly a system performs a certain process or transaction 7

3) Information Integrity Information integrity a measure of information quality Know that data have not been entered incorrectly or altered in an unauthorized manner Integrity constraint rules that help ensure the quality of information We will discuss entity integrity and referential integrity (there is also domain integrity) 3) Information Integrity: Controlling Redundancy Redundant data are ok if they serve a specific purpose such as being used as backup directly linked to the source Backup systems promote fault tolerance, Unintentional redundancy is not good Wasted storage Difficult to modify Possible inconsistencies 4) Information Security Information is an organizational asset and must be protected RDBMS offer several security features Access level Determines the level of access each individual user has Who can access the DBMS Access control Determines the types of things each group can do Types of access, such as power to create, modify, delete, and/or read Which types of SQL statements can be executed Multiuser Issues DBMS serve many different users with different needs Many users may require concurrent access to the same data Must preserve integrity of data and the performance of the system Multiuser Issues Enterprise DBMS Problem: if multiple users (say tens or even hundreds of users) access the same data concurrently, how does the DBMS allow one user to change data without immediately overwriting the change by another user? This is typically referred to as the Lostupdate problem 8

Multiuser Issues Concurrent transactions are addressed through the use of transactions and locks Transactions single indivisible action that affects some data Once a transaction is committed, it is permanent and changes are visible to all users If transaction is not committed, changes are rolled back or reversed Multiuser Issues Locks literally locks the data so that changes cannot be made on the data while another transaction is in process Summary Five characteristics of quality information Define database, DBMS, RDBMS, and supporting components and terminology Advantages of RDBMS What is SQL? Describe the lost-update problem and how it is addressed 9