Department of Computer Science and Information Systems, College of Business and Technology, Morehead State University

Similar documents
CS2 Current Technologies Note 1 CS2Bh

CS2 Current Technologies Lecture 2: SQL Programming Basics

King Fahd University of Petroleum and Minerals

Real-World Performance Training SQL Introduction

CS2 Current Technologies Lecture 3: SQL - Joins and Subqueries

Part III. Data Modelling. Marc H. Scholl (DBIS, Uni KN) Information Management Winter 2007/08 1

Programming Languages

Introduction. Introduction to Oracle: SQL and PL/SQL

Informatics Practices (065) Sample Question Paper 1 Section A

Database Management System. * First install Mysql Database or Wamp Server which contains Mysql Databse.

GIFT Department of Computing Science Data Selection and Filtering using the SELECT Statement

Introduc.on to Databases

Pivot Tables Motivation (1)

Database implementation Further SQL

CS Reading Packet: "Writing relational operations using SQL"

Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Slide 17-1

SQL Structured Query Language Introduction

CS Reading Packet: "Views, and Simple Reports - Part 1"

Table : Purchase. Field DataType Size Constraints CustID CHAR 5 Primary key CustName Varchar 30 ItemName Varchar 30 PurchaseDate Date

Objectives. After completing this lesson, you should be able to do the following:

1 SQL Structured Query Language

Creating and Managing Tables Schedule: Timing Topic

1 SQL Structured Query Language

RDBMS Using Oracle. BIT-4 Lecture Week 3. Lecture Overview

Relational Database Management Systems Mar/Apr I. Section-A: 5 X 4 =20 Marks

: ADMINISTRATION I EXAM OBJECTIVES COVERED IN THIS CHAPTER:

Databases 1. Daniel POP

Programming the Database

Relational Database Management Systems Oct I. Section-A: 5 X 4 =20 Marks

Topic 8 Structured Query Language (SQL) : DML Part 2

Oracle Database 18c. Gentle introduction to Polymorphic Tables Functions with Common patterns and sample use cases

a 64-bit Environment Author: Rob procedures. SSIS servers. Attunity.

Databases. Relational Model, Algebra and operations. How do we model and manipulate complex data structures inside a computer system? Until

CIS Reading Packet: "Views, and Simple Reports - Part 1"

Definitions. Database Architecture. References Fundamentals of Database Systems, Elmasri/Navathe, Chapter 2. HNC Computing - Databases

Chapter. Relational Database Concepts COPYRIGHTED MATERIAL

Getting Information from a Table

Active Databases Part 1: Introduction CS561

SQL Simple Queries. Chapter 3.1 V3.01. Napier University

NURSING_STAFF NNbr NName Grade

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

CS Reading Packet: "Sub-selects, concatenating columns, and projecting literals"


ajpatelit.wordpress.com


Data and Knowledge Management Dr. Rick Jerz

Lecture 8. Database Management and Queries

The Seven Case Tables

Database Design. 1-3 History of the Database. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Data Management Lecture Outline 2 Part 2. Instructor: Trevor Nadeau

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

Data and Knowledge Management. Goals. Big Data. Dr. Rick Jerz

Test bank for accounting information systems 1st edition by richardson chang and smith

Q1. (SQL) Consider the following table HOSPITAL. Write SQL commands for the statements (i) to (v)

U1. Data Base Management System (DBMS) Unit -1. MCA 203, Data Base Management System

Evolution of Database Systems

Introduction to Oracle

Why Relational Databases? Relational databases allow for the storage and analysis of large amounts of data.

Full file at

Data, Information, and Databases

SQL. Char (30) can store ram, ramji007 or 80- b

16/06/56. Databases. Databases. Databases The McGraw-Hill Companies, Inc. All rights reserved.

Databases The McGraw-Hill Companies, Inc. All rights reserved.

DATABASE DEVELOPMENT (H4)

Practical Workbook Database Management Systems

22/01/2018. Data Management. Data Entities, Attributes, and Items. Data Entities, Attributes, and Items. ACS-1803 Introduction to Information Systems

ACS-1803 Introduction to Information Systems. Instructor: Kerry Augustine. Data Management. Lecture Outline 2, Part 2

Introduction to Database Systems

P.G.D.C.M. (Semester I) Examination, : ELEMENTS OF INFORMATION TECHNOLOGY AND OFFICE AUTOMATION (2008 Pattern)

KENDRIYA VIDYALAYA ALIGANJ SHIFT-II HOLIDAY HOMEWORK CLASS-XII INFORMATICS PRACTICES

8) A top-to-bottom relationship among the items in a database is established by a

Overview of Data Management

Q5 Question Based on SQL & Database Concept Total Marks 8. Theory Question 2 Marks / SQL Commands 6 Marks / Output of commands 2 Marks

Meaning & Concepts of Databases

Excel Functions & Tables

Database Compatibility for Oracle Developers Tools and Utilities Guide

Objectives. After completing this lesson, you should be able to do the following:

Overview of PL/SQL. About PL/SQL. PL/SQL Environment. Benefits of PL/SQL. Integration

Part 1 on Table Function

SYSTEM CODE COURSE NAME DESCRIPTION SEM

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

Dr. Lyn Mathis Page 1

DATA Data and information are used in our daily life. Each type of data has its own importance that contribute toward useful information.

chapter 2 G ETTING I NFORMATION FROM A TABLE

Solved MCQ on fundamental of DBMS. Set-1

Q.1 Short Questions Marks 1. New fields can be added to the created table by using command. a) ALTER b) SELECT c) CREATE. D. UPDATE.

Styles and Conditional Features

CSCI 1100L: Topics in Computing Lab Lab 07: Microsoft Access (Databases) Part I: Movie review database.

Database: Collection of well organized interrelated data stored together to serve many applications.

Rapid Application Development

Relational Model. IT 5101 Introduction to Database Systems. J.G. Zheng Fall 2011

Amendments & Transactions

Database Programming with PL/SQL

Databases - 3. Null, Cartesian Product and Join. Null Null is a value that we use when. Something will never have a value

Databases - 3. Null, Cartesian Product and Join. Null Null is a value that we use when. Something will never have a value

Introduction to Geographic Information Science. Updates. Last Lecture. Geography 4103 / Database Management

Styles and Conditional Features. Version: 7.3

2) SQL includes a data definition language, a data manipulation language, and SQL/Persistent stored modules. Answer: TRUE Diff: 2 Page Ref: 36

COGS 121 HCI Programming Studio. Week 03 - Tech Lecture

Database Programming with SQL

Transcription:

1 Department of Computer Science and Information Systems, College of Business and Technology, Morehead State University Lecture 3 Part A CIS 311 Introduction to Management Information Systems (Spring 2017) Computers and information technology have been the main tools used to handle data. In the earlier days, however, data were not the main part of software. In other words, software was designed in such a way that programs (instructions or processes) were the main part, and data were handled as a byproduct of the programs. This approach is still used sometimes, but it is not a good way to manage a large amount of data, and new approaches to managing data have been introduced since the 1960s. Some earlier models (e.g., the hierarchical model such as IBM s IMS system, the network model such as IDMS, etc.) were efficient in terms of speed but too rigid to be used in daily operations. A more reliable and robust database model (i.e., the relational model for databases) was introduced in 1969 by Edgar F. Codd, and later commercial relational database management systems were introduced (Oracle 2). Since then many organizations have used relational database management systems because they are reliable and flexible enough to support their business. In the 1990s, with the popularity of the Internet and the World Wide Web, organizations began to collect external data and use a special type of database management system called data warehouses to manage and analyze different types of internal and external data. Since 2002, the amount of data that organizations deal with is so big that traditional database management systems would not be useful. These traditional systems (e.g., relational database management systems) are still widely used for internal enterprise database management, but for Big Data, different types of database management systems are used (e.g., NoSQL, NewSQL, etc.). Nowadays, all these database managements systems (relational DBMS, data warehouses, NoSQL, etc.) are used in business. Database management systems are used to organize a set of data efficiently and store it economically in a permanent or secondary storage device (e.g., hard disk drives). They also provide efficient tools to locate the on-demand part or whole of the data set, retrieve it, manipulate it, and generate outputs. So far, the most reliable, flexible, and efficient database management systems are based on the relational model, and thus it is important to understand the model. Relational databases use relations to organize and store data. A relation in a relational database is a twodimensional table with a primary key and other constraints integrated. In other words, not every table is a relation. To be a relation, the following conditions should be met: 1. A relation has a name. 2. Data in a relation are organized in two dimensions: row (record, tuple) and column (field, attribute) 3. Each column in a table (field or attribute) is of the same data type. If a column is of numeric data type, the data of text type cannot be inserted in the column. 4. The order of columns does not matter. 5. The order of rows does not matter. 6. Every relation must have one, and only one, primary key. 7. Foreign key is optional. A relation may not have a foreign key or may have one or more foreign keys. In a database, several tables are stored. This is to improve database management and save resources. Let s review the following table. It could be a relation if it has a name (EMPLOYEE_INFORMATION) and if it has a primary key (e.g., the empno and deptno columns together could be set as the primary key for the table). Even

2 if it may be a relation, the data should not be stored as is because it has many anomalies and redundant data. EMPLOYEE_INFORMATION empno ename job hiredate sal deptno dname loc 7839 KING PRESIDENT 17-Nov-81 $5,000.00 10 ACCOUNTING NEW YORK 7782 CLARK MANAGER 09-Jun-81 $2,450.00 10 ACCOUNTING NEW YORK 7934 MILLER CLERK 23-Jan-82 $1,300.00 10 ACCOUNTING NEW YORK 7566 JONES MANAGER 02-Apr-81 $2,975.00 20 RESEARCH DALLAS 7788 SCOTT ANALYST 13-Jul-87 $3,000.00 20 RESEARCH DALLAS 7902 FORD ANALYST 03-Dec-81 $3,000.00 20 RESEARCH DALLAS 7369 SMITH CLERK 17-Dec-80 $800.00 20 RESEARCH DALLAS 7876 ADAMS CLERK 13-Jul-87 $1,100.00 20 RESEARCH DALLAS 7698 BLAKE MANAGER 01-May-81 $2,850.00 30 SALES CHICAGO 7499 ALLEN SALESMAN 20-Feb-81 $1,600.00 30 SALES CHICAGO 7521 WARD SALESMAN 22-Feb-81 $1,250.00 30 SALES CHICAGO 7654 MARTIN SALESMAN 28-Sep-81 $1,250.00 30 SALES CHICAGO 7844 TURNER SALESMAN 08-Sep-81 $1,500.00 30 SALES CHICAGO 7900 JAMES CLERK 03-Dec-81 $950.00 30 SALES CHICAGO In the EMPLOYEE_INFORMATION table, there are many redundant data. For instance, the dname column has many repeating values (e.g., ACCOUNTING, RESEARCH, and SALES). The loc column also has repeating values. This table shows only 14 records (rows), and thus the redundant data may be ignorable. In practice, however, a relation usually has a large amount of data (e.g., millions or billions of records), and these repeating values would be a big waste. Another problem is the anomalies in the table. For instance, if the accounting department moves to a new place (e.g., L.A.), then many values of the data (i.e., the cell values, NEW YORK, in the loc column) should be changed (i.e., to L.A.). This is called update anomalies. In the table, it is not possible to enter a new record if a department does not exist (e.g., no record of IT department exists), and this is called insertion anomalies. Another type, deletion anomaly, occurs when a record cannot be deleted without deleting all related data (e.g., if the ACCOUNTING department was removed, several records will be deleted. A table with these problems is called a non-normalized relation (or table). One easy way to avoid or minimize these problems in the example is to divide the table into two or more tables. For instance, the EMPLOYEE_INFORMATION table could be divided into two tables as shown in the following figure (the Employees and Departments tables). To make each of these a relation, the primary key constraint should be set up in each table (e.g., empno for Employees and deptno for Departments). If the tables are compared with the previous table, it is clear that a significant amount of data points (i.e., values in many cells) has been saved, if not completely. As shown in the figure, there are less repeating values in the new tables. For instance, the value, ACCOUNTING, is shown only one time. Likewise, the values RESEARCH, SALES, OPERATIONS, NEW YORK, DALLAS, CHICAGO, and BOSTON appear only once. There are, however, still some redundant values. For instance, the repeating values appear in the job column and the deptno column. If there was more information for jobs (e.g., job characteristics, job title number, etc.), a new table may be developed. In the given example, however, there are none, and thus keeping it as is seems OK. The deptno column in the Employees table is different and plays an important role in the table. In fact, it is the

3 foreign key in the Employees table which references the primary key in the Departments table. The foreign key is used as a linking point between the two tables. For instance, if a record is selected in the Employees table (e.g., SMITH with the empno 7369), then the foreign key can be used to trace the relevant information on the Departments table (deptno 20). In other words, with the value in the deptno of the Employees table, the foreign key constraint makes it possible to trace the value in the deptno of the Departments table (i.e., the department name is RESEARCH, and the department location is DALLAS). These tables are normalized relations, and this process of converting tables with anomalies into normalized tables is called normalization. This process, however, can be very complex and time consuming. A better way is to use a conceptual data model, and the most popular one is an entity-relationship model. In practice, system analysts or database administrators develop a conceptual data model first, convert it into a relational data schema, generate SQL statements based on the relational data schema, and execute the SQL statements to create actual tables and databases. For instance, the following SQL statement can be executed in a database management system to create a table (relation) called Departments. create table Departments( deptno byte, dname text, loc text, constraint pk_departments primary key (deptno) ); SQL, or Structured Query Language, is a fourth generation programming language (non-procedural or goaloriented). It is the US and international standard for relational databases. In the previous SQL example, create table is the reserved keyword that is used to create a relation. The word after the create table keyword, Departments, is the name of the table (relation). Inside the parentheses, columns (attributes or fields) are defined with column names, data types, and optional constraints. The columns are separated by commas, and the whole statement could be written in one line like the following: create table Departments(deptno byte, dname text, loc text, constraint pk_departments primary key (deptno)); In practice, however, the former is frequently used for better readability. The semicolon after the closing

4 parenthesis is the ending mark (like a period in an English sentence). The SQL code in the example is modified to conform to Microsoft Access, and it can be executed in the program using the SQL view of the Query Design menu. To test the code, first, open Microsoft Access and click on the Blank database menu. Initially, an unnamed database has been created with a table (Table1 which can be deleted). Select the Create tab menu, click on the Query Design menu of the Queries group, and click on the Close button if you see a pop-up window (Show Tables). If this is done correctly, the following will be shown.

5 If the SQL View menu is selected, the following will be displayed. Delete the SELECT; statement in the Query1 window, type in the example SQL code, and click on the Run button to create the Departments table.

6 Likewise, in the Query1 window, delete the previous SQL statement and type in the following to create the Employees table. After the code replaces the previous statement, click on the Run button (if it is not shown, select the Design tab). create table Employees( empno int, ename text, job text, hiredate date, sal currency, deptno byte, constraint pk_employees primary key (empno), constraint fk_deptno foreign key (deptno) references Departments (deptno) ); Once the relations have been created, an insert into statement can be used to enter a record (row or tuple) into a table. For instance, the following statement can be executed to enter a record into the Departments table (Follow the process explained previously to enter the data): insert into Departments values(10, 'ACCOUNTING', 'NEW YORK'); There are some ways to enter the whole data at one time, but this requires advanced programming, and thus, type in the following SQL statements one by one to enter all the records for the Departments and Employees tables:

7 insert into Departments values(20, 'RESEARCH', 'DALLAS'); insert into Departments values(30, 'SALES', 'CHICAGO'); insert into Departments values(40, 'OPERATIONS', 'BOSTON'); values( 7839, 'KING', 'PRESIDENT',format('17-11-1981','dd-mm-yyyy'), 5000, 10); values( 7698, 'BLAKE', 'MANAGER', format('1-5-1981','dd-mm-yyyy'), 2850, 30); values( 7782, 'CLARK', 'MANAGER', format('9-6-1981','dd-mm-yyyy'), 2450, 10); values( 7566, 'JONES', 'MANAGER', format('2-4-1981','dd-mm-yyyy'), 2975, 20); values( 7788, 'SCOTT', 'ANALYST', format('13-jul-1987','dd-mmm-yyyy'), 3000, 20); values( 7902, 'FORD', 'ANALYST', format('3-12-1981','dd-mm-yyyy'), 3000, 20); values( 7369, 'SMITH', 'CLERK', format('17-12-1980','dd-mm-yyyy'), 800, 20); values( 7499, 'ALLEN', 'SALESMAN', format('20-2-1981','dd-mm-yyyy'), 1600, 30); values( 7521, 'WARD', 'SALESMAN', format('22-2-1981','dd-mm-yyyy'), 1250, 30); values( 7654, 'MARTIN', 'SALESMAN', format('28-9-1981','dd-mm-yyyy'), 1250, 30); values( 7844, 'TURNER', 'SALESMAN', format('8-9-1981','dd-mm-yyyy'), 1500, 30); values( 7876, 'ADAMS', 'CLERK', format('13-jul-1987', 'dd-mmm-yyyy'), 1100, 20); values( 7900, 'JAMES', 'CLERK', format('3-12-1981','dd-mm-yyyy'), 950, 30); values( 7934, 'MILLER', 'CLERK', format('23-1-1982','dd-mm-yyyy'), 1300, 10); After the insert into statements are executed, if the following select statement is executed, the records of the Departments table will be displayed: select * from Departments;

8 If the Run button is clicked on, the data will be displayed as follows: The records shown in the Query1 window is not the Departments table, but the data set retrieved from the table and loaded into the computer memory. Likewise, the following statement will display the data from the Employees table: select * from Employees;

9 The following SQL statement will summarize the data from the two tables and display the result as shown in the following figure: select Departments.deptno, sum(sal) from Departments, Employees where Departments.deptno = Employees.deptno group by Departments.deptno, dname; The query (currently Query1) can be saved by clicking on the Save button.

10 Name it as SalaryByDepartment, click the OK button, and the query is shown in the Queries area. The outcome from the database by the SQL statement can be used in different application programs (e.g., Excel). To understand the process, first, click the right mouse button on top of the query and select the Excel menu of the Export list. Complete the Export to Excel process by clicking on the OK and Close buttons, and the file (SalaryByDepartment.xlsx) can be located where the file has been exported.

11

12 In Excel, the data can be modified and analyzed further. The summarized data can be saved in the Excel file, too, but the raw data is usually stored in a database. As mentioned before, Excel is different from Access. Excel is an electronic spreadsheet program which is used to analyze numeric data; on the other hand, Access is a relational database management system which is used to collect, organize, and store a large amount of data. In Excel, there are many cells in a big table-like structure, but it is not a relation (no primary key, foreign key, or any other constraints). In Access, many tables are organized, and each table is a relation which should include a primary key. Optionally, if necessary, a foreign key or more foreign keys could be implemented in a relation. A primary key is a column or a set of columns in a relation and is used to identify each record. For instance, if a value of the primary key is given, a specific record can be identified (differentiated from other records). A foreign key is an extra column or set of columns in a relation. It is the primary key in another relation (reference table) and is used to keep a link between the tables (the reference table and the referring table that has the foreign key). In practice, ordinary business people don t use a database management system directly unless they know the important concepts and techniques. Usually they use business applications or analytic programs (e.g., electronic spreadsheet programs) which can be used to import data from database management systems indirectly and process and analyze the imported data. Some business applications or analytic systems are introduced in CIS 211, CIS 385, and other IS courses. Some database management courses (CIS 326, CIS 426) provide more information on database management. It is strongly recommended that Microsoft Access not be used unless the advanced concepts and techniques (e.g., Entity-Relationship Model, Relational Data Model, etc.) are learned. On the other hand, all college students, especially business-major students, should master an electronic spreadsheet program.