Discovery-based Edit Assistance for Spreadsheets

Size: px
Start display at page:

Download "Discovery-based Edit Assistance for Spreadsheets"

Transcription

1 Discovery-based Edit Assistance for Spreadsheets Jácome Cunha Departamento de Informática Universidade do Minho Portugal Join work with João Saraiva (DI/UM) and Joost Visser (SIG) VL-HCC Corvallis, OR, USA, September 23, 2009

2 1.1 Motivation Example: Property Renting System Each row represents a renting transaction It includes information about properties, owners and clients It also contains dates and prices (formulas) Row 3 says that john has rented a property owned by tony, with address 5 Novar Dr., price per day 70, between dates 9/1/01 and 9/1/02 and paid for it.

3 1.2 Motivation Redundancy This unstructured model is valid and serves its purpose But, it contains data redundancy For example, the rent per day is repeated several times

4 1.3 Motivation Updating Problems As a result, updates can cause data inconsistence For example, updating the renting value of property pg4 must be performed in several places

5 1.4 Motivation Deleting Problems Deleting rows can be problematic, too For example, deleting the row 5 will remove all the information about property pg36

6 1.5 Motivation Modern Programming Environments Naive text editors are now replaced by powerful programming language environments (ex. Eclipse) They are specialized for the programming language under consideration and help the user throughout the editing process Knowing the language under consideration, they can detect features of the programs being edited that, for example, violate the properties of the underlying language They guide users in writing correct programs

7 1.6 Motivation This Talk In this talk, I will show how we can use well-known database techniques and programming language techniques to force users to correctly update and delete spreadsheet data

8 2.0 Overview of our Approach Infer functional dependencies (Fun Algorithm) Normalize (3NF) those functional dependencies Create a Relational Database (RDB) schema (3NF) Embed that schema in the spreadsheet Spreadsheet dependency mining and schema synthesis Relational Database Schema User Embed schema as formulas and visual objects

9 3.1 Functional Dependencies Inference Functional Dependencies A Functional Dependency (FD) denoted A B means that an element of A is uniquely associated with an element of B For example, the following spreadsheet data defines the functional dependency clientnr cname

10 3.2 Functional Dependencies Inference The Fun Algorithm The Fun algorithm (Novelli et al.) computes FDs from data It computes all the FDs defined in the data, even the ones that are not so intuitive For our example, it returns: ownerno oname totaldays clientno, cname propertyno paddress, rentperday, ownerno, oname...

11 3.3 Functional Dependencies Inference Normalize Functional Dependencies Synthesize algorithm (Maier) calculates a 3NF set of FDs It returns a set of attributes with a set of candidate keys It is necessary to choose one candidate key to be the primary key We consider some particularities from spreadsheets: Formulas: PKs can not be formulas Single value columns: too many FDs columns with only one value appear in Semantics of labels: id, number, nr, code maybe good pointers to primary keys Column arrangement: we assume that the PK is before the rest of the columns

12 3.3 Functional Dependencies Inference Normalize Functional Dependencies For our example, the functional dependencies/schema generated is: owners ownerno oname clients clientno cname properties propertyno paddress, rentperday, oname

13 4.0 Spreadsheet Programming Environments The spreadsheet programming environments have several features: Auto-completion of column values Non-editable columns Safe deletion of rows Standard editing

14 4.1 Spreadsheet Programming Environments Auto-completion of Column Values The spreadsheet environment will not allow the user to introduce two properties with the same number. Instead, it offers a list of possible values to choose from: We use the notion of FD and primary key to determine the value of some columns in the spreadsheet.

15 4.1 Spreadsheet Programming Environments Auto-completion of Column Values For example, the value of the property number (propno) determines the values of the address (paddress), rent per day (rentperday ), and owner name (oname). Consequently, the spreadsheet environment is able to automatically fill in the values the corresponding columns.

16 4.1 Spreadsheet Programming Environments Auto-completion of Column Values From the relational model inferred in the original spreadsheet we generate a set of spreadsheet formulas. Consider the FD ownerno oname. In our spreadsheet, ownerno is in column J and oname in column K. We introduce in column K the following formula: S (K, r) = if (isna (vlookup (Jr, J2 : K (r 1), 2, 0)), "", vlookup (Jr, J2 : K (r 1), 2, 0)) This formula tests if there is a value introduced in column J. In the case of a value is chosen, it searches the corresponding value in the column K and shows it.

17 4.2 Spreadsheet Programming Environments Non-editable Columns Non-primary key columns become non-editable It prevents the end-user to introduce potential incorrect data and, thus, producing update anomalies

18 4.3 Spreadsheet Programming Environments Safe Deletion of Rows To correctly delete rows in the spreadsheet, a button per row is added When the user is removing important information, this button warns him of such action giving opportunity to continue or stop the action

19 4.4 Spreadsheet Programming Environments Standard Editing The spreadsheet programming environment provides a mechanism to enable/disable the advanced features. When the user disables those features he his able to introduce data that violates the (previously) inferred relational model. When the user enables the features, the system infers a new relational model that has to be obeyed in future advanced interactions.

20 5.0 Conclusions We used data mining and database techniques to analyze and create spreadsheet programming environments We have defined Spreadsheet Programming Environments and showed how such environments can be automatically derived from the spreadsheet data The spreadsheet environment guides the user in introducing correct data We would like to test this techniques with real users

21 5.0 Conclusions We have derived a relational model and embed it as formulas and visual objects

22 5.0 Conclusions

23 6.0 Metrics from the EUSES Corpus The EUSES corpus was conceived as a shared resource to support research on technologies for improving the dependability of spreadsheet programming. It contains more than 4500 spreadsheets gathered from different sources and developed for different domains. These spreadsheets are assigned to eleven different categories. Among the spreadsheets in the corpus, about 4.4% contain macros, about 2.3% contain charts, and about 56% do not have formulas being only used to store data.

24 6.0 Metrics from the EUSES Corpus In our preliminary experiment we have selected the first ten spreadsheets from each of the eleven categories of the corpus. The number of spreadsheets per category present in the EUSES corpus, selected and processed in the evaluation. category corpus selected processed cs database filby financial forms grades homework inventory jackson modeling personal total

From Spreadsheets to Relational Databases and Back

From Spreadsheets to Relational Databases and Back From Spreadsheets to Relational Databases and Back Jácome Cunha João Saraiva Joost Visser Departamento de Informática Universidade do Minho Portugal SIG & CWI The Netherlands PEPM 09, January, 19-20 Motivation

More information

MODEL-BASED PROGRAMMING ENVIRONMENTS FOR SPREADSHEETS

MODEL-BASED PROGRAMMING ENVIRONMENTS FOR SPREADSHEETS MODEL-BASED PROGRAMMING ENVIRONMENTS FOR SPREADSHEETS Jácome Cunha Departamento de Informática jacome@di.uminho.pt João Saraiva Departamento de Informática jas@di.uminho.pt Joost Visser Software Improvement

More information

Model-based Programming Environments for Spreadsheets

Model-based Programming Environments for Spreadsheets Model-based Programming Environments for Spreadsheets Jácome Cunha 13, João Saraiva 1, and Joost Visser 2 1 HASLab / INESC TEC, Universidade do Minho, Portugal {jacome,jas}@di.uminho.pt 2 Software Improvement

More information

Chapter 3. The Relational database design

Chapter 3. The Relational database design Chapter 3 The Relational database design Chapter 3 - Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations and relations in the relational

More information

Database Technologies. Madalina CROITORU IUT Montpellier

Database Technologies. Madalina CROITORU IUT Montpellier Database Technologies Madalina CROITORU croitoru@lirmm.fr IUT Montpellier Part 5 NORMAL FORMS AND NORMALISATION Database design In the database design one can follow a bottom up approach or a top down

More information

Normalization. { Ronak Panchal }

Normalization. { Ronak Panchal } Normalization { Ronak Panchal } Chapter Objectives The purpose of normailization Data redundancy and Update Anomalies Functional Dependencies The Process of Normalization First Normal Form (1NF) Second

More information

Database Normalization

Database Normalization Database Normalization Todd Bacastow IST 210 1 Overview Introduction The Normal Forms Relationships and Referential Integrity Real World Exercise 2 Keys in the relational model Superkey A set of one or

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems http://nsa.gov1.info/utah-data-center/ http://www.google.com/about/datacenters/gallery/index.html#/locations/the-dalles/1 Lecture 8 Normalization, Bottom-Up from UNF to

More information

Model Inference for Spreadsheets

Model Inference for Spreadsheets Automated Software Engineering manuscript No. (will be inserted by the editor) Model Inference for Spreadsheets Jácome Cunha Martin Erwig Jorge Mendes João Saraiva Received: August 11, 2014 Abstract Many

More information

DBMS Chapter Three IS304. Database Normalization-Comp.

DBMS Chapter Three IS304. Database Normalization-Comp. Database Normalization-Comp. Contents 4. Boyce Codd Normal Form (BCNF) 5. Fourth Normal Form (4NF) 6. Fifth Normal Form (5NF) 7. Sixth Normal Form (6NF) 1 4. Boyce Codd Normal Form (BCNF) In the first

More information

Automatically Inferring ClassSheet Models from Spreadsheets

Automatically Inferring ClassSheet Models from Spreadsheets Automatically Inferring ClassSheet Models from Spreadsheets Jácome Cunha Universidade do Minho & ESTGF jacome@di.uminho.pt Martin Erwig Oregon State University erwig@eecs.oregonstate.edu João Saraiva Universidade

More information

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design 1 Normalization What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert, delete and update Normalization uses concept of dependencies Functional

More information

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization. This lecture Databases -Normalization I This lecture introduces normal forms, decomposition and normalization (GF Royle 2006-8, N Spadaccini 2008) Databases - Normalization I 1 / 23 (GF Royle 2006-8, N

More information

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24 Databases -Normalization I (GF Royle, N Spadaccini 2006-2010) Databases - Normalization I 1 / 24 This lecture This lecture introduces normal forms, decomposition and normalization. We will explore problems

More information

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam Question No: 1 ( Marks: 1 ) - Please choose one Which of the following is NOT a feature of Context DFD?

More information

Lecture5 Functional Dependencies and Normalization for Relational Databases

Lecture5 Functional Dependencies and Normalization for Relational Databases College of Computer and Information Sciences - Information Systems Dept. Lecture5 Functional Dependencies and Normalization for Relational Databases Ref. Chapter14-15 Prepared by L. Nouf Almujally & Aisha

More information

Unit I. Introduction to Database. Prepared By: Prof.Sushila Aghav. Ref.Database Concepts by Korth

Unit I. Introduction to Database. Prepared By: Prof.Sushila Aghav. Ref.Database Concepts by Korth Unit I Introduction to Database Prepared By: Prof.Sushila Aghav Ref.Database Concepts by Korth Contents Database Concepts Data Models & Types Relational Database-R Model(Table) ER Modeling Concepts of

More information

Functional Dependencies and Finding a Minimal Cover

Functional Dependencies and Finding a Minimal Cover Functional Dependencies and Finding a Minimal Cover Robert Soulé 1 Normalization An anomaly occurs in a database when you can update, insert, or delete data, and get undesired side-effects. These side

More information

Database Design Principles

Database Design Principles Database Design Principles CPS352: Database Systems Simon Miner Gordon College Last Revised: 2/11/15 Agenda Check-in Design Project ERD Presentations Database Design Principles Decomposition Functional

More information

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 19 Relational Database Design (Contd.) Welcome to module

More information

Lecture 03. Fall 2017 Borough of Manhattan Community College

Lecture 03. Fall 2017 Borough of Manhattan Community College Lecture 03 Fall 2017 Borough of Manhattan Community College 1 2 Outline 1 Brief History of the Relational Model 2 Terminology 3 Integrity Constraints 4 Views 3 History of the Relational Model The Relational

More information

The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms

The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms Marc Fisher II and Gregg Rothermel Department of Computer Science and Engineering

More information

Static Spreadsheet Analysis

Static Spreadsheet Analysis Debugging of Spreadsheets (DEOS) Project Funded by the Austrian Science Fund (FWF, contract I2144) and the Deutsche Forschungsgemeinschaft (DFG,contract JA 2095/4-1) 1 2 the number of genomics papers packaged

More information

Edited by: Nada Alhirabi. Normalization

Edited by: Nada Alhirabi. Normalization Edited by: Nada Alhirabi Normalization Normalization:Why do we need to normalize? 1. To avoid redundancy (less storage space needed, and data is consistent) Ssn c-id Grade Name Address 123 cs331 A smith

More information

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010 CS403- Database Management Systems Solved MCQS From Midterm Papers April 29,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 CS403- Database Management Systems MIDTERM EXAMINATION - Spring

More information

Lecture 03. Spring 2018 Borough of Manhattan Community College

Lecture 03. Spring 2018 Borough of Manhattan Community College Lecture 03 Spring 2018 Borough of Manhattan Community College 1 2 Outline 1. Brief History of the Relational Model 2. Terminology 3. Integrity Constraints 4. Views 3 History of the Relational Model The

More information

Chapter 14. Chapter 14 - Objectives. Purpose of Normalization. Purpose of Normalization

Chapter 14. Chapter 14 - Objectives. Purpose of Normalization. Purpose of Normalization Chapter 14 - Objectives Chapter 14 Normalization The purpose of normalization. How normalization can be used when designing a relational database. The potential problems associated with redundant data

More information

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple.

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple. Logical Database design Earlier we saw how to convert an unorganized text description of information requirements into a conceptual design, by the use of ER diagrams. The advantage of ER diagrams is that

More information

2. Discovery of Association Rules

2. Discovery of Association Rules 2. Discovery of Association Rules Part I Motivation: market basket data Basic notions: association rule, frequency and confidence Problem of association rule mining (Sub)problem of frequent set mining

More information

Workbooks (File) and Worksheet Handling

Workbooks (File) and Worksheet Handling Workbooks (File) and Worksheet Handling Excel Limitation Excel shortcut use and benefits Excel setting and custom list creation Excel Template and File location system Advanced Paste Special Calculation

More information

Data and Knowledge Management Dr. Rick Jerz

Data and Knowledge Management Dr. Rick Jerz Data and Knowledge Management Dr. Rick Jerz 1 Goals Define big data and discuss its basic characteristics Understand ways to store information Understand the value of a Database Management System Explain

More information

FUNCTIONAL DEPENDENCIES

FUNCTIONAL DEPENDENCIES FUNCTIONAL DEPENDENCIES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Database Design Theory: Functional Dependencies Armstrong s rules The Closure Algorithm

More information

Functional Dependencies, Normalization. Rose-Hulman Institute of Technology Curt Clifton

Functional Dependencies, Normalization. Rose-Hulman Institute of Technology Curt Clifton Functional Dependencies, Normalization Rose-Hulman Institute of Technology Curt Clifton Or Fixing Broken Database Designs This material will almost certainly appear on Exam II next week. Outline Functional

More information

Database Technologies. Madalina CROITORU IUT Montpellier

Database Technologies. Madalina CROITORU IUT Montpellier Database Technologies Madalina CROITORU croitoru@lirmm.fr IUT Montpellier Course practicalities 2 x 2h per week (14 weeks) Basics of database theory relational model, relational algebra, SQL and database

More information

5 Normalization:Quality of relational designs

5 Normalization:Quality of relational designs 5 Normalization:Quality of relational designs 5.1 Functional Dependencies 5.1.1 Design quality 5.1.2 Update anomalies 5.1.3 Functional Dependencies: definition 5.1.4 Properties of Functional Dependencies

More information

Working with Pentaho Interactive Reporting and Metadata

Working with Pentaho Interactive Reporting and Metadata Working with Pentaho Interactive Reporting and Metadata Change log (if you want to use it): Date Version Author Changes Contents Overview... 1 Before You Begin... 1 Other Prerequisites... Error! Bookmark

More information

Part II: Using FD Theory to do Database Design

Part II: Using FD Theory to do Database Design Part II: Using FD Theory to do Database Design 32 Recall that poorly designed table? part manufacturer manaddress seller selleraddress price 1983 Hammers R Us 99 Pinecrest ABC 1229 Bloor W 5.59 8624 Lee

More information

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms Schema Refinement: Dependencies and Normal Forms Grant Weddell Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Spring 2016 CS 348 (Intro to DB Mgmt)

More information

Data and Knowledge Management. Goals. Big Data. Dr. Rick Jerz

Data and Knowledge Management. Goals. Big Data. Dr. Rick Jerz Data and Knowledge Management Dr. Rick Jerz 1 Goals Define big data and discuss its basic characteristics Understand ways to store information Understand the value of a Database Management System Explain

More information

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms Schema Refinement: Dependencies and Normal Forms Grant Weddell David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Spring 2012 CS 348 (Intro to

More information

Chapter 6: Relational Database Design

Chapter 6: Relational Database Design Chapter 6: Relational Database Design Chapter 6: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal Form Decomposition Using Functional Dependencies Second Normal

More information

CSCI 403: Databases 13 - Functional Dependencies and Normalization

CSCI 403: Databases 13 - Functional Dependencies and Normalization CSCI 403: Databases 13 - Functional Dependencies and Normalization Introduction The point of this lecture material is to discuss some objective measures of the goodness of a database schema. The method

More information

From Murach Chap. 9, second half. Schema Refinement and Normal Forms

From Murach Chap. 9, second half. Schema Refinement and Normal Forms From Murach Chap. 9, second half The need for normalization A table that contains repeating columns Schema Refinement and Normal Forms A table that contains redundant data (same values repeated over and

More information

Informal Design Guidelines for Relational Databases

Informal Design Guidelines for Relational Databases Outline Informal Design Guidelines for Relational Databases Semantics of the Relation Attributes Redundant Information in Tuples and Update Anomalies Null Values in Tuples Spurious Tuples Functional Dependencies

More information

V. Database Design CS448/ How to obtain a good relational database schema

V. Database Design CS448/ How to obtain a good relational database schema V. How to obtain a good relational database schema Deriving new relational schema from ER-diagrams Normal forms: use of constraints in evaluating existing relational schema CS448/648 1 Translating an E-R

More information

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms Schema Refinement: Dependencies and Normal Forms M. Tamer Özsu David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Fall 2012 CS 348 Schema Refinement

More information

Database Foundations. 3-9 Validating Data Using Normalization. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Database Foundations. 3-9 Validating Data Using Normalization. Copyright 2015, Oracle and/or its affiliates. All rights reserved. Database Foundations 3-9 Roadmap Conceptual and Physical Data Models Business Rules Entities Attributes Unique Identifiers Relationships Validating Relationships Tracking Data Changes over Time Validating

More information

Transforming ER to Relational Schema

Transforming ER to Relational Schema Transforming ER to Relational Schema Transformation of ER Diagrams to Relational Schema ER Diagrams Entities (Strong, Weak) Relationships Attributes (Multivalued, Derived,..) Generalization Relational

More information

CS411 Database Systems. 05: Relational Schema Design Ch , except and

CS411 Database Systems. 05: Relational Schema Design Ch , except and CS411 Database Systems 05: Relational Schema Design Ch. 3.1-3.5, except 3.4.2-3.4.3 and 3.5.3. 1 How does this fit in? ER Diagrams: Data Definition Translation to Relational Schema: Data Definition Relational

More information

Querying Spreadsheets: An Empirical Study

Querying Spreadsheets: An Empirical Study Querying Spreadsheets: An Empirical Study Jácome Cunha, João Paulo Fernandes, Rui Pereira, and João Saraiva HASLab/INESC TEC Universidade do Minho, Portugal Universidade Nova de Lisboa, Portugal RELEASE,

More information

CS 338 Functional Dependencies

CS 338 Functional Dependencies CS 338 Functional Dependencies Bojana Bislimovska Winter 2016 Outline Design Guidelines for Relation Schemas Functional Dependency Set and Attribute Closure Schema Decomposition Boyce-Codd Normal Form

More information

COMP 430 Intro. to Database Systems

COMP 430 Intro. to Database Systems COMP 430 Intro. to Database Systems Multi-table SQL Get clickers today! Slides use ideas from Chris Ré and Chris Jermaine. The need for multiple tables Using a single table leads to repeating data Provides

More information

SpreadsheetDoc: An Excel Add-in for Documenting Spreadsheets

SpreadsheetDoc: An Excel Add-in for Documenting Spreadsheets SpreadsheetDoc: An Excel Add-in for Documenting Spreadsheets Diogo Canteiro and Jácome Cunha Universidade Nova de Lisboa, Portugal d.canteiro@campus.fct.unl.pt jacome@fct.unl.pt Abstract. Documentation

More information

Graphical Querying of Model-Driven Spreadsheets

Graphical Querying of Model-Driven Spreadsheets Graphical Querying of Model-Driven Spreadsheets Jácome Cunha 1,2, João Paulo Fernandes 1,3, Rui Pereira 1, and João Saraiva 1 1 HASLab/INESC TEC & Universidade do Minho, Portugal 2 CIICESI, ESTGF, Instituto

More information

Lecture 6 Structured Query Language (SQL)

Lecture 6 Structured Query Language (SQL) ITM661 Database Systems Lecture 6 Structured Query Language (SQL) (Data Definition) T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 5th edition,

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems http://dilbert.com/strips/comic/2010-08-24/ Lecture 8 Introduction to Normalization October 17, 2017 Sam Siewert Exam #1 Questions? Reminders Working on Grading Ex #3 -

More information

STRUCTURED QUERY LANGUAGE (SQL)

STRUCTURED QUERY LANGUAGE (SQL) STRUCTURED QUERY LANGUAGE (SQL) EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY SQL TIMELINE SCOPE OF SQL THE ISO SQL DATA TYPES SQL identifiers are used

More information

Database Management System 15

Database Management System 15 Database Management System 15 Trivial and Non-Trivial Canonical /Minimal School of Computer Engineering, KIIT University 15.1 First characterize fully the data requirements of the prospective database

More information

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Relational Model. Why Study the Relational Model? Relational Database: Definitions The Relational Model Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Microsoft, Oracle, Sybase, etc. Legacy systems in

More information

Data about data is database Select correct option: True False Partially True None of the Above

Data about data is database Select correct option: True False Partially True None of the Above Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another

More information

The Professional Services Of Dojo Technology. Spreadsheet Files

The Professional Services Of Dojo Technology. Spreadsheet Files The Professional Services Of Dojo Technology Spreadsheet Files File Conversion Solutions This document serves as an opportunity to introduce the custom solutions that have been developed by Dojo for processing

More information

CS 327E Lecture 12. Shirley Cohen. March 7, 2016

CS 327E Lecture 12. Shirley Cohen. March 7, 2016 CS 327E Lecture 12 Shirley Cohen March 7, 2016 Agenda Announcements Readings for today Reading Quiz Concept Questions Homework for next time Reminders Midterm 2 will be next class Project phase will start

More information

Homework 6: FDs, NFs and XML (due April 15 th, 2015, 4:00pm, hard-copy in-class please)

Homework 6: FDs, NFs and XML (due April 15 th, 2015, 4:00pm, hard-copy in-class please) Virginia Tech. Computer Science CS 4604 Introduction to DBMS Spring 2015, Prakash Homework 6: FDs, NFs and XML (due April 15 th, 2015, 4:00pm, hard-copy in-class please) Reminders: a. Out of 100 points.

More information

Lecture 3. Wednesday, September 3, 2014

Lecture 3. Wednesday, September 3, 2014 Lecture 3 Wednesday, September 3, 2014 ER Diagrams Last week, we covered ER diagrams which allow us to show entities, attributes, and relationships The last component of an ER diagram is the cardinality

More information

Introduction to Data Management. Lecture #7 (Relational Design Theory)

Introduction to Data Management. Lecture #7 (Relational Design Theory) Introduction to Data Management Lecture #7 (Relational Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW#2 is

More information

Normalisation Chapter2 Contents

Normalisation Chapter2 Contents Contents Objective... 64 Superkey & Candidate Keys... 65 Primary, Alternate and Foreign Keys... 65 Functional Dependence... 67 Using Instances... 70 Normalisation Introduction... 70 Normalisation Problems...

More information

Normalization. VI. Normalization of Database Tables. Need for Normalization. Normalization Process. Review of Functional Dependence Concepts

Normalization. VI. Normalization of Database Tables. Need for Normalization. Normalization Process. Review of Functional Dependence Concepts VI. Normalization of Database Tables Normalization Evaluating and correcting relational schema designs to minimize data redundancies Reduces data anomalies Assigns attributes to tables based on functional

More information

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Administration Naive DBMS CMPT 454 Topics. John Edgar 2 Administration Naive DBMS CMPT 454 Topics John Edgar 2 http://www.cs.sfu.ca/coursecentral/454/johnwill/ John Edgar 4 Assignments 25% Midterm exam in class 20% Final exam 55% John Edgar 5 A database stores

More information

CS 4201 Compilers 2014/2015 Handout: Lab 1

CS 4201 Compilers 2014/2015 Handout: Lab 1 CS 4201 Compilers 2014/2015 Handout: Lab 1 Lab Content: - What is compiler? - What is compilation? - Features of compiler - Compiler structure - Phases of compiler - Programs related to compilers - Some

More information

Systematic Spreadsheet Construction Processes

Systematic Spreadsheet Construction Processes Systematic Spreadsheet Construction Processes Jorge Mendes, Jácome Cunha, Francisco Duarte, Gregor Engels, João Saraiva and Stefan Sauer HASLab, INESC TEC & Universidade do Minho, Portugal, email: {jorgemendes,saraiva}@di.uminho.pt

More information

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT Functional Dependencies & Normalization for Relational DBs Truong Tuan Anh CSE-HCMUT 1 2 Contents 1 Introduction 2 Functional dependencies (FDs) 3 Normalization 4 Relational database schema design algorithms

More information

Recap We have been working with representing an organizational structure in a two-column configuration.

Recap We have been working with representing an organizational structure in a two-column configuration. CS 1251 Page 1 Monday Monday, September 22, 2014 10:42 AM Recap We have been working with representing an organizational structure in a two-column configuration. In order to pull useful information we

More information

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics Unit 10 Databases Computer Concepts 2016 ENHANCED EDITION 10 Unit Contents Section A: Database Basics Section B: Database Tools Section C: Database Design Section D: SQL Section E: Big Data Unit 10: Databases

More information

Lecture 5 Design Theory and Normalization

Lecture 5 Design Theory and Normalization CompSci 516 Data Intensive Computing Systems Lecture 5 Design Theory and Normalization Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 HW1 deadline: Announcements Due on 09/21

More information

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database? DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS Complete book short Answer Question.. QUESTION 1: What is database? A database is a logically coherent collection of data with some inherent meaning, representing

More information

Guideline 1: Semantic of the relation attributes Do not mix attributes from distinct real world. example

Guideline 1: Semantic of the relation attributes Do not mix attributes from distinct real world. example Design guidelines for relational schema Semantic of the relation attributes Do not mix attributes from distinct real world Design a relation schema so that it is easy to explain its meaning. Do not combine

More information

The Relational Data Model. Data Model

The Relational Data Model. Data Model The Relational Data Model Davood Rafiei *Disclaimer: The slides used in the course may contain some of the slides provided by the authors of the adopted textbook (present and past) and those used in previous

More information

Visual, Interactive Data Mining with InfoZoom the Financial Data Set

Visual, Interactive Data Mining with InfoZoom the Financial Data Set Contribution to the Discovery Challenge at the 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 99, September 15-18, 1999, Prague, Czech Republic Visual, Interactive

More information

Databases The McGraw-Hill Companies, Inc. All rights reserved.

Databases The McGraw-Hill Companies, Inc. All rights reserved. Distinguish between the physical and logical views of data. Describe how data is organized: characters, fields, records, tables, and databases. Define key fields and how they are used to integrate data

More information

CS352 Lecture - Conceptual Relational Database Design

CS352 Lecture - Conceptual Relational Database Design CS352 Lecture - Conceptual Relational Database Design Objectives: last revised September 20, 2006 1. To define the concepts functional dependency and multivalued dependency 2. To show how to find the closure

More information

Lecture 5 Data Definition Language (DDL)

Lecture 5 Data Definition Language (DDL) ITM-661 ระบบฐานข อม ล (Database system) Walailak - 2013 Lecture 5 Data Definition Language (DDL) Walailak University T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation,

More information

16/06/56. Databases. Databases. Databases The McGraw-Hill Companies, Inc. All rights reserved.

16/06/56. Databases. Databases. Databases The McGraw-Hill Companies, Inc. All rights reserved. Distinguish between the physical and logical views of data. Describe how data is organized: characters, fields, records, tables, and databases. Define key fields and how they are used to integrate data

More information

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016 Applied Databases Lecture 5 ER Model, normal forms Sebastian Maneth University of Edinburgh - January 25 th, 2016 Outline 2 1. Entity Relationship Model 2. Normal Forms Keys and Superkeys 3 Superkey =

More information

To overcome these anomalies we need to normalize the data. In the next section we will discuss about normalization.

To overcome these anomalies we need to normalize the data. In the next section we will discuss about normalization. Anomalies in DBMS There are three types of anomalies that occur when the database is not normalized. These are Insertion, update and deletion anomaly. Let s take an example to understand this. Example:

More information

Content-Based Assessments

Content-Based Assessments the A and B. Content-Based Assessments GO! Solve It Project F Third Quarter Sales For Project F, you will need the following file: Lastname_Firstname_F_Third_Quarter_Sales July August September Gifts &

More information

Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value

Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value KNOWLEDGENT INSIGHTS volume 1 no. 5 October 7, 2011 Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value Today s growing commercial, operational and regulatory

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Calculus and Algebra Part-2 September 10, 2017 Sam Siewert RDBMS Fundamental Theory http://dilbert.com/strips/comic/2008-05-07/ Relational Algebra and

More information

CS352 Lecture - Conceptual Relational Database Design

CS352 Lecture - Conceptual Relational Database Design CS352 Lecture - Conceptual Relational Database Design Objectives: last revised September 16, 2004 1. To define the concepts functional dependency and multivalued dependency 2. To show how to find the closure

More information

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall The Relational Model Chapter 3 Comp 521 Files and Databases Fall 2012 1 Why Study the Relational Model? Most widely used model by industry. IBM, Informix, Microsoft, Oracle, Sybase, etc. It is simple,

More information

Question Bank. 4) It is the source of information later delivered to data marts.

Question Bank. 4) It is the source of information later delivered to data marts. Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile

More information

Bases de Dades: introduction and organization

Bases de Dades: introduction and organization Andrew D. Bagdanov bagdanov@cvc.uab.es Departamento de Ciencias de la Computación Universidad Autónoma de Barcelona Fall, 2010 Outline 1 2 3 4 5 Contact information Professor Database systems Important

More information

Computer, Software and Technology Skills

Computer, Software and Technology Skills Computer, Software and Technology s of Proficiency in Banner Use commands and menus to navigate between Banner forms Find, filter and select appropriate person/vendor records to perform look ups Generate

More information

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design Chapter 16 Relational Database Design Algorithms Database Design Approaches Top-Down design (Starting with conceptual design) Bottom-Up Design (relational synthesis) 2 Top-Down Design Design conceptual

More information

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615 Faloutsos & Pavlo 15-415/615 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Lecture #17: Schema Refinement & Normalization - Normal Forms (R&G, ch. 19) Overview - detailed

More information

Lecture 07. Spring 2018 Borough of Manhattan Community College

Lecture 07. Spring 2018 Borough of Manhattan Community College Lecture 07 Spring 2018 Borough of Manhattan Community College 1 SQL Identifiers SQL identifiers are used to identify objects in the database, such as table names, view names, and columns. The ISO standard

More information

Normalization and Functional Dependencies. CS6302 Database management systems T.R.Lekhaa, AP/IT

Normalization and Functional Dependencies. CS6302 Database management systems T.R.Lekhaa, AP/IT Normalization and Functional Dependencies Normalization Normalization There are fournormal forms: first, second, third, and Boyce-Codd normal forms 1NF, 2NF, 3NF, and BCNF Normalization is a process that

More information

You must not give, either individually or as a group, any assistance, verbal or written, in the carrying out of the tasks or evidence to produce.

You must not give, either individually or as a group, any assistance, verbal or written, in the carrying out of the tasks or evidence to produce. OCR 2015 OCR AS ICT General You must not give, either individually or as a group, any assistance, verbal or written, in the carrying out of the tasks or evidence to produce. 1. I don t think the software

More information

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

The strategy for achieving a good design is to decompose a badly designed relation appropriately. The strategy for achieving a good design is to decompose a badly designed relation appropriately. Functional Dependencies The single most important concept in relational schema design theory is that of

More information

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language)

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language) Announcements (January 20) 2 Relational Database Design Review for Codd paper due tonight via email Follow instructions on course Web site Reading assignment for next week (Ailamaki et al., VLDB 2001)

More information

CS2300: File Structures and Introduction to Database Systems

CS2300: File Structures and Introduction to Database Systems CS2300: File Structures and Introduction to Database Systems Lecture 9: Relational Model & Relational Algebra Doug McGeehan 1 Brief Review Relational model concepts Informal Terms Formal Terms Table Relation

More information