Data Warehouse and Data Mining
|
|
- Elmer McCoy
- 6 years ago
- Views:
Transcription
1 Data Warehouse and Data Mining 1)Data Mining: Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. The most commonly used techniques in data mining are: Artificial neural networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure. Decision trees: Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID). Genetic algorithms: Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of evolution. Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k ³ 1). Sometimes called the k-nearest neighbor technique. Rule induction: The extraction of useful if-then rules
2 from data based on statistical significance. Data Mining Architecture: 2)Data Warehouse: A data warehouse is a: subject-oriented integrated time varying non-volatile collection of data in support of the management s decisionmaking process.a data warehouse is a centralized repository
3 that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis. Subject Oriented:Data warehouses are designed to help you analyze data. For example, to learn more about your company s sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like Who was our best customer for this item last year? This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented. Integrated:Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure. When they achieve this, they are said to be integrated. Nonvolatile:Nonvolatile means that, once entered into the warehouse, data should not change. This is logical because the purpose of a warehouse is to enable you to analyze what has occurred. Time Variant:In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse s focus on change over time is what is meant by the term time variant. There are two approaches to data warehousing, top down and bottom up. The top down approach spins off data marts for specific groups of users after the complete data warehouse has been created. The bottom up approach builds the data marts first and then combines them into a single, all-encompassing data warehouse.
4 Slice and dice refers to a strategy for segmenting, viewing and understanding data in a database. Users slices and dice by cutting a large segment of data into smaller parts, and repeating this process until arriving at the right level of detail for analysis. Slicing and dicing helps provide a closer view of data for analysis and presents data in new and diverse perspectives.the term is typically used with OLAP databases that present information to the user in the form of multidimensional cubes similar to a 3D spreadsheet. ETL process ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse. ETL involves the following tasks: Extracting the data from source systems (SAP, ERP, other
5 operational systems), data from different source systems is converted into one consolidated data warehouse format which is ready for transformation processing. Transforming the data may involve the following tasks: applying business rules (so-called derivations, e.g., calculating new measures and dimensions), cleaning (e.g., mapping NULL to 0 or Male to M and Female to F etc.), filtering (e.g., selecting only certain columns to load), splitting a column into multiple columns and vice versa, joining together data from multiple sources (e.g., lookup, merge), transposing rows and columns, applying any kind of simple or complex data validation (e.g., if the first 3 columns in a row are empty then reject the row from processing) Loading the data into a data warehouse or data repository other reporting applications
6
7 Normalization Normalization: It is the process of removing redundant data from your tables in order to improve storage efficiency, data integrity and scalability. This improvement is balanced against an increase in complexity and potential performance losses from the joining of the normalized tables at querytime.there are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. Normalization is also called Bottom-up-approach, because this technique requires full knowledge of every participating attribute and its dependencies on the key attributes, if you try to add new attributes after normalization is done, it may change the normal form of the database design. Redundancy:Dependencies between attributes within a relation cause redundancy. Without Normalization Problems: Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anomalies are very frequent if Database is not normalized.there s clearly redundant information stored here. Insert Anomaly Due to lack of data i.e., all the data available for insertion such that null values in keys should be avoided. This kind of anomaly can seriously damage a database Update Anomaly It is due to data redundancy i.e. multiple occurrences of same values in a column. This can lead to inefficiency. Deletion Anomaly It leads to loss of data for rows
8 that are not stored elsewhere. It could result in loss of vital data. On decomposition of a relation into smaller relations with fewer attributes on normalization the resulting relations whenever joined must result in the same relation without any extra rows. The join operations can be performed in any order. This is known as Lossless Join decomposition. The resulting relations (tables) obtained on normalization should possess the properties such as each row must be identified by a unique key, no repeating groups, homogenous columns, each column is assigned a unique name etc. Functional Dependency: The attributes of a table is said to be dependent on each other when an attribute of a table uniquely identifies another attribute of the same table.if column A of a table uniquely identifies the column B of same table then it can represented as A->B (Attribute B is functionally dependent on attribute A). Partial Function Dependency ; It is a form of Functional dependency that holds on a set of attributes.let us assume a relation R with attributes A, B, C, and D. Also, assume that the set of functional dependencies F that hold on R as follows; F = {A B, D C}. From set of attributes F, we can derive the primary key. For R, the key can be (A,D), a composite primary key. That means, AD BC, AD can uniquely identify B and C. But, for this case A and D is not required to identify B or C uniquely. To identify B, attribute A is enough. Likewise, to identify C, attribute D is enough. The functional dependencies AD B or AD C are called as Partial functional dependencies. Trivial Dependency ; The dependency of an attribute on a set of attributes is known as Trivial Dependency if the set of attributes includes that attribute.
9 Consider a table with two columns Student_id and Student_Name.{Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as Student_Id is a subset of {Student_Id, Student_Name}. That makes sense because if we know the values of Student_Id and Student_Name then the value of Student_Id can be uniquely determined.also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial dependencies too. Non-Trivial Dependency ; If a functional dependency X->Y holds true where Y is not a subset of X then this dependency is called Non-Trivial Dependency. An employee table with three attributes: emp_id, emp_name, emp_address. The following functional dependencies are non-trivial: emp_id -> emp_name (emp_name is not a subset of emp_id) emp_id -> emp_address (emp_address is not a subset of emp_id) On the other hand, the following dependencies are trivial: {emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}] Normalization has Five Normal Forms: a)1nf b)2nf c)3nf d)bcnf e)(4nf) f)5nf a)1nf: A relation is considered to be in first normal form if all of its attributes have domain that are indivisible or atomic.
10 A table is in 1NF if and only if its satisfies the following five conditions: There is no top-to-bottom ordering to the rows. There is no left-to-right ordering to the columns. There are no duplicate rows. Every row and column intersection contains exactly one value from the applicable domain. All columns are regular Each attribute must contain only a single value from its predefined domain. b)2nf: Table is in 1NF (First normal form) No non-prime attribute is dependent on the proper subset of any candidate key of table. Based on Fully Functional dependency. An attribute that is not part of any candidate key is known as non-prime attribute. c)3nf: A functional dependency is said to be transitive if it is indirectly formed by two functional dependencies. For e.g.x -> Z is a transitive dependency if the following three functional dependencies hold true: X->Y Y does not ->X Y->Z A table design is said to be in 3NF if both the following conditions hold: Table must be in 2NF Transitive functional dependency of non-prime attribute on any super key should be removed. An attribute that is not part of any candidate key is known as non-prime attribute.
11 In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each functional dependency X-> Y at least one of the following conditions hold: X is a super keyof table Y is a prime attribute of table An attribute that is a part of one of the candidate keys is known as prime attribute. d)bcnf:a relational schema R is considered to be in Boyce Codd normal form (BCNF) if it is in 3NF, for every one of its dependencies X Y, one of the following conditions holds true: X Y is a non trivial functional dependency (i.e., Y is a subset of X) X is a superkey for schema R BCNF is more restrictive than 3NF.While decomposing relation to make them in BCNF we may loose some dependencies i.e BCNF does not guarantee the dependency preservation property. Note:A relation with only two attributes is always in BCNF e)4nf It should meet all the requirement of 3NF Attribute of one or more rows in the table should not result in more than one rows of the same table leading to multi-valued dependencies. Every relation in 4NF is in BCNF f)5nf Fifth normal form (5NF), also known as project-join normal form (PJ/NF) is a level of database normalization designed to reduce redundancy in relational databases recording multivalued facts by isolating semantically related multiple relationships. A table is said to be in the 5NF if and only if
12 every non-trivial join dependency in it is implied by the candidate keys. A join dependency *{A, B, Z} on R is implied by the candidate key(s) of R if and only if each of A, B,, Z is a superkey for R. Entity Relationship Model Part-2 Relationship: A relationship is an association among several entities. Relationship Set: A relationship set is a set of relationships of the same type. Relationship Type: A relationship type defines a set of associations among entities of the different entity types. Two Types of Relationship Constraints: a)cardinality Ratio(degree of relationship is also called cardinality) b)participation Constraint a)cardinality Ratio: Specifics the number of relationship instances that an entity can participate in.the possible cardinality ratios are:
13 b)participation Constraint: The participation constraint specifies whether the existence of an entity depends on its being relate to another entity via the relationship type.there are two types of participation constraints: 1)Total Participation Constraints(Existence dependency):the participation of an entity set E in a relationship set R is said to be total if every entity in E participates in at least one relationship in R. This participation is displayed as a double line connection. 2)Partial Dependency: If only some entities in E participate in relationship in R, the participation of entity set E in relationship R is said to be partial.this participation is displayed as a single line connecting. Extended E-R Features: 1)Specialization: Top down design process We take higher level entity and add new attributes to it to produce lower level entity.the lower level entities inherit the characteristics of higher level entity. In terms of ER diagram, specialization is depicted by a triangle component labeled ISA. Consider an entity set person, with attributes name, street, and city. A person may be further classified as one of the
14 following: customer employee 2)Generalization: Bottom-up design approach Union of lower entity types to produce higher entity types. 3)Aggregation: Aggregration is a process when relation between two entity is treated as a single entity.here the relation between Student and Course, is acting as an Entity in relation with Subject.
15 Entity Relationship Model Part-1 Database Model: Logical structure of a database and fundamental determines in which manner data can be stored, organized and manipulated. 1)Hierarchical Model: Data is organized in tree like structure, implying a single parent for each record. Allows to one to many relationship 2)Network Model: Allows many to many relationship in a graph like structure that allows multiple parents. Organise data using two fundamental concepts called records and sets. 3)Relational Data Model: Collection of tables to represent data and the relationship
16 among those data. Eg: Oracle, Sybase. 4)Object Oriented Data Model: Data and their relationship are organized or contained in a single structure known as object. Hierarchical,Network and Relational data model is type of Record Based Model ENTITY RELATIONSHIP MODEL DESIGN 1)Entity: It is thing or object in the real world that is distinguishable from all other objects. An entity has a set of properties and values for some set of properties that may uniquely identify an entity. 2)Entity Set: Collection of entities all having same properties or attributes. 3)Attributes: Each entity is described by set of attributes/properties. Attributes are descriptive properties possessed by each member of an entity set. For each attributes, there is set of permitted values called domain or value set of the attributes. Types of attributes: 1)Simple Attributes: Not divided into subpart eg: any unique
17 number like )Composite Attributes: Divided into subpart eg: Name is divided into first name, middle name and last name. 3)Single Value Attribute: Single value for a particular entity eg: order_id 4)Multivalued Attribute: More than one value for a particular entity eg: Phone No. 5)Derived Attribute: Attribute value is dependent on some other attribute.eg: Age Null Values: Entity doesn t have value for the attribute.
18 Keys: Key plays an important role in relational database; it is used for identifying unique rows from table. It also establishes relationship among tables. Types of Key: 1)Primary Key 2)Composite Key 3)Super Key 4)Candidate Key 5)Secondary Key 6)Foreign key 1)Primary key: A primary is a column or set of columns in a table that uniquely identifies tuples (rows) in that table. A relation may contain many candidate keys.when the designer select one of them to indentify a tuple in the relation,it becomes a primary key.it means that if there is only one candidate key,it will automatically selected as primary key. 2)Composite key Key that consist of two or more attributes that uniquely identify an entity occurrence is called Composite key. But any attribute that makes up the Composite key is not a simple key
19 in its own. 3)Super Key A super key is the most general type of key.a super key is a set of one of more columns (attributes) to uniquely identify rows in a table.super key is a superset of Candidate key. 4)Candidate key A candidate key is simply the shortest super key. Candidate Key are individual columns in a table that qualifies for uniqueness of each row/tuple.every table must have at least one candidate key but at the same time can have several. 5)Secondary key Out of all candidate keys, only one gets selected as primary key, remaining keys are known as alternate or secondary keys. 6)Foreign key A FOREIGN KEY in one table points to a PRIMARY KEY in another table.they act as a cross-reference between tables.
20 Introduction of Database Data: Facts, figures, statistics etc. Record: Collection of related data items. Table or Relation: Collection of related records. Database: Collection of related relation/data. In database, data is organized strictly in row and column format.the columns are called Fields, Attributes or Domains. The rows are called Tuples or Records. Features of Data In a Database: 1)Security 2)Consistency 3)Non-Redundancy 4)Shared 5)Independence 6)Persistence DBMS(Database Management System)It is software that allows creation, definition and manipulation of database.l It is middle layer between data and program. File System: Stores permanent records in various files Need application program to access and manipulate data. Disadvantage of File System: Data Redundancy Data Inconsistency Difficult in accessing data Data Integrity Low Security Data redundancy: Data redundancy is the repetition or superfluity of data. Data redundancy data is an common issue in computer data storage and database systems.this data repetition may occur either if a field is repeated in two or more tables or if the field is repeated within the table.data can appear multiple times in a database for a variety of
21 reasons. A positive type of data redundancy works to safeguard data and promote consistency. Many developers consider it acceptable for data to be stored in multiple places. The key is to have a central, master field or space for this data, so that there is a way to update all of the places where data is redundant through one central access point. Otherwise, data redundancy can lead to big problems with data inconsistency, where one update does not automatically update another field.for example, a shop may have the same customer s name appearing several times if that customer has bought several different products at different dates. Disadvantages Of Data Redundancy: 1)Increases the size of the database unnecessarily. 2)Causes data inconsistency. 3)Decreases efficiency of database. 4)May cause data corruption. Data Isolation: The database must remain in a consistent state after any transaction. No transaction should have any adverse effect on the data residing in the database. If the database was in a consistent state before the execution of a transaction, it must remain consistent after the execution of the transaction as well.as an example, if two people are updating the same catalog item, it s not acceptable for one person s changes to be clobbered when the second person saves a different set of changes. Both users should be able to work in isolation, working as though he or she is the only user. Each set of changes must be isolated from those of the other users. Data Integrity is the assurance that information is unchanged from its source, and has not been accidentally (e.g. through programming errors), or maliciously (e.g. through breaches or hacks) modified, altered or destroyed. In another words, it concerns with the completeness, soundness, and wholeness of the data that complies with the intention of data creators.it s a logical property of the DB, independent of the
22 actual data. Data Consistency refers to the usability of the Data, and is mostly used in single site environment. But still in single site environment, some problems may arise in Data Consistency during recovery activities when original data is replaced by the backup copies of Data. You have to make sure that you data is usable while backing up data. Data Abstraction:To simplify the interaction of users and database, DBMS hides some information which is not user interest is called Data Abstraction. So, developer hides complexity from users and show Abstract view of data. DBMS Architecture/3-Tier Architecture: 1)External/View Level:It is user s view of the database.this level describes the part of the database that is relevant to each user. 2)Conceptual/Logical Level: Describes what data is stored in the database and the relationship among the data. Represent all entities, their attributes and their relationship Constraints on the data Security and Integrity information 3)Physical/Internal Level: Describes how the data is stored in the database Storage Space allocation for data and indexes File System Data compression and Data encryption techniques Record Placement
23 Schemas: It is overall description of the database.in three-level architecture, one schema at each level. Does not specify relationship among files. Instances: Collection of information stored in the database at a particular moment. Sub-schema:It is a subset of schema and inherits the same property that the schema has. It is an application programmer s or user view of the data items types and record types which he or she uses. Data Independence in DBMS: Upper level are unaffected by changes in lower level.two Types of Data Independence: a)physical Data Independence: Physical storage structure or devices can be changed without affecting conceptual schema. Modification done to improve performance. It provide independence to conceptual schema and
24 external schema b)logical Data Independence: Conceptual schema can be changed without affecting external schema. Structure of database is altered when modification done in conceptual schema. It provide independence to external schema. DBMS Components: 1)Hardware Processor/main memory(used for execution) Secondary Storage devices(for physical storage) 2)Data 3)Software 4)Users 5)Procedures(Set of rules for database management) Types of Users: a)naive Users: End Users of the database who work through menu driven application programs, where the type and range of response is always indicated to the users. b)online Users: Those users who may communicate with database directly through an online terminal. c)application Programmer: Those users who are responsible for developing the application program. d)dba(database Administrator) DBA(Database Administrator): DBA directs or performs all activities related to maintaining a successful database environment. Function of DBA:
25 Defining Conceptual Schema Physical Database Design Tuning database performance Security and Integrity Check Back up and Recovery Strategies Improving query processing performance Granting User Access Database Languages: 1)DDL(Data Definition Language): Deals with database schemas and description, how the data should reside in the database. Used to alter/modify a database or table structure and schema. Command used in DDL: Create Alter Drop Rename Truncate Comment 2)DML(Data Manipulation Language) Deals with data manipulation These statements affects records in a table. Command used in DML: Update Select Insert Delete Merge Call
26 Lock Table Two Types of DML: a)procedural DML(Non Declarative)(How data is fetch) b)non-procedural DML(Declarative )(What data is to be fetch) 3)DCL(Data Control Language) Control the level of access that users have on database objects. Command used in DCL: Grant Revoke 4)Transaction Language: Control and manage transactions to maintain integrity of data within SQL statement. Command used in Transaction Language: Set Transaction Commit Savepoint Rollback
Professional Knowledge Complete Theory IT Ebook For IBPS RRB SCALE-2 SBI IT IBPS IT INSURANCE SPECIALIST EXAM
Professional Knowledge Complete Theory IT Ebook For IBPS RRB SCALE-2 SBI IT IBPS IT INSURANCE SPECIALIST EXAM 2017-18 In this ebook cover complete IT theory with some previous year asked questions topic
More informationRedundancy:Dependencies between attributes within a relation cause redundancy.
Normalization Normalization: It is the process of removing redundant data from your tables in order to improve storage efficiency, data integrity and scalability. This improvement is balanced against an
More informationTechno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601
Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Database Management System Subject Code: CS 601 Multiple Choice Type Questions 1. Data structure or the data stored
More informationData about data is database Select correct option: True False Partially True None of the Above
Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another
More information8) A top-to-bottom relationship among the items in a database is established by a
MULTIPLE CHOICE QUESTIONS IN DBMS (unit-1 to unit-4) 1) ER model is used in phase a) conceptual database b) schema refinement c) physical refinement d) applications and security 2) The ER model is relevant
More informationSQL Interview Questions
SQL Interview Questions SQL stands for Structured Query Language. It is used as a programming language for querying Relational Database Management Systems. In this tutorial, we shall go through the basic
More informationNormalization in DBMS
Unit 4: Normalization 4.1. Need of Normalization (Consequences of Bad Design-Insert, Update & Delete Anomalies) 4.2. Normalization 4.2.1. First Normal Form 4.2.2. Second Normal Form 4.2.3. Third Normal
More informationNormalization Rule. First Normal Form (1NF) Normalization rule are divided into following normal form. 1. First Normal Form. 2. Second Normal Form
Normalization Rule Normalization rule are divided into following normal form. 1. First Normal Form 2. Second Normal Form 3. Third Normal Form 4. BCNF First Normal Form (1NF) As per First Normal Form, no
More informationB.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1
Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished
More information1. Considering functional dependency, one in which removal from some attributes must affect dependency is called
Q.1 Short Questions Marks 1. Considering functional dependency, one in which removal from some attributes must affect dependency is called 01 A. full functional dependency B. partial dependency C. prime
More informationII B.Sc(IT) [ BATCH] IV SEMESTER CORE: RELATIONAL DATABASE MANAGEMENT SYSTEM - 412A Multiple Choice Questions.
Dr.G.R.Damodaran College of Science (Autonomous, affiliated to the Bharathiar University, recognized by the UGC)Re-accredited at the 'A' Grade Level by the NAAC and ISO 9001:2008 Certified CRISL rated
More informationDATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?
DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS Complete book short Answer Question.. QUESTION 1: What is database? A database is a logically coherent collection of data with some inherent meaning, representing
More informationDATA MINING TRANSACTION
DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is
More informationCS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010
CS403- Database Management Systems Solved MCQS From Midterm Papers April 29,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 CS403- Database Management Systems MIDTERM EXAMINATION - Spring
More informationSYED AMMAL ENGINEERING COLLEGE
CS6302- Database Management Systems QUESTION BANK UNIT-I INTRODUCTION TO DBMS 1. What is database? 2. Define Database Management System. 3. Advantages of DBMS? 4. Disadvantages in File Processing System.
More information0. Database Systems 1.1 Introduction to DBMS Information is one of the most valuable resources in this information age! How do we effectively and efficiently manage this information? - How does Wal-Mart
More informationDatabase Management System 9
Database Management System 9 School of Computer Engineering, KIIT University 9.1 Relational data model is the primary data model for commercial data- processing applications A relational database consists
More informationCS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam
CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam Question No: 1 ( Marks: 1 ) - Please choose one Which of the following is NOT a feature of Context DFD?
More informationInformation Management (IM)
1 2 3 4 5 6 7 8 9 Information Management (IM) Information Management (IM) is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information;
More informationSolved MCQ on fundamental of DBMS. Set-1
Solved MCQ on fundamental of DBMS Set-1 1) Which of the following is not a characteristic of a relational database model? A. Table B. Tree like structure C. Complex logical relationship D. Records 2) Field
More informationDATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEM For COMPUTER SCIENCE DATABASE MANAGEMENT. SYSTEM SYLLABUS ER model. Relational model: relational algebra, tuple calculus, SQL. Integrity constraints, normal forms. File organization,
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationDATABASE MANAGEMENT SYSTEMS
www..com Code No: N0321/R07 Set No. 1 1. a) What is a Superkey? With an example, describe the difference between a candidate key and the primary key for a given relation? b) With an example, briefly describe
More informationTHE RELATIONAL DATABASE MODEL
THE RELATIONAL DATABASE MODEL Introduction to relational DB Basic Objects of relational model Properties of relation Representation of ER model to relation Keys Relational Integrity Rules Functional Dependencies
More informationDepartment of Information Technology B.E/B.Tech : CSE/IT Regulation: 2013 Sub. Code / Sub. Name : CS6302 Database Management Systems
COURSE DELIVERY PLAN - THEORY Page 1 of 6 Department of Information Technology B.E/B.Tech : CSE/IT Regulation: 2013 Sub. Code / Sub. Name : CS6302 Database Management Systems Unit : I LP: CS6302 Rev. :
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationNormalisation Chapter2 Contents
Contents Objective... 64 Superkey & Candidate Keys... 65 Primary, Alternate and Foreign Keys... 65 Functional Dependence... 67 Using Instances... 70 Normalisation Introduction... 70 Normalisation Problems...
More informationThe DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.
Managing Data Data storage tool must provide the following features: Data definition (data structuring) Data entry (to add new data) Data editing (to change existing data) Querying (a means of extracting
More informationCourse Outline Faculty of Computing and Information Technology
Course Outline Faculty of Computing and Information Technology Title Code Instructor Name Credit Hours Prerequisite Prerequisite Skill/Knowledge/Understanding Category Course Goals Statement of Course
More informationCS6302 DBMS 2MARK & 16 MARK UNIT II SQL & QUERY ORTIMIZATION 1. Define Aggregate Functions in SQL? Aggregate function are functions that take a collection of values as input and return a single value.
More informationMIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38
Student Info StudentID: Center: ExamDate: MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: 1356458 Time: 60 min Marks: 38 BC080402322 OPKST 5/28/2010 12:00:00 AM
More informationInterview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]
Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept] 1. What is DBMS? A Database Management System (DBMS) is a program that controls creation, maintenance and use
More informationA7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS
A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be answered
More informationROEVER ENGINEERING COLLEGE
ROEVER ENGINEERING COLLEGE ELAMBALUR, PERAMBALUR- 621 212 DEPARTMENT OF INFORMATION TECHNOLOGY DATABASE MANAGEMENT SYSTEMS UNIT-1 Questions And Answers----Two Marks 1. Define database management systems?
More informationDistributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science
Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science 1 Distributed Database Systems Basic concepts and Definitions Data Collection of facts and figures concerning an object
More informationE.G.S. PILLAY ENGINEERING COLLEGE (An Autonomous Institution, Affiliated to Anna University, Chennai) Nagore Post, Nagapattinam , Tamilnadu.
7CA0 DATABASE MANAGEMENT SYSTEMS Academic Year : 08-09 Programme : MCA Question Bank Year / Semester : I / I Course Coordinator: Ms.S.Visalatchy Course Objectives. To learn the fundamentals of data models
More informationApplied Databases. Sebastian Maneth. Lecture 5 ER Model, Normal Forms. University of Edinburgh - January 30 th, 2017
Applied Databases Lecture 5 ER Model, Normal Forms Sebastian Maneth University of Edinburgh - January 30 th, 2017 Outline 2 1. Entity Relationship Model 2. Normal Forms From Last Lecture 3 the Lecturer
More informationReview -Chapter 4. Review -Chapter 5
Review -Chapter 4 Entity relationship (ER) model Steps for building a formal ERD Uses ER diagrams to represent conceptual database as viewed by the end user Three main components Entities Relationships
More informationData Warehouse Testing. By: Rakesh Kumar Sharma
Data Warehouse Testing By: Rakesh Kumar Sharma Index...2 Introduction...3 About Data Warehouse...3 Data Warehouse definition...3 Testing Process for Data warehouse:...3 Requirements Testing :...3 Unit
More informationNORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18
NORMAL FORMS CS121: Relational Databases Fall 2017 Lecture 18 Equivalent Schemas 2 Many different schemas can represent a set of data Which one is best? What does best even mean? Main goals: Representation
More informationVALLIAMMAI ENGINEERING COLLEGE
VALLIAMMAI ENGINEERING COLLEGE III SEMESTER - B.E COMPUTER SCIENCE AND ENGINEERING QUESTION BANK - CS6302 DATABASE MANAGEMENT SYSTEMS UNIT I 1. What are the disadvantages of file processing system? 2.
More informationCPS510 Database System Design Primitive SYSTEM STRUCTURE
CPS510 Database System Design Primitive SYSTEM STRUCTURE Naïve Users Application Programmers Sophisticated Users Database Administrator DBA Users Application Interfaces Application Programs Query Data
More informationDATABASE DEVELOPMENT (H4)
IMIS HIGHER DIPLOMA QUALIFICATIONS DATABASE DEVELOPMENT (H4) December 2017 10:00hrs 13:00hrs DURATION: 3 HOURS Candidates should answer ALL the questions in Part A and THREE of the five questions in Part
More informationRajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10
Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10 RAJIV GANDHI COLLEGE OF ENGINEERING & TECHNOLOGY, KIRUMAMPAKKAM-607 402 DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK
More informationApplied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016
Applied Databases Lecture 5 ER Model, normal forms Sebastian Maneth University of Edinburgh - January 25 th, 2016 Outline 2 1. Entity Relationship Model 2. Normal Forms Keys and Superkeys 3 Superkey =
More informationDatabase Systems: Design, Implementation, and Management Tenth Edition. Chapter 6 Normalization of Database Tables
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 6 Normalization of Database Tables Objectives In this chapter, students will learn: What normalization is and what role it
More informationNormalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design
1 Normalization What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert, delete and update Normalization uses concept of dependencies Functional
More informationUNIT I. Introduction
UNIT I Introduction Objective To know the need for database system. To study about various data models. To understand the architecture of database system. To introduce Relational database system. Introduction
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationDatabase Management Systems Paper Solution
Database Management Systems Paper Solution Following questions have been asked in GATE CS exam. 1. Given the relations employee (name, salary, deptno) and department (deptno, deptname, address) Which of
More informationIntroduction to Database. Dr Simon Jones Thanks to Mariam Mohaideen
Introduction to Database Dr Simon Jones simon.jones@nyumc.org Thanks to Mariam Mohaideen Today database theory Key learning outcome - is to understand data normalization Thursday, 19 November Introduction
More informationAssignment Session : July-March
Faculty Name Class/Section Subject Name Assignment Session : July-March 2018-19 MR.RAMESHWAR BASEDIA B.Com II Year RDBMS Assignment THEORY ASSIGNMENT II (A) Objective Question 1. Software that defines
More informationMaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK
CS1301 DATABASE MANAGEMENT SYSTEM DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK Sub code / Subject: CS1301 / DBMS Year/Sem : III / V UNIT I INTRODUCTION AND CONCEPTUAL MODELLING 1. Define
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationDistributed KIDS Labs 1
Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database
More informationCS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I
CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I 1.List the purpose of Database System (or) List the drawback of normal File Processing System. 2. Define Data Abstraction
More informationDatabase Processing. Fundamentals, Design, and Implementation. Global Edition
Database Processing Fundamentals, Design, and Implementation 14th Edition Global Edition Database Processing: Fundamentals, Design, and Implementation, Global Edition Table of Contents Cover Title Page
More informationDatabase Management Systems
S.Y. B.Sc. (IT) : Sem. III Database Management Systems Time : 2½ Hrs.] Prelim Question Paper Solution [Marks : 75 Q.1 Attempt the following (any THREE) [15] Q.1 (a) Explain database system and give its
More informationCISC 3140 (CIS 20.2) Design & Implementation of Software Application II
CISC 3140 (CIS 20.2) Design & Implementation of Software Application II Instructor : M. Meyer Email Address: meyer@sci.brooklyn.cuny.edu Course Page: http://www.sci.brooklyn.cuny.edu/~meyer/ CISC3140-Meyer-lec4
More informationD.K.M COLLEGE FOR WOMEN(AUTONOMOUS),VELLORE DATABASE MANAGEMENT SYSTEM QUESTION BANK
D.K.M COLLEGE FOR WOMEN(AUTONOMOUS),VELLORE DATABASE MANAGEMENT SYSTEM QUESTION BANK UNIT I SECTION-A 2 MARKS 1. What is meant by DBMs? 2. Who is a DBA? 3. What is a data model?list its types. 4. Define
More informationUnit- III (Functional dependencies and Normalization, Relational Data Model and Relational Algebra)
Unit- III (Functional dependencies and Normalization, Relational Data Model and Relational Algebra) Important questions Section A :(2 Marks) 1.What is Functional Dependency? Functional dependency (FD)
More informationTopics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)
Pengantar Teknologi Informasi dan Teknologi Hijau Suryo Widiantoro, ST, MMSI, M.Com(IS) 1 Topics covered 1. Basic concept of managing files 2. Database management system 3. Database models 4. Data mining
More informationBHARAT SCHOOL OF BANKING- VELLORE-1 DATABASE MANAGEMENT SYSTEM Overview of Database
Overview of Database A Database is a collection of related data organised in a way that data can be easily accessed, managed and updated. Any piece of information can be a data, for example name of your
More informationU1. Data Base Management System (DBMS) Unit -1. MCA 203, Data Base Management System
Data Base Management System (DBMS) Unit -1 New Delhi-63,By Vaibhav Singhal, Asst. Professor U2.1 1 Data Base Management System Data: Data is the basic raw,fact and figures Ex: a name, a digit, a picture
More informationDATABASE DEVELOPMENT (H4)
IMIS HIGHER DIPLOMA QUALIFICATIONS DATABASE DEVELOPMENT (H4) Friday 3 rd June 2016 10:00hrs 13:00hrs DURATION: 3 HOURS Candidates should answer ALL the questions in Part A and THREE of the five questions
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization References R&G Book. Chapter 19: Schema refinement and normal forms Also relevant to
More informationnormalization are being violated o Apply the rule of Third Normal Form to resolve a violation in the model
Database Design Section1 - Introduction 1-1 Introduction to the Oracle Academy o Give examples of jobs, salaries, and opportunities that are possible by participating in the Academy. o Explain how your
More informationADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.
4541.564; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room 301-203) ADVANCED DATABASES Copyright by S.-g. Lee Review - 1 General Info. Text Book Database System Concepts, 6 th Ed., Silberschatz,
More informationA database can be modeled as: + a collection of entities, + a set of relationships among entities.
The Relational Model Lecture 2 The Entity-Relationship Model and its Translation to the Relational Model Entity-Relationship (ER) Model + Entity Sets + Relationship Sets + Database Design Issues + Mapping
More informationNormal Forms. Winter Lecture 19
Normal Forms Winter 2006-2007 Lecture 19 Equivalent Schemas Many schemas can represent a set of data Which one is best? What does best even mean? Main goals: Representation must be complete Data should
More informationDatabase Design Theory and Normalization. CS 377: Database Systems
Database Design Theory and Normalization CS 377: Database Systems Recap: What Has Been Covered Lectures 1-2: Database Overview & Concepts Lecture 4: Representational Model (Relational Model) & Mapping
More informationSteps in normalisation. Steps in normalisation 7/15/2014
Introduction to normalisation Normalisation Normalisation = a formal process for deciding which attributes should be grouped together in a relation Normalisation is the process of decomposing relations
More informationAnalytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.
Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied
More informationUnit I. By Prof.Sushila Aghav MIT
Unit I By Prof.Sushila Aghav MIT Introduction The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager DBMS Applications DBMS contains
More informationQ.2 e) Time stamping protocol for concurrrency control Time stamping ids a concurrency protocol in which the fundamental goal is to order transactions globally in such a way that older transactions get
More informationBabu Banarasi Das National Institute of Technology and Management
Babu Banarasi Das National Institute of Technology and Management Department of Computer Applications Question Bank (Short-to-Medium-Answer Type Questions) Masters of Computer Applications (MCA) NEW Syllabus
More informationDC62 Database management system JUNE 2013
Q2 (a) Explain the differences between conceptual & external schema. Ans2 a. Page Number 24 of textbook. Q2 (b) Describe the four components of a database system. A database system is composed of four
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationChapter 6: Relational Database Design
Chapter 6: Relational Database Design Chapter 6: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal Form Decomposition Using Functional Dependencies Second Normal
More informationChapter 1 SQL and Data
Chapter 1 SQL and Data What is SQL? Structured Query Language An industry-standard language used to access & manipulate data stored in a relational database E. F. Codd, 1970 s IBM 2 What is Oracle? A relational
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More informationVU Mobile Powered by S NO Group All Rights Reserved S NO Group 2013
1 CS403 Final Term Solved MCQs & Papers Mega File (Latest All in One) Question # 1 of 10 ( Start time: 09:32:20 PM ) Total Marks: 1 Each table must have a key. primary (Correct) secondary logical foreign
More informationData warehouses Decision support The multidimensional model OLAP queries
Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing
More informationDATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data
More informationNormalisation. Normalisation. Normalisation
Normalisation Normalisation Main objective in developing a logical data model for relational database systems is to create an accurate and efficient representation of the data, its relationships, and constraints
More informationCHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)
CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationSTRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa
STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS LECTURE: 05 (A) DATA WAREHOUSING (DW) By: Dr. Tendani J. Lavhengwa lavhengwatj@tut.ac.za 1 My personal quote:
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing
More informationVendor: CIW. Exam Code: 1D Exam Name: CIW v5 Database Design Specialist. Version: Demo
Vendor: CIW Exam Code: 1D0-541 Exam Name: CIW v5 Database Design Specialist Version: Demo QUESTION: 1 With regard to databases, what is normalization? A. The process of reducing the cardinality of a relation
More informationACS-3902 Fall Ron McFadyen 3D21 Slides are based on chapter 5 (7 th edition) (chapter 3 in 6 th edition)
ACS-3902 Fall 2016 Ron McFadyen 3D21 ron.mcfadyen@acs.uwinnipeg.ca Slides are based on chapter 5 (7 th edition) (chapter 3 in 6 th edition) 1 The Relational Data Model and Relational Database Constraints
More information- Database: Shared collection of logically related data and a description of it, designed to meet the information needs of an organization.
أساسيات قواعد بيانات 220) DataBase fundamentals (IS Lecture 1: Ch1 -Principles of DataBases- File-Based Systems: Collection of application programs that perform services for the end users. (e.g: reports).
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization References R&G Book. Chapter 19: Schema refinement and normal forms Also relevant to this
More informationCMSC 461 Final Exam Study Guide
CMSC 461 Final Exam Study Guide Study Guide Key Symbol Significance * High likelihood it will be on the final + Expected to have deep knowledge of can convey knowledge by working through an example problem
More informationBrief History of SQL. Relational Database Management System. Popular Databases
Brief History of SQL In 1970, Dr. E.F. Codd published "A Relational Model of Data for Large Shared Data Banks," an article that outlined a model for storing and manipulating data using tables. Shortly
More informationData Mining and Warehousing
Data Mining and Warehousing Sangeetha K V I st MCA Adhiyamaan College of Engineering, Hosur-635109. E-mail:veerasangee1989@gmail.com Rajeshwari P I st MCA Adhiyamaan College of Engineering, Hosur-635109.
More informationRelational Database Systems Part 01. Karine Reis Ferreira
Relational Database Systems Part 01 Karine Reis Ferreira karine@dpi.inpe.br Aula da disciplina Computação Aplicada I (CAP 241) 2016 Database System Database: is a collection of related data. represents
More informationCS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70
CS/B.Tech/CSE/New/SEM-6/CS-601/2013 2013 DATABASE MANAGEMENENT SYSTEM Time Allotted : 3 Hours Full Marks : 70 The figures in the margin indicate full marks. Candidates are required to give their answers
More informationDatabase Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur
Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 19 Relational Database Design (Contd.) Welcome to module
More information