Code No: R Set No. 1

Similar documents
2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE)

R07. FirstRanker. 7. a) What is text mining? Describe about basic measures for text retrieval. b) Briefly describe document cluster analysis.

Table Of Contents: xix Foreword to Second Edition

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Chapter 1, Introduction

Time: 3 hours. Full Marks: 70. The figures in the margin indicate full marks. Answers from all the Groups as directed. Group A.

DATA WAREHOUING UNIT I

SCHEME OF COURSE WORK. Data Warehousing and Data mining

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.


COMPUTER SCIENCE AND ENGINEERING TUTORIAL QUESTION BANK

Tribhuvan University Institute of Science and Technology MODEL QUESTION

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV

AUTONOMOUS. Department of Computer Science and Engineering

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad

SQL Server Analysis Services

1. Inroduction to Data Mininig

Contents. Preface to the Second Edition

1. What are the nine decisions in the design of the data warehouse?

Data Mining Course Overview

DEPARTMENT OF INFORMATION TECHNOLOGY IT6702 DATA WAREHOUSING & DATA MINING

IT6702 DATA WAREHOUSING AND DATA MINING TWO MARKS WITH ANSWER UNIT-1 DATA WAREHOUSING

Web Information Retrieval

SQL Server 2005 Analysis Services

DATA MINING AND WAREHOUSING

Summary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining and Analytics. Introduction

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

Contents. Part I Setting the Scene

COURSE PLAN. Computer Science & Engineering

Question Bank. 4) It is the source of information later delivered to data marts.

CT75 (ALCCS) DATA WAREHOUSING AND DATA MINING JUN

Chapter 2 BACKGROUND OF WEB MINING

Data Preprocessing. Slides by: Shree Jaswal

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Sql Fact Constellation Schema In Data Warehouse With Example

Data Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

DATA MINING TRANSACTION

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

UNIT 2. DATA PREPROCESSING AND ASSOCIATION RULES

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

CHAPTER-23 MINING COMPLEX TYPES OF DATA

Part I: Data Mining Foundations

Data Preprocessing. Komate AMPHAWAN

IT DATA WAREHOUSING AND DATA MINING TWO MARKS WITH ANSWER UNIT-1 DATA WAREHOUSING

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

Section A. 1. a) Explain the evolution of information systems into today s complex information ecosystems and its consequences.

20466C - Version: 1. Implementing Data Models and Reports with Microsoft SQL Server

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Data Mining: An experimental approach with WEKA on UCI Dataset

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

Exam Datawarehousing INFOH419 July 2013

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Data warehouses Decision support The multidimensional model OLAP queries

DEVELOPING SQL DATA MODELS

DATA WAREHOUSING AND MINING UNIT-V TWO MARK QUESTIONS WITH ANSWERS

Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

AT78 DATA MINING & WAREHOUSING JUN 2015

Research on Data Mining Technology Based on Business Intelligence. Yang WANG

Chapter 4: Mining Frequent Patterns, Associations and Correlations

Data Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Training 24x7 DBA Support Staffing. MCSA:SQL 2016 Business Intelligence Development. Implementing an SQL Data Warehouse. (40 Hours) Exam

Data Mining By IK Unit 4. Unit 4

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008

Implementing Data Models and Reports with SQL Server 2014

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

Data Mining Concepts

TIM 50 - Business Information Systems

Answer All Questions. All Questions Carry Equal Marks. Time: 20 Min. Marks: 10.

DR. JIVRAJ MEHTA INSTITUTE OF TECHNOLOGY

After completing this course, participants will be able to:

Association Rule Mining. Entscheidungsunterstützungssysteme

Performance Analysis of Data Mining Classification Techniques

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

M. PHIL. COMPUTER SCIENCE (FT / PT) PROGRAMME (For the candidates to be admitted from the academic year onwards)

CHAPTER 6 EXPERIMENTS

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Data Mining. Yi-Cheng Chen ( 陳以錚 ) Dept. of Computer Science & Information Engineering, Tamkang University

DATA MINING II - 1DL460

CS490D: Introduction to Data Mining Prof. Chris Clifton

1.1 What Motivated Data Mining? Why Is It Important?

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Developing SQL Data Models

What Is Data Mining? CMPT 354: Database I -- Data Mining 2

An Overview of various methodologies used in Data set Preparation for Data mining Analysis

Domestic electricity consumption analysis using data mining techniques

CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I

Analyzing Outlier Detection Techniques with Hybrid Method

Dta Mining and Data Warehousing

Department of Computer Science & Engineering University of Kalyani. Syllabus for Ph.D. Coursework

UNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different?

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)

Transcription:

Code No: R05321204 Set No. 1 1. (a) Draw and explain the architecture for on-line analytical mining. (b) Briefly discuss the data warehouse applications. [8+8] 2. Briefly discuss the role of data cube aggregation and dimension reduction in the data reduction process. [16] 3. Write the syntax for the following data mining primitives: (a) Task-relevant data. (b) Concept hierarchies. [16] 4. Write short notes for the following in detail: (a) Measuring the central tendency (b) Measuring the dispersion of data. [16] 5. (a) Write the FP-growth algorithm. Explain. (b) What is an iceberg query? Explain with example. [10+6] 6. (a) What is classification? What is prediction? (b) What is Bayes theorem? Explain about Naive Bayesian classification. (c) Discuss about k-nearest neighbor classifiers and case-based reasoning.[4+6+6] 7. (a) Given the following measurement for the variable age: 18, 22, 25, 42, 28, 43, 33, 35, 56, 28 Standardize the variable by the following: i. Compute the mean absolute deviation of age. ii. Compute the Z-score for the first four measurements. (b) What is a distance-based outlier? What are efficient algorithms for mining distance-based algorithm? How are outliers determined in this method? [4+4+2+3+3] 8. An e-mail database is a database that stores a large number of electronic mail messages. It can be viewed as a semistructured database consisting mainly of text data. Discuss the following. (a) How can such an e-mail database be structured so as to facilitate multidimensional search, such as by sender, by receiver, by subject, by time, and so on? 1 of 2

Code No: R05321204 Set No. 1 (b) What can be mined from such an e-mail database? (c) suppose you have roughly classified a set of your previous e-mail messages as junk, unimportant, normal, or important. Describe how a data mining system may take this as the training set to automatically classify new e-mail messages or unclassified ones. [5+5+6] 2 of 2

Code No: R05321204 Set No. 2 1. (a) Explain data mining as a step in the process of knowledge discovery. (b) Differentiate operational database systems and data warehousing. [8+8] 2. (a) Briefly discuss about data integration. (b) Briefly discuss about data transformation. [8+8] 3. (a) Explain the syntax for Task-relevant data specification. (b) Explain the syntax for specifying the kind of knowledge to be mined. [8+8] 4. (a) Write the algorithm for attribute-oriented induction. Explain the steps involved in it. (b) How can concept description mining be performed incrementally and in a distributed manner? [8+8] 5. Explain the Apriori algorithm with example. [16] 6. Discuss about Backpropagation classification. [16] 7. (a) Write algorithms for k-means and k-medoids. Explain. (b) Discuss about density-based methods. [8+8] 8. Suppose that a city transportation department would like to perform data analysis on highway traffic for the planning of highway construction based on the city traffic data collected at different hours every day. (a) Design a spatial data warehouse that stores the highway traffic information so that people can easily see the average and peak time traffic flow by highway, by time of day, and by weekdays, and the traffic situation when a major accident occurs. (b) What information can we mine from such a spatial data warehouse to help city planners? (c) This data warehouse contains both spatial and temporal data. Propose one mining technique that can efficiently mine interesting patterns from such a spatio-temporal data warehouse. [5+5+6]

Code No: R05321204 Set No. 3 1. (a) Explain the major issues in data mining. (b) Explain the three-tier datawarehousing architecture. [8+8] 2. Discuss the role of data compression and numerosity reduction in data reduction process. [16] 3. Write the syntax for the following data mining primitives: (a) The kind of knowledge to be mined. (b) Measures of pattern interestingness. [16] 4. (a) What are the differences between concept description in large data bases and OLAP? (b) Explain about the graph displays of basic statistical class description. [8+8] 5. Explain the Apriori algorithm with example. [16] 6. (a) Describe the data classification process with a neat diagram. (b) How does the Naive Bayesian classification works? Explain. (c) Explain classifier accuracy. [5+5+6] 7. (a) Given two objects represented by the tuples (22,1,42,10) and (20,0,36,8): i. Compute the Euclidean distance between the two objects. ii. Compute the Manhanttan distance between the two objects. iii. Compute the Minkowski distance between the two objects, using q=3. (b) Explain about Statistical-based outlier detection and Deviation-based outlier detection. [3+3+4+3+3] 8. Explain the following: (a) Constriction and mining of object cubes (b) Mining associations in multimedia data (c) Periodicity analysis (d) Latent semantic indexing. [4+4+4+4]

Code No: R05321204 Set No. 4 1. (a) Explain data mining as a step in the process of knowledge discovery. (b) Differentiate operational database systems and data warehousing. [8+8] 2. (a) Briefly discuss about data integration. (b) Briefly discuss about data transformation. [8+8] 3. (a) Describe why is it important to have a data mining query language. (b) The four major types of concept hierarchies are: schema hierarchies, setgrouping hierarchies, operation-derived hierarchies, and rule-based hierarchies- Briefly define each type of hierarchy. [8+8] 4. Write short notes for the following in detail: (a) Measuring the central tendency (b) Measuring the dispersion of data. [16] 5. (a) How can we mine multilevel Association rules efficiently using concept hierarchies? Explain. (b) Can we design a method that mines the complete set of frequent item sets without candidate generation. If yes, explain with example. [8+8] 6. (a) Explain about basic decision tree induction algorithm. (b) Discuss about Bayesian classification. [8+8] 7. (a) Given two objects represented by the tuples (22,1,42,10) and (20,0,36,8): i. Compute the Euclidean distance between the two objects. ii. Compute the Manhanttan distance between the two objects. iii. Compute the Minkowski distance between the two objects, using q=3. (b) Explain about Statistical-based outlier detection and Deviation-based outlier detection. [3+3+4+3+3] 8. (a) Give an example of generalization-based mining of plan databases by divideand-conquer. (b) What is sequential pattern mining? Explain. (c) Explain the construction of a multilayered web information base. [8+4+4]