Advances in Data Networks, Communications, Computers and Materials

Size: px
Start display at page:

Download "Advances in Data Networks, Communications, Computers and Materials"

Transcription

1 Multi-dimensional Data Model of Textual Information PAVEL ČECH Department of Information Technologies University of Hradec Kralove Rokitanského 62, Hradec Králové, CZECH REPUBLIC Abstract: - The key to understanding the meaning is in the proper representation of the textual information. The tree like representation of a text is not very convenient for large scale processing since the information about types of phrases, part of speech tags and actual terms are scattered in various tree nodes. The aim of the article is to propose a multidimensional model of textual data that would contain all the lexical information in a structure that enables further semantic processing. The approach in this article is based on the data modeling principles used in data warehousing. Key-Words: - Multi-dimensional data model, data warehousing, star schema, text representation, lexical parsing. 1 Introduction Information retrieval and text mining have already attracted attention of many researchers. The major challenge in these areas is the understanding of meaning reposed in the text. The key to understanding the meaning is in the proper representation of the textual information. Today, it is known that the traditional term based representation of documents disregards the semantic differences contained in connections between the terms [29]. The novel concept based approaches utilize lexical parsing to extract semantics. Lexical parsing usually results in a tree like representation of a text (such as sentence). Such structure is however not very convenient for further large scale processing. The information about types of phrases, part of speech tags and actual terms are scattered in various levels of the tree depending on the text complexity. The aim of the article is to propose a multidimensional model of textual data that would contain all the lexical information in a structure that enables further semantic processing. The approach in this article is based on the data modeling principles used in data warehousing [9, 10, 12]. The supposed major advantage of the proposed approach is in a relational calculus and standardized query language for fast and flexible retrieving of lexical information. Hence, suggested approach would enable easy exploration of the text using the various information retrieval and text mining techniques such as term proximity calculation, n-gram determination, Markov model application and Formal Concept Analysis. The approach also addresses a problem stated by Castells et al that once the replacement of documents by bag of knowledge inevitably implies a loss of information [5]. Using this approach the documents can be fully reconstructed and thus new perspectives of the text could be reflected. The paper presents the description of the proposed multidimensional model and a query possibilities as well as the verification of the approach on a sample set of approx. 200 documents. 2 Related work There are various data models being used in the field of information retrieval and text mining. Basically, the two broad categories can be identified. The first category is term-based in which the vector-space model is being used [6, 17, 19, 20, 25]. The second category is concept or semantic based that utilizes ontology and a graph or tree like structures [5, 7, 11, 14, 16, 21]. There are also approaches that entail dimensionality or multidimensionality [27, 28, 29]. For example, Tseng proposed a multidimensional indexing structure, called D-tree [28], and corresponding query expressions MD2X (Multi- Dimensional Document Expressions) for document warehouses [27]. Tseng characterizes documents as a set of keywords and defines dimensions as a tree structure representing the hierarchical relations ISBN:

2 between keywords. The query expression MD 2 X is an extension to MDX (Multi-Dimensional expressions) that is designed to include complete set of SQL-like constructs so that to enable the traditional OLAP (Online Analytical Processing) operation such as drill up and drill down on the above defined dimensions. Similarly, Urbain et al designed a dimensional information retrieval model for genomics literature search. The model combines concept-based semantics and term statistics within multiple levels of document context in order to identify concise, variable length passages of text that answer a user query [29]. Their approach is based on a parallel dimensional indexing model in which the term index is used for identification of terms related to biological concepts and a concept index is used to search for resolved concepts. Furtehr, de Mazière et al [6] analyzed documents and organized them using special visualization techniques. In order to properly categorize the documents, they employed lexical parsing, n-gram finding and subsequent vectorization to use TF-IDF technique and clustering. There also works that refer directly to data/web warehousing concepts. Thus, the issues of warehousing web data have been discussed in [1, 2, 3, 4] in which the concept of a Web Information Coupling and the WHOWEDA (Warehouse of Web Data) project have been introduced. The elaborated schema was a graph structure consisting of nodes representing web pages and links representing hyperlinks. They provided formal description of the schema and algebra with manipulation operators to query/navigate among interconnected pages. Further, Fuller et al constructed an entity retrieval model for semi-structured information found in distributed and heterogeneous data sources such as world wide web or data of social networks [7]. In their approach they regard documents (HTML pages embedding RDF) to contain a set of statements describing one or more entities. The model use consists of entities and entity descriptions that form a graph-like structure in three levels. It has to be noted, that none but one of the reviewed articles discuss the multidimensional model as it is being declared in the data warehousing concepts and principles. The multidimensionality refers rather to a vector space or the warehouse represents just the collection of data ignoring the subject orientation and other characteristics. Predominant data model is a graph like or tree like structure that are close to how ontologies are organized. The proper application of data warehousing principles is found only in Urbain et al [29]. Urbain, however, does not refer to parts of the speech and phrases as dimensions and also the links between words are not considered. 3 Approach The approach in this paper is based on the adoption of the data warehousing principles widely implemented in the so called business intelligence [15, 30]. Originally, data warehouse referred to a large volume data repository that enabled informed decision making of high level business managers. The term data warehouse was coined and promoted by Bill Inmon, who is also the author of commonly accepted definition of a data warehouse as a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions [10]. Correspondingly, the data warehousing can be regarded as a collection of methods and principles of how to build and operate data warehouse. There are various approaches of how to realize the data warehouse using a relational database [10, 12, 13, 18, 24]. However, the core concept is a multidimensional data model or schema that is, in its simple form, often referred to as a star schema [9, 12]. The dimensions represent entities or perspectives through which the data should be viewed or analyzed. Dimensions are organized around particular measures (usually numerical data such as sales). In relational databases the dimensions and measures are represented as tables (dimension tables and fact tables respectively). Unlike in traditional operational databases the data in data warehouse might not be normalized [9, 12]. The variant of the star schema, in which the dimensions are normalized, requires the data to be separated into several interrelated dimension that form a structure similar to a snowflake. This is used in cases in which data in dimensions can form a hierarchy (e.g. time dimension can be split into days, months, years or other time periods or location can be split into city, country, etc.). However, querying data stored in the snowflake schema is computationally more expensive since it entails more table joins. In the application of a multidimensional model to textual data, the dimensions can be formed from the various lexical elements and structures (such as ISBN:

3 words, sentences, documents and also parts of speech tags, etc. The measures contain occurrences of entities in dimensions. Thus the occurrence can refer to a particular occurrence of a certain word in a given sentence from a given document. Populating the model would require some lexical parsing that might be achieved using the various tools and libraries such as Apache OpenNLP [26], Stanford NLP [23] or GATE [22]. 3.1 Proposed multi-dimensional data model The proposed model consists of the following dimensions (see fig. 1): Fig.1: multi-dimensional data model Words individual words; serves as a unique collection of all words indentified; POS parts of the speech tags of words (such as noun, adjective, etc.); for example the Penn Treebank [8] tagset is suggested to be used; Phrases phrases that the words are parts of (such as noun phrase or prepositional phrase); optional if no information about the phrase would be stored in that case the phrase would be stored in the fact table; Sentences sentences as containing the words and are parts of the paragraphs and documents; optional if no information about the sentence would be stored in that case the sentence would be captured as order in document or paragraph in the fact table; Paragraphs paragraphs as consisting of sentences; the paragraphs would serve for providing context and priority (e.g. paragraph with title might have greater weight than normal text). This dimension is optional if in some documents paragraphs might not be identified. Documents documents such as pages are characterized by the title and location i.e. URL in case of pages or path in case of files in a document repository. Word_Links links between consecutive words; the link is realized between two word occurrences; this dimension might not be necessary since the orders of the various parts are stored, thus when word occurrences are properly ordered the next word occurrence represents the next word. However, linking the words would allow easier manipulation using a relation calculus in which there are no operators for next or previous records. The Documents, Paragraphs, Sentences dimensions allow construction of a hierarchy, however, that would be more computationally demanding. Hence, the star schema is followed linking all dimensions directly to the fact table. Still, the dimensions are suggested to be normalized it means there will be no redundancy. In case the frequency of sentences and phrases are to be calculated a new fact tables for sentence or phrase occurrence might be created. However, the same information might be stored in the word occurrences table. If Word_Links are normalized it would be characterized by link frequency that would be increased whenever the same consecutive pair of words is encountered. This enables pattern identification based e.g. on n-gram computation. In a simple form the model is suggested to contain one fact table called Words_Occurrences that would connect the data from the above mentioned dimensions. In relational databases the connection is realized using the foreign keys concept. Apart from the foreign keys the fact table would contain the word_order that is the order of a given word in a sentence. 3.2 Querying proposed data model The proposed data model can be queried using the standardized SQL (Structured Query Language) or ISBN:

4 MDX (Multi-Dimensional expressions). This would allow for easy manipulation with data since various views can be defined and also the off-the-shelf OLAP tools might be used to analyze the data. Thus, for example an OLAP cube might be constructed above the data to compute aggregates and to allow drill down and drill down operations. The following SQL statement might be used to reconstruct the whole text enriched with part of speech tags and phrase types: SELECT w.word, pos.tag, phr.type wo.document_id, wo.word_order, wo.sentence_order FROM dbo.tf_word_occurrences AS wo INNER JOIN dbo.td_words AS w ON wo.word_id = w.id INNER JOIN dbo.td_pos AS pos ON wo.pos_id = pos.id INNER JOIN dbo.td_phrases AS phr ON wo.phrase_id = phr.id ORDER BY wo.document_id, wo.sentence_order, wo.word_order There is also a possibility to identify word pairs. The following SQL statement would identify all adjective noun pairs: SELECT v1.word AS adjective, v2.word AS noun, FROM dbo.vx_wo_words_pos AS v1 INNER JOIN dbo.vx_wo_words_pos AS v2 ON v1.wo_after_id = v2.id WHERE (v1.tag = 'JJ') AND (v2.tag = 'NN') The expression v1.wo_after_id = v2.id has been realized using the Word_Links dimension. The frequency of the above identified adjective noun pairs in documents can be computed as follows: SELECT document_title, adjective, noun, count(*) counts FROM vl_adjectives_nouns an INNER JOIN td_documents d ON d.id = an.document_id GROUP BY title, adjective, noun HAVING count(*) > 1 ORDER BY title, counts DESC The result set from the above SQL statements is limited only to adjective noun pairs that appear in a given document at least twice. The model enables for even more complex patterns to be analyzed. For instance, in the formation of concept hierarchies the is a pattern provides a good clue to identify sub and super concepts. The search expression between word w1 that is a noun and a word w2 that is also noun and a phrase p that is is a can be formulated as: w1.tag='nn' and p.text='is a' and w2.tag='nn' The pattern might be extended to include also adjectives that would further specify the concepts. 4 Verification The verification of suggested approach has been conducted on the corpus of 180 documents or pages from Wikipedia. In that corpus over 40 thousands unique words were identified with more than 400 thousand word occurrences in 19 thousand sentences. In order to analyze the documents, the suggested multidimensional model based on a star schema was created. Lexical parsing have been provided using OpenNLP framework [26]. During the lexical parsing the paragraphs were not identified and thus the Paragraphs dimension was omitted. Similarly, the sentence order in the document and word links between successive words were stored as a part of the word occurrence fact table. The model enables the easy calculation of various aggregations. The Table 1 presents a sample of noun counts for selected documents. Document title Counts Evolutionary programming 68 Candidate solution 76 Genetic representation 84 Chromosome (genetic algorithm) 89 Business intelligence tools 91 Social behavior 93 Table 1: Count of nouns in selected documents ISBN:

5 More complicated aggregations are also possible. Using the above presented SQL statements the table 2 shows the frequency of keywords consisting of an adjective and noun in a document titled Computational science. Keywords Counts computational science 12 scientific computing 5 numerical analysis 3 computational engineering 2 scientific computation 2 Table 2: Frequency of adjective noun pairs Computing the frequency of keywords can serve as a basis for the TF-IDF calculations and for other analysis. 5 Conclusion The multi-dimensional data model appeared to provide fast and flexible way of retrieving lexical information. The query mechanism is based on the standardized query language and provides for further semantic analysis of the text. The major disadvantage of the approach is the extent of the data. The possibility to fully reconstruct the analyzed text is obviously at the expense of the disk space. However, in space saving alternative mainly the word occurrence fact table would grow proportionally with the length of the text analyzed given that there is relatively small number of unique words that are used in a language. References: [1] S. Bhowmick, S. Madria, W. Ng, and E. Lim, "Web Warehousing: Design and Issues Advances in Database Technologies." vol. 1552, Y. Kambayashi, D.-L. Lee, E.-p. Lim, M. Mohania, and Y. Masunaga, Eds., ed: Springer Berlin / Heidelberg, 1999, pp [2] S. Bhowmick, W.-K. Ng, and E.-P. Lim, "Information Coupling in Web Databases Conceptual Modeling ER 98." vol. 1507, T.- W. Ling, S. Ram, and M. Li Lee, Eds., ed: Springer Berlin / Heidelberg, 1998, pp [3] Y. Cao, E.-P. Lim, and W.-K. Ng, "On Warehousing Historical Web Information Conceptual Modeling ER 2000." vol. 1920, A. Laender, S. Liddle, and V. Storey, Eds., ed: Springer Berlin / Heidelberg, 2000, pp [4] Y. Cao, E.-P. Lim, and W.-K. Ng, "Data model for warehousing historical Web information," Information and Software Technology, vol. 45, pp , [5] P. Castells, M. Fernandez, and D. Vallet, "An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval," IEEE Trans. on Knowl. and Data Eng., vol. 19, pp , [6] P. A. De Mazière and M. M. Van Hulle, "A clustering study of a 7000 EU document inventory using MDS and SOM," Expert Systems with Applications, vol. 38, pp , [7] C. M. Fuller, D. P. Biros, and D. Delen, "An investigation of data and text mining methods for real world deception detection," Expert Systems with Applications, vol. 38, pp , [8] J. Hajič, "Treebanks and Tagsets," in Encyclopedia of Language & Linguistics (Second Edition), B. Editor-in-Chief: Keith, Ed., ed Oxford: Elsevier, 2006, pp [9] J. Han, M. Kamber, and J. Pei, "Data Warehousing and Online Analytical Processing," in Data Mining (Third Edition), ed Boston: Morgan Kaufmann, 2012, pp [10] W. H. Inmon, Building the Data Warehouse.: Wiley and Sons, [11] B.-Y. Kang, D.-W. Kim, and S.-J. Lee, "Exploiting concept clusters for content-based information retrieval," Information Sciences, vol. 170, pp , [12] R. Kimball and R. Margy, The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd ed.: Wiley, [13] S. Lightstone, T. Teorey, and T. Nadeau, "Physical design for decision support, warehousing, and OLAP," in Physical Database Design, ed Burlington: Morgan Kaufmann, 2007, pp [14] H.-T. Lin, N.-W. Chi, and S.-H. Hsieh, "A concept-based information retrieval approach for engineering domain-specific technical documents," Advanced Engineering Informatics, vol. 26, pp , [15] D. Loshin, Business Intelligence. San Francisco: Morgan Kaufmann, [16] Q. Luo, E. Chen, and H. Xiong, "A semantic term weighting scheme for text categorization," Expert Systems with Applications, vol. 38, pp , ISBN:

6 [17] W. Mao and W. W. Chu, "The phrase-based vector space model for automatic retrieval of free-text medical documents," Data & Knowledge Engineering, vol. 61, pp , [18] S. T. March and A. R. Hevner, "Integrated decision support systems: A data warehousing perspective," Decision Support Systems, vol. 43, pp , [19] S. Robertson, "Understanding inverse document frequency: on theoretical arguments for IDF," Journal of Documentation, vol. 60, pp , [20] G. Salton, "Term-weighting approaches in automatic text retrieval," Information Processing & Management, vol. 24 pp , [21] R. Setchi, Q. Tang, and I. Stankov, "Semanticbased information retrieval in support of concept design," Advanced Engineering Informatics, vol. 25, pp , [22] T. U. o. Sheffield. (2012, 7 July, 2012). GATE [23] T. Stanford. (2012, 7 July, 2012). Stanford NLP Group. Available: [24] A. Subramanian, L. Douglas Smith, A. C. Nelson, J. F. Campbell, and D. A. Bird, "Strategic planning for data warehousing," Information & Management, vol. 33, pp , [25] X. Tai, F. Ren, and K. Kita, "An information retrieval model based on vector space method by supervised learning," Information Processing & Management, vol. 38, pp , [26] The Apache Software Foundation. (2012, 7 July, 2012). Apache OpenNLP. Available: [27] F. S. C. Tseng, "Design of a multi-dimensional query expression for document warehouses," Information Sciences, vol. 174, pp , [28] F. S. C. Tseng and W.-P. Lin, "D-Tree: a multidimensional indexing structure for constructing document warehouses," Journal of Information Science and Engineering, vol. 22, pp , [29] J. Urbain, N. Goharian, and O. Frieder, "A dimensional retrieval model for integrating semantics and statistical evidence in context for genomics literature search," Computers in Biology and Medicine, vol. 39, pp , [30] S. Williams and N. Williams, The Profit Impact of Business Intelligence. San Francisco: Morgan Kaufmann, ISBN:

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India

More information

Novel Materialized View Selection in a Multidimensional Database

Novel Materialized View Selection in a Multidimensional Database Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/

More information

Data warehouse architecture consists of the following interconnected layers:

Data warehouse architecture consists of the following interconnected layers: Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and

More information

1. Inroduction to Data Mininig

1. Inroduction to Data Mininig 1. Inroduction to Data Mininig 1.1 Introduction Universe of Data Information Technology has grown in various directions in the recent years. One natural evolutionary path has been the development of the

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Fast and Effective System for Name Entity Recognition on Big Data

Fast and Effective System for Name Entity Recognition on Big Data International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-3, Issue-2 E-ISSN: 2347-2693 Fast and Effective System for Name Entity Recognition on Big Data Jigyasa Nigam

More information

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Fig 1.2: Relationship between DW, ODS and OLTP Systems 1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions

More information

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing

More information

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:- UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to

More information

Data Warehousing and OLAP Technologies for Decision-Making Process

Data Warehousing and OLAP Technologies for Decision-Making Process Data Warehousing and OLAP Technologies for Decision-Making Process Hiren H Darji Asst. Prof in Anand Institute of Information Science,Anand Abstract Data warehousing and on-line analytical processing (OLAP)

More information

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using

More information

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & 6367(Print), ISSN 0976 6375(Online) Volume 3, Issue 1, January- June (2012), TECHNOLOGY (IJCET) IAEME ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume

More information

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing. About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This

More information

Enabling Off-Line Business Process Analysis: A Transformation-Based Approach

Enabling Off-Line Business Process Analysis: A Transformation-Based Approach Enabling Off-Line Business Process Analysis: A Transformation-Based Approach Arnon Sturm Department of Information Systems Engineering Ben-Gurion University of the Negev, Beer Sheva 84105, Israel sturm@bgu.ac.il

More information

ijade Reporter An Intelligent Multi-agent Based Context Aware News Reporting System

ijade Reporter An Intelligent Multi-agent Based Context Aware News Reporting System ijade Reporter An Intelligent Multi-agent Based Context Aware Reporting System Eddie C.L. Chan and Raymond S.T. Lee The Department of Computing, The Hong Kong Polytechnic University, Hung Hong, Kowloon,

More information

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI Department of Information Technology IT6702 Data Warehousing & Data Mining Anna University 2 & 16 Mark Questions & Answers Year / Semester: IV / VII Regulation:

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Hierarchies in a multidimensional model: From conceptual modeling to logical representation

Hierarchies in a multidimensional model: From conceptual modeling to logical representation Data & Knowledge Engineering 59 (2006) 348 377 www.elsevier.com/locate/datak Hierarchies in a multidimensional model: From conceptual modeling to logical representation E. Malinowski *, E. Zimányi Department

More information

Rocky Mountain Technology Ventures

Rocky Mountain Technology Ventures Rocky Mountain Technology Ventures Comparing and Contrasting Online Analytical Processing (OLAP) and Online Transactional Processing (OLTP) Architectures 3/19/2006 Introduction One of the most important

More information

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS LECTURE: 05 (A) DATA WAREHOUSING (DW) By: Dr. Tendani J. Lavhengwa lavhengwatj@tut.ac.za 1 My personal quote:

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

A Hybrid Unsupervised Web Data Extraction using Trinity and NLP

A Hybrid Unsupervised Web Data Extraction using Trinity and NLP IJIRST International Journal for Innovative Research in Science & Technology Volume 2 Issue 02 July 2015 ISSN (online): 2349-6010 A Hybrid Unsupervised Web Data Extraction using Trinity and NLP Anju R

More information

Data-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives

Data-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives Data-Driven Driven Business Intelligence Systems: Parts I Week 5 Dr. Jocelyn San Pedro School of Information Management & Systems Monash University IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, 2004 Lecture

More information

The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees

The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees Omar H. Karam Faculty of Informatics and Computer Science, The British University in Egypt and Faculty of

More information

Information Modeling and Relational Databases

Information Modeling and Relational Databases Information Modeling and Relational Databases Second Edition Terry Halpin Neumont University Tony Morgan Neumont University AMSTERDAM» BOSTON. HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV Subject Name: Elective I Data Warehousing & Data Mining (DWDM) Subject Code: 2640005 Learning Objectives: To understand

More information

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction 2 Data warehousing

More information

Question Bank. 4) It is the source of information later delivered to data marts.

Question Bank. 4) It is the source of information later delivered to data marts. Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile

More information

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE Dr. Kirti Singh, Librarian, SSD Women s Institute of Technology, Bathinda Abstract: Major libraries have large collections and circulation. Managing

More information

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems Management Information Systems Management Information Systems B10. Data Management: Warehousing, Analyzing, Mining, and Visualization Code: 166137-01+02 Course: Management Information Systems Period: Spring

More information

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures) CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm

More information

ETL and OLAP Systems

ETL and OLAP Systems ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester

More information

Big Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data. Fall 2012

Big Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data. Fall 2012 Big Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data Fall 2012 Data Warehousing and OLAP Introduction Decision Support Technology On Line Analytical Processing Star Schema

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

: How does DSS data differ from operational data?

: How does DSS data differ from operational data? by Daniel J Power Editor, DSSResources.com Decision support data used for analytics and data-driven DSS is related to past actions and intentions. The data is a historical record and the scale of data

More information

Extracting knowledge from Ontology using Jena for Semantic Web

Extracting knowledge from Ontology using Jena for Semantic Web Extracting knowledge from Ontology using Jena for Semantic Web Ayesha Ameen I.T Department Deccan College of Engineering and Technology Hyderabad A.P, India ameenayesha@gmail.com Khaleel Ur Rahman Khan

More information

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 31 Table of contents 1 Introduction 2 Data warehousing

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

Text Mining: A Burgeoning technology for knowledge extraction

Text Mining: A Burgeoning technology for knowledge extraction Text Mining: A Burgeoning technology for knowledge extraction 1 Anshika Singh, 2 Dr. Udayan Ghosh 1 HCL Technologies Ltd., Noida, 2 University School of Information &Communication Technology, Dwarka, Delhi.

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Data Warehouse Testing. By: Rakesh Kumar Sharma

Data Warehouse Testing. By: Rakesh Kumar Sharma Data Warehouse Testing By: Rakesh Kumar Sharma Index...2 Introduction...3 About Data Warehouse...3 Data Warehouse definition...3 Testing Process for Data warehouse:...3 Requirements Testing :...3 Unit

More information

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2 Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1907-1911 1907 Web-Based Data Mining in System Design and Implementation Open Access Jianhu

More information

Advantages of UML for Multidimensional Modeling

Advantages of UML for Multidimensional Modeling Advantages of UML for Multidimensional Modeling Sergio Luján-Mora (slujan@dlsi.ua.es) Juan Trujillo (jtrujillo@dlsi.ua.es) Department of Software and Computing Systems University of Alicante (Spain) Panos

More information

Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course Details Course Outline Module 1: Introduction to Microsoft SQL Server Analysis Services This module introduces

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension

More information

Mining User - Aware Rare Sequential Topic Pattern in Document Streams

Mining User - Aware Rare Sequential Topic Pattern in Document Streams Mining User - Aware Rare Sequential Topic Pattern in Document Streams A.Mary Assistant Professor, Department of Computer Science And Engineering Alpha College Of Engineering, Thirumazhisai, Tamil Nadu,

More information

MOLAP Data Warehouse of a Software Products Servicing Call Center

MOLAP Data Warehouse of a Software Products Servicing Call Center MOLAP Data Warehouse of a Software Products Servicing Call Center Z. Kazi, B. Radulovic, D. Radovanovic and Lj. Kazi Technical faculty "Mihajlo Pupin" University of Novi Sad Complete Address: Technical

More information

News Filtering and Summarization System Architecture for Recognition and Summarization of News Pages

News Filtering and Summarization System Architecture for Recognition and Summarization of News Pages Bonfring International Journal of Data Mining, Vol. 7, No. 2, May 2017 11 News Filtering and Summarization System Architecture for Recognition and Summarization of News Pages Bamber and Micah Jason Abstract---

More information

Knowledge Engineering with Semantic Web Technologies

Knowledge Engineering with Semantic Web Technologies This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering 5.3 Ontology Learning

More information

An Overview of Data Warehousing and OLAP Technology

An Overview of Data Warehousing and OLAP Technology An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection

More information

EIAW: Towards a Business-Friendly Data Warehouse Using Semantic Web Technologies

EIAW: Towards a Business-Friendly Data Warehouse Using Semantic Web Technologies EIAW: Towards a Business-Friendly Data Warehouse Using Semantic Web Technologies Guotong Xie 1, Yang Yang 1, Shengping Liu 1, Zhaoming Qiu 1, Yue Pan 1, and Xiongzhi Zhou 2 1 IBM China Research Laboratory

More information

Database Modeling And Design The Fundamental Principles The Morgan Kaufmann Series In Data Management Systems

Database Modeling And Design The Fundamental Principles The Morgan Kaufmann Series In Data Management Systems Database Modeling And Design The Fundamental Principles The Morgan Kaufmann Series In Data Management We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our

More information

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,

More information

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)

More information

DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT

DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT Jong-Soo Sohn and In-Jeong Chung Department of Computer and Information Science Korea University Republic of Korea Abstract

More information

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK

A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK Qing Guo 1, 2 1 Nanyang Technological University, Singapore 2 SAP Innovation Center Network,Singapore ABSTRACT Literature review is part of scientific

More information

A Data Warehouse Implementation Using the Star Schema. For an outpatient hospital information system

A Data Warehouse Implementation Using the Star Schema. For an outpatient hospital information system A Data Warehouse Implementation Using the Star Schema For an outpatient hospital information system GurvinderKaurJosan Master of Computer Application,YMT College of Management Kharghar, Navi Mumbai ---------------------------------------------------------------------***----------------------------------------------------------------

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information

Data Mining and Warehousing

Data Mining and Warehousing Data Mining and Warehousing Sangeetha K V I st MCA Adhiyamaan College of Engineering, Hosur-635109. E-mail:veerasangee1989@gmail.com Rajeshwari P I st MCA Adhiyamaan College of Engineering, Hosur-635109.

More information

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Lectures for the course: Data Warehousing and Data Mining (IT 60107) Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

TDWI Data Modeling. Data Analysis and Design for BI and Data Warehousing Systems

TDWI Data Modeling. Data Analysis and Design for BI and Data Warehousing Systems Data Analysis and Design for BI and Data Warehousing Systems Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your

More information

Inference in Hierarchical Multidimensional Space

Inference in Hierarchical Multidimensional Space Proc. International Conference on Data Technologies and Applications (DATA 2012), Rome, Italy, 25-27 July 2012, 70-76 Related papers: http://conceptoriented.org/ Inference in Hierarchical Multidimensional

More information

Semantic Web Mining and its application in Human Resource Management

Semantic Web Mining and its application in Human Resource Management International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2

More information

20466C - Version: 1. Implementing Data Models and Reports with Microsoft SQL Server

20466C - Version: 1. Implementing Data Models and Reports with Microsoft SQL Server 20466C - Version: 1 Implementing Data Models and Reports with Microsoft SQL Server Implementing Data Models and Reports with Microsoft SQL Server 20466C - Version: 1 5 days Course Description: The focus

More information

Adnan YAZICI Computer Engineering Department

Adnan YAZICI Computer Engineering Department Data Warehouse Adnan YAZICI Computer Engineering Department Middle East Technical University, A.Yazici, 2010 Definition A data warehouse is a subject-oriented integrated time-variant nonvolatile collection

More information

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online): IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online): 2321-0613 Tanzeela Khanam 1 Pravin S.Metkewar 2 1 Student 2 Associate Professor 1,2 SICSR, affiliated

More information

International Journal of Advance Engineering and Research Development. Survey of Web Usage Mining Techniques for Web-based Recommendations

International Journal of Advance Engineering and Research Development. Survey of Web Usage Mining Techniques for Web-based Recommendations Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 02, February -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 Survey

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Data Warehousing. Overview

Data Warehousing. Overview Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake

More information

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 7: Schemas Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database schema A Database Schema captures: The concepts represented Their attributes

More information

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher ISSN: 2394 3122 (Online) Volume 2, Issue 1, January 2015 Research Article / Survey Paper / Case Study Published By: SK Publisher P. Elamathi 1 M.Phil. Full Time Research Scholar Vivekanandha College of

More information

PRIS at TAC2012 KBP Track

PRIS at TAC2012 KBP Track PRIS at TAC2012 KBP Track Yan Li, Sijia Chen, Zhihua Zhou, Jie Yin, Hao Luo, Liyin Hong, Weiran Xu, Guang Chen, Jun Guo School of Information and Communication Engineering Beijing University of Posts and

More information

CHAPTER 3 Implementation of Data warehouse in Data Mining

CHAPTER 3 Implementation of Data warehouse in Data Mining CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected

More information

A MAS Based ETL Approach for Complex Data

A MAS Based ETL Approach for Complex Data A MAS Based ETL Approach for Complex Data O. Boussaid, F. Bentayeb, J. Darmont Abstract : In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have

More information

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table A Star Schema Has One To Many Relationship Between A Dimension And Fact Table Many organizations implement star and snowflake schema data warehouse The fact table has foreign key relationships to one or

More information

Data transfer, storage and analysis for data mart enlargement

Data transfer, storage and analysis for data mart enlargement Data transfer, storage and analysis for data mart enlargement PROKOPOVA ZDENKA, SILHAVY PETR, SILHAVY RADEK Department of Computer and Communication Systems Faculty of Applied Informatics Tomas Bata University

More information

DATA WAREHOUSE MANAGEMENT SYSTEM A CASE STUDY

DATA WAREHOUSE MANAGEMENT SYSTEM A CASE STUDY DATA WAREHOUSE MANAGEMENT SYSTEM A CASE STUDY DARKO KRULJ Trizon Group, Belgrade, Serbia and Montenegro. MILUTIN CUPIC MILAN MARTIC MILIJA SUKNOVIC Faculty of Organizational Science, University of Belgrade,

More information

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City The Complete Reference Christopher Adamson Mc Grauu LlLIJBB New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Contents Acknowledgments

More information

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION Ms. Nikita P.Katariya 1, Prof. M. S. Chaudhari 2 1 Dept. of Computer Science & Engg, P.B.C.E., Nagpur, India, nikitakatariya@yahoo.com 2 Dept.

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such

More information

Metadata. Data Warehouse

Metadata. Data Warehouse A DECISION SUPPORT PROTOTYPE FOR THE KANSAS STATE UNIVERSITY LIBRARIES Maria Zamr Bleyberg, Raghunath Mysore, Dongsheng Zhu, Radhika Bodapatla Computing and Information Sciences Department Karen Cole,

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Semantic Annotation of Web Resources Using IdentityRank and Wikipedia

Semantic Annotation of Web Resources Using IdentityRank and Wikipedia Semantic Annotation of Web Resources Using IdentityRank and Wikipedia Norberto Fernández, José M.Blázquez, Luis Sánchez, and Vicente Luque Telematic Engineering Department. Carlos III University of Madrid

More information

Data Modeling Online Training

Data Modeling Online Training Data Modeling Online Training IQ Online training facility offers Data Modeling online training by trainers who have expert knowledge in the Data Modeling and proven record of training hundreds of students.

More information

Full file at

Full file at Chapter 2 Data Warehousing True-False Questions 1. A real-time, enterprise-level data warehouse combined with a strategy for its use in decision support can leverage data to provide massive financial benefits

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic

More information

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,

More information

A Semantic Model for Concept Based Clustering

A Semantic Model for Concept Based Clustering A Semantic Model for Concept Based Clustering S.Saranya 1, S.Logeswari 2 PG Scholar, Dept. of CSE, Bannari Amman Institute of Technology, Sathyamangalam, Tamilnadu, India 1 Associate Professor, Dept. of

More information

Data Warehouse Design Using Row and Column Data Distribution

Data Warehouse Design Using Row and Column Data Distribution Int'l Conf. Information and Knowledge Engineering IKE'15 55 Data Warehouse Design Using Row and Column Data Distribution Behrooz Seyed-Abbassi and Vivekanand Madesi School of Computing, University of North

More information

ABD - Database Administration

ABD - Database Administration Coordinating unit: 270 - FIB - Barcelona School of Informatics Teaching unit: 747 - ESSI - Department of Service and Information System Engineering Academic year: Degree: 2017 BACHELOR'S DEGREE IN INFORMATICS

More information

The Study on Data Warehouse and Data Mining for Birth Registration System of the Surat City

The Study on Data Warehouse and Data Mining for Birth Registration System of the Surat City The Study on Data Warehouse and Data Mining for Birth Registration System of the Surat City Pushpal Desai M.Sc.(I.T.) Programme Veer Narmad South Gujarat University Surat, India. Desai Apurva Department

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

Data Warehouses Chapter 12. Class 10: Data Warehouses 1 Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is

More information

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 Introduction Increasingly,

More information