Modern Software Engineering Methodologies Meet Data Warehouse Design: 4WD

Similar documents
Summary. The Dimensional Fact Model. Goals and benefits Basic and advanced constructs. Logical design with the DFM Best practices for design

System Development Life Cycle Methods/Approaches/Models

Incremental development A.Y. 2018/2019

Introduction To Software Development CSC Spring 2019 Howard Rosenthal

Software Life Cycle. Main issues: Discussion of different life cycle models Maintenance or evolution

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

ProtOLAP: Rapid OLAP Prototyping with On-Demand Data Supply

SOFTWARE LIFE-CYCLE MODELS 2.1

A Comprehensive Approach to Data Warehouse Testing

Software Engineering Lifecycles. Controlling Complexity

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Reducing the costs of rework. Coping with change. Software prototyping. Ways to Cope with change. Benefits of prototyping

Modernizing Business Intelligence and Analytics

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

Optimization Online Analytical Processing (OLAP) Data Sales Door Case Study CV Adilia Lestari

Seminars of Software and Services for the Information Society. Data Warehousing Design Issues

Dilbert Scott Adams. CSc 233 Spring 2012

SE420 - Software Quality Assurance

Introduction to Software Engineering

Topic 01. Software Engineering, Web Engineering, agile methodologies.

CMSC 435: Software Engineering Section 0201

Proceedings of the IE 2014 International Conference AGILE DATA MODELS

Activities Common to Software Projects. Software Life Cycle. Activities Common to Software Projects. Activities Common to Software Projects

Introduction to Software Engineering

Testing in Agile Software Development

Proposal of a new Data Warehouse Architecture Reference Model

This tutorial also elaborates on other related methodologies like Agile, RAD and Prototyping.

Visual Modelling of Data Warehousing Flows with UML Profiles

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.

Agile Manifesto & XP. Topics. Rapid software development. Agile methods. Chapter ) What is Agile trying to do?

Improving the Data Warehouse Architecture Using Design Patterns

Objectives. Connecting with Computer Science 2

Second. Incremental development model

Requirements and Design Overview

Pervasive Business Intelligence

Schedule(3/3) March 18th 13:00 Unified Process and Usecase-Driven Approach. (problem definition, use case model)

A Methodology for Integrating XML Data into Data Warehouses

Managing Change and Complexity

Dr. Tom Hicks. Computer Science Department Trinity University

Adnan YAZICI Computer Engineering Department

The requirements engineering process

Architectural Blueprint The 4+1 View Model of Software Architecture. Philippe Kruchten

Systems Analysis and Design in a Changing World, Fourth Edition

CHAPTER 3 Implementation of Data warehouse in Data Mining

Software Process. Software Process

The Scaled Agile Framework

5 Object Oriented Analysis

Software Verification and Validation (VIMMD052) Introduction. Istvan Majzik Budapest University of Technology and Economics

OBJECT ORIENTED SYSTEM DEVELOPMENT Software Development Dynamic System Development Information system solution Steps in System Development Analysis

OBJECTIVES DEFINITIONS CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS. Figure 1-1a Data in context

CS SOFTWARE ENGINEERING QUESTION BANK SIXTEEN MARKS

Process of Interaction Design and Design Languages

DEVELOPING DECISION SUPPORT SYSTEMS A MODERN APPROACH

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV

Agile Accessibility. Presenters: Ensuring accessibility throughout the Agile development process

ISSN: [Kaur* et al., 6(10): October, 2017] Impact Factor: 4.116

Optimize tomorrow today.

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

Software Development Process Models

What Time is it in the Data Warehouse?

Adopting Agile Practices

Incremental Programming

Gradational conception in Cleanroom Software Development

Quality Management Plan (QMP)

Towards a Data Warehouse Testing Framework

Testing in the Agile World

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

STUDY OF THE IMPACT OF THE RAPID PROTOTYPING METHOD ON THE PERFORMANCES OF A DESIGN PROCESS

Social Business Intelligence in Action

Software Engineering Principles

THE JOURNEY OVERVIEW THREE PHASES TO A SUCCESSFUL MIGRATION ADOPTION ACCENTURE IS 80% IN THE CLOUD

Goal-Oriented Requirement Analysis for Data Warehouse Design

Incorporating User Centered Requirement Engineering into Agile Software Development

Administrivia. Added 20 more so far. Software Process. Only one TA so far. CS169 Lecture 2. Start thinking about project proposal

Department of Industrial Engineering. Sharif University of Technology. Operational and enterprises systems. Exciting directions in systems

COMPUTER-AIDED DATA-MART DESIGN

Software Processes. Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 4 Slide 1

Requirements and User-Centered Design in an Agile Context

Specifying and Prototyping

Study Of Feature Driven Development Using The Concepts Of Object Oriented Programming System

Full file at

Using SLE for creation of Data Warehouses

5-1McGraw-Hill/Irwin. Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved.

Composite Software Data Virtualization The Five Most Popular Uses of Data Virtualization

Managing Data Resources

Business Intelligence Roadmap HDT923 Three Days

Advanced Data Management Technologies

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Darshan Institute of Engineering & Technology for Diploma Studies Rajkot Unit-1

SCHEME OF COURSE WORK. Data Warehousing and Data mining

How Can a Tester Cope With the Fast Paced Iterative/Incremental Process?

SM 3511 Interface Design. Institutionalizing interface design

New Frontiers in Business Intelligence: Distribution and Personalization

UML-Based Conceptual Modeling of Pattern-Bases

Logical design DATA WAREHOUSE: DESIGN Logical design. We address the relational model (ROLAP)

Extended TDWI Data Modeling: An In-Depth Tutorial on Data Warehouse Design & Analysis Techniques

After completing this course, participants will be able to:

Architectural Blueprint

VO Software Engineering

The strategic advantage of OLAP and multidimensional analysis

Transcription:

Modern Software Engineering Methodologies Meet Data Warehouse Design: 4WD Matteo Golfarelli Stefano Rizzi Elisa Turricchia University of Bologna - Italy 13th International Conference on Data Warehousing and Knowledge Discovery (DaWaK'11) August 29, 2011

Summary Motivating scenario Problems - Goals - Principles The methodology: 4WD Practical Evidences Summary and future work

Motivating scenario Data warehouse systems are characterized by a long and expensive development process that hardly meets the ambitious requirements of today s market Low penetration of data warehouse systems in smallmedium firms Data warehouse projects often leave both customers and developers dissatisfied Our contribution: Four-Wheel-Drive (4WD) An innovative methodology to improve the data warehouse development process in terms of efficiency and predictability, that couples traditional methodologies with Agile approaches

4WD: Research method How to design a new methodology for the data warehouse development process? Problems Goals Principles 4WD

Problems in the Data Warehouse Development Process Problem Unclear and uncertain requirements Long time for delivery Complexity of a data warehouse Motivation Difficult communication between users and developers Fast business condition evolution The decision process is flexibly structured and poorly shared across large organization Data mart centric Linear development for each data mart Data integration Huge data volume and the workload unpredictability make performance optimization hard

Goals in the Data Warehouse Development Process Goal Description Effect Reliability Robustness Productivity Timeliness Probability that the delivered system completely and accurately meets user requirements Capability to quickly react to environment changes Efficiency of using the resources assigned to the project to speed up system delivery Accuracy of time and cost assessing High-quality and satisfactory final system Uncertain and changing requirement management Shorter and cheaper projects Reliable resource estimates

Principles in the Data Warehouse Development Process Methodologies Principles Incrementality and risk-based iteration Waterfall [7] RAD [5] POSD [6] SSD [2] MDA [4] CBSE [3] ASD [1] Prototyping User involvement Component reuse Formal and light documentation Automated schema transformation

Relationship between Goals and Principles Principles Goals Reliability Robustness Productivity Timeliness Incrementality and riskbased iteration Continuous feed-back, clearer requirements Better management of change Better management of project resources, rapid feedback Early detection of errors Prototyping Frequent tests, easier error detection Early deliveries User involvement Better requir. validation, better data quality Early error detection Component reuse Error-free components Faster design Predictable development Formal and light documentation Clearer requirements Easier evolution Faster design Automated schema transformation Optimized performances Easier evolution Faster design Predictable design

The Methodology: Four-Wheel-Drive (4WD)

4WD: Description Nested iteration cycles: DM cycle Global plan for the development of the whole data warehouse Incrementally designs and releases one data mart Fact cycle Refines the data mart plan It incrementally designs the facts of a data mart Fact design: Modeling cycle Implementation cycle Modeling cycle Implementation cycle

4WD: DM Cycle Architectural Sketch: overall functional and physical architecture of the data warehouse Conformity Analysis: definition of conformed dimensions across different facts and data marts DM Priorities: trade-off between user priorities and technical constraints DM Design: builds and releases the top-priority data mart

4WD: Fact Cycle Source & Fact macro-analysis: checks the availability, quality and completeness of the data sources and determines the main business facts Fact prioritization: trade-off between user requirements and technical priorities Fact design: develops and releases the top-priority fact

4WD principle applied: 1. Incrementality and Risk-Based Iteration Slicing the system functionality into increments (e.g. 2-4 weeks for a single fact release) Risk guides the data mart and fact priority definition DM strategies Give priority to DM with widely shared hierarchies Prefer DM that are fed up from stable and well-understood data sources Fact strategies Give priority to fact with the main business hierarchies and require the most complex ETL procedures Adopt data-driven approach Plan the length of an iteration in proportion to the complexity of the fact

4WD principle applied: 2. Prototyping Use prototyping to support every data warehouse development phase: To help designers to validate requirements To improve the design of reports and analysis applications To advance testing to the early phases of design To evaluate the feasibility of alternative solutions during logical design of multidimensional schemata and during ETL design

4WD principle applied: 3. User involvement Tight collaboration between users and designers: Preliminary user training (e.g. clarify project goals, explain the multidimensional model) Prototyping to favor user awareness User feedback to detect problems and errors For usability tests of reporting and OLAP front-ends For functional tests of ETL procedures

4WD principle applied: 4. Component reuse Favor the use of predefined elements to support the data warehouse development: Conformed hierarchies Library hierarchies Library facts ETL building blocks Analysis templates

4WD principle applied: 5. Formal and light documentation Formal but lean documentation to formalize requirements, simplify communication and support accurate design: At the data warehouse level: Effective schema to summarize the data marts, data sources and user profiles At the data mart level: Bus matrix to associate each fact with its dimensions At the fact level: Conceptual schema before proceeding with the implementation (e.g. Dimensional Fact Model (DFM) [9])

4WD principle applied: 6. Automated Schema Transformation Automatic transformations between every data warehouse design level: Supply-driven conceptual design Logical schema Conceptual schema A CASE tool: QBX [8] Logical design Conceptual schema Logical schema

Practical evidences 4WD was applied to a project in the area of pay-tvs 2 Data marts: Administration: 9 facts, 5 releases Management control: 3 facts, 2 releases 10 to 26 days for each release Benefit Project development speed-up Reduction of the implementation effort Concise but exhaustive documentation Logical design automation Strategy User involvement Prototyping Reusing of existing reports and dimension tables DFM as conceptual model CASE tool

Summary and Future work We have identified the main problems behind data warehouse projects and we have proposed an innovative data warehouse methodology We carry out a case study to assess the impact of 4WD in a real environment, but many practical extensions are possible: Apply 4WD to different type of companies Design a tool to support the analyst using the 4WD methodology

References [1] Agile Manifesto: Manifesto for agile software development. http://agilemanifesto.org/ (2010) [2] Boehm, B.W.: A spiral model of software development and enhancement. IEEE Computer 21(5), 61 72 (1988) [3] Heineman, G.T., Councill, W.T.: Component-based software engineering: Putting the pieces together. Addison-Wesley (2001) [4] Kruchten, P.: The 4+1 view model of architecture. IEEE Software 12(6), 42 50 (1995) [5] Martin, J.: Rapid application development. MacMillan (1991) [6] Pomberger, G., Bischofberger, W.R., Kolb, D., Pree, W., Schlemm, H.: Prototyping-oriented software development concepts and tools. Structured Programming 12(1), 43 60 (1991) [7] Royce, W.W.: Managing the development of large software systems: Concepts and techniques. In: Proc. ICSE. pp. 328 339. Monterey, California, USA (1987) [8] Battaglia, A., Golfarelli, M., Rizzi, S.. QBX: A CASE Tool for Data Mart Design. To appear on: Proceedings 30th International Conference on Conceptual Modeling (ER 2011) Brussels, Belgium, 2011. [9] Golfarelli, M., Rizzi, S.: Data Warehouse Design: Modern Principles and Methodologies. McGraw- Hill, 2009.

Thank you for your attention Questions?