The Enterprise Data Marketplace Rick F. van der Lans Industry analyst Email rick@r20.nl Twitter @rick_vanderlans www.r20.nl Copyright 2018 R20/Consultancy B.V., The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or otherwise, without the explicit written permission of the copyright owners. Rick F. van der Lans Rick F. van der Lans is an independent consultant, lecturer, and author. He specializes in data warehousing, business intelligence, database technology, and data virtualization. He is managing director of R20/Consultancy B.V.. Rick has been involved in various projects in which data warehousing, and integration technology was applied. Rick van der Lans is an internationally acclaimed lecturer. He has lectured world wide professionally for the last twenty five years. He has been invited by several major software vendors to present keynote speeches. He is the author of several books on computing, including his new Data Virtualization for Business Intelligence Systems. Some of these books are available in different languages. Books such as the popular Introduction to SQL is available in English, Dutch, Italian, Chinese, and German and is sold world wide. He also authored The SQL Guide to Ingres and SQL for MySQL Developers. Ambassador of Kadenza: Rick works closely together with the consultants of Kadenza in many projects. Kadenza is a Dutch consultancy company specializing in business intelligence, data management, big data, data warehousing, data virtualization, and analytics. Our joint experiences and insights are shared in seminars, webinars, blogs, and white papers. R20/Consultancy B.V. is located in The Netherlands, www.r20.nl. You can get in touch with Rick via: Email: rick@r20.nl Twitter: @Rick_vanderlans LinkedIn: http://www.linkedin.com/pub/rick-van-der-lans/9/207/223 Copyright 2018 R20/Consultancy B.V., The Netherlands 2 1
The Classic Data Warehouse Architecture Source systems Staging area Data warehouse Data marts Analytics & reporting ETL ETL ETL Copyright 2018 R20/Consultancy B.V., The Netherlands 3 The Logical Data Warehouse Architecture Source systems Staging area Data warehouse Big data ETL Social media data Open data Spreadsheets ETL Logical Data Warehouse Architecture Copyright 2018 R20/Consultancy B.V., The Netherlands 4 2
The Data Lake All data sources Data lake Investigative analytics Data science Copyright 2018 R20/Consultancy B.V., The Netherlands 5 Taylor Made Reports Copyright 2018 R20/Consultancy B.V., The Netherlands 6 3
Part 1: Are We On the Right Track? Copyright 2018 R20/Consultancy B.V., The Netherlands 7 Expectations Expectation is the root of all heartache. Shakespeare Copyright 2018 R20/Consultancy B.V., The Netherlands 8 4
Expectation 1: Users Know What They Want If I had asked people what they wanted, they would have said faster horses. Henry Ford Copyright 2018 R20/Consultancy B.V., The Netherlands 9 Expectation 1: Users Know What They Want People don t know what they want until you show it to them. Steve Jobs Copyright 2018 R20/Consultancy B.V., The Netherlands 10 5
Expectation 1: Users Know What They Want It s not the customer s job to know what they want. Steve Jobs Copyright 2018 R20/Consultancy B.V., The Netherlands 11 Expectation 2: Transactional Data Fulfills the User s Information Needs Copyright 2018 R20/Consultancy B.V., The Netherlands 12 6
Expectation 3: Users Understand BI Tools Copyright 2018 R20/Consultancy Source: Wayne B.V., The Eckerson Netherlands http://insideanalysis.com/2013/04/the promise of self service bi/ April 2013 13 Expectation 4: Users Love Developing Reports Copyright 2018 R20/Consultancy B.V., The Netherlands 14 7
Expectation 4: Users Love Developing Reports Most good programmers do programming not because they expect to get paid or get adulation by the public, but because it is fun to program. Linus Torvalds Copyright 2018 R20/Consultancy B.V., The Netherlands 15 Expectation 4: Users Love Developing Reports In fifteen years we ll be teaching programming just like reading and writing and wondering why we didn t do it sooner. Mark Zuckerberg Copyright 2018 R20/Consultancy B.V., The Netherlands 16 8
Expectation 5: Users Love Wrestling With Star Schemas Copyright 2018 R20/Consultancy B.V., The Netherlands 17 We Expect Too Much Copyright 2018 R20/Consultancy B.V., The Netherlands 18 9
Part 2: The Data Marketplace Copyright 2018 R20/Consultancy B.V., The Netherlands 19 The Supply Chain Raw materials Supplier Manufacturing Distribution Entire network of entities, directly or indirectly interlinked and interdependent in serving the same consumer or customer. It comprises of vendors that supply raw material, producers who convert the material into products, warehouses that store, distribution centers that deliver to the retailers, and retailers who bring the product to the ultimate user. Customer Consumer Copyright 2018 R20/Consultancy B.V., The Netherlands 20 10
The Data Supply Chain Entire network of It comprises of vendors that supply raw data, producers who convert the data into products, data warehouses that store data, distribution centers that deliver data to the retailers, and retailers who bring the data to the ultimate user. Copyright 2018 R20/Consultancy B.V., The Netherlands 21 Actors in the Data Supply Chain 1990 census: 87% of the US population can be identified by Zipcode, gender, and DoB Data consumer Data buyer Data producer Data supply chain Data provider Data distributor Tracking: AdSonar Pulse260 Quantcast Rubicon Undertone Traffic Marketplace Acxiom Equifax InfoUSA Teletrack Data enricher / blender Data retailer Copyright 2018 R20/Consultancy B.V., The Netherlands 22 11
Examples of Public Data Marketplaces DataMarket offers more than 45,000 datasets from around the world, delivered by among others 42 governments DataStreamX is the global marketplace for commercial data. Founded in 2014, their mission is to accelerate data access worldwide by bringing together buyers and vendors of data onto one simple-to-use platform QunB allows companies to upload their own data to QunB and to combine it with other datasets; these datasets can be sold or can be given away for free Knoema provides access to over 100 million time series. All available data is interactive and can be exported if needed Data.Gov offers more than 190,000 data sets. Copyright 2018 R20/Consultancy B.V., The Netherlands 23 Shopping for Data at the Data Marketplace Copyright 2018 R20/Consultancy B.V., The Netherlands 24 12
The Private/Enterprise Data Marketplace Data sets Business users Enterprise Data Marketplace Copyright 2018 R20/Consultancy B.V., The Netherlands 25 Potential Data Products Data as file Report Service Data via SQL Apps Embeddable KPI Stream of Data Copyright 2018 R20/Consultancy B.V., The Netherlands 26 13
From Taylor Made to Ready Made Copyright 2018 R20/Consultancy B.V., The Netherlands 27 The Self Service Data Counter Copyright 2018 R20/Consultancy B.V., The Netherlands 28 14
The Enterprise Data Marketplace and the Shopper The data marketplace is a storefront Users can shop for data products Private data and public data Users are shoppers Internal and external users Find the data products that meets the users needs Users can develop their own data products to be shared by others Copyright 2018 R20/Consultancy B.V., The Netherlands 29 Features of a Data Marketplace Data description Categorization Definitions Tags Search Metadata Data catalog Business glossary Data security and privacy Interfaces File interface Service interface SQL interface Analytical interface Data insert by owner by customers Price Free Subscription Pay by the sip Copyright 2018 R20/Consultancy B.V., The Netherlands 30 15
Data Warehouse versus Data Marketplace With data warehouses, IT develops what the business requests, with data marketplaces, IT develops what they think the business needs. Copyright 2018 R20/Consultancy B.V., The Netherlands 31 Part 3: Data Virtualization to the Rescue Copyright 2018 R20/Consultancy B.V., The Netherlands 32 16
Data Virtualization Overview (1) production application analytics & reporting internal portal mobile App website dashboard Data Virtualization Server production databases applications data warehouse & data marts streaming databases unstructured data ESB big data stores social media data private data external data Copyright 2018 R20/Consultancy B.V., The Netherlands 33 Data Virtualization Overview (2) production application analytics & reporting internal portal mobile App website dashboard SQL statement ODBC/SQL JDBC/SQL XML/SOAP REST/JSON XQuery MDX/DAX CICS JMS message SQL statement SOAP message Data Virtualization Server JMS SQL SQL+ XSLT SOAP Hive Prop. Excel JSON production databases applications data warehouse & data marts streaming databases unstructured data ESB big data stores social media data private data external data Copyright 2018 R20/Consultancy B.V., The Netherlands 34 17
The View from the Applications Data Virtualization Server Copyright 2018 R20/Consultancy B.V., The Netherlands 35 Importing Source Data Data consumer Data Virtualization Server Virtual table pointing to source Source Copyright 2018 R20/Consultancy B.V., The Netherlands 36 18
Developing Virtual Tables Data consumer Data Virtualization Server Virtual table: May contain row selections, column selections, column concatenations, transformations, column and table name changes, groupings, aggregations, data cleansing, Virtual table pointing to source Source Copyright 2018 R20/Consultancy B.V., The Netherlands 37 Layers of Virtual Tables Data consumption layer Enterprise data layer Data source layer Data Virtualization Server Copyright 2018 R20/Consultancy B.V., The Netherlands 38 19
Publishing a Virtual Table Copyright 2018 R20/Consultancy B.V., The Netherlands 39 Data Protection Copyright 2018 R20/Consultancy B.V., The Netherlands 40 20
The Data Marketplace and Data Virtualization Data as file Data via SQL Report Embeddable KPI Service App via JSON/REST Data consumption layer Enterprise data layer Data source layer Data Virtualization Server Copyright 2018 R20/Consultancy B.V., The Netherlands 41 Logical or Physical? Source systems Staging area Data warehouse Data marts Data Products ETL ETL ETL Source systems Data virtualization Data Products Copyright 2018 R20/Consultancy B.V., The Netherlands 42 21
Part 3: Challenges of Data Marketplaces Copyright 2018 R20/Consultancy B.V., The Netherlands 43 Challenge 1: Research and Development Copyright 2018 R20/Consultancy B.V., The Netherlands 44 22
But Where To Start? Service quality Call length the time to answer a call Volume of calls handled per call center staff Number of escalations how many bad Number of reminders how many at risk Number of alerts overall summary Customer ratings of service customer satisfaction Number of customer complaints problems Number of late tasks late Business Process Key Performance Indicators Percentage of processes where completion falls within +/- 5% of the estimated completion Average process overdue time Percentage of overdue processes Average process age Percentage of processes where the actual number assigned resources is less than planned number of assigned resources Copyright 2018 R20/Consultancy B.V., The Netherlands 45 Challenge 2: Prioritizing Development of Data Products Copyright 2018 R20/Consultancy B.V., The Netherlands 46 23
Business User Developing New Data Products Copyright 2018 R20/Consultancy B.V., The Netherlands 47 Challenge 3: Marketing and Selling Data Products Copyright 2018 R20/Consultancy B.V., The Netherlands 48 24
Challenge 4: Discoverable Data Products Categories Descriptions Definitions Tags Metadata Data catalog Business glossary Copyright 2018 R20/Consultancy B.V., The Netherlands 49 Challenge 5: Who Pays? Data products are developed before they are requested Data warehouse reports are paid in advance Pay by the sip? What if data products don t sell? Copyright 2018 R20/Consultancy B.V., The Netherlands 50 25
Challenge 6: Sizing of the Architecture How many users? How many reports? How much data? Virtual implementation Cloud Scaling up and down Copyright 2018 R20/Consultancy B.V., The Netherlands 51 Challenge 7: Organization Developers need input from the business Developers need to understand the business Current and future needs BICC not a cost center anymore The need for commercially-oriented people Copyright 2018 R20/Consultancy B.V., The Netherlands 52 26
Copyright 2018 R20/Consultancy B.V., The Netherlands 53 27