Making Data Integration Easy For Multiplatform Data Architectures With Diyotta 4.0 WEBINAR MAY 15 th, 2018 1PM EST 10AM PST
Welcome and Logistics If you have problems with the sound on your computer, switch to phone dial-in. Questions will be answered at the end of the presentation. Throughout the presentation you can submit questions through the Zoom control panel on your screen. For your convenience, the slides from the presentation and a link to the recorded webinar will be sent to you within 48 hours of the webinar. 2
Speakers 3 Philip Russom, Ph.D. SR. Research Dir. For Data Mgmt. TDWI Ravindra Punuru CTO. Diyotta
Diyotta 4.0 announcement Diyotta 4.0 Integrates Cloud and Streaming Data for Multiplatform Data Architecture Latest Release of Diyotta Enables Enterprise Class Data Integration for Spark Streaming and Cloud based data warehouses Snowflake, Amazon Redshift, and Google BigQuery 2018 Diyotta Inc. All Rights Reserved.
Challenges in Data Integration with The Multiplatform Data Architecture Philip Russom, Ph.D. Senior Research Director, TDWI May 15, 2018
AGENDA Background Recap on Multiplatform Data Architectures MDA Reference Architecture Real-World Examples Critical Success Factors for MDA Solution Pattern for Data Integration across MDAs Conclusions Integrating Data Across Multiplatform Data Architecture (MDA) #TDWI @prussom
Background Increasing Complexity Rising complexity of data Eclectic mix of old and new data; every structure imaginable Generated and integrated, from batch to real time Traditional data from enterprise apps, web, third-parties New sources of data from machines, social media, IoT Rising complexity of data management solutions Mix of home grown, vendor built, and open source Multiplatform architectures; distributed and heterogeneous; on premises or on cloud; from relational to Hadoop Complex and diverse in the extreme, the result is: Multiplatform Data Architecture (MDA)
DEFINITION Multiplatform Data Architecture (MDA) Numerous, diverse data platform types Traditional relational database management systems (RDBMSs) Newer DBMSs, based on clouds, columns, appliances, graph analytics, NoSQL, etc. Hadoop & its ecosystem. Other file systems Diversity isn t new, but the intensity is. Architecture can help with the complexity.
MDA Reference Architecture: Data Warehouse Data/Application Integration and Metadata Management Infrastructure Data Views: Logical, Virtual, Federated / Cross-Platform Operations: Data Flow, Query, Sync, Analytics New Data Machine Data sensors, vehicles, handheld devices, shipping pallets Web Data server logs, social media, ecommerce Traditional Data CRM, SFA, ERP Financials, billing, call center, supply chain Many Ingestion Methods Ingestion Zones Landing and Staging ETL/ELT Stream Capture and Event Processing Data Lake on Hadoop Analytic Zones Exploration & Data Prep Set-based & Algorithmic Analytics Sandboxes Archive Zones Infrequently Used Data / Live Archive Expired Data per Compliance Rules Functional Zones Marketing, Sales, Financials Healthcare, Manufacturing Sync w/op Apps Many Delivery Methods Data Warehouse Dimensions, cubes, subject areas, time series, metrics, aggregates... Trusted data for standard reports Specialized DBMSs Based on columns, appliances, clouds, analytics, graph DIVERSE PLATFORMS: Web, Client/Server, Storage, Clusters, Racks, Grids, Clouds, Hybrid Combinations
Most data warehouses are now multiplatform data architectures. Monolith was norm in 90s; now rare. Multi-platform hybrid is the new norm. Central monolithic EDW with no other data platforms Central EDW with many additional data platforms No true EDW, but many workload-specific data platforms instead EDW 15% 37% 16% 15% 15% DWE Central EDW with a few additional data platforms Many workload-specific data platforms w/non-central EDW Other (2%) Source: 2014 TDWI report Evolving Data Warehouse Architectures. Based on 538 respondents.
MDA is not Not a big bang enterprise information model That's too large, intrusive, time consuming, risky MDA s for solutions, between local & ent scope Not a mere portfolio of platforms and tools Although the portfolio affects physical distribution of data MDA is more about relations among platforms and datasets they manage, less about inventory Not a mere technology stack MDA is more about relations among stack layers Not only about data at rest Also data in motion, e.g. from streaming sources Also data moving across platforms
REAL-WORLD EXAMPLES OF Multiplatform Data Architectures Across Industries Multiplatform Data Warehouse Environments Omnichannel Marketing Digital Supply Chain Vertical Specific Banking: International Banking Insurance: Claims, Fraud, Actuarials Telco: Real-Time Network Forecasting
Critical Success Factors for MDAs MDA is created one thread at a time Threads weave together in a data fabric More patch-work quilt than seamless fabric Threads can be many cross-platform things Substantial app and data integration infrastructure In-memory, pipelines, data flows, replication, messaging Data hubs, workflows, orchestration Shared data structures, Development artifacts, Standards Metadata, Virtual/logical views, Federated queries Other critical success factors for MDA Portfolio management that encourages diverse data platforms Data architects and governors who foster threads that weave into architecture
Look for solutions that can: Minimize data integration infrastructure: unified, enterprise data integration across MDAs Maximize usage of MDAs for best-fit data processing (ELT) for data at rest and in motion Maximize reusability, shared artefacts and standards with enterprise grade features Maximize visibility across diverse data hubs using centralized metadata Scale in any direction - horizontal, vertical, geographies, systems
End
Making Data Integration Easy For Multiplatform Data Architectures With Diyotta 4.0 Ravindra Punuru, CTO Diyotta
Agenda Data Integration on Multiplatform Architetcures Diyotta 4.0 Features Diyotta Demo 17 2018 Diyotta Inc. All Rights Reserved.
How Enterprise Data Integration Looks Like Today Point Tools Point Tools Traditional Data Repositories Hadoop Ecosystem Cloud Data Warehouses ODS EDW Marts Legacy Tools Legacy Data Store Legacy Data Store Snapshots Point Tools Ingest/Store Hive ELT Spark ELT Point Tools Spark Streaming Kafka Data Repositories Ingest Historical DBs ELT Point Tools Marts EDW Distribute Legacy Tools One-off Tools One-off Tools One-off Tools Operational Source Systems Emerging Source Systems Regional Source Systems OLTP Reference Data Social Media Streaming Sources Regional DBs Regional Files External Systems Web/Online Data Devices Data SaaS Sources Regional DWs External Data 18 2018 Diyotta Inc. All Rights Reserved.
Diyotta s Unified and Modular Approach DIYOTTA Controller Diyotta generated processing instructions D Diyotta enabled data movement Traditional Data Repositories Hadoop Ecosystem Cloud Data Warehouses ODS EDW Marts Ingest/Store Spark Streaming Ingest Marts Hive ELT Kafka Historical DBs EDW Legacy Data Store Legacy Data Store Snapshots Spark ELT Data Lake ELT Distribute Operational Source Systems Emerging Source Systems Regional Source Systems OLTP Reference Data Social Media Streaming Sources Regional DBs Regional Files External Systems Web/Online Data Devices Data SaaS Sources Regional DWs External Data 19 2018 Diyotta Inc. All Rights Reserved.
Diyotta s single Job Flow manages multiple data platforms Hadoop Teradata RedShift Google BigQuery Snowflake 20 2018 Diyotta Inc. All Rights Reserved.
Diyotta 4.0 features Transform user experience with visual excellence. Quicker response time, and increased design speed & flexibility. Expand your data fabric with cloud data warehouse. Cloud data migration, Cloud data integration and unify on-prem and cloud data. Expand your data fabric with Lambda architecture. Realtime stream data processing, Combine batch data with data in motion, and real-time alerts & notifications. 21 2018 Diyotta Inc. All Rights Reserved.
Transform user experience with visual excellence, quicker response time, and increased design speed & flexibility Friendly user experience with new, modern user interface Faster response time with browser caching efficiency and compressed metadata transfer Increased design speed & flexibility with interactive dataflows and highspeed agent data access 22 2018 Diyotta Inc. All Rights Reserved.
Expand your data fabric with cloud data warehouse. Cloud data migration, Cloud data integration and unify on-prem and cloud data 23 2018 Diyotta Inc. All Rights Reserved.
Expand your data fabric with Lambda architecture. Realtime stream data processing, Combine batch data with data in motion, and real-time alerts & notifications. Sources Event Transformation Diyotta Data Stream Sinks/Targets Batch data lookup Batch flow trigger Data Transformation Diyotta Batch Data Flow Other Sources Others.. Other Targets 24 2018 Diyotta Inc. All Rights Reserved.
Making Data Integration Easy For Multiplatform Data Architectures With Diyotta 4.0 Minimize data integration infrastructure: unified, enterprise data integration across MDAs Maximize usage of MDAs for best-fit data processing (ELT) for data at rest and in motion Maximize reusability, shared artefacts and standards with enterprise grade features Maximize visibility across diverse data hubs using centralized metadata Scale in any direction - horizontal, vertical, geographies, systems 25 2018 Diyotta Inc. All Rights Reserved.
Live Demo User Experience Interactive Design. Cloud warehouse support use case. Data Stream use case. 2018 Diyotta Inc. All Rights Reserved.
2018 Diyotta Inc. All Rights Reserved. Questions?
Resources Diyotta 4.0 Data Sheet: https://uploadsssl.webflow.com/5abbd6c80ca1b5830c921e17/5af05e946154a88cdfb3c4d9_diyotta%204.0%20datasheet.pdf Request trial: https://www.diyotta.com/request-trial Documentation: https://support.diyotta.com/docs Latest blogs on 4.0: https://www.diyotta.com/blog