Whitepaper. Solving Complex Hierarchical Data Integration Issues. What is Complex Data? Types of Data

Similar documents
Self-Service Data Preparation for Qlik. Cookbook Series Self-Service Data Preparation for Qlik

OLAP Introduction and Overview

MetaSuite : Advanced Data Integration And Extraction Software

Low Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR

Introduction to Customer Data Platforms

Full file at

Provide Real-Time Data To Financial Applications

Enterprise Directories and Security Management: Merging Technologies for More Control

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION

NatQuery General Questions & Answers

FINANCIAL REGULATORY REPORTING ACROSS AN EVOLVING SCHEMA

Deccansoft Software Services. SSIS Syllabus

SAS Data Integration Studio 3.3. User s Guide

Call: SAS BI Course Content:35-40hours

Data Management Glossary

Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations

Analytic Workspace Manager and Oracle OLAP 10g. An Oracle White Paper November 2004

Instant Data Warehousing with SAP data

Optimized Data Integration for the MSO Market

Teiid Designer User Guide 7.5.0

EDI XLS INVOICE HTML. The Top 10 Ways Data Preparation Helps Tableau Users XML

Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value

Shine a Light on Dark Data with Vertica Flex Tables

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

Exploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software

Data Interoperability An Introduction

CHAPTER 3 Implementation of Data warehouse in Data Mining

@Pentaho #BigDataWebSeries

Automated Netezza Migration to Big Data Open Source

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH

Hybrid Data Platform

MD Link Integration MDI Solutions Limited

Crystal Reports. Overview. Contents. How to report off a Teradata Database

Data Virtualization and the API Ecosystem

BEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC

Lambda Architecture for Batch and Stream Processing. October 2018

REPORTING AND QUERY TOOLS AND APPLICATIONS

Data Integration and ETL with Oracle Warehouse Builder

Data Mining: Approach Towards The Accuracy Using Teradata!

Introduction to Federation Server

The Salesforce Migration Playbook

ER/Studio Enterprise Portal Evaluation Guide. Published: March 6, 2009

SAP Agile Data Preparation Simplify the Way You Shape Data PUBLIC

IBM Data Replication for Big Data

metamatrix enterprise data services platform

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

Venezuela: Teléfonos: / Colombia: Teléfonos:

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

Construction and Real Estate. Improve system performance and data security with SQL Server

Data Warehouse and Data Mining

TPC-DI. The First Industry Benchmark for Data Integration

Database Vs. Data Warehouse

Audience BI professionals BI developers

12 Minute Guide to Archival Search

Making MongoDB Accessible to All. Brody Messmer Product Owner DataDirect On-Premise Drivers Progress Software

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Tableau Metadata Model

Migrate from Netezza Workload Migration

Course Contents: 1 Business Objects Online Training

Building Self-Service BI Solutions with Power Query. Written By: Devin

Designing High-Performance Data Structures for MongoDB

Education Brochure. Education. Accelerate your path to business discovery. qlik.com

Customizing SAS Data Integration Studio to Generate CDISC Compliant SDTM 3.1 Domains

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

INTRODUCTION. Chris Claterbos, Vlamis Software Solutions, Inc. REVIEW OF ARCHITECTURE

DotNetNuke. Easy to Use Extensible Highly Scalable

Data Warehousing. Adopted from Dr. Sanjay Gunasekaran

DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting.

Teradata Aggregate Designer

Data warehouse architecture consists of the following interconnected layers:

Enterprise Data-warehouse (EDW) In Easy Steps

Architectural challenges for building a low latency, scalable multi-tenant data warehouse

Virtuoso Infotech Pvt. Ltd.

XML Documentation for Adobe Experience Manager

Satisfy the Business Using Db2 Web Query

Tamr Technical Whitepaper

Introduction to K2View Fabric

Pentaho 3.2 Data Integration

XML Gateway. Factsheet. J System Solutions. Version 1.1

StreamServe Persuasion SP5 StreamServe Connect for SAP - Business Processes

MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Caché and Data Management in the Financial Services Industry

Chapter 6 VIDEO CASES

360SCIENCE. use case. Customer Data Matching for Business Intelligence and Finance M&A

MCSA SQL SERVER 2012

Informatica Power Center 10.1 Developer Training

XML Systems & Benchmarks

Sage SQL Gateway Installation and Reference Guide

DEV-33: Get to Know Your Data Open Source Data Integration, Business Intelligence and more Marian Edu

Learning Alliance Corporation, Inc. For more info: go to

Data integration made easy with Talend Open Studio for Data Integration. Dimitar Zahariev BI / DI Consultant

Salesforce Classic Mobile Implementation Guide

ETL Testing Concepts:

What is a multi-model database and why use it?

SIMULATING SECURE DATA EXTRACTION IN EXTRACTION TRANSFORMATION LOADING (ETL) PROCESSES

Fast Innovation requires Fast IT

Salesforce Classic Mobile Implementation Guide

How to Accelerate Merger and Acquisition Synergies

Jitterbit is comprised of two components: Jitterbit Integration Environment

Transcription:

Whitepaper Solving Complex Hierarchical Data Integration Issues What is Complex Data? Historically, data integration and warehousing has consisted of flat or structured data that typically comes from structured operational applications and databases. Data integration today is becoming a much more complex undertaking that involves a variety of platforms, formats, and data sources. Incoming data often has hierarchical structures that must be parsed into a common format, validated, and loaded into internal databases. And for outbound data the challenges are even greater because data from a number of disparate sources such as databases, files, and services must be combined in a structure that complies with established standards and partners requirements. Data integration solutions are being developed to acquire, merge, and transport data in a myriad of formats. Integrated data is loaded into target databases and applications inside the organization or packaged into files and documents for exchange with other organizations. Growth has expanded both analytic data integration in the form of extract, transform, and load [ETL] for data warehousing and operational data integration for consolidation, migration, or synchronization of operational databases. Despite the development of sophisticated data integration solutions in recent years, few adequately address the integration and management of complex hierarchical data. Types of Data Structured Data - Simple tabular data structures from relational database management systems, tables, and flat files in record format. Unstructured Data - Documents of mostly natural language text, like word-processing files, email, and text fields from databases and applications. Some documents may have light metadata, such as spreadsheets and RSS feeds. Semi-structured Data - Data documents exchanged between organizations often combine unstructured and structured data, or, such as in XML, test that has been structured with metadata tags. Semi-structured data documents often comply with open data exchange standards such as SWIFT, NACHA, HIPAA, HL7, RosettaNet, and EDI.

The data processing capabilities of most data integration solutions addresses structured data sources like tabular data from relational database management systems (RDBMSs) and flat files in record format. Unfortunately, much of an organization s data these days is not structured in a tabular way, limiting the benefits that are derived from its data gathering efforts to tabular data from structured sources, even though similar value and benefits could also be achieved by embracing the complex data found in unstructured, semistructured, and hierarchical data sources. As data integration practices and technologies expand to embrace complex data, data integration solutions must grapple with two tasks that are new to most data integration specialists: integrating data from complex and nontraditional sources and assuring the quality of data drawn from those sources. Centerprise Data Integrator for Complex Hierarchical Data Integration Complex Data Integration for Business Users Building hierarchical documents from relational or flat data is a complex undertaking. It is usually accomplished using hand-coded programs developed by IT professionals that are time consuming to write, expensive to maintain, and don t always meet the needs of end users. Businesses are beginning to recognize that business analysts and those who make business decisions should be developing data integration processes to meet their guidelines rather than handing off to developers whose expertise is in perfecting systems but not in how the data from the systems is used. Approachable Data Integration The good news is that there is a powerful yet affordable complex data integration software solution on the market that doesn t require difficult and expensive IT coding processes. Astera s Centerprise Data Integrator has been developed from the ground up specifically so business users can solve their complex data integration issues with XML, electronic data interchange (EDI), web services, etc. first hand. The intuitive, drag-and-drop environment enables non-developers to quickly and easily parse and construct hierarchical structures and manage their complex integration jobs faster and more affordably. Centerprise approachable data integration significantly lowers the need for IT resources and empowers those who use the data to have a say in how it is processed. Working with Hierarchical Data The key to successful management of complex, hierarchical data is superior data mapping functionality. Data mapping is very difficult and must be done correctly in order to deliver the desired results. New technologies have been created in Centerprise that simplify and automate the complexities of data mapping. You can join disparate data sources to build complex trees, automatically build hierarchies from relational databases, and apply transformations to specific tree nodes. With these unique capabilities, users can rapidly create powerful transformations for hierarchical-to-flat, flat-to-hierarchical, and hierarchical-to-hierarchical situations. Single, integrated platform Centerprise provides an integrated and consistent approach for working with both flat and hierarchical files. Because everything is done in a single environment, users working with hierarchical data use the same familiar toolset they use when working with flat structures. All data parsing and structure building is done in the same map, with no need for coding or scripting.

Special components and modes Technologies have been created specifically to provide capabilities for joining disparate data sources to build complex trees, automatically building hierarchies from relational databases, and applying transformations to specific tree nodes. This last item applying transformations to tree nodes enables users to apply common Centerprise transformations such as Aggregate, Filter, Join, Distinct, Sort, Normalize, Denormalize, and others to specific nodes in a hierarchical structure. With this unique technology, users can rapidly create powerful transformations for hierarchical-to-flat, flat-to-hierarchical, and hierarchical-to-hierarchical situations. Hierarchical formats and databases Centerprise supports most hierarchical source formats from databases, files, and web services. Database hierarchies include data models and visual joins. Data models are easy to create in Centerprise for joins that happen in the database. A new and more flexible way to create a data model is through a visual join. Typical hierarchical file formats include COBOL, EDI, XML, reports/pdf files, delimited, and fixed length. EDI - Electronic Data Interchange Today s agile business relies on EDI technology to automatically exchange business documents from different sources in a standard electronic format, thus streamlining operational costs, increasing processing speed, reducing errors, and improving relationships with business partners. Centerprise provides complete bidirectional EDI mapping functionality for creating customized transaction structures, parsing and validating incoming documents, building outgoing documents, and creating outgoing EDI files, including HIPAA-compliant documents. With Centerprise EDI you can: Create custom EDI formats using the Centerprise EDI Message Designer Parse and validate incoming EDI files. Validations include data type, lookup codes, message, functional group and interchange integrity, and conditionals. Error information can also be mapped to create customized outputs. Send and receive messages in all current and past versions of the X12 standard. Additional dialects will be available in the third quarter of 2014. Build standards and partners compliant outbound EDI files Web Services Input and output payloads are almost always hierarchical. Centerprise supports web services such as SOAP and REST, as well as XML and JSON. The software also provides a gateway for countless consumer and enterprise APIs. Centerprise XML functionality can be used to parse incoming documents as well as build outgoing documents. Key functionalities include: Use an existing schema or build one by providing a sample XML or JSON file High-performance reading and writing capabilities able to work efficiently with very large data files Quickly build complex outgoing XML documents using multi-table queries, tree joins, scoped transformations, and other features Custom Hierarchical Files Centerprise also enables parsing and creation of custom hierarchical files. This feature can be used to process hierarchical delimited files produced by many systems.

Centerprise Object Model The Centerprise object model now includes a layout tree to organize referenced objects versus collections, as well as elements. Data preview is different for hierarchical data than for flat files and introduces the concept of anchors and cardinality rules. Hierarchical Transformations Centerprise brings an innovative approach to building complex hierarchical documents that uses a combination of multi-table queries, tree-joins, lookups, and sort, aggregate, and other transformations to progressively build output using a visual, stepwise method. At each step, you can use Centerprise s single-click preview to see data in at that step, making it easy to identify and correct any mapping errors. Centerprise not only provides the functionality to parse incoming EDI, XML, and custom hierarchical files, it also offers an extensive array of unique transformations that can be used to transform files into hierarchical or flat structures, including Filter, Sort, Route, Union, Tree Join, and Pass-through. Loading hierarchical data into a database requires sequencing of inserts or updates to ensure that any referential integrity constraints are not violated. For instance, a purchase order must be inserted before line items can be inserted. Centerprise features the ability to automatically sequences database writes based on referential constraints. Supplemental Transformations Centerprise provides some unique transformations that are very useful when dealing with trees, including FLWOR, Scoped Transformations, Aggregate Transformations, and text processors. FLWOR (FOR, LET, WHERE, ORDER BY, RETURN) must be scoped and allows for querying the tree for a specific node. Scoped Transformations are created with a Scope Map and are a must when dealing with sub-trees. You virtually attach a new sibling node to the existing tree. Aggregate Transformations pull aggregate data from sub-collections and must be used with a scope Text processors deal with two formats at once and are ideal for scenarios where database columns contain structured text (i.e., xml) Common Scenarios Want to learn more? We recently presented a one-hour webinar, Complex Data Integration in Centerprise, which is available for viewing in our Astera.TV video library. Along with demos of all the technologies discussed in this blog, the video contains several great common scenario demos at the end: Creating hierarchies from multiple flat sources Flattening hierarchies Writing to relational tables One hierarchy to another

Centerprise Solutions Data Warehousing High-performance data warehousing ETL features in a unified, intuitive environment. CDC Choose either a batch or real-time change data capture strategy for your particular requirements. Data Mapping Transforms advanced mapping, validating, and cleansing tasks into basic drag-and-drop or single-click commands. ETL Extract data from any source, transform it to suit your needs, and load it into your database or warehouse. Data Conversion Complex doesn't need to be complicated. Visual, code-free parsing, transforming, and loading of data from any source. Data Integration A single platform for complex, hierarchical integration the requires no coding. Data Migration Unique hierarchical data processing technologies automate and streamline data migration projects. EDI Full electronic data interchange functionality combined with Centerprise complex data mapping capabilities. www.astera.com Contact us for more information or to request a free trial sales @astera.com 8888-77-ASTERA Copyright 2014 Astera Software Incorporated. All rights reserved. Astera and Centerprise are registered trademarks of Astera Software Incorporated in the United States and / or other countries. Other marks are the property of their respective owners.