Introduction to Federation Server Alex Lee IBM Information Integration Solutions Manager of Technical Presales Asia Pacific 2006 IBM Corporation
WebSphere Federation Server Federation overview Tooling support Case studies Summary and references 2
What if you could Access data anywhere in your enterprise No matter where it resides Regardless of what format it is in Regardless of vendor Without creating new databases and without disruptive changes to existing ones Using standard SQL and any tool that supports JDBC/ODBC while looking to the user like a single database BI tools Business Analysis Mgmt Reports 3
Then you could Produce information needed by the organization faster Melbourne Health built the worlds first solution to access public medical history discoveries were made days after implementation versus months with the prior process. Improve the productivity of your people Taikang Life - saved 90% in people costs to compile real time reports (1 instead of 10 people) Reduce business process costs Neckermann reduced labor costs equivalent to 5 full-time employees per year Pioneer Display production efficiency increased over 25% Business BI tools Analysis Mgmt Reports 4
And without Building new databases for data you already have stored in multiple places Acquiring hardware and software infrastructure to support them Keeping them up to date Keeping them secure Assuring their reliability and availability for the next 5-7 years Business BI tools Analysis Mgmt Reports 5
What is Federation? Federation is an integration pattern that allows a collection of resources to be viewed and manipulated as if they were a single resource while retaining their autonomy and integrity. It is the technology on which EII is based. 6
Parallel Processing Rich Connectivity to Applications, Data, and Content IBM WebSphere Information Server Delivering information you can trust Information Server Information Services Director 7 Understand Cleanse Transform Federate Information Analyzer QualityStage DataStage Federation Server Metadata Server
Data Federation Transparent Appears to be one source Independent of how and where data is stored Applications continue to work despite of any change in how data is stored Heterogeneous Accesses data from diverse sources Relational, Structured, XML, messages, Web, Extensible Bring together almost any data source. Wrapper Development Toolkit High Function Full query support against all data Capabilities of sources as well Autonomous Non-disruptive to data sources, existing applications, systems. High Performance Optimization of distributed queries 8
Virtualized Information Access Access diverse and distributed information as if it were in one system Single sign on Unified views Common language Web services or Java API Query and update Optimized access SQL, SQL/XML SQL Classic Federation Server for z/os Federation Server Content II Content Edition Mainframe databases Mainframe files Relational databases XML Web services Packaged applications Web, Collaboration Non-Relational Systems Sources Content Workflow Repositories systems and Imaging Systems 9
Federated Sources SQL SQL Classic Federation Server for z/os Federation Server Content II Content Edition Mainframe databases Mainframe files Relational databases XML Web services Packaged applications Web Other Collaboration Systems Content & Imaging Workflow systems IMS Adabas CA-Datacom CA-IDMS 10 VSAM Sequential DB2 Informix Oracle Sybase Teradata Microsoft SQL Server ODBC WebSphere BI Adaptors SAP PeopleSoft Siebel OLE DB Excel Flat files Life sciences Custom-built Lotus Notes Microsoft Index Server IBM Lotus Extended Search Sametime QuickPlace Microsoft Exchange Plus partner tools and custom-built connectors extend access to more sources DB2 CM Family Domino.doc Documentum FileNet Open Text Stellent Interwoven Hummingbird WebSphere FileNet
Data Federation Approach Incorporate data sources using wrappers Access to a particular class of data sources or protocols Contains information about data source characteristics High-function relational wrappers from IBM Read/Write access Table Federated views Nickname Server1 Wrapper A Nickname Nickname Server2 Nickname Server3 WrapperB Clean, simple interface for nonrelational wrappers Written by IBM, third parties, customers Read Only (Optional Local Data) (remote data source 1) (remote data source 2) (remote data source 3) 11
Data Federation Approach Powerful query processing engine in federated server Decomposes, rewrites and distributes queries Federated Server Cost-based optimizer chooses query plan with pushdown as appropriate Query execution engine drives wrappers, combines results DB2 cost-based optimizer Rel. Wrapper Client library Local + Remote Execution Plans NR. Wrapper Client library Nickname Nickname Table Compensates for missing function in data source Non-SQL Invokes functions at remote sources as needed 12
Agenda Federation overview Tooling support Case Studies Summary and References 13
Tools for modeling Visualize and define mappings between remote schema and federated schema Generate federated schema based on transformations and joins Nicknames Views Simplify creation of virtual schemas 14
Administration Tools Control Center Tools to configure and administer standard wrappers Plug-in architecture allows custom wrappers to be administered 15
Tools help manage the complexity Configuration wizard Guides you through federation configuration process Discovery Server discovery: Automatically discovers and configures external servers Nickname discovery Deploy Capture configuration to a script, save and deploy Facilitates cloning system configuration for horizontal scaling... 16
WFS tools help manage the complexity Health Monitoring Monitor health of servers, nicknames that affect configuration Statistics refresh Refresh nickname statistics on demand or by scheduled task Snapshot Monitoring Snapshots are useful for determining the status of a database system. Event Monitoring Collect information about the database and any connected applications when specified events occur. 17
Agenda Federation overview Tooling support Case Studies Summary and References 18
When to use Federation Too big - Data from multiple sources is just too big to integrate on a permanent basis Too ad hoc - Data is too varied and unpredictable to make an ETL process worthwhile Too proprietary - Data is owned by disparate entities/organizations that do not want to support ongoing ETL processes Too recent - Data from multiple sources is required that must be current or must not be updated while being read Application or tool does not support native access to the sources being accessed 19
Providing on-demand relational access to multiple types of data Requirements Warehouse users need access to up-to-theminute data from an external source External data may not be integrated into the warehouse until later (or never) because it is not practical or possible/allowed Solution Access remote data via Federation Server nicknames and combine with DW data Also enables prototyping of ETL development Data Warehouse Client Federation Server Flat files or spreadsheets ODS External Web services 20
Enabling transparent drill-through to detail data from summary data Requirements Users query a summary warehouse that is fed from a detail warehouse by an ETL process Most queries satisfied from summary alone. Some need to retrieve detail data after initial filtering by the summary Be able to retrieve detail on-demand within the context of the summary query without making a new connection Solution Extract, transform and load relevant source data into summary and detail databases using bulk data movement Users query summary data using existing application Detail data visible on-demand from summary warehouse using federation requests data Federation Server requests data Detail Database Client Summary Database Bulk data Bulk data DataStage DataStage 21
Unified view of regionally distributed data with same data model Requirements Several regional databases with similar logical data models, but unique data Application needs to see the data as one large database with a single schema Impractical to physically consolidate data Solution Access relevant remote tables via Federation Server nicknames Connect matching nicknames from different sources via a UNION ALL view Can optionally cache common data at the federated server or create local aggregates Federation Server Seattle Client Phoenix San Jose 22
Placement, Consolidation, and Access Choices ETL or replication preferred: Centralized data needed for access performance or availability Complex, multi-dimensional queries Point-in-time consistency needed e.g. close of business Complex transformation needed for semantically consistent data Federation preferred: Access performance and load on sources traded for overall lower cost Queries returning small result sets among federated systems Large volume data that is infrequently accessed Data that changes rapidly Business requirements demand current data Data security, licensing or regulations restrict data movement Unique functions must be accessed at data source Data semantics consistent and managed across domains Read/write access is required 23
Agenda Federation overview Tooling support Case Studies Summary and References 24
Value of Federation Speed time to market for new applications Simplify and enrich portal development Reduce hand-coding by half Reduce skills requirements Use familiar SQL programming model and existing tools Build on a standards-based, strategic integration platform Enhance value and insight from existing assets and applications Work within your existing infrastructure Extend existing warehouses Combine existing data and content assets in new ways Facilitate cross-divisional reporting Increase control over IT costs Reduce need to rip and replace Reduce need to manage redundant data 25
References For product information on WebSphere Federation Server http://www.ibm.com/software/data/integration/federation_server For the system and data source requirements: http://www.ibm.com/software/data/integration/federation_server/requirements.html WebSphere Federation Server V9.1 infocenter: http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp Some whitepapers on federation technology: IBM Federated Database Technology : http://www.ibm.com/developerworks/db2/library/techarticle/0203haas/0203haas.html Two-part series on using data federation technology: http://www.ibm.com/developerworks/db2/library/techarticle/dm-0506lin/ http://www.ibm.com/developerworks/db2/library/techarticle/dm-0507lin/ Maximizing the performance of WebSphere Information Integrator with MQTs : http://www.ibm.com/developerworks/db2/library/techarticle/dm-0605lin/ Use federated procedures in WebSphere Federation Server : http://www.ibm.com/developerworks/db2/library/techarticle/dm-0605bhatia 26
27