Using MDM Big Data Relationship Management to Perform the Match Process for MDM Multidomain Edition
|
|
- Elfrieda Bell
- 5 years ago
- Views:
Transcription
1 Using MDM Big Data Relationship Management to Perform the Match Process for MDM Multidomain Edition Copyright Informatica LLC 1993, Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. All other company and product names may be trade names or trademarks of their respective owners and/or copyrighted materials of such owners
2 Abstract You can use MDM Multidomain Edition to create and manage master data. The MDM Hub uses multiple batch processes including the match process to create master data. You can use MDM Big Data Relationship Management to perform the match process on a large data set. The MDM Hub can leverage the scalability of MDM Big Data Relationship Management to perform the match process on the initial data and the incremental data. This article describes how to use MDM Big Data Relationship Management to perform the match process for MDM Multidomain Edition. Supported Versions MDM Big Data Relationship Management 10.0 HotFix HotFix 6 PowerCenter HotFix 2 and HotFix 3 PowerExchange for Hadoop HotFix 2 and HotFix 3 MDM Multidomain Edition and Table of Contents Overview Match Process in the MDM Hub Match Process with MDM Big Data Relationship Management Scenario Migrate Data from the Hub Store to HDFS Step 1. Import Source Definitions for the Base Objects in the Hub Store Step 2. Create a Mapping Step 3. Create an HDFS Connection Step 4. Run the Mapping Create the Configuration and Matching Rules Files Run the Extractor Utility Run the Initial Clustering Job Output Files of the Initial Clustering Job Migrate the Match-Pair Data from HDFS to the Hub Store Step 1. Create a Source Definition for the Match-Pair Output Data Step 2. Create a Mapping Step 3. Run the Mapping Updating the Consolidation Indicator of the Matched Records Updating the Consolidation Indicator by Running an SQL Query Updating the Consolidation Indicator by Using the Designer and the Workflow Manager Manage the Incremental Data Step 1. Migrate the Incremental Data from the Hub Store to HDFS Step 2. Run the Initial Clustering Job in the Incremental Mode Step 3. Migrate the Match-Pair Data from HDFS to the Hub Store
3 Step 4. Update the Consolidation Indicator of the Matched Records Overview You can use MDM Multidomain Edition to create and manage master data. Before you create master data, you must load data into the MDM Hub and process it through multiple batch processes, such as land, stage, load, tokenize, match, consolidate, and publish processes. The match process identifies duplicate records and determines the records that the MDM Hub can automatically consolidate and the records that you must manually review before consolidation. For more information about MDM Multidomain Edition and the batch processes, see the MDM Multidomain Edition documentation. You can use MDM Big Data Relationship Management to perform the match process on a large data set in a Hadoop environment. The MDM Hub can leverage the scalability of MDM Big Data Relationship Management to perform the match process, and you can continue to use the MDM Hub to manage master data. For more information about MDM Big Data Relationship Management, see the Informatica MDM Big Data Relationship Management User Guide. Match Process in the MDM Hub You can run the Match job in the MDM Hub to match and merge the duplicate data that the job identifies. The following image shows how the Match job functions: The Match job performs the following tasks: 1. Reads data from the base objects in the Hub Store. 2. Generates match keys for the data in the base object and stores the match keys in the match key table associated with the base object. 3. Uses the match keys to identify the records that match. 4. Updates the Match table with references to the matched record pairs and related information. 5. Updates the consolidation indicator of the matched records in the base objects to 2. The indicator 2 indicates the QUEUED_FOR_MERGE state, which means that the records are ready for a merge. Match Process with MDM Big Data Relationship Management MDM Big Data Relationship Management reads data from HDFS, performs the match process, and creates the matchpair output in HDFS. Use PowerCenter with PowerExchange for Hadoop to migrate data from the Hub Store to HDFS and from HDFS to the Hub Store. The following image shows how you can migrate data from the Hub Store to HDFS: 3
4 To perform the match process with MDM Big Data Relationship Management, perform the following tasks: 1. Use PowerCenter with PowerExchange for Hadoop to migrate data from the base objects in the Hub Store to HDFS in the fixed-width format. 2. Run the extractor utility of MDM Big Data Relationship Management to generate the configuration file and the matching rules file. 3. Run the initial clustering job of MDM Big Data Relationship Management that performs the following tasks: a. Reads the input data from HDFS. b. Indexes and groups the input data based on the rules in the matching rules file. c. Matches the grouped records and links the matched records. d. Creates the match-pair output files in HDFS that contain a list of records and their matched records. 4. Use PowerCenter with PowerExchange for Hadoop to migrate the match-pair output data to the Match table in the Hub Store. 5. Run an SQL query or use PowerCenter with PowerExchange for Hadoop to update the consolidation indicator of the matched records in the base objects to 2. The indicator 2 indicates the QUEUED_FOR_MERGE state, which means that the records are ready for a merge. Scenario You work for a large healthcare organization that has a large data set as the initial data. You want to use MDM Multidomain Edition to create and manage master data. You also want to leverage the scalability of MDM Big Data Relationship Management to perform the match process. Migrate Data from the Hub Store to HDFS Use the PowerCenter Designer and the PowerCenter Workflow Manager to migrate data from the base objects in the Hub Store to HDFS. 1. Import source definitions for the base objects in the Hub Store. 2. Create a mapping with the source instances, a Joiner transformation, and a target definition. 3. Create an HDFS connection. 4
5 4. Run the mapping to migrate data from the base objects in the Hub Store to HDFS. Step 1. Import Source Definitions for the Base Objects in the Hub Store Use the Designer to import a relational source definition for the base object in the Hub Store. A relational source definition extracts data from a relational source. If you want to integrate data from more than one base object, create a relational source definition for each base object. For example, if you have separate base objects for person names and addresses, create two relational source definitions, one for the person names and another for the addresses. You can use a Joiner transformation to integrate the person names and the address details. The following image shows the source definitions for the person names and addresses base objects: Step 2. Create a Mapping Use the Designer to create a mapping. A mapping represents the data flow between the source and target instances. 1. Create a mapping and add the source definitions that you import. When you add the source definitions to the mapping, the Designer adds a Source Qualifier transformation for each source definition. 2. Set the value of the Source Filter property to CONSOLIDATION_IND <> 9 for each Source Qualifier transformation. The value of the Source Filter property indicates to migrate data that has the consolidation indicator not equal to 9. The following image shows how to configure the Source Filter property for a Source Qualifier transformation: 5
6 Note: You can also set additional conditions to configure filters for a match path. The filter of a match path includes or excludes records for matching based on the values of the columns that you specify. 3. Create a Joiner transformation. A Joiner transformation joins data from two or more base objects in the Hub Store. The Joiner transformation uses a condition that matches one or more pairs of columns between the base objects. 4. Link the ports between the Source Qualifier transformations and the Joiner transformation. 5. Configure a condition to integrate data from the base objects. The following image shows a condition based on which the Joiner transformation integrates data: 6
7 6. Create a target definition based on the Joiner transformation. When you create a target definition from a transformation, the type of target is the same as the source instance by default. You must update the writer type of the target definition to HDFS Flat File Writer in the Workflow Manager. 7. Link the ports between the Joiner transformation and the target definition. The following image shows a mapping that contains two source instances, a Joiner transformation, and a target definition: 8. Validate the mapping. The Designer marks a mapping as not valid when it detects errors. Step 3. Create an HDFS Connection Use the Workflow Manager to create an HDFS connection. You can use the HDFS connection to access the Hadoop cluster. 7
8 Step 4. Run the Mapping Use the Workflow Manager to run a mapping. 1. Create a workflow and configure the PowerCenter Integration Service for the workflow. 2. Create a Session task and select the mapping. 3. Edit the session to update the following properties: Writer type of the target definition to HDFS Flat File Writer. HDFS connection details for the target definition. The following image shows the properties of a target definition: 4. Start the workflow to integrate the source data and write the integrated data to the flat file in HDFS. Create the Configuration and Matching Rules Files You must create a configuration file and a matching rules file in the XML format to run the MapReduce jobs of MDM Big Data Relationship Management. A configuration file defines parameters related to the Hadoop distribution, repository, SSA-NAME3, input record layout, and metadata. A matching rules file defines indexes, searches, and matching rules for each index. Use one of the following methods to create the configuration file and the matching rules file: Customize the MDMBDRMTemplateConfiguration.xml sample configuration file and the MDMBDRMMatchRuleTemplate.xml sample matching rules file located in the following directory: /usr/local/ mdmbdrm-<version Number>/sample Run the following extractor utility to generate the configuration and matching rules files: On MDM Big Data Relationship Management 10.0 HotFix HotFix 5. BDRMExtractConfig.jar On MDM Big Data Relationship Management 10.0 HotFix 6 or later. config-tools.jar 8
9 Run the Extractor Utility You can run the extractor utility to generate a configuration file and a matching rules file. The utility extracts the index, search, and matching rules definition from the Hub Store and adds the definitions to the matching rules file. The utility also creates a configuration file to which you must manually add the parameters related to Hadoop distribution, repository, input record layout, and metadata. Use the following extractor utility based on the MDM Big Data Relationship Management version: On MDM Big Data Relationship Management 10.0 HotFix HotFix 5. /usr/local/mdmbdrm-<version Number>/bin/BDRMExtractConfig.jar On MDM Big Data Relationship Management 10.0 HotFix 6 or later. /usr/local/mdmbdrm-<version Number>/bin/config-tools.jar Use the following command to run the BDRMExtractConfig.jar extractor utility: java -Done-jar.class.path=<JDBC Driver Name> -jar <Extractor Utility Name> <Driver Class> <Connection String> <User Name> <Password> <Schema Name> <Rule Set name> Use the following command to run the config-tools.jar extractor utility: java -cp <JDBC Driver Name>:<Extractor Utility Name>:. com.informatica.mdmbde.driver.extractconfig <Driver Class> <Connection String> <User Name> <Password> <Schema Name> <Rule Set name> Use the following parameters to run the utility: Extractor Utility Name Absolute path and file name of the extractor utility. You can use one of the following values based on the MDM Big Data Relationship Management version: On MDM Big Data Relationship Management 10.0 HotFix HotFix 5. /usr/local/mdmbdrm- <Version Number>/bin/BDRMExtractConfig.jar On MDM Big Data Relationship Management 10.0 HotFix 6 or later. /usr/local/mdmbdrm-<version Number>/bin/config-tools.jar JDBC Driver Name Absolute path and name of the JDBC driver. Use one of the following values based on the database you use for the Hub Store: Oracle. ojdbc6.jar Microsoft SQL Server. sqljdbc.jar IBM DB2. db2jcc.jar Driver Class Driver class for the JDBC driver. Use one of the following values based on the database you use for the Hub Store: Oracle. oracle.jdbc.driver.oracledriver Microsoft SQL Server. com.microsoft.sqlserver.jdbc.sqlserverdriver IBM DB2. com.ibm.db2.jcc.db2driver Connection String Connection string to access the metadata in the Hub Store. 9
10 Use the following format for the connection string based on the database you use for the Hub Store: Oracle. jdbc:informatica:oracle://<host Name>:<Port>;SID=<Database Name> Microsoft SQL Server. jdbc:informatica:sqlserver://<host Name>:<Port>;Databasename=<Database Name> IBM DB2. jdbc:informatica:db2://<host Name>:<Port>;Databasename=<Database Name> Host Name indicates the name of the server on which you host the database, Port indicates the port through which the database listens, and Database Name indicates the name of the Operational Reference Store (ORS). User Name Password Schema User name to access the ORS. Password for the user name to access the ORS. Name of the ORS. Rule Set Name Name of the match rule set. The utility retrieves the match definitions from the rule set and writes the definitions to the matching rules file. The following sample command runs the config-tools.jar extractor utility in an IBM DB2 environment: java -cp /root/db2jcc.jar:/usr/local/mdmbdrm-<version Number>/bin/config-tools.jar:. com.informatica.mdmbde.driver.extractconfig com.ibm.db2.jcc.db2driver jdbc:db2:// mdmauto10:50000/devut DQUSER Password1 DQUSER TP_Match_Rule_Set The following sample command runs the BDRMExtractConfig.jar extractor utility in an IBM DB2 environment: java -Done-jar.class.path=/root/db2jcc.jar -jar /usr/local/mdmbdrm-<version Number>/bin/ BDRMExtractConfig.jar com.ibm.db2.jcc.db2driver jdbc:db2://mdmauto10:50000/devut DQUSER Password1 DQUSER TP_Match_Rule_Set Run the Initial Clustering Job An initial clustering job indexes and links the input data in HDFS and persists the indexed and linked data in HDFS. The initial clustering job also creates the match-pair output files in HDFS that contain a list of records and their matched records. To run the initial clustering job, on the machine where you install MDM Big Data Relationship Management, run the run_genclusters.sh script located in the following directory: /usr/local/mdmbdrm-<version Number> Use the following command to run the run_genclusters.sh script in the initial mode: run_genclusters.sh --config=configuration_file_name --input=input_file_in_hdfs [--reducer=number_of_reducers] --hdfsdir=working_directory_in_hdfs --rule=matching_rules_file_name [--matchinfo] 10
11 The following table describes the options and the arguments that you can specify to run the run_genclusters.sh script in the initial mode: Option Argument Description --config configuration_file_name Absolute path and file name of the configuration file that you create. --input input_file_in_hdfs Absolute path to the input files in HDFS. --reducer number_of_reducers Optional. Number of reducer jobs that you want to run to perform initial clustering. Default is 1. --hdfsdir working_directory_in_hdfs Absolute path to a working directory in HDFS. The initial clustering job uses the working directory to store the output and library files. --rule matching_rules_file_name Absolute path and file name of the matching rules file that you create. --matchinfo Optional. Indicates to add the match score against each matched record. Use --matchinfo option only if you want to apply any post-match rules on the match-pair output data before you migrate the data to the Match table. For example: The following command runs the initial clustering job in the initial mode: run_genclusters.sh --config=/usr/local/conf/config_big.xml --input=/usr/hdfs/source10million -- reducer=16 --hdfsdir=/usr/hdfs/workingdir --rule=/usr/local/conf/matching_rules.xml --matchinfo Output Files of the Initial Clustering Job An initial clustering job indexes and links the input data in HDFS and persists the indexed and linked data in HDFS. The initial clustering job also creates the match-pair output files in HDFS that contain a list of records and their matched records. The format of the data in the match-pair output files is same as the Match table in the Hub Store. You must migrate the data in the match-pair output files to the Match table in the Hub Store. You can find the output files in the following directory: Linked records. <Working Directory of Initial Clustering Job in HDFS>/BDRMClusterGen/<Job ID>/ output/dir/pass-join Match-pair output files. <Working Directory of Initial Clustering Job in HDFS>/BDRMClusterGen/<Job ID>/output/dir/match-pairs Each initial clustering job generates a unique ID, and you can identify the job ID based on the time stamp of the <Job ID> directory. You can also use the output log of the initial clustering job to find the directory path of its output files. Migrate the Match-Pair Data from HDFS to the Hub Store Use the Designer and the Workflow Manager to migrate data from HDFS to the Match table in the Hub Store. 1. Import a source definition for the match-pair output file. 2. Create a mapping with the source instance, an Expression transformation, and a target definition. 3. Run the mapping to migrate the match-pair data to the Match table in the Hub Store. 11
12 Step 1. Create a Source Definition for the Match-Pair Output Data Use the Designer to import a source definition for the match-pair output files. You can use a single source instance in the mapping even if you have multiple match-pair output files that have same columns. During the session run time, you can concatenate all the match-pair output files. Step 2. Create a Mapping Use the Designer to create a mapping. A mapping represents the data flow between the source and target instances. 1. Create a mapping and add the source definition that you import to the mapping. When you add a source definition to the mapping, the Designer adds a Source Qualifier transformation for the source definition. 2. Create an Expression transformation. 3. Link the ports between the Source Qualifier transformation and the Expression transformation. 4. Set the date format of all the date columns in the source to the following format: dd-mon-yyyy hh24:mi:ss The following image shows how to set the date format for a date column: 5. Create a target definition based on the Expression transformation. 12
13 When you create a target definition from a transformation, the target type is the same as the source instance by default. You must update the writer type of the target definition to Relational Writer in the Workflow Manager. 6. Link the ports between the Expression transformation and the target definition. The following image shows a mapping that migrates data from HDFS to the Match table in the Hub Store: 7. Validate the mapping. The Designer marks a mapping as not valid when it detects errors. Step 3. Run the Mapping Use the Workflow Manager to run a mapping. 1. Create a workflow and configure the Integration Service for the workflow. 2. Create a Session task and select the mapping. 3. Edit the session to update the writer type of the target definition to Relational Writer. The following image shows how to configure the write type of the target definition: 13
14 4. If the match-pair output contains multiple files, perform the following steps: a. Create a parameter file with.par extension and add the following entry to the file: [Global] $$Param_Filepath=<Match-Pair Output Directory> [Parameterization.WF:<Workflow Name>.ST:<Session Task Name>] Match-Pair Output Directory indicates the absolute path to the directory that contains the match-pair output, Workflow Name indicates the name of the workflow, and Session Task Name indicates the name of the Session task that you create. For example: [Global] $$Param_Filepath=/usr/hdfs/workingdir/BDRMClusterGen/MDMBDRM_ / output/dir/match-pairs [Parameterization.WF:wf_Mapping2_write_match_pairs_MTCH.ST:s_Mapping1a_write_match_pa irs_mtch] Create the parameter file to use a single source instance for all the match-pair output files. b. Edit the session to update the following properties for the parameter file: Absolute path and the file name of the parameter file that you create. The following image shows how to specify the absolute path and the file name of the parameter file: 14
15 File Path property to $$Param_Filepath for the source qualifier. The following image shows how to set the File Path property for a source qualifier: 5. Define the $$Param_Filepath variable for the workflow. The following image shows how to define a variable that you use in the workflow: 6. Start the workflow to migrate the match-pair output data in HDFS to the Match table in the Hub Store. 15
16 Updating the Consolidation Indicator of the Matched Records In the MDM Hub, the Match job loads the matched record pairs into the Match table and sets the consolidation indicator of the matched records to 2 in the base objects. The indicator 2 indicates the QUEUED_FOR_MERGE state, which means that the records are ready for a merge. Similarly, after you migrate the match-pair output data from HDFS to the Hub Store, you must update the consolidation indicator of the matched records in the base objects to 2. To update the consolidation indicator of the matched records in the base objects, use one of the following methods: Run an SQL query. Use the Designer and the Workflow Manager. For more information about the consolidation indicator, see the Informatica MDM Multidomain Edition Configuration Guide. Updating the Consolidation Indicator by Running an SQL Query You can run an SQL query to update the consolidation indicator of the matched records to 2 in the base objects. Use the following SQL query to update the consolidation indicator in the base objects: Update <Base Object> set consolidation_ind = 2 where consolidation_ind = 3 The consolidation indicator 2 indicates the QUEUED_FOR_MERGE state and 3 indicates the NOT MERGED or MATCHED state. For example, Update C_RL_PARTY_GROUP set consolidation_ind = 2 where consolidation_ind = 3 Updating the Consolidation Indicator by Using the Designer and the Workflow Manager You can use the Designer and the Workflow Manager to update the consolidation indicator of the matched records in the base objects.. 1. Import a source definition for the Match table. 2. Import a target definition for the base object. 3. Create a mapping with two instances of the source definition, a Union transformation, an Expression transformation, an Update Strategy transformation, and a target definition. 4. Run the mapping to update the consolidation indicator of the matched records to 2 in the base object. Step 1. Create Source Definitions for the Match Table Use the Designer to import a source definition for the Match table. Based on the data source, you can create the source definition. For example, if the data source is Oracle, create a relational source. Step 2. Create a Target Definition for the Base Object Use the Designer to import a target definition for the base object in the Hub Store from which you migrate data to HDFS. Based on the data source, you can create the target definition. For example, if the data source is Oracle, create a relational target. 16
17 Step 3. Create a Mapping Use the Designer to create a mapping. A mapping represents the data flow between the source and target instances. 1. Create a mapping and add two instances of the source definition that you import to the mapping. You can use the two source instances to compare the records and retrieve unique records. When you add the source definitions to the mapping, the Designer adds a Source Qualifier transformation for each source definition. 2. Add one of the following conditions to each Source Qualifier transformation: SELECT distinct <Match Table Name>.ROWID_OBJECT FROM <Match Table Name> SELECT distinct <Match Table Name>.ROWID_OBJECT_MATCHED FROM <Match Table Name> Note: Ensure that each Source Qualifier transformation contains a unique condition. The conditions retrieve records that have unique ROWID_OBJECT and ROWID_OBJECT_MATCHED values. The following image shows how to configure the condition for the Source Qualifier transformation: 3. Create a Union transformation. A Union transformation merges data from multiple pipelines or pipeline branches into one pipeline branch. 4. Create two input groups for the Union transformation. 5. Link the ports between the Source Qualifier transformations and the Union transformation. 6. Create an Expression transformation. Use the Expression transformation to set the consolidation indicator for the matched records. 7. Link the ports between the Union transformation and the Expression transformation. 17
18 8. Set the output_consolidation_ind condition to 2 for the Expression transformation. The following image shows how to set the output_consolidation_ind condition: 9. Create an Update Strategy transformation. Use the Update Strategy transformation to update the consolidation indicator of the matched records based on the value that you set in the Expression transformation. 10. Link the ports between the Expression transformation and the Update Strategy transformation. 11. Configure the DD_UPDATE constant as the update strategy expression to update the consolidation indicator. The following image shows how to configure the update strategy expression: 18
19 12. Add the target definition that you import to the mapping. 13. Link the ports between the Update Strategy transformation and the target instance. The following image shows a mapping that updates the consolidation indicator of the matched records in the base object to 2: 14. Validate the mapping. The Designer marks a mapping as not valid when it detects errors. Step 4. Run the Mapping Use the Workflow Manager to run a mapping. 1. Create a workflow and configure the Integration Service for the workflow. 2. Create a Session task and select the mapping. 3. Start the workflow to update the consolidation indicator of the matched records to 2 in the base object. Manage the Incremental Data After you migrate the match-pair output data from HDFS to the Hub Store, you can manage the incremental data for MDM Multidomain Edition. 1. Migrate the incremental data from the Hub Store to HDFS. 2. Run the initial clustering job in the incremental mode. 3. Migrate the match-pair output data from HDFS to the Match table in the Hub Store. 4. Update the consolidation indicator of the matched records in the base object to 2. Step 1. Migrate the Incremental Data from the Hub Store to HDFS Use the Designer and the Workflow Manager to migrate the incremental data from the Hub Store to HDFS. 1. Import source definitions for the base objects in the Hub Store. 2. Create a mapping with the source instances, a Joiner transformation, and a target definition for a flat file in HDFS. 3. Run the mapping to migrate the data from the Hub Store to HDFS. 19
20 Step 2. Run the Initial Clustering Job in the Incremental Mode An initial clustering job indexes and links the input data in HDFS and persists the indexed and linked data in HDFS. The initial clustering job also creates the match-pair output files in HDFS that contain a list of records and their matched records. To run the initial clustering job in the incremental mode, use the run_genclusters.sh script located in the following directory: /usr/local/mdmbdrm-<version Number> Use the following command to run the run_genclusters.sh script in the incremental mode: run_genclusters.sh --config=configuration_file_name --input=input_file_in_hdfs [--reducer=number_of_reducers] --hdfsdir=working_directory_in_hdfs --rule=matching_rules_file_name --incremental --clustereddirs=indexed_linked_data_directory [--consolidate] [--matchinfo] The following table describes the options and the arguments that you can specify to run the run_genclusters.sh script: Option Argument Description --config configuration_file_name Absolute path and file name of the configuration file that you create. --input input_file_in_hdfs Absolute path to the input files in HDFS. --reducer number_of_reducers Optional. Number of reducer jobs that you want to run to perform initial clustering. Default is 1. --hdfsdir working_directory_in_hdfs Absolute path to a working directory in HDFS. The initial clustering job uses the working directory to store the output and library files. --rule matching_rules_file_name Absolute path and file name of the matching rules file that you create. --incremental Runs the initial clustering job in the incremental mode. If you want to incrementally update the indexed and linked data in HDFS, run the job in the incremental mode. By default, the initial clustering job runs in the initial mode. 20
21 Option Argument Description --clustereddirs indexed_linked_data_directory Absolute path to the output files that the previous run of the initial clustering job creates. You can find the output files of the initial clustering job in the following directory: <Working Directory of Initial Clustering Job in HDFS>/ BDRMClusterGen/<Job ID>/output/dir/pass-join Each initial clustering job generates a unique ID, and you can identify the job ID based on the time stamp of the <Job ID> directory. You can also use the output log of the initial clustering job to find the directory path of its output files. --consolidate --matchinfo Consolidates the incremental data with the existing indexed and linked data in HDFS. By default, the initial clustering job indexes and links only the incremental data. Optional. Indicates to add the match score against each matched record. Use --matchinfo option only if you want to apply any post-match rules on the match-pair output data before you migrate the data to the Match table. For example, the following command runs the initial clustering job in the incremental mode: run_genclusters.sh --config=/usr/local/conf/config_big.xml --input=/usr/hdfs/source10million -- reducer=16 --hdfsdir=/usr/hdfs/workingdir --rule=/usr/local/conf/matching_rules.xml -- clustereddirs=/usr/hdfs/workingdir/bdrmclustergen/mdmbdrm_ /output/dir/passjoin --incremental --matchinfo You can find the output files of the initial clustering job in the following directory: Linked records. <Working Directory of Initial Clustering Job in HDFS>/BDRMClusterGen/<Job ID>/ output/dir/pass-join Match-pair output files. <Working Directory of Initial Clustering Job in HDFS>/BDRMClusterGen/<Job ID>/output/dir/match-pairs The format of the data in the match-pair output files is same as the Match table in the Hub Store. You must migrate the data in the match-pair output files to the Hub Store. Step 3. Migrate the Match-Pair Data from HDFS to the Hub Store Use the Designer and the Workflow Manager to migrate the match-pair data from HDFS to the Hub Store. 1. Import a source definition for the match-pair output file. 2. Create a mapping with the source instance, an Expression transformation, and a target definition for the Match table in the Hub Store. 3. Run the mapping to migrate the match-pair data from HDFS to the Match table in the Hub Store. Step 4. Update the Consolidation Indicator of the Matched Records Run an SQL query or use the Designer and the Workflow Manager to update the consolidation indicator of the matched records in the base objects. 21
22 Author Bharathan Jeyapal Lead Technical Writer Acknowledgements The author would like to thank Vijaykumar Shenbagamoorthy, Vinod Padmanabha Iyer, Krishna Kanth Annamraju, and Venugopala Chedella for their technical assistance. 22
Configuring a JDBC Resource for IBM DB2 for z/os in Metadata Manager
Configuring a JDBC Resource for IBM DB2 for z/os in Metadata Manager 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationConfiguring a JDBC Resource for IBM DB2/ iseries in Metadata Manager HotFix 2
Configuring a JDBC Resource for IBM DB2/ iseries in Metadata Manager 9.5.1 HotFix 2 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationMigrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository
Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationOptimizing Performance for Partitioned Mappings
Optimizing Performance for Partitioned Mappings 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationImporting Metadata from Relational Sources in Test Data Management
Importing Metadata from Relational Sources in Test Data Management Copyright Informatica LLC, 2017. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the
More informationHow to Install and Configure EBF14514 for IBM BigInsights 3.0
How to Install and Configure EBF14514 for IBM BigInsights 3.0 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationHow to Use Full Pushdown Optimization in PowerCenter
How to Use Full Pushdown Optimization in PowerCenter 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationPublishing and Subscribing to Cloud Applications with Data Integration Hub
Publishing and Subscribing to Cloud Applications with Data Integration Hub 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationUsing Synchronization in Profiling
Using Synchronization in Profiling Copyright Informatica LLC 1993, 2017. Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationHow to Write Data to HDFS
How to Write Data to HDFS 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior
More informationConfiguring a JDBC Resource for Sybase IQ in Metadata Manager
Configuring a JDBC Resource for Sybase IQ in Metadata Manager 2012 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationIncreasing Performance for PowerCenter Sessions that Use Partitions
Increasing Performance for PowerCenter Sessions that Use Partitions 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationConfiguring a JDBC Resource for MySQL in Metadata Manager
Configuring a JDBC Resource for MySQL in Metadata Manager 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationThis document contains information on fixed and known limitations for Test Data Management.
Informatica LLC Test Data Management Version 10.1.0 Release Notes December 2016 Copyright Informatica LLC 2003, 2016 Contents Installation and Upgrade... 1 Emergency Bug Fixes in 10.1.0... 1 10.1.0 Fixed
More informationCode Page Configuration in PowerCenter
Code Page Configuration in PowerCenter 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationCreating an Avro to Relational Data Processor Transformation
Creating an Avro to Relational Data Processor Transformation 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationUsing Standard Generation Rules to Generate Test Data
Using Standard Generation Rules to Generate Test Data 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationPerforming a Post-Upgrade Data Validation Check
Performing a Post-Upgrade Data Validation Check 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or
More informationInformatica PowerExchange for Microsoft Azure Blob Storage 10.2 HotFix 1. User Guide
Informatica PowerExchange for Microsoft Azure Blob Storage 10.2 HotFix 1 User Guide Informatica PowerExchange for Microsoft Azure Blob Storage User Guide 10.2 HotFix 1 July 2018 Copyright Informatica LLC
More informationBusiness Glossary Best Practices
Business Glossary Best Practices 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without
More informationInformatica BCI Extractor Solution
Informatica BCI Extractor Solution Objective: The current BCI implementation delivered by Informatica uses a LMAPI SDK plugin to serially execute idoc requests to SAP and then execute a process mapping
More informationCreating OData Custom Composite Keys
Creating OData Custom Composite Keys 1993, 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without
More informationMoving DB2 for z/os Bulk Data with Nonrelational Source Definitions
Moving DB2 for z/os Bulk Data with Nonrelational Source Definitions 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationThis document contains information on fixed and known limitations for Test Data Management.
Informatica Corporation Test Data Management Version 9.6.0 Release Notes August 2014 Copyright (c) 2003-2014 Informatica Corporation. All rights reserved. Contents Informatica Version 9.6.0... 1 Installation
More informationCreating a Column Profile on a Logical Data Object in Informatica Developer
Creating a Column Profile on a Logical Data Object in Informatica Developer 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationCreating a Subset of Production Data
Creating a Subset of Production Data 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationUsing Data Replication with Merge Apply and Audit Apply in a Single Configuration
Using Data Replication with Merge Apply and Audit Apply in a Single Configuration 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationMigrating IDD Applications to the Business Entity Data Model
Migrating IDD Applications to the Business Entity Data Model Copyright Informatica LLC 2016. Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationRunning PowerCenter Advanced Edition in Split Domain Mode
Running PowerCenter Advanced Edition in Split Domain Mode 1993-2016 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationUsing PowerCenter to Process Flat Files in Real Time
Using PowerCenter to Process Flat Files in Real Time 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationConfiguring a Hadoop Environment for Test Data Management
Configuring a Hadoop Environment for Test Data Management Copyright Informatica LLC 2016, 2017. Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationSecurity Enhancements in Informatica 9.6.x
Security Enhancements in Informatica 9.6.x 1993-2016 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or
More informationPowerExchange for Facebook: How to Configure Open Authentication using the OAuth Utility
PowerExchange for Facebook: How to Configure Open Authentication using the OAuth Utility 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means
More informationNew Features and Enhancements in Big Data Management 10.2
New Features and Enhancements in Big Data Management 10.2 Copyright Informatica LLC 2017. Informatica, the Informatica logo, Big Data Management, and PowerCenter are trademarks or registered trademarks
More informationHow to Migrate RFC/BAPI Function Mappings to Use a BAPI/RFC Transformation
How to Migrate RFC/BAPI Function Mappings to Use a BAPI/RFC Transformation 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationHow to Convert an SQL Query to a Mapping
How to Convert an SQL Query to a Mapping 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationOptimizing Testing Performance With Data Validation Option
Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationPowerCenter Repository Maintenance
PowerCenter Repository Maintenance 2012 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without
More informationPowerExchange IMS Data Map Creation
PowerExchange IMS Data Map Creation 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationData Validation Option Best Practices
Data Validation Option Best Practices 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without
More informationPerformance Optimization for Informatica Data Services ( Hotfix 3)
Performance Optimization for Informatica Data Services (9.5.0-9.6.1 Hotfix 3) 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationWhat's New In Informatica Data Quality 9.0.1
What's New In Informatica Data Quality 9.0.1 2010 Abstract When you upgrade Informatica Data Quality to version 9.0.1, you will find multiple new features and enhancements. The new features include a new
More informationConfiguring Ports for Big Data Management, Data Integration Hub, Enterprise Information Catalog, and Intelligent Data Lake 10.2
Configuring s for Big Data Management, Data Integration Hub, Enterprise Information Catalog, and Intelligent Data Lake 10.2 Copyright Informatica LLC 2016, 2017. Informatica, the Informatica logo, Big
More informationPowerCenter 7 Architecture and Performance Tuning
PowerCenter 7 Architecture and Performance Tuning Erwin Dral Sales Consultant 1 Agenda PowerCenter Architecture Performance tuning step-by-step Eliminating Common bottlenecks 2 PowerCenter Architecture:
More informationOptimizing Session Caches in PowerCenter
Optimizing Session Caches in PowerCenter 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationConfigure an ODBC Connection to SAP HANA
Configure an ODBC Connection to SAP HANA 1993-2017 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationJyotheswar Kuricheti
Jyotheswar Kuricheti 1 Agenda: 1. Performance Tuning Overview 2. Identify Bottlenecks 3. Optimizing at different levels : Target Source Mapping Session System 2 3 Performance Tuning Overview: 4 What is
More informationHow to Install and Configure EBF15545 for MapR with MapReduce 2
How to Install and Configure EBF15545 for MapR 4.0.2 with MapReduce 2 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationImporting Flat File Sources in Test Data Management
Importing Flat File Sources in Test Data Management Copyright Informatica LLC 2017. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the United States
More informationUsing Two-Factor Authentication to Connect to a Kerberos-enabled Informatica Domain
Using Two-Factor Authentication to Connect to a Kerberos-enabled Informatica Domain Copyright Informatica LLC 2016, 2018. Informatica LLC. No part of this document may be reproduced or transmitted in any
More informationImplementing Data Masking and Data Subset with IMS Unload File Sources
Implementing Data Masking and Data Subset with IMS Unload File Sources 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationImporting Connections from Metadata Manager to Enterprise Information Catalog
Importing Connections from Metadata Manager to Enterprise Information Catalog Copyright Informatica LLC, 2018. Informatica, the Informatica logo, and PowerCenter are trademarks or registered trademarks
More informationData Integration Service Optimization and Stability
Data Integration Service Optimization and Stability 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationInformatica PowerExchange for Tableau User Guide
Informatica PowerExchange for Tableau 10.2.1 User Guide Informatica PowerExchange for Tableau User Guide 10.2.1 May 2018 Copyright Informatica LLC 2015, 2018 This software and documentation are provided
More informationInformatica Data Quality Upgrade. Marlene Simon, Practice Manager IPS Data Quality Vertical Informatica
Informatica Data Quality Upgrade Marlene Simon, Practice Manager IPS Data Quality Vertical Informatica 2 Biography Marlene Simon Practice Manager IPS Data Quality Vertical Based in Colorado 5+ years with
More informationManually Defining Constraints in Enterprise Data Manager
Manually Defining Constraints in Enterprise Data Manager 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationInformatica MDM - Customer Release Guide
Informatica MDM - Customer 360 10.3 Release Guide Informatica MDM - Customer 360 Release Guide 10.3 September 2018 Copyright Informatica LLC 2017, 2018 This software and documentation are provided only
More informationImplementing Data Masking and Data Subset with IMS Unload File Sources
Implementing Data Masking and Data Subset with IMS Unload File Sources 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationHow to Install and Configure EBF16193 for Hortonworks HDP 2.3 and HotFix 3 Update 2
How to Install and Configure EBF16193 for Hortonworks HDP 2.3 and 9.6.1 HotFix 3 Update 2 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any
More informationImplementing Data Masking and Data Subset with Sequential or VSAM Sources
Implementing Data Masking and Data Subset with Sequential or VSAM Sources 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationEnterprise Data Catalog Fixed Limitations ( Update 1)
Informatica LLC Enterprise Data Catalog 10.2.1 Update 1 Release Notes September 2018 Copyright Informatica LLC 2015, 2018 Contents Enterprise Data Catalog Fixed Limitations (10.2.1 Update 1)... 1 Enterprise
More informationDynamic Data Masking: Capturing the SET QUOTED_IDENTIFER Value in a Microsoft SQL Server or Sybase Database
Dynamic Data Masking: Capturing the SET QUOTED_IDENTIFER Value in a Microsoft SQL Server or Sybase Database 1993, 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any
More informationHow to Export a Mapping Specification as a Virtual Table
How to Export a Mapping Specification as a Virtual Table 2012 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationGetting Started with Embedded ActiveVOS in MDM-ME Version
Getting Started with Embedded ActiveVOS in MDM-ME Version 10.1.0 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationHortonworks Data Platform
Hortonworks Data Platform Workflow Management (August 31, 2017) docs.hortonworks.com Hortonworks Data Platform: Workflow Management Copyright 2012-2017 Hortonworks, Inc. Some rights reserved. The Hortonworks
More informationOracle Data Integrator 12c: Integration and Administration
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle Data Integrator 12c: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive
More informationHow to Optimize Jobs on the Data Integration Service for Performance and Stability
How to Optimize Jobs on the Data Integration Service for Performance and Stability 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More informationTable of Contents. Abstract
JDBC User Guide 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent
More informationHow to Install and Configure Big Data Edition for Hortonworks
How to Install and Configure Big Data Edition for Hortonworks 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationOracle Data Integrator 12c: Integration and Administration
Oracle University Contact Us: +34916267792 Oracle Data Integrator 12c: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive data integration platform
More informationHow to Configure Informatica HotFix 2 for Cloudera CDH 5.3
How to Configure Informatica 9.6.1 HotFix 2 for Cloudera CDH 5.3 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationInformatica Cloud Spring Workday V2 Connector Guide
Informatica Cloud Spring 2017 Workday V2 Connector Guide Informatica Cloud Workday V2 Connector Guide Spring 2017 March 2018 Copyright Informatica LLC 2015, 2018 This software and documentation are provided
More informationUsing the Random Sampling Option in Profiles
Using the Random Sampling Option in Profiles Copyright Informatica LLC 2017. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the United States and many
More informationTuning the Hive Engine for Big Data Management
Tuning the Hive Engine for Big Data Management Copyright Informatica LLC 2017. Informatica, the Informatica logo, Big Data Management, PowerCenter, and PowerExchange are trademarks or registered trademarks
More informationInformatica Power Center 10.1 Developer Training
Informatica Power Center 10.1 Developer Training Course Overview An introduction to Informatica Power Center 10.x which is comprised of a server and client workbench tools that Developers use to create,
More informationImproving PowerCenter Performance with IBM DB2 Range Partitioned Tables
Improving PowerCenter Performance with IBM DB2 Range Partitioned Tables 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationStrategies for Incremental Updates on Hive
Strategies for Incremental Updates on Hive Copyright Informatica LLC 2017. Informatica, the Informatica logo, and Big Data Management are trademarks or registered trademarks of Informatica LLC in the United
More informationImplementing a Persistent Identifier Module in MDM Multidomain Edition
Implementing a Persistent Identifier Module in MDM Multidomain Edition 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationHyperion Data Integration Management Adapter for Performance Scorecard. Readme. Release
Hyperion Data Integration Management Adapter for Performance Scorecard Release 11.1.1.1 Readme [Skip Navigation Links] Purpose... 3 About Data Integration Management Release 11.1.1.1... 3 Data Integration
More informationContainer-based Authentication for MDM- ActiveVOS in WebSphere
Container-based Authentication for MDM- ActiveVOS in WebSphere 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationHyperion Data Integration Management Adapter for Essbase. Sample Readme. Release
Hyperion Data Integration Management Adapter for Essbase Release 11.1.1.1 Sample Readme [Skip Navigation Links] Purpose... 2 About Data Integration Management Release 11.1.1.1... 2 Data Integration Management
More informationInformatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition. Eugene Gonzalez Support Enablement Manager, Informatica
Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition Eugene Gonzalez Support Enablement Manager, Informatica 1 Agenda Troubleshooting PowerCenter issues require a
More informationInformatica Cloud Data Integration Winter 2017 December. What's New
Informatica Cloud Data Integration Winter 2017 December What's New Informatica Cloud Data Integration What's New Winter 2017 December January 2018 Copyright Informatica LLC 2016, 2018 This software and
More informationCreating Column Profiles on LDAP Data Objects
Creating Column Profiles on LDAP Data Objects Copyright Informatica LLC 1993, 2017. Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationData Warehousing Concepts
Data Warehousing Concepts Data Warehousing Definition Basic Data Warehousing Architecture Transaction & Transactional Data OLTP / Operational System / Transactional System OLAP / Data Warehouse / Decision
More informationEnabling Seamless Data Access for JD Edwards EnterpriseOne
Enabling Seamless Data Access for JD Edwards EnterpriseOne 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationZENworks Reporting Migration Guide
www.novell.com/documentation ZENworks Reporting Migration Guide ZENworks Reporting 5 January 2014 Legal Notices Novell, Inc. makes no representations or warranties with respect to the contents or use of
More informationTibero Migration Utility Guide
Tibero Migration Utility Guide Copyright 2014 TIBERO Co., Ltd. All Rights Reserved. Copyright Notice Copyright 2014 TIBERO Co., Ltd. All Rights Reserved. 5, Hwangsaeul-ro 329beon-gil, Bundang-gu, Seongnam-si,
More informationChanging the Password of the Proactive Monitoring Database User
Changing the Password of the Proactive Monitoring Database User 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
More informationCode Page Settings and Performance Settings for the Data Validation Option
Code Page Settings and Performance Settings for the Data Validation Option 2011 Informatica Corporation Abstract This article provides general information about code page settings and performance settings
More informationOracle Financial Services Data Integration Hub
Oracle Financial Services Data Integration Hub User Manual 8.0.5.0.0 Table of Contents TABLE OF CONTENTS PREFACE... 5 Audience... 5 Prerequisites... 5 Acronyms... 5 Glossary of Icons... 5 Related Information
More informationPerformance Tuning for MDM Hub for IBM DB2
Performance Tuning for MDM Hub for IBM DB2 2012 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationHadoop Map Reduce 10/17/2018 1
Hadoop Map Reduce 10/17/2018 1 MapReduce 2-in-1 A programming paradigm A query execution engine A kind of functional programming We focus on the MapReduce execution engine of Hadoop through YARN 10/17/2018
More informationConfiguring an ERwin Resource in Metadata Manager 8.5 and 8.6
Configuring an ERwin Resource in Metadata 8.5 and 8.6 2009 Informatica Corporation Abstract This article shows how to create and configure an ERwin resource in Metadata 8.5, 8.5.1, 8.6, and 8.6.1 to extract
More informationInformatica Data Explorer Performance Tuning
Informatica Data Explorer Performance Tuning 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
More informationETL Transformations Performance Optimization
ETL Transformations Performance Optimization Sunil Kumar, PMP 1, Dr. M.P. Thapliyal 2 and Dr. Harish Chaudhary 3 1 Research Scholar at Department Of Computer Science and Engineering, Bhagwant University,
More informationSelfTestEngine.PR000041_70questions
SelfTestEngine.PR000041_70questions Number: PR000041 Passing Score: 800 Time Limit: 120 min File Version: 20.02 http://www.gratisexam.com/ This is the best VCE I ever made. Try guys and if any suggestion
More informationImporting Metadata From an XML Source in Test Data Management
Importing Metadata From an XML Source in Test Data Management Copyright Informatica LLC 2017. Informatica, the Informatica logo, and PowerCenter are trademarks or registered trademarks of Informatica LLC
More informationPackage sjdbc. R topics documented: December 16, 2016
Package sjdbc December 16, 2016 Version 1.6.0 Title JDBC Driver Interface Author TIBCO Software Inc. Maintainer Stephen Kaluzny Provides a database-independent JDBC interface. License
More information