Optimizing Testing Performance With Data Validation Option

Size: px
Start display at page:

Download "Optimizing Testing Performance With Data Validation Option"

Transcription

1 Optimizing Testing Performance With Data Validation Option Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. All other company and product names may be trade names or trademarks of their respective owners and/or copyrighted materials of such owners.

2 Abstract To optimize test performance in Data Validation Option, you must understand the factors that affect test performance, and the features and techniques that you can use to improve test performance. Supported Versions Data Validation Option Hotfix 2 Update 1 Data Validation Option Data Validation Option Table of Contents Data Validation Option and Performance Understanding Tests Performance and PowerCenter Mappings Factors Affecting Performance Data Volume Tested Data Complexity Types of Tests Quantity of Tests PowerCenter Server Capacity Load on PowerCenter Integration Service Performance Optimization Techniques Database Optimization techniques Sorted Input Caching Sampling Splitting Wide Tables Server Configurations Performance Metrics Calculations COUNT Tests COUNT_ROWS Tests SUM Tests SET Tests OUTER_VALUE Tests VALUE Tests Conclusion

3 Data Validation Option and Performance When people look to deploy Data Validation Option, a key concern is performance. This concern is expressed in different ways by different people and include the following: How fast are tests executed? How much data does Data Validation Option test? What type of server does Data Validation Option need? Will the deployment of Data Validation Option in the production environment impact the performance of existing jobs? There is no absolute answer to these questions because there are many factors that contribute to the performance of tests conducted with Data Validation Option. The performance of Data Validation Option tests depend of the following factors: Data volume (the number of rows and columns) tested Data complexity (heterogeneous, homogeneous, complex joins) Types of tests performed (Aggregate, Set, Value, Expressions) Number of tests in a given single table object or table pair object Capacity of the PowerCenter Integration Service (memory or CPU) executing the tests Load on the PowerCenter Integration Service when tests are executed Configuration of Data Validation Option jobs to optimize performance This document explains the various factors that affect performance and explains features and techniques that can be applied to increase performance for tests conducted with Data Validation Option. Additionally, some baseline performance numbers are provided to give context to performance. Those numbers should be viewed as a minimal performance level given that they were conducted on a smaller server with no optimizations configured. Understanding Tests The first thing to understand is how Data Validation Option executes tests. Data Validation Option provides a client and metadata repository that uses PowerCenter as an execution engine. Thus performance is about optimizing the processing performed by PowerCenter. Consider the following image: 3

4 The image above has three major components: Data Validation Option, which consists of a desktop client, its own metadata repository, and a results warehouse which has predefined database views. PowerCenter, which consists of a metadata repository and a set of services for accessing and processing data. Enterprise Data, which consists of a broad spread of data sources including relational DBMS, warehousing appliances, mainframe data, flat files, and data in the cloud such as Salesforce.com. The numbers in the following steps below correspond with the numbers in the image above and explain the execution process for Data Validation Option: 1. Data Validation Option users define tests and store those (metadata) definitions in the Data Validation Option repository. 2. Tests are executed from a GUI or command line. Data Validation Option generates mappings that embody the test conditions and executes those mappings in PowerCenter. 3. PowerCenter accesses the data being tested and applies the tests to that data. 4. Test results (pass or fail, error data, and so on) are stored in the Data Validation Option warehouse. 5. Users can view test results in reports that are generated from the warehouse. The key steps from a performance perspective are 2, 3, and 4. In particular, step 2, the performance of the generated mapping, is important as it is that mapping that is executed by PowerCenter, and thus anything that can be done at design-time (that is, in the Data Validation Option client) to optimize for performance should be done. Performance and PowerCenter Mappings The basic principle of performance is that overall performance will never be faster than the slowest point in the mapping. PowerCenter mappings consist of three main parts, the reader, the transformation pipeline, and the writer. 4

5 The reader reads all the data from all the sources that are being tested and feeds that data into the transformation pipeline. The pipeline is where tests are processed, results, and error rows identified, and so on. The writer writes those results to the Data Validation Option warehouse. If there is a lot of data to read, but few tests to perform, then the reader could be slow point in the mapping. Conversely, if there are a lot of tests or a lot of complex joins and lookups defined on the data, then the transformation pipeline will be the slow point. It is rare that the writer is the slow point as there is usually very little data written to the Data Validation Option warehouse relative to the amount of data read. Though simplistic, this description of the process gives a high-level background before you get into the next section, which explains the factors that affect performance in more detail. Factors Affecting Performance The level of complexity of Data Validation Option mappings is a frequent query. Data Validation Option mappings are generated based on the configuration defined by the user in the Data Validation Option client. This configuration includes what data to read, how that data should be pre-processed for testing, what tests are to be performed, where they are performed, and the results to write out. If you think through the various source data types and configuration options in the table pair object, the use of join or SQL or lookup views, the various tests defined and where and how they should be executed, you can end up with what appear to be complex mappings. And if those mappings run on underpowered hardware, or read data across a slow network, or do not have enough memory to execute effectively, things will slow down. But while those mappings may appear complex, their design is efficient, and they run on PowerCenter, which has an unparalleled track record in the industry for scalable, high performance data processing. The key is to understand what the issues are, and how to address them to deliver efficient and high performance testing scenarios. Data Volume Tested The amount of data tested affects performance. More data to test means more data to read from the source, more rows to process, and, potentially, more errors to write. Data volume has the following three dimensions to it: Number of Rows The number of rows dimension is straight forward. The more rows read from a source, the more time spent in the reader and in test execution. But all rows are not created equal. Number of Columns A wide table or file with dozens or hundreds of columns, if read, will take more time to read and process than a narrow table. Width of Individual Columns Wide columns will take more time to read and process than narrow columns. For example, a string that is hundreds of characters wide will take more compute power than a short string or a number Data Complexity Data complexity refers to the complexity of the input. File or database tables are straight forward, but if you combine different sources into a join view, or different tables into a SQL view, and then use those in a table pair object, the processing time could be higher than with a single table object or file. The taken to join various tables or execute the SQL view's SQL in the database depends on the complexity and size of the joins, the complexity of the SQL, and the size of the associated tables. 5

6 Types of Tests Aggregate tests like SUM, MIN, MAX, and AVG are executed faster than SET or VALUE tests. This is due to the nature of the tests. Aggregates act on a single column, whereas VALUE and SET tests act across columns and need joined tables. Complex expression evaluation can impact performance, particularly if applied to many columns in the data. For very large tables, the join process itself takes significant resources to complete. Quantity of Tests When mapping generation occurs, all tests defined in a single table object or table pair object are encoded into the mapping. The mapping will read and then propagate only the data necessary for tests. For example, if a flat file with 100 columns is read in, but only five tests are defined that evaluate data in five different columns, then only those five columns are propagated in the mapping, and only five tests are made on each row. This is an efficiency built into the generated mappings. Now, if the same 100 column file has 50 tests on 50 columns instead of five, then 50 columns are propagated and 50 tests are made on each row. Thus, both, the amount of data tested and the number of tests per row, increase by a factor of 10. Both of these affect performance. PowerCenter Server Capacity The famous Greek mathematician, Archimedes, once said: Give me a lever long enough and a fulcrum on which to place it, and I shall move the world The same principle applies to PowerCenter. With enough compute power and memory, virtually any amount of data can be processed. But unlimited computing resources is rare. The PowerCenter environment used to execute the Data Validation Option mappings has a tremendous impact on performance. Testing large data sets on an underpowered server with insufficient RAM or processing power is almost guaranteed to be slow. Load on PowerCenter Integration Service A powerful server under heavy load can perform poorly. If you test large data or complex data sets on a server that has other jobs running on it, ensure that the server has enough additional capacity to run Data Validation Option mappings efficiently. If you run Data Validation Option in production, the Data Validation Option mappings will add load to the server. Like any other PowerCenter job, it is best to have a clear understanding of the resource requirements of Data Validation Option jobs before adding them to an already overloaded server. Performance Optimization Techniques There are features in Data Validation Option that can be used to maximize performance. Basic techniques such as reading only the data that is needed for testing, minimizing unnecessary tests, and providing Data Validation Option mappings with the resources they need contribute to increased performance. Database Optimization techniques A simple step to increase performance is to use the database optimization features available in Data Validation Option. These can be applied to data coming from any supported SQL database. use the following features: 6

7 WHERE clauses WHERE clauses in table pair objects and single table objects can be defined with SQL or with the PowerCenter expression language. When sourcing data from databases, it is usually more efficient to define the WHERE clause with SQL, and then select the Execute Where Clause in DB check box. This executes the WHERE clause in the database and only feeds PowerCenter the rows matching the WHERE condition. This is more efficient than reading the entire table and then processing the WHERE clause in PowerCenter. Essentially, it throws away unmatched data immediately after reading it. Aggregate and Count Tests Aggregate and COUNT Tests can be processed in a database very efficiently. To do this, set the Optimization Levelin the single table object or table pair object to Where Clause, Sorting and Aggregation in DB. Sorted Input When using Table Pairs with value or set tests, a join condition is required across table A and table B. If the data coming into the table pair object is already sorted (for example, via a WHERE clause in the Table Pair sent to the database) in the same order as the join condition, PowerCenter can optimize the join operation and use less memory for the process. To indicate that the input is already sorted, select Already Sorted Input in the Optimization Level drop down in the table pair object or single table object dialog box. Caching Caching is a means of memory allocation when executing PowerCenter jobs. As data is processed by PowerCenter, memory is allocated to specific operations up to a specified limit. Operations like joins, sorting, lookups and aggregations all use cached memory. If the PowerCenter job requires more memory than has been allocated to it, it spills data to disk and then swaps data between disk and memory as required. In-memory processing is significantly faster than spilling to disk. For best performance, keep all data in memory whenever possible. By default, Data Validation Option lets PowerCenter decide how to allocate cache memory for a given job. This is known as automatic caching. The image below shows the automatic caching option: In general, automatic caching works well, but there are cases where it is not sufficient. For example, in very large data sets, or when complex join views, large lookup views or extensive sorting is needed within PowerCenter, automatic caching is not sufficient. At that time, users must explicitly allocate memory to the job. This explanation of caching is an intentional simplification of the actual process, but should be sufficient to gain a general understanding of the concepts. From a PowerCenter perspective, cache allocation can get quite involved as the amount of cache required is specific to the transformation and depends on the amount of data processed by that transformation at runtime. 7

8 Detailed information on necessary cache allocation is given in PowerCenter session logs, but for the uninitiated, those logs are quite daunting. If you need more detailed information on the topic of caching, see the appropriate sections in the PowerCenter Performance Tuning Guide. Setting Table Pair Object and Single Table Object Cache This is the start of the concept. You can set the total cache allocation on the Advanced tab of the single table object and table pair object dialog box. Simply uncheck Automatic, and enter the total amount of cache memory (for example, 256 MB, 1 GB, and so on) that you want to allocate to the job and Data Validation Option will take care of the rest. If the amount is insufficient and data spills to disk, then allocate a larger amount and rerun. Sampling In some situations, either with very large data sets (hundreds of millions or billions of rows), or in cases where 100% of the data is not required, data sampling can be employed. Sampling is available for both table pairs objects and single tables objects and can be accessed via the Advanced tabs in these dialog boxes. Users specify the percentage of rows needed from the source, and an optional seed value that is used to add repeatability in the sampled data. In table pairs objects, data is only sampled on one side, that is either table A or table B. When the data from both sides is joined (via a join condition), only the matching rows across Table A and B will pass through and be tested. The image below shows the data sampling option. Data sampling can operate in one of two modes: in the database (native) or wholly in PowerCenter. Native sampling is supported for databases (Oracle, SQLServer, DB2, Teradata) that support sampling directly. Here, the database does the sampling and only the sampled data is returned from the database for testing. For all other data sources (flat files, other databases, mainframe data, and so on), the sampling is done in PowerCenter, which means all data is read and then filtered out based on sampling criteria. It is important to understand how sampling actually works. For example, if a user indicates that they want 5% of the data, then each row has a 5% chance of being selected and passed on for testing. This is true for native sampling as well as sampling performed by PowerCenter. Thus, if there are 100,000 rows in a table, and the user specifies 5%, then about 5,000 rows, but not guaranteed exactly 5000 rows, will be delivered. The net result, regardless of the type and amount of data sampled, is a subset of large data volumes that can be tested efficiently to find issues or give confidence in a system. 8

9 Splitting Wide Tables Wide tables or files (that is, tables or files with hundreds of columns) are not uncommon in enterprise data environments. Testing such tables, especially with a value tests, can place heavy load on an underpowered server. Unlike sampling, which limits the number of rows being tested, splitting wide tables reduces the number of columns being tested. Imagine a table 500 columns wide, where value tests are required across all 500 columns. Instead of creating a single table pair object with 500 tests, and slowing down the PowerCenter Integration Service, you can create five table pairs objects, each with a 100 tests and run these table pair objects separately. The end result of this split will be the same, that is, all 500 columns will be tested, but, in smaller chunks that will be put much less load on the server. Additionally, with the reduced number of tests in the table pairs objects, inspecting results in the GUI or creating reports will be simpler, more manageable, and efficient. Server Configurations Though there is no absolute answer to Data Validation Option performance questions, some data is always better than none to help understand what can be expected and what looks anomalous. The following tables provide a baseline set of performance metrics to give some context on expected performance. They provide a rough minimum level of expected performance for tests executed with Data Validation Option. All tests were performed on the following server configuration: Operating System: Linux version el5 (Red Hat ) CPU: AMD Opteron 6220 Processor Speed: 3000 MHz Cores: 4 Memory: 128 GB All tests used data in Oracle tables for both table A and table B in the table pair object. A ll tests used default configurations and were performed entirely in PowerCenter. No database optimization, caching, sampling, or other performance enhancements were made. The intention is to provide sample baseline performance numbers on a small server. Larger or more powerful servers and enabling Data Validation Option optimizations would likely result in significantly improved performance. Performance Metrics Calculations All tests were conducted five times, and the elapsed time for each test run was recorded. The lowest and highest times were discarded, and the average of the middle three was computed and rounded to the nearest second. This is the number displayed in the tables below under Average Time. Rows/Second is shown to provide a normalized number to compare test performance across different row counts and test types. The ideal situation is to have a consistent (that is, flat) measure for Rows/Second as the number of rows increases for a given test scenario. This shows linear scalability for the system. COUNT Tests COUNT tests count all non-null values in a column and check if the expected number of values are present. In the data below, five COUNT tests are performed on a table pair object with 50 columns of data. The performance is consistent, averaging about 57,000 rows per second across all tests. 9

10 The following table shows the results of five COUNT tests: Joined Rows Columns Test Types Average Time Rows/Second No 1,000, Five COUNT 16 sec 63,830 No 2,000, Five COUNT 36 sec 56,604 No 5,000, Five COUNT 1 min 35 sec 52,817 No 10,000, Five COUNT 3 min 1 sec 55,351 COUNT_ROWS Tests The COUNT_ROWS test counts all values, including nulls, in a column. The performance of the COUNT_ROWS test is faster than that of the COUNT test because the COUNT test checks each data value to see if it is null before incrementing the count. Also, as the COUNT_ROWS test counts all rows in a table, only one test is needed for a table pair object or single table. The table below shows results for one COUNT_ROWS test: Joined Rows Columns Test Types Average Time Rows/Sec No 1,000, One COUNT_ROWS 7 sec 136,364 No 2,000, One COUNT_ROWS 12 sec 171,249 No 5,000, One COUNT_ROWS 23 sec 220,588 No 1,000, One COUNT_ROWS 51 sec 196,078 SUM Tests SUM Tests calculate the sum of a numeric column. They are a common high level test performed on data to see if sums match (for example, across a set of transactions) across data sets. The following table shows results for five SUM tests: Joined Rows Columns Test Types Average Time Rows/Sec No 1,000, Five SUM 19 sec 52,632 No 2,000, Five SUM 36 sec 56,075 No 5,000, Five SUM 1 min 47 sec 46,584 No 10,000, Five SUM 3 min 9 sec 53,191 The Rows/Sec is consistent, which shows the linear scalability of SUM tests. 10

11 SET Tests Set tests look at distinct values within a pair of columns, and identify if there are missing or extra distinct values across those columns. SET AinB Use the SET AinB test to identify if values in a column in source table (A) all exist in a column in Lookup Table (B). If any value in A does not exist in B, it will be revealed as an Error row. The following table shows the results of five SET AinB tests: Joined Rows Columns Test Type Average Time Rows/Sec Yes 1,000, Five SET AinB 32 sec 31,250 Yes 2,000, Five SET AinB 1 min 14 sec 27,027 Yes 5,000, Five SET AinB 3 min 57 sec 21,097 Yes 10,000, Five SET AinB Five SET AinB 22,523 SET ANotinB Use the SET ANotinB test to ensure that no value in A is also in B. This test is useful to identify duplicates during the merging of two data sets, or to validate masked data to ensure all values were appropriately masked. The following table shows the results of five SET ANotinB tests: Joined Rows Columns Test Type Average Time Rows/Sec Yes 1,000, Five SET ANotinB 38 sec 26,316 Yes 2,000, Five SET ANotinB 1 min 21 sec 24,691 Yes 5,000, Five SET ANotinB 4 min 13 sec 19,763 Yes 10,000, Five SET ANotinB 8 min 36 sec 19,380 From the performance of both types of set tests, you see that the Set AinB (approximately 25,000 rows/sec) test is slightly faster being slightly faster than the set ANotinB (approximately 22,000 rows/sec) test. OUTER_VALUE Tests OUTER_VALUE tests perform a full outer join across table A and table B in the table pair object and display any orphans from either side. In general, OUTER_VALUE tests are performed across the key columns of the data sets and only one outer value is typically needed in a given table pair object. 11

12 The table below shows the performance of one OUTER_VALUE test: Joined Rows Columns Test Types Average Time Rows/Sec No 1,000, One OUTER_VALUE 10 sec 103,448 No 2,000, One OUTER_VALUE 18 sec 111,111 No 5,000, One OUTER_VALUE 38 sec 131,579 No 1,000, One OUTER_VALUE 1 min 26 sec 116,279 VALUE Tests VALUE Tests are the most common tests executed in Data Validation Option as they provide the most detail about missing or erroneous values. VALUE tests are executed within PowerCenter, and are individual evaluations of row/ column data across the table pair object. In the following table, an additional column, Comparisons/ Sec, shows the number of comparisons executed per second. Total Comparisons is the total number of rows multiplied by the total number of value tests. Divide that by the number of seconds it takes to complete the test and you have Comparisons/Sec. This number can be used to compare the performance across the two scenarios presented below. that is, Five VALUE Tests versus 50 VALUE Tests. 5 VALUE Tests The following table shows that the performance of 5 VALUE tests is very good, with an average of about 35,700 rows/second across all scenarios. Joined Rows Columns Test Types Average Time Rows/Sec Comparisons/ Sec No 1,000, VALUE 34 sec 29, ,631 No 2,000, VALUE 52 sec 29, ,548 No 5,000, VALUE 2 min 16 sec 36, ,824 No 1,000, VALUE 4 min 52 sec 34, , VALUE Tests This scenario contains 50 VALUE Tests, one COUNT and one OUTER_VALUE. It is a typical set of tests and can be automatically generated for a table pair object in Data Validation Option. 12

13 Here, the Rows/Sec drops by about 10x as compared to the Five VALUE Tests scenario. This is expected given the fixed capacity of the server and the 10 times increase in number of columns testing. But Comparisons/Sec is, on average, slightly higher than the Five VALUE tests scenario. Joined Rows Columns Test Types Average Time Rows/S ec Comparisons/ Sec No 1,000, VALUE, 1 COUNT, 1 OUTER_VALUE No 2,000, VALUE, 1 COUNT, 1 OUTER_VALUE No 5,000, VALUE, 1 COUNT, 1 OUTER_VALUE No 1,000, VALUE, 1 COUNT, 1 OUTER_VALUE 3 min 49 sec 4, , min 14 sec 2, , min 20 sec 2, , min 29 sec 3,117 3,117 Conclusion All performance statistics depend on the conditions under which the performance was measured. This is as true for speed tests for cars as it is for computer software. There are many factors that affect the performance of Data Validation Option including. These factors are not limited to the characteristics of the data, the types of data, and the amount of testing done on the data, the server hardware where the tests are run, and the Data Validation Option testing configuration set by the user. It is important to understand these factors when designing and implementing tests with Data Validation Option. There are product features and testing tactics that can be used to improve performance. Different approaches can be used in different situations. For example when the data is primarily in relational databases, some processing (WHERE clauses, COUNT and aggregate tests) can be performed in the database itself. When very large (hundreds of millions of rows) data sets need to be tested, statistical sampling can be used. Baseline performance statistics provides a performance metrics for a variety of test types with increasing amounts of data. Run on a modest 4-core Linux server without any specific optimizations, these numbers serve as a baseline reference for what can be achieved. More powerful servers or specific performance optimizations will yield even better results, but as the numbers show, Data Validation Option tests perform very well out of the box, and scale linearly as data volumes grow. This makes for a predictable and efficient framework for large scale validation testing, whether in development, QA, or production environments. Author Saeed Khan Principal Product Manager, PowerCenter 13

Data Validation Option Best Practices

Data Validation Option Best Practices Data Validation Option Best Practices 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without

More information

Increasing Performance for PowerCenter Sessions that Use Partitions

Increasing Performance for PowerCenter Sessions that Use Partitions Increasing Performance for PowerCenter Sessions that Use Partitions 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

How to Use Full Pushdown Optimization in PowerCenter

How to Use Full Pushdown Optimization in PowerCenter How to Use Full Pushdown Optimization in PowerCenter 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Optimizing Performance for Partitioned Mappings

Optimizing Performance for Partitioned Mappings Optimizing Performance for Partitioned Mappings 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

PowerCenter 7 Architecture and Performance Tuning

PowerCenter 7 Architecture and Performance Tuning PowerCenter 7 Architecture and Performance Tuning Erwin Dral Sales Consultant 1 Agenda PowerCenter Architecture Performance tuning step-by-step Eliminating Common bottlenecks 2 PowerCenter Architecture:

More information

Jyotheswar Kuricheti

Jyotheswar Kuricheti Jyotheswar Kuricheti 1 Agenda: 1. Performance Tuning Overview 2. Identify Bottlenecks 3. Optimizing at different levels : Target Source Mapping Session System 2 3 Performance Tuning Overview: 4 What is

More information

Using Standard Generation Rules to Generate Test Data

Using Standard Generation Rules to Generate Test Data Using Standard Generation Rules to Generate Test Data 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Performance Optimization for Informatica Data Services ( Hotfix 3)

Performance Optimization for Informatica Data Services ( Hotfix 3) Performance Optimization for Informatica Data Services (9.5.0-9.6.1 Hotfix 3) 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository

Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Optimizing Session Caches in PowerCenter

Optimizing Session Caches in PowerCenter Optimizing Session Caches in PowerCenter 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Importing Metadata from Relational Sources in Test Data Management

Importing Metadata from Relational Sources in Test Data Management Importing Metadata from Relational Sources in Test Data Management Copyright Informatica LLC, 2017. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the

More information

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER Hardware Sizing Using Amazon EC2 A QlikView Scalability Center Technical White Paper June 2013 qlikview.com Table of Contents Executive Summary 3 A Challenge

More information

Performance Tuning. Chapter 25

Performance Tuning. Chapter 25 Chapter 25 Performance Tuning This chapter covers the following topics: Overview, 618 Identifying the Performance Bottleneck, 619 Optimizing the Target Database, 624 Optimizing the Source Database, 627

More information

Creating a Subset of Production Data

Creating a Subset of Production Data Creating a Subset of Production Data 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition. Eugene Gonzalez Support Enablement Manager, Informatica

Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition. Eugene Gonzalez Support Enablement Manager, Informatica Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition Eugene Gonzalez Support Enablement Manager, Informatica 1 Agenda Troubleshooting PowerCenter issues require a

More information

Tuning the Hive Engine for Big Data Management

Tuning the Hive Engine for Big Data Management Tuning the Hive Engine for Big Data Management Copyright Informatica LLC 2017. Informatica, the Informatica logo, Big Data Management, PowerCenter, and PowerExchange are trademarks or registered trademarks

More information

Informatica Data Explorer Performance Tuning

Informatica Data Explorer Performance Tuning Informatica Data Explorer Performance Tuning 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Analytics: Server Architect (Siebel 7.7)

Analytics: Server Architect (Siebel 7.7) Analytics: Server Architect (Siebel 7.7) Student Guide June 2005 Part # 10PO2-ASAS-07710 D44608GC10 Edition 1.0 D44917 Copyright 2005, 2006, Oracle. All rights reserved. Disclaimer This document contains

More information

INFORMATICA PERFORMANCE

INFORMATICA PERFORMANCE CLEARPEAKS BI LAB INFORMATICA PERFORMANCE OPTIMIZATION TECHNIQUES July, 2016 Author: Syed TABLE OF CONTENTS INFORMATICA PERFORMANCE OPTIMIZATION TECHNIQUES 3 STEP 1: IDENTIFYING BOTTLENECKS 3 STEP 2: RESOLVING

More information

Using MDM Big Data Relationship Management to Perform the Match Process for MDM Multidomain Edition

Using MDM Big Data Relationship Management to Perform the Match Process for MDM Multidomain Edition Using MDM Big Data Relationship Management to Perform the Match Process for MDM Multidomain Edition Copyright Informatica LLC 1993, 2017. Informatica LLC. No part of this document may be reproduced or

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

ETL Transformations Performance Optimization

ETL Transformations Performance Optimization ETL Transformations Performance Optimization Sunil Kumar, PMP 1, Dr. M.P. Thapliyal 2 and Dr. Harish Chaudhary 3 1 Research Scholar at Department Of Computer Science and Engineering, Bhagwant University,

More information

Aggregate Data in Informatica Developer

Aggregate Data in Informatica Developer Aggregate Data in Informatica Developer 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Tuning Enterprise Information Catalog Performance

Tuning Enterprise Information Catalog Performance Tuning Enterprise Information Catalog Performance Copyright Informatica LLC 2015, 2018. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the United States

More information

Code Page Configuration in PowerCenter

Code Page Configuration in PowerCenter Code Page Configuration in PowerCenter 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Configuring a JDBC Resource for IBM DB2/ iseries in Metadata Manager HotFix 2

Configuring a JDBC Resource for IBM DB2/ iseries in Metadata Manager HotFix 2 Configuring a JDBC Resource for IBM DB2/ iseries in Metadata Manager 9.5.1 HotFix 2 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Importing Flat File Sources in Test Data Management

Importing Flat File Sources in Test Data Management Importing Flat File Sources in Test Data Management Copyright Informatica LLC 2017. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the United States

More information

Using Synchronization in Profiling

Using Synchronization in Profiling Using Synchronization in Profiling Copyright Informatica LLC 1993, 2017. Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Performance of relational database management

Performance of relational database management Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate

More information

SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT

SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT QlikView Technical Brief February 2012 qlikview.com Introduction When it comes to the enterprise Business Discovery environments, the ability of the

More information

Catalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging

Catalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging Catalogic DPX TM 4.3 ECX 2.0 Best Practices for Deployment and Cataloging 1 Catalogic Software, Inc TM, 2015. All rights reserved. This publication contains proprietary and confidential material, and is

More information

Implementing Data Masking and Data Subset with IMS Unload File Sources

Implementing Data Masking and Data Subset with IMS Unload File Sources Implementing Data Masking and Data Subset with IMS Unload File Sources 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Qlik Sense Enterprise architecture and scalability

Qlik Sense Enterprise architecture and scalability White Paper Qlik Sense Enterprise architecture and scalability June, 2017 qlik.com Platform Qlik Sense is an analytics platform powered by an associative, in-memory analytics engine. Based on users selections,

More information

Implementing Data Masking and Data Subset with IMS Unload File Sources

Implementing Data Masking and Data Subset with IMS Unload File Sources Implementing Data Masking and Data Subset with IMS Unload File Sources 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Enterprise Data Catalog for Microsoft Azure Tutorial

Enterprise Data Catalog for Microsoft Azure Tutorial Enterprise Data Catalog for Microsoft Azure Tutorial VERSION 10.2 JANUARY 2018 Page 1 of 45 Contents Tutorial Objectives... 4 Enterprise Data Catalog Overview... 5 Overview... 5 Objectives... 5 Enterprise

More information

TIBCO BusinessEvents Extreme. System Sizing Guide. Software Release Published May 27, 2012

TIBCO BusinessEvents Extreme. System Sizing Guide. Software Release Published May 27, 2012 TIBCO BusinessEvents Extreme System Sizing Guide Software Release 1.0.0 Published May 27, 2012 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR

More information

Was ist dran an einer spezialisierten Data Warehousing platform?

Was ist dran an einer spezialisierten Data Warehousing platform? Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction

More information

Quest Central for DB2

Quest Central for DB2 Quest Central for DB2 INTEGRATED DATABASE MANAGEMENT TOOLS Supports DB2 running on Windows, Unix, OS/2, OS/390 and z/os Integrated database management components are designed for superior functionality

More information

Passit4sure.P questions

Passit4sure.P questions Passit4sure.P2090-045.55 questions Number: P2090-045 Passing Score: 800 Time Limit: 120 min File Version: 5.2 http://www.gratisexam.com/ P2090-045 IBM InfoSphere Information Server for Data Integration

More information

Best ETL Design Practices. Helpful coding insights in SAS DI studio. Techniques and implementation using the Key transformations in SAS DI studio.

Best ETL Design Practices. Helpful coding insights in SAS DI studio. Techniques and implementation using the Key transformations in SAS DI studio. SESUG Paper SD-185-2017 Guide to ETL Best Practices in SAS Data Integration Studio Sai S Potluri, Synectics for Management Decisions; Ananth Numburi, Synectics for Management Decisions; ABSTRACT This Paper

More information

Data Integration and ETL with Oracle Warehouse Builder

Data Integration and ETL with Oracle Warehouse Builder Oracle University Contact Us: 1.800.529.0165 Data Integration and ETL with Oracle Warehouse Builder Duration: 5 Days What you will learn Participants learn to load data by executing the mappings or the

More information

Improving PowerCenter Performance with IBM DB2 Range Partitioned Tables

Improving PowerCenter Performance with IBM DB2 Range Partitioned Tables Improving PowerCenter Performance with IBM DB2 Range Partitioned Tables 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Oracle Database: SQL and PL/SQL Fundamentals

Oracle Database: SQL and PL/SQL Fundamentals Oracle University Contact Us: 001-855-844-3881 & 001-800-514-06-9 7 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training

More information

Oracle Event Processing Extreme Performance on Sparc T5

Oracle Event Processing Extreme Performance on Sparc T5 Oracle Event Processing Extreme Performance on Sparc T5 An Oracle Event Processing (OEP) Whitepaper ORACLE WHITE PAPER AUGUST 2014 Table of Contents Introduction 2 OEP Architecture 2 Server Architecture

More information

Chapter 3 - Memory Management

Chapter 3 - Memory Management Chapter 3 - Memory Management Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 3 - Memory Management 1 / 222 1 A Memory Abstraction: Address Spaces The Notion of an Address Space Swapping

More information

Database Optimization

Database Optimization Database Optimization June 9 2009 A brief overview of database optimization techniques for the database developer. Database optimization techniques include RDBMS query execution strategies, cost estimation,

More information

Prerequisites for Using Enterprise Manager with Your Primavera Applications

Prerequisites for Using Enterprise Manager with Your Primavera Applications Oracle Enterprise Manager For Oracle Construction and Engineering Configuration Guide for On Premises Version 18 August 2018 Contents Introduction... 5 Prerequisites for Using Enterprise Manager with

More information

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata COGNOS (R) 8 FRAMEWORK MANAGER GUIDELINES FOR MODELING METADATA Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata GUIDELINES FOR MODELING METADATA THE NEXT LEVEL OF PERFORMANCE

More information

Informatica PowerExchange for Tableau User Guide

Informatica PowerExchange for Tableau User Guide Informatica PowerExchange for Tableau 10.2.1 User Guide Informatica PowerExchange for Tableau User Guide 10.2.1 May 2018 Copyright Informatica LLC 2015, 2018 This software and documentation are provided

More information

ADO.NET from 3,048 meters

ADO.NET from 3,048 meters C H A P T E R 2 ADO.NET from 3,048 meters 2.1 The goals of ADO.NET 12 2.2 Zooming in on ADO.NET 14 2.3 Summary 19 It is a rare opportunity to get to build something from scratch. When Microsoft chose the

More information

Creating an Avro to Relational Data Processor Transformation

Creating an Avro to Relational Data Processor Transformation Creating an Avro to Relational Data Processor Transformation 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Advanced Data Management Technologies Written Exam

Advanced Data Management Technologies Written Exam Advanced Data Management Technologies Written Exam 02.02.2016 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. This

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

Code Page Settings and Performance Settings for the Data Validation Option

Code Page Settings and Performance Settings for the Data Validation Option Code Page Settings and Performance Settings for the Data Validation Option 2011 Informatica Corporation Abstract This article provides general information about code page settings and performance settings

More information

Performance Tuning in Informatica Developer

Performance Tuning in Informatica Developer Performance Tuning in Informatica Developer 2010 Informatica Abstract The Data Integration Service uses optimization methods to improve the performance of a mapping. You can choose an optimizer level to

More information

DQpowersuite. Superior Architecture. A Complete Data Integration Package

DQpowersuite. Superior Architecture. A Complete Data Integration Package DQpowersuite Superior Architecture Since its first release in 1995, DQpowersuite has made it easy to access and join distributed enterprise data. DQpowersuite provides an easy-toimplement architecture

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH White Paper EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH A Detailed Review EMC SOLUTIONS GROUP Abstract This white paper discusses the features, benefits, and use of Aginity Workbench for EMC

More information

Designing your BI Architecture

Designing your BI Architecture IBM Software Group Designing your BI Architecture Data Movement and Transformation David Cope EDW Architect Asia Pacific 2007 IBM Corporation DataStage and DWE SQW Complex Files SQL Scripts ERP ETL Engine

More information

Creating OData Custom Composite Keys

Creating OData Custom Composite Keys Creating OData Custom Composite Keys 1993, 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without

More information

Creating a target user and module

Creating a target user and module The Warehouse Builder contains a number of objects, which we can use in designing our data warehouse, that are either relational or dimensional. OWB currently supports designing a target schema only in

More information

MCSA SQL SERVER 2012

MCSA SQL SERVER 2012 MCSA SQL SERVER 2012 1. Course 10774A: Querying Microsoft SQL Server 2012 Course Outline Module 1: Introduction to Microsoft SQL Server 2012 Introducing Microsoft SQL Server 2012 Getting Started with SQL

More information

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits

More information

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11 DATABASE PERFORMANCE AND INDEXES CS121: Relational Databases Fall 2017 Lecture 11 Database Performance 2 Many situations where query performance needs to be improved e.g. as data size grows, query performance

More information

Data Mining & Data Warehouse

Data Mining & Data Warehouse Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?

More information

PowerCenter Repository Maintenance

PowerCenter Repository Maintenance PowerCenter Repository Maintenance 2012 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without

More information

Tuning Intelligent Data Lake Performance

Tuning Intelligent Data Lake Performance Tuning Intelligent Data Lake Performance 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without

More information

Optimize Your Databases Using Foglight for Oracle s Performance Investigator

Optimize Your Databases Using Foglight for Oracle s Performance Investigator Optimize Your Databases Using Foglight for Oracle s Performance Investigator Solve performance issues faster with deep SQL workload visibility and lock analytics Abstract Get all the information you need

More information

Microsoft Power Tools for Data Analysis #7 Power Query 6 Types of Merges/ Joins 9 Examples Notes from Video:

Microsoft Power Tools for Data Analysis #7 Power Query 6 Types of Merges/ Joins 9 Examples Notes from Video: Table of Contents: Microsoft Power Tools for Data Analysis #7 Power Query 6 Types of Merges/ Joins 9 Examples Notes from Video: 1. Power Query Has Six Types of Merges / Joins... 2 2. What is a Merge /

More information

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including: IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including: 1. IT Cost Containment 84 topics 2. Cloud Computing Readiness 225

More information

Database performance becomes an important issue in the presence of

Database performance becomes an important issue in the presence of Database tuning is the process of improving database performance by minimizing response time (the time it takes a statement to complete) and maximizing throughput the number of statements a database can

More information

How Do I Manage Multiple Versions of my BI Implementation?

How Do I Manage Multiple Versions of my BI Implementation? How Do I Manage Multiple Versions of my BI Implementation? 9 This case study focuses on the life cycle of a business intelligence system. This case study covers two approaches for managing individually

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Configuring a JDBC Resource for IBM DB2 for z/os in Metadata Manager

Configuring a JDBC Resource for IBM DB2 for z/os in Metadata Manager Configuring a JDBC Resource for IBM DB2 for z/os in Metadata Manager 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

271 Waverley Oaks Rd. Telephone: Suite 206 Waltham, MA USA

271 Waverley Oaks Rd. Telephone: Suite 206 Waltham, MA USA Contacting Leostream Leostream Corporation http://www.leostream.com 271 Waverley Oaks Rd. Telephone: +1 781 890 2019 Suite 206 Waltham, MA 02452 USA To submit an enhancement request, email features@leostream.com.

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Oracle Database: SQL and PL/SQL Fundamentals NEW

Oracle Database: SQL and PL/SQL Fundamentals NEW Oracle Database: SQL and PL/SQL Fundamentals NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training delivers the fundamentals of SQL and PL/SQL along with the

More information

Table Compression in Oracle9i Release2. An Oracle White Paper May 2002

Table Compression in Oracle9i Release2. An Oracle White Paper May 2002 Table Compression in Oracle9i Release2 An Oracle White Paper May 2002 Table Compression in Oracle9i Release2 Executive Overview...3 Introduction...3 How It works...3 What can be compressed...4 Cost and

More information

Teradata Aggregate Designer

Teradata Aggregate Designer Data Warehousing Teradata Aggregate Designer By: Sam Tawfik Product Marketing Manager Teradata Corporation Table of Contents Executive Summary 2 Introduction 3 Problem Statement 3 Implications of MOLAP

More information

SAS Data Integration Studio 3.3. User s Guide

SAS Data Integration Studio 3.3. User s Guide SAS Data Integration Studio 3.3 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Data Integration Studio 3.3: User s Guide. Cary, NC: SAS Institute

More information

Oracle Database 11g: SQL and PL/SQL Fundamentals

Oracle Database 11g: SQL and PL/SQL Fundamentals Oracle University Contact Us: +33 (0) 1 57 60 20 81 Oracle Database 11g: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn In this course, students learn the fundamentals of SQL and PL/SQL

More information

Transformer Looping Functions for Pivoting the data :

Transformer Looping Functions for Pivoting the data : Transformer Looping Functions for Pivoting the data : Convert a single row into multiple rows using Transformer Looping Function? (Pivoting of data using parallel transformer in Datastage 8.5,8.7 and 9.1)

More information

Strategies for Incremental Updates on Hive

Strategies for Incremental Updates on Hive Strategies for Incremental Updates on Hive Copyright Informatica LLC 2017. Informatica, the Informatica logo, and Big Data Management are trademarks or registered trademarks of Informatica LLC in the United

More information

Part 1: Indexes for Big Data

Part 1: Indexes for Big Data JethroData Making Interactive BI for Big Data a Reality Technical White Paper This white paper explains how JethroData can help you achieve a truly interactive interactive response time for BI on big data,

More information

Performance Benchmark and Capacity Planning. Version: 7.3

Performance Benchmark and Capacity Planning. Version: 7.3 Performance Benchmark and Capacity Planning Version: 7.3 Copyright 215 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied

More information

This document contains information on fixed and known limitations for Test Data Management.

This document contains information on fixed and known limitations for Test Data Management. Informatica LLC Test Data Management Version 10.1.0 Release Notes December 2016 Copyright Informatica LLC 2003, 2016 Contents Installation and Upgrade... 1 Emergency Bug Fixes in 10.1.0... 1 10.1.0 Fixed

More information

SAS BI Dashboard 3.1. User s Guide Second Edition

SAS BI Dashboard 3.1. User s Guide Second Edition SAS BI Dashboard 3.1 User s Guide Second Edition The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2007. SAS BI Dashboard 3.1: User s Guide, Second Edition. Cary, NC:

More information

Designing dashboards for performance. Reference deck

Designing dashboards for performance. Reference deck Designing dashboards for performance Reference deck Basic principles 1. Everything in moderation 2. If it isn t fast in database, it won t be fast in Tableau 3. If it isn t fast in desktop, it won t be

More information

My grandfather was an Arctic explorer,

My grandfather was an Arctic explorer, Explore the possibilities A Teradata Certified Master answers readers technical questions. Carrie Ballinger Senior database analyst Teradata Certified Master My grandfather was an Arctic explorer, and

More information

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform Data Virtualization Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform Introduction Caching is one of the most important capabilities of a Data Virtualization

More information

Publishing and Subscribing to Cloud Applications with Data Integration Hub

Publishing and Subscribing to Cloud Applications with Data Integration Hub Publishing and Subscribing to Cloud Applications with Data Integration Hub 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Oracle Big Data Connectors

Oracle Big Data Connectors Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process

More information

COPYRIGHTED MATERIAL. Contents. Introduction. Chapter 1: Welcome to SQL Server Integration Services 1. Chapter 2: The SSIS Tools 21

COPYRIGHTED MATERIAL. Contents. Introduction. Chapter 1: Welcome to SQL Server Integration Services 1. Chapter 2: The SSIS Tools 21 Introduction xxix Chapter 1: Welcome to SQL Server Integration Services 1 SQL Server SSIS Historical Overview 2 What s New in SSIS 2 Getting Started 3 Import and Export Wizard 3 The Business Intelligence

More information

How to Migrate RFC/BAPI Function Mappings to Use a BAPI/RFC Transformation

How to Migrate RFC/BAPI Function Mappings to Use a BAPI/RFC Transformation How to Migrate RFC/BAPI Function Mappings to Use a BAPI/RFC Transformation 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

More on MS Access queries

More on MS Access queries More on MS Access queries BSAD 141 Dave Novak Topics Covered MS Access query capabilities Aggregate queries Different joins Review: AND and OR Parameter query Exact match criteria versus range Formatting

More information

Data Warehousing 11g Essentials

Data Warehousing 11g Essentials Oracle 1z0-515 Data Warehousing 11g Essentials Version: 6.0 QUESTION NO: 1 Indentify the true statement about REF partitions. A. REF partitions have no impact on partition-wise joins. B. Changes to partitioning

More information

Pentaho Data Integration (PDI) Standards for Lookups, Joins, and Subroutines

Pentaho Data Integration (PDI) Standards for Lookups, Joins, and Subroutines Pentaho Data Integration (PDI) Standards for Lookups, Joins, and Subroutines Change log (if you want to use it): Date Version Author Changes 10/11/2017 1.0 Matthew Casper Contents Overview... 1 Before

More information

Column Stores vs. Row Stores How Different Are They Really?

Column Stores vs. Row Stores How Different Are They Really? Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background

More information

White Paper. Major Performance Tuning Considerations for Weblogic Server

White Paper. Major Performance Tuning Considerations for Weblogic Server White Paper Major Performance Tuning Considerations for Weblogic Server Table of Contents Introduction and Background Information... 2 Understanding the Performance Objectives... 3 Measuring your Performance

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information