HP Integration with Incorta: Connection Guide HP Vertica Analytic Database HP Big Data Document Release Date: July, 2015
Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Restricted Rights Legend Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Copyright Notice Copyright 2006-2015 Hewlett-Packard Development Company, L.P. Trademark Notices Adobe is a trademark of Adobe Systems Incorporated. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. UNIX is a registered trademark of The Open Group. HP Vertica Integration with Incorta p. 2
Contents About HP Vertica Connection Guides... 1 Overview... 1 Before You Begin... 1 Connecting to HP Vertica Using Incorta... 1 Loading HP Vertica Data into Incorta... 3 Designing Dashboards with Incorta... 8 For More Information... 10 HP Vertica Integration with Incorta p. iii
About HP Vertica Connection Guides HP Vertica connection guides provide basic information about setting up connections to HP Vertica from software that third-party vendors create. These documents provide guidance using one specific version of HP Vertica and one specific version of the thirdparty vendor's software. Other versions of the third-party product may work with HP Vertica. However, Hewlett-Packard may not have tested these other versions. Overview Incorta is a seamless, end-to-end analytical warehouse solution engineered for simple, powerful, real-time analysis of massive volumes of data. This document describes how to: Connect to HP Vertica from Incorta. Load data from an HP Vertica database into Incorta. Create a dashboard that displays real-time analysis of the data. This document assumes that the reader is familiar with both Incorta and HP Vertica. In preparing this document, Hewlett-Packard tested connecting to HP Vertica 7.x using Incorta 2.1. Before You Begin Before you can connect Incorta to HP Vertica, you must install the Incorta software. Incorta provides the HP Vertica 6.1.3 JDBC drivers (vertica-jdk5-6.1.3-0.jar) with the installation package. You do not need to download the drivers separately. After you install Incorta you need to: 1. Log in using the Sign In page. 2. Create a connection to your HP Vertica database. 3. Load data from HP Vertica into Incorta. 4. Build your dashboard using that data. Connecting to HP Vertica Using Incorta Before you can connect to HP Vertica, define the data source using the following steps: 1. To create a new connection, click the database icon on the left side of the window. Then click + on the top right of the window. The Add New Data Source window opens. HP Vertica Integration with Incorta p. 1
2. Select Vertica from the Database dropdown list: 3. Specify a name for this data source. This example names the HP Vertica data source as testdb. Specify the HP Vertica database username and password, and the JDBC connection string that Incorta needs to connect to your HP Vertica instance. HP Vertica Integration with Incorta p. 2
4. To register your HP Vertica database with Incorta and create the connection, click Add Data Source. 5. After Incorta has registered the connection successfully, click Test Connection to test the connection. Loading HP Vertica Data into Incorta Incorta provides end-to-end analytics platform capabilities. You load data from your HP Vertica database into the Incorta engine. Incorta keeps this data in memory for fast performance. Take the following steps: 1. Once you have established a connection from Incorta to your HP Vertica database, click the Schemas and Session Variables icon on the left side of the window. This window displays all the schemas you have currently defined in Incorta. 2. To create a new schema in Incorta for the data you want to load from your HP Vertica instance, in the top right of the window, click Schema Wizard. HP Vertica Integration with Incorta p. 3
3. Using the Schema Wizard, enter the name of your Incorta schema to contain the HP Vertica data. This example names the Incorta schema VMart. Enter the name of the data source, which is the name you assigned when you created the connection, in this example, testdb. Optionally, you can add a description of the schema. This example loads data into Incorta from the VMart database that ships with HP Vertica. Click Next. 4. The following window displays the HP Vertica schemas and tables. Select the schemas and tables that you want to load and click Next. HP Vertica Integration with Incorta p. 4
The Schema Wizard displays information about the tables that you have selected. Incorta displays information about the columns in the tables, including name, label, data type, mapping type, and function. Incorta has not loaded the data yet. 5. The Schema Wizard also gives you the option to modify the query that selects exactly this data. To do so, click Play, the blue arrow inside a blue circle next to the table name. HP Vertica Integration with Incorta p. 5
6. You might edit this query if you want to omit a table column from the data you load from HP Vertica. To do so, edit the query to omit that column name from the SELECT statement, click Execute to run the modified query. 7. To save the results, without the omitted column, click Save. 8. Once you have the data you want, to tell Incorta to verify the schema that you have created, in the Schema Wizard, click Next. 9. To save your new schema, click Finish. After Incorta has verified the schema, it displays detailed information about the schema and tables you selected. Note that Incorta has not yet loaded the HP Vertica data: HP Vertica Integration with Incorta p. 6
10. When you load data from the HP Vertica schemas and tables, Incorta stores that data into memory to perform the analytics. Currently in this example, there is no data loaded into memory; no value displays for the number of rows and memory consumed. To tell Incorta to load the data from HP Vertica, click Load. There are three options for loading data: Full First-time load into Incorta memory. Incremental Load only data that has been generated in the data source since data was last loaded into Incorta. Incorta uses a SQL query to fetch the new data from the server. Snapshot Load data into the schema from Incorta s data snapshot instance, rather than directly from the data source. This snapshot is updated each time a full or incremental load occurs. 11. After loading the data, Incorta displays updated information about each table, such as the number of rows and columns, memory usage, and other information. HP Vertica Integration with Incorta p. 7
Note: Incorta compresses the data on ingestion, which depends on the cardinality of the data, and loads the data in memory. If memory is insufficient to load the data, cold data is removed from the memory and is only loaded from the disk if it is required again. <Incorta_installation_directory>/server/logs/catalina.out 12. To define the missing joins manually in Incorta, click + at the top right of the window and specify the columns on which to join multiple tables to create a new table. Designing Dashboards with Incorta Now that you have loaded the HP Vertica data into your Incorta schema, you are ready to design a dashboard. 1. To create a new dashboard, click the Content icon on the left side of the window (it looks like a cloud). 2. Click + on the top right of the window. 3. Enter a name for your new dashboard and click Create. Incorta opens the dashboard design page where you can begin to design your dashboard. The following dashboard uses data from the VMart database that ships with HP Vertica. That data now resides in memory as part of your Incorta schema. The dashboard displays three pieces of information about sales: Sales per Year Sales by Geographic Region Sales by Product Category HP Vertica Integration with Incorta p. 8
4. In the Sales per Year graph, to view more detailed data, such as quarterly sales or monthly sales, click a data point for a given year, and the sales per quarter display. Click a data point for a given quarter, and the sales for the three months of that quarter display. 5. Similarly, on the Sales by Geographical Region, click a data point for a region. For example, if you click the East category, Incorta displays detailed results for all states in the East region. HP Vertica Integration with Incorta p. 9
To cancel the filter and return to the high-level data, click the button next to Remote All. For the Sales by Geographic Region graph, that button reads Customer Region > East. For More Information For more information about how Incorta works with HP Vertica, see www.incorta.com. HP Vertica Integration with Incorta p. 10