Data Analysis and Integration

Similar documents
Data Analysis and Integration

OpenEMR Insights Configuration Instructions

Elixir Repertoire Designer

Using SQL Developer. Oracle University and Egabi Solutions use only

RESETTING MYSQL ROOT PASSWORDS

Creating Your First MySQL Database. Scott Seighman Sales Consultant Oracle

Download and install MySQL server 8 in Windows. Step1: Download windows installer

Managing Your Database Using Oracle SQL Developer

Android Studio Setup Procedure

Quick Guide to Installing and Setting Up MySQL Workbench

QUICKSTART GUIDE: THE ATTIVIO PLATFORM

Xton Access Manager GETTING STARTED GUIDE

MySQL SERVER INSTALLATION, CONFIGURATION, AND HOW TO USE WITH STARCODE NETWORK

MassTransit 6.0 Installation Guide for Enterprise and Professional Servers on Macintosh

Mysql Create Schema From Sql File Command Line Windows

Instructor : Dr. Sunnie Chung. Independent Study Spring Pentaho. 1 P a g e

Install & Configure Windows 10, Visual Studio, & MySQL Dr. Tom Hicks Trinity University

Generes and Associates. Desktop Setup Guide LANDTECH ONLINE

Perceptive Process Mining

Using the IMS Universal Drivers and QMF to Access Your IMS Data Hands-on Lab

Connecting BioNumerics to MySQL

IT - SS - Corporate & Web Systems - Strategy : Confluence migration

Archivists Toolkit Internal Database

Session 1: Accessing MUGrid and Command Line Basics

Create CSV for Asset Import

Oracle Application Express Student Guide

Database Setup in IRI Workbench 1

MySQL Installation Guide

Nortel Quality Monitoring User Import Guide

Talend Open Studio for Data Quality. User Guide 5.5.2

SCCM Plug-in User Guide. Version 3.0

If this is the first time you have run SSMS, I recommend setting up the startup options so that the environment is set up the way you want it.

This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

WA2056 Building HTML5 Based Mobile Web Sites. Classroom Setup Guide. Web Age Solutions Inc. Copyright Web Age Solutions Inc. 1

MySQL Installation Guide (OS X)

Informatica Cloud Platform Building Connectors with the Toolkit Student Lab: Prerequisite Installations. Version Connectors Toolkit Training

Web Intelligence Reporting Basics HTML Version

Hydra Installation Manual

Installation of Actiheart Data Analysis Suite:

Processing Big Data with Hadoop in Azure HDInsight

Professional Edition User Guide

Variable Data Printing in Fiery Controllers. Exercise 1: Fiery FreeForm 1

RELAIS. Installation Guide in Windows Environment

Altiris Plug-in User Guide. Version 3.11

Installation and Upgrade Guide Zend Studio 7.0

APEX Times Ten Berichte. Tuning DB-Browser Datenmodellierung Schema Copy & Compare Data Grids. Extension Exchange.

Mysql Workbench Cannot Drop Schema

Working with Database Connections. Version: 7.3

Using the VMware vcenter Orchestrator Client. vrealize Orchestrator 5.5.1

Toad Edge Installation Guide

Perceptive Process Mining

Working with Attributes

How to Mail Merge a file with Microsoft Word 2003

Perceptive TransForm E-Forms Manager

Creative assets management. MySQL Install Guide

User Guide. Version 8.0

Simon Fischer, Rapid-I GmbH November 18, 2011

GroupWise to Outlook Feature Translation

Using the VMware vrealize Orchestrator Client

Database Explorer Quickstart

Linux Network Administration. MySQL COMP1071 Summer 2017

Welcome to Kmax Installing Kmax

ADOBE EXPERIENCE MANAGER DAM CONNECTOR FOR ADOBE DRIVE CC: TECHNICAL NOTE

TEMPO INSTALLATION I O A. Platform Independent Notes 1. Installing Tempo 3. Installing Tools for the Plugins 5. v0.2.

Using the IMS Universal Drivers and QMF to Access Your IMS Data Hands-on Lab

The following instructions cover how to edit an existing report in IBM Cognos Analytics.

MassTransit Server Installation Guide for Windows

JUNE 2016 PRIMAVERA P6 8x, CONTRACT MANAGEMENT 14x AND UNIFIER 16x CREATING DASHBOARD REPORTS IN ORACLE BI PUBLISHER

WA1827 Cloud Programming Workshop. Classroom Setup Guide. Web Age Solutions Inc. Copyright Web Age Solutions Inc. 1

Setup of PostgreSQL, pgadmin and importing data. CS3200 Database design (sp18 s2) Version 2/9/2018

WA2185 Platform Independent Mobile Development with jquery Mobile and Apache Cordova. Classroom Setup Guide. Web Age Solutions Inc.

SQream Dashboard Version SQream Technologies

MySQL Database Server Installation Guide for SAS Financial Management 5.3 SAS Human Capital Management 5.21 and SAS Strategy Management 5.

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

Not For Sale. Offline Scratch Development. Appendix B. Scratch 1.4

2 Getting Started. Getting Started (v1.8.6) 3/5/2007

Table of Contents. Abstract

STEP 1: PREPARE FOR DATA MIGRATION 1. Right-click the desktop and choose New > Folder. a. Type For Transferring and press Enter to name the folder.

IBM DB DB2 application development Hands-On Lab. Information Management Cloud Computing Center of Competence. IBM Canada Lab

Release notes for version 3.7.1

REV OBSERVER for WINDOWS

Perceptive Data Transfer

Important notice regarding accounts used for installation and configuration

SAP Process Mining by Celonis. Installation Guide. Version 1.2 Corresponding Software Version: 4.0

LiveNX Upgrade Guide from v5.2.0 to v5.2.1

VIRTUAL GPU LICENSE SERVER VERSION AND 5.1.0

This is a known issue (SVA-700) that will be resolved in a future release IMPORTANT NOTE CONCERNING A VBASE RESTORE ISSUE

Datathon 2018 Connecting to MicroStrategy on AWS Cloud

Teradata Studio Express

Microsoft Access 2010

Release notes for version 3.7

Infotek Solutions Inc.

Online Backup Client User Manual

New York State Department of Health Medicaid Perinatal Care Quality Improvement Project

Deploying a System Center 2012 R2 Configuration Manager Hierarchy

Introduction to IBM Data Studio, Part 1: Get started with IBM Data Studio, Version and Eclipse

Manage and Generate Reports

Introduction to IBM Data Studio, Part 1: Get started with IBM Data Studio, Version and Eclipse

Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades Q2

Live Data Connection to SAP Universes

Transcription:

MEIC 2015/2016 Data Analysis and Integration Lab 5: Working with databases 1 st semester Installing MySQL 1. Download MySQL Community Server for your operating system. For Windows, use one of the following links: Full installer (370.5 MB) http://dev.mysql.com/get/downloads/mysqlinstaller/mysql-installer-community-5.7.9.0.msi Web installer (1.6 MB) http://dev.mysql.com/get/downloads/mysqlinstaller/mysql-installer-web-community-5.7.9.0.msi For Mac OS X, use one of the following links: Mac OS X 10.10 (Yosemite) http://dev.mysql.com/get/downloads/mysql-5.7/mysql-5.7.9-osx10.10-x86_64.dmg Mac OS X 10.9 (Mavericks) http://dev.mysql.com/get/downloads/mysql-5.7/mysql-5.7.9-osx10.9-x86_64.dmg For Linux, use one of the following: Ubuntu sudo apt-get install mysql-server mysql-client mysql-workbench Other distros http://dev.mysql.com/downloads/mysql/ (If you are asked to login, click "No thanks, just start my download.") 2. Install MySQL according to the specific instructions for your operating system. On Windows, the only thing that you need to install is MySQL Server. Optionally, you might want to install MySQL Workbench and MySQL Notifier as well. During the installation process, set the MySQL Root Password. For the moment, you do not need to add any other MySQL User Accounts. IST/DEI Page 1 of 11

For convenience, add the MySQL executables to the system PATH. On Windows, the path should be C:\Program Files\MySQL\MySQL Server 5.7\bin. Do not replace the entire PATH variable, just add another path to the variable. Use semi-colon (;). On Mac OS X, to add the MySQL executables to your PATH you might need to run the following commands on a Terminal: echo 'export PATH="/usr/local/mysql/bin:$PATH"' >> ~/.bash_profile source ~/.bash_profile Furthermore, to set the MySQL Root Password, you might need to run: mysqladmin -u root password yourpasswordhere IST/DEI Page 2 of 11

Creating the SteelWheels database 3. Download the SteelWheels.sql file that has been published together with this lab. Save the file to some folder (e.g. Desktop). 4. Open a Command Prompt (or Terminal) and navigate to the folder where you have saved the SteelWheels.sql file. 5. Execute the following command: mysql -u root -p When prompted, give your MySQL Root Password. 6. You are now logged into MySQL. Execute the following commands to create the database and its user: CREATE DATABASE steelwheels; CREATE USER 'steelwheels'@'localhost' IDENTIFIED BY 'steelwheels'; GRANT ALL PRIVILEGES ON steelwheels.* TO 'steelwheels'@'localhost'; 7. Execute the following command to create the tables and load the data: SOURCE SteelWheels.sql 8. You have now created the SteelWheels database. Write quit to quit the MySQL client. IST/DEI Page 3 of 11

Add the MySQL database driver to PDI 9. Get the MySQL database driver from here: http://dev.mysql.com/get/downloads/connector-j/mysql-connector-java-5.1.37.zip 10. Open the ZIP file and extract the JAR file (mysql-connector-java-5.1.37-bin.jar) to some folder. 11. Copy/move the JAR file to the lib folder of your PDI installation (\data-integration\lib). You will find other JARs in that folder. Creating a database connection in PDI 12. Open PDI (Spoon) and create a new transformation. 13. Change to the View tab, and expand Transformations > Transformation 1 > Database connections. 14. Right-click Database connections and select New. 15. In the Database Connection dialog: In Connection Name write steelwheels In Connection Type select MySQL In Access select Native (JDBC) In Host Name write localhost In Database Name write steelwheels In User Name write steelwheels In Password write steelwheels Click Test to test the connection. 16. Close the Database Connection Test dialog and close the Database Connection dialog with OK. IST/DEI Page 4 of 11

17. In the View tab, right-click the steelwheels database connection and select Share. This will make the database connection available to every transformation. Exploring the database 18. Right-click the steelwheels database connection and select Explore. 19. In the Database Explorer dialog, expand steelwheels and Tables. 20. Right-click the customers table and select View SQL. 21. In the Simple SQL editor dialog, change the query to: SELECT CUSTOMERNUMBER, CUSTOMERNAME, CITY, COUNTRY FROM CUSTOMERS 22. Click Execute to preview the data. 23. Close the Examine preview data dialog, the Results of the SQL statements dialog, the Simple SQL editor dialog, and the Database Explorer dialog. Querying the database in a transformation 24. Switch to the Design tab, and drag a Table input step to the transformation. 25. Configure the Table input step: In Connection select steelwheels Click on Get SQL select statement Expand steelwheels and Tables Select orders and click OK When asked if you want to include the field names in the SQL, click Yes At the end of the SQL statement, add: WHERE STATUS = 'Shipped' Click on Preview and then OK The following window appears: IST/DEI Page 5 of 11

26. Close the window and press OK to close the step configuration. 27. Add the following sequence of steps to the transformation: 28. Configure the Calculator step: Add a new field called diff_days to calculate Date A - Date B (in days) where A is SHIPPEDDATE and B is REQUIREDDATE. 29. Configure the Number range step: In Input field select diff_days In Output field write delivery Configure the Ranges as shown here: 30. Configure the Sort rows step to sort by diff_days. IST/DEI Page 6 of 11

31. Configure the Select values step to select the fields delivery, ORDERNUMBER, REQUIREDDATE, and SHIPPEDDATE. 32. Click on the Select values step, and do a Preview. 33. How many orders were shipped earlier than the required date? How many orders were shipped on time? How many orders were shipped late? 34. Save your transformation. Using query parameters 35. Add a Data grid step to the transformation and connect it to the Table input step. 36. Configure the Data Grid step: Change the Step name to Query params IST/DEI Page 7 of 11

In the Meta tab, create two new fields called date_from and date_to, both of type String. Switch to the Data tab and, in the first row, write 2004-12-01 in the first column, and 2004-12-10 in the second column. Make sure that you have only one row. Delete any additional (empty) rows that are not being used. 37. Double-click the Table input step to change its configuration: At the end of the SQL statement, add the following line: AND ORDERDATE BETWEEN? AND? Check the option Replace variables in script In Insert data from step select Query params 38. Click on the Select values step, and do a Preview. You should now have only 10 rows. Check that the dates are within the date_from and date_to values. IST/DEI Page 8 of 11

Storing the results in the database 39. Add a Table output step to the transformation and connect Select values to the Table output. When connecting the two steps, select Main output of step. 40. Configure the Table output step: In Connection select steelwheels In Target table write results Check the option Truncate table Press the SQL button. The Simple SQL editor will appear with a CREATE statement. o Replace REQUIREDDATE UNKNOWN with REQUIREDDATE TIMESTAMP o Replace SHIPPEDDATE UNKNOWN with SHIPPEDDATE TIMESTAMP o Add a line with the primary key: PRIMARY KEY(ORDERNUMBER) o Press the Execute button to create the table. 41. Once the results table has been successfully created, you can close the Table output configuration with OK. IST/DEI Page 9 of 11

42. Click on the Table output step, and do a Preview. You should have the same results as before. The difference is that these results have now been written to the database. 43. Save your transformation. 44. Open a Command Prompt (or Terminal) and execute the following command: mysql -u steelwheels -p When prompted, give the password: steelwheels 45. You are now logged into MySQL. Execute the following command to select the steelwheels database: USE steelwheels; 46. Execute the following command to show the database tables: SHOW TABLES; Check that there is a results table. 47. Query the results table with: SELECT * FROM results; 48. Write quit to quit the MySQL client. IST/DEI Page 10 of 11

Pencil & paper exercise Consider a data migration system where the goal is to populate the following table: Person(firstname, lastname) The input data comes from two different sources: The first data source is the table Customer(name, surname): +-----------+-----------+ name surname +-----------+-----------+ Alexander Scott Paul Wilson Scott Craig Simon Alexander Craig Wilson +-----------+-----------+ 5 rows in set The schema matches between the Customer table and the Person table are: Customer.name Person.firstname Customer.surname Person.lastname The second data source is a CSV file with the following contents: The schema matches between the CSV file and the Person table are unknown. Find these schema matches with a Naive Bayes learner, using the examples in table Customer as training data and the examples in the CSV file as test data. IST/DEI Page 11 of 11