Integrating Large Datasets from Multiple Sources Calgary SAS Users Group (CSUG)

Similar documents
Electricity Forecasting Full Circle

IFC ENERGY STORAGE MARKET REPORT

Step 1: Open browser to navigate to the data science challenge home page

EUROPEAN ICT PROFESSIONAL ROLE PROFILES VERSION 2 CWA 16458:2018 LOGFILE

ALBERTA S RESIDENTIAL AND COMMERCIAL SOLAR PROGRAM

Oracle and Tangosol Acquisition Announcement

Reporting Template. By Denis Fafard Business Analyst WCB - Alberta

CV2ODBC Procedure. Overview. CV2ODBC Procedure Syntax APPENDIX 4

NERC Staff Organization Chart Budget 2019

Analytics: Server Architect (Siebel 7.7)

Memorandum of Understanding between the Central LHIN and the Toronto Central LHIN to establish a Joint ehealth Program

Appendix C: Run a Report of Individual Listings for your Department

FEATURES BENEFITS SUPPORTED PLATFORMS. Reduce costs associated with testing data projects. Expedite time to market

Principles of Automation

University of Massachusetts Amherst * Boston * Dartmouth * Lowell * President s Office * Worcester

ROCK-POND REPORTING 2.1

Jane s Defence Industry & Markets Intelligence Centre. Develop Advantage. Mitigate Risk. Capture Opportunity.

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

GREEN CAPE WASTE WORKSHOP

Optimizing wind farms

Cloud CFO Trial: Step by Step Guide

Advanced Solutions of Microsoft SharePoint Server 2013 Course Contact Hours

7. Run the TRAVERSE Data Migration Utility from TRAVERSE 10.2 into TRAVERSE 10.5.

Infrastructure Advisory

Top Coding Tips. Neil Merchant Technical Specialist - SAS

GUIDE TO USING THE 2014 AND 2015 CURRENT POPULATION SURVEY PUBLIC USE FILES

2 Year Business Plan Utilities Business Support


Project Management Professional (PMP) Certification Course

Project and Portfolio Management Center

Talend Open Studio for Data Quality. User Guide 5.5.2

A simplistic approach to Grid Computing Edmonton SAS Users Group. April 5, 2016 Bill Benson, Enterprise Data Scienc ATB Financial

Data Management Glossary

Advanced Solutions of Microsoft SharePoint Server 2013

NERC Staff Organization Chart Budget 2019

Biostatistics & SAS programming. Kevin Zhang

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

QMF Analytics v11: Not Your Green Screen QMF

SeUGI 19 - Florence WEB Enabling SAS output. Author : Darryl Lawrence

Sage Intelligence Financial Reporting for Sage ERP X3 Report Designer: Oracle Setup Conversion Guidelines

REPORTING Copyright Framework Private Equity Investment Data Management Ltd

Adviser Central Help Guide

History of NERC December 2012

WORKING WITH P6 PROFESSIONAL Adam Baker & Dan Beck May 27, 2015 DRM May 2015 Technical Webinar

SAS Drug Development Program Portability

Installation Procedure and Registration

IBM IBM WebSphere Information Analyzer v8.0.

Building Self-Service BI Solutions with Power Query. Written By: Devin

Importing CSV Data to All Character Variables Arthur L. Carpenter California Occidental Consultants, Anchorage, AK

Advanced Solutions of Microsoft SharePoint 2013

Data Transfer from Microsoft Access to SAS Made Easy

External Data Connector for SharePoint

Budget Review Process (BRP) Preliminary List of Business Initiatives. Stakeholder Meeting April 10, 2017

South Asia Cross-Border Electricity Trade and role of SARI/E

Bremer WealthLink Reference Guide

Business Intelligence on Dell Quickstart Data Warehouse Appliance Using Toad Business Intelligence Suite

Improving Data Governance in Your Organization. Faire Co Regional Manger, Information Management Software, ASEAN

Data Governance. Production Surveillance & Optimisation. November 2015 Melvin Keightley

Let s manage agents. Tom Sightler, Principal Solutions Architect Dmitry Popov, Product Management

Oracle Financial Services Enterprise Modeling User Guide. Release July 2015

Importing to WIRED Contact From a Database File. Reference Guide

relational databases

The Output Bundle: A Solution for a Fully Documented Program Run

EMC Ionix ControlCenter (formerly EMC ControlCenter) 6.0 StorageScope

SSR Staff Information Sessions Information Technology

Supporting the Cloud Transformation of Agencies across the Public Sector


ER/Studio Enterprise Portal User Guide

Big data privacy in Australia

SOFTWARE PLATFORM INFRASTRUCTURE. as a Service. as a Service. as a Service. Empower Users. Develop Apps. Manage Machines

Hot Tech Tips for using SPSS Statistics Part 1. Laura Watts Mitya Moitra Wednesday October 11 th 2017, 11AM

ORACLE PRIMAVERA RISK ANALYSIS (PERTMASTER )

CA ERwin Data Profiler

Decision D ATCO Electric Ltd. Crystal Lake 722S Substation Telecommunications Tower Replacement

Purchase this book at

STRATHCONA AREA TRANSMISSION UPGRADE PROJECT

Business Model for Global Platform for Big Data for Official Statistics in support of the 2030 Agenda for Sustainable Development

Deliverable 17.3 Test Report on MD-Paedigree Release

Global Software, Inc.'s Database Manager User Manual. Version 14.6

ODBC Setup MS Access 2007 Overview Microsoft Access 2007 can be utilized to create ODBC connections. This page will show you the steps to create an

Experiences and inspiration from Denmark

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)

Using DDE with Microsoft Excel and SAS to Collect Data from Hundreds of Users

A WHITE PAPER By Silwood Technology Limited

Sage X3 Intelligence Financial Reporting. Installation and Upgrade Guide

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

Sharing and Deploying MATLAB Applications

SLSA IT Systems Training Notes 2018

Open data and analytics for a sustainable energy future. Version 0.2 January 23 rd, 2017

Patching the Holes in SQL Systems with SAS

Investor Presentation. February 2016

Perceptive TransForm E-Forms Manager Data Source

Mike Schulte Data Scientist at the University of Pittsburgh Professor of Economics and Philosophy at Western Michigan University

The EP Technical Data Quality Metrics Initiative

Writing Analytical Queries for Business Intelligence

SHARED SERVICES - INFORMATION TECHNOLOGY

Randy House Vice President of Health Informatics. Saint Luke s Health System. Lynsey McNeal Director Data of Governance. Saint Luke s Health System

Administration and Data Retention. Best Practices for Systems Management

Introduction to the NYISO

PETRONAS E&P Technical Data Quality Metrics Initiative Going Green with Data

Transcription:

Integrating Large Datasets from Multiple Sources Calgary SAS Users Group (CSUG) October 25, 2017 Hotel Le-Germain

Outline About the AESO Large Datasets: AESO Context Usual Process Obtain data Connecting to ODBC drivers for Oracle and SQL server data Query data on Oracle directly Import from Excel/CSV Process data using Data/Proc steps, including SQL commands Export data to SQL server or Excel for further simulation or analysis and presentation About the Presenters and Market Simulation Team 1

About the AESO Independent System Operator that plans/operates provincial transmission grid Continuous evaluation; 20- year planning horizon Provides access for new generation and load Operates wholesale electricity market $2B in 2016 power market transactions; 200+ participants Not-for-profit corporation; no grid asset ownership 2

Large datasets - AESO context The Alberta Electric System Operator (AESO) manages and operates the provincial power grid. Manage and plan the power grid 24 hours a day (operations data) Manage and operate the energy markets (market data) Plan the future of the system and its infrastructure (simulated operations and market data) 3

Large datasets - AESO context Typical projects for the Forecasting and Analytics team include analyzing multiple repositories of datasets that are not always 100% compatible because of different Intervals (seconds, minutes, hours), Units of analysis (individual power plants, transmission lines), Granularity (individual, regional, Alberta-wide), Time stamps (i.e., multiple versions of same record) Mix of historical and forecast data from internal and external sources Results from multiple iterations (Monte Carlo simulations) Real and nominal dollar values Requires blending and merging data carefully, efficiently 4

Using SAS to Obtain Data Connecting and Importing from Oracle 1. Set up connections via MS ODBC Administrator 2. Create CSV with database name, data source, user name and password all as defined in MS ODBC Administrator DB DB source User Password 3. Open SAS program node with the following filename dbpw 'H:\SAS\dbpw.csv'; CSV file directory %macro libodbc(lib, db, user, pw); libname &lib. odbc datasrc="&db." user="&user." pwd="&pw."; %mend; data _null_; attrib lib length=$32; attrib db length=$32; attrib user length=$32; attrib pw length=$256; infile dbpw dsd lrecl=1000; input lib db user pw; call execute(cats('%libodbc(',lib,',',db,',',user,',',pw,')')); run; Voila! Oracle servers are now accessible 5

Using SAS to Obtain Data Connecting and Importing from SQL Server 1. Set up connections via MS ODBC Administrator 2. Open SAS program node with the following libname AURORA odbc datasrc=xxx Library name in SAS DB source (as in ODBC Admin) noprompt= "UID=; PWD=; DSN=AURORA; Tip: leave username and password blank if same as Windows login DB name (as in ODBC Admin) SERVER=XXX; DATABASE=dataset_repository;"; DB server (as in ODBC Admin) DB server (as in ODBC Admin) Voila! SQL server datasets are now added next to Oracle servers 6

Using SAS to Obtain Data Query Data on Oracle or SQL Server Directly DSN Name Need to create prompt message 7

Using SAS to Obtain Data Importing from Excel / CSV Import a single file proc Import datafile= C:\...\File_Name.xlsx" out=work.xxx DBMS="xlsx" REPLACE; SHEET= Sheet1"; GETNAMES=YES; RANGE= RANGE_NAME"; Run; Import multiple files %macro ImportFile(num); %do num = 1 %to 10; proc Import datafile= C:\...\File_Name_&num..xlsx" out=work.xxx_&num. DBMS="xlsx" REPLACE; SHEET= Sheet1"; GETNAMES=YES; RANGE= RANGE_NAME"; Run; %end; %mend ImportFile; Unless otherwise specified, Excel/CSV imports will reside in the temp folder 8

Processing data Analyze Renewable Generation Data Data tables: contain simulated hourly wind and solar data by site from 2014 to 2017 Information table: contains capacity and location of each site Merge two tables (the power of Hash!) 9

Processing data Select Price Data Background: Multiple years of simulation An even number of iterations One price for each year that is closest to the median price of that year Instead of using data step, we used the following approach: Select the median price for each year (Proc Summary) Calculate the difference between the price from each iteration and the median price for each year (Proc SQL) Find the minimum difference for each year (Proc SQL) Find which iteration that created the median price for that year (the power of Hash!) 10

Export data Processed data can be transferred to SQL server (for further simulations) or Excel (for analysis and presentation) Table created in SAS to be exported to SQL server 11

About the Presenters and Market Simulation Team Jin Chen joined the AESO s Market Simulation team as a Senior Analyst, Market Simulation in March 2017. Jin has extensive experience within the Alberta power market in numerous roles that have focused on market simulation, market fundamental and risk quantitative analytics. Through the past 16 years, Jin has been a key contributor in roles including Forecast Specialist and Risk Specialist (ENMAX), Market Analyst (the AESO), Senior Quantitative Analyst and Asset Optimization Analyst (ATCO Power), and Portfolio Analyst (EPCOR Merchant and Capital). Jin has a MA in Economics from the University of Calgary. Leonardo Tovar joined the AESO s Market Simulation team in February 2017. His immediate focus was on driving changes to AESO s market simulation model, championing the use of tools such as SAS for simulation activities and streamlining processes within the Forecasting and Analytics team to increase efficiency. Prior to joining the AESO, Leonardo has had significant government advisory and analytical experience with the Ontario Ministry of Finance in the roles of Special Policy Advisor and Senior Economist. Leonardo s educational background will support his activities within the team via a Master of Policy from the University of Toronto and a BA in both Political Science and Economics from the University of Calgary. About the team: The AESO s Market Simulation team are market advisors to the AESO and its stakeholders through the creation of market assessments and risk management strategies through detailed analysis of market fundamentals that incorporate a leading edge market simulation methodology. The team is part of Forecasting & Analytics which has a mission to be regarded by the AESO and its stakeholders as a trusted Centre of Excellence that provides quantitative and qualitative data-driven recommendations and assessments whilst maintaining an agile and efficient team. The larger team is strategically located within Regulatory & External Affairs to be ready to provide analytic support and advice across the AESO. 12