BIG DATA, FAST PROCESSING SPEEDS

Size: px
Start display at page:

Download "BIG DATA, FAST PROCESSING SPEEDS"

Transcription

1 BIG DATA, FAST PROCESSING SPEEDS DECEMBER 4, 2013 Gary T. Ciampa SAS Solutions OnDemand Advanced Analytics Lab Birmingham User s Group 2013

2 OVERVIEW AND AGENDA Big data introduction SAS language performance tuning SAS system facilities SQL, MACRO and DATA STEP examples Case study - SAS Revenue Optimization Solution History and tuning techniques High Performance Revenue Optimization GRID environment SAS emerging big data technologies 2

3 BIG DATA INTRODUCTION Wiki Knows All: is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications Forrester: software and/or hardware solutions that allow firms to discover, evaluate, optimize, and deploy predictive models by analyzing big data sources to improve business performance or mitigate risk. Gartner: technology is the management of high-volume, high-velocity and high-variety information assets that demand cost-effective and innovative forms of information processing for enhanced insight and decision making. 3

4 the management of high-volume, high-velocity and high-variety assets that demand cost-effective and innovative forms of processing for enhanced insight and decision making 4

5 BIG DATA ACCORDING TO SAS Incorporates concepts of IDC dimensions Volume transactions, streaming, sensors, Variety database, warehouse, text, , metered, OLAP, stocks, etc Velocity how fast the data is produced; and processed (near real-time) SAS considers additional dimensions Variability - in velocity and variety of the data (peaks and valleys, seasonal) Complexity - handling disparate sources to cleanse, transform, correlate and establish relationships and hierarchies SAS Big Data Starting Point: 5

6 APPROACHES TO PROCESSING BIG DATA Bigger, Faster, More Powerful is Better Increase CPU processor speed and count Increase MEMORY capability or speed Faster Networks and Network Devices High-speed disk arrays, or, direct memory disk arrays Parallel Processing Multi-threading capabilities, distributed processing within or across nodes Segmented data along with distributed processing Viable, but not always feasible within constraints (time, resource and dollars) 6

7 SAS SYSTEM FACILITIES SAS command line options, AUTOEXEC and CONFIG processing Customizes the SAS execution environment Settings can affect performance significantly Settings may have unexpected or unintended consequences Set on command line, configuration or within the program SAS Companion for <OS> (Windows, UNIX, z/os) Bonus Options VERBOSE option emits options and configuration details RTRACE option emits list of resources that are read, loaded 7

8 SAS SYSTEM AND HOST OPTIONS System Options, SAS Files BUFNO, BUFSIZE, OBS, IBUFNO, IBUFSIZE (index processing) System Administration Memory MEMSIZE, SORTSIZE, SUMSIZE System Administration, Performance CPUCOUNT, THREADS System Options for Macros MLOGIC, MPRINT, SYMBOLGEN (everyone has their favorites) NOTE: Use the *correct* SAS Companion for the target OS 8

9 SAS SYSTEM FACILITIES SAS option STIMER or FULLSTIMER System performance statistics, CPU, memory, real and elapsed time Subtle differences depending on the OS SAS option MSGLEVEL level of detail for messages to SAS log SAS option OBS last observation or record to process ARM and PERF macro facility Default or custom performance metrics at programmers discretion PROC or DATA STEP statistics User controlled START and STOP semantics across segments of SAS code Discrete log and format to include macros to process and report on metrics 9

10 SAMPLE OPTIONS STATEMENTS & LOG options obs=max fullstimer; data work.sort500k; set sgf2013.sort_500000; run; NOTE: DATA statement used (Total process time): real time user cpu time system cpu time 1.66 seconds 0.12 seconds 0.34 seconds memory k OS Memory k Timestamp 04/25/ :16:21 PM options obs=10; data work.sort500k; set sgf2013.sort_500000; run; NOTE: DATA statement used (Total process time): real time user cpu time system cpu time 0.03 seconds 0.00 seconds 0.03 seconds 10

11 SAMPLE ARM / PERF MACRO EXECUTION %let _armexec=1; %perfinit(applname="glm_appl_1"); %perfstrt(txnname="glm_txn1");. Do some work. %perfstop; %perfstrt(txnname="glm_txn2"); ods exclude all; proc GLM data=one; model y = x1; by by; quit; ods select all; %perfstop; 11

12 SAMPLE ARM / PERF MACRO EXECUTION lines deleted G, ,2,2,Glm_Txn1, CPU,IO_CNT,MEMORY INFO,THREAD S, ,2,1,1, , , , , ,6,6 P, ,2,1,1, , ,0, , , ,6,6 lines deleted G, ,2,2,Glm_Txn2, CPU,IO_CNT,MEMORY INFO,THREAD S, ,2,2,2, , , , , ,6,6 P, ,2,2,2, , ,0, , , ,6,6 SAS 9.3 Interface to Application Response Measurement ( 12

13 OVERVIEW ENVIRONMENT AND INTRODUCTION Sample Environment RHEL Linux 5.6, Intel Xenon 2.67 GHz, 32 Cores, 256 MB; SAS 9.3, Oracle Table, 44 columns, 10 million records SAS Language Reference (cost, benefit and considerations) Understanding SAS Indexes Understanding Integrity Constraints Use EXISTS (0:04.6) rather than IN (0:05.2). For example, select * from table_a a where exists (select * from orders o where a.prod_id=o.prod_id); 13

14 INDEXES USING INDEXES FOR PERFORMANCE OPTIMIZATION INDEX Considerations (TANSTAAFL) Data file size, small tables would be suitable for sequential processing Change rate of the data and use key variables, NAME versus GENDER Generally used where sub-setting the data, 25% or less is typical Sort by key variables, ordered data improves index behavior Some operators, conditions are not optimized with an INDEX Arithmetic, variable-to-variable, sounds-like operator CONTAINS, IS NULL or IS MISSING, TRIM, SUBSTR* where amount!=0; 0:28.0 Minutes:Seconds.Tenths where amount > 0; 0:26.0 Minutes:Seconds.Tenths 14

15 PROC SQL OPTIMIZING PROC SQL HAVING versus WHERE HAVING operates on all rows returned, not a subset Use HAVING on summary operations, after a restricted WHERE step Order statements, filter or select rows before grouping select state from order group by state having state = nc ; 01:50 select state from order where state = nc ; group by state; 01:31 15

16 PROC SQL OPTIMIZING PROC SQL Nested (sub-)queries Minimize nested queries with a small number of tables SUBQUERY versus JOIN select ename from employees emp where exists (select price from prices where prod_id = emp.prod_id and prices.class= j ); >05:00 minutes (terminated with prejudice) select ename, from prices pr, employees emp where pr.prod_id=emp.prod_id and pr.class= j ; 01:40 seconds 16

17 PROC SQL OPTIMIZING PROC SQL TABLE order Order of tables within the SQL statement impacts performance List the tables with the greatest number of rows left to right in the query SQL processing scans the last table listed, and merges all of the rows Assuming TAB1 has 20,000 rows, TAB2 has 10 rows select count (*) from tab2, tab select count (*) from tab1, tab

18 PROC SQL OPTIMIZING PROC SQL EXISTS versus DISTINCT for table join select distinct date,name from sales s, employee emp where s.prod_id=emp.prod_id; > 7 minutes select date, name from sales s where exists(select x from employee emp where emp.prod_id = s.prod_id); 0:11 seconds (including post distinct step) SAS 9.3 SQL Procedure User's Guide 18

19 SAS MACRO OPTIONS AND CONSIDERATIONS Use MLOGIC, MPRINT & SYMBOLGEN development phase Do NOT use MLOGIC, MPRINT & SYMBOLGEN production Stored Compiled Macro Facility Permanent SAS catalog Protect intellectual property Both AUTOCALL and SESSION macros are available Override compiled macros with session instances or AUTOCALL semantics Minimize nesting macro definitions 19

20 SAS MACRO NESTING MACRO INSTANCE Avoid nesting macros where possible %macro m1; %macro m2; /* nested macro */ %mend m2; %mend m1; %macro m1; <macro 1 code goes here> %mend m1; %macro m2; <macro 2 code goes here> %mend m2;

21 SAS DATA STEP A FEW EXAMPLES TO CONSIDER Missing values may perturb performance. is propagated across all calculations total=t4+(x*b)+c*(abc); 01:03 (63 seconds) total=(x*b)+c*(abc) + t4; 00:59 Superior practice, check for. before expression if <operand> ne. then do <expression>; end; 21

22 SAS DATA STEP A FEW EXAMPLES TO CONSIDER PROC FORMAT: User defined formats associated with variables Details in the Base SAS 9.3 Procedures Guide Reference the format throughout the code, simplifies logic and support if educ = 0 then neweduc="< 3 yrs old"; else if educ=1 then neweduc="no school"; else if educ=2 then neweduc="nursery school"; 10:54 proc format; value educf 0="< 3 yrs old 1="no school 2="nursery school"; neweduc=put(educ,educf); 10:32 22

23 SAS DATA STEP A FEW EXAMPLES TO CONSIDER Using the IN operator, versus OR conditions OR function checks all the conditions IN function matches first occurrence if x=8 or x=9 or x=23 or x=45 then do; end; 01:04 if x in (8,9,23,45) then do; end; 00:58 23

24 SAS USER FEEDBACK: IN VERSUS OR VALIDATION Thanks to Bruce Gilsen at Federal Reserve for independent validation Bruce s Optimization Validation 1,000,000 OBS, 100 VARIABLES with RANGE VALUES 1 to 100 Independent DATA STEP, using IN versus OR IN 8.15 / 7.88 Seconds (REAL / CPU) OR / Seconds (REAL / CPU) data two; set one; array vall (*) v1-v100; drop i; do i = 1 to 100; if vall(i) in ( ) then vall(i) = vall(i) + 100; end; run; data two; set one; array vall (*) v1-v100; drop i; do i = 1 to 100; if vall(i)= 1 or vall(i) = 2 or vall(i) = 3 or vall(i) = 4 vall(i) = 99 then vall(i) = vall(i) ; end; run; 24

25 CASE STUDY - SAS REVENUE OPTIMIZATION SOLUTION Big Data Introduction SAS Language Performance Tuning SAS System Facilities SQL, MACRO and DATA STEP examples Case Study - SAS Revenue Optimization Solution History and Tuning Techniques High Performance Revenue Optimization GRID Environment SAS Emerging Big Data Technologies 25

26 SOLUTIONS ONDEMAND ADVANCED ANALYTICS LAB Over a petabyte of data, 400+ customers Customer Profiles Variety of industry sectors, private as well as public Multi-tier deployments, client, mid-tier, analytic tier and RDBMS Daily and Weekly ETL feed requirements PROD, QA, DEV environments and data synchronization Disparate analytic processing (batch) schedules Backup and restore processing that minimizes performance impacts 99.5% up time service level agreements 26

27 CASE STUDY SAS REVENUE OPTIMIZATION SOLUTION Problem Statement: 33 hours of processing time for one batch component using 30% of projected data. Linear projection approximately 110 hours or 4 ½ days processing time. Requirement to fit batch into a 40 hour window AIX 6.1+, Power7, 64 Bit attached to EMC SAN Arrays 7 CPUS, SMT=4, 128GB RAM, 3700 IOPS, CPU 45% Approximately 1.2 TB of DATA, target 1.6 TB primary warehouse Focus on the most significant issues and then repeat as new issues arise 27

28 28

29 29

30 30

31 SAS WORK volume Eight-way stripe with eight paths Warehouse Fixed Tier 1 EMC storage; 80 x 100GB disk arrays Moved support directories off of volume 31

32 Weekly Performance Parallel Executions 16 processes 54 processes IO/SEC 8.5K to 15.3K CPU Idle Time 42% to 13% Weekly Batch Time 60 hours 43 hours GEO_PRODS 67 Million 92 Million 32

33 33

34 34

35 35

36 SAS GRID SAS REVENUE OPTIMIZATION SOLUTION Initial RO Versions used SAS/Connect parallel processing Single host deployments with concurrent analytics Flat data warehouse structure, non-partitioned SAS tables SAS High Performance Revenue Optimization Enhancements SAS TK GRID architecture distributed processing across grid nodes SAS data partitions distributed across grid nodes ETL processes, daily and weekly to distribute data across partitions Grid Captain to manage the processing and analytic across grid nodes 36

37 SAS GRID SAS REVENUE OPTIMIZATION NON GRID 37

38 SAS GRID SAS HIGH PERFORMANCE REVENUE OPTIMIZATION 38

39 SAS GRID & EMERGING TECHNOLOGIES SAS Grid Manager: distributed SAS processing Scheduling, Workload Balancing, High Availability & Management SAS In-Data Base: queries, aggregations, analytics within DBMS 9.2M3: DB2, EDW & Oracle; 9.3 Netezza HADOOP Scalable, fault tolerant, distributed files system SAS integration includes access, analysis and management SAS In Memory Analytics Distributed, descriptive, inferential to visualization analytics Visual Analytics and Visual Analytics HPA 39

40 SAS TECHNICAL SUPPORT 40

41 SAS BIG-DATA HOME PAGE 41

42 SUMMARY CONSIDERATIONS PERFORMANCE IMPROVEMENT IS A CONTINUAL PROCESS Focus on the most severe hotspots within SAS program and operating environment Use INDEX where appropriate Exploit SAS OPTIONS tuning Consider SAS Grid Products Evaluate SAS Visual Analytics and Visual Analytics HPA 42

43 SAS SOLUTION ON DEMAND ADVANCED ANALYTICS LAB

Perform scalable data exchange using InfoSphere DataStage DB2 Connector

Perform scalable data exchange using InfoSphere DataStage DB2 Connector Perform scalable data exchange using InfoSphere DataStage Angelia Song (azsong@us.ibm.com) Technical Consultant IBM 13 August 2015 Brian Caufield (bcaufiel@us.ibm.com) Software Architect IBM Fan Ding (fding@us.ibm.com)

More information

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary v1.0 January 8, 2010 Introduction This guide describes the highlights of a data warehouse reference architecture

More information

Optimizing System Performance

Optimizing System Performance 243 CHAPTER 19 Optimizing System Performance Definitions 243 Collecting and Interpreting Performance Statistics 244 Using the FULLSTIMER and STIMER System Options 244 Interpreting FULLSTIMER and STIMER

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

Lenovo Database Configuration

Lenovo Database Configuration Lenovo Database Configuration for Microsoft SQL Server Standard Edition DWFT 9TB Reduce time to value with pretested hardware configurations Data Warehouse problem and a solution The rapid growth of technology

More information

Accelerating Digital Transformation with InterSystems IRIS and vsan

Accelerating Digital Transformation with InterSystems IRIS and vsan HCI2501BU Accelerating Digital Transformation with InterSystems IRIS and vsan Murray Oldfield, InterSystems Andreas Dieckow, InterSystems Christian Rauber, VMware #vmworld #HCI2501BU Disclaimer This presentation

More information

Intelligence Platform

Intelligence Platform SAS Publishing SAS Overview Second Edition Intelligence Platform The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Intelligence Platform: Overview, Second Edition.

More information

Lenovo Database Configuration for Microsoft SQL Server TB

Lenovo Database Configuration for Microsoft SQL Server TB Database Lenovo Database Configuration for Microsoft SQL Server 2016 22TB Data Warehouse Fast Track Solution Data Warehouse problem and a solution The rapid growth of technology means that the amount of

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,

More information

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure Mario Beck (mario.beck@oracle.com) Principal Sales Consultant MySQL Session Agenda Requirements for

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

Modernizing Business Intelligence and Analytics

Modernizing Business Intelligence and Analytics Modernizing Business Intelligence and Analytics Justin Erickson Senior Director, Product Management 1 Agenda What benefits can I achieve from modernizing my analytic DB? When and how do I migrate from

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Storage Innovation at the Core of the Enterprise Robert Klusman Sr. Director Storage North America 2 The following is intended to outline our general product direction. It is intended for information

More information

Flash Storage Complementing a Data Lake for Real-Time Insight

Flash Storage Complementing a Data Lake for Real-Time Insight Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum

More information

Netezza The Analytics Appliance

Netezza The Analytics Appliance Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for

More information

April Copyright 2013 Cloudera Inc. All rights reserved.

April Copyright 2013 Cloudera Inc. All rights reserved. Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on

More information

IBM Education Assistance for z/os V2R2

IBM Education Assistance for z/os V2R2 IBM Education Assistance for z/os V2R2 Item: RSM Scalability Element/Component: Real Storage Manager Material current as of May 2015 IBM Presentation Template Full Version Agenda Trademarks Presentation

More information

QMF Analytics v11: Not Your Green Screen QMF

QMF Analytics v11: Not Your Green Screen QMF QMF Analytics v11: Not Your Green Screen QMF Central Ohio Db2 Users Group CODUG December 5, 2017 Roger Midgette The Fillmore Group Frank Fillmore The Fillmore Group Doug Anderson Rocket Software roger.midgette@thefillmoregroup.com

More information

Lenovo Database Configuration

Lenovo Database Configuration Lenovo Database Configuration for Microsoft SQL Server OLTP on Flex System with DS6200 Reduce time to value with pretested hardware configurations - 20TB Database and 3 Million TPM OLTP problem and a solution

More information

#mstrworld. Analyzing Multiple Data Sources with Multisource Data Federation and In-Memory Data Blending. Presented by: Trishla Maru.

#mstrworld. Analyzing Multiple Data Sources with Multisource Data Federation and In-Memory Data Blending. Presented by: Trishla Maru. Analyzing Multiple Data Sources with Multisource Data Federation and In-Memory Data Blending Presented by: Trishla Maru Agenda Overview MultiSource Data Federation Use Cases Design Considerations Data

More information

Optimizing Testing Performance With Data Validation Option

Optimizing Testing Performance With Data Validation Option Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Oracle #1 RDBMS Vendor

Oracle #1 RDBMS Vendor Oracle #1 RDBMS Vendor IBM 20.7% Microsoft 18.1% Other 12.6% Oracle 48.6% Source: Gartner DataQuest July 2008, based on Total Software Revenue Oracle 2 Continuous Innovation Oracle 11g Exadata Storage

More information

Massive Scalability With InterSystems IRIS Data Platform

Massive Scalability With InterSystems IRIS Data Platform Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special

More information

Trouble-free Upgrade to Oracle Database 12c with Real Application Testing

Trouble-free Upgrade to Oracle Database 12c with Real Application Testing Trouble-free Upgrade to Oracle Database 12c with Real Application Testing Kurt Engeleiter Principal Product Manager Safe Harbor Statement The following is intended to outline our general product direction.

More information

Accessibility Features in the SAS Intelligence Platform Products

Accessibility Features in the SAS Intelligence Platform Products 1 CHAPTER 1 Overview of Common Data Sources Overview 1 Accessibility Features in the SAS Intelligence Platform Products 1 SAS Data Sets 1 Shared Access to SAS Data Sets 2 External Files 3 XML Data 4 Relational

More information

IBM Db2 Analytics Accelerator Version 7.1

IBM Db2 Analytics Accelerator Version 7.1 IBM Db2 Analytics Accelerator Version 7.1 Delivering new flexible, integrated deployment options Overview Ute Baumbach (bmb@de.ibm.com) 1 IBM Z Analytics Keep your data in place a different approach to

More information

VOLTDB + HP VERTICA. page

VOLTDB + HP VERTICA. page VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics

More information

Data Mining & Data Warehouse

Data Mining & Data Warehouse Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?

More information

Paper CC16. William E Benjamin Jr, Owl Computer Consultancy LLC, Phoenix, AZ

Paper CC16. William E Benjamin Jr, Owl Computer Consultancy LLC, Phoenix, AZ Paper CC16 Smoke and Mirrors!!! Come See How the _INFILE_ Automatic Variable and SHAREBUFFERS Infile Option Can Speed Up Your Flat File Text-Processing Throughput Speed William E Benjamin Jr, Owl Computer

More information

SAS CURRICULUM. BASE SAS Introduction

SAS CURRICULUM. BASE SAS Introduction SAS CURRICULUM BASE SAS Introduction Data Warehousing Concepts What is a Data Warehouse? What is a Data Mart? What is the difference between Relational Databases and the Data in Data Warehouse (OLTP versus

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing

Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing IBM Software Group Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing George Wang Lead Software Egnineer, DB2 for z/os IBM 2014 IBM Corporation Disclaimer and Trademarks

More information

SAS Institute Exam A SAS Advanced Programming Version: 6.0 [ Total Questions: 184 ]

SAS Institute Exam A SAS Advanced Programming Version: 6.0 [ Total Questions: 184 ] s@lm@n SAS Institute Exam A00-212 SAS Advanced Programming Version: 6.0 [ Total Questions: 184 ] Question No : 1 The report will not successfully run and will produce an error message in the log. What

More information

SAP IQ Software16, Edge Edition. The Affordable High Performance Analytical Database Engine

SAP IQ Software16, Edge Edition. The Affordable High Performance Analytical Database Engine SAP IQ Software16, Edge Edition The Affordable High Performance Analytical Database Engine Agenda Agenda Introduction to Dobler Consulting Today s Data Challenges Overview of SAP IQ 16, Edge Edition SAP

More information

A SAS/AF Application for Parallel Extraction, Transformation, and Scoring of a Very Large Database

A SAS/AF Application for Parallel Extraction, Transformation, and Scoring of a Very Large Database Paper 11 A SAS/AF Application for Parallel Extraction, Transformation, and Scoring of a Very Large Database Daniel W. Kohn, Ph.D., Torrent Systems Inc., Cambridge, MA David L. Kuhn, Ph.D., Innovative Idea

More information

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera

More information

Evolving To The Big Data Warehouse

Evolving To The Big Data Warehouse Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from

More information

Infor Lawson on IBM i 7.1 and IBM POWER7+

Infor Lawson on IBM i 7.1 and IBM POWER7+ Infor Lawson on IBM i 7.1 and IBM POWER7+ IBM Systems & Technology Group Mike Breitbach mbreit@us.ibm.com This document can be found on the web, Version Date: March, 2014 Table of Contents 1. Introduction...

More information

DiskSavvy Disk Space Analyzer. DiskSavvy DISK SPACE ANALYZER. User Manual. Version Dec Flexense Ltd.

DiskSavvy Disk Space Analyzer. DiskSavvy DISK SPACE ANALYZER. User Manual. Version Dec Flexense Ltd. DiskSavvy DISK SPACE ANALYZER User Manual Version 10.3 Dec 2017 www.disksavvy.com info@flexense.com 1 1 Product Overview...3 2 Product Versions...7 3 Using Desktop Versions...8 3.1 Product Installation

More information

Microsoft Exam

Microsoft Exam Volume: 42 Questions Case Study: 1 Relecloud General Overview Relecloud is a social media company that processes hundreds of millions of social media posts per day and sells advertisements to several hundred

More information

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications

More information

Optimizing Data Transformation with Db2 for z/os and Db2 Analytics Accelerator

Optimizing Data Transformation with Db2 for z/os and Db2 Analytics Accelerator Optimizing Data Transformation with Db2 for z/os and Db2 Analytics Accelerator Maryela Weihrauch, IBM Distinguished Engineer, WW Analytics on System z March, 2017 Please note IBM s statements regarding

More information

Lenovo Database Configuration Guide

Lenovo Database Configuration Guide Lenovo Database Configuration Guide for Microsoft SQL Server OLTP on ThinkAgile SXM Reduce time to value with validated hardware configurations up to 2 million TPM in a DS14v2 VM SQL Server in an Infrastructure

More information

Windows NT Server Configuration and Tuning for Optimal SAS Server Performance

Windows NT Server Configuration and Tuning for Optimal SAS Server Performance Windows NT Server Configuration and Tuning for Optimal SAS Server Performance Susan E. Davis Compaq Computer Corp. Carl E. Ralston Compaq Computer Corp. Our Role Onsite at SAS Corporate Technology Center

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition. Eugene Gonzalez Support Enablement Manager, Informatica

Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition. Eugene Gonzalez Support Enablement Manager, Informatica Informatica Developer Tips for Troubleshooting Common Issues PowerCenter 8 Standard Edition Eugene Gonzalez Support Enablement Manager, Informatica 1 Agenda Troubleshooting PowerCenter issues require a

More information

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE Digital transformation is taking place in businesses of all sizes Big Data and Analytics Mobility Internet of Things

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

Oracle Hyperion Profitability and Cost Management

Oracle Hyperion Profitability and Cost Management Oracle Hyperion Profitability and Cost Management Configuration Guidelines for Detailed Profitability Applications November 2015 Contents About these Guidelines... 1 Setup and Configuration Guidelines...

More information

SAS SOLUTIONS ONDEMAND

SAS SOLUTIONS ONDEMAND DECEMBER 4, 2013 Gary T. Ciampa SAS Solutions OnDemand Advanced Analytics Lab Birmingham Users Group, 2013 OVERVIEW SAS Solutions OnDemand Started in 2000 SAS Advanced Analytics Lab (AAL) Created in 2007

More information

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW UKOUG RAC SIG Meeting London, October 24 th, 2006 Luca Canali, CERN IT CH-1211 LCGenève 23 Outline Oracle at CERN Architecture of CERN

More information

Executive Brief June 2014

Executive Brief June 2014 (707) 595-3607 Executive Brief June 2014 Comparing IBM Power Systems to Cost/Benefit Case for Transactional Applications Introduction Demand for transaction processing solutions continues to grow. Although

More information

SAS File Management. Improving Performance CHAPTER 37

SAS File Management. Improving Performance CHAPTER 37 519 CHAPTER 37 SAS File Management Improving Performance 519 Moving SAS Files Between Operating Environments 520 Converting SAS Files 520 Repairing Damaged Files 520 Recovering SAS Data Files 521 Recovering

More information

Session 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies

Session 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies Session 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies Prabhaker Gongloor (GP) Product Management Director, Database Manageability, Oracle

More information

Availability and Performance for Tier1 applications

Availability and Performance for Tier1 applications Assaf Fraenkel Senior Architect (MCA+MCM SQL 2008) MCS Israel Availability and Performance for Tier1 applications Agenda and Takeaways Agenda: Introduce the new SQL Server High Availability and Disaster

More information

IBM s Data Warehouse Appliance Offerings

IBM s Data Warehouse Appliance Offerings IBM s Data Warehouse Appliance Offerings RChaitanya IBM India Software Labs Agenda 1 IBM Smart Analytics System (D5600) System Overview Technical Architecture Software / Hardware stack details 2 Netezza

More information

PESIT Bangalore South Campus

PESIT Bangalore South Campus PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Information Science & Engineering SOLUTION MANUAL INTERNAL ASSESSMENT TEST 1 Subject & Code : Storage Area

More information

SAS Data Integration Studio 3.3. User s Guide

SAS Data Integration Studio 3.3. User s Guide SAS Data Integration Studio 3.3 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Data Integration Studio 3.3: User s Guide. Cary, NC: SAS Institute

More information

HyPer-sonic Combined Transaction AND Query Processing

HyPer-sonic Combined Transaction AND Query Processing HyPer-sonic Combined Transaction AND Query Processing Thomas Neumann Technische Universität München December 2, 2011 Motivation There are different scenarios for database usage: OLTP: Online Transaction

More information

Scalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX

Scalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX Scalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX ABSTRACT Symmetric multiprocessor (SMP) computers can increase performance by reducing the time required to analyze large volumes

More information

IBM BigFix Lifecycle 9.5

IBM BigFix Lifecycle 9.5 Software Product Compatibility Reports Product IBM BigFix Lifecycle 9.5 Contents Included in this report Operating systems (Section intentionally removed by the report author) Hypervisors (Section intentionally

More information

Virtualizing Oracle on VMware

Virtualizing Oracle on VMware Virtualizing Oracle on VMware Sudhansu Pati, VCP Certified 4/20/2012 2011 VMware Inc. All rights reserved Agenda Introduction Oracle Databases on VMware Key Benefits Performance, Support, and Licensing

More information

SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less

SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less Dipl.- Inform. Volker Stöffler Volker.Stoeffler@DB-TecKnowledgy.info Public Agenda Introduction: What is SAP IQ - in a

More information

Advanced Database Systems

Advanced Database Systems Advanced Database Systems DBMS Internals Data structures and algorithms to implement RDBMS Internals of non relational data management systems Why to take this course? To understand the strengths and weaknesses

More information

Data Integration and ETL with Oracle Warehouse Builder

Data Integration and ETL with Oracle Warehouse Builder Oracle University Contact Us: 1.800.529.0165 Data Integration and ETL with Oracle Warehouse Builder Duration: 5 Days What you will learn Participants learn to load data by executing the mappings or the

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III [ White Paper Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III Performance of Microsoft SQL Server 2008 BI and D/W Solutions on Dell PowerEdge

More information

Introduction to Database Services

Introduction to Database Services Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Oracle Database 10G. Lindsey M. Pickle, Jr. Senior Solution Specialist Database Technologies Oracle Corporation

Oracle Database 10G. Lindsey M. Pickle, Jr. Senior Solution Specialist Database Technologies Oracle Corporation Oracle 10G Lindsey M. Pickle, Jr. Senior Solution Specialist Technologies Oracle Corporation Oracle 10g Goals Highest Availability, Reliability, Security Highest Performance, Scalability Problem: Islands

More information

UNFAIR ADVANTAGE Your Road to SAP Hana 2016 PURE STORAGE INC.

UNFAIR ADVANTAGE Your Road to SAP Hana 2016 PURE STORAGE INC. UNFAIR ADVANTAGE Your Road to SAP Hana 1 1 AGENDA Road to S4 Hana Road to S4 Hana Your Business Opportunity Why is your storage decision important for SAP? Pure Storage and SAP Global Partnership SAP Co-Innovation

More information

CIS 601 Graduate Seminar Presentation Introduction to MapReduce --Mechanism and Applicatoin. Presented by: Suhua Wei Yong Yu

CIS 601 Graduate Seminar Presentation Introduction to MapReduce --Mechanism and Applicatoin. Presented by: Suhua Wei Yong Yu CIS 601 Graduate Seminar Presentation Introduction to MapReduce --Mechanism and Applicatoin Presented by: Suhua Wei Yong Yu Papers: MapReduce: Simplified Data Processing on Large Clusters 1 --Jeffrey Dean

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Cloud Analytics and Business Intelligence on AWS

Cloud Analytics and Business Intelligence on AWS Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse

More information

Top Five Reasons for Data Warehouse Modernization Philip Russom

Top Five Reasons for Data Warehouse Modernization Philip Russom Top Five Reasons for Data Warehouse Modernization Philip Russom TDWI Research Director for Data Management May 28, 2014 Sponsor Speakers Philip Russom TDWI Research Director, Data Management Steve Sarsfield

More information

Guide Users along Information Pathways and Surf through the Data

Guide Users along Information Pathways and Surf through the Data Guide Users along Information Pathways and Surf through the Data Stephen Overton, Overton Technologies, LLC, Raleigh, NC ABSTRACT Business information can be consumed many ways using the SAS Enterprise

More information

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( ) Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL

More information

Huge market -- essentially all high performance databases work this way

Huge market -- essentially all high performance databases work this way 11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch

More information

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment This document is provided as-is. Information and views expressed in this document, including URL and other Internet

More information

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation SAS Enterprise Miner Performance on IBM System p 570 Jan, 2008 Hsian-Fen Tsao Brian Porter Harry Seifert IBM Corporation Copyright IBM Corporation, 2008. All Rights Reserved. TABLE OF CONTENTS ABSTRACT...3

More information

One Size Fits All: An Idea Whose Time Has Come and Gone

One Size Fits All: An Idea Whose Time Has Come and Gone ICS 624 Spring 2013 One Size Fits All: An Idea Whose Time Has Come and Gone Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/9/2013 Lipyeow Lim -- University

More information

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest

More information

MCSA SQL SERVER 2012

MCSA SQL SERVER 2012 MCSA SQL SERVER 2012 1. Course 10774A: Querying Microsoft SQL Server 2012 Course Outline Module 1: Introduction to Microsoft SQL Server 2012 Introducing Microsoft SQL Server 2012 Getting Started with SQL

More information

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

Field Testing Buffer Pool Extension and In-Memory OLTP Features in SQL Server 2014

Field Testing Buffer Pool Extension and In-Memory OLTP Features in SQL Server 2014 Field Testing Buffer Pool Extension and In-Memory OLTP Features in SQL Server 2014 Rick Heiges, SQL MVP Sr Solutions Architect Scalability Experts Ross LoForte - SQL Technology Architect - Microsoft Changing

More information

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE Reference Architecture EMC STORAGE FOR MILESTONE XPROTECT CORPORATE Milestone multitier video surveillance storage architectures Design guidelines for Live Database and Archive Database video storage EMC

More information

The SERVER Procedure. Introduction. Syntax CHAPTER 8

The SERVER Procedure. Introduction. Syntax CHAPTER 8 95 CHAPTER 8 The SERVER Procedure Introduction 95 Syntax 95 Syntax Descriptions 96 Examples 101 ALLOCATE SASFILE Command 101 Syntax 101 Introduction You invoke the SERVER procedure to start a SAS/SHARE

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

Architecting Microsoft Azure Solutions (proposed exam 535)

Architecting Microsoft Azure Solutions (proposed exam 535) Architecting Microsoft Azure Solutions (proposed exam 535) IMPORTANT: Significant changes are in progress for exam 534 and its content. As a result, we are retiring this exam on December 31, 2017, and

More information

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration Sanjay Rao, Principal Software Engineer Version 1.0 August 2011 1801 Varsity Drive Raleigh NC 27606-2072 USA

More information

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits

More information

IBM Data Retrieval Technologies: RDBMS, BLU, IBM Netezza, and Hadoop

IBM Data Retrieval Technologies: RDBMS, BLU, IBM Netezza, and Hadoop #IDUG IBM Data Retrieval Technologies: RDBMS, BLU, IBM Netezza, and Hadoop Frank C. Fillmore, Jr. The Fillmore Group, Inc. The Baltimore/Washington DB2 Users Group December 11, 2014 Agenda The Fillmore

More information

Oracle Database 11g Direct NFS Client Oracle Open World - November 2007

Oracle Database 11g Direct NFS Client Oracle Open World - November 2007 Oracle Database 11g Client Oracle Open World - November 2007 Bill Hodak Sr. Product Manager Oracle Corporation Kevin Closson Performance Architect Oracle Corporation Introduction

More information

PowerCenter 7 Architecture and Performance Tuning

PowerCenter 7 Architecture and Performance Tuning PowerCenter 7 Architecture and Performance Tuning Erwin Dral Sales Consultant 1 Agenda PowerCenter Architecture Performance tuning step-by-step Eliminating Common bottlenecks 2 PowerCenter Architecture:

More information

<Insert Picture Here> Enterprise Data Management using Grid Technology

<Insert Picture Here> Enterprise Data Management using Grid Technology Enterprise Data using Grid Technology Kriangsak Tiawsirisup Sales Consulting Manager Oracle Corporation (Thailand) 3 Related Data Centre Trends. Service Oriented Architecture Flexibility

More information

IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata

IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata Research Report IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata Executive Summary The problem: how to analyze vast amounts of data (Big Data) most efficiently. The solution: the solution is threefold:

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information