Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians

Size: px
Start display at page:

Download "Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians"

Transcription

1 Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians ABSTRACT Karthik Chidambaram, Senior Program Director, Data Strategy, Genentech, CA This paper will provide tips and techniques for the analysts & statisticians to optimize the data processing routines in their day-to-day work. Quite a bit of productivity is lost on slow SAS servers and slow response time from IT teams. However, there are certain tools and techniques, that the analysts can do on their end, to bypass the inefficiencies. This paper will provide a list of those techniques & share the experience on utilizing the SAS GRID architecture. Key sections of the paper: 1. Tips and techniques to optimize the SAS programs, to bypass the bottlenecks 2. Hidden gems: quick tips to administer & optimize parameters to enhance processing huge volumes of data 3. GRID: A quick primer on GRID (from an analyst/statistician perspective) and its advantages TIPS AND TECHNIQUES TO OPTIMIZE SAS PROGRAMS TO BYPASS BOTTLENECKS OPTIMIZING WINDOWS MACHINE FOR PROCESSING YOUR PROGRAMS: In many cases, the servers or machines underperform and the blame is mostly placed on the SAS system. However, there are instances where, the back end system could be optimized to better serve the analytics. For instance, under Windows 7, follow these steps to optimize application performance: Open the Control Panel Click System and Security Select the System Click Advanced system settings task Select the Advanced tab In the Performance box, click Settings and then select the Advanced tab To optimize performance of an interactive SAS session, select Programs To optimize performance of a batch SAS session, select Background services Click OK This optimization ensures that the memory and page files are appropriately optimized for the type of SAS processing we use. This helps with the stability and memory processing of the server/pc to a greater extent. Irrespective of the type of windows machine used, the optimization listed above could be accomplished (even though the navigation path may be slightly different) USING HIGHLY RECURSIVE PROCESS WITH MODERATE SIZED DATASETS? CONSIDER MEMLIB OR MEMCACHE With MEMLIB and MEMCACHE options, we will be able to create Memory-based libraries. Using memory based libraries reduce the I/O to and from the disk. Especially, if our permanent library is on a SAN, we will see a substantial processing improvement with MEMLIB option. Memory based libraries can be used in several ways: 1. As a storage for the work library 2. Processing SAS libraries with high I/O 3. Cache for very large SAS libraries CHECK THE ASSIGNMENT OF THE SAS WORK LIBRARY Especially in server based SAS processing, there is always an increasing need for additional space on the work server. When the number of users or the processing database size increases, the size of the workspace is increased correspondingly. In most cases, this impacts the performance of the system. SAS processes are I/O intensive and utilize the work library for storing the temporary files. There are 2 common issues with SAS work library set up: 1. Size of the work folder 2. Network connectivity to work folder from the server 1

2 Work around: Check the SAS work library assignment using the proc datasets. Check for I/O issues by switching on the FULLSTIMER option. If you notice I/O issues, try to define a different location using saswork option at runtime or by modifying the SAS work assignment on autoexec.sas. OPTIMIZE YOUR CODE Many times, a simple change to the code could result in huge efficiency gain. A quick look at some of the efficient SAS coding options: If we would be reading a flat file multiple times, it will be a better option to create a SAS dataset. Reading a SAS dataset will be much faster than reading from a flat file. When using arrays in long programs, where the content generated in the DATA step are not intended for output to the result dataset, ensure addition of _TEMPORARY_. This will release the memory after the processing is complete. To reduce the I/O ensure that filters are done at the beginning of the code, especially while dealing with huge volumes of data. Even while filtering, a combination of where statement and keep statements could result in additional performance gains. SAS program data vector allocates buffer space based on the number of variables that are being read in and the number of variables that are created during the data step processing. Hence, if we are using 4 variables, out of 10 from a dataset, the keep statement at the set statement is more efficient than the keep statement at the end of the program. This is because, the keep option, when used with the set statement, avoids reading in the unwanted columns on to the buffer. Less Efficient Code: DATA sample; Efficient Code: DATA sample; Other SAS Statements keep var1 var2 var3; SET source (keep = var1 var2 var3); Other SAS Statements Both if and where statements can be used to subset a dataset based on the specified criteria. Though both if and where statements produce the exact same results in most cases, they have a big difference in the way they operate on the data. In case of the if statement, the data is read into the program data vector before the condition is verified. Thus all the records are read into the program data vector irrespective of their value and the criteria. On the contrary, the where statement checks for the criteria, even before the data is read into the PDV. Hence, the unwanted data records are not read in to the buffer space at all. Thus the Where statement will be a better option for data subset, especially in case of datasets with a large number of variables. Less Efficient Code: DATA subst; Efficient Code: DATA subst; If sales > 1000; Where sales > 1000; 2

3 HIDDEN GEMS: QUICK TIPS TO ADMINISTER & OPTIMIZE PARAMETERS TO ENHANCE PROCESSING HUGE VOLUMES OF DATA Many SAS users do not adjust the SAS System options and work with the default setting on the system. There are several hundreds of such options and it is virtually impossible to master the right setting for each of these parameters. This section will highlight a few interesting parameters, that may offer huge performance benefit to the users. BUFNO=, BUFSIZE=, CATCACHE=, AND COMPRESS= SYSTEM OPTIONS BUFNO: SAS uses the BUFNO= option to adjust the number of open page buffers when it processes a SAS data set. Increasing this option's value can improve our application's performance by allowing SAS to read more data with fewer passes; however, when memory usage increases. Experiment with different values for this option to determine the optimal value for our needs. Note: We can also use the CBUFNO= system option to control the number of extra page buffers to allocate for each open SAS catalog BUFSIZE: When the Base SAS engine creates a data set, it uses the BUFSIZE= option to set the permanent page size for the data set. The page size is the amount of data that can be transferred for an I/O operation to one buffer. The default value for BUFSIZE= is determined by operating system environment. Note that the default is set to optimize the sequential access method. To improve performance for direct (random) access, we should change the value for BUFSIZE. Whether we use our operating environment's default value or specify a value, the engine always writes complete pages regardless of how full or empty those pages are. If we know that the total amount of data is going to be small, we can set a small page size with the BUFSIZE= option, so that the total data set size remains small and we minimize the amount of wasted space on a page. In contrast, if we know that we are going to have many observations in a data set, we should optimize BUFSIZE= so that as little overhead as possible is needed. Note that each page requires some additional overhead. Large data sets that are accessed sequentially benefit from larger page sizes because sequential access reduces the number of system calls that are required to read the data set. Note that because observations cannot span pages, typically there is unused space on a page. CATCACHE: SAS uses this option to determine the number of SAS catalogs to keep open at one time. Increasing its value can use more memory, although this might be warranted if our application uses catalogs that will be needed relatively soon by other applications. (The catalogs closed by the first application are cached and can be accessed more efficiently by subsequent applications.) COMPRESS: One further technique that can reduce I/O processing is to store our data as compressed data sets by using the COMPRESS= data set option. However, storing our data this way means that more CPU time is needed to decompress the observations, as they are made available to SAS. But if our concern is I/O, and not CPU usage, compressing our data might improve the I/O performance of our application. SASFILE STATEMENT The SASFILE global statement opens a SAS data set and allocates enough buffers to hold the entire data set in memory. Once it is read, data is held in memory, available to subsequent DATA and PROC steps, until either a second SASFILE statement closes the file and frees the buffers or the program ends, which automatically closes the file and frees the buffers. Using the SASFILE statement can improve performance by Reducing multiple open/close operations (including allocation and freeing of memory for buffers) to process a SAS data set to one open/close operation Reducing I/O processing by holding the data in memory. If our SAS program consists of steps that read a SAS data set multiple times and we have an adequate amount of memory so that the entire file can be held in real memory, the program should benefit from using the SASFILE statement. Also, SASFILE is especially useful as part of a program that starts a SAS server such as a SAS/SHARE server. IBUFSIZE SYSTEM OPTION An index is an optional SAS file that we can create for a SAS data file in order to provide direct access to specific observations. The index file consists of entries that are organized into hierarchical levels, such as a tree structure, 3

4 and connected by pointers. When an index is used to process a request, such as for WHERE processing, SAS does a search on the index file in order to rapidly locate the requested records. Typically, we do not need to specify an index page size. However, the following situations could require a different page size: The page size affects the number of levels in the index. The more pages there are, the more levels in the index. The more levels, the longer the index search takes. Increasing the page size allows more index values to be stored on each page, thus reducing the number of pages (and the number of levels). The number of pages required for the index varies with the page size, the length of the index value, and the values themselves. The main resource that is saved when reducing levels in the index is I/O. If our application is experiencing a lot of I/O in the index file, increasing the page size might help. However, we must re-create the index file after increasing the page size. The index file structure requires a minimum of three index values to be stored on a page. If the length of an index value is very large, we might get an error message that the index could not be created because the page size is too small to hold three index values. Increasing the page size should eliminate the error. REUSE SYSTEM OPTION If space is reused, observations that are added to the SAS data set are inserted wherever enough free space exists, instead of at the end of the SAS data set. Specifying REUSE=NO results in less efficient usage of space if we delete or update many observations in a SAS data set. However, the APPEND procedure, the FSEDIT procedure, and other procedures that add observations to the SAS data set continue to add observations to the end of the data set, as they do for uncompressed SAS data sets. We cannot change the REUSE= attribute of a compressed SAS data set after it is created. Space is tracked and reused in the compressed SAS data set according to the REUSE= value that was specified when the SAS data set was created, not when we add and delete observations. Even with REUSE=YES, the APPEND procedure will add observations at the end. It may be worthwhile to check the default setting for this variable and set it to YES, especially in environments dealing with a lot of data updates. SAS GRID: A QUICK PRIMER ON GRID (FROM AN ANALYST/STATISTICIAN PERSPECTIVE) AND ITS ADVANTAGES SAS Grid Manager delivers grid computing capabilities, enabling organizations to create a managed, shared environment for processing large volumes of data and analytic programs. The grid effectively combines several servers, with dynamic load balancing abilities. From the shoes of an analyst, without the IT terms, the GRID manager avoids having a single server for shared pool of users, by combining a pool of CPUs and balancing the load across several machines, providing better performance and enhancing reliability. Some key benefits include: Automatically tailors SAS Data Integration Studio and SAS Enterprise Miner for parallel processing and job submission in a grid environment. Balances the load of many SAS Enterprise Guide users through easy submission to the grid. Provides load balancing for all SAS servers to improve throughput and response time of all SAS clients. Uses SAS Code Analyzer to analyze job dependencies in SAS programs and generates grid-ready code: Used by SAS Data Integration Studio and SAS Enterprise Guide to import SAS programs. Provides automated session spawning and distributed processing of SAS programs across a set of diverse computing resources. Speeds up processing of applicable SAS programs and applications, and provides more efficient computing resource utilization. Enables scheduling of production SAS workflows to be executed across grid resources: Ø Provides a process flow diagram to create SAS flows of one or more SAS jobs that can be simple or complex to meet our needs. Ø Uses all of the policies and resources of the grid. Enables many SAS solutions and user-written programs to be easily configured for submission to a grid of shared resources. Integrates with all SAS Business Intelligence clients and analytic applications by storing grid-enabled code as SAS Stored Processes. Provides greater resilience for mission-critical applications and high availability for the SAS environment. Includes command-line batch submission utility called SASGSUB: Ø Allows us to submit and forget, and reconnect later to retrieve results. Ø Enables integration with other standard enterprise schedulers. 4

5 Enables batch submission to leverage checkpoint and automatically restart jobs. Ø Applies grid policies to SAS workspace servers when they are launched through the grid. CONCLUSION This paper has highlighted the basic & easy rules for optimizing the SAS processing. With some minimal changes to our code, we can make sure that we process our programs in an effective and efficient manner, leveraging all the nice features in the SAS system. REFERENCES SAS Online Help, ACKNOWLEDGMENTS The Author would like to thank his family, friends, peers and supervisors for their encouragement, support and suggestions. CONTACT INFORMATION Karthikeyan Chidambaram - SAS certified professional, has over 15 years of experience in SAS in a variety of roles including SAS Administration, Statistical Analysis and ETL programming. Your comments and questions are valued and encouraged. Contact the author at: Karthikeyan Chidambaram Genentech Inc. 1 DNA Way South San Francisco, CA Phone: karthihere@hotmail.com, Chidambaram.karthikeyan@gene.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. Indicates USA registration. Other brand and product names are trademarks of their respective companies. 5

Optimizing System Performance

Optimizing System Performance 243 CHAPTER 19 Optimizing System Performance Definitions 243 Collecting and Interpreting Performance Statistics 244 Using the FULLSTIMER and STIMER System Options 244 Interpreting FULLSTIMER and STIMER

More information

Performance Considerations

Performance Considerations 149 CHAPTER 6 Performance Considerations Hardware Considerations 149 Windows Features that Optimize Performance 150 Under Windows NT 150 Under Windows NT Server Enterprise Edition 4.0 151 Processing SAS

More information

Paper CC16. William E Benjamin Jr, Owl Computer Consultancy LLC, Phoenix, AZ

Paper CC16. William E Benjamin Jr, Owl Computer Consultancy LLC, Phoenix, AZ Paper CC16 Smoke and Mirrors!!! Come See How the _INFILE_ Automatic Variable and SHAREBUFFERS Infile Option Can Speed Up Your Flat File Text-Processing Throughput Speed William E Benjamin Jr, Owl Computer

More information

Chapter 12. File Management

Chapter 12. File Management Operating System Chapter 12. File Management Lynn Choi School of Electrical Engineering Files In most applications, files are key elements For most systems except some real-time systems, files are used

More information

Grid Computing in SAS 9.4

Grid Computing in SAS 9.4 Grid Computing in SAS 9.4 SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2013. Grid Computing in SAS 9.4. Cary, NC: SAS Institute Inc. Grid Computing

More information

Stephen M. Beatrous, SAS Institute Inc., Cary, NC John T. Stokes, SAS Institute Inc., Austin, TX

Stephen M. Beatrous, SAS Institute Inc., Cary, NC John T. Stokes, SAS Institute Inc., Austin, TX 1/0 Performance Improvements in Release 6.07 of the SAS System under MVS, ems, and VMS' Stephen M. Beatrous, SAS Institute Inc., Cary, NC John T. Stokes, SAS Institute Inc., Austin, TX INTRODUCTION The

More information

Paper Best Practices for Managing and Monitoring SAS Data Management Solutions. Gregory S. Nelson

Paper Best Practices for Managing and Monitoring SAS Data Management Solutions. Gregory S. Nelson Paper 113-2012 Best Practices for Managing and Monitoring SAS Data Management Solutions Gregory S. Nelson President and CEO ThotWave Technologies, Chapel Hill, North Carolina Abstract SAS and DataFlux

More information

APPENDIX 3 Tuning Tips for Applications That Use SAS/SHARE Software

APPENDIX 3 Tuning Tips for Applications That Use SAS/SHARE Software 177 APPENDIX 3 Tuning Tips for Applications That Use SAS/SHARE Software Authors 178 Abstract 178 Overview 178 The SAS Data Library Model 179 How Data Flows When You Use SAS Files 179 SAS Data Files 179

More information

Effective Usage of SAS Enterprise Guide in a SAS 9.4 Grid Manager Environment

Effective Usage of SAS Enterprise Guide in a SAS 9.4 Grid Manager Environment Paper SAS375-2014 Effective Usage of SAS Enterprise Guide in a SAS 9.4 Grid Manager Environment Edoardo Riva, SAS Institute Inc., Cary, NC ABSTRACT With the introduction of new features in SAS 9.4 Grid

More information

SAS Studio: A New Way to Program in SAS

SAS Studio: A New Way to Program in SAS SAS Studio: A New Way to Program in SAS Lora D Delwiche, Winters, CA Susan J Slaughter, Avocet Solutions, Davis, CA ABSTRACT SAS Studio is an important new interface for SAS, designed for both traditional

More information

SAS File Management. Improving Performance CHAPTER 37

SAS File Management. Improving Performance CHAPTER 37 519 CHAPTER 37 SAS File Management Improving Performance 519 Moving SAS Files Between Operating Environments 520 Converting SAS Files 520 Repairing Damaged Files 520 Recovering SAS Data Files 521 Recovering

More information

Submitting Code in the Background Using SAS Studio

Submitting Code in the Background Using SAS Studio ABSTRACT SAS0417-2017 Submitting Code in the Background Using SAS Studio Jennifer Jeffreys-Chen, SAS Institute Inc., Cary, NC As a SAS programmer, how often does it happen that you would like to submit

More information

SAS Scalable Performance Data Server 4.3

SAS Scalable Performance Data Server 4.3 Scalability Solution for SAS Dynamic Cluster Tables A SAS White Paper Table of Contents Introduction...1 Cluster Tables... 1 Dynamic Cluster Table Loading Benefits... 2 Commands for Creating and Undoing

More information

An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio

An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Paper 2733-2018 An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Jeff Dyson, The Financial Risk Group ABSTRACT The SAS Data Integration Studio job is historically

More information

SAS Factory Miner 14.2: Administration and Configuration

SAS Factory Miner 14.2: Administration and Configuration SAS Factory Miner 14.2: Administration and Configuration SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. SAS Factory Miner 14.2: Administration

More information

How to Optimize Jobs on the Data Integration Service for Performance and Stability

How to Optimize Jobs on the Data Integration Service for Performance and Stability How to Optimize Jobs on the Data Integration Service for Performance and Stability 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Maximizing SAS Software Performance Under the Unix Operating System

Maximizing SAS Software Performance Under the Unix Operating System Maximizing SAS Software Performance Under the Unix Operating System Daniel McLaren, Henry Ford Health system, Detroit, MI George W. Divine, Henry Ford Health System, Detroit, MI Abstract The Unix operating

More information

SAS Model Manager 15.1: Quick Start Tutorial

SAS Model Manager 15.1: Quick Start Tutorial SAS Model Manager 15.1: Quick Start Tutorial Overview This Quick Start Tutorial is an introduction to some of the primary features of SAS Model Manager. The tutorial covers basic tasks that are related

More information

The DATA Statement: Efficiency Techniques

The DATA Statement: Efficiency Techniques The DATA Statement: Efficiency Techniques S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT One of those SAS statements that everyone learns in the first day of class, the DATA statement rarely gets

More information

Qlik Sense Enterprise architecture and scalability

Qlik Sense Enterprise architecture and scalability White Paper Qlik Sense Enterprise architecture and scalability June, 2017 qlik.com Platform Qlik Sense is an analytics platform powered by an associative, in-memory analytics engine. Based on users selections,

More information

Dynamic Projects in SAS Enterprise Guide How to Create and Use Parameters

Dynamic Projects in SAS Enterprise Guide How to Create and Use Parameters Paper HW02 Dynamic Projects in SAS Enterprise Guide How to Create and Use Parameters Susan J. Slaughter, Avocet Solutions, Davis, CA Lora D. Delwiche, University of California, Davis, CA ABSTRACT SAS Enterprise

More information

Ten tips for efficient SAS code

Ten tips for efficient SAS code Ten tips for efficient SAS code Host Caroline Scottow Presenter Peter Hobart Managing the webinar In Listen Mode Control bar opened with the white arrow in the orange box Efficiency Overview Optimisation

More information

SAS Visual Analytics Environment Stood Up? Check! Data Automatically Loaded and Refreshed? Not Quite

SAS Visual Analytics Environment Stood Up? Check! Data Automatically Loaded and Refreshed? Not Quite Paper SAS1952-2015 SAS Visual Analytics Environment Stood Up? Check! Data Automatically Loaded and Refreshed? Not Quite Jason Shoffner, SAS Institute Inc., Cary, NC ABSTRACT Once you have a SAS Visual

More information

Business Insight Authoring

Business Insight Authoring Business Insight Authoring Getting Started Guide ImageNow Version: 6.7.x Written by: Product Documentation, R&D Date: August 2016 2014 Perceptive Software. All rights reserved CaptureNow, ImageNow, Interact,

More information

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide Paper 809-2017 Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide ABSTRACT Marje Fecht, Prowerk Consulting Whether you have been programming in SAS for years, are new to

More information

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide April 2013 SSIM Engineering Team Version 3.0 1 Document revision history Date Revision Description of Change Originator 03/20/2013

More information

The SERVER Procedure. Introduction. Syntax CHAPTER 8

The SERVER Procedure. Introduction. Syntax CHAPTER 8 95 CHAPTER 8 The SERVER Procedure Introduction 95 Syntax 95 Syntax Descriptions 96 Examples 101 ALLOCATE SASFILE Command 101 Syntax 101 Introduction You invoke the SERVER procedure to start a SAS/SHARE

More information

Technical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array

Technical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array Technical Paper Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array Release Information Content Version: 1.0 April 2018 Trademarks and Patents SAS Institute Inc., SAS Campus

More information

WRITE SAS CODE TO GENERATE ANOTHER SAS PROGRAM

WRITE SAS CODE TO GENERATE ANOTHER SAS PROGRAM WRITE SAS CODE TO GENERATE ANOTHER SAS PROGRAM A DYNAMIC WAY TO GET YOUR DATA INTO THE SAS SYSTEM Linda Gau, ProUnlimited, South San Francisco, CA ABSTRACT In this paper we introduce a dynamic way to create

More information

An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles

An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles r:, INTRODUCTION This tutorial introduces compressed data sets. The SAS system compression algorithm is described along with basic syntax.

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

My SAS Grid Scheduler

My SAS Grid Scheduler ABSTRACT Paper 1148-2017 My SAS Grid Scheduler Patrick Cuba, Cuba BI Consulting No Batch Scheduler? No problem! This paper describes the use of a SAS DI Studio job that can be started by a time dependent

More information

SAS Solutions for the Web: Static and Dynamic Alternatives Matthew Grover, S-Street Consulting, Inc.

SAS Solutions for the Web: Static and Dynamic Alternatives Matthew Grover, S-Street Consulting, Inc. SAS Solutions for the Web: Static and Dynamic Alternatives Matthew Grover, S-Street Consulting, Inc. Abstract This paper provides a detailed analysis of creating static and dynamic web content using the

More information

Adobe LiveCycle ES and the data-capture experience

Adobe LiveCycle ES and the data-capture experience Technical Guide Adobe LiveCycle ES and the data-capture experience Choosing the right solution depends on the needs of your users Table of contents 2 Rich application experience 3 Guided experience 5 Dynamic

More information

Taking Advantage of the SAS System on the Windows Platform

Taking Advantage of the SAS System on the Windows Platform Taking Advantage of the SAS System on the Windows Platform 09:45 Friday Gary Mehler, SAS Institute Introduction! Current state of the Windows platform! Current state of PC hardware! Current state of SAS

More information

Extending the Scope of Custom Transformations

Extending the Scope of Custom Transformations Paper 3306-2015 Extending the Scope of Custom Transformations Emre G. SARICICEK, The University of North Carolina at Chapel Hill. ABSTRACT Building and maintaining a data warehouse can require complex

More information

SCSUG-2017 SAS Grid Job Search Performance Piyush Singh, Ghiyasuddin Mohammed Faraz Khan, Prasoon Sangwan TATA Consultancy Services Ltd.

SCSUG-2017 SAS Grid Job Search Performance Piyush Singh, Ghiyasuddin Mohammed Faraz Khan, Prasoon Sangwan TATA Consultancy Services Ltd. SCSUG-2017 SAS Grid Job Search Performance Piyush Singh, Ghiyasuddin Mohammed Faraz Khan, Prasoon Sangwan TATA Consultancy Services Ltd. ABSTRACT Have you ever tried to find the job execution information

More information

Perform scalable data exchange using InfoSphere DataStage DB2 Connector

Perform scalable data exchange using InfoSphere DataStage DB2 Connector Perform scalable data exchange using InfoSphere DataStage Angelia Song (azsong@us.ibm.com) Technical Consultant IBM 13 August 2015 Brian Caufield (bcaufiel@us.ibm.com) Software Architect IBM Fan Ding (fding@us.ibm.com)

More information

Shared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter

Shared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter Shared File System Requirements for SAS Grid Manager Table Talk #1546 Ben Smith / Brian Porter About the Presenters Main Presenter: Ben Smith, Technical Solutions Architect, IBM smithbe1@us.ibm.com Brian

More information

Data Set Options CHAPTER 2

Data Set Options CHAPTER 2 5 CHAPTER 2 Data Set Options Definition 6 6 Using Data Set Options 6 Using Data Set Options with Input or Output SAS Data Sets 6 How Data Set Options Interact with System Options 7 Data Set Options by

More information

Data Set Options. Specify a data set option in parentheses after a SAS data set name. To specify several data set options, separate them with spaces.

Data Set Options. Specify a data set option in parentheses after a SAS data set name. To specify several data set options, separate them with spaces. 23 CHAPTER 4 Data Set Options Definition 23 Syntax 23 Using Data Set Options 24 Using Data Set Options with Input or Output SAS Data Sets 24 How Data Set Options Interact with System Options 24 Data Set

More information

Atlona Manuals Software AMS

Atlona Manuals Software AMS AMS Atlona Manuals Software Version Information Version Release Date Notes 1 03/18 Initial release AMS 2 Welcome to Atlona! Thank you for purchasing this Atlona product. We hope you enjoy it and will take

More information

High-availability services in enterprise environment with SAS Grid Manager

High-availability services in enterprise environment with SAS Grid Manager ABSTRACT Paper 1726-2018 High-availability services in enterprise environment with SAS Grid Manager Andrey Turlov, Allianz Technology SE; Nikolaus Hartung, SAS Many organizations, nowadays, rely on services

More information

SAS Data Integration Studio 3.3. User s Guide

SAS Data Integration Studio 3.3. User s Guide SAS Data Integration Studio 3.3 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Data Integration Studio 3.3: User s Guide. Cary, NC: SAS Institute

More information

Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study

Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study 1746-2014 Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study Dr. Goutam Chakraborty, Professor, Department of Marketing, Spears School of Business, Oklahoma

More information

Massive Scalability With InterSystems IRIS Data Platform

Massive Scalability With InterSystems IRIS Data Platform Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special

More information

SAS Environment Manager A SAS Viya Administrator s Swiss Army Knife

SAS Environment Manager A SAS Viya Administrator s Swiss Army Knife Paper SAS2260-2018 SAS Environment Manager A SAS Viya Administrator s Swiss Army Knife Michelle Ryals, Trevor Nightingale, SAS Institute Inc. ABSTRACT The latest version of SAS Viya brings with it a wealth

More information

SAS Business Rules Manager 2.1

SAS Business Rules Manager 2.1 SAS Business Rules Manager 2.1 User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2013. SAS Business Rules Manager 2.1: User's Guide. Cary,

More information

The Submission Data File System Automating the Creation of CDISC SDTM and ADaM Datasets

The Submission Data File System Automating the Creation of CDISC SDTM and ADaM Datasets Paper AD-08 The Submission Data File System Automating the Creation of CDISC SDTM and ADaM Datasets Marcus Bloom, Amgen Inc, Thousand Oaks, CA David Edwards, Amgen Inc, Thousand Oaks, CA ABSTRACT From

More information

Transformer Looping Functions for Pivoting the data :

Transformer Looping Functions for Pivoting the data : Transformer Looping Functions for Pivoting the data : Convert a single row into multiple rows using Transformer Looping Function? (Pivoting of data using parallel transformer in Datastage 8.5,8.7 and 9.1)

More information

CS399 New Beginnings. Jonathan Walpole

CS399 New Beginnings. Jonathan Walpole CS399 New Beginnings Jonathan Walpole Virtual Memory (1) Page Tables When and why do we access a page table? - On every instruction to translate virtual to physical addresses? Page Tables When and why

More information

Grid Computing in SAS 9.2. Second Edition

Grid Computing in SAS 9.2. Second Edition Grid Computing in SAS 9.2 Second Edition The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2009. Grid Computing in SAS 9.2, Second Edition. Cary, NC: SAS Institute Inc.

More information

SAS Simulation Studio 14.1: User s Guide. Introduction to SAS Simulation Studio

SAS Simulation Studio 14.1: User s Guide. Introduction to SAS Simulation Studio SAS Simulation Studio 14.1: User s Guide Introduction to SAS Simulation Studio This document is an individual chapter from SAS Simulation Studio 14.1: User s Guide. The correct bibliographic citation for

More information

Disks and I/O Hakan Uraz - File Organization 1

Disks and I/O Hakan Uraz - File Organization 1 Disks and I/O 2006 Hakan Uraz - File Organization 1 Disk Drive 2006 Hakan Uraz - File Organization 2 Tracks and Sectors on Disk Surface 2006 Hakan Uraz - File Organization 3 A Set of Cylinders on Disk

More information

Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service

Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service Demo Introduction Keywords: Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service Goal of Demo: Oracle Big Data Preparation Cloud Services can ingest data from various

More information

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

Chapter 3. Design of Grid Scheduler. 3.1 Introduction Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies

More information

Paper Operating System System Architecture 9.2 Baseline and additional releases OpenVMS OpenVMS on Integrity 8.3 Solaris

Paper Operating System System Architecture 9.2 Baseline and additional releases OpenVMS OpenVMS on Integrity 8.3 Solaris Paper 298-2008 Improving Your SAS Investment from the Ground Up: SAS 9.2 Enhancements That Help You Leverage Your Operating Environment Clarke Thacher, SAS Institute Inc., Cary, NC ABSTRACT SAS 9.2 has

More information

An Oracle White Paper June Exadata Hybrid Columnar Compression (EHCC)

An Oracle White Paper June Exadata Hybrid Columnar Compression (EHCC) An Oracle White Paper June 2011 (EHCC) Introduction... 3 : Technology Overview... 4 Warehouse Compression... 6 Archive Compression... 7 Conclusion... 9 Introduction enables the highest levels of data compression

More information

MAINVIEW Batch Optimizer. Data Accelerator Andy Andrews

MAINVIEW Batch Optimizer. Data Accelerator Andy Andrews MAINVIEW Batch Optimizer Data Accelerator Andy Andrews Can I push more workload through my existing hardware configuration? Batch window problems can often be reduced down to two basic problems:! Increasing

More information

SAS/Warehouse Administrator Usage and Enhancements Terry Lewis, SAS Institute Inc., Cary, NC

SAS/Warehouse Administrator Usage and Enhancements Terry Lewis, SAS Institute Inc., Cary, NC SAS/Warehouse Administrator Usage and Enhancements Terry Lewis, SAS Institute Inc., Cary, NC ABSTRACT SAS/Warehouse Administrator software makes it easier to build, maintain, and access data warehouses

More information

Divide and Conquer Writing Parallel SAS Code to Speed Up Your SAS Program

Divide and Conquer Writing Parallel SAS Code to Speed Up Your SAS Program SESUG 2016 Paper PA-265 Divide and Conquer Writing Parallel SAS Code to Speed Up Your SAS Program Doug Haigh, SAS Institute Inc., Cary, NC ABSTRACT Being able to split SAS processing over multiple SAS

More information

An Interactive GUI Front-End for a Credit Scoring Modeling System

An Interactive GUI Front-End for a Credit Scoring Modeling System Paper 6 An Interactive GUI Front-End for a Credit Scoring Modeling System Jeffrey Morrison, Futian Shi, and Timothy Lee Knowledge Sciences & Analytics, Equifax Credit Information Services, Inc. Abstract

More information

RSA WebCRD Getting Started

RSA WebCRD Getting Started RSA WebCRD Getting Started User Guide Getting Started With WebCRD Document Version: V9.2.2-1 Software Version: WebCRD V9.2.2 April 2013 2001-2013 Rochester Software Associates, Inc. All Rights Reserved.

More information

BDM Hyperion Workspace Basics

BDM Hyperion Workspace Basics BDM Hyperion Workspace Basics Contents of this Guide - Toolbars & Buttons Workspace User Interface 1 Standard Toolbar 3 Explore Toolbar 3 File extensions and icons 4 Folders 4 Browsing Folders 4 Root folder

More information

Talend Open Studio for Data Quality. User Guide 5.5.2

Talend Open Studio for Data Quality. User Guide 5.5.2 Talend Open Studio for Data Quality User Guide 5.5.2 Talend Open Studio for Data Quality Adapted for v5.5. Supersedes previous releases. Publication date: January 29, 2015 Copyleft This documentation is

More information

Parallelizing Windows Operating System Services Job Flows

Parallelizing Windows Operating System Services Job Flows ABSTRACT SESUG Paper PSA-126-2017 Parallelizing Windows Operating System Services Job Flows David Kratz, D-Wise Technologies Inc. SAS Job flows created by Windows operating system services have a problem:

More information

... IBM Power Systems with IBM i single core server tuning guide for JD Edwards EnterpriseOne

... IBM Power Systems with IBM i single core server tuning guide for JD Edwards EnterpriseOne IBM Power Systems with IBM i single core server tuning guide for JD Edwards EnterpriseOne........ Diane Webster IBM Oracle International Competency Center January 2012 Copyright IBM Corporation, 2012.

More information

Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2

Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2 Paper SAS1826-2018 Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2 Gregor Herrmann, SAS Institute Inc. ABSTRACT We all know that data preparation is crucial before you can derive

More information

My Enterprise Guide David Shannon, Amadeus Software Limited, UK

My Enterprise Guide David Shannon, Amadeus Software Limited, UK Paper 169-31 My Enterprise Guide David Shannon, Amadeus Software Limited, UK ABSTRACT Following on from My Computer and My Documents, users of SAS can now also have My Enterprise Guide! The aim of this

More information

Oracle SOA Suite Performance Tuning Cookbook

Oracle SOA Suite Performance Tuning Cookbook Oracle SOA Suite Performance Tuning Cookbook Matt Brasier Nicholas Wright Chapter No. 9 "Mediator and BAM" In this package, you will find: A Biography of the authors of the book A preview chapter from

More information

PhUSE Eric Brinsfield, Meridian Analytics and d-wise, Virginia Beach, VA, USA Joe Olinger, d-wise, Morrisville, NC, USA

PhUSE Eric Brinsfield, Meridian Analytics and d-wise, Virginia Beach, VA, USA Joe Olinger, d-wise, Morrisville, NC, USA Paper AD11 SAS Programmer s Guide to Life on the SAS Grid Eric Brinsfield, Meridian Analytics and d-wise, Virginia Beach, VA, USA Joe Olinger, d-wise, Morrisville, NC, USA ABSTRACT With the goal of utilizing

More information

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University.

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University. Abstract Data requests can range from on-the-fly, need it yesterday, to extended projects taking several weeks or months to complete. Often institutional researchers and other data scientists are juggling

More information

SoftPro 360 User Guide

SoftPro 360 User Guide SoftPro 360 User Guide March 8, 2016 4800 Falls of Neuse Road, Suite 400 Raleigh, NC 27609 p (800) 848 0143 f (919) 755 8350 www.softprocorp.com Copyright and Licensing Information Copyright 1987 2016

More information

ABSTRACT MORE THAN SYNTAX ORGANIZE YOUR WORK THE SAS ENTERPRISE GUIDE PROJECT. Paper 50-30

ABSTRACT MORE THAN SYNTAX ORGANIZE YOUR WORK THE SAS ENTERPRISE GUIDE PROJECT. Paper 50-30 Paper 50-30 The New World of SAS : Programming with SAS Enterprise Guide Chris Hemedinger, SAS Institute Inc., Cary, NC Stephen McDaniel, SAS Institute Inc., Cary, NC ABSTRACT SAS Enterprise Guide (with

More information

SAS IT Resource Management 3.8: Reporting Guide

SAS IT Resource Management 3.8: Reporting Guide SAS IT Resource Management 3.8: Reporting Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. SAS IT Resource Management 3.8: Reporting Guide.

More information

Using the Horizon vrealize Orchestrator Plug-In

Using the Horizon vrealize Orchestrator Plug-In Using the Horizon vrealize Orchestrator Plug-In VMware Horizon 6 version 6.2.3, VMware Horizon 7 versions 7.0.3 and later Modified on 4 JAN 2018 VMware Horizon 7 7.4 You can find the most up-to-date technical

More information

%DWFK$&&(66WR $'$%$6%$$ E\ 6WXDUW%LUFK IURP,QIRUPDWLRQ'HOLYHU\ 6\VWHPV6RXWK$IULFD

%DWFK$&&(66WR $'$%$6%$$ E\ 6WXDUW%LUFK IURP,QIRUPDWLRQ'HOLYHU\ 6\VWHPV6RXWK$IULFD %DWFK$&&(66WR $'$%$6%$$ E\ 6WXDUW%LUFK IURP,QIRUPDWLRQ'HOLYHU\ 6\VWHPV6RXWK$IULFD 1 ,QWURGXFWLRQ O Objectives and Benefits O Applicable Environment O Terms and Definitions O System Components Objectives

More information

Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics

Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics SAS6660-2016 Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics ABSTRACT Brandon Kirk and Jason Shoffner, SAS Institute Inc., Cary, NC Sensitive data requires elevated security

More information

SAS. Studio 4.1: User s Guide. SAS Documentation

SAS. Studio 4.1: User s Guide. SAS Documentation SAS Studio 4.1: User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. SAS Studio 4.1: User s Guide. Cary, NC: SAS Institute Inc. SAS

More information

Optimizing Performance for Partitioned Mappings

Optimizing Performance for Partitioned Mappings Optimizing Performance for Partitioned Mappings 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Introduction to IBM i Performance Data Investigator (PDI) Tool

Introduction to IBM i Performance Data Investigator (PDI) Tool Introduction to IBM i Performance Data Investigator (PDI) Tool Satid Singkorapoom ASEAN IBM i ATS July 2013 Presenter s Name - Presenter s Title MM/DD/Year IBM i Performance Data Investigator Tool A new

More information

PowerCenter 7 Architecture and Performance Tuning

PowerCenter 7 Architecture and Performance Tuning PowerCenter 7 Architecture and Performance Tuning Erwin Dral Sales Consultant 1 Agenda PowerCenter Architecture Performance tuning step-by-step Eliminating Common bottlenecks 2 PowerCenter Architecture:

More information

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig.

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig. Topic 7: Data Structures for Databases Olaf Hartig olaf.hartig@liu.se Database System 2 Storage Hierarchy Traditional Storage Hierarchy CPU Cache memory Main memory Primary storage Disk Tape Secondary

More information

IBM DB2 Control Center

IBM DB2 Control Center DB2 Management Tools Package IBM DB2 Control Center Technical Summary IBM DB2 Control Center is the central point from which you can manage your family of DB2 databases, running on an array of operating

More information

Making the most of SAS Jobs in LSAF

Making the most of SAS Jobs in LSAF PharmaSUG 2018 - Paper AD-26 Making the most of SAS Jobs in LSAF Sonali Garg, Alexion; Greg Weber, DataCeutics ABSTRACT SAS Life Science Analytics Framework (LSAF) provides the ability to have a 21 CFR

More information

Getting Started with CAPS

Getting Started with CAPS Getting Started with CAPS Starting Out The first time you start the CAPS software, you will be asked for your activation code and then prompted to fill in your Email address and Locale information. The

More information

Best ETL Design Practices. Helpful coding insights in SAS DI studio. Techniques and implementation using the Key transformations in SAS DI studio.

Best ETL Design Practices. Helpful coding insights in SAS DI studio. Techniques and implementation using the Key transformations in SAS DI studio. SESUG Paper SD-185-2017 Guide to ETL Best Practices in SAS Data Integration Studio Sai S Potluri, Synectics for Management Decisions; Ananth Numburi, Synectics for Management Decisions; ABSTRACT This Paper

More information

Oracle Warehouse Builder 10g Runtime Environment, an Update. An Oracle White Paper February 2004

Oracle Warehouse Builder 10g Runtime Environment, an Update. An Oracle White Paper February 2004 Oracle Warehouse Builder 10g Runtime Environment, an Update An Oracle White Paper February 2004 Runtime Environment, an Update Executive Overview... 3 Introduction... 3 Runtime in warehouse builder 9.0.3...

More information

Using the Horizon vcenter Orchestrator Plug-In. VMware Horizon 6 6.0

Using the Horizon vcenter Orchestrator Plug-In. VMware Horizon 6 6.0 Using the Horizon vcenter Orchestrator Plug-In VMware Horizon 6 6.0 You can find the most up-to-date technical documentation on the VMware Web site at: https://docs.vmware.com/ The VMware Web site also

More information

I KNOW HOW TO PROGRAM IN SAS HOW DO I NAVIGATE SAS ENTERPRISE GUIDE?

I KNOW HOW TO PROGRAM IN SAS HOW DO I NAVIGATE SAS ENTERPRISE GUIDE? Paper HOW-068 A SAS Programmer s Guide to the SAS Enterprise Guide Marje Fecht, Prowerk Consulting LLC, Cape Coral, FL Rupinder Dhillon, Dhillon Consulting Inc., Toronto, ON, Canada ABSTRACT You have been

More information

Informatica PowerExchange for Microsoft Azure Blob Storage 10.2 HotFix 1. User Guide

Informatica PowerExchange for Microsoft Azure Blob Storage 10.2 HotFix 1. User Guide Informatica PowerExchange for Microsoft Azure Blob Storage 10.2 HotFix 1 User Guide Informatica PowerExchange for Microsoft Azure Blob Storage User Guide 10.2 HotFix 1 July 2018 Copyright Informatica LLC

More information

SAS IT Resource Management Forecasting. Setup Specification Document. A SAS White Paper

SAS IT Resource Management Forecasting. Setup Specification Document. A SAS White Paper SAS IT Resource Management Forecasting Setup Specification Document A SAS White Paper Table of Contents Introduction to SAS IT Resource Management Forecasting... 1 Getting Started with the SAS Enterprise

More information

Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations

Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations Table of contents Faster Visualizations from Data Warehouses 3 The Plan 4 The Criteria 4 Learning

More information

All-Flash Storage Solution for SAP HANA:

All-Flash Storage Solution for SAP HANA: All-Flash Storage Solution for SAP HANA: Storage Considerations using SanDisk Solid State Devices WHITE PAPER Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

An Introduction to Big Data Formats

An Introduction to Big Data Formats Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION

More information

Data Grids in Business Rules, Decisions, Batch Scoring, and Real-Time Scoring

Data Grids in Business Rules, Decisions, Batch Scoring, and Real-Time Scoring Paper SAS605-2017 Data Grids in Business Rules, Decisions, Batch Scoring, and Real-Time Scoring Carl Sommer, Chris Upton, and Ernest Jessee, SAS Institute Inc. ABSTRACT Users want more power. SAS delivers.

More information

ArcGIS Server Performance and Scalability : Optimizing GIS Services

ArcGIS Server Performance and Scalability : Optimizing GIS Services Esri International User Conference San Diego, CA Technical Workshops July 12, 2011 ArcGIS Server Performance and Scalability : Optimizing GIS Services David Cordes, Eric Miller Poll the Audience: Role

More information

Increasing Performance for PowerCenter Sessions that Use Partitions

Increasing Performance for PowerCenter Sessions that Use Partitions Increasing Performance for PowerCenter Sessions that Use Partitions 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information