Making the most of SAS Jobs in LSAF

Similar documents
Submission-Ready Define.xml Files Using SAS Clinical Data Integration Melissa R. Martinez, SAS Institute, Cary, NC USA

Why SAS Programmers Should Learn Python Too

PharmaSUG Paper PO10

The Next Generation Smart Program Repository

Customizing SAS Data Integration Studio to Generate CDISC Compliant SDTM 3.1 Domains

One Project, Two Teams: The Unblind Leading the Blind

Easy CSR In-Text Table Automation, Oh My

Why organizations need MDR system to manage clinical metadata?

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components

Cleaning up your SAS log: Note Messages

CDISC Variable Mapping and Control Terminology Implementation Made Easy

PharmaSUG Paper PO22

Using SAS Enterprise Guide with the WIK

Automate Clinical Trial Data Issue Checking and Tracking

An Alternate Way to Create the Standard SDTM Domains

Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples

Application Interface for executing a batch of SAS Programs and Checking Logs Sneha Sarmukadam, inventiv Health Clinical, Pune, India

PharmaSUG Paper AD09

How a Metadata Repository enables dynamism and automation in SDTM-like dataset generation

Developing an Integrated Platform

PharmaSUG Paper DS06 Designing and Tuning ADaM Datasets. Songhui ZHU, K&L Consulting Services, Fort Washington, PA

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA

How Managers and Executives Can Leverage SAS Enterprise Guide

SDTM Attribute Checking Tool Ellen Xiao, Merck & Co., Inc., Rahway, NJ

ABSTRACT MORE THAN SYNTAX ORGANIZE YOUR WORK THE SAS ENTERPRISE GUIDE PROJECT. Paper 50-30

SAS Drug Development Program Portability

Patricia Guldin, Merck & Co., Inc., Kenilworth, NJ USA

Clinical Data Visualization using TIBCO Spotfire and SAS

JMP to LSAF Add-in. User Guide v1.1

The Output Bundle: A Solution for a Fully Documented Program Run

SAS Programming Techniques for Manipulating Metadata on the Database Level Chris Speck, PAREXEL International, Durham, NC

Doctor's Prescription to Re-engineer Process of Pinnacle 21 Community Version Friendly ADaM Development

Extending the Scope of Custom Transformations

Batch vs. Interactive: Why You Need Both Janet E. Stuelpner. ASG. Inc Cary. North Carolina

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA

PharmaSUG 2014 PO16. Category CDASH SDTM ADaM. Submission in standardized tabular form. Structure Flexible Rigid Flexible * No Yes Yes

Parallelizing Windows Operating System Services Job Flows

Advanced Visualization using TIBCO Spotfire and SAS

esubmission - Are you really Compliant?

Pharmaceuticals, Health Care, and Life Sciences. An Approach to CDISC SDTM Implementation for Clinical Trials Data

SAS Clinical Data Integration 2.4

PharmaSUG Paper SP09

Taming a Spreadsheet Importation Monster

Advanced Data Visualization using TIBCO Spotfire and SAS using SDTM. Ajay Gupta, PPD

A Hands-On Introduction to SAS Visual Analytics Reporting

How to write ADaM specifications like a ninja.

An Efficient Solution to Efficacy ADaM Design and Implementation

SAS Data Integration Studio Take Control with Conditional & Looping Transformations

Interactive Programming Using Task in SAS Studio

How to validate clinical data more efficiently with SAS Clinical Standards Toolkit

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA

Introducing SAS Model Manager 15.1 for SAS Viya

Legacy to SDTM Conversion Workshop: Tools and Techniques

100 THE NUANCES OF COMBINING MULTIPLE HOSPITAL DATA

NetBackup Collection Quick Start Guide

When Powerful SAS Meets PowerShell TM

Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA

Creating an ADaM Data Set for Correlation Analyses

PharmaSUG Paper TT11

SAS/Warehouse Administrator Usage and Enhancements Terry Lewis, SAS Institute Inc., Cary, NC

PharmaSUG China Big Insights in Small Data with RStudio Shiny Mina Chen, Roche Product Development in Asia Pacific, Shanghai, China

PharmaSUG Paper PO21

Remember to always check your simple SAS function code! Yingqiu Yvette Liu, Merck & Co. Inc., North Wales, PA

Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ

Paper CC16. William E Benjamin Jr, Owl Computer Consultancy LLC, Phoenix, AZ

A Breeze through SAS options to Enter a Zero-filled row Kajal Tahiliani, ICON Clinical Research, Warrington, PA

Automatic Indicators for Dummies: A macro for generating dummy indicators from category type variables

Real Time Clinical Trial Oversight with SAS

SAS Life Science Analytics Framework API 1.8 Release Notes January 31, 2017

Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics

SAS ENTERPRISE GUIDE USER INTERFACE

Don t Get Blindsided by PROC COMPARE Joshua Horstman, Nested Loop Consulting, Indianapolis, IN Roger Muller, Data-to-Events.

Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians

A Tool to Compare Different Data Transfers Jun Wang, FMD K&L, Inc., Nanjing, China

The Submission Data File System Automating the Creation of CDISC SDTM and ADaM Datasets

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA

Automating the Production of Formatted Item Frequencies using Survey Metadata

Best Practice for Creation and Maintenance of a SAS Infrastructure

New Approach to Graph Databases

Reading and Writing RTF Documents as Data: Automatic Completion of CONSORT Flow Diagrams

Streamline SDTM Development and QC

TLFs: Replaying Rather than Appending William Coar, Axio Research, Seattle, WA

Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA

SAS Visual Analytics Environment Stood Up? Check! Data Automatically Loaded and Refreshed? Not Quite

SAS Model Manager 15.1: Quick Start Tutorial

PharmaSUG Paper AD06

Automation of SDTM Programming in Oncology Disease Response Domain Yiwen Wang, Yu Cheng, Ju Chen Eli Lilly and Company, China

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

Text Generational Data Sets (Text GDS)

Moving to SAS Drug Development 4 Magnus Mengelbier, Limelogic AB, Malmö, Sweden

The Impossible An Organized Statistical Programmer Brian Spruell and Kevin Mcgowan, SRA Inc., Durham, NC

Anticipating User Issues with Macros

PhUSE US Connect 2019

Git with It and Version Control!

Advantages of a real end-to-end approach with CDISC standards

Planning to Pool SDTM by Creating and Maintaining a Sponsor-Specific Controlled Terminology Database

A SAS Macro for Generating Informative Cumulative/Point-wise Bar Charts

Step through Your DATA Step: Introducing the DATA Step Debugger in SAS Enterprise Guide

Report Writing, SAS/GRAPH Creation, and Output Verification using SAS/ASSIST Matthew J. Becker, ST TPROBE, inc., Ann Arbor, MI

PharmaSUG Paper AD03

Transcription:

PharmaSUG 2018 - Paper AD-26 Making the most of SAS Jobs in LSAF Sonali Garg, Alexion; Greg Weber, DataCeutics ABSTRACT SAS Life Science Analytics Framework (LSAF) provides the ability to have a 21 CFR Part 11 compliant web-based Statistical Computing Environment (SCE) where studies data can be stored securely with versioning and audit history capabilities. Jobs are a critical part of an LSAF environment as programs must be run via jobs. Starting in version 4.7.1, the Run and Populate feature is a new and improved way of creating and running jobs. This paper will present our experiences with creating jobs prior to 4.7.1, the issues we encountered and how Run and Populate resolves these concerns and helps us write efficient jobs reducing the time and resources needed to get our results. INTRODUCTION In LSAF (formerly known as SAS Drug Development), jobs are required in order to execute programs in the global Repository and permanently save the output and logs to the repository. The creation of jobs, while not overly complicated, is one of the more challenging and time-consuming activities for new users. It can be especially frustrating when your new job runs successfully in the Workspace, but then fails when run in the Repository. In this paper, we will review the components of a job, one of the common mistakes made by new users when creating jobs. We will review the execution of jobs and how this differs when running in the Workspace vs. the Repository. Once we have reviewed the basics around creating jobs, we will discuss how to use and take advantage of the Run and Populate feature to make the task of creating a job more foolproof and efficient. JOBS IN LSAF - COMPONENTS OF A JOB In LSAF, there are two working areas: the global Repository and the user s Workspace. You develop files in your workspace and then check the files in to the repository to share with other users. In order for a SAS program to be run in the repository and generate results accessible to all users, a job must be created. To create a job, you define the following: 1. General Options: Description/Detail of the job and location of the log, listing and manifest. 2. SAS Programs: List of SAS programs to be run in the order to be executed. 3. Parameters: Parameters and values to pass to the job (converted to macro variables). 4. Inputs: Any input files and datasets. 5. Output: Folder(s) where the output data is saved. 1

The screenshot above is the main job creation interface. In the red box are the items to define for each job. The focus of this paper is the definition of the job inputs. When creating jobs in LSAF prior to version 4.7.1, all the inputs and outputs must be manually defined. The files and folders needed as inputs are browsed to and then added to the job. This is a tedious and errorprone process, which can cause frustration for users. Inputs are often missed when creating a job in the Workspace and then the job fails when run in the Repository. This leads to the user defining inputs at the folder level, which is easier and takes less time. The screenshot below shows, a job to create the DM dataset. Two of the inputs are located in the folders macros and source. These folders contain many files, but the user has chosen to define the input at the folder level instead of defining multiple inputs at the file level. The problem with this shortcut is that it results in defining many unnecessary inputs and creating inefficient jobs. We refer to jobs with unnecessary inputs as bulky jobs. What is so bad about having a few extra inputs, you might ask? Unlike when creating and running a job in the user Workspace, when a job runs in the Repository, the input and output folders and the input files are copied into the transient workspace (a temporary area created by LSAF when the job runs). If there are more inputs files than the job requires, then unnecessary data is copied to the transient workspace, which is a shared and limited storage resource. In addition, this increases job completion time. In the job displayed above, the user has defined the source folder as an input. When the job is run, it copies all the files in the source folder to the transient workspace. Though not obvious by looking at the inputs, the source folder for this project contains a large lab dataset that is not needed for this job. RUN AND POPULATE Run and Populate is a new LSAF feature when creating and executing jobs in LSAF. Prior to this feature, all job inputs and outputs were defined manually as discussed above. Our contention is that job inputs should be defined at the file level (job outputs are always at the folder level). However, as discussed, defining to the file level is tedious and error-prone. This leads to the user defining inputs at the folder level, which is easier and takes less time. Run and Populate resolves this by automatically defining inputs only at the file level and only the files that are needed by the program, thus optimizing the job. USING RUN AND POPULATE 2

1. As always, jobs must be created in the workspace. 2. User develops and successfully executes the SAS program(s) interactively in the workspace. 3. Programs(s) must run error-free to take advantage of Run and Populate. 4. Define the General Options, Parameters and SAS Program(s). These items are still defined manually. 5. Run the job selecting Run and Populate to define the inputs and outputs. 6. All inputs will be defined to the file level creating an efficient and optimally sized job. The Run and Populate feature was added to the job creation interface. The job above was manually created using Run and Populate with the following result. All inputs were automatically populated by LSAF and have been defined to the file level. Only required inputs were defined, resulting in a more efficient job. 3

OTHER ADVANTAGES AND CONSIDERATIONS WHEN USING RUN AND POPULATE When job inputs are defined to the file level (as they are when using Run and Populate), you can easily extract the job inputs from the job file. This is done by using the LSAF API macro %lsaf_getjobinputs: %lsaf_getjobinputs(lsaf_path=./dm.job,sas_dsname=jobinputs); Using this macro on the job shown above, produces the following dataset containing the input files with the full path. One need we had was the ability to copy a job from one LSAF analysis to another and then execute the job so we also needed to copy the input files to the new analysis. This is tedious to do manually so to automate this process, we created a macro to read a job, retrieve the inputs as shown above using the %lsaf_getjobinputs macro, iterate though the dataset and utilize the %lsaf_copy API macro to copy the required file to the new analysis. Other LSAF API macros used to make the process robust included %lsaf_objectexists and %lsaf_deleteobject to check if the file already exists and delete before copying. This process works very well for us, but we did run into a couple of issues: Since Run and Populate scans the SAS program to determine inputs and outputs for all programs run in the job, coding approach plays an important role. For example, we encountered a utility macro that was using proc contents on the whole library (_all_) to check the existence of one particular file. This resulted in run and populate identifying all files in the library as inputs. Clearly this led to an unnecessarily bulky job. The resolution was to recode the macro to not use _all_ and explicitly check for the existence of files. The lesson learned was that when designing our macros, we needed to take into account how this will affect the behavior of Run and Populate. Another issue we encountered was with jobs that were designed to alter processing using job parameters. These jobs do not benefit from Run and Populate, since the inputs are dependent on the value of the chosen parameter. Another consideration is for jobs that run multiple programs, for example, a job that runs all SDTM programs. In this case, it makes more sense to include inputs at the folder level instead of using Run and Populate, which will require the user to update the job each time a new input file is required. CONCLUSION The addition of the Run and Populate feature for creating LSAF jobs, has simplified the creation of efficient jobs in LSAF. Also, making jobs defined at the file level offers other advantages when automating processes that require the ability to extract the individual inputs from a job. While there are some considerations where Run and Populate is not always appropriate, the authors feel the functionality has great benefits. 4

REFERENCES SAS Life Science Analytics Framework 4.7: User s Guide SAS Life Science Analytics Framework - SAS Macro API, Version 1.5 ACKNOWLEDGMENTS Steven Hege, Associate Director and Alexion SAS Administrator Melissa Martinez, Principal Health and Life Sciences Industry Consultant Global Hosting and US Professional Services RECOMMENDED READING Hitchhiker s Guide to the Galaxy by Douglas Adams CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the authors at: Sonali Garg Alexion Pharmaceuticals 475-230-3926 Sonali.Garg@alexion.com http://www.alexion.com/ Greg Weber DataCeutics 610-970-2333 (x226) weberg@dataceutics.com http://www.dataceutics.com/ SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 5