Create Metadata Documentation using ExcelXP

Similar documents
Exporting Variable Labels as Column Headers in Excel using SAS Chaitanya Chowdagam, MaxisIT Inc., Metuchen, NJ

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

Quick Data Definitions Using SQL, REPORT and PRINT Procedures Bradford J. Danner, PharmaNet/i3, Tennessee

Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS. Vincent DelGobbo, SAS Institute Inc.

ODS TAGSETS - a Powerful Reporting Method

CC13 An Automatic Process to Compare Files. Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ

Taming a Spreadsheet Importation Monster

An Efficient Tool for Clinical Data Check

Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables

Uncommon Techniques for Common Variables

Same Data Different Attributes: Cloning Issues with Data Sets Brian Varney, Experis Business Analytics, Portage, MI

PharmaSUG Paper AD03

SDTM Attribute Checking Tool Ellen Xiao, Merck & Co., Inc., Rahway, NJ

CDISC Variable Mapping and Control Terminology Implementation Made Easy

Planning to Pool SDTM by Creating and Maintaining a Sponsor-Specific Controlled Terminology Database

Creating Multi-Sheet Excel Workbooks the Easy Way with SAS Vincent DelGobbo, SAS Institute Inc., Cary, NC

- 1 - ABSTRACT. Paper TU02

Fall 2012 OASUS Questions and Answers

PharmaSUG Paper PO10

Multi-sheet Workbooks from SAS. data using the ODS ExcelXP tagset. Another Way to EXCEL using SAS

Real Time Clinical Trial Oversight with SAS

Quick and Efficient Way to Check the Transferred Data Divyaja Padamati, Eliassen Group Inc., North Carolina.

Dictionary.coumns is your friend while appending or moving data

Applications Big & Small. Printable Spreadsheets Made Easy: Utilizing the SAS Excel XP Tagset Rick Andrews, UnitedHealth Group, Cary, NC

Paper AD12 Using the ODS EXCEL Destination with SAS University Edition to Send Graphs to Excel

SQL Metadata Applications: I Hate Typing

A Better Perspective of SASHELP Views

All Aboard! Next Stop is the Destination Excel

Creating AND Importing Multi-Sheet Excel Workbooks the Easy Way with SAS

The Output Bundle: A Solution for a Fully Documented Program Run

Moving Data and Results Between SAS and Excel. Harry Droogendyk Stratia Consulting Inc.

Keeping Track of Database Changes During Database Lock

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

Exploring DICTIONARY Tables and SASHELP Views

%check_codelist: A SAS macro to check SDTM domains against controlled terminology

Figure 1. Table shell

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

Give me EVERYTHING! A macro to combine the CONTENTS procedure output and formats. Lynn Mullins, PPD, Cincinnati, Ohio

Doctor's Prescription to Re-engineer Process of Pinnacle 21 Community Version Friendly ADaM Development

Paper B GENERATING A DATASET COMPRISED OF CUSTOM FORMAT DETAILS

Essentials of the SAS Output Delivery System (ODS)

Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ

ODS EXCEL DESTINATION ACTIONS, OPTIONS, AND SUBOPTIONS

PharmaSUG China 2018 Paper AD-62

Maintaining Formats when Exporting Data from SAS into Microsoft Excel

It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.

Traffic Lighting Your Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS Vincent DelGobbo, SAS Institute Inc., Cary, NC

A Macro to Create Program Inventory for Analysis Data Reviewer s Guide Xianhua (Allen) Zeng, PAREXEL International, Shanghai, China

Run your reports through that last loop to standardize the presentation attributes

ADaM Compliance Starts with ADaM Specifications

What Do You Mean My CSV Doesn t Match My SAS Dataset?

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

Utilizing the VNAME SAS function in restructuring data files

Creating Your Own Worksheet Formats in exporttoxl

An Introduction to Creating Multi- Sheet Microsoft Excel Workbooks the Easy Way with SAS

Overview 14 Table Definitions and Style Definitions 16 Output Objects and Output Destinations 18 ODS References and Resources 20

Hooking up SAS and Excel. Colin Harris Technical Director

Data Science Services Dirk Engfer Page 1 of 5

An Alternate Way to Create the Standard SDTM Domains

Validation Summary using SYSINFO

Submission-Ready Define.xml Files Using SAS Clinical Data Integration Melissa R. Martinez, SAS Institute, Cary, NC USA

Paper CC01 Sort Your SAS Graphs and Create a Bookmarked PDF Document Using ODS PDF ABSTRACT INTRODUCTION

New for SAS 9.4: Including Text and Graphics in Your Microsoft Excel Workbooks, Part 2

Pharmaceuticals, Health Care, and Life Sciences. An Approach to CDISC SDTM Implementation for Clinical Trials Data

Choosing the Right Tool from Your SAS and Microsoft Excel Tool Belt

Power Data Explorer (PDE) - Data Exploration in an All-In-One Dynamic Report Using SAS & EXCEL

Don Hurst, Zymogenetics Sarmad Pirzada, Hybrid Data Systems

New for SAS 9.4: A Technique for Including Text and Graphics in Your Microsoft Excel Workbooks, Part 1

MedDRA Dictionary: Reporting Version Updates Using SAS and Excel

A Fully Automated Approach to Concatenate RTF outputs and Create TOC Zhiping Yan, Covance, Beijing, China Lugang Xie, Merck, Princeton, US

Making a SYLK file from SAS data. Another way to Excel using SAS

Document and Enhance Your SAS Code, Data Sets, and Catalogs with SAS Functions, Macros, and SAS Metadata. Louise S. Hadden. Abt Associates Inc.

Paper A Simplified and Efficient Way to Map Variable Attributes of a Clinical Data Warehouse

ET01. LIBNAME libref <engine-name> <physical-file-name> <libname-options>; <SAS Code> LIBNAME libref CLEAR;

How to Create Data-Driven Lists

SAS ENTERPRISE GUIDE USER INTERFACE

Planting Your Rows: Using SAS Formats to Make the Generation of Zero- Filled Rows in Tables Less Thorny

Data Quality Review for Missing Values and Outliers

Why choose between SAS Data Step and PROC SQL when you can have both?

Regaining Some Control Over ODS RTF Pagination When Using Proc Report Gary E. Moore, Moore Computing Services, Inc., Little Rock, Arkansas

Liberate, a component-based service orientated reporting architecture

PhUSE US Connect 2018 Paper CT06 A Macro Tool to Find and/or Split Variable Text String Greater Than 200 Characters for Regulatory Submission Datasets

Applying ADaM Principles in Developing a Response Analysis Dataset

Automation of SDTM Programming in Oncology Disease Response Domain Yiwen Wang, Yu Cheng, Ju Chen Eli Lilly and Company, China

A SAS Macro to Create Validation Summary of Dataset Report

More Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS Vincent DelGobbo, SAS Institute Inc, Cary, NC

Creating an ADaM Data Set for Correlation Analyses

SAS Drug Development Program Portability

How to Keep Multiple Formats in One Variable after Transpose Mindy Wang

Tracking Dataset Dependencies in Clinical Trials Reporting

Text Data Processing Entity Extraction Dictionary File Generator User's Guide SAP Data Services 4.2 (14.2.0)

Exchanging data between SAS and Microsoft Excel

What Is SAS? CHAPTER 1 Essential Concepts of Base SAS Software

Making a List, Checking it Twice (Part 1): Techniques for Specifying and Validating Analysis Datasets

PharmaSUG Paper TT11

The Perfect Marriage: The SAS Output Delivery System (ODS) and

One SAS To Rule Them All

ABSTRACT INTRODUCTION WORK FLOW AND PROGRAM SETUP

HAVE YOU EVER WISHED THAT YOU DO NOT NEED TO TYPE OR CHANGE REPORT NUMBERS AND TITLES IN YOUR SAS PROGRAMS?

Patricia Guldin, Merck & Co., Inc., Kenilworth, NJ USA

Transcription:

Paper AD13 Create Metadata Documentation using ExcelXP Christine Teng, Merck Research Labs, Merck & Co., Inc., Rahway, NJ ABSTRACT The purpose of the metadata documentation is two-fold. First, it facilitates quick understanding of the project design. Second, it rapidly validates that data sets and variables adhere to the electronic submission requirements for clinical trials. SAS 9 provides several approaches to create Excel output. There is an experimental tagset called ExcelXP that is available for download from the ODS Markup Resources site at http://support.sas.com/rnd/base/topics/ odsmarkup/. The SAS 9 ExcelXP tagset generates XML output that conforms to the Microsoft XML Spreadsheet Specification ("XML Spreadsheet Reference", Microsoft Corp.). One can create XML output on UNIX or Windows platform and the XML output can be read by EXCEL 2000 and later releases. In this paper, I use the ExcelXP tagset in conjunction with the SAS Dictionary to create metadata documentation for a group of data sets from a mocked clinical trial project. A SAS macro is created based on the requirements as follows: 1. Create a project workbook that contains multiple worksheets. 2. Create a metadata table inside a worksheet for each data set. 3. If any given data set has a test code, create a second table that lists the test codes under the metadata table within the same worksheet. 4. Create a worksheet that is comprised of all variables within a project. In addition, identify all data sets that contain the individual variable. 5. In a separate worksheet, create a global dictionary for all test codes defined in the project along with the associated test description defined in the PROC FORMAT. SAS 9, Windows, Intermediate Level Key Words: ExcelXP, Tagset, SAS Dictionary, PROC SQL INTRODUCTION The SAS 9 ExcelXP tagset generates XML output that conforms to the Microsoft XML Spreadsheet Specification ("XML Spreadsheet Reference", Microsoft Corp.). It provides the functionality to create multiple worksheets in a workbook as well as multiple tables within a single worksheet. These features are very useful for creating metadata documentation where each data set has its own worksheet with label. It enables quicker accessibility to locate the information for a group of data sets. With SAS DICTIONARY and PROC SQL, the metadata documentation can be created without hard coding. The details of using PROC SQL and SAS DICTIONARY will not be covered here. For more information regarding the SAS DICTIONARY and PROC SQL, please refer to the SAS manuals or the paper I coauthored for PharmaSUG 2006 - Simple Ways to Use PROC SQL and SAS DICTIONARY TABLES to Verify Data Structure of the Electronic Submission Data Sets. This paper is not a tutorial about the ExcelXP tagset. Rather, it demonstrates another application using the ExcelXP tagset. The detailed tutorials and references for the ExcelXP tagset can be found at the references section of this paper. In order to control the appearance of the output within Excel, PROC TEMPLATE can be used to create a style template. A template defines how to format output produced by a procedure or data step. For information about PROC TEMPLATE, please consult this site: http://support.sas.com/rnd/base/ topics/ odsmarkup/ tagsets.html. SAS provides many standard templates that allow for customization. To see a list of templates provided by SAS, (1) go to the Results windows, (2) right click on Results and select Template, (3) expand sashelp.tmplmst (See Table-1 in Appendix). In the macro that builds the metadata documentation, I created a customized style template that uses certain fonts, colors and spacing inside my Excel workbook. This step is not required to use ExcelXP. However, style template makes the output more presentable. DESIGN REQUIREMENTS The following are the requirements for the metadata documentation:

A. Create a macro program with two parameters: 1. DATADIR is used to assign the input library name. 2. DSETNAME is used to assign a list of data sets separated by +. The prefer design is that DATADIR is a required variable. If the value of DSETNAME is not provided, all data sets under DATADIR directory should be used. Otherwise, use the specified data sets in the DSETNAME macro variable. For this exercise, we use the data sets provided in the DSENAME macro variable. %ls_datastruc(datadir = datadir, dsetname = demog_mk+weighte_mk+vital_mk+labchem_mk+ms_mk) B. Create a metadata table inside a worksheet for each data set defined in the macro parameters. The label of each data set should be listed first, followed by the attributes of the variables. (See Table-2 in Appendix) C. If a data set contains an EXAM_CD field, create a second table after the metadata table in the same worksheet. (See Table-3 in Appendix) D. After all worksheets of data sets are created; create a worksheet that is comprised of all variables from the individual worksheet to build a global dictionary for the data sets that were specified. In addition, identify all data sets that contain the variable. (See Table-4 in Appendix) This worksheet is used to cross-reference all tables and allows one to quickly spot any inconsistencies. For example, in Table-4, the EXAMPARM variable appears twice with different attributes, it means that the variable was defined differently among programs. We need to go back to correct the definition of the variable if they should have the same attributes, or give a new name if the difference is intentional. E. Create a global dictionary of the test code (variable name is EXAM_CD) to list all EXAM_CD defined in the data sets provided in the DSETNAME macro parameter. In addition, include the EXAM_CD description provided in PROC FORMAT. (See Table-5 in Appendix) Normally, we use PROC FORMAT data on table output such as title or test name. This worksheet checks if a description is associated with the correct test code. IMPLEMENTATION Since the ExcelXP tagset is still evolving, there are some limitations and hence its functionality may be changed in the future. It is recommended that user always download the latest update to verify the changes and enhancements. To use the ExcelXP tagset, first download the latest ExcelXP tagset from the ODS MARKUP page. This page also provides links to documentation for using and customizing tagsets. For this exercise, I use ExcelXP Tagset version dated June 2006. Before using the ExcelXP tagset, check the codes or execute the following to see a list of options available in the ExcelXP tagset: ODS tagsets.excelxp file = "test.xml" options(doc="help"); Under the pre-configuration part of the requirement A below, only specifications are described since coding for this part is not the focus of this paper. The sections where the worksheets are built have more detailed coding information. REQUIREMENT A Create a macro program with two parameters. %MACRO ls_datastruc(datadir=, dsetname=); *Pre-configuration before building the worksheets; NULLTBL A table used to build header in the global worksheets for the requirement D and E. GLOBTBL A table that contains all variables from the data sets of DSETNAME list and each variable has a list of tables that contain this variable. It is built from the dictionary_columns table and is used in the requirement D..

TESTTBL A table that contains all the EXAM_CD and the associated exam_cd short description. The exam_cd values are collected from the individual data set within the DSETNAME list. This is used in the requirement E. FMTDESCP A table that was created by loading the format using PROC FORMAT CNTLOUT= option. This is used in the requirement E. This table contains the full descriptions of the exam codes defined in PROC FORMAT. EXAMLST A macro variable that contains all data set that has the variable exam_cd. This is used to build the sub-table for the requirement C. *Set up the style template; proc template; define style styles.xlstatistical; parent = styles.statistical; : : *Set up the workbook; Include the ExcelXP tagset code ods listing close; ods tagsets.excelxp path = c:\temp\excelxp file = AD13.xml style = XLStatistical; %MEND; *Build the worksheets (see requirements below); REQUIREMENT B Create a metadata table inside a worksheet for each data set defined in the macro parameters. %let num=1; %let list = %upcase(%scan(&dsetname, &num, '+')); %*Use Do-While loop to create individual worksheet; %do %while (&list. ne ); *Create worksheet with defined options; ods &_ODSDEST options(absolute_column_width = 6, 16, 6, 35, 35 sheet_interval = none sheet_name = &list ); *Print data set name and label at the beginning of the sheet; select ' ', substr(memname,1) as Data_Set, ' ', substr(memlabel,1) as Data_Set_Label, ' ' as Created_by from dictionary.tables where libname = "DATADIR" and memtype = "DATA" and memname = "&list"; *Print data set columns and attributes information; select int(varnum) as Pos, upcase(name) as VarName, propcase(catx('',type,put(length, best4.))) as TypeLen, substr(label,1) as Label, ' ' as Deriviation_Comments from dictionary.columns where libname = "DATADIR" and memtype = "DATA" and memname = "&list" order by varnum;

REQUIREMENT C If a data set contains an EXAM_CD field, create a second table after the metadata table in the same worksheet. %if %index(&examlst., &list.) %then %do; select distinct exam_cd label='exam Code', examunit Label = 'Unit', ' ', examparm as description, ' ' as Week from datadir.&list.; %end; %*Ready to build the next worksheet; %let num = %eval(&num + 1); %let list = %upcase(%scan(&dsetname, &num, '+')); %end; REQUIREMENT D Create a worksheet that is comprised of all variables from the individual worksheet to build a global dictionary for the data sets that were specified. In addition, identify all data sets that contain the variable. ods &_ODSDEST options(absolute_column_width = 10, 6, 30, 85 sheet_interval = none sheet_name = VarDictionary ); *Create a header at the beginning of the worksheet; select ' ' label='purpose: ', ' ', ' ' label = 'Reference for Variable Dictionary', ' ' from NULLTBL; *Create variable dictionary and the tables that contain it; select VarName, TypeLen, Label, memnames label = 'In Data Sets' from GLOBTBL order by varname; REQUIREMENT E Create a global dictionary of the test code (variable name is EXAM_CD) to list all EXAM_CD defined in the data sets provided in the DSETNAME macro parameter. In addition, include the EXAM_CD description provided in PROC FORMAT. ods &_ODSDEST options(absolute_column_width = 10, 25, 65 sheet_interval = none sheet_name = StudyTests ); *Create a header at the beginning of the worksheet; select ' ' label='purpose: ', ' ' label='list of Tests Done' from NULLTBL; *Create exam_cd dictionary with description from PROC FORMAT; select distinct a.exam_cd label='exam Code', a.examparm Label = 'Parameter Name', b.description label = 'Format Description' from TESTTBL a left join FMTDESCP b on a.exam_cd = b.exam_cd order by a.exam_cd; As shown above, I only use a few options provided by ExcelXP. With the use of PROC SQL, SAS DICTIONARY tables and ExcelXP, I am able to quickly build up the workbook with multiple worksheets that contain the metadata information for a list of data sets. This information is very useful to help learn or verify a project database design.

SUMMARY ExcelXP is one of the many tools in SAS to create Excel output. It allows simple configurations to generate Excel output. With SAS Dictionary tables, I found it very useful and simple to create documentation for quality assurance purpose. Please visit SAS support website at http://support.sas.com/rnd/base/topics/odsmarkup/ for additional ExcelXP tagset information and examples. REFFERENCES DelGobbo, V. 2006. "Creating AND Importing Multi-Sheet Excel Workbooks the Easy Way with SAS ". Proceedings of the Thirty-First Annual SAS Users Group International Conference, 31. CD-ROM. Paper 115. Gebhart, E. 2005. " ODS Markup: The SAS Reports You've Always Dreamed Of ". Proceedings of the Thirtieth Annual SAS Users Group International Conference, 30. CD-ROM. Paper 85. Zender, C. 2005. "The Power of Table Templates and DATA _NULL_". Proceedings of the Thirtieth Annual SAS Users Group International Conference, 30. CD-ROM. Paper 88. PharmaSUG 2006 Paper: "Simple Ways to Use PROC SQL and DICTIONARY TABLES to Verify Data Structure of the Electronic Submission Data Sets" By Christine S. Teng and Wenjie Wang. SAS Macro Language: Reference SAS SQL Procedure User s Guide ACKNOWLEGEMENTS The author would like to thank the management team for their encouragement and review of this paper. TRADEMARKS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks of their respective companies. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Christine Teng Merck & Co., Inc. Rahway, NJ 07065 christine_teng@merck.com APPENDIX (Continue to next page)

Table 1 (Available Tagsets in SAS 9) Table 2 (Requirement B)

Table 3 (Requirement C) Table 4 (Requirement D)

Table 5 (Requirement E)