Automating the Production of Formatted Item Frequencies using Survey Metadata

Size: px
Start display at page:

Download "Automating the Production of Formatted Item Frequencies using Survey Metadata"

Transcription

1 Automating the Production of Formatted Item Frequencies using Survey Metadata Tim Tilert, Centers for Disease Control and Prevention (CDC) / National Center for Health Statistics (NCHS) Jane Zhang, CDC / NCHS Lewis Berman, CDC / NCHS 1. ABSTRACT The National Health and Nutrition Examination Survey (NHANES) collects a vast array of questionnaire and examination data regarding the health and nutritional status of the United States population. Ongoing release of NHANES data to the public is one of the many tasks associated with the survey. Codebooks consisting of data item names and associated metadata, along with corresponding item frequencies, accompany the public data release. The challenge is to utilize existing metadata to automate the production of the detailed response or exam result frequencies for each and every data item released. This poster will illustrate a novel solution utilizing the SAS/IntrNet system along with the unique challenges posed by combining metadata with actual survey data for the production of automated frequency distributions. These challenges include associating item labels from the metadata with the actual survey data via dynamic SAS formats, systematically computing ranges for data which were not coded, handling floating point number limitations, ordering the final results in a standardized fashion, and updating the database with the resulting computed frequencies. 2. INTRODUCTION The NHANES is designed to monitor the health and nutritional status of the U.S. population. In 1999, NHANES became a continuous survey fielded on an ongoing basis. The survey sample selected each year is a multi-staged probability sample of persons of all ages and is representative of the noninstitutionalized U.S. civilian population. Data are released in two year cycles. Participation in the survey is voluntary. Findings are reported for the total U.S. population, as well as for selected race/ethnicity groups such as African Americans and Mexican Americans living in the U.S. NHANES data are obtained by personal interviews, health examinations, and laboratory tests. All data collection methods follow standardized protocols. Initially, people that are selected for the survey samples are interviewed in their homes. The interviewed individual is then invited to participate in a health examination component. The health examinations are conducted in Mobile Examination Centers (MEC). Examinees receive a preliminary report of their examination findings at the conclusion of the MEC exam and a final report of findings after all laboratory processing is completed. Page 1 of 7

2 3. PROBLEM For each survey component (Blood Pressure Exam, Total Cholesterol Lab, Prescription Medication Questionnaire, for example), there are numerous exam, lab, or questionnaire items. Tied to the public release of the data, the National Center for Health Statistics (NCHS) releases frequencies for each of these items. There is a great degree of tedium in producing these frequencies for several reasons. First, some of the items have character values while other items have numeric values. This becomes an issue in that one cannot simply run proc means or proc freq for all items to produce frequencies. Another challenge is that many of these survey items (both character and numeric) have several hundred or even thousands of distinct values. This becomes an issue because a simple proc freq statement will produce a table which is too difficult to read and is unmanageable from a publication standpoint. In the past, a programmer was assigned to each component to address these issues. These programmers had to walk through each component, item by item, and determine whether proc freq or proc means should be run for each item, for each component. In addition, these programmers also had to write out SAS format statements for each item so that the resulting frequencies were formatted correctly. The goal of this effort was to find a way to automate these frequencies, dynamically and automatically format all the values for each item, convert unmanageable lists of distinct values to value ranges, and order the resulting output in an easy to understand, consistent order. 4. APPROACH and METHODOLOGY By utilizing the pre-existing metadata that was created and validated in a web-based codebook application, it became possible to automate the production of the survey frequencies. A series of SAS macros were developed to combine the data to be released (residing in SAS datasets) with the preexisting metadata (stored in Sybase ). Through the integration of the web-based codebook application with SAS/IntrNet, users are now able to call these SAS macros directly from the web-based codebook application to dynamically and automatically format all the values for each survey item, convert unmanageable lists of distinct values to value ranges, order the resulting output in an easy to understand consistent order, and save this final frequency output to the Sybase database. 4.1 DYNAMIC SAS FORMATS In order to explain the development methodology, it is important to understand the metadata. The metadata for each survey component is stored in Sybase, which are then presented as Hyper Text Markup Language (HTML) codebooks or data dictionaries. Below are two excerpts from the NHANES Cardiovascular Fitness Examination codebook: Page 2 of 7

3 CVQ220m English Text: Reason for Priority 2 Stop: Other specified reasons Codes: 1= Yes 2= No Priority 2 Stop, other specified reasons Skip To Values: CVDEXLEN Length of CV fitness exam (min) English Text: Length of the CV fitness exam (minutes) All of the values presented in these codebook excerpts are stored in a metadata database and it is these values which are used to dynamically create the formatted frequencies. In order to create the formatted frequencies, the first requirement is to read all the item names ( CVQ220m, CVDEXLEN ) and corresponding coded values (1=Yes, 2=No) for these items from the Sybase tables into separate SAS datasets. Then, in order to dynamically create the SAS formats, each item requires its own unique format name. Since we have a limited number of items in a survey component, the approach is to simply use the observation number (_n_) to create the unique format names while still satisfying the SAS format constraints of all format names being eight characters or less and not ending with a digit. After creating the format names, the program then loops through all the items. Then, as it is defined in the metadata database, if the item is numeric, the format name begins with fm and if the item is character, the format name begins with $fm. See the code below:!"#$ % & &'' ((()*+++''&& % & &'' ((()*+++''&& Page 3 of 7

4 Then, depending on whether or not the item is character or numeric, the appropriate macro is called to create the SAS formats. This is fairly straightforward. Two SAS datasets are created (one for numeric values and one for character values) which contain the starting value, the ending value and the label to be used when formatting individual values. These datasets are then employed in the SAS proc format statements later in the program. 4.2 CONVERT DISTINCT VALUES TO VALUE RANGES Most of the SAS formats are straightforward with one exception converting overly large lists of values to a value range. For example, the length of a Cardiovascular Fitness exam (CVDEXLEN) has 791 distinct values, far too many to be practically displayed in a single frequency table. The approach taken is to run proc freq for every item, regardless of whether or not it is character or numeric. There is a value in our metadata table which designates the maximum number of discrete values that we will allow to display in a frequency table. The default is 50. This means that if more than 50 distinct uncoded values are found for an item, then these distinct values are converted to one range of values. This test and subsequent conversion are accomplished by outputting the frequencies generated and counting the number of records in the resulting output file. If the number of records in the output file exceeds the maximum number of values allowed, then the outputted values are converted to a range for numeric values or simply labeled using the desired metadata label for character values. If the number of records in the output file is less than the maximum number of values allowed, then the outputted values are simply displayed as they are. Since SAS sorts frequencies by default and the frequencies have just been saved to a file, it is very straightforward at this point to create the range of values. The first record in the output frequency file becomes the from value in the range while the last record in the output frequency file becomes the to value in the range. 4.3 HANDLING FLOATING POINT NUMBER LIMITATIONS Once the range issue had been solved, the application worked well but periodically the output for a given item contained one of the range delimiters as its own value record, in addition to a formatted range of values. This duplication only happens with floating point numeric values or numbers with decimal places. After looking through the temporary datasets, it was discovered that the numbers don t match exactly, as they are off in the outermost decimal places. This mismatch is due to the limitations of floating point numeric representation which exists in nearly every software package and hardware device. With some research 1, it was determined that there is a fuzz value that can be used in the format datasets that tells SAS to ignore differences less than a certain precision value. Since the differences are all past six decimal places and that level of precision is not required, the fuzz value in the numeric formats is set to This resolves all of the data misrepresentations. Page 4 of 7

5 4.4 ORDERING THE FINAL RESULTS Sorting the output values is not a trivial task. The values for an item can be either character or numeric. There are significant differences between sorting numeric values and sorting character values and an algorithm was needed that would work in all cases. Since the maximum length of a coded value was decided upon a priori to be 40 characters in our database, we chose to create a special character variable (dom_val_sort) in the database that could be used for sorting the values, also with a length of 40. If the coded value was numeric, the value of dom_val_sort was front-filled with blanks. This way instead of 40 preceding 4 when sorting with the coded value itself, the value of 4 would always precede the value of 40 when sorted using dom_val_sort. Conversely, if the coded value was character, the value of dom_val_sort was back-filled with blanks. Finally, to ensure that the MISSING values are always displayed last, the dom_val_sort value was set to a 40 character Z filled string so that missing records would always be displayed last in the outputted frequencies. 4.5 UPDATING THE DATABASE In order to produce the HTML output using the previously developed web application, the database needs to be updated to include the frequencies as well as the newly created sort order variable (dom_val_sort). This was accomplished using a simple proc append statement. In the very first attempt at updating the database, the program elicited the following error: Unable to update a Sybase table with an Identity field with SAS V8.2. After more research 2, it was discovered that this was a known error in SAS V8.2 and required the download and installation of SAS technical support hotfix 82SB09. After applying the hotfix, the program was then able to successfully update the database. 4.6 RESULTS Below are the same codebook excerpts shown earlier from the NHANES Cardiovascular Fitness Examination codebook. These excerpts are from the new codebooks. Note that these examples now include the automatically-computed, formatted frequencies: CVQ220m Priority 2 Stop, other specified reasons English Text: Reason for Priority 2 Stop: Other specified reasons Code or Value Description Count Skip to Item 1 Yes 42 2 No 411. Missing 4699 Page 5 of 7

6 CVDEXLEN Length of CV fitness exam (min) English Text: Length of the CV fitness exam (minutes) Code or Value Description Count Skip to Item 0 to Range of Values Missing 0 5. CONCLUSIONS By combining existing metadata with survey release data, it is possible to take a long, tedious, very userinvolved process and turn it into an easy to use, automated, SAS/IntrNet program. In prior releases, codebooks were tediously created via manual data entry into Microsoft FrontPage. The frequency files were also manually created from user-defined macros for each survey item. Moving forward, it is now possible to combine the codebook information with formatted frequencies into a singular output file and produce this file automatically without any manual user intervention. This significantly speeds up and simplifies the release process and offers the end users an easier-to-use, fully integrated data dictionary complete with frequencies. 6. REFERENCES 1. Pete Lund, More than Just Value: A Look into the Depths of PROC FORMAT, SAS Users Group International 27th Annual Conference Proceedings SAS Technical Support Web Site, SN , Unable to update a Sybase table with an Identity field with SAS V ACKNOWLEDGEMENTS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. Page 6 of 7

7 8. CONTACT INFORMATION Tim Tilert Centers for Disease Control and Prevention / National Center for Health Statistics 3311 Toledo Rd. Hyattsville, MD20782 Work phone: (301) Fax: (301) tnt6@cdc.gov Date Last Modified: September 7, 2004 Submitted to: The Northeast SAS Users Group Page 7 of 7

Summarizing Impossibly Large SAS Data Sets For the Data Warehouse Server Using Horizontal Summarization

Summarizing Impossibly Large SAS Data Sets For the Data Warehouse Server Using Horizontal Summarization Summarizing Impossibly Large SAS Data Sets For the Data Warehouse Server Using Horizontal Summarization Michael A. Raithel, Raithel Consulting Services Abstract Data warehouse applications thrive on pre-summarized

More information

ABSTRACT MORE THAN SYNTAX ORGANIZE YOUR WORK THE SAS ENTERPRISE GUIDE PROJECT. Paper 50-30

ABSTRACT MORE THAN SYNTAX ORGANIZE YOUR WORK THE SAS ENTERPRISE GUIDE PROJECT. Paper 50-30 Paper 50-30 The New World of SAS : Programming with SAS Enterprise Guide Chris Hemedinger, SAS Institute Inc., Cary, NC Stephen McDaniel, SAS Institute Inc., Cary, NC ABSTRACT SAS Enterprise Guide (with

More information

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components PharmaSUG 2017 - Paper AD19 SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components Walter Hufford, Vincent Guo, and Mijun Hu, Novartis Pharmaceuticals Corporation ABSTRACT

More information

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University While your data tables or spreadsheets may look good to

More information

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA Paper AD11 SAS Macro Technique for Embedding and Using Metadata in Web Pages Paul Gilbert, Troy A. Ruth, Gregory T. Weber DataCeutics, Inc., Pottstown, PA ABSTRACT This paper will present a technique to

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics ST009 PROC MI as the Basis for a Macro for the Study of Patterns of Missing Data Carl E. Pierchala, National Highway Traffic Safety Administration, Washington ABSTRACT The study of missing data patterns

More information

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix Paper PO-09 How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix ABSTRACT This paper demonstrates how to implement

More information

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA Paper HW04 There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA ABSTRACT Clinical Trials data comes in all shapes and sizes depending

More information

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD ABSTRACT SESUG 2016 - RV-201 CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD Those of us who have been using SAS for more than a few years often rely

More information

Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics

Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics SAS6660-2016 Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics ABSTRACT Brandon Kirk and Jason Shoffner, SAS Institute Inc., Cary, NC Sensitive data requires elevated security

More information

Dear friends of Survey Solutions,

Dear friends of Survey Solutions, Dear friends of Survey Solutions, In version 5.12.0 that we have released on September 06, 2016 we release the interviewer cover page and improvements related to functioning of rosters, as well as other

More information

Centers for Disease Control and Prevention National Center for Health Statistics

Centers for Disease Control and Prevention National Center for Health Statistics Wireless-Only and Wireless-Mostly Households: A growing challenge for telephone surveys Stephen Blumberg sblumberg@cdc.gov Julian Luke jluke@cdc.gov Centers for Disease Control and Prevention National

More information

Quality Control of Clinical Data Listings with Proc Compare

Quality Control of Clinical Data Listings with Proc Compare ABSTRACT Quality Control of Clinical Data Listings with Proc Compare Robert Bikwemu, Pharmapace, Inc., San Diego, CA Nicole Wallstedt, Pharmapace, Inc., San Diego, CA Checking clinical data listings with

More information

Chaining Logic in One Data Step Libing Shi, Ginny Rego Blue Cross Blue Shield of Massachusetts, Boston, MA

Chaining Logic in One Data Step Libing Shi, Ginny Rego Blue Cross Blue Shield of Massachusetts, Boston, MA Chaining Logic in One Data Step Libing Shi, Ginny Rego Blue Cross Blue Shield of Massachusetts, Boston, MA ABSTRACT Event dates stored in multiple rows pose many challenges that have typically been resolved

More information

NRS STATE DATA QUALITY CHECKLIST

NRS STATE DATA QUALITY CHECKLIST A Project of the U.S. Department of Education NRS STATE DATA QUALITY CHECKLIST State: Date: Completed by (name and title): A. Data Foundation and Structure Acceptable Quality 1. State has written assessment

More information

Lecture 1 Getting Started with SAS

Lecture 1 Getting Started with SAS SAS for Data Management, Analysis, and Reporting Lecture 1 Getting Started with SAS Portions reproduced with permission of SAS Institute Inc., Cary, NC, USA Goals of the course To provide skills required

More information

Making the most of SAS Jobs in LSAF

Making the most of SAS Jobs in LSAF PharmaSUG 2018 - Paper AD-26 Making the most of SAS Jobs in LSAF Sonali Garg, Alexion; Greg Weber, DataCeutics ABSTRACT SAS Life Science Analytics Framework (LSAF) provides the ability to have a 21 CFR

More information

HEALTH AND RETIREMENT STUDY 2006 Internet Survey Final, Version 1.0 November Data Description and Usage. November 2008, Version 1.

HEALTH AND RETIREMENT STUDY 2006 Internet Survey Final, Version 1.0 November Data Description and Usage. November 2008, Version 1. HEALTH AND RETIREMENT STUDY 2006 Internet Survey Final, Version 1.0 November 2008 Data Description and Usage November 2008, Version 1.0 TABLE OF CONTENTS TABLE OF CONTENTS... II 1. INTRODUCTION... 1 2.

More information

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University.

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University. Abstract Data requests can range from on-the-fly, need it yesterday, to extended projects taking several weeks or months to complete. Often institutional researchers and other data scientists are juggling

More information

INTRODUCTION to SAS STATISTICAL PACKAGE LAB 3

INTRODUCTION to SAS STATISTICAL PACKAGE LAB 3 Topics: Data step Subsetting Concatenation and Merging Reference: Little SAS Book - Chapter 5, Section 3.6 and 2.2 Online documentation Exercise I LAB EXERCISE The following is a lab exercise to give you

More information

Events User Guide for Microsoft Office Live Meeting from Global Crossing

Events User Guide for Microsoft Office Live Meeting from Global Crossing for Microsoft Office Live Meeting from Global Crossing Contents Events User Guide for... 1 Microsoft Office Live Meeting from Global Crossing... 1 Contents... 1 Introduction... 2 About This Guide... 2

More information

Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury

Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury Introduction The objective of this paper is to demonstrate how to use a fillable PDF to collect

More information

A SAS/AF Application for Linking Demographic & Laboratory Data For Participants in Clinical & Epidemiologic Research Studies

A SAS/AF Application for Linking Demographic & Laboratory Data For Participants in Clinical & Epidemiologic Research Studies Paper 208 A SAS/AF Application for Linking Demographic & Laboratory Data For Participants in Clinical & Epidemiologic Research Studies Authors: Emily A. Mixon; Karen B. Fowler, University of Alabama at

More information

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT MWSUG 2017 - Paper BB15 Building Intelligent Macros: Driving a Variable Parameter System with Metadata Arthur L. Carpenter, California Occidental Consultants, Anchorage, Alaska ABSTRACT When faced with

More information

Version 8 Base SAS Performance: How Does It Stack-Up? Robert Ray, SAS Institute Inc, Cary, NC

Version 8 Base SAS Performance: How Does It Stack-Up? Robert Ray, SAS Institute Inc, Cary, NC Paper 9-25 Version 8 Base SAS Performance: How Does It Stack-Up? Robert Ray, SAS Institute Inc, Cary, NC ABSTRACT This paper presents the results of a study conducted at SAS Institute Inc to compare the

More information

Hands-On Workshops. Creating Java Based Applications

Hands-On Workshops. Creating Java Based Applications Creating Java Based Applications Destiny Corporation, Wethersfield, CT INTRODUCTION This presentation is designed to enable the user to create a Java Based Application. It will demonstrate this process

More information

Resolving Text Substitutions

Resolving Text Substitutions Resolving Text Substitutions Jason Ostergren, Helena Stolyarova, Danilo Gutierrez October 2010 13th International ti Blaise Users Conference Baltimore, Maryland Survey Research Operations Survey Research

More information

Let s get started with the module Getting Data from Existing Sources.

Let s get started with the module Getting Data from Existing Sources. Welcome to Data Academy. Data Academy is a series of online training modules to help Ryan White Grantees be more proficient in collecting, storing, and sharing their data. Let s get started with the module

More information

A Visual Step-by-step Approach to Converting an RTF File to an Excel File

A Visual Step-by-step Approach to Converting an RTF File to an Excel File A Visual Step-by-step Approach to Converting an RTF File to an Excel File Kirk Paul Lafler, Software Intelligence Corporation Abstract Rich Text Format (RTF) files incorporate basic typographical styling

More information

Using SAS software to shrink the data in your applications

Using SAS software to shrink the data in your applications Paper 991-2016 Using SAS software to shrink the data in your applications Ahmed Al-Attar, AnA Data Warehousing Consulting LLC, McLean, VA ABSTRACT This paper discusses the techniques I used at the Census

More information

What to Expect When You Need to Make a Data Delivery... Helpful Tips and Techniques

What to Expect When You Need to Make a Data Delivery... Helpful Tips and Techniques What to Expect When You Need to Make a Data Delivery... Helpful Tips and Techniques Louise Hadden, Abt Associates Inc. QUESTIONS YOU SHOULD ASK REGARDING THE PROJECT Is there any information regarding

More information

Omitting Records with Invalid Default Values

Omitting Records with Invalid Default Values Paper 7720-2016 Omitting Records with Invalid Default Values Lily Yu, Statistics Collaborative Inc. ABSTRACT Many databases include default values that are set inappropriately. These default values may

More information

SAS Online Training: Course contents: Agenda:

SAS Online Training: Course contents: Agenda: SAS Online Training: Course contents: Agenda: (1) Base SAS (6) Clinical SAS Online Training with Real time Projects (2) Advance SAS (7) Financial SAS Training Real time Projects (3) SQL (8) CV preparation

More information

The Impossible An Organized Statistical Programmer Brian Spruell and Kevin Mcgowan, SRA Inc., Durham, NC

The Impossible An Organized Statistical Programmer Brian Spruell and Kevin Mcgowan, SRA Inc., Durham, NC Paper CS-061 The Impossible An Organized Statistical Programmer Brian Spruell and Kevin Mcgowan, SRA Inc., Durham, NC ABSTRACT Organization is the key to every project. It provides a detailed history of

More information

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

More information

Pharmaceuticals, Health Care, and Life Sciences

Pharmaceuticals, Health Care, and Life Sciences Successful Lab Result Conversion for LAB Analysis Data with Minimum Effort Pushpa Saranadasa, Merck & Co., Inc. INTRODUCTION In the pharmaceutical industry, the statistical results of a clinical trial's

More information

Maryland OneStop Statewide License Portal State of Maryland Department of Information Technology

Maryland OneStop Statewide License Portal   State of Maryland Department of Information Technology Maryland OneStop Statewide License Portal http://onestop.md.gov/ State of Maryland Department of Information Technology Category: Digital Government: Government to Citizen Contact: Michael G. Leahy Secretary

More information

CC13 An Automatic Process to Compare Files. Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ

CC13 An Automatic Process to Compare Files. Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ CC13 An Automatic Process to Compare Files Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ ABSTRACT Comparing different versions of output files is often performed

More information

ANSI Standards: Creating a local, searchable database

ANSI Standards: Creating a local, searchable database ANSI Standards: Creating a local, searchable database By Norma J. Dowell Introduction Iowa State University Library, as a large technical university, has an extensive paper collection of American National

More information

Copy That! Using SAS to Create Directories and Duplicate Files

Copy That! Using SAS to Create Directories and Duplicate Files Copy That! Using SAS to Create Directories and Duplicate Files, SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and

More information

SUGI 29 Data Warehousing, Management and Quality

SUGI 29 Data Warehousing, Management and Quality Building a Purchasing Data Warehouse for SRM from Disparate Procurement Systems Zeph Stemle, Qualex Consulting Services, Inc., Union, KY ABSTRACT SAS Supplier Relationship Management (SRM) solution offers

More information

Paper HOW-06. Tricia Aanderud, And Data Inc, Raleigh, NC

Paper HOW-06. Tricia Aanderud, And Data Inc, Raleigh, NC Paper HOW-06 Building Your First SAS Stored Process Tricia Aanderud, And Data Inc, Raleigh, NC ABSTRACT Learn how to convert a simple SAS macro into three different stored processes! Using examples from

More information

Patient Portal User Guide The Patient s Guide to Using the Portal

Patient Portal User Guide The Patient s Guide to Using the Portal 2014 Patient Portal User Guide The Patient s Guide to Using the Portal Table of Contents: What is the Patient Portal?...3 Enrolling in the Patient Portal.......... 4-19 A. Enrollment Option #1: First-Time

More information

Checking for Duplicates Wendi L. Wright

Checking for Duplicates Wendi L. Wright Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when

More information

Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation

Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation ABSTRACT Data that contain multiple observations per case are called repeated measures

More information

Indenting with Style

Indenting with Style ABSTRACT Indenting with Style Bill Coar, Axio Research, Seattle, WA Within the pharmaceutical industry, many SAS programmers rely heavily on Proc Report. While it is used extensively for summary tables

More information

SAS and Electronic Mail: Send faster, and DEFINITELY more efficiently

SAS and Electronic Mail: Send  faster, and DEFINITELY more efficiently SAS and Electronic Mail: Send e-mail faster, and DEFINITELY more efficiently Roy Fleischer, Sodexho Marriott Services, Gaithersburg, MD Abstract With every new software package I install, I look for some

More information

Eaton Corporation. Prescription Benefits Managed by Express Scripts FREQUENTLY ASKED QUESTIONS

Eaton Corporation. Prescription Benefits Managed by Express Scripts FREQUENTLY ASKED QUESTIONS Eaton Corporation 1 Prescription Benefits Managed by Express Scripts Member Services: 1-800-792-9596 Member Website: Navigate to Express Scripts through EatonBenefits.com FREQUENTLY ASKED QUESTIONS 1.

More information

Chapter 17: INTERNATIONAL DATA PRODUCTS

Chapter 17: INTERNATIONAL DATA PRODUCTS Chapter 17: INTERNATIONAL DATA PRODUCTS After the data processing and data analysis, a series of data products were delivered to the OECD. These included public use data files and codebooks, compendia

More information

Nuix ediscovery Specialist

Nuix ediscovery Specialist Nuix ediscovery Specialist Nuix ediscovery Specialist ADVANCE TWO-DAY INSTRUCTOR-LED COURSE Nuix ediscovery Specialist training is a two-day course that will work through the complete ediscovery workflow,

More information

The Dataset Attribute Family of Classes Mark Tabladillo, Ph.D., Atlanta, GA

The Dataset Attribute Family of Classes Mark Tabladillo, Ph.D., Atlanta, GA The Dataset Attribute Family of Classes Mark Tabladillo, Ph.D., Atlanta, GA ABSTRACT This presentation will specifically present the dataset attribute family, an abstract parent and its twenty-five children.

More information

How to Create Data-Driven Lists

How to Create Data-Driven Lists Paper 9540-2016 How to Create Data-Driven Lists Kate Burnett-Isaacs, Statistics Canada ABSTRACT As SAS programmers we often want our code or program logic to be driven by the data at hand, rather than

More information

ABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES

ABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES An Efficient Method to Create a Large and Comprehensive Codebook Wen Song, ICF International, Calverton, MD Kamya Khanna, ICF International, Calverton, MD Baibai Chen, ICF International, Calverton, MD

More information

FULBRIGHT VISITING SCHOLAR PROGRAM

FULBRIGHT VISITING SCHOLAR PROGRAM FULBRIGHT VISITING SCHOLAR PROGRAM Instructions for Completing the 2020-2021 Fulbright Visiting Scholar Program Application Applications submitted after 11:59 p.m. on October 15, 2019 will not be considered

More information

Perceptive Process Mining

Perceptive Process Mining Perceptive Process Mining What s New Version: 2.4.x Written by: Product Documentation, R&D Date: May 2013 2013 Lexmark International Technology SA. All rights reserved Perceptive Software is a trademark

More information

22S:166. Checking Values of Numeric Variables

22S:166. Checking Values of Numeric Variables 22S:1 Computing in Statistics Lecture 24 Nov. 2, 2016 1 Checking Values of Numeric Variables range checks when you know what the range of possible values is for a given quantitative variable internal consistency

More information

GETTING STARTED Contents

GETTING STARTED Contents 2.5 Enterprise GETTING STARTED Contents Quick Start Guide... 2 Supporting Data... 3 Prompts... 3 Techniques... 4 Pragmatic Observations... 5 Locations... 6 School Levels... 6 Quick Notes... 6 Session Groups...

More information

Create a SAS Program to create the following files from the PREC2 sas data set created in LAB2.

Create a SAS Program to create the following files from the PREC2 sas data set created in LAB2. Topics: Data step Subsetting Concatenation and Merging Reference: Little SAS Book - Chapter 5, Section 3.6 and 2.2 Online documentation Exercise I LAB EXERCISE The following is a lab exercise to give you

More information

Functionality Guide. for CaseWare IDEA Data Analysis

Functionality Guide. for CaseWare IDEA Data Analysis Functionality Guide for CaseWare IDEA Data Analysis CaseWare IDEA Quick Access Functionality Crib Sheet A quick guide to the major functionality you will use within IDEA. FILE TAB: Passport The single

More information

Tools to Facilitate the Creation of Pooled Clinical Trials Databases

Tools to Facilitate the Creation of Pooled Clinical Trials Databases Paper AD10 Tools to Facilitate the Creation of Pooled Clinical Trials Databases Patricia Majcher, Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Raritan, NJ ABSTRACT Data collected from

More information

SOST 201 September 20, Stem-and-leaf display 2. Miscellaneous issues class limits, rounding, and interval width.

SOST 201 September 20, Stem-and-leaf display 2. Miscellaneous issues class limits, rounding, and interval width. 1 Social Studies 201 September 20, 2006 Presenting data and stem-and-leaf display See text, chapter 4, pp. 87-160. Introduction Statistical analysis primarily deals with issues where it is possible to

More information

Edition. MONTEREY COUNTY BEHAVIORAL HEALTH MD User Guide

Edition. MONTEREY COUNTY BEHAVIORAL HEALTH MD User Guide Edition 1 MONTEREY COUNTY BEHAVIORAL HEALTH MD User Guide i Table of Content OderConnect/InfoScriber Registration CH1 Pg.2 Sign In to MyAvatar CH2..Pg.10 Sync OrderConnect Password CH3.Pg.14 Client Look

More information

Using SAS Macros to Extract P-values from PROC FREQ

Using SAS Macros to Extract P-values from PROC FREQ SESUG 2016 ABSTRACT Paper CC-232 Using SAS Macros to Extract P-values from PROC FREQ Rachel Straney, University of Central Florida This paper shows how to leverage the SAS Macro Facility with PROC FREQ

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA

Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA ABSTRACT Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA Data set documentation is essential to good

More information

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA ABSTRACT Removing duplicate observations from a data set is not as easy as it might

More information

A Blaise Editing System at Westat. Rick Dulaney, Westat Boris Allan, Westat

A Blaise Editing System at Westat. Rick Dulaney, Westat Boris Allan, Westat A Blaise Editing System at Westat Rick Dulaney, Westat Boris Allan, Westat Introduction Editing and delivering survey data pose challenges often quite separate from developing Blaise applications for data

More information

Going Under the Hood: How Does the Macro Processor Really Work?

Going Under the Hood: How Does the Macro Processor Really Work? Going Under the Hood: How Does the Really Work? ABSTRACT Lisa Lyons, PPD, Inc Hamilton, NJ Did you ever wonder what really goes on behind the scenes of the macro processor, or how it works with other parts

More information

TLF Management Tools: SAS programs to help in managing large number of TLFs. Eduard Joseph Siquioco, PPD, Manila, Philippines

TLF Management Tools: SAS programs to help in managing large number of TLFs. Eduard Joseph Siquioco, PPD, Manila, Philippines PharmaSUG China 2018 Paper AD-58 TLF Management Tools: SAS programs to help in managing large number of TLFs ABSTRACT Eduard Joseph Siquioco, PPD, Manila, Philippines Managing countless Tables, Listings,

More information

Once the data warehouse is assembled, its customers will likely

Once the data warehouse is assembled, its customers will likely Clinical Data Warehouse Development with Base SAS Software and Common Desktop Tools Patricia L. Gerend, Genentech, Inc., South San Francisco, California ABSTRACT By focusing on the information needed by

More information

Frequency, proportional, and percentage distributions.

Frequency, proportional, and percentage distributions. 1 Social Studies 201 September 13-15, 2004 Presenting data and stem-and-leaf display See text, chapter 4, pp. 87-160. Introduction Statistical analysis primarily deals with issues where it is possible

More information

100 THE NUANCES OF COMBINING MULTIPLE HOSPITAL DATA

100 THE NUANCES OF COMBINING MULTIPLE HOSPITAL DATA Paper 100 THE NUANCES OF COMBINING MULTIPLE HOSPITAL DATA Jontae Sanders, MPH, Charlotte Baker, DrPH, MPH, CPH, and C. Perry Brown, DrPH, MSPH, Florida Agricultural and Mechanical University ABSTRACT Hospital

More information

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., Atlanta, GA

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., Atlanta, GA How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., Atlanta, GA ABSTRACT This tutorial will demonstrate how to implement the One-Time Methodology, a way to manage, validate, and process survey

More information

Macro Method to use Google Maps and SAS to Geocode a Location by Name or Address

Macro Method to use Google Maps and SAS to Geocode a Location by Name or Address Paper 2684-2018 Macro Method to use Google Maps and SAS to Geocode a Location by Name or Address Laurie Smith, Cincinnati Children s Hospital Medical Center, Cincinnati, Ohio ABSTRACT Google Maps is a

More information

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA PharmaSUG 2017 - Paper DS23 The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA ABSTRACT Since FDA released the Analysis

More information

Beginner Beware: Hidden Hazards in SAS Coding

Beginner Beware: Hidden Hazards in SAS Coding ABSTRACT SESUG Paper 111-2017 Beginner Beware: Hidden Hazards in SAS Coding Alissa Wise, South Carolina Department of Education New SAS programmers rely on errors, warnings, and notes to discover coding

More information

Processing SAS Data Sets

Processing SAS Data Sets Statistical Data Analysis 1 Processing SAS Data Sets Namhyoung Kim Dept. of Applied Statistics Gachon University nhkim@gachon.ac.kr 1 Using OUT Dataset OUTPUT Statement OUTPUT

More information

Paper William E Benjamin Jr, Owl Computer Consultancy, LLC

Paper William E Benjamin Jr, Owl Computer Consultancy, LLC Paper 025-2009 So, You ve Got Data Enterprise Wide (SAS, ACCESS, EXCEL, MySQL, and Others); Well, Let SAS Enterprise Guide Software Point-n-Click Your Way to Using It William E Benjamin Jr, Owl Computer

More information

Utilizing the VNAME SAS function in restructuring data files

Utilizing the VNAME SAS function in restructuring data files AD13 Utilizing the VNAME SAS function in restructuring data files Mirjana Stojanovic, Duke University Medical Center, Durham, NC Donna Niedzwiecki, Duke University Medical Center, Durham, NC ABSTRACT Format

More information

Data Quality Control: Using High Performance Binning to Prevent Information Loss

Data Quality Control: Using High Performance Binning to Prevent Information Loss SESUG Paper DM-173-2017 Data Quality Control: Using High Performance Binning to Prevent Information Loss ABSTRACT Deanna N Schreiber-Gregory, Henry M Jackson Foundation It is a well-known fact that the

More information

Foundation Level Syllabus Usability Tester Sample Exam

Foundation Level Syllabus Usability Tester Sample Exam Foundation Level Syllabus Usability Tester Sample Exam Version 2017 Provided by German Testing Board Copyright Notice This document may be copied in its entirety, or extracts made, if the source is acknowledged.

More information

Programming Beyond the Basics. Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell

Programming Beyond the Basics. Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell ABSTRACT The SAS hash object has come of age in SAS 9.2, giving the SAS programmer the ability to quickly do things

More information

Cause/reason (if currently known)

Cause/reason (if currently known) The Affirmatively Furthering Fair Housing Data and Mapping Tool (AFFH-T): for AFFH-T Release 4.1 Published: December 20, As part of its efforts to support program participants in conducting their Assessments

More information

Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA

Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA Indexing and Compressing SAS Data Sets: How, Why, and Why Not Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA Many users of SAS System software, especially those working

More information

Part A. EpiData Entry

Part A. EpiData Entry Part A. EpiData Entry Part A: EpiData Entry Exercise 1 A data documentation sheet for a simple questionnaire Exercise 2 The QES-REC-CHK triplet Exercise 3 Derived fields and Check file commands unrelated

More information

Paper Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations. John Schmitz, Luminare Data LLC

Paper Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations. John Schmitz, Luminare Data LLC Paper 1331-2017 Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations ABSTRACT John Schmitz, Luminare Data LLC Data processing can sometimes require complex

More information

ABSTRACT INTRODUCTION MACRO. Paper RF

ABSTRACT INTRODUCTION MACRO. Paper RF Paper RF-08-2014 Burst Reporting With the Help of PROC SQL Dan Sturgeon, Priority Health, Grand Rapids, Michigan Erica Goodrich, Priority Health, Grand Rapids, Michigan ABSTRACT Many SAS programmers need

More information

Data Quality Assessment Tool for health and social care. October 2018

Data Quality Assessment Tool for health and social care. October 2018 Data Quality Assessment Tool for health and social care October 2018 Introduction This interactive data quality assessment tool has been developed to meet the needs of a broad range of health and social

More information

Loading Data. Introduction. Understanding the Volume Grid CHAPTER 2

Loading Data. Introduction. Understanding the Volume Grid CHAPTER 2 19 CHAPTER 2 Loading Data Introduction 19 Understanding the Volume Grid 19 Loading Data Representing a Complete Grid 20 Loading Data Representing an Incomplete Grid 21 Loading Sparse Data 23 Understanding

More information

Using SAS Enterprise Guide to Coax Your Excel Data In To SAS

Using SAS Enterprise Guide to Coax Your Excel Data In To SAS Paper IT-01 Using SAS Enterprise Guide to Coax Your Excel Data In To SAS Mira Shapiro, Analytic Designers LLC, Bethesda, MD ABSTRACT Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley,

More information

Reading in Data Directly from Microsoft Word Questionnaire Forms

Reading in Data Directly from Microsoft Word Questionnaire Forms Paper 1401-2014 Reading in Data Directly from Microsoft Word Questionnaire Forms Sijian Zhang, VA Pittsburgh Healthcare System ABSTRACT If someone comes to you with hundreds of questionnaire forms in Microsoft

More information

Simple Data Flow ForWord

Simple Data Flow ForWord Thomas Lund Brother Rodney Keller ENG 316 October 27, 2011 Irrefutable Logic: A New Method for Standard Search Protocol To achieve results more concise, accurate, and relevant than those of current online

More information

Useful Tips When Deploying SAS Code in a Production Environment

Useful Tips When Deploying SAS Code in a Production Environment Paper SAS258-2014 Useful Tips When Deploying SAS Code in a Production Environment ABSTRACT Elena Shtern, SAS Institute Inc., Arlington, VA When deploying SAS code into a production environment, a programmer

More information

Parallelizing Windows Operating System Services Job Flows

Parallelizing Windows Operating System Services Job Flows ABSTRACT SESUG Paper PSA-126-2017 Parallelizing Windows Operating System Services Job Flows David Kratz, D-Wise Technologies Inc. SAS Job flows created by Windows operating system services have a problem:

More information

SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada

SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada ABSTRACT Performance improvements are the well-publicized enhancement to SAS 9, but what else has changed

More information

Introduction. Getting Started with the Macro Facility CHAPTER 1

Introduction. Getting Started with the Macro Facility CHAPTER 1 1 CHAPTER 1 Introduction Getting Started with the Macro Facility 1 Replacing Text Strings Using Macro Variables 2 Generating SAS Code Using Macros 3 Inserting Comments in Macros 4 Macro Definition Containing

More information

Are you Still Afraid of Using Arrays? Let s Explore their Advantages

Are you Still Afraid of Using Arrays? Let s Explore their Advantages Paper CT07 Are you Still Afraid of Using Arrays? Let s Explore their Advantages Vladyslav Khudov, Experis Clinical, Kharkiv, Ukraine ABSTRACT At first glance, arrays in SAS seem to be a complicated and

More information

SOPHISTICATED DATA LINKAGE USING SAS

SOPHISTICATED DATA LINKAGE USING SAS SOPHISTICATED DATA LINKAGE USING SAS Philip J. d' Almada, Battelle Memorial Institute, Atlanta, Georgia ABSTRACT A by-product of the very-iow-birthweight project at the Centers for Disease Control and

More information

Quick Reference Guide

Quick Reference Guide The Cochrane Library on Wiley InterScience Quick Reference Guide Go to www.thecochranelibrary.com to discover the best single source of reliable evidence for healthcare decision-making Go to www.thecochranelibrary.com

More information

Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA

Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA ABSTRACT PharmaSUG 2013 - Paper TF22 Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA The merge is

More information