Creating output datasets using SQL (Structured Query Language) only Andrii Stakhniv, Experis Clinical, Ukraine

Similar documents
Using PROC SQL to Generate Shift Tables More Efficiently

%ANYTL: A Versatile Table/Listing Macro

A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment

Sorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine

The Implementation of Display Auto-Generation with Analysis Results Metadata Driven Method

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania Kim Minkalis, Accenture Life Sciences, Wayne, Pennsylvania

SAS CLINICAL SYLLABUS. DURATION: - 60 Hours

An Efficient Method to Create Titles for Multiple Clinical Reports Using Proc Format within A Do Loop Youying Yu, PharmaNet/i3, West Chester, Ohio

A Taste of SDTM in Real Time

Get SAS sy with PROC SQL Amie Bissonett, Pharmanet/i3, Minneapolis, MN

How to write ADaM specifications like a ninja.

The Power of Combining Data with the PROC SQL

PharmaSUG Paper PO10

An Efficient Solution to Efficacy ADaM Design and Implementation

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA

One Project, Two Teams: The Unblind Leading the Blind

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania

Creating an ADaM Data Set for Correlation Analyses

PharmaSUG Paper PO22

This paper describes a report layout for reporting adverse events by study consumption pattern and explains its programming aspects.

SAS Online Training: Course contents: Agenda:

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA

PharmaSUG Paper DS-24. Family of PARAM***: PARAM, PARAMCD, PARAMN, PARCATy(N), PARAMTYP

Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design

PharmaSUG China Paper 70

CFB: A Programming Pattern for Creating Change from Baseline Datasets Lei Zhang, Celgene Corporation, Summit, NJ

PharmaSUG Paper PO03

USING HASH TABLES FOR AE SEARCH STRATEGIES Vinodita Bongarala, Liz Thomas Seattle Genetics, Inc., Bothell, WA

A Tool to Compare Different Data Transfers Jun Wang, FMD K&L, Inc., Nanjing, China

Interactive Programming Using Task in SAS Studio

How Macro Design and Program Structure Impacts GPP (Good Programming Practice) in TLF Coding

Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA

PharmaSUG China

Programming checks: Reviewing the overall quality of the deliverables without parallel programming

A Macro for Generating the Adverse Events Summary for ClinicalTrials.gov

Implementing CDISC Using SAS. Full book available for purchase here.

PharmaSUG Paper TT10 Creating a Customized Graph for Adverse Event Incidence and Duration Sanjiv Ramalingam, Octagon Research Solutions Inc.

Keeping Track of Database Changes During Database Lock

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES

Making a List, Checking it Twice (Part 1): Techniques for Specifying and Validating Analysis Datasets

Standard Safety Visualization Set-up Using Spotfire

It s Proc Tabulate Jim, but not as we know it!

PharmaSUG Paper TT11

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment

Are you Still Afraid of Using Arrays? Let s Explore their Advantages

From Implementing CDISC Using SAS. Full book available for purchase here. About This Book... xi About The Authors... xvii Acknowledgments...

SAS Training BASE SAS CONCEPTS BASE SAS:

PharmaSUG Paper DS06 Designing and Tuning ADaM Datasets. Songhui ZHU, K&L Consulting Services, Fort Washington, PA

An Alternate Way to Create the Standard SDTM Domains

Advanced Visualization using TIBCO Spotfire and SAS

Clinical Data Visualization using TIBCO Spotfire and SAS

Customer oriented CDISC implementation

PharmaSUG Paper PO21

SAS (Statistical Analysis Software/System)

Customized Flowcharts Using SAS Annotation Abhinav Srivastva, PaxVax Inc., Redwood City, CA

Let s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic

One-PROC-Away: The Essence of an Analysis Database Russell W. Helms, Ph.D. Rho, Inc.

Not Just Merge - Complex Derivation Made Easy by Hash Object

Preparing the Office of Scientific Investigations (OSI) Requests for Submissions to FDA

Paper FC02. SDTM, Plus or Minus. Barry R. Cohen, Octagon Research Solutions, Wayne, PA

Submission-Ready Define.xml Files Using SAS Clinical Data Integration Melissa R. Martinez, SAS Institute, Cary, NC USA

Automation of SDTM Programming in Oncology Disease Response Domain Yiwen Wang, Yu Cheng, Ju Chen Eli Lilly and Company, China

Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

A Breeze through SAS options to Enter a Zero-filled row Kajal Tahiliani, ICON Clinical Research, Warrington, PA

SAS Macros for Grouping Count and Its Application to Enhance Your Reports

PharmaSUG Paper IB11

Deriving Rows in CDISC ADaM BDS Datasets

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility

Multiple Graphical and Tabular Reports on One Page, Multiple Ways to Do It Niraj J Pandya, CT, USA

Setting the Percentage in PROC TABULATE

Regaining Some Control Over ODS RTF Pagination When Using Proc Report Gary E. Moore, Moore Computing Services, Inc., Little Rock, Arkansas

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

PharmaSUG Paper DS24

Applying ADaM Principles in Developing a Response Analysis Dataset

Integrated Clinical Systems, Inc. announces JReview 13.1 with new AE Incidence

PharmaSUG China Mina Chen, Roche (China) Holding Ltd.

JMP Clinical. Release Notes. Version 5.0

SAS Programming Techniques for Manipulating Metadata on the Database Level Chris Speck, PAREXEL International, Durham, NC

Study Data Reviewer s Guide Completion Guideline

From SAP to BDS: The Nuts and Bolts Nancy Brucken, i3 Statprobe, Ann Arbor, MI Paul Slagle, United BioSource Corp., Ann Arbor, MI

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components

PharmaSUG DS05

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

Doctor's Prescription to Re-engineer Process of Pinnacle 21 Community Version Friendly ADaM Development

Internal Consistency and the Repeat-TFL Paradigm: When, Why and How to Generate Repeat Tables/Figures/Listings from Single Programs

Benchmark Macro %COMPARE Sreekanth Reddy Middela, MaxisIT Inc., Edison, NJ Venkata Sekhar Bhamidipati, Merck & Co., Inc.

Automate Clinical Trial Data Issue Checking and Tracking

Paper An Automated Reporting Macro to Create Cell Index An Enhanced Revisit. Shi-Tao Yeh, GlaxoSmithKline, King of Prussia, PA

Reducing SAS Dataset Merges with Data Driven Formats

Let Hash SUMINC Count For You Joseph Hinson, Accenture Life Sciences, Berwyn, PA, USA

PharmaSUG China 2018 Paper AD-62

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA

Indenting with Style

Using Templates Created by the SAS/STAT Procedures

Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL

SAS Clinical Data Integration 2.4

START CONVERTING FROM TEXT DATE/TIME VALUES

Mapping Clinical Data to a Standard Structure: A Table Driven Approach

PharmaSUG China Paper 059

Transcription:

ABSTRACT PharmaSUG 2015 Paper QT22 Andrii Stakhniv, Experis Clinical, Ukraine PROC SQL is one of the most powerful procedures in SAS. With this tool we can easily manipulate data and create a large number of outputs. I will illustrate how we can create final datasets based on three common types of safety outputs using SQL (Structured Query Language) only: Outputs with a fixed structure (like Disposition outputs); Outputs with descriptive statistics (like Demographics or Laboratory outputs); Outputs with a flexible 2-level hierarchy structure (like Adverse Events outputs). The approaches and tricks presented here can be utilized in everyday work as they are easy to implement and understand. Additionally, this information can be a helpful resource for those who are new to statistical programming analysis. INTRODUCTION SAS is widely used in the pharmaceutical industry as the main software tool for clinical trial analyses. At the core of SAS is a specific programing language which contains four key elements: Data Steps Procedures Macros ODS (output delivery system) With these key elements, we can easily manipulate data, perform different types of analysis, and produce a large variety of reports. One of the most powerful procedures in SAS is PROC SQL; SQL is very flexible and therefore is popular among different developers. The purpose of this paper is to demonstrate the facilities of SQL and also provide basic knowledge about the clinical trial output creation process by providing step-by-step instructions followed with examples. GENERATION INPUT DATA According to the Clinical Data Interchange Standards Consortium (CDISC) standards, Analysis Data Model (ADaM) datasets are used for generation of statistical outputs for a regulatory submission. They are built on Study Data Tabulation Model (SDTM) datasets which contain data collected during the study and organized by clinical domain. For illustrating the statistical output creation process, we need to have two ADaM datasets. The first one is ASL an Analysis Subject Level (ASL) dataset and the second one is some analysis dataset (ANL). The ASL dataset has a one-record-per-subject structure, where the USUBJID variable is a unique identifier for each record. Here is usually stored demographic data, treatment group and population information, baseline characteristics and any other subject or study specific data. An ANL dataset can have different structures depending on the clinical domain but here we will use one where each subject can have several records and each record is classified to both a main category and a sub category (example, Adverse Events dataset where we have System Organ Class and Preferred Term classification). For our purpose, we assume that we have 30 subjects (patients) in our study and 3 trial treatment arms. To be sure that all applied methods are not data dependent, we set up these parameters at the very beginning of our program. %let pt_num = 30; /* Number of patients */ %let tr_num = 3; /* Number of treatment groups */ %let asl = _asl; /* Subject Level dataset name */ %let anl = _anl; /* Analysis dataset name */ 1

Everything is ready now to generate our dummy ASL and ANL datasets. To have a cleaner presentation of them, we will simplify the list of variables. ** Generation of ASL (Analysis Subject Level) dataset **; data &asl(drop = i r); do i = 1 to &pt_num by 1; ** Random variables **; r = rand("uniform"); /* [0, 1] */ k = floor(10*r); /* [0, 10] */ ** Create subject (patient) id number **; USUBJID = 1000 + i; ** Setup treatment variables numeric and character **; TRTN = round(1 + (&tr_num - 1) * r); TRT = "Treatment - " compress(put(trtn, best.)); ** Setup population / discontinue / completers flags **; POPFL = ifc(k < 9, "Y", ""); COMPFL = ifc(k < 7, "Y", ""); DISCFL = ifc(k = 7, "Y", ""); ** set up 2 numeric & their char variables **; ** Example, AGE & AGECAT or some BL value) **; NUM_VAR1 = round(20 + (50-20) * r); /*[20,50]*/ CHR_VAR1 = ifc(num_var1 <= 35, "<= 35", "> 35"); NUM_VAR2 = round(-5 + (5 + 5) * r, 0.001); /*[-5,5 ]*/ CHR_VAR2 = ifc(num_var2 <= 0, "Negative", "Positive"); output; end; ** Generation of ANL (Analysis) dataset **; data &anl(drop = i r); do i = 1 to 25 by 1; ** Random variables **; r = rand("uniform"); /* [0,1] */ USUBJID = 1000 + round(1 + (&pt_num - 1) * r); /*[1001, 1000 + &pt_num]*/ MCAT = "Main Category - " compress(put(round(1 + (5-1) * rand("uniform")),best.)); SCAT = "Sub Category - " compress(put(round(1 + (5-1) * rand("uniform")),best.)); output; end; DATA PROCESSING Creation of any statistical output always starts with a review of the specifications. We need to know which template is required and which data will be needed. Earlier, we generated input datasets, so now we can perform data processing. This step is general for all clinical outputs and starts with ing an appropriate population. This population can be Safety, Intent to treat, All patients or any specific/modified population. To generalize, we assume that the POPFL variable in the ASL dataset is responsible for required population. The ANL dataset contains all available data, so we need to subset it on subjects the specified population. In practice, we usually need to create additional filtering for the ANL. 2

** Subset ASL dataset on population **; create table asl_p as a.* &asl as a where a.popfl = "Y"; ** Subset ANL dataset **; create table anl_p as a.*, d.trtn asl_p as d join &anl as a on a.usubjid = d.usubjid; Now we can prepare treatment group counts (N/output denominators) and labels which we will use for the header output. ** Get summary treatment information **; create table trt as t.trtn,t.trt,count(distinct t.usubjid) as N /* N - denominator */,trim(left(t.trt)) " (N=" compress(put(calculated N, best.)) ")" as LBL /* treatment label */ asl_p as t group by t.trtn, t.trt order by t.trtn, t.trt; ** Save treatment labels into macro variables **; LBL into :lbl1-:lbl%left(&tr_num) trt; OUTPUTS WITH A FIXED STRUCTURE Fixed structure outputs display summary information in a pre-defined format and are commonly used for safety analyses. A sample of such output is presented in Table 1, where the value outside the brackets shows number of subjects and the value inside the brackets displays portion (percentage of the total number of subjects in the treatment group). Table 1: Sample of output with a fixed structure Treatment 1 Treatment T Subjects completed the study X (XX.X%) X (XX.X%) Subjects discontinued the study X (XX.X%) X (XX.X%) Subjects who have at least one record in the analysis dataset X (XX.X%) X (XX.X%) The SQL approach presented below is based on the following steps: 1. Create the template skeleton dataset, with ordering variable for category. 2. Using the Cartesian Product (cross join), extend the skeleton with treatment group information and their counts. 3. For each category, compute numbers and percentages for each treatment group. 4. Format values according to template. 5. Go vertical structure to horizontal (transpose by treatment). 3

** Define skeleton dataset **; create table skel ( ID num,ord num,txt char(100) ); ** Fill our our skeleton dataset with data **; insert into skel values(1,1, "Subjects completed the study"); insert into skel values(1,2, "Subjects discontinued study"); insert into skel values(1,3, "Subjects who have at least one record in the analysis dataset"); ** Add treatment information to our skeleton **; create table part1 as *,. as Pcnt trt cross join skel; ** Count those who completed study **; update part1 as p set Pcnt = ( COUNT(distinct t.usubjid) asl_p as t where t.trtn = p.trtn and t.compfl) where ORD = 1; ** Count those who have discontinued study **; update part1 as p set Pcnt = ( COUNT(distinct t.usubjid) asl_p as t where t.trtn = p.trtn and t.discfl) where ORD = 2; ** Count those who have record in ANL dataset **; update part1 as p set Pcnt = ( COUNT(distinct t.usubjid) anl_p as t where t.trtn = p.trtn) where ORD = 3; ** Prepare result variable **; create table fin1 as p.id,p.ord,p.txt,p.trtn,p.trt,p.lbl,case when p.pcnt ne 0 then put(p.pcnt, 6.) " (" put((p.pcnt/p.n)*100, 5.1) "%)" else put(p.pcnt, 6.) END as RES length=30 part1 as p order by p.id, p.ord, p.txt, p.trtn, p.trt; 4

Assigning treatment labels can be performed in a macro loop which allows more flexible code, but for simplicity we will maintain pure SQL. ** Transpose dataset by treatment **; create table final1 as f.id,f.ord,f.txt as TXT1,"" as TXT2 length=100,max(case when TRTN = 1 then RES else "" END) as _1 label="&lbl1",max(case when TRTN = 2 then RES else "" END) as _2 label="&lbl2",max(case when TRTN = 3 then RES else "" END) as _3 label="&lbl3" fin1 as f group by f.id, f.ord, f.txt order by f.id, f.ord, f.txt; Using the skeleton update method allows us to have simple maintainable code which easily can be extended if we would like to add more categories. Also keeping vertical structure gives us possibility to have working code in case when we have more than 3 treatment groups in study. Another way to perform a data transpose is to apply the PROC TRANSPOSE procedure. In this case we do not need to define and keep treatment labels in special macro variables as they would be obtained directly the LBL variable. proc transpose data=fin1 out=tfinal1(drop=_name_); by ID TXT ORD; var RES; id TRTN; idlabel LBL; OUTPUTS WITH DESCRIPTIVE STATISTICS Such outputs are very common in statistical analysis. They appear in different variations which required a more complicated template for demonstration. In the ASL dataset we generated two pairs of variables (numeric and its character equivalent), and we will calculate statistics of the NUM_VAR1 variable across each value the CHR_VAR2 variable. A sample of this output is presented in Table 2. Table 2: Sample of output with descriptive statistics Treatment 1 <CHR_VAR2 value X> N XX XX Mean XX.X XX.X SD XX.XX XX.XX Median XX.X XX.X Treatment T Min Max XX.X XX.X XX.X XX.X The SQL approach applied here contains the following steps: 1. With summary functions such as MEAN, STD, MIN, MAX under aggregation we can calculate the required statistics. For MEDIAN, we need to use special subquery, as it is a row function in SAS 9.3. 2. Go horizontal structure to vertical (transpose our statistics) additionally generating an ordering variable, formatting values, and adding treatment information. 3. Go a vertical structure to horizontal (transpose by treatment). 5

** Calculate requested stats **; ** MEDIAN is a row function in SAS 9.3, so we need to apply a trick **; create table stat1 as t.trtn,t.trt,t.chr_var2,count(*) as n,mean(t.num_var1) as MEAN,STD(t.NUM_VAR1) as STD,MIN(t.NUM_VAR1) as MIN,MAX(t.NUM_VAR1) as MAX,( AVG(m.NUM_VAR1) ( a.num_var1 asl_p as a cross join asl_p as b where a.trtn = t.trtn and b.trtn = t.trtn and a.chr_var2 = t.chr_var2 and b.chr_var2 = t.chr_var2 group by a.num_var1 having SUM(CASE when a.num_var1 = b.num_var1 then 1 else 0 END) >= abs(sum(sign(a.num_var1 - b.num_var1)))) as m ) as MEDIAN asl_p as t group by t.trtn, t.trt, t.chr_var2 order by t.trtn, t.trt, t.chr_var2; Descriptive statistics also can be calculated using the PROC MEANS procedure on sorted data. proc sort data=asl_p out=asl_ps; by TRTN TRT CHR_VAR2; proc means data=asl_ps noprint; by TRTN TRT CHR_VAR2; var NUM_VAR1; output out = stat2 (drop = _type freq_) n = n mean = mean min = min max = max median = median std = std; Received statistics can be checked by PROC COMAPRE procedure. proc compare base = stat1 compare = stat2 method = absolute criterion = 0.0000001 %str(warn)ings listall MAXPRINT=(200,4000); Now in one SQL script we will perform several operations to prepare data for a treatment transpose. For this we will use one big subquery to construct a vertical structure our horizontal statistics. Also, here we need to take care about order for newly created items (the ORD variable is added for this purpose) and format received statistics (the RES variable will keep the result value). When this subquery is done, we can add treatment information. 6

** Go horizontal structure to vertical **; ** Add treatment information **; ** Prepare result variable **; create table fin2 as d.*,t.lbl ( 2 as ID,1 as ORD,s.CHR_VAR2 as TXT1 length = 100,"n" as TXT2 length = 100,s.TRTN,s.TRT,put(s.n, 5.) as RES length=30 stat1 as s 2 as ID,2 as ORD,s.CHR_VAR2 as TXT1 length = 100,"Mean" as TXT2 length = 100,s.TRTN,s.TRT,put(s.mean, 7.1) as RES length=30 stat1 as s 2 as ID,3 as ORD,s.CHR_VAR2 as TXT1 length = 100,"SD" as TXT2 length = 100,s.TRTN,s.TRT,put(s.std, 8.2) as RES length=30 stat1 as s 2 as ID,4 as ORD,s.CHR_VAR2 as TXT1 length = 100,"Median" as TXT2 length = 100,s.TRTN,s.TRT,put(s.median, 7.1) as RES length=30 stat1 as s 2 as ID,5 as ORD,s.CHR_VAR2 as TXT1 length = 100,"Min - Max" as TXT2 length = 100,s.TRTN,s.TRT,compress(put(s.MIN, 7.1)) " - " compress(put(s.max, 7.1)) as RES length=30 stat1 as s ) as d join trt as t on t.trtn = d.trtn order by d.id, d.txt1, d.ord, d.trtn, d.trt; 7

Similar to what we have done for fixed outputs, we perform a final transpose by treatment. ** Transpose dataset by treatment **; create table final2 as f.id,f.ord,case when f.ord = 1 then f.txt1 else "" END as TXT1,f.TXT2,MAX(CASE when TRTN = 1 then RES else "" END) as _1 label="&lbl1",max(case when TRTN = 2 then RES else "" END) as _2 label="&lbl2",max(case when TRTN = 3 then RES else "" END) as _3 label="&lbl3" fin2 as f group by f.id, f.txt1, f.ord, f.txt2 order by f.id, f.txt1, f.ord, f.txt2; OUTPUTS WITH A FLEXIBLE 2-LEVEL HIERARCHY Flexible 2-level hierarchy outputs are also very common in statistical analysis. As an example, we can take Adverse Event output, where the 1 st level would be the System Organ Class (SOC) variable and the 2 nd level would be the related Preferred Term (PT). This output is flexible because the list of displayed SOCs and PTs is unknown at the beginning and can change after the data changes. For illustration in ANL dataset, we have created two dependent variables, the Main category (MCAT) and the Sub category (SCAT) which are generic equivalents of SOC and PT. Sample of this output is presented in Table 3, where the number in brackets shows the number of records (events) defined by the MCAT/SCAT values and the number outside the brackets displays the number of unique subjects. Main category (MCAT) Table 3: Sample of output with a flexible 2-level hierarchy Sub category (SCAT) Treatment 1 <MCAT value M1> XX ( XX) XX ( XX) <SCAT value M1S1> XX ( XX) XX ( XX) <SCAT value M1S > XX ( XX) XX ( XX) <SCAT value M1SK> XX ( XX) XX ( XX) <MCAT value MX> XX ( XX) XX ( XX) <SCAT value MXS1> XX ( XX) XX ( XX) <SCAT value MXS > XX ( XX) XX ( XX) <SCAT value MXSN> XX ( XX) XX ( XX) Treatment T The SQL approach applied here contains the following steps: 1. Using two subqueries as input tables, we get all of the required data in a vertical view, where the first subquery is responsible for constructing a skeleton of 2-level structure output (combination of MCAT/SCAT based on existing data) and the second subquery prepares subject and event counts. 2. Go a vertical structure to horizontal (transpose by treatment). Pcnt variable will be responsible for subject counts and for event counts we will create an Ecnt variable. Obviously, the Pcnt value (number of unique subjects) can t be greater than the Ecnt value (number of occurred events) for any MCAT/SCAT combinations. The ordering variable ORD will be set to missing because we want to apply sorting of items based on alphabetical order of the MCAT and SCAT variables. Also, the method we applied here can be easily extended to 3-level hierarchy outputs. For example, in the case where an existing AE output Severity variable will be added. 8

** Prepare skeleton and counts using subqueries **; create table fin3 as b.*,3 as ID,CASE when ~missing(t.pcnt) then t.pcnt else 0 END as Pcnt,CASE when ~missing(t.ecnt) then t.ecnt else 0 END as Ecnt,CASE when ~missing(b.scat) then "" else b.mcat END as TXT1 length = 100,b.SCAT as TXT2 length = 100,put(CALCULATED Pcnt, 4.) " (" put(calculated Ecnt, 4.) ")" as RES length=30 ( * trt cross join ( distinct MCAT, "" as SCAT anl_p) * trt cross join ( distinct MCAT, SCAT anl_p)) as b left join ( p.trtn,p.mcat,"" as SCAT,COUNT(distinct p.usubjid) as Pcnt,COUNT(*) as Ecnt anl_p as p group by p.trtn, p.mcat p.trtn,p.mcat,p.scat,count(distinct p.usubjid) as Pcnt,COUNT(*) as Ecnt anl_p as p group by p.trtn, p.mcat, p.scat) as t on b.trtn = t.trtn and b.mcat = t.mcat and b.scat = t.scat order by b.trtn, b.mcat, b.scat; ** Transpose dataset by treatment **; create table final3(drop = MCAT SCAT) as f.id,. as ORD,f.MCAT,f.SCAT,f.TXT1,f.TXT2,MAX(CASE when TRTN = 1 then RES else "" END) as _1 label="&lbl1",max(case when TRTN = 2 then RES else "" END) as _2 label="&lbl2",max(case when TRTN = 3 then RES else "" END) as _3 label="&lbl3" fin3 as f group by f.id, f.mcat, f.scat, f.txt1, f.txt2 order by f.id, f.mcat, f.scat, f.txt1, f.txt2; 9

CONCLUSION A huge number of statistical analysis outputs are based on these three types of templates. This paper demonstrates that PROC SQL is a powerful tool to create a wide variety of outputs. However, the illustrated examples, SQL code might appear to become more complicated when multiple nested subqueries are utilized. This means that you may need to split your queries into smaller parts or possibly use a solution outside of SQL. PROC SQL sometimes can be a good alternative to SAS data step and SAS procedures. REFERENCES SAS 9.4 SQL Procedure User's Guide. SAS Institute Inc. 2013. Available at URL: http://support.sas.com/documentation/cdl/en/sqlproc/65065/pdf/default/sqlproc.pdf Chao Huang. Top 10 SQL Tricks in SAS. Available at URL: http://support.sas.com/resources/papers/proceedings14/1561-2014.pdf CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Andrii Stakhniv Enterprise: Experis Clinical Address: 23 Baggoutovskaya street City, State ZIP: Kyiv 04107, Ukraine Work Phone: +1 407 512 10 06 ext. 2407 E-mail: andrii.stakhniv@intego-group.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 10