Confidence interval for sample mean = Upper and lower confidence interval for sample standard deviation = Sample standard error =

Size: px
Start display at page:

Download "Confidence interval for sample mean = Upper and lower confidence interval for sample standard deviation = Sample standard error ="

Transcription

1 A Macro To Perform A T-Test For 2 Independent Samples Using Sufficient Statistics Lan-Feng Tsai, Edwards Lifesciences LLC, Irvine, California Abstract The T-test is a commonly used statistical test to compare the mean of one sample to a predetermined value, the means of paired samples, or the means of 2 independent samples. It is known that the test statistic for the T -test is based on the sample means, sample standard deviations, and sample sizes. Therefore, if only the summary statistics are known, and the raw data are unavailable, the result of the T -test can still be calculated. While SAS procedures require raw data to perform a T-test, a SAS macro to perform a T-test for 2 independent samples using sufficient statistics is proposed. The advantages of performing a T test using sufficient statistics are also discussed. Introduction A T -test can be performed using only summary statistics because the summary statistics are sufficient, consistent, and unbiased estimators for its normal model. The definition of a sufficient statistic is given as follows (Rice 1995): A statistic T(X 1,..., X,J is said to be sufficient for e if the conditional distribution of X1,..., Xn, given T=t, does not depend on 0 for any value oft. Confidence interval for sample mean = Upper and lower confidence interval for sample standard deviation = [ (n-1)s 2 2 a % Sample standard error = s.jn Sample mean difference = Pooled sample standard deviation (or standard deviation for sample mean difference): sp = 2 2 (n 1 -l)s 1 + (n 2 -l)s 2 n 1 +n 2-2 Pooled sample standard error = Calculation The purpose of this macro is to perform a T -test for 2 independent sample means using sufficient statistics (summary statistics). The theoretical details can be found in statistical textbooks (Arnold 1990, Rice 1995) and will not be discussed here. The following formulas are used to calculate the result of the T -test. Confidence interval for sample mean difference 68

2 Confidence interval for pooled sample standard deviation= F value for folded f statistic: f = P-value for folded f statistic = Degrees of freedom for equal variances = 2 [1-P(fmax(n1-1, n2-1), min(n1-1, n2-2) ::5 f)] Test statistic for equal variances: t_eq = P-value for equal variances = 2 P(t..1+n2-2 ::5 t_ eq) Degrees of freedom for unequal variances: df_uneq = Test statistic for unequal variances: t_uneq = Discussion Normality assumption One of the assumptions of the T -test is normality, and this assumption cannot be examined without the raw data. Therefore, one should keep in mind that the normality assumption might not be valid when comparing 2 independent sample means using summary statistics in a T -test. Advantages of performing T -tests using sufficient statistics Sometimes, statisticians obtain only summary statistics from clients, published litemture, or other sources. T -tests can still be performed keeping in mind the potential normality assumption violation. For example, we can compare the result of our product with the results of competitors products from published journals, companies websites, or advertisements. This macro can also be a handy tool when we would like to do a quick comparison of 2 sample means or to validate the results oft -tests with other statistical software. The Macro P-value for unequal variances = 2 P(t.Jr_uneq:St_uneq) Degrees of freedom for folded F statistic: df_f= The PROC TTEST in Version 8 (Appendix I) cannot perform a T -test using just summary statistics without a _STAT_ variable. Therefore, a STAT variable must be created along with the summary statistics that are to be in in the macro. The macro parameters MIN, MAX for both groups and the alpha level are not required. However, the numeric missing

3 values "." need to be entered to avoid confusion if MINs and MAXs are not used. The confidence intervals for the standard deviations of the groups are not calculated in the PROC TTEST. However, they can, in fact, be calculated using the formula provided above. This is expected to be solved in SAS Version 9. This macro creates a separate text file containing the confidence intervals for the standard deviations of the groups. A macro (Appendix 2) to perform a T -test using summary statistics in SAS Version 6 is also shown. Contact Information Lan-Feng Tsai One Edwards Way Irvine, CA Ian _feng_ tsai@edwards.com SAS is a registered trademark of SAS Institute Inc., Cary, NC, USA. Conclusion One sample T -tests and paired sample T tests can be performed using only summary statistics. More macros for such T-tests will be developed using summary statistics in the future. Acknowledgement The author would like to thank William Anderson PhD, Rita Kristy, Brian Ramos, and Felicia Ho for their generous comments. Reference Arnold, S. F., Mathematical Statistics (1990), Prentice-Hall, Inc., p.366, p.373. Rice, J. A., Mathematical Statistics and Data Analysis, Second Edition (1995), Duxbury Press, p.280, p.388. SAS/STAT Users Guide, Version 8, (1999), SAS Institute Inc. 70

4 Appendix 1. Version 8 SAS code: ttest8 macro: Perform V8 proc ttest using summary ; statistics ; Position parameters gl: sample name of group 1 nl: sample size of group 1 m1: sample mean of group 1 sl: sample standard error of group 1 il: sample minimum of group 1 xl: sample maximum of group l g2: sample name of group 2 n2: sample size of group 2 m2: sample mean of group 2 s2: sample standard error of group 2 i2: sample minimum of group 2 x2: sample maximum of group 2 alpha: alpha level (default is 0.05) ; ; ; ; ; ; ; ; ; Note: il, xl, i2, x2 are not required, enter values or for missing. Written by Lan-Feng Tsai %macro ttest8(g1, n1, ml, s1, il, x1, g2, n2, m2, s2, i2, x2, alpha); data sumstat; %let len=%sysfunc(max(%length(&gl), %length(&g2))); length group $&len. stat $4.; %do i=1 %to 2; - - group="&&g&i"; sumstat=&&n&i; _stat_=n; out; group="&&g&i"; sumstat=&&m&i; _stat_=mean; out; proc print; group="&&g&i"; sumstat=&&s&i; stat_=std; out; group="&&g&i"; sumstat=&&i&i; _stat_=min; out; group="&&g&i"; sumstat=&&x&i; _stat_=max; out; proc ttest %if &alpha ne %then %do; alpha=&alpha ; class group; var sumstat; data null ; file ttestmacro_v8.txt; gl="&g1"; n1=&nl; s1=&s1; g2="&g2"; n2=&n2; s2=&s2; %if &alpha ne %then %do; lcll=sqrt(((n1-1)*sl**2)/cinv((1-&alpha/2), n1-1)); ucll=sqrt(((nl-l)*s1**2)/cinv(&alpha/2, n1-1)); lcl2=sqrt(((n2-1)*s2**2)/cinv((1-&alpha/2), n2-1)); ucl2=sqrt(((n2-l}*s2**2)/cinv(&alpha/2, n2-1)); %else %do; 1cll=sqrt(((n1-l}*s1**2)/cinv(0.975, nl-1}); ucll=sqrt(((nl-l}*s1**2}/cinv(0.025, n1-1}); lcl2=sqrt(((n2-1}*s2**2)/cinv(0.975, n2-l)); ucl2=sqrt(((n2-l}*s2**2)/cinv(0.025, n2-1)); 71

5 @21 LCL UCL ucl2; %mend ttestb; 2. Version 6 SAS code: *; ttest6 macro: gives similar out as proc ttest position parameters V6 using sufficient statistics mgl: name of group 1 mn1: sample size of group 1 mml: sample mean of group 1 ms1: sample standard error of group 1 mg2: name of group 2 mn2: sample size of group 2 mm2: sample mean of group 2 ms2: sample standard error of group 2 Note: specify out file out in a written by : Lan-Feng Tsai FILENAME statement. ; ; ; ; ; ; * %macro ttest6(mgl, mnl, mml, msl, mg2, mn2, mm2, ms2); data null ; file ttestmacro V6.txt; attrib gl g2 for;at=$8. nl n2 format=s. m1 m2 sl s2 t_uneq t_eq f format=8.2 df_uneq df_eq format~s.l f_p t_p_uneq t_p_eq format=8.4; gl="&mgl"; nl=&mnl; ml=&mml; sl=&msl; g2="&mg2"; n2=&mn2; m2=&mm2; s2=&ms2; vl=sl**2; v2=s2**2; f~ax(of vl, v2)/min(of vl, v2); dfl=nl-1; df2=n2-1; dfmax=max(of dfl, df2); dfmin=min(of dfl, df2); f_p=2*(1-probf(f, dfmax, dfmin)); 2-sided ; v_pool=((nl-l)*v1+(n2-l)*v2)/(nl+n2-2); t uneq=(ml-m2)/sqrt(vl/nl+v2/n2); t=eq~(ml-m2)/sqrt(v_pool*(l/nl+l/n2)); df uneq=(v1/n1+v2/n2)**2/((v1/n1)**2/(nl-l)+(v2/n2)**2/(n2-1)); df=eq=nl+n2-2; t_p_uneq=2*(1-probt(abs(t_uneq), df_uneq)); t_p_eq=2*(1-probt(abs(t_eq), df_eq)); Std Err; s2; ; Prob> IT I ; df For HO: Variances are equal, F" = f +5 DF = ( dfmax +(-1), dfmin +(-1) ) +5 Prob>F 1 f_p; %mend ttest6; ; 72

EXST3201 Mousefeed01 Page 1

EXST3201 Mousefeed01 Page 1 EXST3201 Mousefeed01 Page 1 3 /* 4 Examine differences among the following 6 treatments 5 N/N85 fed normally before weaning and 85 kcal/wk after 6 N/R40 fed normally before weaning and 40 kcal/wk after

More information

Cut Out The Cut And Paste: SAS Macros For Presenting Statistical Output ABSTRACT INTRODUCTION

Cut Out The Cut And Paste: SAS Macros For Presenting Statistical Output ABSTRACT INTRODUCTION Cut Out The Cut And Paste: SAS Macros For Presenting Statistical Output Myungshin Oh, UCLA Department of Biostatistics Mel Widawski, UCLA School of Nursing ABSTRACT We, as statisticians, often spend more

More information

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS ABSTRACT Paper 1938-2018 Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS Robert M. Lucas, Robert M. Lucas Consulting, Fort Collins, CO, USA There is confusion

More information

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors (Section 5.4) What? Consequences of homoskedasticity Implication for computing standard errors What do these two terms

More information

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2 PLS205 Lab 1 January 9, 2014 Laboratory Topics 1 & 2 Welcome, introduction, logistics, and organizational matters Introduction to SAS Writing and running programs saving results checking for errors Different

More information

Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research

Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research Liping Huang, Center for Home Care Policy and Research, Visiting Nurse Service of New York, NY, NY ABSTRACT The

More information

Soci Statistics for Sociologists

Soci Statistics for Sociologists University of North Carolina Chapel Hill Soci708-001 Statistics for Sociologists Fall 2009 Professor François Nielsen Stata Commands for Module 7 Inference for Distributions For further information on

More information

Defining Test Data Using Population Analysis Clarence Wm. Jackson, CQA - City of Dallas CIS

Defining Test Data Using Population Analysis Clarence Wm. Jackson, CQA - City of Dallas CIS Defining Test Data Using Population Analysis Clarence Wm. Jackson, CQA - City of Dallas CIS Abstract Defining test data that provides complete test case coverage requires the tester to accumulate data

More information

Intermediate SAS: Statistics

Intermediate SAS: Statistics Intermediate SAS: Statistics OIT TSS 293-4444 oithelp@mail.wvu.edu oit.wvu.edu/training/classmat/sas/ Table of Contents Procedures... 2 Two-sample t-test:... 2 Paired differences t-test:... 2 Chi Square

More information

Cluster Randomization Create Cluster Means Dataset

Cluster Randomization Create Cluster Means Dataset Chapter 270 Cluster Randomization Create Cluster Means Dataset Introduction A cluster randomization trial occurs when whole groups or clusters of individuals are treated together. Examples of such clusters

More information

Stata versions 12 & 13 Week 4 Practice Problems

Stata versions 12 & 13 Week 4 Practice Problems Stata versions 12 & 13 Week 4 Practice Problems SOLUTIONS 1 Practice Screen Capture a Create a word document Name it using the convention lastname_lab1docx (eg bigelow_lab1docx) b Using your browser, go

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ when the population standard deviation is known and population distribution is normal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses

More information

Simulating Multivariate Normal Data

Simulating Multivariate Normal Data Simulating Multivariate Normal Data You have a population correlation matrix and wish to simulate a set of data randomly sampled from a population with that structure. I shall present here code and examples

More information

Chapters 5-6: Statistical Inference Methods

Chapters 5-6: Statistical Inference Methods Chapters 5-6: Statistical Inference Methods Chapter 5: Estimation (of population parameters) Ex. Based on GSS data, we re 95% confident that the population mean of the variable LONELY (no. of days in past

More information

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by Paper CC-016 A macro for nearest neighbor Lung-Chang Chien, University of North Carolina at Chapel Hill, Chapel Hill, NC Mark Weaver, Family Health International, Research Triangle Park, NC ABSTRACT SAS

More information

Data Analysis and Hypothesis Testing Using the Python ecosystem

Data Analysis and Hypothesis Testing Using the Python ecosystem ARISTOTLE UNIVERSITY OF THESSALONIKI Data Analysis and Hypothesis Testing Using the Python ecosystem t-test & ANOVAs Stavros Demetriadis Assc. Prof., School of Informatics, Aristotle University of Thessaloniki

More information

Introduction to Statistical Analyses in SAS

Introduction to Statistical Analyses in SAS Introduction to Statistical Analyses in SAS Programming Workshop Presented by the Applied Statistics Lab Sarah Janse April 5, 2017 1 Introduction Today we will go over some basic statistical analyses in

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Confidence Intervals: Estimators

Confidence Intervals: Estimators Confidence Intervals: Estimators Point Estimate: a specific value at estimates a parameter e.g., best estimator of e population mean ( ) is a sample mean problem is at ere is no way to determine how close

More information

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 19, 2015 2 Getting

More information

A Macro Application on Confidence Intervals for Binominal Proportion

A Macro Application on Confidence Intervals for Binominal Proportion A SAS @ Macro Application on Confidence Intervals for Binominal Proportion Kaijun Zhang Sheng Zhang ABSTRACT: FMD K&L Inc., Fort Washington, Pennsylvanian Confidence Intervals (CI) are very important to

More information

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ Aiming Yang, Merck & Co., Inc., Rahway, NJ ABSTRACT Four pitfalls are commonly

More information

Excel Tips and FAQs - MS 2010

Excel Tips and FAQs - MS 2010 BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my

More information

SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA

SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA Paper SIB-113 SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA ABSTRACT Edward Tufte has championed the idea of using "small multiples" as an effective way to present

More information

E-Campus Inferential Statistics - Part 2

E-Campus Inferential Statistics - Part 2 E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence

More information

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies Gandhi R Bhattarai PhD, OptumInsight, Rocky Hill, CT ABSTRACT Measuring the change in outcomes between

More information

Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International

Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International Abstract Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International SAS has many powerful features, including MACRO facilities, procedures such

More information

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN Paper 045-29 A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN ABSTRACT: PROC MEANS analyzes datasets according to the variables listed in its Class

More information

Bland-Altman Plot and Analysis

Bland-Altman Plot and Analysis Chapter 04 Bland-Altman Plot and Analysis Introduction The Bland-Altman (mean-difference or limits of agreement) plot and analysis is used to compare two measurements of the same variable. That is, it

More information

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA Paper CC-20 Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA ABSTRACT Statistical Hypothesis Testing is performed to determine whether enough statistical

More information

revision of the validation protocol and of the completely executed validation report.

revision of the validation protocol and of the completely executed validation report. John N. Zorich, Jr. Zorich Technical Consulting & Publishing Sunnyvale, California (cell) 408-203-8811 http://www.johnzorich.com johnzorich@yahoo.com This copy of the Validation Protocol / Report # ZTC-7,

More information

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt )

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) JMP Output from Chapter 9 Factorial Analysis through JMP Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) Fitting the Model and checking conditions Analyze > Fit Model

More information

Week 4: Simple Linear Regression III

Week 4: Simple Linear Regression III Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of

More information

Generating Least Square Means, Standard Error, Observed Mean, Standard Deviation and Confidence Intervals for Treatment Differences using Proc Mixed

Generating Least Square Means, Standard Error, Observed Mean, Standard Deviation and Confidence Intervals for Treatment Differences using Proc Mixed Generating Least Square Means, Standard Error, Observed Mean, Standard Deviation and Confidence Intervals for Treatment Differences using Proc Mixed Richann Watson ABSTRACT Have you ever wanted to calculate

More information

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Paper SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Qixuan Chen, University of Michigan, Ann Arbor, MI Brenda Gillespie, University of Michigan, Ann Arbor, MI ABSTRACT This paper

More information

Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy.

Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. JMP Output from Chapter 5 Factorial Analysis through JMP Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. Fitting

More information

Identifying Duplicate Variables in a SAS Data Set

Identifying Duplicate Variables in a SAS Data Set Paper 1654-2018 Identifying Duplicate Variables in a SAS Data Set Bruce Gilsen, Federal Reserve Board, Washington, DC ABSTRACT In the big data era, removing duplicate data from a data set can reduce disk

More information

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB Keppel, G. Design and Analysis: Chapter 17: The Mixed Two-Factor Within-Subjects Design: The Overall Analysis and the Analysis of Main Effects and Simple Effects Keppel describes an Ax(BxS) design, which

More information

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM options pagesize=53 linesize=76 pageno=1 nodate; proc format; value $stcktyp "1"="Growth" "2"="Combined" "3"="Income"; data invstmnt; input stcktyp $ perform; label stkctyp="type of Stock" perform="overall

More information

Summarizing Organization Performance Metrics Tania Skinner Intel Corporation

Summarizing Organization Performance Metrics Tania Skinner Intel Corporation Summarizing Organization Performance Metrics Tania Skinner Intel Corporation Tania.skinner@intel.com Intel Corporation 5200 NE Elam Young Parkway Hillsboro, OR 97124 MS: EG3-319 Objectives Teach a novel

More information

EXST SAS Lab Lab #6: More DATA STEP tasks

EXST SAS Lab Lab #6: More DATA STEP tasks EXST SAS Lab Lab #6: More DATA STEP tasks Objectives 1. Working from an current folder 2. Naming the HTML output data file 3. Dealing with multiple observations on an input line 4. Creating two SAS work

More information

Creating Macro Calls using Proc Freq

Creating Macro Calls using Proc Freq Creating Macro Calls using Proc Freq, Educational Testing Service, Princeton, NJ ABSTRACT Imagine you were asked to get a series of statistics/tables for each country in the world. You have the data, but

More information

Oh Quartile, Where Art Thou? David Franklin, TheProgrammersCabin.com, Litchfield, NH

Oh Quartile, Where Art Thou? David Franklin, TheProgrammersCabin.com, Litchfield, NH PharmaSUG 2013 Paper SP08 Oh Quartile, Where Art Thou? David Franklin, TheProgrammersCabin.com, Litchfield, NH ABSTRACT "Why is my first quartile number different from yours?" It was this question that

More information

Applied Statistics and Econometrics Lecture 6

Applied Statistics and Econometrics Lecture 6 Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,

More information

Lab 5 - Risk Analysis, Robustness, and Power

Lab 5 - Risk Analysis, Robustness, and Power Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors

More information

Robust Linear Regression (Passing- Bablok Median-Slope)

Robust Linear Regression (Passing- Bablok Median-Slope) Chapter 314 Robust Linear Regression (Passing- Bablok Median-Slope) Introduction This procedure performs robust linear regression estimation using the Passing-Bablok (1988) median-slope algorithm. Their

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,

More information

Unit 5: Estimating with Confidence

Unit 5: Estimating with Confidence Unit 5: Estimating with Confidence Section 8.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Unit 5 Estimating with Confidence 8.1 8.2 8.3 Confidence Intervals: The Basics Estimating

More information

Paper A Simplified and Efficient Way to Map Variable Attributes of a Clinical Data Warehouse

Paper A Simplified and Efficient Way to Map Variable Attributes of a Clinical Data Warehouse Paper 117-28 A Simplified and Efficient Way to Map Variable Attributes of a Clinical Data Warehouse Yanyun Shen, Genentech, Inc., South San Francisco ABSTRACT In the pharmaceutical industry, pooling a

More information

Submitting SAS Code On The Side

Submitting SAS Code On The Side ABSTRACT PharmaSUG 2013 - Paper AD24-SAS Submitting SAS Code On The Side Rick Langston, SAS Institute Inc., Cary NC This paper explains the new DOSUBL function and how it can submit SAS code to run "on

More information

T-test og variansanalyse i SAS. T-test og variansanalyse i SAS p.1/18

T-test og variansanalyse i SAS. T-test og variansanalyse i SAS p.1/18 T-test og variansanalyse i SAS T-test og variansanalyse i SAS p.1/18 T-test og variansanalyse i SAS T-test (Etstik, tostik, parrede observationer) Variansanalyse SAS-procedurer: PROC TTEST PROC GLM T-test

More information

Online Supplementary Appendix for. Dziak, Nahum-Shani and Collins (2012), Multilevel Factorial Experiments for Developing Behavioral Interventions:

Online Supplementary Appendix for. Dziak, Nahum-Shani and Collins (2012), Multilevel Factorial Experiments for Developing Behavioral Interventions: Online Supplementary Appendix for Dziak, Nahum-Shani and Collins (2012), Multilevel Factorial Experiments for Developing Behavioral Interventions: Power, Sample Size, and Resource Considerations 1 Appendix

More information

Subset Selection in Multiple Regression

Subset Selection in Multiple Regression Chapter 307 Subset Selection in Multiple Regression Introduction Multiple regression analysis is documented in Chapter 305 Multiple Regression, so that information will not be repeated here. Refer to that

More information

Interval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.

Interval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use. Interval Estimation It is a common requirement to efficiently estimate population parameters based on simple random sample data. In the R tutorials of this section, we demonstrate how to compute the estimates.

More information

title1 "Visits at &string1"; proc print data=hospitalvisits; where sitecode="&string1";

title1 Visits at &string1; proc print data=hospitalvisits; where sitecode=&string1; PharmaSUG 2012 Paper TF01 Macro Quoting to the Rescue: Passing Special Characters Mary F. O. Rosenbloom, Edwards Lifesciences LLC, Irvine, CA Art Carpenter, California Occidental Consultants, Anchorage,

More information

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation: Topic 11. Unbalanced Designs [ST&D section 9.6, page 219; chapter 18] 11.1 Definition of missing data Accidents often result in loss of data. Crops are destroyed in some plots, plants and animals die,

More information

Calculation of the Mean and Variance of Lognormal Data Which Contains Left-Censored Observations

Calculation of the Mean and Variance of Lognormal Data Which Contains Left-Censored Observations Calculation of the Mean and Variance of Lognormal Data Which Contains Left-Censored Observations Stephen J. Ganocy The Goodyear Tire & Rubber Company Akron, Ohio Abstract The mean and variance of data

More information

A SAS MACRO for estimating bootstrapped confidence intervals in dyadic regression

A SAS MACRO for estimating bootstrapped confidence intervals in dyadic regression A SAS MACRO for estimating bootstrapped confidence intervals in dyadic regression models. Robert E. Wickham, Texas Institute for Measurement, Evaluation, and Statistics, University of Houston, TX ABSTRACT

More information

VALIDITY OF 95% t-confidence INTERVALS UNDER SOME TRANSECT SAMPLING STRATEGIES

VALIDITY OF 95% t-confidence INTERVALS UNDER SOME TRANSECT SAMPLING STRATEGIES Libraries Conference on Applied Statistics in Agriculture 1996-8th Annual Conference Proceedings VALIDITY OF 95% t-confidence INTERVALS UNDER SOME TRANSECT SAMPLING STRATEGIES Stephen N. Sly Jeffrey S.

More information

STAT:5201 Applied Statistic II

STAT:5201 Applied Statistic II STAT:5201 Applied Statistic II Two-Factor Experiment (one fixed blocking factor, one fixed factor of interest) Randomized complete block design (RCBD) Primary Factor: Day length (short or long) Blocking

More information

Laboratory Topics 1 & 2

Laboratory Topics 1 & 2 PLS205 Lab 1 January 12, 2012 Laboratory Topics 1 & 2 Welcome, introduction, logistics, and organizational matters Introduction to SAS Writing and running programs; saving results; checking for errors

More information

SAS/STAT 13.1 User s Guide. The NESTED Procedure

SAS/STAT 13.1 User s Guide. The NESTED Procedure SAS/STAT 13.1 User s Guide The NESTED Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute

More information

14.2 Do Two Distributions Have the Same Means or Variances?

14.2 Do Two Distributions Have the Same Means or Variances? 14.2 Do Two Distributions Have the Same Means or Variances? 615 that this is wasteful, since it yields much more information than just the median (e.g., the upper and lower quartile points, the deciles,

More information

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann Forrest W. Young & Carla M. Bann THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA CB 3270 DAVIE HALL, CHAPEL HILL N.C., USA 27599-3270 VISUAL STATISTICS PROJECT WWW.VISUALSTATS.ORG

More information

Introductory Guide to SAS:

Introductory Guide to SAS: Introductory Guide to SAS: For UVM Statistics Students By Richard Single Contents 1 Introduction and Preliminaries 2 2 Reading in Data: The DATA Step 2 2.1 The DATA Statement............................................

More information

Empirical Asset Pricing

Empirical Asset Pricing Department of Mathematics and Statistics, University of Vaasa, Finland Texas A&M University, May June, 2013 As of May 17, 2013 Part I Stata Introduction 1 Stata Introduction Interface Commands Command

More information

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes Brian E. Lawton Curriculum Research & Development Group University of Hawaii at Manoa Honolulu, HI December 2012 Copyright 2012

More information

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90% ------------------ log: \Term 2\Lecture_2s\regression1a.log log type: text opened on: 22 Feb 2008, 03:29:09. cmdlog using " \Term 2\Lecture_2s\regression1a.do" (cmdlog \Term 2\Lecture_2s\regression1a.do

More information

Macros and ODS. SAS Programming November 6, / 89

Macros and ODS. SAS Programming November 6, / 89 Macros and ODS The first part of these slides overlaps with last week a fair bit, but it doesn t hurt to review as this code might be a little harder to follow. SAS Programming November 6, 2014 1 / 89

More information

PharmaSUG Paper TT11

PharmaSUG Paper TT11 PharmaSUG 2014 - Paper TT11 What is the Definition of Global On-Demand Reporting within the Pharmaceutical Industry? Eric Kammer, Novartis Pharmaceuticals Corporation, East Hanover, NJ ABSTRACT It is not

More information

Using SAS Macro to Include Statistics Output in Clinical Trial Summary Table

Using SAS Macro to Include Statistics Output in Clinical Trial Summary Table Using SAS Macro to Include Statistics Output in Clinical Trial Summary Table Amy C. Young, Ischemia Research and Education Foundation, San Francisco, CA Sharon X. Zhou, Ischemia Research and Education

More information

David S. Septoff Fidia Pharmaceutical Corporation

David S. Septoff Fidia Pharmaceutical Corporation UNLIMITING A LIMITED MACRO ENVIRONMENT David S. Septoff Fidia Pharmaceutical Corporation ABSTRACT The full Macro facility provides SAS users with an extremely powerful programming tool. It allows for conditional

More information

Stat 302 Statistical Software and Its Applications SAS: Data I/O

Stat 302 Statistical Software and Its Applications SAS: Data I/O Stat 302 Statistical Software and Its Applications SAS: Data I/O Yen-Chi Chen Department of Statistics, University of Washington Autumn 2016 1 / 33 Getting Data Files Get the following data sets from the

More information

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 3 John Koval Department of Epidemiology and Biostatistics University of Western Ontario What we will do today We will learn to use use SAS to 1. read

More information

An Approach To ANOM Chart. Muhammad Riaz

An Approach To ANOM Chart. Muhammad Riaz An Approach To ANOM Chart Muhammad Riaz Department of tatistics, Quaid-i-Azam University, Islamabad, Pakistan E-mail: riaz76qau@yahoo.com Abstract The study proposes a scheme for the structure of Analysis

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

Multiple Comparisons of Treatments vs. a Control (Simulation)

Multiple Comparisons of Treatments vs. a Control (Simulation) Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that

More information

SparkLines Using SAS and JMP

SparkLines Using SAS and JMP SparkLines Using SAS and JMP Kate Davis, International Center for Finance at Yale, New Haven, CT ABSTRACT Sparklines are intense word-sized graphics for use inline text or on a dashboard that condense

More information

PharmaSUG China. Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai

PharmaSUG China. Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai PharmaSUG China Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai ABSTRACT Once generating SAS graphs, it is a headache to programmers to reorder the axis tick values

More information

Posters 417. NESUG '92 Proceedings. usinq Annotate Data sets to Enhance Contour Graphics output. Shi Tao Yeh, Environmental Resources Kanaqement, ~nc.

Posters 417. NESUG '92 Proceedings. usinq Annotate Data sets to Enhance Contour Graphics output. Shi Tao Yeh, Environmental Resources Kanaqement, ~nc. Posters 417 usinq Annotate Data sets to Enhance Contour Graphics output Shi Tao Yeh, Environmental Resources Kanaqement, ~nc. I. Introduction The GCONTOUR procedure in the SAS/GRAPH produces contour plpts.

More information

An Integer Linear Programming Problem for RNA Structures

An Integer Linear Programming Problem for RNA Structures Applied Mathematical Sciences, Vol. 6, 2012, no. 54, 2695-2702 An Integer Linear Programming Problem for RNA Structures G. H. Shirdel Department of Mathematics, Faculty of Basic Sciences University of

More information

The NESTED Procedure (Chapter)

The NESTED Procedure (Chapter) SAS/STAT 9.3 User s Guide The NESTED Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete manual

More information

A Generalized Procedure to Create SAS /Graph Error Bar Plots

A Generalized Procedure to Create SAS /Graph Error Bar Plots Generalized Procedure to Create SS /Graph Error Bar Plots Sanjiv Ramalingam, Consultant, Octagon Research Solutions, Inc. BSTRCT Different methodologies exist to create error bar related plots. Procedures

More information

CREATING THE DISTRIBUTION ANALYSIS

CREATING THE DISTRIBUTION ANALYSIS Chapter 12 Examining Distributions Chapter Table of Contents CREATING THE DISTRIBUTION ANALYSIS...176 BoxPlot...178 Histogram...180 Moments and Quantiles Tables...... 183 ADDING DENSITY ESTIMATES...184

More information

Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ

Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ PharmaSUG 2015 - Paper QT41 Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ ABSTRACT Most often clinical trial data analysis has tight deadlines with very

More information

Quick Results with the Output Delivery System

Quick Results with the Output Delivery System Paper 58-27 Quick Results with the Output Delivery System Sunil K. Gupta, Gupta Programming, Simi Valley, CA ABSTRACT SAS s new Output Delivery System (ODS) opens a whole new world of options in generating

More information

Bivariate (Simple) Regression Analysis

Bivariate (Simple) Regression Analysis Revised July 2018 Bivariate (Simple) Regression Analysis This set of notes shows how to use Stata to estimate a simple (two-variable) regression equation. It assumes that you have set Stata up on your

More information

Windows Application Using.NET and SAS to Produce Custom Rater Reliability Reports Sailesh Vezzu, Educational Testing Service, Princeton, NJ

Windows Application Using.NET and SAS to Produce Custom Rater Reliability Reports Sailesh Vezzu, Educational Testing Service, Princeton, NJ Windows Application Using.NET and SAS to Produce Custom Rater Reliability Reports Sailesh Vezzu, Educational Testing Service, Princeton, NJ ABSTRACT Some of the common measures used to monitor rater reliability

More information

Data Quality Control: Using High Performance Binning to Prevent Information Loss

Data Quality Control: Using High Performance Binning to Prevent Information Loss Paper 2821-2018 Data Quality Control: Using High Performance Binning to Prevent Information Loss Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation ABSTRACT It is a well-known fact that the structure

More information

5.5 Regression Estimation

5.5 Regression Estimation 5.5 Regression Estimation Assume a SRS of n pairs (x, y ),..., (x n, y n ) is selected from a population of N pairs of (x, y) data. The goal of regression estimation is to take advantage of a linear relationship

More information

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application SAS/STAT 13.1 User s Guide The Power and Sample Size Application This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as

More information

Dealing with Categorical Data Types in a Designed Experiment

Dealing with Categorical Data Types in a Designed Experiment Dealing with Categorical Data Types in a Designed Experiment Part II: Sizing a Designed Experiment When Using a Binary Response Best Practice Authored by: Francisco Ortiz, PhD STAT T&E COE The goal of

More information

Statistical Tests for Variable Discrimination

Statistical Tests for Variable Discrimination Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:

More information

PharmaSUG Paper SP07

PharmaSUG Paper SP07 PharmaSUG 2014 - Paper SP07 ABSTRACT A SAS Macro to Evaluate Balance after Propensity Score ing Erin Hulbert, Optum Life Sciences, Eden Prairie, MN Lee Brekke, Optum Life Sciences, Eden Prairie, MN Propensity

More information

Panel Data 4: Fixed Effects vs Random Effects Models

Panel Data 4: Fixed Effects vs Random Effects Models Panel Data 4: Fixed Effects vs Random Effects Models Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised April 4, 2017 These notes borrow very heavily, sometimes verbatim,

More information

Tips and Tricks in Creating Graphs Using PROC GPLOT

Tips and Tricks in Creating Graphs Using PROC GPLOT Paper CC15 Tips and Tricks in Creating Graphs Using PROC GPLOT Qin Lin, Applied Clinical Intelligence, LLC, Bala Cynwyd, PA ABSTRACT SAS/GRAPH is a very powerful data analysis and presentation tool. Creating

More information

Regression on the trees data with R

Regression on the trees data with R > trees Girth Height Volume 1 8.3 70 10.3 2 8.6 65 10.3 3 8.8 63 10.2 4 10.5 72 16.4 5 10.7 81 18.8 6 10.8 83 19.7 7 11.0 66 15.6 8 11.0 75 18.2 9 11.1 80 22.6 10 11.2 75 19.9 11 11.3 79 24.2 12 11.4 76

More information

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem. STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,

More information

And the benefits are immediate minimal changes to the interface allow you and your teams to access these

And the benefits are immediate minimal changes to the interface allow you and your teams to access these Find Out What s New >> With nearly 50 enhancements that increase functionality and ease-of-use, Minitab 15 has something for everyone. And the benefits are immediate minimal changes to the interface allow

More information