SAS Example A10 data sales infile U:\Documents\...\sales.txt input Region : $8. State $2. +1 Month monyy5. Headcnt Revenue Expenses format Month monyy5. Revenue dollar12.2 proc sort by Region State Month Mervyn Marasinghe proc print proc print label by Region State format Expenses dollar10.2 label State= State Month= Month Revenue= Sales Revenue Expenses= Overhead Expenses id Region State sum Revenue Expenses sumby Region title Sales report by State and Region 2 Sample Data Set sales.txt Output Delivery System (ODS) SOUTHERN FL JAN78 10 10000 8000 SOUTHERN FL FEB78 10 11000 8500 SOUTHERN FL MAR78 9 13500 9800 SOUTHERN GA JAN78 5 8000 2000 SOUTHERN GA FEB78 7 6000 1200 PLAINS NM MAR78 2 500 1350 NORTHERN MA MAR78 3 1000 1500 NORTHERN NY FEB78 4 2000 4000 NORTHERN NY MAR78 5 5000 6000 EASTERN NC JAN78 12 20000 9000 EASTERN NC FEB78 12 21000 8990 EASTERN NC MAR78 12 20500 9750 EASTERN VA JAN78 10 15000 7500 EASTERN VA FEB78 10 15500 7800 EASTERN VA MAR78 11 16600 8200 CENTRAL OH JAN78 13 21000 12000 CENTRAL OH FEB78 14 22000 13000 CENTRAL OH MAR78 14 22500 13200 CENTRAL MI JAN78 10 10000 8000 CENTRAL MI FEB78 9 11000 8200 CENTRAL MI MAR78 10 12000 8900 CENTRAL IL JAN78 4 6000 2000 CENTRAL IL FEB78 4 6100 2000 CENTRAL IL MAR78 4 6050 2100 ODS allows flexibility in presenting output from SAS procedures ODS can arrange output in more presentable ways It also can create output in a variety of formats (called destinations) Examples of currently available ODS destinations: LISTING: produces traditional monospace output PS or PDF: output that is formatted for a highresolution printer such as PostScript or PDF HTML: output that is formatted in various markup languages such as HTML RTF: output that is formatted for use with Microsoft Word Traditionally, results of a SAS procedure were displayed in a the output window in the LISTING format. Use the Results tab in the Preferences window to change HTML to LISTING format 3
Current Default Settings in SAS 9.3 The default destination in the SAS windowing environment is HTML (changed from traditional LISTING destination) The default style for the HTML destination is HTMLBlue This new style is an all-color style that is designed to integrate tables and modern statistical graphics In the past, SAS users used SAS/Graph to produce high quality graphics. We ll call these traditional SAS graphics. Template-based graphics produced by ODS Graphics (different from traditional SAS graphics) is now used for producing graphs is most SAS procs Previously, in the SAS Windowing environment, ODS GRAPHICS ON statement was used to enable (and ODS GRAPHICS OFF statement to disable) ODS Graphics SAS ODS Example 1 data oranges input Variety $ Flavor texture Looks Total=Flavor+Texture+Looks datalines navel 9 8 6 temple 7 7 7 valencia 8 9 9 mandarin 5 7 8 proc sort data=oranges by descending Total ods rtf file= U:\Documents\...\oranges.rtf" proc print data=oranges title 'Taste Test Results for Oranges' ods rtf close 6 PROC MEANS PROC MEANS options CLASS variables VAR variables TYPES request(s) WAYS list BY variables FREQ variable WEIGHT variable ID variables OUTPUT OUT=SAS_dataset statistics PROC MEANS (continued) Statistics: N, NMISS, MEAN, STD, MIN, MAX, RANGE, SUM, VAR, USS, CSS, CV, STDERR, T, PRT, SUMWGT Example of OUTPUT statement output out=stats mean=av_ht Av_Wt stderr=se_ht SE_Wt Options: DATA=, PRINT, MAXDEC=, FW=, MISSING, NWAY, IDMIN, DESCENDING, ORDER=, VARDEF=, statistics MAXDEC = n no. of decimal places (2) FW = n field width (12) For printing computed statistics ORDER=FREQ, DATA, INTERNAL, FORMATTED VARDEF=DF, N, WGT, WDF 7 8
SAS Example B5 data biology input Id Sex $ Age Year Height Weight datalines 7389 M 24 4 69.2 132.5 3945 F 19 2 58.5 112.0 4721 F 20 2 65.3 98.6 6327 M 20 1 70.2 135.4 8472 F 20 4 66.8 142.6 4875 M 20 1 74.2 160.4 proc means data=biology fw=8 maxdec=3 class Year Sex var Height Weight output out=stats mean=av_ht Av_Wt stderr=se_ht SE_Wt proc print data=stats title Biology Class Data Set: Output Statement PROC UNIVARIATE PROC UNIVARIATE options VAR variables CLASS variable-1 <(v-options)> < variable-2 <(v- options)> >...< / KEYLEVEL= value1 ( value1 value2 ) > BY variables ID variable OUTPUT < OUT=SAS-data-set > < keyword_1=names...keyword_k=names > < percentile-options > Some Options: DATA =, NOPRINT, PLOT, FREQ, NORMAL, ALPHA=, CIBASIC (TYPE= ALPHA=), MU0=, TRIM= (TYPE= ALPHA=), PCTLDEF=, VARDEF=, PCTLDEF=1,2,3,4 OR 5 (5 methods of computing percentiles) VARDEF=DF, N, WEIGHT, WDF (divisor for NORMAL computing variance) computes the Shapiro-Wilk statistic W if n 2000 or the Kolmogorov-Smirnov statistic D if n > 2000 10 PROC UNIVARIATE (continued) TYPE= LOWER, UPPER, TWOSIDED (specify type of confidence intervals) TRIM= list of integers or fractions specifying amount of trimming ALPHA=.05 (for both CIBASIC and TRIM specifications) Class Options: The v-options are MISSING or ORDER= ORDER = FREQ, DATA, INTERNAL, FORMATTED specifies how levels of the variable are ordered in the output Some Keywords for use with OUTPUT statement: N, NMISS, NOBS, MEAN, SUM, SD, VAR, SKEWNESS, KURTOSIS, SUMWGT, MAX, MIN, RANGE, Q3, MEDIAN, Q1, ORANGE, P1, P5, P10, P90, P95, P99, MODE, SIGNRANK, NORMAL, PCTLNAME=, PCTLPTS=, PCTLPRE= Example: proc univariate data=survey class county var acreage rainfall output out=new mean=ave1 ave2 var= var1 var2 SAS ODS Example 2 data biology input Id Sex $ Age Year Height Weight datalines 7389 M 24 4 69.2 132.5 3945 F 19 2 58.5 112.0 4721 F 20 2 65.3 98.6 6327 M 20 1 70.2 135.4 8472 F 20 4 66.8 142.6 4875 M 20 1 74.2 160.4 ods rtf file= U:\Documents\...\progB2_out.rtf" ods select BasicMeasures Quantiles TestsForNormality proc univariate data=biology Normal var Height title Biology class: Analysis of Height Distribution ods rtf close 11
SAS Example C12 (simplified) data rainfall input Control Seeded @@ Logseed = log10(seeded) label Logseed="Log of Seeded Rainfall" datalines 1202.6 2745.6 830.1 1697.8 372.4 1656.0 345.5 978.0 321.2 703.4 244.3 489.1 163.0 430.0 147.8 334.1 95.0 302.8 87.0 274.7 81.2 274.7 68.5 255.0 47.3 242.5 41.1 200.7 36.6 198.6 29.0 129.6 28.6 119.0 26.3 118.3 26.1 115.3 24.4 92.4 21.7 40.6 17.3 32.7 11.5 31.4 4.9 17.5 4.9 7.7 1.0 4.1 proc univariate data=rainfall var Logseed probplot Logseed/normal(mu=est sigma=est) PROC TABULATE PROC TABULATE options CLASS variable VAR variable FREQ variable WEIGHT variable BY variable TABLE expression, expression, / options KEYLABEL keyword= text Options: DATA=, MISSING, FORMAT=, ORDER=, FORMCHAR( ) =, NOSEPS, DEPTH= FORMAT = any valid SAS or user defined format (BEST12.2) ORDER = FREQ, DATA, INTERNAL,FORMATTED FORMCHAR= ---- + --- DEPTH = 10 14 PROC TABULATE (continued) PROC TABULATE (continued) TABLE statement TABLE expression 1, expression 2, expression 3 /options defines defines defines pages rows columns Expression consists of: Classification variables (CLASS variables) Analysis variables (variables in VAR) Statistics (N, MEAN, STD, MIN, MAX, etc.) Format specifications Other expressions Example Expressions: Assume we have the variables A,B,C,D,X and Y defined in SAS statements as follows: CLASS A B C D VAR X Y A*B A*B*C A*MEAN*X A B A B*C (A B)*C 15 Examples of TABLE statements. class variable TABLE PRODUCT,, SALETYPE*(QUANTITY AMOUNT) page expression row expression analysis variables column expression In the examples below, the required table format is specified with a TABLE statement and the output produced by different TABLE statements are sketched. The TABLE statement used appears on top of the table produced. 16
PROC TABULATE Examples First we will consider only three variables:, CITYSIZE, and POP where and CITYSIZE are CLASS variables and POP is an analysis variable (numerical, continuous-valued). Each observation in the data contains data for a single city for many cities in several regions. Variable CITYSIZE POP Description code for region of the country code for relative population size (S=small, M=medium, L=large) urban population Most applications like this need both CLASS and VAR statements in the PROC TABULATE step in addition to the TABLE statement: PROC TABULATE TITLE Population by Region CLASS VAR POP Examples of TABLE statement Example 1: TABLE,POP POP SUM NC 4650000.00 NE 6666000.00 SO 6864000.00 WE 8376000.00 Example 2: TABLE,CITYSIZE*POP *SUM CITYSIZE L M S POP POP POP SUM SUM SUM NC 3750000.00 750000.00 15000.00 NE 5022000.00 1422000.00 222000.00 SO 4488000.00 2088000.00 288000.00 WE 5592000.00 2592000.00 192000.00 17 18 Example 3: TABLE *CITYSIZE,POP*SUM POP SUM CITYSIZE NC L 3750000.00 M 750000.00 S 150000.00 NE L 5022000.00 M 1422000.00 S 222000.00 SO L 4488000.00 M 2088000.00 S 288000.00 WE L 5592000.00 M 2592000.00 S 192000.00 Example 4: TABLE *CITYSIZE*(SUM MEAN),POP CITYSIZE POP NC L SUM 3750000.00 MEAN 625000.00 M SUM 750000.00 MEAN 125000.00 S SUM 150000.00 MEAN 25000.00 NE L SUM 5022000.00 MEAN 837000.00 M SUM 1422000.00 MEAN 237000.00 S SUM 222000.00 MEAN 37000.00 SO L SUM 4488000.00 MEAN 748000.00 M SUM 2088000.00 MEAN 348000.00 S SUM 288000.00 MEAN 48000.00 WE 19 20
Next, add four additional variables to the example: PRODUCT and SALETYPE are class variables: QUANTITY and AMOUNT are analysis variables. Modified proc step: PROC TABULATE TITLE Sales by Region and City Size CLASS PRODUCT SALETYPE VAR POP QUANTITY AMOUNT The report produced by the following TABLE statement contains a separate table for each PRODUCT and provides an analysis of wholesale and retail sales by and by CITYSIZE. Example 5: TABLE PRODUCT, CITYSIZE,SALETYPE*(QUANTITY AMOUNT) PRODUCT A100 SALETYPE R W QUANTITY AMOUNT QUANTITY AMOUNT SUM SUM SUM SUM NC 1250.00 31250.00 1250.00 25000.00 NE 1600.00 40000.00 1600.00 32000.00 SO 1880.00 47000.00 1880.00 37600.00 WE 1840.00 46000.00 1840.00 36800.00 CITYSIZE L 3190.00 79750.00 3190.00 63800.00 M 2440.00 61000.00 2440.00 48800.00 S 940.00 23500.00 940.00 18800.00 PRODUCT A200 SALETYPE R W QUANTITY AMOUNT QUANTITY AMOUNT SUM SUM SUM SUM NC 1295.00 32375.00 1295.00 25900.00 21 22