Imprting an Excel Wrksheet int SAS (cmmands=imprt_excel.sas) I. Preparing Excel Data fr a Statistics Package These instructins apply t setting up an Excel file fr SAS, SPSS, Stata, etc. Hw t Set up the Excel File: Place the variable names in the first rw. Be sure the names fllw these rules: variable names can be n mre than 8 characters (lnger variable names are currently allwed in SAS and SPSS) variable names must start with a letter variable names may nly have letters, numbers, r underscres in them d nt use fllwing characters in variable names: %,$,#,@,!,+,*,~,",.,-,. n blanks in variable names be sure that each variable name is unique (n duplicate variable names) be sure variable names are n the first rw nly! Include nly the raw, un-summarized data. Delete extraneus data in yur Excel file, like rw r clumn ttals, graphs, cmments, anntatins, etc. T prevent "ghst" rws and clumns, cpy nly the raw data nt a new wrksheet, and save values nly frm there. Include a unique identifying number fr each case. Smetimes yu may have mre than ne identifier, such as Husehld ID and Subject ID; place these in separate clumns. If yu have several spreadsheets cntaining data n the same individuals, include their identifier(s) n each sheet. Include nly ne value per cell. Dn t enter data such as "120/80" fr bld pressure. Enter systlic bld pressure as ne variable, and diastlic bld pressure as anther variable. Dn't enter data as "A,C,D" r "BDF" if there are three pssible answers t a questin. Include a separate clumn fr each answer. Dn't leave blank rws r clumns in the data. Dn t mix numeric and character values (e.g. names and ID numbers) in the same clumn. Use numeric values when feasible. While character variables are allwed in statistical packages, they are nt as flexible as numeric variables, which are preferred. Date values are best entered in three clumns: ne fr mnth, ne fr day, ne fr year. Yu can change them int date values in yur statistics package later. If yu have missing values, yu can indicate them with a numeric cde, such as 99 r 999, r yu can leave the cell blank. Be sure, if yu use a missing value cde, that it cannt be cnfused with a "real" data value. Save the spreadsheet with values nly, nt frmulas. D nt underline text, r use bldface r italics. 1
An excerpt frm an Excel file might lk like this: Hw t Save the Excel File: Excel allws yu the ptin f saving a file in several different frmats. If yu re having prblems, Versin 4.0 Excel Wrksheets can be read by mst statistical packages. T save yur Excel file in versin 4.0, g t the File menu and chse Save As... and then select Excel 4.0 Wrksheet (nt Wrkbk) as the file type. Yu will be able t save nly ne wrksheet at a time in Excel 4.0 frmat. T preserve yur riginal Excel data, use a different name when saving in this special frmat. T be sure that the file name will be easily recgnizable n any system, use a name nt lnger than eight characters, and add the extensin.xls. Multiple Wrksheets: If yu have several wrksheets, yu can select the wrksheet that yu wish t imprt when yu bring the data int SAS, but yu will need t bring in each sheet individually and then merge them in the statistical package yu are using. The cnsultants at CSCAR can help yu with this. A dcument very similar t this ne is available nline at http://www.umich.edu/~cscar/sftware/frmexcel.html What Type f Excel Files Can Yu Imprt t SAS? Yu can imprt Excel wrksheets, starting with very early versins f Excel (e.g., Excel versin 4.0). Yu can als imprt individual sheets frm wrkbks fr later versins f Excel (e.g. Excel 2000), but nly ne sheet at a time. Excel 2007 (.xlsx) files can be pened by the updated versins f SAS 9.2 r SAS 9.3, but if yu re using an earlier versin f SAS, yu will have t save the files as.xls files befre prceeding. 2
II. Imprting the Excel File t SAS Step-By-Step Instructins: G t the File Menu and select Imprt Data Select the type f data file that yu wuld like t imprt frm the pull-dwn menu. Click n the Next> buttn t prceed. In the dialg bx that pens, click n Brwse t lcate the file yu wish t imprt. 3
Select the Excel file t pen and click n the Open buttn. The file name that yu have chsen will appear in the brwse dialg bx. Click n OK. In the next dialg bx, yu will need t select the table that yu want t imprt frm the pulldwn list. In this example, we are selecting the table named EMPLOYEE, which is in fact, the nly sheet in this wrkbk. 4
Click n Next> t prceed. Yu will be taken t a dialg bx that allws yu t save the SAS data set t a library. The default temprary library WORK will be autmatically filled in fr yu, but yu need t type the data set (Member) name. In this case, we are saving the data set as WORK.EMPLOYEE. 5
At this pint, yu have tw chices: If yu click n Finish, the data set will be saved, and yu can prceed t wrk with it. If yu click n Next>, as illustrated here, yu will g t the fllwing dialg bx, where yu will have a chance t save the SAS cmmands that were used t imprt the dataset. Yu can use these cmmands later t re-imprt the data.. I usually click n Next>, s I can save my cmmands. This prcess is shwn belw. Brwse t a lcatin where yu wish t save yur SAS cmmands and give them a name, as in the example belw (the cmmands were saved n the desktp as imprt_emplyee.sas. 6
Click n Save and yu will then see the dialg bx belw. 7
Yu can nw click n Finish t cmplete imprting the data set. Check the SAS Lg. Yu shuld see the fllwing message in the lg: NOTE: WORK.EMPLOYEE data set was successfully created. Using SAS cmmands t imprt Excel Files: If yu saved yur cmmands in the previus step, yu can nw bring them int yur SAS enhanced editr, by ging t File Open Prgram and brwsing t the cmmand file that yu saved. Alternatively, yu can type these cmmands by hand and submit them t SAS. The cmmand file is shwn belw: PROC IMPORT OUT= WORK.EMPLOYEE DATAFILE= "C:\labdata\EMPLOYEE.XLS" DBMS=EXCEL REPLACE; RANGE="EMPLOYEE$"; GETNAMES=YES; MIXED=NO; SCANTEXT=YES; USEDATE=YES; SCANTIME=YES; RUN; 8
The data set can be mdified by creating a new data step, with additinal cmmands, fr example: data emplyee2; set emplyee; saldiff = salary-salbegin; if 0<= educ < 12 then edcat = 1; if educ = 12 then edcat = 2; if educ > 12 then edcat = 3; run; SAS can nw be used t run prcedures n this new data set, fr example, Prc Means, as shwn belw: prc means data=emplyee2; run; The MEANS Prcedure Variable Label N Mean Std Dev Minimum Maximum ------------------------------------------------------------------------------------------- id id 474 237.5000000 136.9762753 1.0000000 474.0000000 bdate bdate 473-1179.56 4302.33-11282.00 4058.00 educ educ 474 13.4915612 2.8848464 8.0000000 21.0000000 jbcat jbcat 474 1.4113924 0.7732014 1.0000000 3.0000000 salary salary 474 34419.57 17075.66 15750.00 135000.00 salbegin salbegin 474 17016.09 7870.64 9000.00 79980.00 jbtime jbtime 474 81.1097046 10.0609449 63.0000000 98.0000000 prevexp prevexp 474 95.8607595 104.5862361 0 476.0000000 minrity minrity 474 0.2194093 0.4142836 0 1.0000000 saldiff 474 17403.48 10814.62 5550.00 76240.00 edcat 474 2.3755274 0.6775720 1.0000000 3.0000000 ------------------------------------------------------------------------------------------- 9