Mimicking the Data Step Dash and Double Dash in PROC SQL Arlene Amodeo, Law School Admission Council, Newtown, PA

Size: px
Start display at page:

Download "Mimicking the Data Step Dash and Double Dash in PROC SQL Arlene Amodeo, Law School Admission Council, Newtown, PA"

Transcription

1 ABSTRACT Mimicking the Data Step Dash and Double Dash in PROC SQL Arlene Amodeo, Law School Admission Council, Newtown, PA The SQL procedure is a powerful and versatile procedure in SAS that allows the programmer to utilize SQL syntax to perform data set merges (joins), update data sets (tables), and update observations (records). Its main use is to duplicate the functionality of the MERGE BY statements in the DATA step, but most of its worth comes from its expansion upon these functionalities and its shorter processing time. It allows the user to utilize a wide range of summary functions without having to code PROCs and to merge on variables without the restrictions inherent in the BY statement. However, there are two shortcuts in the DATA step that are not available in PROC SQL: the dash (-) for selecting a range of suffixed variables, and the double dash (--) for selecting a range of variables from a data set. This paper shows how to integrate PROC SQL, dictionary members, and character functions to mimic the functionality of the dash and double dash. This method can be utilized easily to gather several ranges of variable names and rearrange the ordering of columns in the output data set. Examples of how to use this code are given in the context of the LSAC National Longitudinal Bar Passage Study data. Macro programs are provided in the Appendices to enable the reader to implement these methods quickly without hard-coding. This code was developed using SAS 9.2 for PC execution of this code requires licenses for Base SAS and SAS/STAT. The intended audience of this paper has some awareness of SAS macro programming and PROC SQL syntax. INTRODUCTION The inconceivably large potential for data manipulation is a chief advantage of choosing SAS over similar software: If you can think it, you can probably do it in SAS and usually with half the battle you d face with other software. Two basic procedures for manipulating data sets are available in SAS: the DATA step (native to SAS), and the SQL procedure (based on the Structured Query Language used by several other commercial software, such as Oracle). PROC SQL incorporates most of the functionality of the DATA step. It allows the programmer to utilize all SAS character functions and also perform the same types of merges (joins) as with DATA steps, using a more universal nomenclature. In many situations, PROC SQL has several advantages (and relatively few disadvantages) over the DATA step. PROC SQL not only performs functions across rows (as the DATA step does), but it can also perform functions down observations, an endeavor that is typically code-intensive and requires caution when implemented in the DATA step. PROC SQL also makes merging data sets a less stressful task it does not have the strict restrictions on the merging variables that the MERGE-BY method in the DATA step has. Data merges (joins) in PROC SQL can be performed on merging variables with different names (e.g. NAME and FULL_NAME), different cases (join records on LOWCASE(FIRST_NAME)), and even with several variables serving as one merging variable (CATX(, FIRST_NAME, LAST_NAME) matched to FULL_NAME)). In addition to having powerful and versatile data manipulation capabilities, the SAS program contains several shortcuts to help eliminate tedious coding. Two of these shortcuts are the dash (-) and the double dash (--), each of which generates lists of variables. The (single) dash allows the programmer to specify that all variables that are similarly structured in name with a prefix and a numeric suffix should be processed (e.g., ANSWER1 through ANSWER30). The double dash allows the programmer to quickly specify that all variables in the program data vector that lie between two variables should be processed (e.g., NAME through PHONE_NUMBER, for specifying that NAME, ADDRESS, and PHONE_NUMBER should be included). 1 Unfortunately, the dash and double dashes can only be used in DATA steps, data set options, and SAS statements they cannot be used within the SELECT clause of PROC SQL. This is a rare example of a functionality that is lost when moving from the DATA step to PROC SQL. A consequence of this is that when a user utilizes PROC SQL to select variables, he or she must type each variable individually, even in cases where the dash or double dash would automatically generate the list in another context. However, as another example of SAS s versatility, a combination of procedures and data views can be integrated to duplicate the functionality of the dash and double dash for use in PROC SQL. The first four sections of this paper introduce the basic nomenclature, concepts, and code that can be utilized to mimic the variable list shortcuts in PROC SQL. The last three sections describe code for each of the shortcuts (the dash for a range of suffixed variables, and the double dash for a range of variables between two specified names). The motivation is discussed for developing a method of mimicking both the single and double dashes a short example of the code to accomplish this is provided and macros are provided in the appendices, along with short descriptions in the text. The Bar Passage Study data collected and published by LSAC is cited in short examples. 1 Typically, this means that the variable list will contain all variables that were created between NAME and PHONE_NUMBER chronologically when those variables were first created. 1

2 PROC SQL: BASIC SYNTAX The SQL procedure is based on the Structured Query Language (SQL) used by products such as MySQL, Oracle, and Microsoft Access. The most basic purpose of the language is to query data from tables sometimes imposing particular criteria and to join one or more tables into one table (or temporary view). When SQL code is written to act upon a table (data set), it is said that the table is being queried. PROC SQL can be used to create data sets and views through typical methods such as concatenation, subsetting, and merging. For several key components of the DATA step, there are analogous features in PROC SQL that have their own names unique to PROC SQL. Table 1 shows the equivalent names for DATA step components and associated data set attributes whose essences are also mimicked in PROC SQL. Table 1: Equivalent SAS terms Between the DATA step and PROC SQL DATA step term/method PROC SQL term/method Data set Table Statement Clause (CREATE TABLE/CREATE VIEW, SELECT, INTO, FROM, WHERE, GROUP BY, HAVING, ORDER BY) Observation Row Variable Column Merge Join RENAME oldvarname = newvarname SELECT oldvarname AS newvarname CALL SYMPUT(macrovariable, value) newvarname is referred to as an alias In general, the INTO clause is used within the SELECT statement. This can be utilized to create one or more macro variables very efficiently. An example appears below. This example creates one macro variable that holds each value encountered in column, where the elements within the variable are separated by a comma and a space: Williams, SESUG 2008 SELECT DISTINCT columnname1 INTO : macrovariable SEPARATED BY, FROM datasetname QUIT Figure 1 shows the syntax for PROC SQL. Only the SELECT and FROM clauses are required. In general, the SELECT clause indicates which variables should be placed into the table (or view) from the table(s) indicated in the FROM statement in contrast to the DATA step, multiple variables are separated by commas, not spaces. The WHERE statement is used to select only certain rows and to execute joins between multiple data sets specified in the FROM statement: the variable that would be in the BY statement of a MERGE BY would appear in the WHERE statement. Additional code can be used to specify what type of join(s) should be performed, using specific SQL syntax and the ON keyword rather than the WHERE clause. For more information on SQL joins, reference the paper PROC SQL: Tips and Translations for Data Step Users (Marcella and Jorgensen, 2010). Just as each DATA step completes with a RUN statement, each CREATE ORDER BY block ends with a semicolon, and each SQL block ends with a QUIT statement. The RUN Statement is not necessary. Several CREATE TABLE ORDER BY blocks can be created within one SQL procedure, each ending with a semicolon. It is important to note that by default, the results of a query that does not use the CREATE clause will be printed to output to prevent this, use the NOPRINT option in the PROC SQL statement. Fig. 1 PROC SQL syntax PROC SQL noprint/print CREATE TABLE/ CREATE VIEW newtablename AS SELECT tablename.var1 AS alias1, tablename.var2 AS alias2 FROM tablename WHERE <condition> GROUP BY varname HAVING <condition> ORDER BY varname QUIT 2

3 Just as the DATA step can create macro variables with CALL SYMPUT, PROC SQL can also create macro variables. Within the SELECT clause, the user can specify that a column value (or a function of a column value) should be inserted into one or more macro variables this is done using the INTO clause. For example, if a programmer wanted to have one macro variable that contained the average value of age from the SASHELP.CLASS data set, he or she could code as in Figure 2. In this code, AVG is called an aggregate function it works down rows in the table to find the average value of age, just as PROC MEANS would when the VAR statement is specified as VAR age The average value for age is inserted into the macro variable average_age. Fig. 2 Example PROC SQL code PROC SQL SELECT AVG(age) INTO: average_age FROM SASHELP.CLASS QUIT Table 1 above briefly shows how one macro variable can be created to hold all of the values found in a column. In conjunction with automatic data sets called dictionary members, this code can be utilized to help mimic the dashes in PROC SQL. Dictionary members are described briefly in the next section, and the last three sections of this paper discuss how to put it all together. In addition to several NESUG papers, the manual SAS SQL Procedure User s Guide -- published by the SAS Institute gives an excellent summary of the various ways in which PROC SQL can be utilized. DICTIONARY MEMBERS AND THE DICTIONARY.COLUMNS TABLE During each SAS session there exist automatic data views called dictionary members. These data sets contain information about the data sets that are in the current SAS session. For example, DICTIONARY.TABLES has one row to describe each data set in the current SAS session (one row for SASHELP.CLASS, one row for SASHELP.SALES, etc.). The data set DICTIONARY.COLUMNS contains one row to describe every variable in the data sets (one row for each variable in SASHELP.CLASS, one row for each variable in SASHELP.SALES, etc.). Table 2 displays the contents of this data set. Table 2: Contents of DICTIONARY.COLUMNS Column Name libname memname memtype name type length npos varnum label format informat idxusage sortedby xtype notnull precision scale transcode Column Label Library Name Member Name Member Type Column Name Column Type Column Length Column Position Column Number in Table Column Label Column Format Column Informat Column Index Type Order in Key Sequence Extended Type Not NULL? Precision Scale Transcoded? 3

4 Marcella and Jorgensen, NESUG 2010 As seen in Table 2, DICTIONARY.COLUMNS contains the library name ( libname ) and data set name ( memname ) in which each column resides. It also contains the position of the variable in the data set in the context of a table with rows and columns for example, varnum = 3 indicates that the variable is in the third column from the left. In conjunction with PROC SQL, the DICTIONARY.COLUMNS table can be used to generate a single macro variable that contains a list of column/variable names from a certain data set/table, where the column/variable names have certain characteristics, or are located in a certain position in the SAS data set. This forms a foundation for mimicking the dash shortcuts in PROC SQL. It is worth noting that the columns in the dictionary members may have trailing blanks in their values, and this needs to be taken into consideration when generating macro variables whose values are intended to be used as SAS code. Each variable list shortcut is discussed in turn in the remaining sections of this paper. SAS CHARACTER FUNCTIONS SAS contains a wide variety of character functions to manipulate character variables, extract substrings from character variables, and count occurrences of substrings and characters within character variables. These three tasks are conducted by many more than just three functions, and many more tasks on character functions can be performed. The book SAS Functions by Example by Ronald Cody details many of these with excellent clarity and organization. For the purposes of mimicking the dash and double dash, this paper will utilize the LENGTH, CAT, and SUBSTR functions, as described generally below. 1. LENGTH : counts the number of characters used in a character variable, ignoring trailing blanks. For each observation, the result varies according to the value of the character variable for that observation. This does not return the length property of the character variable as created in the SAS data set (that can be determined by querying the DICTIONARY.COLUMNS view, as seen in the previous section). 2. CAT(string1, string2, stringn) : concatenates string1, string2,, stringn with each substring taken verbatim (i.e. the strings are not trimmed for leading or trailing blanks). Each string can be a SAS variable (numerical or character) or a string expressed within quotation marks. CATT also concatenates strings with the same syntax as CAT, but trims the leading blanks from each element mentioned in the list CATX will trim and concatenate each entry and place a specified delimiter between each entry. 3. SUBSTR(varname, start_position, length) : creates a substring from string, that is extracted from starting at the position start_position of string, and having length length. String can be a SAS character variable or a string within quotation marks, while start_position and length must be positive integer numbers (e.g. a SAS numeric variable containing such a value, a SAS function that resolves to resolves to such a value, or simply a constant positive integer number). For example, SUBSTR( LSAC, 3, 2) would resolve to AC, and SUBSTR( LSAC, 1, 1 + 2) would resolve to LSA. METHOD FOR MIMICKING THE VARIABLE SHORTCUTS IN PROC SQL The code below describes how PROC SQL, dictionary members, and character functions can be integrated to mimic the variable shortcuts in the SELECT clause. This basic structure is used throughout this paper. PROC SQL SELECT <function of MEMNAME and NAME> INTO: macrovar FROM DICTIONARY.COLUMNS WHERE libname = libraryname AND memname = datasetname AND memtype = data *We will call this our target select clause in which we want to use the dash/double dash shortcuts QUIT SELECT &macrovar. FROM libraryname.datasetname DATA FOR EXAMPLES: LSAC NATIONAL LONGITUDINAL BAR PASSAGE STUDY From 1989 through 1996, the Law School Admission Council (LSAC) conducted a longitudinal study of law school matriculates that collected data regarding a variety of subjects relevant to aspiring lawyers, such as their Law School Selection and Expectations and their Aspirations in their law career. Four surveys were administered throughout the legal education careers of these matriculates: an Entering Student Questionnaire (ESQ), a First-Year Follow-up Questionnaire (FSQ), a 4

5 Second-Year Follow-up Questionnaire (SFQ), and a Third-Year Follow-up Questionnaire (TFQ). Being a longitudinal study, many of the items on the surveys were identical or nearly identical in order to allow analyses of changes in the goals, educational statuses, and employment statuses and the progress in passing the State Bar Exams of aspiring lawyers. The data collected during this study was released in 1999 and is available upon request from LSAC. Variables (columns) from the Bar Passage Study are used in the examples in this paper. Figure 3 shows the relevant excerpts of the ESQ that are referenced (repeated on the subsequent three surveys as well) and the demographic variables that are also used in examples. Figure 4 shows some of the contents of DICTIONARY.COLUMNS for a subset of the Bar Passage Study data, inputted from a flat data file by means of a DATA step. Readers can explore further examples by using the SASHELP.PRICEDATA data set. Fig. 3 Select Demographic Variables and Responses 67a Through 67k to Question 67 of the Entering Student Questionnaire of the Longitudinal Bar Passage Study (also duplicated in the other three surveys of this study) 5

6 Fig. 4 Select Contents of the Bar Passage Study Data Set 6

7 MIMICKING THE SINGLE DASH FOR SUFFIXED VARIABLES MOTIVATION FOR DEVELOPING NEW CODE FOR PROC SQL If a data set contains N variables each named with the same prefix and a numerical suffix then references to these N variables can be made using the single dash in DATA steps, data set options, and procedures to convey that each variable with this naming convention should be processed. For example, in the Bar Passage Study, each respondent was asked to rate (on a scale from 1 to 5) how appealing a certain employment setting in their field seems to them (Figure 3). The 12 coded responses to these 12 options within this item can be named as ESQ_E4_1, ESQ_E4_2, ESQ_E4_12, and the DATA step in Figure 5 could be used to input the data succinctly. In Figure 5, the single dash indicates that 12 variables should be created: ESQ_E4_1, ESQ_E4_2, ESQ_E4_12, and the 1. indicates that each variable is numeric with one digit. We will refer to ESQ_E4_ as the prefix part of these suffixed variable names. Fig. 5 Example of Using the Single Dash to Input Data of Same Type, Length, and Informat DATA BPS INFILE "&filepath.\bps.txt" LRECL = 1000 missover pad obs = 10 student_id 5. (ESQ_E4_1 - ESQ_E4_12) (1.) If a programmer desired frequencies on each, he or she could code as shown in Figure 6. Fig. 6 Example of Using the Single Dash to Process Data in a Procedure PROC FREQ DATA = BPS TABLES ESQ_E4_1 - ESQ_E4_12 RUN Additionally, if he or she wanted to KEEP or DELETE these variables or a subset of these new variables in a new data set, the variable list shortcut using the dash can be used yet again, as seen in Figure 7. Fig. 7 Example of Using the Single Dash to Select a Range of Variables to Input Into a Data Set DATA ESQ_BPS SET BPS (KEEP = ESQ_E4_7 - ESQ_E4_12) RUN When moving to PROC SQL, the dash cannot be used again in this manner in the SELECT clause to query those columns from the BPS table. Because arithmetic functions can be used in the SELECT clause, SAS would interpret this as a minus sign and subtract the two variables ESQ_E4_1 and ESQ_E4_12. The single dash can be used in the FROM clause as a data set option, as shown in Figure 8. However, if the programmer additionally wishes to rename the columns prior to creating the new data set say, using the prefix E4_ instead of ESQ_E4_ the programmer cannot do this easily with the RENAME statement in the same data set option. 7

8 Fig. 8 Example of a Simple Method for Mimicking the Single Dash in PROC SQL PROC SQL noprint CREATE TABLE ESQ_BPS AS SELECT * FROM BPS (KEEP = ESQ_E4_1 - ESQ_E4_12) QUIT EXPLANATION OF HOW TO ACCOMPLISH MIMICRY OF THE SINGLE DASH The trick to enabling both the renaming of the columns and easily selecting the columns involves creating a short program that saves a function of the data in the DICTIONARY.COLUMNS view into a macro variable. The output of this method is a global macro variable that contains the oldvarname AS newvarname codes with the proper placement of the required keywords, spaces, and commas. Then, the macro variable can be quoted in the SELECT clause of a new PROC SQL block where the dashed list was desired (we ll call this the target SELECT clause ). Figure 9 shows the basic code for accomplishing this. As coded in this figure, this macro variable can be referenced in the SELECT clause only during the current SAS session. Fig. 9 Basic Code for Mimicking the Single Dash in PROC SQL Using Dictionary Members and Enabling Dynamism and the Renaming of Suffixed Variables PROC SQL PROC SQL CREATE TABLE BPS_COLUMNS AS SELECT *, INPUT(SUBSTR(name, 1 + LENGTH("ESQ_E4_"), LENGTH(name) - LENGTH("ESQ_E4_") ), 3.) AS suffix FROM DICTIONARY.COLUMNS WHERE LOWCASE(memname) = LOWCASE("BPS") AND LOWCASE(memtype) = "data" AND libname = "WORK" AND SUBSTR(LOWCASE(name), 1, LENGTH("ESQ_E4_")) = LOWCASE("ESQ_E4_") ORDER BY suffix *Create a macro variable to hold the SELECT code we want to generate and use SELECT CAT(trimn(memname),., trimn(name), " AS ", "E4_", SUFFIX) INTO : suffixed_list SEPARATED BY ", " FROM BPS_COLUMNS WHERE SUBSTR(LOWCASE(name), 1, LENGTH("ESQ_E4_")) = LOWCASE("ESQ_E4_") ORDER BY SUFFIX CREATE TABLE ESQ_BPS AS SELECT student_id, &suffixed_list. FROM BPS In, a new table called BPS_COLUMNS is created. Only the rows of DICTIONARY.COLUMNS that correspond to the suffixed variables are saved to the new table. The specification for this selection is made by the WHERE clause: the SUBSTR function uses the length of the original prefix to compare a substring of NAME with the original prefix, and if this substring matches it ( ESQ_E4_ ), then the record is placed into the new table BPS_COLUMNS. The SELECT clause places a new column called suffix into this table. Assured by the WHERE statement that the value of variable name contains the name of a suffixed variable in data set BPS, the SUBSTR function within the SELECT clause determines what the suffix is by using the 8

9 length of NAME and the length of the original prefix ( ESQ_E4_ ). In addition to the suffix, all other columns from DICTIONARY.COLUMNS are saved into this new table, as indicated by the use of the asterisk (*) in the SELECT clause. In, this new table BPS_COLUMNS is queried to create a macro variable called suffixed_list, that will contain the old names of the suffixed variables and the desired new names placed into proper SELECT syntax. 2 The CAT function places the AS keyword, spaces, commas where necessary so that the macro variable suffixed_list contains exactly the code that is needed by the target SELECT clause in to query and rename the suffixed variables from BPS. It is worth noting here that the trailing blanks in the memname and name columns are trimmed by the TRIMN function before concatenation. The CAT function is used rather than the CATT function to preserve the trailing blank after the AS keyword. The first term of the CAT function contains the dataset name, followed by a period and the prefix for the original variable names. The next term is the suffix, followed by the AS keyword, the new prefix, and the suffix again. The new prefix and the suffix form an alias for each column that will be created using the macro variable suffixed_list in the target SELECT statement, in block 3 () Figure 9. Figure 10 shows the DICTIONARY.COLUMNS printout for the new data set (ESQ_BPS) created in this can be compared to Figure 4, which shows some of the original data set (BPS) contents. The macro variable suffixed_list resolves to: BPS.ESQ_E4_1 AS E4_1, BPS.ESQ_E4_2 AS E4_2, BPS.ESQ_E4_3 AS E4_3, BPS.ESQ_E4_4 AS E4_4, BPS.ESQ_E4_5 AS E4_5, BPS.ESQ_E4_6 AS E4_6, BPS.ESQ_E4_7 AS E4_7, BPS.ESQ_E4_8 AS E4_8, BPS.ESQ_E4_9 AS E4_9, BPS.ESQ_E4_10 AS E4_10, BPS.ESQ_E4_11 AS E4_11, BPS.ESQ_E4_12 AS E4_12 Fig. 10 Contents of New Data Set ESQ_BPS Using the Code in Figure 9 () %MIMIC_DASH: A MACRO TO MIMIC THE SINGLE DASH IN PROC SQL The code in Figure 9 can be made more dynamic by modifying it for inclusion in a macro program. Each hard-coded string that is referenced in Figure 9 can be turned into a macro variable that a user specifies in a macro invocation. Appendix I includes a macro, called MIMIC _DASH, that will accept data set name, library name, current prefix for the suffixed variables, and the desired new prefix for the variables the user wishes to select and rename in the target SELECT clause. Although it is certainly not recommended that a programmer use any data file without first referencing its code book, the beginning and ending suffixes are automatically determined in this code simply for the sake of dynamism. To enable efficiency in renaming more than one list of suffixed variables in a SELECT clause, the macro accepts a positive integer iteration number that is used to create the macro variable suffixed_list_# Table 3 summarizes some key properties of this code that must be considered before the macro is put into use. As evidenced by the length of the program, checks are included to be sure that the input parameters entered by the user are valid. Note that if a variable exists with a name that is strictly the prefix that the user specifies without a numerical suffix the variable will be ignored. 3 Because the scope of a macro variable created within a macro program is generally local (that is, can 2 These first two SELECT blocks could be combined into one using nested queries, where the SELECT clause in would appear within the FROM clause of. Instead, the two queries are separated here to provide easier dissection of the purpose of each block of code. For more information on nested queries, see Williams, For an example of such a circumstance, see the price variable and the suffixed price variables in SASHELP.PRICEDATA. 9

10 only be used within the current macro), a line of code is used to declare the variable as global so that it can be referenced in any coding environment during the current SAS session. Table 3: Features of the Macro Program MIMIC_DASH in Appendix I Feature Check that library and data set exist, variables exist, and suffix list is unbroken prior to execution of key code Reference table and column names in the SELECT clause Prevent rewrite of macro variable after subsequent invocations by allowing input of a suffix for the outputted macro variable Rename columns (can keep the names the same by re-specifying the same prefix) Include a variable named with only the specified prefix, but no suffix Assign new suffixes to the current suffixed list Use the outputted global macro variable suffixed_list_# in another SAS session Macro invocation %MIMIC_DASH(iteration_num, lib_name, dataset_name, old_prefix, new_prefix) Feature of macro --- Figure 11 shows an example of how this macro can be invoked and how the macro variables can be utilized with the Bar Passage Study data, using the variables described in Figure 3 from the four surveys that were administered. Figure 12 displays the contents of DICTIONARY.COLUMNS for the new data set ESQ_BPS, created by the outputted macro variable from the macro. The macro variable resolves to: BPS.ESQ_E4_1 AS E4_1, BPS.ESQ_E4_2 AS E4_2, BPS.ESQ_E4_3 AS E4_3, BPS.ESQ_E4_4 AS E4_4, BPS.ESQ_E4_5 AS E4_5, BPS.ESQ_E4_6 AS E4_6, BPS.ESQ_E4_7 AS E4_7, BPS.ESQ_E4_8 AS E4_8, BPS.ESQ_E4_9 AS E4_9, BPS.ESQ_E4_10 AS E4_10, BPS.ESQ_E4_11 AS E4_11, BPS.ESQ_E4_12 AS E4_12 %MIMIC_DASH(iteration_num = 1, lib_name = WORK dataset_name = BPS, old_prefix = ESQ_E4_, new_prefix = E4_) Fig. 11 Example Invocation of Macro MIMIC_DASH PROC SQL CREATE TABLE ESQ_BPS AS SELECT student_id, &suffixed_list_1. FROM BPS QUIT Fig. 12 Contents of New Data Set ESQ_BPS, Created Using the Variable List Provided by the Macro MIMIC_DASH 10

11 MIMICKING THE DOUBLE DASH FOR A LIST OF SEQUENTIAL VARIABLES MOTIVATION FOR DEVELOPING NEW CODE FOR PROC SQL As discussed above, the single dash in SAS allows a programmer to easily specify that each variable in a list of suffixed variables should be processed within a particular statement or option. The double dash (--) in SAS allows a programmer to specify that every variable found between two variables should be included for processing. For example, in the BPS data set, the first three variables in the file are as given in Figure 3: student_id, sex, and ethnicity.. If a programmer wanted to request that only these variables (i.e., every variable between the student ID and the self-reported race/ethnicity) be kept in a new data set, he or she could code as in Figure 13. In this figure, the data set BPS_DEMO contains only these three variables student_id, sex, and ethnicity. Fig. 13 Example of the Use of the Double Dash to Select Variables Into a New Data Set DATA BPS_DEMO SET BPS (KEEP = student_id -- ethnicity) RUN Unfortunately, there is no shortcut in PROC SQL for such a variable list. Attempts to use the double dash will result in a SAS error. This can be inconvenient when a significant number of the columns of a table should be included in a SELECT clause, especially when two or more tables are being merged or concatenated. Here again is a situation where the DICTIONARY.COLUMNS table is useful. This table contains a column named varnum that holds the position number of the variable. For example, if the variable was the third variable created in the program data vector, then the value for varnum the associated row in DICTIONARY.COLUMNS would be 3. Figure 14 shows an example of how this can be utilized to help create a macro variable string that holds the desired variable list. Figure 15 shows the data from DICTIONARY.COLUMNS that is related to the new table BPS_DEMO. Fig. 14 Basic code for mimicking the double dash in PROC SQL using a macro variable to hold the variable list PROC SQL noprint *Find the position number for the first variable name SELECT varnum INTO :begin_var_num FROM DICTIONARY.COLUMNS WHERE LOWCASE(memname) = LOWCASE("BPS") AND libname = "WORK" AND memtype = "data" AND LOWCASE(name) = LOWCASE("student_id") *Find the position number for the second variable name SELECT varnum INTO :end_var_num FROM DICTIONARY.COLUMNS WHERE LOWCASE(memname) = LOWCASE("BPS") AND libname = "WORK" AND memtype = "data" AND LOWCASE(name) = LOWCASE("ethnicity") *Now create a macro variable sequential_list to hold the SELECT clause we desire SELECT CATT(memname,., name) INTO :sequential_list SEPARATED BY ", " FROM DICTIONARY.COLUMNS WHERE varnum >= &begin_var_num. AND varnum <= &end_var_num. AND LOWCASE(memname) = LOWCASE("BPS") AND libname = "WORK" AND memtype = "data" *Create data set BPS_DEMO with only those variables between - and including - student_id and ethnicity CREATE TABLE BPS_DEMO AS SELECT &sequential_list. FROM BPS 11

12 Fig. 15 Contents of Data Set BPS_DEMO, Created Using the Macro Variable Created in Fig. 14 () EXPLANATION OF HOW TO ACCOMPLISH MIMICRY OF THE DOUBLE DASH In Figure 14, the varnum value of the variable that begins the desired list is determined using DICTIONARY.COLUMNS and the WHERE clause then, this value is saved into a local macro variable called begin_var_num (). Subsequently, this is also done for the variable that concludes the desired list, and the value is saved into a local variable called end_var_num (). In the next PROC SQL block (), each variable that is found to have a varnum value between these two values is identified using the WHERE statement and local macro variables begin_var_num and end_var_num. Each distinct name (column/variable name) that occurs between these two variables in the program data vector is saved into one global macro variable called sequential_list, preceeded by its table name, with each entry separated by commas. This is indicated in the SEPARATED BY clause. Because the global macro variable sequential_list was created with the syntax required by the SELECT clause, this variable can be referenced in a future SELECT clause to mimic the double dash that is used within the KEEP of a DATA step, as seen in. This macro variable created in Figure 14 resolves to: BPS.student_id, BPS.sex, BPS.ethnicity As seen in Figure 8, the mimicry of the variable list shortcut (--) can be accomplished by using the KEEP option in the FROM clause. However, the method used in Figure 13 has one small advantage: when utilized in a dynamic fashion, multiple variable lists can be created to be used in PROC SQL, and the placement of the resulting macro variables can used to rearrange columns. In contrast, the KEEP statement will not rearrange any columns it will maintain their relative positions in the new table. An efficient way to take advantage of this additional benefit of rearranging columns is to create multiple variable lists quickly using a macro program. The next subsection describes such a macro that is provided in Appendix II. %MIMIC_DOUBLE_DASH: A MACRO TO MIMIC THE DOUBLE DASH IN PROC SQL A macro for the double dash appears in the appendices of this paper. As is true in the macro for the single dash, this macro allows the user to specify an iteration number to prevent overwriting of the global macro variable sequential_list in subsequent invocations of the macro. Again, the macro variables produced can only be used in the current SAS session. As was done in the macro MIMIC_DASH, the scope of this variable is declared as global so it can be used during the current SAS session in open code. The key features of this macro are briefly summarized in Table 4. Table 4: Features of the macro program MIMIC_DOUBLE_DASH in Appendix II Feature Check that library and data set exist, variables exist, and the specified first variable in the list occurs before the specified last variable in the program data vector Reference table as well as column names Prevent rewrite of macro variable after subsequent invocations by allowing input of a suffix for the outputted macro variable Rename columns Create a suffixed variable list from a specified list Use the outputted global macro variable sequential_list_# in a new SAS session Macro invocation %MIMIC_DOUBLE_DASH(iteration_num, lib_name, dataset_name, begin_var, end_var) Feature of macro --- Figure 16 shows an example of how to invoke this macro to produce the data set seen in Figure 15. The macro variable sequential_list_1 resolves to: BPS.student_id, BPS.sex, BPS.ethnicity 12

13 Fig. 16 Example Invocation of Macro MIMIC_DOUBLE_DASH %MIMIC_DOUBLE_DASH(iteration_num = 1, lib_name = WORK, dataset_name = BPS, begin_var_name = student_id, end_var_name = ethnicity ) PROC SQL CREATE TABLE BPS_DEMO AS SELECT &sequential_list_1. FROM BPS QUIT INTEGRATING THE OUTPUT FROM THE MACROS Because the user can help define the name of the output macro variable that contains the variable list, these two macros MIMIC_DASH and MIMIC_DOUBLE_DASH can be invoked repeatedly without the outputted macro variables being written over. This allows the user to mix-and-match the variable lists within a SELECT clause. Figure 16 shows how this can be done using the Bar Passage Study data set. In this code, the user wanted to rename the variables that contain responses to the basic survey item in Figure 3, using a brief description of what the item contained. The user also wanted to select just a few variables from the first 15 displayed in the file and rearrange the ordering of the columns versus the original arrangement in the data set BPS. Figure 17 shows the contents of the new data set BPS_SUBSET, which can be compared to the original Bar Passage Study data described in Figures 3 and 4. 13

14 Fig. 17 Several Invocations of the Macros MIMIC_DASH and MIMIC_DOUBLE_DASH to Create a New Data Set with Renamed Suffixed Variables and Rearranged Columns *rename the questionnaire item codes to a vernacular phrase %MIMIC_DASH(iteration_num = 1, lib_name = WORK, dataset_name = BPS, old_prefix = ESQ_E4_, new_prefix = ESQ_HOW_APPEALING_WORK_) %MIMIC_DASH(iteration_num = 2, lib_name = WORK, dataset_name = BPS, old_prefix = SFQ_E6_, new_prefix = SFQ_HOW_APPEALING_WORK_) *select the first three demographic variables only %MIMIC_DOUBLE_DASH(iteration_num = 1, lib_name = WORK, dataset_name = BPS, begin_var_name = student_id, end_var_name = ethnicity) *select the number of bar attempts and the outcomes and dates of the first and last bar exam attempts, and place them at the end of the table %MIMIC_DOUBLE_DASH(iteration_num = 2, lib_name = WORK, dataset_name = BPS, begin_var_name = num_bar_attempts, end_var_name = last_bar_mm) PROC SQL NOPRINT CREATE TABLE BPS_SUBSET AS SELECT &sequential_list_1., &sequential_list_2., &suffixed_list_1., &suffixed_list_2. FROM BPS QUIT Fig. 18 Contents of the New Data Set BPS_SUBSET, Created by the Code in Figure 16 14

15 CONCLUSIONS The use of the single and double dash operators is a very convenient way to quickly generate variable lists within SAS statements and options. Neither of these list generators is duplicated by similar syntax in the SELECT clause of the SQL procedure. However, using PROC SQL, dictionary members, character functions, and macro variables in conjunction with each other, the functionalities of the dash and double dash can be mimicked using a macro variable that holds the desired variable list. The journey to implementing this method integrates and enlightens the programmer to many useful and efficient SAS tools, such as macros, dictionary members, and character functions. While both the dash and double dash can be used in the KEEP option with the table specified in the FROM, this shortcut does not allow the columns to be rearranged when the CREATE TABLE clause generates the new table, nor does it allow variables to be renamed. The two macros provided in this paper MIMIC_DASH and MIMIC_DOUBLE_DASH help introduce these benefits to allow the user to take fuller advantage of the alias specification and column reordering that is normally available via the SELECT statement. Readers can explore this technique with the SASHELP.PRICEDATA data set available in PC versions of SAS. 15

16 APPENDIX I: MIMIC_ DASH MACRO /* ********************************************************************************************************** MACRO: MIMIC_DASH GOAL The goal of this program is to enable the programmer to easily create a variable list of suffixed variable names in PROC SQL's SELECT clause. This will mimick the dash shortcut that is commonly used when working in native SAS code (DATA steps and procedures). PURPOSE Create a global macro variable to hold proper SELECT syntax for a variable list with suffixed variable names. Ultimately allow the programmer to produce a variable list of suffixed variable names in PROC SQL's SELECT clause without hand-typing each. INPUT PARAMETERS iteration_num A positive integer that will be used to name a global macro variable that accomplishes the PURPOSE of this macro lib_name The name of the library that contains the data set dataset_name The name of the data set that contains the suffixed variables OUTPUT old_prefix new_prefix The current prefix of the suffixed variable names The prefix that will be used to create aliases. This can be the same as the old_prefix. A global macro variable called &suffixed_list_&iteration_num. that allows the programmer to achieve the GOAL An intermediate data set called &dataset_name._columns that holds the names of the columns of &dataset_name. that held the suffixed variable names. This data set is deleted upon completion of the macro execution. MISC. IMPORTANT NOTES The macro variable that is created can only be used in the current SAS session. ********************************************************************************************************** */ %MACRO MIMIC_DASH(iteration_num, lib_name, dataset_name, old_prefix, new_prefix) OPTIONS SYMBOLGEN SPOOL MPRINT MLOGIC MERROR VALIDVARNAME = V7 *create lowercase macro variables from the inputted parameters, to ease comparison to DICTIONARY.COLUMNS %LET LC_lib_name = %LOWCASE(&lib_name.) %LET LC_dataset_name = %LOWCASE(&dataset_name.) %LET LC_OLD_PREFIX = %LOWCASE(&old_prefix.) *find length of the old prefix %LET LEN_OLD_PREFIX = %LENGTH(&old_prefix.) 16

17 *extract the prefix from the dictionary.columns entry %LET EXTRACT_PREFIX = SUBSTR(LOWCASE(name), 1, &len_old_prefix.) %let EXTRACT_suffix = input(substr(name, 1 + &LEN_old_prefix., length(name) - &LEN_old_prefix. ), 3.) *check that the variable and data set exist before trying to process %LOCAL ERROR_COUNT %LET ERROR_COUNT = 0 *check that the new_prefix entered is a valid SAS variable name %IF %SYSFUNC(NVALID(&new_prefix.)) = 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) The user-specified new prefix %upcase(&new_prefix.) is not a valid SAS variable name %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the iteration number is an integer digit %IF %DATATYP(&iteration_num.) ^= NUMERIC %THEN %DO %put ERROR: (&SYSMACRONAME.) Iteration number entered by user is not a number. %put ERROR: (&SYSMACRONAME.) Terminating %LET ERROR_COUNT = 1 %ELSE %DO %IF %SYSFUNC(MOD(&iteration_num.,%SYSFUNC(FLOOR(&iteration_num.)))) ^= 0 OR &iteration_num. < 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Iteration number entered is not an integer %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the library exists %IF %SYSFUNC(LIBREF(&lib_name.)) ^= 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Library %upcase(&lib_name.) does not exist. %put ERROR: (&SYSMACRONAME.) TERMINATING. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the data set exists and also that it is contained within the library %IF %SYSFUNC(EXIST(&lib_name..&dataset_name.)) ^= 1 %THEN %DO %put ERROR: (&SYSMACRONAME.) Data set %upcase(&lib_name..&dataset_name.) does not exist. %put ERROR: (&SYSMACRONAME.) TERMINATING. %LET ERROR_COUNT = %EVAL(&error_count. + 1) 17

18 %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the old prefix is actually used in a variable name PROC SQL NOPRINT SELECT N(name) INTO :NUMBER_VARS_FOUND FROM DICTIONARY.COLUMNS WHERE &extract_prefix. = "&LC_old_prefix." AND LOWCASE(memname) = "&LC_dataset_name." AND LOWCASE(libname) = "&LC_lib_name." AND LOWCASE(memtype) = "data" QUIT %IF &number_vars_found. = 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Variable with prefix %upcase(&old_prefix.) not found %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *dynamically determine the start and stop point of the suffixed variable list %put NOTE: Dynamically determining the variables with prefix %upcase(&old_prefix.) from dataset %upcase(&dataset_name.) *find the beginning of the suffixed variables, and the end PROC SQL NOPRINT *For now, we create this new table -- from DICTIONARY.COLUMNS -- to find the beginning and ending suffix values *Later, we will use this new table again the generate the variable list create table &dataset_name._columns as SELECT *, &EXTRACT_suffix. AS suffix FROM dictionary.columns WHERE length(name) ne %length(&lc_old_prefix.) AND &EXTRACT_prefix. = "&LC_old_prefix." AND lowcase(memname) = "&LC_dataset_name." AND lowcase(libname) = "&LC_lib_name." AND lowcase(memtype) = "data" SELECT MIN(SUFFIX) INTO :BEGIN_SUFFIX FROM &dataset_name._columns WHERE SUFFIX NE. *the end 18

19 QUIT SELECT MAX(SUFFIX) INTO :END_SUFFIX FROM &dataset_name._columns WHERE SUFFIX NE. %put NOTE: %put NOTE: BEGIN_SUFFIX dynamically determined as : &begin_suffix. END_SUFFIX dynamically determined as : &end_suffix. *check that the suffixes are numeric %IF %DATATYP(&begin_suffix.) ^= NUMERIC OR %DATATYP(&end_suffix.) ^= NUMERIC %THEN %DO %put ERROR: (&SYSMACRONAME.) Suffixes are NOT numeric %put Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *Be sure that the interval increases by 1 %DO LOOP_SUFFIX = &begin_suffix. %TO &end_suffix. %LET OPEN_&dataset_name. = %SYSFUNC(OPEN(&lib_name..&dataset_name.,i)) %IF %SYSFUNC(VARNUM(&&OPEN_&dataset_name.., &old_prefix.&loop_suffix.)) = 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) There is not a continguous range of variable between %upcase(&old_prefix.%left(&begin_suffix.)) and %upcase(&old_prefix.%left(&end_suffix.)) %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. %LET CLOSE_&dataset_name. = %SYSFUNC(CLOSE(&&OPEN_&dataset_name..)) %IF &error_count. ^= 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Errors encountered. See log for details. %put ERROR: (&SYSMACRONAME.) Terminating. %ELSE %DO *now select the variables in PROC SQL (rename if the user requested it that way) *and put into global macro variable string_of_suffixed_var_names_# %GLOBAL suffixed_list_&iteration_num. 19

20 PROC SQL NOPRINT SELECT CAT("&dataset_name..", "&old_prefix.", SUFFIX, " AS ", "&new_prefix.", SUFFIX) INTO : suffixed_list_&iteration_num. SEPARATED BY ", " FROM &dataset_name._columns WHERE &extract_prefix. = "&LC_old_prefix." AND SUFFIX NE. ORDER BY SUFFIX QUIT %put NOTE: Here is a global macro variable that you can use in the current SAS session %put NOTE: within the PROC SQL SELECT clause to select suffixed variables %put NOTE: that are renamed as you specified in the macro invocation %put NOTE: MACRO VARIABLE suffixed_list_&iteration_num.: &&suffixed_list_&iteration_num. *erase the data set we created while we performed these tasks: dataset_name._columns PROC DATASETS LIBRARY = WORK MEMTYPE = DATA NOLIST DELETE &dataset_name._columns RUN QUIT %MEND APPENDIX II: MIMIC_DOUBLE_DASH MACRO /* ********************************************************************************************************** MACRO: MIMIC_DOUBLE_DASH GOAL The goal of this program is to enable the programmer to easily create a list of variable names that would be created by the double dash (--) in SAS DATA steps and procedures, in a PROC SQL SELECT clause that precludes the use of this shortcut. PURPOSE Create a global macro variable to hold proper SELECT syntax for a variable list. Ultimately allow the programmer to produce a variable list in PROC SQL's SELECT clause without hand-typing each variable name. INPUT PARAMETERS iteration_num A positive integer that will be used to name a global macro variable that accomplishes the PURPOSE of this macro lib_name The name of the library that contains the data set dataset_name The name of the data set that contains the suffixed variables begin_var_name The first name in the variable list that would be produced by --. end_var_name The last name in the variable list that would be produced by

21 OUTPUT A global macro variable called &sequential_list_&iteration_num. that allows the programmer to achieve the GOAL MISC. IMPORTANT NOTES The macro variable that is created can only be used in the current SAS session. ********************************************************************************************************** */ %MACRO MIMIC_DOUBLE_DASH(iteration_num, lib_name, dataset_name, begin_var_name, end_var_name) OPTIONS SYMBOLGEN MPRINT MLOGIC *create lowercase macro variables from the inputted parameters, to ease comparison to DICTIONARY.COLUMNS %LET LC_lib_name = %LOWCASE(&lib_name.) %LET LC_dataset_name = %LOWCASE(&dataset_name.) %LET LC_begin_var_name = %LOWCASE(&begin_var_name.) %LET LC_end_var_name = %LOWCASE(&end_var_name.) *check that the iteration number is an integer digit %LOCAL ERROR_COUNT %LET ERROR_COUNT = 0 %IF %DATATYP(&iteration_num.) ^= NUMERIC %THEN %DO %put ERROR: (&SYSMACRONAME.) Iteration number entered by user is not a number. %put ERROR: (&SYSMACRONAME.) Terminating %LET ERROR_COUNT = 1 %ELSE %DO %IF %SYSFUNC(MOD(&iteration_num.,%SYSFUNC(FLOOR(&iteration_num.)))) ^= 0 OR &iteration_num. < 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Iteration number entered is not an integer %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the library exists %IF %SYSFUNC(LIBREF(&lib_name.)) ^= 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Library %upcase(&lib_name.) does not exist. %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) 21

22 %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the data set exists %IF %SYSFUNC(EXIST(&lib_name..&dataset_name.)) ^= 1 %THEN %DO %put ERROR: (&SYSMACRONAME.) Data set %upcase(&lib_name..&dataset_name.) does not exist. %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *check that the two variables exist %LET OPEN_&dataset_name. = %SYSFUNC(OPEN(&lib_name..&dataset_name.,i)) %IF %SYSFUNC(VARNUM(&&OPEN_&dataset_name.., &begin_var_name.)) = 0 OR %SYSFUNC(VARNUM(&&OPEN_&dataset_name.., &end_var_name.)) = 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) Variable %upcase(&begin_var_name.) or %upcase(&end_var_name.) does not exist in data set %upcase(&dataset_name.). %put ERROR: (&SYSMACRONAME.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. *now check that the beginning variable is before the ending variable in the data set PROC SQL NOPRINT *find the first number for the first variable name SELECT VARNUM INTO :BEGIN_VAR_NUM FROM DICTIONARY.COLUMNS WHERE LOWCASE(memname) = "&LC_dataset_name." AND LOWCASE(name) = "&LC_begin_var_name." AND LOWCASE(libname) = "&LC_lib_name." AND LOWCASE(memtype) = "data" QUIT *find the second number for the second variable name SELECT VARNUM INTO :END_VAR_NUM FROM DICTIONARY.COLUMNS WHERE LOWCASE(memname) = "&LC_dataset_name." AND LOWCASE(name) = "&LC_end_var_name." AND LOWCASE(libname) = "&LC_lib_name." AND LOWCASE(memtype) = "data" *if the first variable name entered actually was created AFTER the second... *then terminate *otherwise, continue onward to select the variables indicated 22

23 %IF &begin_var_num. > &end_var_num. %THEN %DO %put ERROR: (&SYSMACRONAME.) (&sysmacroname.) User-supplied first variable occurs after second variable name in dataset %upcase(&dataset_name.) %put ERROR: (&SYSMACRONAME.) (&sysmacroname.) Terminating. %LET ERROR_COUNT = %EVAL(&error_count. + 1) %ELSE %DO %LET ERROR_COUNT = &error_count. %LET CLOSE_&dataset_name. = %SYSFUNC(CLOSE(&&OPEN_&dataset_name..)) %IF &error_count. ^= 0 %THEN %DO %put ERROR: (&SYSMACRONAME.) See log for details. %put ERROR: (&SYSMACRONAME.) Terminating. %ELSE %DO *declare the string that will hold the variable list as a global variable %GLOBAL sequential_list_&iteration_num. PROC SQL NOPRINT SELECT CATX(".", memname, name) INTO :sequential_list_&iteration_num. SEPARATED BY ", " FROM DICTIONARY.COLUMNS WHERE VARNUM >= &begin_var_num. AND VARNUM <= &end_var_num. AND LOWCASE(memname) = "&LC_dataset_name." AND LOWCASE(libname) = "&LC_lib_name." AND LOWCASE(memtype) = "data" QUIT %put NOTE: Here is your list of variable names to use in the PROC SQLs SELECT clause %put NOTE: This is a global macro variable you can use in the current SAS session %put NOTE: MACRO VARIABLE sequential_list_&iteration_num.: &&sequential_list_&iteration_num.. %MEND 23

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data Paper PO31 The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data MaryAnne DePesquo Hope, Health Services Advisory Group, Phoenix, Arizona Fen Fen Li, Health Services Advisory Group,

More information

Base and Advance SAS

Base and Advance SAS Base and Advance SAS BASE SAS INTRODUCTION An Overview of the SAS System SAS Tasks Output produced by the SAS System SAS Tools (SAS Program - Data step and Proc step) A sample SAS program Exploring SAS

More information

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS TO SAS NEED FOR SAS WHO USES SAS WHAT IS SAS? OVERVIEW OF BASE SAS SOFTWARE DATA MANAGEMENT FACILITY STRUCTURE OF SAS DATASET SAS PROGRAM PROGRAMMING LANGUAGE ELEMENTS OF THE SAS LANGUAGE RULES FOR SAS

More information

Uncommon Techniques for Common Variables

Uncommon Techniques for Common Variables Paper 11863-2016 Uncommon Techniques for Common Variables Christopher J. Bost, MDRC, New York, NY ABSTRACT If a variable occurs in more than one data set being merged, the last value (from the variable

More information

CSC Web Programming. Introduction to SQL

CSC Web Programming. Introduction to SQL CSC 242 - Web Programming Introduction to SQL SQL Statements Data Definition Language CREATE ALTER DROP Data Manipulation Language INSERT UPDATE DELETE Data Query Language SELECT SQL statements end with

More information

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */ Overview Language Basics This chapter describes the basic elements of Rexx. It discusses the simple components that make up the language. These include script structure, elements of the language, operators,

More information

Chapter 7 File Access. Chapter Table of Contents

Chapter 7 File Access. Chapter Table of Contents Chapter 7 File Access Chapter Table of Contents OVERVIEW...105 REFERRING TO AN EXTERNAL FILE...105 TypesofExternalFiles...106 READING FROM AN EXTERNAL FILE...107 UsingtheINFILEStatement...107 UsingtheINPUTStatement...108

More information

Arthur L. Carpenter California Occidental Consultants, Oceanside, California

Arthur L. Carpenter California Occidental Consultants, Oceanside, California Paper 028-30 Storing and Using a List of Values in a Macro Variable Arthur L. Carpenter California Occidental Consultants, Oceanside, California ABSTRACT When using the macro language it is not at all

More information

CHAPTER 7 Using Other SAS Software Products

CHAPTER 7 Using Other SAS Software Products 77 CHAPTER 7 Using Other SAS Software Products Introduction 77 Using SAS DATA Step Features in SCL 78 Statements 78 Functions 79 Variables 79 Numeric Variables 79 Character Variables 79 Expressions 80

More information

SAS CURRICULUM. BASE SAS Introduction

SAS CURRICULUM. BASE SAS Introduction SAS CURRICULUM BASE SAS Introduction Data Warehousing Concepts What is a Data Warehouse? What is a Data Mart? What is the difference between Relational Databases and the Data in Data Warehouse (OLTP versus

More information

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes Brian E. Lawton Curriculum Research & Development Group University of Hawaii at Manoa Honolulu, HI December 2012 Copyright 2012

More information

chapter 2 G ETTING I NFORMATION FROM A TABLE

chapter 2 G ETTING I NFORMATION FROM A TABLE chapter 2 Chapter G ETTING I NFORMATION FROM A TABLE This chapter explains the basic technique for getting the information you want from a table when you do not want to make any changes to the data and

More information

SQL Metadata Applications: I Hate Typing

SQL Metadata Applications: I Hate Typing SQL Metadata Applications: I Hate Typing Hannah Fresques, MDRC, New York, NY ABSTRACT This paper covers basics of metadata in SQL and provides useful applications, including: finding variables on one or

More information

Open Problem for SUAVe User Group Meeting, November 26, 2013 (UVic)

Open Problem for SUAVe User Group Meeting, November 26, 2013 (UVic) Open Problem for SUAVe User Group Meeting, November 26, 2013 (UVic) Background The data in a SAS dataset is organized into variables and observations, which equate to rows and columns. While the order

More information

How to Create Data-Driven Lists

How to Create Data-Driven Lists Paper 9540-2016 How to Create Data-Driven Lists Kate Burnett-Isaacs, Statistics Canada ABSTRACT As SAS programmers we often want our code or program logic to be driven by the data at hand, rather than

More information

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System %MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System Rushi Patel, Creative Information Technology, Inc., Arlington, VA ABSTRACT It is common to find

More information

Retrieving Data Using the SQL SELECT Statement. Copyright 2009, Oracle. All rights reserved.

Retrieving Data Using the SQL SELECT Statement. Copyright 2009, Oracle. All rights reserved. Retrieving Data Using the SQL SELECT Statement Objectives After completing this lesson, you should be able to do the following: List the capabilities of SQL SELECT statements Execute a basic SELECT statement

More information

Retrieving Data Using the SQL SELECT Statement. Copyright 2004, Oracle. All rights reserved.

Retrieving Data Using the SQL SELECT Statement. Copyright 2004, Oracle. All rights reserved. Retrieving Data Using the SQL SELECT Statement Copyright 2004, Oracle. All rights reserved. Objectives After completing this lesson, you should be able to do the following: List the capabilities of SQL

More information

Tips & Tricks. With lots of help from other SUG and SUGI presenters. SAS HUG Meeting, November 18, 2010

Tips & Tricks. With lots of help from other SUG and SUGI presenters. SAS HUG Meeting, November 18, 2010 Tips & Tricks With lots of help from other SUG and SUGI presenters 1 SAS HUG Meeting, November 18, 2010 2 3 Sorting Threads Multi-threading available if your computer has more than one processor (CPU)

More information

Getting Information from a Table

Getting Information from a Table ch02.fm Page 45 Wednesday, April 14, 1999 2:44 PM Chapter 2 Getting Information from a Table This chapter explains the basic technique of getting the information you want from a table when you do not want

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics ST009 PROC MI as the Basis for a Macro for the Study of Patterns of Missing Data Carl E. Pierchala, National Highway Traffic Safety Administration, Washington ABSTRACT The study of missing data patterns

More information

SAS Macro Language: Reference

SAS Macro Language: Reference SAS Macro Language: Reference INTRODUCTION Getting Started with the Macro Facility This is the macro facility language reference for the SAS System. It is a reference for the SAS macro language processor

More information

Dictionary.coumns is your friend while appending or moving data

Dictionary.coumns is your friend while appending or moving data ABSTRACT SESUG Paper CC-41-2017 Dictionary.coumns is your friend while appending or moving data Kiran Venna, Dataspace Inc. Dictionary.columns is a dictionary table, which gives metadata information of

More information

Taming a Spreadsheet Importation Monster

Taming a Spreadsheet Importation Monster SESUG 2013 Paper BtB-10 Taming a Spreadsheet Importation Monster Nat Wooding, J. Sargeant Reynolds Community College ABSTRACT As many programmers have learned to their chagrin, it can be easy to read Excel

More information

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS SAS COURSE CONTENT Course Duration - 40hrs BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS What is SAS History of SAS Modules available SAS GETTING STARTED

More information

A Better Perspective of SASHELP Views

A Better Perspective of SASHELP Views Paper PO11 A Better Perspective of SASHELP Views John R. Gerlach, Independent Consultant; Hamilton, NJ Abstract SASHELP views provide a means to access all kinds of information about a SAS session. In

More information

Quick Data Definitions Using SQL, REPORT and PRINT Procedures Bradford J. Danner, PharmaNet/i3, Tennessee

Quick Data Definitions Using SQL, REPORT and PRINT Procedures Bradford J. Danner, PharmaNet/i3, Tennessee ABSTRACT PharmaSUG2012 Paper CC14 Quick Data Definitions Using SQL, REPORT and PRINT Procedures Bradford J. Danner, PharmaNet/i3, Tennessee Prior to undertaking analysis of clinical trial data, in addition

More information

Introduction to Computer Science and Business

Introduction to Computer Science and Business Introduction to Computer Science and Business The Database Programming with PL/SQL course introduces students to the procedural language used to extend SQL in a programatic manner. This course outline

More information

Introduction. Getting Started with the Macro Facility CHAPTER 1

Introduction. Getting Started with the Macro Facility CHAPTER 1 1 CHAPTER 1 Introduction Getting Started with the Macro Facility 1 Replacing Text Strings Using Macro Variables 2 Generating SAS Code Using Macros 3 Inserting Comments in Macros 4 Macro Definition Containing

More information

Introduction to Computer Science and Business

Introduction to Computer Science and Business Introduction to Computer Science and Business This is the second portion of the Database Design and Programming with SQL course. In this portion, students implement their database design by creating a

More information

Using Data Set Options in PROC SQL Kenneth W. Borowiak Howard M. Proskin & Associates, Inc., Rochester, NY

Using Data Set Options in PROC SQL Kenneth W. Borowiak Howard M. Proskin & Associates, Inc., Rochester, NY Using Data Set Options in PROC SQL Kenneth W. Borowiak Howard M. Proskin & Associates, Inc., Rochester, NY ABSTRACT Data set options are an often over-looked feature when querying and manipulating SAS

More information

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management Tenth Edition Chapter 7 Introduction to Structured Query Language (SQL) Objectives In this chapter, students will learn: The basic commands and

More information

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys Richard L. Downs, Jr. and Pura A. Peréz U.S. Bureau of the Census, Washington, D.C. ABSTRACT This paper explains

More information

Chapter 2: Getting Data Into SAS

Chapter 2: Getting Data Into SAS Chapter 2: Getting Data Into SAS Data stored in many different forms/formats. Four categories of ways to read in data. 1. Entering data directly through keyboard 2. Creating SAS data sets from raw data

More information

MOBILE MACROS GET UP TO SPEED SOMEWHERE NEW FAST Author: Patricia Hettinger, Data Analyst Consultant Oakbrook Terrace, IL

MOBILE MACROS GET UP TO SPEED SOMEWHERE NEW FAST Author: Patricia Hettinger, Data Analyst Consultant Oakbrook Terrace, IL MOBILE MACROS GET UP TO SPEED SOMEWHERE NEW FAST Author: Patricia Hettinger, Data Analyst Consultant Oakbrook Terrace, IL ABSTRACT: Have you ever been faced with this scenario? It s your first day on the

More information

SQL functions fit into two broad categories: Data definition language Data manipulation language

SQL functions fit into two broad categories: Data definition language Data manipulation language Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition Chapter 7 Beginning Structured Query Language (SQL) MDM NUR RAZIA BINTI MOHD SURADI 019-3932846 razia@unisel.edu.my

More information

Contents of SAS Programming Techniques

Contents of SAS Programming Techniques Contents of SAS Programming Techniques Chapter 1 About SAS 1.1 Introduction 1.1.1 SAS modules 1.1.2 SAS module classification 1.1.3 SAS features 1.1.4 Three levels of SAS techniques 1.1.5 Chapter goal

More information

NULLs & Outer Joins. Objectives of the Lecture :

NULLs & Outer Joins. Objectives of the Lecture : Slide 1 NULLs & Outer Joins Objectives of the Lecture : To consider the use of NULLs in SQL. To consider Outer Join Operations, and their implementation in SQL. Slide 2 Missing Values : Possible Strategies

More information

Beginning Tutorials. PROC FSEDIT NEW=newfilename LIKE=oldfilename; Fig. 4 - Specifying a WHERE Clause in FSEDIT. Data Editing

Beginning Tutorials. PROC FSEDIT NEW=newfilename LIKE=oldfilename; Fig. 4 - Specifying a WHERE Clause in FSEDIT. Data Editing Mouse Clicking Your Way Viewing and Manipulating Data with Version 8 of the SAS System Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California ABSTRACT Version 8 of the

More information

EE221 Databases Practicals Manual

EE221 Databases Practicals Manual EE221 Databases Practicals Manual Lab 1 An Introduction to SQL Lab 2 Database Creation and Querying using SQL Assignment Data Analysis, Database Design, Implementation and Relation Normalisation School

More information

SAS Certification Handout #10: Adv. Prog. Ch. 5-8

SAS Certification Handout #10: Adv. Prog. Ch. 5-8 SAS Certification Handout #10: Adv. Prog. Ch. 5-8 /************ Ch. 5 ******************* /* First, make example data -- same as Handout #9 libname cert 'C:/jrstevens/Teaching/SAS_Cert/AdvNotes'; /* In

More information

Chapter 7. Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel

Chapter 7. Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel 1 In this chapter, you will learn: The basic commands

More information

Developing Data-Driven SAS Programs Using Proc Contents

Developing Data-Driven SAS Programs Using Proc Contents Developing Data-Driven SAS Programs Using Proc Contents Robert W. Graebner, Quintiles, Inc., Kansas City, MO ABSTRACT It is often desirable to write SAS programs that adapt to different data set structures

More information

Chapter 1. Introduction. 1.1 More about SQL More about This Book 5

Chapter 1. Introduction. 1.1 More about SQL More about This Book 5 Chapter 1 Introduction 1.1 More about SQL 2 1.2 More about This Book 5 SAS defines Structured Query Language (SQL) as a standardized, widely used language that retrieves data from and updates data in tables

More information

Retrieving Data Using the SQL SELECT Statement. Copyright 2004, Oracle. All rights reserved.

Retrieving Data Using the SQL SELECT Statement. Copyright 2004, Oracle. All rights reserved. Retrieving Data Using the SQL SELECT Statement Objectives After completing this lesson, you should be able to do the following: List the capabilities of SQL SELECT statements Execute a basic SELECT statement

More information

WHAT ARE SASHELP VIEWS?

WHAT ARE SASHELP VIEWS? Paper PN13 There and Back Again: Navigating between a SASHELP View and the Real World Anita Rocha, Center for Studies in Demography and Ecology University of Washington, Seattle, WA ABSTRACT A real strength

More information

DBLOAD Procedure Reference

DBLOAD Procedure Reference 131 CHAPTER 10 DBLOAD Procedure Reference Introduction 131 Naming Limits in the DBLOAD Procedure 131 Case Sensitivity in the DBLOAD Procedure 132 DBLOAD Procedure 132 133 PROC DBLOAD Statement Options

More information

SAS 9.3 LIBNAME Engine for DataFlux Federation Server

SAS 9.3 LIBNAME Engine for DataFlux Federation Server SAS 9.3 LIBNAME Engine for DataFlux Federation Server User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2012. SAS 9.3 LIBNAME Engine for

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information

Let the CAT Out of the Bag: String Concatenation in SAS 9

Let the CAT Out of the Bag: String Concatenation in SAS 9 Let the CAT Out of the Bag: String Concatenation in SAS 9 Joshua M. Horstman, Nested Loop Consulting, Indianapolis, IN, USA ABSTRACT Are you still using TRIM, LEFT, and vertical bar operators to concatenate

More information

Surviving the SAS Macro Jungle by Using Your Own Programming Toolkit Kevin Russell, SAS Institute Inc., Cary, North Carolina

Surviving the SAS Macro Jungle by Using Your Own Programming Toolkit Kevin Russell, SAS Institute Inc., Cary, North Carolina PharmaSUG 2016 Paper BB11 Surviving the SAS Macro Jungle by Using Your Own Programming Toolkit Kevin Russell, SAS Institute Inc., Cary, North Carolina ABSTRACT Almost every night there is a reality show

More information

Unlock SAS Code Automation with the Power of Macros

Unlock SAS Code Automation with the Power of Macros SESUG 2015 ABSTRACT Paper AD-87 Unlock SAS Code Automation with the Power of Macros William Gui Zupko II, Federal Law Enforcement Training Centers SAS code, like any computer programming code, seems to

More information

SAS Macro. SAS Training Courses. Amadeus Software Ltd

SAS Macro. SAS Training Courses. Amadeus Software Ltd SAS Macro SAS Training Courses By Amadeus Software Ltd AMADEUS SOFTWARE LIMITED SAS TRAINING Amadeus have been delivering SAS Training since 1989 and our aim is to provide you with best quality SAS training

More information

Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables

Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables Paper 3458-2015 Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables ABSTRACT Louise Hadden, Abt Associates Inc., Cambridge, MA SAS provides a wealth of resources for users to

More information

Course Outline and Objectives: Database Programming with SQL

Course Outline and Objectives: Database Programming with SQL Introduction to Computer Science and Business Course Outline and Objectives: Database Programming with SQL This is the second portion of the Database Design and Programming with SQL course. In this portion,

More information

DB2 SQL Class Outline

DB2 SQL Class Outline DB2 SQL Class Outline The Basics of SQL Introduction Finding Your Current Schema Setting Your Default SCHEMA SELECT * (All Columns) in a Table SELECT Specific Columns in a Table Commas in the Front or

More information

Greenplum SQL Class Outline

Greenplum SQL Class Outline Greenplum SQL Class Outline The Basics of Greenplum SQL Introduction SELECT * (All Columns) in a Table Fully Qualifying a Database, Schema and Table SELECT Specific Columns in a Table Commas in the Front

More information

Program Validation: Logging the Log

Program Validation: Logging the Log Program Validation: Logging the Log Adel Fahmy, Symbiance Inc., Princeton, NJ ABSTRACT Program Validation includes checking both program Log and Logic. The program Log should be clear of any system Error/Warning

More information

Using PROC SQL to Generate Shift Tables More Efficiently

Using PROC SQL to Generate Shift Tables More Efficiently ABSTRACT SESUG Paper 218-2018 Using PROC SQL to Generate Shift Tables More Efficiently Jenna Cody, IQVIA Shift tables display the change in the frequency of subjects across specified categories from baseline

More information

Macros are a block of code that can be executed/called on demand

Macros are a block of code that can be executed/called on demand What Are These Macros are a block of code that can be executed/called on demand Global variables are variables that you assign a value to, which can be referenced anywhere within your program. (Leah s

More information

Merge Processing and Alternate Table Lookup Techniques Prepared by

Merge Processing and Alternate Table Lookup Techniques Prepared by Merge Processing and Alternate Table Lookup Techniques Prepared by The syntax for data step merging is as follows: International SAS Training and Consulting This assumes that the incoming data sets are

More information

DATA Step Debugger APPENDIX 3

DATA Step Debugger APPENDIX 3 1193 APPENDIX 3 DATA Step Debugger Introduction 1194 Definition: What is Debugging? 1194 Definition: The DATA Step Debugger 1194 Basic Usage 1195 How a Debugger Session Works 1195 Using the Windows 1195

More information

SQL for MySQL A Beginner s Tutorial

SQL for MySQL A Beginner s Tutorial SQL for MySQL A Beginner s Tutorial Djoni Darmawikarta SQL for MySQL: A Beginner s Tutorial Copyright 2014 Brainy Software Inc. First Edition: June 2014 All rights reserved. No part of this book may be

More information

1Z0-071 Exam Questions Demo Oracle. Exam Questions 1Z Oracle Database 12c SQL.

1Z0-071 Exam Questions Demo   Oracle. Exam Questions 1Z Oracle Database 12c SQL. Oracle Exam Questions 1Z0-071 Oracle Database 12c SQL Version:Demo 1. the Exhibit and examine the structure of the CUSTOMERS and CUST_HISTORY tables. The CUSTOMERS table contains the current location of

More information

An Introduction to SAS University Edition

An Introduction to SAS University Edition An Introduction to SAS University Edition Ron Cody From An Introduction to SAS University Edition. Full book available for purchase here. Contents List of Programs... xi About This Book... xvii About the

More information

CSCE 120: Learning To Code

CSCE 120: Learning To Code CSCE 120: Learning To Code Manipulating Data I Introduction This module is designed to get you started working with data by understanding and using variables and data types in JavaScript. It will also

More information

Table of Contents. PDF created with FinePrint pdffactory Pro trial version

Table of Contents. PDF created with FinePrint pdffactory Pro trial version Table of Contents Course Description The SQL Course covers relational database principles and Oracle concepts, writing basic SQL statements, restricting and sorting data, and using single-row functions.

More information

Acknowledgments xi Preface xiii About the Author xv About This Book xvii New in the Macro Language xxi

Acknowledgments xi Preface xiii About the Author xv About This Book xvii New in the Macro Language xxi Contents Part 1 Acknowledgments xi Preface xiii About the Author xv About This Book xvii New in the Macro Language xxi Macro Basics Chapter 1 Introduction 3 1.1 Macro Facility Overview 3 1.2 Terminology

More information

SAS Online Training: Course contents: Agenda:

SAS Online Training: Course contents: Agenda: SAS Online Training: Course contents: Agenda: (1) Base SAS (6) Clinical SAS Online Training with Real time Projects (2) Advance SAS (7) Financial SAS Training Real time Projects (3) SQL (8) CV preparation

More information

1Z0-071 Exam Questions Demo Oracle. Exam Questions 1Z Oracle Database 12c SQL.

1Z0-071 Exam Questions Demo   Oracle. Exam Questions 1Z Oracle Database 12c SQL. Oracle Exam Questions 1Z0-071 Oracle Database 12c SQL Version:Demo 1. the Exhibit and examine the structure of the CUSTOMERS and CUST_HISTORY tables. The CUSTOMERS table contains the current location of

More information

SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada

SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada ABSTRACT Performance improvements are the well-publicized enhancement to SAS 9, but what else has changed

More information

The DATA Statement: Efficiency Techniques

The DATA Statement: Efficiency Techniques The DATA Statement: Efficiency Techniques S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT One of those SAS statements that everyone learns in the first day of class, the DATA statement rarely gets

More information

FSEDIT Procedure Windows

FSEDIT Procedure Windows 25 CHAPTER 4 FSEDIT Procedure Windows Overview 26 Viewing and Editing Observations 26 How the Control Level Affects Editing 27 Scrolling 28 Adding Observations 28 Entering and Editing Variable Values 28

More information

Ten Great Reasons to Learn SAS Software's SQL Procedure

Ten Great Reasons to Learn SAS Software's SQL Procedure Ten Great Reasons to Learn SAS Software's SQL Procedure Kirk Paul Lafler, Software Intelligence Corporation ABSTRACT The SQL Procedure has so many great features for both end-users and programmers. It's

More information

David Ghan SAS Education

David Ghan SAS Education David Ghan SAS Education 416 307-4515 David.ghan@sas.com Using SQL in SAS Victoria Area SAS User Group February 12, 2004 1. What is SQL? 2. Coding an SQL Query 3. Advanced Examples a. Creating macro variables

More information

Format-o-matic: Using Formats To Merge Data From Multiple Sources

Format-o-matic: Using Formats To Merge Data From Multiple Sources SESUG Paper 134-2017 Format-o-matic: Using Formats To Merge Data From Multiple Sources Marcus Maher, Ipsos Public Affairs; Joe Matise, NORC at the University of Chicago ABSTRACT User-defined formats are

More information

Instructor: Craig Duckett. Lecture 14: Tuesday, May 15 th, 2018 Stored Procedures (SQL Server) and MySQL

Instructor: Craig Duckett. Lecture 14: Tuesday, May 15 th, 2018 Stored Procedures (SQL Server) and MySQL Instructor: Craig Duckett Lecture 14: Tuesday, May 15 th, 2018 Stored Procedures (SQL Server) and MySQL 1 Assignment 3 is due LECTURE 20, Tuesday, June 5 th Database Presentation is due LECTURE 20, Tuesday,

More information

How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U?

How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Paper 54-25 How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Andrew T. Kuligowski Nielsen Media Research Abstract / Introduction S-M-U. Some people will see these three letters and

More information

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Paper SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Qixuan Chen, University of Michigan, Ann Arbor, MI Brenda Gillespie, University of Michigan, Ann Arbor, MI ABSTRACT This paper

More information

RETRIEVING DATA USING THE SQL SELECT STATEMENT

RETRIEVING DATA USING THE SQL SELECT STATEMENT RETRIEVING DATA USING THE SQL SELECT STATEMENT Course Objectives List the capabilities of SQL SELECT statements Execute a basic SELECT statement Development Environments for SQL Lesson Agenda Basic SELECT

More information

Web Application Development (WAD) V th Sem BBAITM(Unit-1) By: Binit Patel

Web Application Development (WAD) V th Sem BBAITM(Unit-1) By: Binit Patel Web Application Development (WAD) V th Sem BBAITM(Unit-1) By: Binit Patel Introduction: PHP (Hypertext Preprocessor) was invented by Rasmus Lerdorf in 1994. First it was known as Personal Home Page. Later

More information

The Ugliest Data I ve Ever Met

The Ugliest Data I ve Ever Met The Ugliest Data I ve Ever Met Derek Morgan, Washington University Medical School, St. Louis, MO Abstract Data management frequently involves interesting ways of doing things with the SAS System. Sometimes,

More information

Efficiency Programming with Macro Variable Arrays

Efficiency Programming with Macro Variable Arrays ABSTRACT MWSUG 2018 - Paper SP-062 Efficiency Programming with Macro Variable Arrays Veronica Renauldo, QST Consultations, LTD, Allendale, MI Macros in themselves boost productivity and cut down on user

More information

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2 SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Department of MathemaGcs and StaGsGcs Phone: 4-3620 Office: Parker 364- A E- mail: carpedm@auburn.edu Web: hup://www.auburn.edu/~carpedm/stat6110

More information

Structured Query Language

Structured Query Language for Environmental Management of Military Lands Structured Query Language (SQL) Tutorial By William Sprouse Structured Query Language CEMML CENTER FOR ENVIRONMENTAL MANAGEMENT OF MILITARY LANDS Colorado

More information

Contents. About This Book...1

Contents. About This Book...1 Contents About This Book...1 Chapter 1: Basic Concepts...5 Overview...6 SAS Programs...7 SAS Libraries...13 Referencing SAS Files...15 SAS Data Sets...18 Variable Attributes...21 Summary...26 Practice...28

More information

If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC

If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC Paper 2417-2018 If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC ABSTRACT Reading data effectively in the DATA step requires knowing the implications

More information

Identifying Duplicate Variables in a SAS Data Set

Identifying Duplicate Variables in a SAS Data Set Paper 1654-2018 Identifying Duplicate Variables in a SAS Data Set Bruce Gilsen, Federal Reserve Board, Washington, DC ABSTRACT In the big data era, removing duplicate data from a data set can reduce disk

More information

DSCI 325: Handout 15 Introduction to SAS Macro Programming Spring 2017

DSCI 325: Handout 15 Introduction to SAS Macro Programming Spring 2017 DSCI 325: Handout 15 Introduction to SAS Macro Programming Spring 2017 The Basics of the SAS Macro Facility Macros are used to make SAS code more flexible and efficient. Essentially, the macro facility

More information

Top 10 SAS Functions in A brief summary of SAS Communities Survey - by Flora Fang Liu

Top 10 SAS Functions in A brief summary of SAS Communities Survey - by Flora Fang Liu Top 10 SAS Functions in 2017 A brief summary of SAS Communities Survey - by Flora Fang Liu 1 What are SAS Functions? Why use SAS Functions? What? SAS functions perform computations, data manipulation,

More information

GIFT Department of Computing Science Data Selection and Filtering using the SELECT Statement

GIFT Department of Computing Science Data Selection and Filtering using the SELECT Statement GIFT Department of Computing Science [Spring 2013] CS-217: Database Systems Lab-2 Manual Data Selection and Filtering using the SELECT Statement V1.0 4/12/2016 Introduction to Lab-2 This lab reinforces

More information

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO ABSTRACT The power of SAS programming can at times be greatly improved using PROC SQL statements for formatting and manipulating

More information

Oracle Database: Introduction to SQL Ed 2

Oracle Database: Introduction to SQL Ed 2 Oracle University Contact Us: +40 21 3678820 Oracle Database: Introduction to SQL Ed 2 Duration: 5 Days What you will learn This Oracle Database 12c: Introduction to SQL training helps you write subqueries,

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

An SQL Tutorial Some Random Tips

An SQL Tutorial Some Random Tips An SQL Tutorial Some Random Tips Presented by Jens Dahl Mikkelsen SAS Institute A/S Author: Paul Kent SAS Institute Inc, Cary, NC. Short Stories Towards a Better UNION Outer Joins. More than two too. Logical

More information

ABSTRACT INTRODUCTION THE GENERAL FORM AND SIMPLE CODE

ABSTRACT INTRODUCTION THE GENERAL FORM AND SIMPLE CODE Paper SA06 Painless Extraction: Options and Macros with PROC PRESENV Keith Fredlund, MS (candidate) Grand Valley State University, Allendale, Michigan; Thinzar Wai, MS (candidate) Grand Valley State University,

More information

Top 5 Handy PROC SQL Tips You Didn t Think Were Possible

Top 5 Handy PROC SQL Tips You Didn t Think Were Possible Top 5 Handy PROC SQL Tips You Didn t Think Were Possible Montreal SAS users Group 30 May 2018 11:00-11:40 Charu Shankar SAS Institute, Toronto About your presenter SAS Senior Technical Training Specialist,

More information

Structured Query Language

Structured Query Language Structured Query Language (SQL) Training Guide Structured Query Language CENTRAL TRAINING INSTITUTE MPPKVVCL JABALPUR CENTRAL TRAINING INSTITUTE MPPKVVCL JABALPUR COURSE DESIGEND BY- 1. Dr A K TIWARI 2.

More information

Welcome to Top 10 SAS Functions

Welcome to Top 10 SAS Functions Welcome to Top 10 SAS Functions Goal and Agenda By the end of this meeting, you will understand 10 key SAS functions purpose, value and features. What are SAS functions? Why use them? Use Case Manipulating

More information

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71 Stat 342 - Wk 3 What is SQL Proc SQL 'Select' command and 'from' clause 'group by' clause 'order by' clause 'where' clause 'create table' command 'inner join' (as time permits) Stat 342 Notes. Week 3,

More information