Paper TT17 An Animated Guide : Speed Merges with Key Merging and the _IORC_ Variable Russ Lavery Contractor for Numeric resources, Inc.

Size: px
Start display at page:

Download "Paper TT17 An Animated Guide : Speed Merges with Key Merging and the _IORC_ Variable Russ Lavery Contractor for Numeric resources, Inc."

Transcription

1 Paper TT7 An Animated Guide : Speed Merges with Key Merging and the _IORC_ Variable Russ Lavery Contractor for Numeric resources, Inc. ABSTRACT The key mege (A.K.A. _IORC_ merge) is an efficiency technique. It is a method of merging two files without having to perform a slow, and disk-space-consuming, pre-sorting of the files. Because this merge does not require pre-sorting of the files to be merged, it can be faster and uses less disk space than a by merge. The _IORC_ merge is considered one of the "Table Lookup Techniques" and as such is a competitive technique for by merges, formats, if-else if blocks, key-indexing, bitmapping, hashing and SQL joins. It is a useful technique for SAS programmers because it is fairly fast (faster than a by merge ) and easy to understand. Additionally, _IORC_ merging is part of the material in the SAS certification exam. This paper accompanies an animated presentation at NESUG and can not duplicate the animated effect. It will outline the features of the _IORC_, as were presented. More material can be found in the excellent articles found in the on-line SUGI proceedings and listed in the reference section of this paper. INTRODUCTION This paper will explain details of the _IORC_ merge and how the Program Data Vector (PDV) is modified as the _IORC_ merge executes. An understanding of the PDV is key to understanding this technique. PROGRAM (PDV) FACTS The PDV can be thought of as a data storage area. It functions much like a one line Excel Spreadsheet. The PDV has a column for every variable you read in from the data set, every variable you create in the data step and some automatic variables (_n_, _ERROR_ and _IORC_). When SAS processes a data step set, it copies your data -ONE LINE AT A TIME- into the program data vector. All calculations in your data step will be performed in the program data vector and the results of your calculation will be stored in the PDV. When you have executed all the statements in the data step, values in the PDV will be written to the output file. If data comes into your PDV from a SAS file (as opposed to cards or a text file), it will automatically be retained until that data set is accessed again with a set command. If a data step accesses two SAS data sets, variables from both data sets will be retained in the PDV until the set statement associated with that data set next executes. THE BUSINESS PROBLEM Imagine you are working at a college and you send your assistant to the gym on the first day of class. He interviews people waiting in a line to get their gym lockers and asks them if they are joggers. This information is recorded in a file called Day_. At the end of term your assistant goes to the health office,in the gym, and looks at records of people who visited the health office complaining of either shin splints (a runner's problem) and tennis elbow (Tennis elbow information is not required but your assistant got carried away). This information is put in a file called UpDt. Note that there is not a good match of names between the files. Also note that shinslints are coded S hinsplints and No and tennis elbow is coded T ennis elbow and No (see Figure ). For our merging goal, we desire a file that contains all the people we interviewed on day and information about their health problems that we collected at end of term. Source data sets, code, the PDV and output file are shown in Figures, 2, 3 and 4. These figures are also shown, in a larger size, in the appendix. THE SAS CODE The files we desire to merge are of different sizes. The first step in an _IORC_ merge is to index the larger file. As you can see in Figure ; the data step, where the merge takes place has two set statements. Count from the top and think of them as set and set2. Set executes first. The data set in the set2 must have the key= option and an index. The larger file should be indexed and put in the set2 statement - the set statement that has the Key= option. - -

2 The position of the small and large data sets can be reversed (put smaller file in set2) and the technique will still work, however the job will run faster if the larger file is indexed and used in set2 (the set statement with the key= option). You might code the indexing as: Proc datasets lib=work; modify UpDt; index create name/unique; quit; Figure Y Sue N AJ N Fred Y Glenn N KL Y data new3; SYNTAX set Day_; set UpDt key=name/unique; 2 array setmiss(*) $ ShinSpl - -T_Elb; if _iorc_ then do i= to dim(setmiss); setmiss(i)=""; DATA SET: Day_ S N Smaller Eric S N file is used Sue N N in the first Fred S set. Mark S Walt N KL N T Wayne N T Sally N T 2 Name Run Sh_Sp T_Eelb T_Elb _N ERROR IORC_ Y S N 0 0 Name Run Sh_Sp T_Elb Y S N 3 DATA SET: UpDT Larger file is indexed and used the set wilh keyword Index/unique Figure shows the first observation being processed. The statement data New3 creates the PDV and sets user variables to missing. At the top of the data step several things happen automatically. It sets _n_ = and sets _error_ and _IORC_ to zero (_IORC_ is set to zero at the top of the data step ONLY for the first observation). The data step executes statements from top down and executes the following statement (circle () in Figure ): set day_; The above statement reads only the variables from the file Day into the PDV. At this time the PDV only contains values for Name () and Run(Y). Sh_Sp and T_Elb contain missing values. Next SAS executes the statement (2): set UpDt key=name/unique; This second set statement performs an indexed table lookup inside UpDt. Since the command option is Key=name SAS looks in the PDV and gets the current value of name. It then uses the current value of name () to perform an indexed lookup in the file UpDt. SAS looks in UpDt for an observation with name=. When/If it finds such an observation the second set statement executes. The attempt to perform an indexed lookup and the copying of the information to the PDV are separate steps. When the set executes, it copies the values of the variables in UpDt from the data set into the PDV. Since there was a successful table lookup, _error_ and _IORC_ stay at zero. Since _IORC_ is zero the x-ed out box of code does not execute (Figure ) for this observation. When SAS reaches the bottom of the data set it outputs the observation to the output file (circle (3) in Figure ). The output file contains variables for from both files. Figure 2 shows part of the processing of a "no-match" observation

3 Y Sue N AJ N Fred Y Glenn N KL Y data new3; set Day_; set UpDt SYNTAX key=name/unique; array setmiss(*) $ ShinSpl - -T_Elb; if _iorc_ then do i= to dim(setmiss); setmiss(i)=""; DATA SET: Day_ S N Smaller Eric S N file is used Sue N N in the first Fred S set. Mark S Walt N KL N T Wayne N T Sally N T Name Run Sh_Sp T_Eelb _N ERROR IORC_ Russ Y N S N Name Run Sh_Sp T_Elb Y S N DATA SET: UpDT Larger file is indexed and used the set wilh keyword Index/unique Figure 2 When control passes to the top of the data step, two automatic variables are modified. First, _n_ is incremented by. Second, While it is not easy to see here, _error_ is set to Zero. The value of _IORC_ is not automatically modified at the top of the data step. SAS processes the first set command, the line marked with a () in Figure 2. set day_; This line will read the data, from Observation 2 of the file Day_, into the PDV. After the above line executes, the PDV contains values "Russ" and "N" from the second observation in file Day_. However; because of the automatic retain of data from SAS data sets, the PDV still contains information that came from 's record in the file UpDt. Data Step processing continues as shown below. Figure three shows the execution of the second set statement, the line marked with a (2) in Figure 3: set UpDt key=name/unique; SAS looks in the PDV and gets the current value of name (Russ). It then attempts an indexed lookup in the file UpDt for an observation with name= Russ. When it fails to find an observation with name=russ in UpDt, the set does not execute. SAS sets _error_ to and _IORC_ to a non-zero number. Dangerously, it does not reset the values of Sh_Sp and T_Elb to missing. The values for these variables have been retained from, are not correct, and must be corrected manually. Since the value of _IORC_ is not zero the box of code (circle (3) in Figure 3) executes

4 Y Sue N AJ N Fred Y Glenn N KL Y data new3; SYNTAX set Day_; set UpDt key=name/unique; 2 array setmiss(*) $ ShinSpl - -T_Elb; if _iorc_ then 3 do i= to dim(setmiss); setmiss(i)=""; DATA SET: Day_ S N Smaller Eric S N file is used Sue N N in the first Fred S set. Mark S Walt N KL N T Wayne N T Sally N T Name Run Sh_Sp T_Eelb _N ERROR IORC_ Russ Y N S N Name Run Sh_Sp T_Elb Y S N DATA SET: UpDT Larger file is indexed and used the set wilh keyword Index/unique 2 Figure 3 Figure 4 shows the effects of executing the box of code (circle (3) in Figure 4). The variable _error_ is set to zero (circle () in Figure 4) to suppress the printing on the error message in the log. The variable _IORC_ is set to zero (circle (2) in Figure 4) just to be tidy. This resetting of the _IORC_ is not required for correct execution of the merge. We use array logic (circle (3) in Figure 4) to set all the variables that came from the indexed data set (UpDt) to missing. Note that the reset to missing logic would have to be a bit more complex if we had brought in a mixture of numeric and character variables from UpDt (SAS arrays should be all numeric or all character). After all the code in the box finished executing, the observation would be copied to the output data set. Figure 4 Y Sue N AJ N Fred Y Glenn N KL Y data new3; SYNTAX set Day_; set UpDt key=name/unique; array setmiss(*) $ ShinSpl - -T_Elb; if _iorc_ then 3 do i= to dim(setmiss); setmiss(i)=""; DATA SET: Day_ S N Smaller Eric S N file is used Sue N N in the first Fred S set. Mark S Walt N KL N T Wayne N T Sally N T Name Run Sh_Sp T_Eelb _N ERROR IORC_ Russ Y N S N Name Run Sh_Sp T_Elb Y S N DATA SET: UpDT Larger file is indexed and used the set wilh keyword Index/unique 2

5 The code in Figures through 4 created a data set that has all the observations from Day_, regardless of the success of the matching attempt. If the match on an observation was successful, the output data set has variables from both input data sets. If the matching attempt was not successful, the observation has missing values. This output structure is often what a client wants. SELECTING OBSERVATIONS IN BOTH FILES The code would be slightly different if your goal were to create a data set that contains just the people that are in both files. That code is below. The pictures above can be used to examine the details of the logic. *Index larger file; proc datasets lib=work; modify UpDt; index create name/unique; quit; data new2; set day_; set UpDt key=name/unique; if _iorc_ NE 0 then delete; This code checks for "failed index lookup" by checking _IORC_. If _IORC_ is not zero, the code resets _error_ to zero and deletes the observation. USING AN _IORC_ MERGE WITH A COMPOUND INDEX Proc SQl; create table Small (name char(5), sex char()); insert into small values("pat,"m") values("pat,"f ) values("sam,"f ) values("russ","m"); data Cmpnd; set Small; set ForCmpIndx key=nmsex/unique; if _iorc_ NE 0 then age=.; Fix SYNTAX no need for Compile an animation of ASSIGN Proc SQl; create table ForCmpIndx (name char(5), sex char(), age num); insert into ForDblIndx values("pat","m", values("pat 0) 0) values("pat","f", values("pat,"f 4), 4) values("sam","f", values("sam","f, 9 9); proc sql; sql; Create index NmSex on on ForCmpIndx(name ForCmpIndx(NAME, sex);, SEX); Same process as before so... compound index Obs name sex age Pat M 0 2 Pat F 4 3 Sam F 9 4 Russ M. Figure 5 As figure 5 shows, a compound index is quite easy to use in an _IORC_ merge. Note that failures to find must still be fixed, as is shown for subject Russ

6 USING AN _IORC_ MERGE TO SELECT/UPDATE A VARIABLE IN A SPECIFIED ORDER Image a business situation where you have files containing customer reported address changes for this year (2008) as well as 2007, 2006 and This might occur in a non-profit organization where the files are records of contributions. Management wants, for a select group of people, the most recent address. This can be done with an _IORC_ merge. Client wants the most current address for these people Proc SQL; create table GetThese (name char(9)); insert into GetThese Values("") Values("Chee") Values("Lahong") Values("Murali") Values("Yanmei") Values("Russ") ; No address at all Sort Curr address File Address 2007 File Address 2006 File Address 2005 File Name Curr Address AptCB Chee Lahong Murali Yanmei Name Address2007 Apt7B Lahong Apt7L Murali Apt7M Name Address2006 Apt6B Chee Apt6C Murali Apt6M Name Address2005 Apt5B Chee Apt5C Lahong Apt5L Murali Apt5M Yanmei Apt5Y Remove blank lines and lines with blank addresses Sort Index Index Index Name Curr Address Apt CB Name Address2007 Apt7B Lahong Apt7L Murali Apt7M Name Address2006 Apt6B Chee Apt6C Murali Apt6M Name Address2005 Apt5B Chee Apt5C Lahong Apt5L Murali Apt5M Yanmei Apt5Y Figure 6 Figure six shows the data files in a few ways. The SQL code shows the people for which we want addresses. The yellow boxes show information, by year, in a layout that makes it easy to see who is present in any particular year. The data files should be sorted, or indexed, before being fed into the _IORC_ merge. Curr address File Address 2007 File Address 2006 File Address 2005 File Name Caddr AptC Data MostCurrent; merge getthese CurrAddr; by name; if CAddr in ("", " ") then /*search 2007 file*/ set Addr07 key=name /unique; CAddr=Addr07; /*Search 2006 file*/ set Addr06 key=name /unique; CAddr=Addr06; /*Search 2005 file*/ set Addr05 key=name /unique; CAddr=Addr05; seaddr="nomatch"; /*name ~found*/ /*end of 2005 do*/ /*end of 2006 do*/ /*end of 2007do*/ Name Address2007 Apt7B Lahong Apt7L Murali Apt7M SYNTAX "" "Chee" "Lahong" "Murali" "Russ" "Yanmei" Name Address2006 Apt6B Chee Apt6C Murali Apt6M Sorted No address at all for Russ Name Address2005 Apt5B Chee Apt5C Lahong Apt5L Murali Apt5M Yanmei Apt5Y (abbreviated) Name CAddr Addr07 Addr06 Addr05 _ERROR IORC_ Chee AptCB Apt6C Compile Apt6C Name CAddr Addr07 Addr06 Addr05 AptCB Chee Apt6C Apt6C Figure 6-6 -

7 Figure 6 shows the SAS supervisor after processing Chee s data. Data read from a SAS data set is automatically retained, so the PDV started holding s data. Chee s name was read from the file GetThese, leaving s Caddr in the PDV. SAS tried to do a by merge to get information on Chee in the CAddr (Current Year Address) file and failed. Caddr, in the PDV was set to missing. SAS used an _IORC_ to try to read Chee s address from Addr07 and failed causing the variables _error_ and _IORC_ to become non zero. Moronically, SAS copied Chee s missing value for Addr07 into Caddr. This data step could have been coded to eliminate this, but the resulting code might not run faster and would not fit in a PowerPoint slide. Since the value in _IORC_ is not zero processing continues. _error_ is assigned a value of 0 for two reasons. It will not automatically be reset to zero if there is a successful _IORC_ lookup. If _error_ is not zero when control reaches the bottom of the data step, SAS will write a note to the log. If there are thousands of No finds, the log will become unwealdy. SAS then uses another _IORC_ lookup to read Addr06 and finds information on Chee in that file. _IORC_ is set to zero, because of the successful read. This automatic re-set to zero is in contrast to how SAS treats the variable _error_. A value is read into Addr06 in the PDV and then assigned to Caddr. Since the value of _IORC_ is zero, the X-ed out code does not execute. Figure 7 Curr address File Address 2007 File Address 2006 File Address 2005 File Name Caddr AptC Data MostCurrent; merge getthese CurrAddr; by name; if CAddr in ("", " ") then /*search 2007 file*/ set Addr07 key=name /unique; CAddr=Addr07; /*Search 2006 file*/ set Addr06 key=name /unique; CAddr=Addr06; /*Search 2005 file*/ set Addr05 key=name /unique; CAddr=Addr05; seaddr="nomatch"; /*name ~found*/ /*end of 2005 do*/ /*end of 2006 do*/ /*end of 2007do*/ Name Address2007 Apt7B Lahong Apt7L Murali Apt7M SYNTAX "" " Chee" "Lahong" "Murali" "Russ" "Yanmei" Name Address2006 Apt6B Chee Apt6C Murali Apt6M Name Address2005 Apt5B Chee Apt5C Lahong Apt5L Murali Apt5M Yanmei Apt5Y Sorted Lets see the final file. No address at all for Russ (abbreviated) Name CAddr Addr07 Addr06 Addr05 _ERROR IORC_ Chee Apt6C Compile Apt6C Note the automatic retains. Name CAddr Addr07 Addr06 Addr05 AptCB Chee Apt6C Apt6C Lahong Apt7L Apt7L Apt6C Murali Apt7M Apt7M Apt6C omat Apt7M Apt6C Yanmei Apt5Y Apt7M Apt6C Apt5Y Figure 7 shows the final file. Note that Caddr has the correct information but that Addr07 and Addr06 have errors because of the automatic retaining of values read in from SAS data sets. These variables are not needed by the client and, in early versions of this paper had been removed by a drop on the Data statement. The drop option was removed, to allow these variables, and values, to flow through to the final data set as an aid to understanding the internal process of this merge

8 While the above merge is interesting and useful when resources are limited. The same result could be produced with the simple by merge shown below. In a merge, data sets to the right overwrite data sets to the left. proc sort data=getthese; by name; proc sort data=curraddr; by name; proc sort data=addr07; by name; proc sort data=addr06; by name; proc sort data=addr05; by name; Data EasyWay; merge GetThese Addr05 rename=(addr05=caddr)) Addr06 rename=(addr06=caddr)) Addr07 rename=(addr07=caddr)) CurrAddr; by name; CONCLUSION The Table Lookup function (i.e. merging two data sets, selecting observations from a large file that are also in a smaller file or performing a long series of if-else if processing) is a common SAS task and SAS programmers should know the best ways to perform this task. The _IORC_ merge is a fast way of selecting observations from a large data set. It does not require sorting of the data sets thus conserves CPU time and disk space. For more details please read the excellent articles that are online at the NESUG and SUGI web sites and that are mentioned in the reference section. The articles by Sandra Aker were especially helpful to the author. Anyone needing to perform Table Lookup function quickly, and without sorting the data sets, should also investigate using formats and hashing. There are several articles on table lookup using formats in the NESUG and SUGI proceedings. Hashing is a little more difficult to master than _IORC_ and format Table Lookups but can be very fast. REFERENCES Aker,Sandra, 997, Table Look-up using Indexes, SQL, Arrays and formats without using Matched Merge Data Steps, In the Proceedings of the 997 North East SAS Users Group Conference, page 3 Aker,Sandra, 999, Using Indexes to Perform Table Look-up In the Proceedings of the 999 North East SAS Users Group Conference, page 79 Carpenter, Art, 200, Table Lookups: From IF-THEN to key-indexing, In the Proceedings of the 23rd SAS Users Group International Conference, paper 58 Croonen and Theuwissen, 2002, Table Look-up: Techniques Beyond the Obvious, SUGI 27 In the Proceedings of the 27th SAS Users Group International Conference, paper Foley, Malachy J.,997, Advanced MATCH-MERGING: Techniques, Tricks, and Traps In the Proceedings of the 22nd SAS Users Group International Conference, paper 39 Gober, John, 998, Understanding Indexed Datasets and Using Direct Access Queries In the Proceedings of the 23rd SAS Users Group International Conference, paper 64 McAllister, Doug, 998, Indexed Table Lookup vs Multiple Data Sets, In the Proceedings of the 998 North East SAS Users Group Conference, page 40 Raffee, Dana, 997, NO MORE MERGE: Alternative Table Lookup Techniques In the Proceedings of the 22nd SAS Users Group International Conference, paper 88 Riba, David, 2002, Table Look-up Techniques Other Than the Matched Merge DATA Step, In the Proceedings of the 27th SAS Users Group International Conference, paper 27 Stinson, Walter, 2000, Indexing: My new best friend for table lookup, In the Proceedings of the 2000 North East SAS Users Group Conference, page 293 Zdeb, Mike, 200 Five (or more) Alternatives for Record Selection From One File Based On Information In Another In the Proceedings of the 200 North East SAS Users Group Conference, page 355 CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Russell Lavery- Contractor for Numeric Resources Ardmore, PA russ.lavery@verizon.net - 8 -

9 SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies

An Animated Guide : Speed Merges: resource use by common procedures Russell Lavery, Contractor, Ardmore, PA

An Animated Guide : Speed Merges: resource use by common procedures Russell Lavery, Contractor, Ardmore, PA An Animated Guide : Speed Merges: resource use by common procedures Russell Lavery, Contractor, Ardmore, PA ABSTRACT This paper is a comparison of how resources are used by different SAS table lookup (Figure

More information

Administration & Support

Administration & Support An Animated Guide : Speed Merges: Resource use by common non-parallel procedures Russ Lavery Contractor for ASG, Inc. ABSTRACT This paper is a comparison of how resources are used by different SAS table

More information

Merge Processing and Alternate Table Lookup Techniques Prepared by

Merge Processing and Alternate Table Lookup Techniques Prepared by Merge Processing and Alternate Table Lookup Techniques Prepared by The syntax for data step merging is as follows: International SAS Training and Consulting This assumes that the incoming data sets are

More information

An Animated Guide: Proc Transpose

An Animated Guide: Proc Transpose ABSTRACT An Animated Guide: Proc Transpose Russell Lavery, Independent Consultant If one can think about a SAS data set as being made up of columns and rows one can say Proc Transpose flips the columns

More information

Choosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA

Choosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA Choosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA ABSTRACT This paper outlines different SAS merging techniques

More information

An Animated Guide: An Introduction to SAS Macro Quoting Russ Lavery Bryn Mawr, PA

An Animated Guide: An Introduction to SAS Macro Quoting Russ Lavery Bryn Mawr, PA An Animated Guide: An Introduction to SAS Macro Quoting Russ Lavery Bryn Mawr, PA Figure 1 ABSTRACT This paper builds on a NESUG 2002 paper that described the general functioning of the SAS Macro Processor.

More information

Hash Objects for Everyone

Hash Objects for Everyone SESUG 2015 Paper BB-83 Hash Objects for Everyone Jack Hall, OptumInsight ABSTRACT The introduction of Hash Objects into the SAS toolbag gives programmers a powerful way to improve performance, especially

More information

Table Lookups: Getting Started With Proc Format

Table Lookups: Getting Started With Proc Format Table Lookups: Getting Started With Proc Format John Cohen, AstraZeneca LP, Wilmington, DE ABSTRACT Table lookups are among the coolest tricks you can add to your SAS toolkit. Unfortunately, these techniques

More information

Programming Beyond the Basics. Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell

Programming Beyond the Basics. Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell ABSTRACT The SAS hash object has come of age in SAS 9.2, giving the SAS programmer the ability to quickly do things

More information

Are you Still Afraid of Using Arrays? Let s Explore their Advantages

Are you Still Afraid of Using Arrays? Let s Explore their Advantages Paper CT07 Are you Still Afraid of Using Arrays? Let s Explore their Advantages Vladyslav Khudov, Experis Clinical, Kharkiv, Ukraine ABSTRACT At first glance, arrays in SAS seem to be a complicated and

More information

USING SAS HASH OBJECTS TO CUT DOWN PROCESSING TIME Girish Narayandas, Optum, Eden Prairie, MN

USING SAS HASH OBJECTS TO CUT DOWN PROCESSING TIME Girish Narayandas, Optum, Eden Prairie, MN Paper RF-12-2014 USING SAS HASH OBJECTS TO CUT DOWN PROCESSING TIME Girish Narayandas, Optum, Eden Prairie, MN ABSTRACT Hash tables are in existence since SAS 9 version and are part of data step programming.

More information

Taming a Spreadsheet Importation Monster

Taming a Spreadsheet Importation Monster SESUG 2013 Paper BtB-10 Taming a Spreadsheet Importation Monster Nat Wooding, J. Sargeant Reynolds Community College ABSTRACT As many programmers have learned to their chagrin, it can be easy to read Excel

More information

9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA

9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA 9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA ABSTRACT Joining or merging data is one of the fundamental actions carried out when manipulating data to bring it

More information

If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC

If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC Paper 2417-2018 If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC ABSTRACT Reading data effectively in the DATA step requires knowing the implications

More information

Table Lookups: From IF-THEN to Key-Indexing

Table Lookups: From IF-THEN to Key-Indexing Table Lookups: From IF-THEN to Key-Indexing Arthur L. Carpenter, California Occidental Consultants ABSTRACT One of the more commonly needed operations within SAS programming is to determine the value of

More information

Language Editor User Manual

Language Editor User Manual Language Editor User Manual June 2010 Contents Introduction... 3 Install the Language Editor... 4 Start using the Language Editor... 6 Editor screen... 8 Section 1: Translating Text... 9 Load Translations...

More information

How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U?

How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Paper 54-25 How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Andrew T. Kuligowski Nielsen Media Research Abstract / Introduction S-M-U. Some people will see these three letters and

More information

Cleaning up your SAS log: Note Messages

Cleaning up your SAS log: Note Messages Paper 9541-2016 Cleaning up your SAS log: Note Messages ABSTRACT Jennifer Srivastava, Quintiles Transnational Corporation, Durham, NC As a SAS programmer, you probably spend some of your time reading and

More information

Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA

Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA ABSTRACT PharmaSUG 2013 - Paper TF22 Anatomy of a Merge Gone Wrong James Lew, Compu-Stat Consulting, Scarborough, ON, Canada Joshua Horstman, Nested Loop Consulting, Indianapolis, IN, USA The merge is

More information

An Annotated Guide: The New 9.1, Free & Fast SPDE Data Engine Russ Lavery, Ardmore PA, Independent Contractor Ian Whitlock, Kennett Square PA

An Annotated Guide: The New 9.1, Free & Fast SPDE Data Engine Russ Lavery, Ardmore PA, Independent Contractor Ian Whitlock, Kennett Square PA An Annotated Guide: The New 9.1, Free & Fast SPDE Data Engine Russ Lavery, Ardmore PA, Independent Contractor Ian Whitlock, Kennett Square PA ABSTRACT SAS has been working hard to decrease clock time to

More information

Parallelizing Windows Operating System Services Job Flows

Parallelizing Windows Operating System Services Job Flows ABSTRACT SESUG Paper PSA-126-2017 Parallelizing Windows Operating System Services Job Flows David Kratz, D-Wise Technologies Inc. SAS Job flows created by Windows operating system services have a problem:

More information

Introduction / Overview

Introduction / Overview Paper # SC18 Exploring SAS Generation Data Sets Kirk Paul Lafler, Software Intelligence Corporation Abstract Users have at their disposal a unique and powerful feature for retaining historical copies of

More information

Countdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH

Countdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH PharmaSUG2010 - Paper TU06 Countdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH ABSTRACT Joining or merging data is one of the fundamental actions carried out

More information

Merging Data Eight Different Ways

Merging Data Eight Different Ways Paper 197-2009 Merging Data Eight Different Ways David Franklin, Independent Consultant, New Hampshire, USA ABSTRACT Merging data is a fundamental function carried out when manipulating data to bring it

More information

The Proc Transpose Cookbook

The Proc Transpose Cookbook ABSTRACT PharmaSUG 2017 - Paper TT13 The Proc Transpose Cookbook Douglas Zirbel, Wells Fargo and Co. Proc TRANSPOSE rearranges columns and rows of SAS datasets, but its documentation and behavior can be

More information

Comparison of different ways using table lookups on huge tables

Comparison of different ways using table lookups on huge tables PhUSE 007 Paper CS0 Comparison of different ways using table lookups on huge tables Ralf Minkenberg, Boehringer Ingelheim Pharma GmbH & Co. KG, Ingelheim, Germany ABSTRACT In many application areas the

More information

The DATA Statement: Efficiency Techniques

The DATA Statement: Efficiency Techniques The DATA Statement: Efficiency Techniques S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT One of those SAS statements that everyone learns in the first day of class, the DATA statement rarely gets

More information

How to Create Data-Driven Lists

How to Create Data-Driven Lists Paper 9540-2016 How to Create Data-Driven Lists Kate Burnett-Isaacs, Statistics Canada ABSTRACT As SAS programmers we often want our code or program logic to be driven by the data at hand, rather than

More information

Getting the Most from Hash Objects. Bharath Gowda

Getting the Most from Hash Objects. Bharath Gowda Getting the Most from Hash Objects Bharath Gowda Getting the most from Hash objects Techniques covered are: SQL join Data step merge using BASE engine Data step merge using SPDE merge Index Key lookup

More information

Clinical Data Visualization using TIBCO Spotfire and SAS

Clinical Data Visualization using TIBCO Spotfire and SAS ABSTRACT SESUG Paper RIV107-2017 Clinical Data Visualization using TIBCO Spotfire and SAS Ajay Gupta, PPD, Morrisville, USA In Pharmaceuticals/CRO industries, you may receive requests from stakeholders

More information

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys Richard L. Downs, Jr. and Pura A. Peréz U.S. Bureau of the Census, Washington, D.C. ABSTRACT This paper explains

More information

The 'SKIP' Statement

The 'SKIP' Statement The 'SKIP' Statement Paul Grant, Private Healthcare Systems, Inc. The Problem Sooner or later every SAS programmer faces the irritating problem of running only a portion of an existing SAS program. If

More information

How MERGE Really Works

How MERGE Really Works How MERGE Really Works Bob Virgile Robert Virgile Associates, Inc. Overview Do your MERGEs produce unexpected results? Three basic DATA step concepts resolve the complexities of MERGE: compile and execute,

More information

Gary L. Katsanis, Blue Cross and Blue Shield of the Rochester Area, Rochester, NY

Gary L. Katsanis, Blue Cross and Blue Shield of the Rochester Area, Rochester, NY Table Lookups in the SAS Data Step Gary L. Katsanis, Blue Cross and Blue Shield of the Rochester Area, Rochester, NY Introduction - What is a Table Lookup? You have a sales file with one observation for

More information

Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables

Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables Paper 3458-2015 Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables ABSTRACT Louise Hadden, Abt Associates Inc., Cambridge, MA SAS provides a wealth of resources for users to

More information

Streamline Table Lookup by Embedding HASH in FCMP Qing Liu, Eli Lilly & Company, Shanghai, China

Streamline Table Lookup by Embedding HASH in FCMP Qing Liu, Eli Lilly & Company, Shanghai, China ABSTRACT PharmaSUG China 2017 - Paper 19 Streamline Table Lookup by Embedding HASH in FCMP Qing Liu, Eli Lilly & Company, Shanghai, China SAS provides many methods to perform a table lookup like Merge

More information

Essentials of PDV: Directing the Aim to Understanding the DATA Step! Arthur Xuejun Li, City of Hope National Medical Center, Duarte, CA

Essentials of PDV: Directing the Aim to Understanding the DATA Step! Arthur Xuejun Li, City of Hope National Medical Center, Duarte, CA PharmaSUG 2013 - Paper TF17 Essentials of PDV: Directing the Aim to Understanding the DATA Step! Arthur Xuejun Li, City of Hope National Medical Center, Duarte, CA ABSTRACT Beginning programmers often

More information

Updating Data Using the MODIFY Statement and the KEY= Option

Updating Data Using the MODIFY Statement and the KEY= Option Updating Data Using the MODIFY Statement and the KEY= Option Denise J. Moorman and Deanna Warner Denise J. Moorman is a technical support analyst at SAS Institute. Her area of expertise is base SAS software.

More information

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC Paper CC-05 Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC ABSTRACT For many SAS users, learning SQL syntax appears to be a significant effort with a low

More information

29 Shades of Missing

29 Shades of Missing SESUG 2015 ABSTRACT Paper CC106 29 Shades of Missing Darryl Putnam, Pinnacle Solutions, LLC Missing values can have many flavors of missingness in your data and understanding these flavors of missingness

More information

Control Structures. Code can be purely arithmetic assignments. At some point we will need some kind of control or decision making process to occur

Control Structures. Code can be purely arithmetic assignments. At some point we will need some kind of control or decision making process to occur Control Structures Code can be purely arithmetic assignments At some point we will need some kind of control or decision making process to occur C uses the if keyword as part of it s control structure

More information

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University.

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University. Abstract Data requests can range from on-the-fly, need it yesterday, to extended projects taking several weeks or months to complete. Often institutional researchers and other data scientists are juggling

More information

Paper # Jazz it up a Little with Formats. Brian Bee, The Knowledge Warehouse Ltd

Paper # Jazz it up a Little with Formats. Brian Bee, The Knowledge Warehouse Ltd Paper #1495-2014 Jazz it up a Little with Formats Brian Bee, The Knowledge Warehouse Ltd Abstract Formats are an often under-valued tool in the SAS toolbox. They can be used in just about all domains to

More information

Arthur L. Carpenter California Occidental Consultants, Oceanside, California

Arthur L. Carpenter California Occidental Consultants, Oceanside, California Paper 028-30 Storing and Using a List of Values in a Macro Variable Arthur L. Carpenter California Occidental Consultants, Oceanside, California ABSTRACT When using the macro language it is not at all

More information

A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA

A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA Paper RF10-2015 A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA ABSTRACT Weight of evidence (WOE) recoding

More information

NO MORE MERGE. Alternative Table Lookup Techniques

NO MORE MERGE. Alternative Table Lookup Techniques NO MORE MERGE. Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT ABSTRACT This tutorial is designed to show you several techniques available for

More information

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Paper SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Qixuan Chen, University of Michigan, Ann Arbor, MI Brenda Gillespie, University of Michigan, Ann Arbor, MI ABSTRACT This paper

More information

Checking for Duplicates Wendi L. Wright

Checking for Duplicates Wendi L. Wright Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when

More information

Automating Comparison of Multiple Datasets Sandeep Kottam, Remx IT, King of Prussia, PA

Automating Comparison of Multiple Datasets Sandeep Kottam, Remx IT, King of Prussia, PA Automating Comparison of Multiple Datasets Sandeep Kottam, Remx IT, King of Prussia, PA ABSTRACT: Have you ever been asked to compare new datasets to old datasets while transfers of data occur several

More information

Posters. Workarounds for SASWare Ballot Items Jack Hamilton, First Health, West Sacramento, California USA. Paper

Posters. Workarounds for SASWare Ballot Items Jack Hamilton, First Health, West Sacramento, California USA. Paper Paper 223-25 Workarounds for SASWare Ballot Items Jack Hamilton, First Health, West Sacramento, California USA ABSTRACT As part of its effort to insure that SAS Software is useful to its users, SAS Institute

More information

Super boost data transpose puzzle

Super boost data transpose puzzle Paper 2100-2016 Super boost data transpose puzzle Ahmed Al-Attar, AnA Data Warehousing Consulting LLC, McLean, VA ABSTRACT This paper compares different solutions to a data transpose puzzle presented to

More information

Understanding and Applying the Logic of the DOW-Loop

Understanding and Applying the Logic of the DOW-Loop PharmaSUG 2014 Paper BB02 Understanding and Applying the Logic of the DOW-Loop Arthur Li, City of Hope National Medical Center, Duarte, CA ABSTRACT The DOW-loop is not official terminology that one can

More information

Outlook Integration Guide

Outlook Integration Guide PracticeMaster Outlook Integration Guide Copyright 2012-2015 Software Technology, Inc. 1621 Cushman Drive Lincoln, NE 68512 (402) 423-1440 Tabs3.com Tabs3, PracticeMaster, and the "pinwheel" symbol ( )

More information

Paper Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations. John Schmitz, Luminare Data LLC

Paper Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations. John Schmitz, Luminare Data LLC Paper 1331-2017 Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations ABSTRACT John Schmitz, Luminare Data LLC Data processing can sometimes require complex

More information

. NO MORE MERGE - Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT

. NO MORE MERGE - Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT betfomilw tltlljri4ls. NO MORE MERGE - Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT ABSTRACT This tutorial is designed to show you several

More information

DATA Step in SAS Viya : Essential New Features

DATA Step in SAS Viya : Essential New Features Paper SAS118-2017 DATA Step in SAS Viya : Essential New Features Jason Secosky, SAS Institute Inc., Cary, NC ABSTRACT The is the familiar and powerful data processing language in SAS and now SAS Viya.

More information

Why Hash? Glen Becker, USAA

Why Hash? Glen Becker, USAA Why Hash? Glen Becker, USAA Abstract: What can I do with the new Hash object in SAS 9? Instead of focusing on How to use this new technology, this paper answers Why would I want to? It presents the Big

More information

SAS Styles ODS, Right? No Programming! Discover a Professional SAS Programming Style That Will Last a Career

SAS Styles ODS, Right? No Programming! Discover a Professional SAS Programming Style That Will Last a Career SAS Styles ODS, Right? No Programming! Discover a Professional SAS Programming Style That Will Last a Career Joe Perry, Perry & Associates Consulting, Oceanside, CA The typical, new SAS programmer has

More information

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO ABSTRACT The power of SAS programming can at times be greatly improved using PROC SQL statements for formatting and manipulating

More information

Uncommon Techniques for Common Variables

Uncommon Techniques for Common Variables Paper 11863-2016 Uncommon Techniques for Common Variables Christopher J. Bost, MDRC, New York, NY ABSTRACT If a variable occurs in more than one data set being merged, the last value (from the variable

More information

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix Paper PO-09 How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix ABSTRACT This paper demonstrates how to implement

More information

Techniques for Large Scale Data Linking in SAS. By Damien John Melksham

Techniques for Large Scale Data Linking in SAS. By Damien John Melksham Techniques for Large Scale Data Linking in SAS By Damien John Melksham What is Data Linking? Called everything imaginable: Data linking, record linkage, mergepurge, entity resolution, deduplication, fuzzy

More information

SAS - By Group Processing umanitoba.ca/centres/mchp

SAS - By Group Processing umanitoba.ca/centres/mchp SAS - By Group Processing umanitoba.ca/centres/mchp Winnipeg SAS users Group SAS By Group Processing Are you First or Last In Line Charles Burchill Manitoba Centre for Health Policy, University of Manitoba

More information

SYSTEM 2000 Essentials

SYSTEM 2000 Essentials 7 CHAPTER 2 SYSTEM 2000 Essentials Introduction 7 SYSTEM 2000 Software 8 SYSTEM 2000 Databases 8 Database Name 9 Labeling Data 9 Grouping Data 10 Establishing Relationships between Schema Records 10 Logical

More information

Using SAS Enterprise Guide to Coax Your Excel Data In To SAS

Using SAS Enterprise Guide to Coax Your Excel Data In To SAS Paper IT-01 Using SAS Enterprise Guide to Coax Your Excel Data In To SAS Mira Shapiro, Analytic Designers LLC, Bethesda, MD ABSTRACT Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley,

More information

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc. PharmaSUG2011 - Paper DM03 Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc., TX ABSTRACT In the Clinical trials data analysis

More information

Same Data Different Attributes: Cloning Issues with Data Sets Brian Varney, Experis Business Analytics, Portage, MI

Same Data Different Attributes: Cloning Issues with Data Sets Brian Varney, Experis Business Analytics, Portage, MI Paper BB-02-2013 Same Data Different Attributes: Cloning Issues with Data Sets Brian Varney, Experis Business Analytics, Portage, MI ABSTRACT When dealing with data from multiple or unstructured data sources,

More information

Give me EVERYTHING! A macro to combine the CONTENTS procedure output and formats. Lynn Mullins, PPD, Cincinnati, Ohio

Give me EVERYTHING! A macro to combine the CONTENTS procedure output and formats. Lynn Mullins, PPD, Cincinnati, Ohio PharmaSUG 2014 - Paper CC43 Give me EVERYTHING! A macro to combine the CONTENTS procedure output and formats. Lynn Mullins, PPD, Cincinnati, Ohio ABSTRACT The PROC CONTENTS output displays SAS data set

More information

Journey to the center of the earth Deep understanding of SAS language processing mechanism Di Chen, SAS Beijing R&D, Beijing, China

Journey to the center of the earth Deep understanding of SAS language processing mechanism Di Chen, SAS Beijing R&D, Beijing, China Journey to the center of the earth Deep understanding of SAS language processing Di Chen, SAS Beijing R&D, Beijing, China ABSTRACT SAS is a highly flexible and extensible programming language, and a rich

More information

How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U?

How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Andrew T. Kuligowski Nielsen Media Research Abstract / Introduction S-M-U. Some people will see these three letters and immediately

More information

Paper CC-013. Die Macro Die! Daniel Olguin, First Coast Service Options, Jacksonville, FL

Paper CC-013. Die Macro Die! Daniel Olguin, First Coast Service Options, Jacksonville, FL Die Macro Die! Daniel Olguin, First Coast Service Options, Jacksonville, FL ABSTRACT Have you ever tried to work your way through some convoluted SAS Macro Language code and thought to yourself, there

More information

Applications Development

Applications Development AD003 User Implementation and Revision of Business Rules Without Hard Coding: Macro-Generated SAS Code By Michael Krumenaker, Sr. Project Manager, Palisades Research, Inc. and Jit Bhattacharya, Manager

More information

Hypothesis Testing: An SQL Analogy

Hypothesis Testing: An SQL Analogy Hypothesis Testing: An SQL Analogy Leroy Bracken, Boulder Creek, CA Paul D Sherman, San Jose, CA ABSTRACT This paper is all about missing data. Do you ever know something about someone but don't know who

More information

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA Paper HW04 There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA ABSTRACT Clinical Trials data comes in all shapes and sizes depending

More information

Customized Flowcharts Using SAS Annotation Abhinav Srivastva, PaxVax Inc., Redwood City, CA

Customized Flowcharts Using SAS Annotation Abhinav Srivastva, PaxVax Inc., Redwood City, CA ABSTRACT Customized Flowcharts Using SAS Annotation Abhinav Srivastva, PaxVax Inc., Redwood City, CA Data visualization is becoming a trend in all sectors where critical business decisions or assessments

More information

Quicker Than Merge? Kirby Cossey, Texas State Auditor s Office, Austin, Texas

Quicker Than Merge? Kirby Cossey, Texas State Auditor s Office, Austin, Texas Paper 076-29 Quicker Than Merge? Kirby Cossey, Texas State Auditor s Office, Austin, Texas ABSTRACT How many times do you need to extract a few records from an extremely large dataset? INTRODUCTION In

More information

Automating the Production of Formatted Item Frequencies using Survey Metadata

Automating the Production of Formatted Item Frequencies using Survey Metadata Automating the Production of Formatted Item Frequencies using Survey Metadata Tim Tilert, Centers for Disease Control and Prevention (CDC) / National Center for Health Statistics (NCHS) Jane Zhang, CDC

More information

The inner workings of the datastep. By Mathieu Gaouette Videotron

The inner workings of the datastep. By Mathieu Gaouette Videotron The inner workings of the datastep By Mathieu Gaouette Videotron Plan Introduction The base The base behind the scene Control in the datastep A side by side compare with Proc SQL Introduction Most of you

More information

A Practical Guide to SAS Extended Attributes

A Practical Guide to SAS Extended Attributes ABSTRACT Paper 1980-2015 A Practical Guide to SAS Extended Attributes Chris Brooks, Melrose Analytics Ltd All SAS data sets and variables have standard attributes. These include items such as creation

More information

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT MWSUG 2017 - Paper BB15 Building Intelligent Macros: Driving a Variable Parameter System with Metadata Arthur L. Carpenter, California Occidental Consultants, Anchorage, Alaska ABSTRACT When faced with

More information

Macros I Use Every Day (And You Can, Too!)

Macros I Use Every Day (And You Can, Too!) Paper 2500-2018 Macros I Use Every Day (And You Can, Too!) Joe DeShon ABSTRACT SAS macros are a powerful tool which can be used in all stages of SAS program development. Like most programmers, I have collected

More information

USING HASH TABLES FOR AE SEARCH STRATEGIES Vinodita Bongarala, Liz Thomas Seattle Genetics, Inc., Bothell, WA

USING HASH TABLES FOR AE SEARCH STRATEGIES Vinodita Bongarala, Liz Thomas Seattle Genetics, Inc., Bothell, WA harmasug 2017 - Paper BB08 USING HASH TABLES FOR AE SEARCH STRATEGIES Vinodita Bongarala, Liz Thomas Seattle Genetics, Inc., Bothell, WA ABSTRACT As part of adverse event safety analysis, adverse events

More information

SC-15. An Annotated Guide: Using Proc Tabulate And Proc Summary to Validate SAS Code Russ Lavery, Contractor, Ardmore, PA

SC-15. An Annotated Guide: Using Proc Tabulate And Proc Summary to Validate SAS Code Russ Lavery, Contractor, Ardmore, PA SC-15 An Annotated Guide: Using Proc Tabulate And Proc Summary to Validate SAS Code Russ Lavery, Contractor, Ardmore, PA ABSTRACT This paper discusses how Proc Tabulate and Proc Summary can be used to

More information

Techdata Solution. SAS Analytics (Clinical/Finance/Banking)

Techdata Solution. SAS Analytics (Clinical/Finance/Banking) +91-9702066624 Techdata Solution Training - Staffing - Consulting Mumbai & Pune SAS Analytics (Clinical/Finance/Banking) What is SAS SAS (pronounced "sass", originally Statistical Analysis System) is an

More information

Outlook Integration Guide

Outlook Integration Guide Tabs3 Billing PracticeMaster General Ledger Accounts Payable Trust Accounting TA BS3.COM PracticeMaster Outlook Integration Guide Copyright 2012-2018 Software Technology, LLC 1621 Cushman Drive Lincoln,

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/25/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 3 In many data mining

More information

SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada

SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada SAS 9 Programming Enhancements Marje Fecht, Prowerk Consulting Ltd Mississauga, Ontario, Canada ABSTRACT Performance improvements are the well-publicized enhancement to SAS 9, but what else has changed

More information

Paper Cool SQL Tricks Russ Lavery, Contractor

Paper Cool SQL Tricks Russ Lavery, Contractor ABSTRACT Paper 1823-2018 Cool SQL Tricks Russ Lavery, Contractor This paper is a collection of eleven tips and tricks using SAS PROC SQL. It is intended for an intermediate SAS person who would like to

More information

Effectively Utilizing Loops and Arrays in the DATA Step

Effectively Utilizing Loops and Arrays in the DATA Step Paper 1618-2014 Effectively Utilizing Loops and Arrays in the DATA Step Arthur Li, City of Hope National Medical Center, Duarte, CA ABSTRACT The implicit loop refers to the DATA step repetitively reading

More information

Using Data Set Options in PROC SQL Kenneth W. Borowiak Howard M. Proskin & Associates, Inc., Rochester, NY

Using Data Set Options in PROC SQL Kenneth W. Borowiak Howard M. Proskin & Associates, Inc., Rochester, NY Using Data Set Options in PROC SQL Kenneth W. Borowiak Howard M. Proskin & Associates, Inc., Rochester, NY ABSTRACT Data set options are an often over-looked feature when querying and manipulating SAS

More information

Efficiently Join a SAS Data Set with External Database Tables

Efficiently Join a SAS Data Set with External Database Tables ABSTRACT Paper 2466-2018 Efficiently Join a SAS Data Set with External Database Tables Dadong Li, Michael Cantor, New York University Medical Center Joining a SAS data set with an external database is

More information

Automating Preliminary Data Cleaning in SAS

Automating Preliminary Data Cleaning in SAS Paper PO63 Automating Preliminary Data Cleaning in SAS Alec Zhixiao Lin, Loan Depot, Foothill Ranch, CA ABSTRACT Preliminary data cleaning or scrubbing tries to delete the following types of variables

More information

Using SAS software to shrink the data in your applications

Using SAS software to shrink the data in your applications Paper 991-2016 Using SAS software to shrink the data in your applications Ahmed Al-Attar, AnA Data Warehousing Consulting LLC, McLean, VA ABSTRACT This paper discusses the techniques I used at the Census

More information

David Franklin Independent SAS Consultant TheProgramersCabin.com

David Franklin Independent SAS Consultant TheProgramersCabin.com Countdown of the Top 10 Ways to Merge Data Trivia The film The Poseidon Adventure is based on a real life event that involved the Queen Mary in 1942 the ship was hit by a 92 foot wave which listed the

More information

Ready To Become Really Productive Using PROC SQL? Sunil K. Gupta, Gupta Programming, Simi Valley, CA

Ready To Become Really Productive Using PROC SQL? Sunil K. Gupta, Gupta Programming, Simi Valley, CA PharmaSUG 2012 - Paper HW04 Ready To Become Really Productive Using PROC SQL? Sunil K. Gupta, Gupta Programming, Simi Valley, CA ABSTRACT Using PROC SQL, can you identify at least four ways to: select

More information

An SQL Tutorial Some Random Tips

An SQL Tutorial Some Random Tips An SQL Tutorial Some Random Tips Presented by Jens Dahl Mikkelsen SAS Institute A/S Author: Paul Kent SAS Institute Inc, Cary, NC. Short Stories Towards a Better UNION Outer Joins. More than two too. Logical

More information

Efficient Processing of Long Lists of Variable Names

Efficient Processing of Long Lists of Variable Names Efficient Processing of Long Lists of Variable Names Paulette W. Staum, Paul Waldron Consulting, West Nyack, NY ABSTRACT Many programmers use SAS macro language to manipulate lists of variable names. They

More information

SQL Metadata Applications: I Hate Typing

SQL Metadata Applications: I Hate Typing SQL Metadata Applications: I Hate Typing Hannah Fresques, MDRC, New York, NY ABSTRACT This paper covers basics of metadata in SQL and provides useful applications, including: finding variables on one or

More information

Exploring DATA Step Merge and PROC SQL Join Techniques Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California

Exploring DATA Step Merge and PROC SQL Join Techniques Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Exploring DATA Step Merge and PROC SQL Join Techniques Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Abstract Explore the various DATA step merge and PROC SQL join processes.

More information

Learn Well Technocraft

Learn Well Technocraft Section 1: Getting started The Word window New documents Document navigation Section 2: Editing text Working with text The Undo and Redo commands Cut, copy, and paste Find and replace Section 3: Text formatting

More information