PROC FORMAT: USE OF THE CNTLIN OPTION FOR EFFICIENT PROGRAMMING
|
|
- Rhoda Kennedy
- 6 years ago
- Views:
Transcription
1 PROC FORMAT: USE OF THE CNTLIN OPTION FOR EFFICIENT PROGRAMMING Karuna Nerurkar and Andrea Robertson, GMIS Inc. ABSTRACT Proc Format can be a useful tool for improving programming efficiency. This paper will demonstrate how to use Proc Format with the Cntlin option to read in an extemal dataset and convert it into a custom format. The paper will discuss pitfalls to avoid when using the Proc Format such as using a numeric variable containing a decimal as your LABEL. The paper will also demonstrate the use of Proc Format as a more efficient mechanism than sorting and merging when performing a table lookup (i.e. using a key variable to extract observations from one dataset based on the values of the key variable in a second dataset). INTRODUCTION The Format procedure is a useful tool when the selection of records from one dataset is dependent on information contained within a second dataset. This is commonly called a table lookup. One requirement for successfully performing a table lookup is that both datasets contain one common element or key. First, this paper will provide a detailed explanation of how to perform a table lookup using the Format procedure. Then, it will compare the Format procedure table lookup to a SortlMerge table lookup to illustrate the gain in efficiency that the Format procedure achieves over the Sort/Merge process when a table lookup is being performed on large datasets. The paper will also provide tips on avoiding pitfalls when creating the format, and discuss memory limitations associated with the Format procedure. THE DATA The monthly processing of health insurance claims data will be used as an example in which a table lookup can be done using the Format procedure. In this situation, a history file containing claims data may consist of millions of claims for millions of members. Updating all of the claims in this file on a monthly basis, which would consist of combining newly entered claims with those in the history file, cleansing the data, and performing any desired enhancements, would be costly in terms of CPU and execution time. Since the new month of claims data will only contain records for a subset of the members in the history file, there is no need to reprocess the entire history file. Only the history records for those members who have services reflected in the new month of claims data will need to be selected for reprocessing. This is where the Format procedure can be a useful tool. USING THE FORMAT PROCEDURE TO PERFORM A TABLE LOOKUP In order to create a custom format beginning with a SAS dataset (the input control dataset), the Format procedure expects to find three variables within the dataset:. START - Range starting variable 2. LABEL - Label: informatted or formatted value 3. FMTNAME - Format or informat name The variable TYPE is not required, but if it is not included, the name of a character format must be preceded by a dollar sign ($). An additional variable, END, is also optional. START and END specify a range of values that will be assigned a given LABEL value. If END is absent in the input control dataset, SAS will assign it the same value as START (in essence, creating a range containing only one value). FMTNAME is the name of the format that has been created. The format name, when used (as described below) in conjunction with a key variable in your lookup dataset, executes a comparison of the values for that key variable with the values of the START variable (or in the START-END range) contained in the format. When a match between the value of the key variable and START is found, the Data step retums the value for LABEL that is paired with the value for START. There are other variables that can be specified in the input control dataset, but are not required. Refer to the SAS Procedures Guide for a listing of these variables. Using the health insurance claims data as an example, the information contained in the history file for those members that have received services this month needs to be updated. A key that uniquely identifies members must exist in both datasets. Both the history file (the master file) and the current month file (the transaction file) contain, along with other variables, each member's social security number () which will be used as the key. Since records 698
2 from the history file that match members represented in the current month file must be selected, the key variable () will be used to create the format. Since the format will serve a / function (select the record or -do not select the record), the same value will be assigned to LABEL for all values of the START variable. Additionally, since a single format is being created, the same FMTNAME will be assigned to all observations. For the sake of the example, a small test dataset will be used. IN.HIS94 (history file): CHARGES , IN2.CURR0794 (current month file): CHARGES The first step is to create a dataset containing the variables START, LABEL, and FMTNAME. This can be done using the following SAS statements: Example Step : DATATEST; RETAIN LABEL '' FMTNAME ''; SET IN2.CURR0794 (KEEP= RENAME=(=START»; The above example creates a dataset named TEST containing the following observations: START LABEL FMTNAME The next step would be to execute the Format procedure which converts this dataset into a format; however, if this dataset were used in its current form, the Format procedure would fail because overlapping ranges exist. Each value (range) of the START variable must be unique in order for the Format procedure to successfully create a format for that variable. One way to avoid overlapping ranges is to sort the dataset TEST by the key variable using the dupkey option so that only one observation per member remains in the final dataset. The sort is illustrated below: Example Step 2: PROC SORT DATA=TEST OUT=TEST2 NODUPKEY; BY START; This code creates a dataset called TEST2 containing the following observations: START LABEL FMTNAME Proc Format with the Cntlin option is used to create the format. If you are creating a temporary format (i.e., one that will exist only while the program is running), use the following SAS statement: Example Step 3A: PROC FORMAT CNTLlN= TEST2; This will read in the dataset TEST2 (the input control dataset) and create the format on the WORK disk. If you want the format to be available for use beyond the completion of the current job, you must create a permanent format library to house the format: Example Step 3B: LlBNAME MYLIB 'MYACCT.MEMB.NEWMON' DISP=(NEW,CATLG) UNIT=DASD RETPD=365 SPACE=(TRK,(0,5),RLSE); PROC FORMAT CNTLlN= TEST2 LlBRARY=MYLIB; te that the above example was structured for use in MVS. The library allocation statement will need to 699
3 be modified to conform to the operating system that is being used. This example reads the dataset TEST2 and creates the format as a member of the permanent format library called 'MYACCT.MEMB.NEWMON'. The next step in performing the table lookup would be to read in the history file using the format that was created to select only the desired records. This can be accomplished in several ways, one of which will be illustrated below: Example 2: DATA OUT.HIS94NEW; SET IN.HIS94 (WHERE=(PUT(,.)=''» IN2.CURR0794; [statements] This data step evaluates each record in IN.HIS94 before reading it into the output dataset to determine whether the WHERE clause is satisfied by the record. Additionally, all records in IN2.CURR0794 are read into the data step since they will be added to the history file. When a match is found between the value of in the history file and a value for the START variable in the format (which represents s contained in the current month of data), the value of the LABEL (in this case '') is retumed, which satisfies the WHERE clause. When a match between the value of and any value for the START variable is not found, the first n bytes of the value for are retumed, with n being equal to the length of the LABEL variable. Each record in IN.HIS94 is evaluated for selection as shown below: Retumed Value Select (YIN)? This completes the steps necessary for performing a table lookup using the Format procedure. te that IN2.CURR0794 need not be included in the SET statement in order for a table lookup to be performed. When the history dataset and the current month dataset are combined as shown in Example 2, the following records will be output to OUT.HIS94NEW: CHARGE COMPARISON OF FORMAT WITH SORTIMERGE: Thus far, this paper has discussed how to use the. Format procedure to perform a table lookup. A table lookup can also be accomplished by sorting and merging the two datasets. Consider the same example of updating the history file with data from a current month file. If the two files have a common key then a table lookup can be perfomed. For the monthly and history files, this key is. In order to perform a merge, both files must be pre-sorted by. The first step is to sort the monthly file by the key and remove observations with duplicate key values. The dupkey option must be used in order to avoid the duplication of records that can occur in a many to many merge. Example 3 Step: PROC SORT DATA=IN2.CURR0794(KEEP=) OUT =UNIQUE NODUPKEY; BY; If the monthly file is sorted this way, the following dataset is created: te that the dataset UNIQUE should have only the key variable. This avoids overwriting values of 700
4 similarly named variables when this dataset is merged with the history file. The next step will be to sort the history file by the same key variable (). Keep all the records in the history file. Do not drop any variables. Example 3 Step 2: PROC SORT DATA=IN.HIS94 OUT = TEMP.HISA; BY; When IN.HIS94 is sorted, the following dataset is created: CHARGES , To select records from the history file for members who also appear in the curent monthly file, merge the history file with the UNIQUE file that was created using the monthly file. Example 3 Step 3: DATA TEMP2.HISB; MERGE TEMP.HISA(IN=A) UNIQUE(IN=B); BY; IFAANDB; The statement 'IF A AND B' instructs SAS to output only those records where the value appears in both of the input datasets, thus accomplishing the table lookup. The resulting dataset will be as follows: CHARGES At this point, a dataset containing the history file records for members who also have records in the current monthly file has been created. The next step is to combine this data with the current month file so update processing can be accomplished. Example 3 Step 4: DATA OUT.HIS94NEW; SET TEMP2.HISB IN2.CURR0794; For the format procedure, the corresponding four steps will be as follows:. Create dataset UNIQUE the same way as in the SortlMerge process. 2. Create input control dataset POP: DATA POP; SET UNIQUE(RENAME=(=START»; RETAIN LABEL '*' TYPE 'C' FMTNAME 'P; 3. Use the input control dataset to create a format: PROC FORMAT CNTLlN=POP; 4. Use this format to extract records from the history file. At the same time, set it together with the monthly file: DATA OUT.HIS94NEW; SET IN.HIS94 (WHERE=(PUT(, $ F.)='*'» IN2.CURR0794; A comparison between the SortlMerge process and the Format procedure was performed under the following conditions: Operating system: MVS, SAS : Version 6.08 History file: LRECL(Record length) = 632 Number of observations = 344,768 Monthly file: LRECL(Record length) = 632 Number of observations = 5,584 Step SorVMerge Format (CPU seconds) (CPU seconds) OVerall The SorVMerge process required almost twice as much CPU time as compared with the Format procedure. Also a substantial amount of memory is needed to store the intermediate large datasets created by the SortlMerge process, TEMP.HISA and TEMP2.HISB. Nevertheless, if the transaction 70
5 and master files are comparable in size, more resources may be used by the Format procedure than by the Sort/Merge process. MEMORY LIMITATIONS Since formats reside in memory, memory limitations may be reached while creating a very large format. For example, if the transaction file has a large number of unique s (more than 00,000), an error message indicating insufficient memory size may result. In our research, we found that when the START variable had a record length of 6 bytes, the maximum number of records we could put in a format was 83,69 when MEMSIZE=6M. To overcome this memory limitation, either increase the memory size, or create multiple formats. Example 4: Increasing MEMSIZE: The memory size can be increased by coding the region parameter and the MEMSIZE option in your JCL. Memory size can also be increased by coding MEMSIZE in the program as illustrated below. OPTIONS MEMSIZE=6M; PROC SORT DATA=IN2.CURR0794(KEEP=) OUT =UNIOUE NODUPKEY; BY; DATA POP; SET UNIOUE(RENAME=(=START»; LABEL='*'; TYPE='C'; FMTNAME=' F'; PROC FORMAT CNTLlN=POP; Increase memory size as much as your system will allow. If memory is still inadequate, then create multiple formats as illustrated in the next example. Example 5: Creation of multiple control datasets: You can circumvent the memory requirement by creating multiple input control datasets to produce multiple smaller formats. te that this example is structured for use in an MVS environment. Example 5 Program : LBNAME USERFMT 'MYACCT.MEMB.NEWMON' DlSP=(NEW,CATLG) UNIT=DASD RETPD=365 SPACE=(TRK,(0,5),RLSE); OPTIONS MEMSIZE=6M; PROC SORT DATA=IN2.CURR0794(KEEP=) OUT =UNIQUE NODUPKEY; BY; DATA POP POP2; SET UNIOUE(RENAME=(=START»; RETAIN LABEL '.'; IF _N_ < 8369 THEN DO; FMTNAME='$ F'; OUTPUT POP; END; ELSE DO; FMTNAME ='$2F'; OUTPUT POP2; END; PROC FORMAT CNTLlN=POP LBRARY=USERFMT ; PROC FORMAT CNTLlN=POP2 LBRARY=USERFMT ; In the above example, the two formats $ F and $2F are created from dataset UNIQUE which contains 366,338 observations. Another program or another datastep in the same program can then access the permanent library 'USERFMT' and use the formats to extract information from the claims file as follows. Example 5 Program 2: In this example, both of the formats should be used as criteria to extract records corresponding to patients appearing in the dataset IN.HIS94. This is done using the PUT statement as follows: OPTIONS MEMSIZE=6M FMTSEARCH=(USERFMT ); DATA OUT.HIS94NEW; SET IN.HIS94 (WHERE=(PUT(,$F.) =,*, OR PUT(,$2F.) = '*'»; te that if your transaction file has a large number of unique s and creating two input control datasels is not enough to get around memory limitations, you can create as many input control 702
6 datasets and formats as you like. However, since you can load only two formats in one data step, you will have to make multiple data steps to load these formats. Even though more CPU time is needed for multiple datasteps, the Format procedure is still less CPU intensive than the SortlMerge. POTENTIAL PITFALLS When selecting a value for your LABEL, keep in mind the possible values of your comparison variable (in this case ). If, in Example above, we had assigned LABEL='' when creating the format, the following would be the outcome of evaluating the records in IN.HIS94: Retumed Value 9 2 Select (YIN)? Since the LABEL is only one byte in length, in the three cases where a match is not found between and START (, , and 9344), the first byte of the is returned (, 2, and 9 respectively). In the first case (), this value matches the value of LABEL, so the two records are erroneously selected. One way to ensure that the value of your LABEL will not cause records to be selected erroneously is to create an 'other' condition. To do this, you will need to make use of the DO WHILE loop as the following example illustrates: Example 6: PROC SORT DATA=IN2.CURR0794 OUT = TEST NODUPKEY; BY; DATATEST2; RETAIN FMTNAME ''; DO WHILE (NOT EOF); SETTEST (KEEP= RENAME=(=START» END=EOF; LABEL=''; OUTPUT; END; START ='OTHER'; LABEL='BAD'; OUTPUT; STOP; PROC FORMAT CNTLlN= TEST2; DATA OUT.HIS94NEW; SET IN.HIS94 (WHERE=(PUT(,.)=''» IN2.CURR0794; In the data step where TEST2 is created, once the 'not end of file' condition is no longer satisfied (i.e., the end of file has been reached), the Do While loop will terminate. The next few lines create an additional record with 'OTHER' as the START value. All records with a value for the key variable (in this case ) that is not in the list of START values will fall into the 'other' condition, so the value of LABEL that is paired with the 'other' condition will be returned (in this case 'BAD'). Repeating the prior example, the following would be the outcome of evaluating the records in IN.HIS94 when the 'other' condition is included: Returned Value Select (YIN)? 9344 BAD BAD BAD BAD tice the last statement in Example 6 above. The Stop statement causes the data step to stop processing after the 'other' condition record has been output. If the Stop statement is not used, a second 'other' record will be generated and the following note will appear in your log: NOTE: DATA STEP stopped due to looping. One other pitfall involves the assignment of the LABEL. You may create a format for reasons other than a table lookup, such as in the assignment of an adjustment factor in order to normalize dollar values across localities within a region. In this instance, it would make sense to have the START variable contain zip codes and the LABEL variable contain the associated adjustment factor. However, if you 703
7 assigned a numeric LABEL value (such as.25), only digits to the left of the decimal will be retumed when the format is executed. This occurs because SAS truncates all numeric LABEL values to integers. To get around this, assign the LABEL value as a character. After you execute the format to assign the adjustment factor to a new variable (ADJUST), you can convert this variable back to a numeric for computing purposes. CONCLUSION This paper discussed the use of the Format procedure with the Cntlin option when performing a table lookup, and compared the CPU time of the Format procedure to the Sort/Merge process. If the master file is substantially larger than the transaction file, then it is advantageous to use formats for the update process to save CPU time. Increase memory size as much as your system will allow in order to make larger formats. If memory is still inadequate, create multiple formats. However, if the transaction and master files are comparable in size, then you might be using more resources in the Format procedure than you would to perform the Sort/Merge process. The paper also discussed potential pitfalls to avoid when assigning the LABEL variable. REFERENCES SAS Institute Inc., SA~ Procedures Guide, Version 6, Third Edition, Cary, N.C. SAS Institute Inc., 990. pp ACKNOWLEDGEMENTS SAS is a registered trademark of SAS Institute Inc. in the USA and other countries. indicates USA registration. 704
Merge Processing and Alternate Table Lookup Techniques Prepared by
Merge Processing and Alternate Table Lookup Techniques Prepared by The syntax for data step merging is as follows: International SAS Training and Consulting This assumes that the incoming data sets are
More informationCreate a Format from a SAS Data Set Ruth Marisol Rivera, i3 Statprobe, Mexico City, Mexico
PharmaSUG 2011 - Paper TT02 Create a Format from a SAS Data Set Ruth Marisol Rivera, i3 Statprobe, Mexico City, Mexico ABSTRACT Many times we have to apply formats and it could be hard to create them specially
More informationTable Lookups: Getting Started With Proc Format
Table Lookups: Getting Started With Proc Format John Cohen, AstraZeneca LP, Wilmington, DE ABSTRACT Table lookups are among the coolest tricks you can add to your SAS toolkit. Unfortunately, these techniques
More informationGary L. Katsanis, Blue Cross and Blue Shield of the Rochester Area, Rochester, NY
Table Lookups in the SAS Data Step Gary L. Katsanis, Blue Cross and Blue Shield of the Rochester Area, Rochester, NY Introduction - What is a Table Lookup? You have a sales file with one observation for
More informationTable Lookups: From IF-THEN to Key-Indexing
Table Lookups: From IF-THEN to Key-Indexing Arthur L. Carpenter, California Occidental Consultants ABSTRACT One of the more commonly needed operations within SAS programming is to determine the value of
More informationHash Objects for Everyone
SESUG 2015 Paper BB-83 Hash Objects for Everyone Jack Hall, OptumInsight ABSTRACT The introduction of Hash Objects into the SAS toolbag gives programmers a powerful way to improve performance, especially
More information9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA
9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA ABSTRACT Joining or merging data is one of the fundamental actions carried out when manipulating data to bring it
More informationOptimizing System Performance
243 CHAPTER 19 Optimizing System Performance Definitions 243 Collecting and Interpreting Performance Statistics 244 Using the FULLSTIMER and STIMER System Options 244 Interpreting FULLSTIMER and STIMER
More informationUSING SAS SOFTWARE TO COMPARE STRINGS OF VOLSERS IN A JCL JOB AND A TSO CLIST
USING SAS SOFTWARE TO COMPARE STRINGS OF VOLSERS IN A JCL JOB AND A TSO CLIST RANDALL M NICHOLS, Mississippi Dept of ITS, Jackson, MS ABSTRACT The TRANSLATE function of SAS can be used to strip out punctuation
More informationBEYOND FORMAT BASICS 1
BEYOND FORMAT BASICS 1 CNTLIN DATA SETS...LABELING VALUES OF VARIABLE One common use of a format in SAS is to assign labels to values of a variable. The rules for creating a format with PROC FORMAT are
More informationHow to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U?
Paper 54-25 How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Andrew T. Kuligowski Nielsen Media Research Abstract / Introduction S-M-U. Some people will see these three letters and
More information. NO MORE MERGE - Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT
betfomilw tltlljri4ls. NO MORE MERGE - Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT ABSTRACT This tutorial is designed to show you several
More informationTackling Unique Problems Using TWO SET Statements in ONE DATA Step. Ben Cochran, The Bedford Group, Raleigh, NC
MWSUG 2017 - Paper BB114 Tackling Unique Problems Using TWO SET Statements in ONE DATA Step Ben Cochran, The Bedford Group, Raleigh, NC ABSTRACT This paper illustrates solving many problems by creatively
More informationSAS Scalable Performance Data Server 4.3
Scalability Solution for SAS Dynamic Cluster Tables A SAS White Paper Table of Contents Introduction...1 Cluster Tables... 1 Dynamic Cluster Table Loading Benefits... 2 Commands for Creating and Undoing
More informationFormat-o-matic: Using Formats To Merge Data From Multiple Sources
SESUG Paper 134-2017 Format-o-matic: Using Formats To Merge Data From Multiple Sources Marcus Maher, Ipsos Public Affairs; Joe Matise, NORC at the University of Chicago ABSTRACT User-defined formats are
More informationBeginning Tutorials. PROC FSEDIT NEW=newfilename LIKE=oldfilename; Fig. 4 - Specifying a WHERE Clause in FSEDIT. Data Editing
Mouse Clicking Your Way Viewing and Manipulating Data with Version 8 of the SAS System Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California ABSTRACT Version 8 of the
More informationSTOP MERGING AND START COMBINING by Robert S. Nicol U.S. Quality Algorithms
STOP MERGING AND START COMBINING by Robert S. Nicol U.S. Quality Algorithms There are many ways to combine data within the SAS system. Probably the most widely used method is the. While the merge is very
More informationMerging Data Eight Different Ways
Paper 197-2009 Merging Data Eight Different Ways David Franklin, Independent Consultant, New Hampshire, USA ABSTRACT Merging data is a fundamental function carried out when manipulating data to bring it
More informationPaper Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations. John Schmitz, Luminare Data LLC
Paper 1331-2017 Haven't I Seen You Before? An Application of DATA Step HASH for Efficient Complex Event Associations ABSTRACT John Schmitz, Luminare Data LLC Data processing can sometimes require complex
More informationUsing an ICPSR set-up file to create a SAS dataset
Using an ICPSR set-up file to create a SAS dataset Name library and raw data files. From the Start menu, launch SAS, and in the Editor program, write the codes to create and name a folder in the SAS permanent
More informationCountdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH
PharmaSUG2010 - Paper TU06 Countdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH ABSTRACT Joining or merging data is one of the fundamental actions carried out
More informationSYSTEM 2000 Essentials
7 CHAPTER 2 SYSTEM 2000 Essentials Introduction 7 SYSTEM 2000 Software 8 SYSTEM 2000 Databases 8 Database Name 9 Labeling Data 9 Grouping Data 10 Establishing Relationships between Schema Records 10 Logical
More informationDitch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA
ABSTRACT Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA Data set documentation is essential to good
More informationAre Your SAS Programs Running You? Marje Fecht, Prowerk Consulting, Cape Coral, FL Larry Stewart, SAS Institute Inc., Cary, NC
Paper CS-044 Are Your SAS Programs Running You? Marje Fecht, Prowerk Consulting, Cape Coral, FL Larry Stewart, SAS Institute Inc., Cary, NC ABSTRACT Most programs are written on a tight schedule, using
More informationCharacteristics of a "Successful" Application.
Characteristics of a "Successful" Application. Caroline Bahler, Meridian Software, Inc. Abstract An application can be judged "successful" by two different sets of criteria. The first set of criteria belongs
More informationPerformance Considerations
149 CHAPTER 6 Performance Considerations Hardware Considerations 149 Windows Features that Optimize Performance 150 Under Windows NT 150 Under Windows NT Server Enterprise Edition 4.0 151 Processing SAS
More informationNO MORE MERGE. Alternative Table Lookup Techniques
NO MORE MERGE. Alternative Table Lookup Techniques Dana Rafiee, Destiny Corporation/DDISC Group Ltd. U.S., Wethersfield, CT ABSTRACT This tutorial is designed to show you several techniques available for
More informationHow to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U?
How to Incorporate Old SAS Data into a New DATA Step, or What is S-M-U? Andrew T. Kuligowski Nielsen Media Research Abstract / Introduction S-M-U. Some people will see these three letters and immediately
More informationAn Efficient Method to Create Titles for Multiple Clinical Reports Using Proc Format within A Do Loop Youying Yu, PharmaNet/i3, West Chester, Ohio
PharmaSUG 2012 - Paper CC12 An Efficient Method to Create Titles for Multiple Clinical Reports Using Proc Format within A Do Loop Youying Yu, PharmaNet/i3, West Chester, Ohio ABSTRACT Do you know how to
More informationUsing the SQL Editor. Overview CHAPTER 11
205 CHAPTER 11 Using the SQL Editor Overview 205 Opening the SQL Editor Window 206 Entering SQL Statements Directly 206 Entering an SQL Query 206 Entering Non-SELECT SQL Code 207 Creating Template SQL
More informationFormats. Formats Under UNIX. HEXw. format. $HEXw. format. Details CHAPTER 11
193 CHAPTER 11 Formats Formats Under UNIX 193 Formats Under UNIX This chapter describes SAS formats that have behavior or syntax that is specific to UNIX environments. Each format description includes
More informationPosters. Workarounds for SASWare Ballot Items Jack Hamilton, First Health, West Sacramento, California USA. Paper
Paper 223-25 Workarounds for SASWare Ballot Items Jack Hamilton, First Health, West Sacramento, California USA ABSTRACT As part of its effort to insure that SAS Software is useful to its users, SAS Institute
More informationChaining Logic in One Data Step Libing Shi, Ginny Rego Blue Cross Blue Shield of Massachusetts, Boston, MA
Chaining Logic in One Data Step Libing Shi, Ginny Rego Blue Cross Blue Shield of Massachusetts, Boston, MA ABSTRACT Event dates stored in multiple rows pose many challenges that have typically been resolved
More informationChoosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA
Choosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA ABSTRACT This paper outlines different SAS merging techniques
More informationChecking for Duplicates Wendi L. Wright
Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when
More informationPROC FORMAT Jack Shoemaker Real Decisions Corporation
140 Beginning Tutorials PROC FORMAT Jack Shoemaker Real Decisions Corporation Abstract: Although SAS stores and processes data intemally as either characters or numbers, you can control the external view
More information10 The First Steps 4 Chapter 2
9 CHAPTER 2 Examples The First Steps 10 Invoking the Query Window 11 Changing Your Profile 11 ing a Table 13 ing Columns 14 Alias Names and Labels 14 Column Format 16 Creating a WHERE Expression 17 Available
More informationLoading Data. Introduction. Understanding the Volume Grid CHAPTER 2
19 CHAPTER 2 Loading Data Introduction 19 Understanding the Volume Grid 19 Loading Data Representing a Complete Grid 20 Loading Data Representing an Incomplete Grid 21 Loading Sparse Data 23 Understanding
More informationQuicker Than Merge? Kirby Cossey, Texas State Auditor s Office, Austin, Texas
Paper 076-29 Quicker Than Merge? Kirby Cossey, Texas State Auditor s Office, Austin, Texas ABSTRACT How many times do you need to extract a few records from an extremely large dataset? INTRODUCTION In
More informationProgramming Beyond the Basics. Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell
Find() the power of Hash - How, Why and When to use the SAS Hash Object John Blackwell ABSTRACT The SAS hash object has come of age in SAS 9.2, giving the SAS programmer the ability to quickly do things
More informationWhat to Expect When You Need to Make a Data Delivery... Helpful Tips and Techniques
What to Expect When You Need to Make a Data Delivery... Helpful Tips and Techniques Louise Hadden, Abt Associates Inc. QUESTIONS YOU SHOULD ASK REGARDING THE PROJECT Is there any information regarding
More informationS-M-U (Set, Merge, and Update) Revisited
Paper 3444-2015 S-M-U (Set, Merge, and Update) Revisited Andrew T. Kuligowski, HSN ABSTRACT It is a safe assumption that almost every SAS user learns how to use the SET statement not long after they re
More informationWorking with Administrative Databases: Tips and Tricks
3 Working with Administrative Databases: Tips and Tricks Canadian Institute for Health Information Emerging Issues Team Simon Tavasoli Administrative Databases > Administrative databases are often used
More information1. Join with PROC SQL a left join that will retain target records having no lookup match. 2. Data Step Merge of the target and lookup files.
Abstract PaperA03-2007 Table Lookups...You Want Performance? Rob Rohrbough, Rohrbough Systems Design, Inc. Presented to the Midwest SAS Users Group Monday, October 29, 2007 Paper Number A3 Over the years
More informationSAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board
SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms.
More informationSAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board
SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms.
More informationBruce Gilsen, Federal Reserve Board
SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms
More information50 WAYS TO MERGE YOUR DATA INSTALLMENT 1 Kristie Schuster, LabOne, Inc., Lenexa, Kansas Lori Sipe, LabOne, Inc., Lenexa, Kansas
Paper 103-26 50 WAYS TO MERGE YOUR DATA INSTALLMENT 1 Kristie Schuster, LabOne, Inc., Lenexa, Kansas Lori Sipe, LabOne, Inc., Lenexa, Kansas ABSTRACT When you need to join together two datasets, how do
More informationSorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine
PharmaSUG 2015 - Paper QT21 Sorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine ABSTRACT Very often working with big data causes difficulties for SAS programmers.
More informationAn Easy Way to Split a SAS Data Set into Unique and Non-Unique Row Subsets Thomas E. Billings, MUFG Union Bank, N.A., San Francisco, California
An Easy Way to Split a SAS Data Set into Unique and Non-Unique Row Subsets Thomas E. Billings, MUFG Union Bank, N.A., San Francisco, California This work by Thomas E. Billings is licensed (2017) under
More informationPaper PO06. Building Dynamic Informats and Formats
Paper PO06 Building Dynamic Informats and Formats Michael Zhang, Merck & Co, Inc, West Point, PA ABSTRACT Using the FORMAT procedure to define informats and formats is a common task in SAS programming
More informationAn exercise in separating client-specific parameters from your program
An exercise in separating client-specific parameters from your program Erik Tilanus The Netherlands WIILSU 2015 Milwaukee Do you recognize this? You write a 'one-time' program for one particular situation
More informationLeave Your Bad Code Behind: 50 Ways to Make Your SAS Code Execute More Efficiently.
Leave Your Bad Code Behind: 50 Ways to Make Your SAS Code Execute More Efficiently. William E Benjamin Jr Owl Computer Consultancy, LLC 2012 Topic Groups Processing more than one file in each DATA step
More informationSUGI 29 Data Warehousing, Management and Quality
Building a Purchasing Data Warehouse for SRM from Disparate Procurement Systems Zeph Stemle, Qualex Consulting Services, Inc., Union, KY ABSTRACT SAS Supplier Relationship Management (SRM) solution offers
More informationSAS Scalable Performance Data Server 4.3 TSM1:
: Parallel Join with Enhanced GROUP BY Processing A SAS White Paper Table of Contents Introduction...1 Parallel Join Coverage... 1 Parallel Join Execution... 1 Parallel Join Requirements... 5 Tables Types
More informationUsing PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO
Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO ABSTRACT The power of SAS programming can at times be greatly improved using PROC SQL statements for formatting and manipulating
More informationPaper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency.
Paper 76-28 Comparative Efficiency of SQL and Base Code When Reading from Database Tables and Existing Data Sets Steven Feder, Federal Reserve Board, Washington, D.C. ABSTRACT In this paper we compare
More informationCommon Sense Tips and Clever Tricks for Programming with Extremely Large SAS Data Sets
Common Sense Tips and Clever Tricks for Programming with Extremely Large SAS Data Sets Kathy Hardis Fraeman, United BioSource Corporation, Bethesda, MD ABSTRACT Working with extremely large SAS data sets
More informationSAS/ASSIST Software Setup
173 APPENDIX 3 SAS/ASSIST Software Setup Appendix Overview 173 Setting Up Graphics Devices 173 Setting Up Remote Connect Configurations 175 Adding a SAS/ASSIST Button to Your Toolbox 176 Setting Up HTML
More informationPaper CT-16 Manage Hierarchical or Associated Data with the RETAIN Statement Alan R. Mann, Independent Consultant, Harpers Ferry, WV
Paper CT-16 Manage Hierarchical or Associated Data with the RETAIN Statement Alan R. Mann, Independent Consultant, Harpers Ferry, WV ABSTRACT For most of the history of computing machinery, hierarchical
More informationWill Your Data Warehouse Stand the Test of rime? David Annis, Amadeus Data Processing, Germany
Will Your Data Warehouse Stand the Test of rime? David Annis, Amadeus Data Processing, Germany As storage becomes cheaper, we have to be more careful rather than less careful about how we design our historical
More informationusing and Understanding Formats
using and Understanding SAS@ Formats Howard Levine, DynaMark, Inc. Oblectives The purpose of this paper is to enable you to use SAS formats to perform the following tasks more effectively: Improving the
More informationStephen M. Beatrous, SAS Institute Inc., Cary, NC John T. Stokes, SAS Institute Inc., Austin, TX
1/0 Performance Improvements in Release 6.07 of the SAS System under MVS, ems, and VMS' Stephen M. Beatrous, SAS Institute Inc., Cary, NC John T. Stokes, SAS Institute Inc., Austin, TX INTRODUCTION The
More informationConsiderations of Analysis of Healthcare Claims Data
Considerations of Analysis of Healthcare Claims Data ABSTRACT Healthcare related data is estimated to grow exponentially over the next few years, especially with the growing adaptation of electronic medical
More informationComparison of different ways using table lookups on huge tables
PhUSE 007 Paper CS0 Comparison of different ways using table lookups on huge tables Ralf Minkenberg, Boehringer Ingelheim Pharma GmbH & Co. KG, Ingelheim, Germany ABSTRACT In many application areas the
More informationLocking SAS Data Objects
59 CHAPTER 5 Locking SAS Data Objects Introduction 59 Audience 60 About the SAS Data Hierarchy and Locking 60 The SAS Data Hierarchy 60 How SAS Data Objects Are Accessed and Used 61 Types of Locks 62 Locking
More informationIf You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC
Paper 2417-2018 If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC ABSTRACT Reading data effectively in the DATA step requires knowing the implications
More informationAn Introduction to Compressing Data Sets J. Meimei Ma, Quintiles
An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles r:, INTRODUCTION This tutorial introduces compressed data sets. The SAS system compression algorithm is described along with basic syntax.
More informationAn Introduction to SAS/FSP Software Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California
An Introduction to SAS/FSP Software Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California ABSTRACT SAS/FSP is a set of procedures used to perform full-screen interactive
More information3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;
Randy s SAS hints, updated Feb 6, 2014 1. Always begin your programs with internal documentation. * ***************** * Program =test1, Randy Ellis, first version: March 8, 2013 ***************; 2. Don
More informationPlease don't Merge without By!!
ABSTRACT Please don't Merge without By!! Monal Kohli Have you ever merged datasets and forgotten a by Statement, looked at the results and thought wow -- 100% match but when you started validating the
More informationcapabilities and their overheads are therefore different.
Applications Development 3 Access DB2 Tables Using Keylist Extraction Berwick Chan, Kaiser Permanente, Oakland, Calif Raymond Wan, Raymond Wan Associate Inc., Oakland, Calif Introduction The performance
More informationSAS/Warehouse Administrator Usage and Enhancements Terry Lewis, SAS Institute Inc., Cary, NC
SAS/Warehouse Administrator Usage and Enhancements Terry Lewis, SAS Institute Inc., Cary, NC ABSTRACT SAS/Warehouse Administrator software makes it easier to build, maintain, and access data warehouses
More informationPROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need
ABSTRACT Paper PO 133 PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need Imelda C. Go, South Carolina Department of Education, Columbia,
More informationFSEDIT Procedure Windows
25 CHAPTER 4 FSEDIT Procedure Windows Overview 26 Viewing and Editing Observations 26 How the Control Level Affects Editing 27 Scrolling 28 Adding Observations 28 Entering and Editing Variable Values 28
More informationUsing SAS/SHARE More Efficiently
Using More Efficiently by Philip R Holland, Holland Numerics Ltd, UK Abstract is a very powerful product which allow concurrent access to SAS Datasets for reading and updating. However, if not used with
More informationSAS File Management. Improving Performance CHAPTER 37
519 CHAPTER 37 SAS File Management Improving Performance 519 Moving SAS Files Between Operating Environments 520 Converting SAS Files 520 Repairing Damaged Files 520 Recovering SAS Data Files 521 Recovering
More informationCHAPTER 7 Examples of Combining Compute Services and Data Transfer Services
55 CHAPTER 7 Examples of Combining Compute Services and Data Transfer Services Introduction 55 Example 1. Compute Services and Data Transfer Services Combined: Local and Remote Processing 56 Purpose 56
More informationThe Path To Treatment Pathways Tracee Vinson-Sorrentino, IMS Health, Plymouth Meeting, PA
ABSTRACT PharmaSUG 2015 - Paper HA06 The Path To Treatment Pathways Tracee Vinson-Sorrentino, IMS Health, Plymouth Meeting, PA Refills, switches, restarts, and continuation are valuable and necessary metrics
More informationValidating And Updating Your Data Using SAS Formats Peter Welbrock, Britannia Consulting, Inc., MA
Validating And Updating Your Data Using SAS Formats Peter Welbrock, Britannia Consulting, Inc., MA Overview In whatever way you use SAS software, at some point you will have to deal with data. It is unavoidable.
More informationProgramming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC
Paper CC-05 Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC ABSTRACT For many SAS users, learning SQL syntax appears to be a significant effort with a low
More informationUsing Cross-Environment Data Access (CEDA)
93 CHAPTER 13 Using Cross-Environment Data Access (CEDA) Introduction 93 Benefits of CEDA 93 Considerations for Using CEDA 93 Alternatives to Using CEDA 94 Introduction The cross-environment data access
More informationAre Your SAS Programs Running You?
Overview Are Your SAS Programs Running You? Have you ever QUICKLY written some code assuming it will never be used again?? Is it now 5 years later and the SPAGHETTI CODE is still in production? Worse still
More informationCleaning up your SAS log: Note Messages
Paper 9541-2016 Cleaning up your SAS log: Note Messages ABSTRACT Jennifer Srivastava, Quintiles Transnational Corporation, Durham, NC As a SAS programmer, you probably spend some of your time reading and
More informationSAS 101. Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23. By Tasha Chapman, Oregon Health Authority
SAS 101 Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23 By Tasha Chapman, Oregon Health Authority Topics covered All the leftovers! Infile options Missover LRECL=/Pad/Truncover
More informationSAS Viya 3.1 FAQ for Processing UTF-8 Data
SAS Viya 3.1 FAQ for Processing UTF-8 Data Troubleshooting Tips for Processing UTF-8 Data (Existing SAS Code) What Is the Encoding of My Data Set? PROC CONTENTS displays information about the data set
More informationPaper SAS Managing Large Data with SAS Dynamic Cluster Table Transactions Guy Simpson, SAS Institute Inc., Cary, NC
Paper SAS255-2014 Managing Large Data with SAS Dynamic Cluster Table Transactions Guy Simpson, SAS Institute Inc., Cary, NC ABSTRACT Today's business needs require 24/7 access to your data in order to
More informationThe Problem With NODUPLICATES, Continued
The Problem With NODUPLICATES, Continued Jack Hamilton First Health West Sacramento, California JackHamilton@FirstHealth.com moredupsov.doc Wednesday, Wed Apr 28 1999 12:37 PM Page 1 of 11 What Should
More informationUsing Data Transfer Services
103 CHAPTER 16 Using Data Transfer Services Introduction 103 Benefits of Data Transfer Services 103 Considerations for Using Data Transfer Services 104 Introduction For many applications, data transfer
More informationData Set Options. Specify a data set option in parentheses after a SAS data set name. To specify several data set options, separate them with spaces.
23 CHAPTER 4 Data Set Options Definition 23 Syntax 23 Using Data Set Options 24 Using Data Set Options with Input or Output SAS Data Sets 24 How Data Set Options Interact with System Options 24 Data Set
More informationIntroduction. Understanding SAS/ACCESS Descriptor Files. CHAPTER 3 Defining SAS/ACCESS Descriptor Files
15 CHAPTER 3 Defining SAS/ACCESS Descriptor Files Introduction 15 Understanding SAS/ACCESS Descriptor Files 15 Creating SAS/ACCESS Descriptor Files 16 The ACCESS Procedure 16 Creating Access Descriptors
More informationVersion 8 Base SAS Performance: How Does It Stack-Up? Robert Ray, SAS Institute Inc, Cary, NC
Paper 9-25 Version 8 Base SAS Performance: How Does It Stack-Up? Robert Ray, SAS Institute Inc, Cary, NC ABSTRACT This paper presents the results of a study conducted at SAS Institute Inc to compare the
More informationBASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS
SAS COURSE CONTENT Course Duration - 40hrs BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS What is SAS History of SAS Modules available SAS GETTING STARTED
More informationPaper SAS Programming Conventions Lois Levin, Independent Consultant, Bethesda, Maryland
Paper 241-28 SAS Programming Conventions Lois Levin, Independent Consultant, Bethesda, Maryland ABSTRACT This paper presents a set of programming conventions and guidelines that can be considered in developing
More informationSAS System Powers Web Measurement Solution at U S WEST
SAS System Powers Web Measurement Solution at U S WEST Bob Romero, U S WEST Communications, Technical Expert - SAS and Data Analysis Dale Hamilton, U S WEST Communications, Capacity Provisioning Process
More informationSAS Infrastructure for Risk Management 3.4: User s Guide
SAS Infrastructure for Risk Management 3.4: User s Guide SAS Documentation March 2, 2018 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. SAS Infrastructure for
More informationReducing SAS Dataset Merges with Data Driven Formats
Paper CT01 Reducing SAS Dataset Merges with Data Driven Formats Paul Grimsey, Roche Products Ltd, Welwyn Garden City, UK ABSTRACT Merging different data sources is necessary in the creation of analysis
More informationHandling Numeric Representation SAS Errors Caused by Simple Floating-Point Arithmetic Computation Fuad J. Foty, U.S. Census Bureau, Washington, DC
Paper BB-206 Handling Numeric Representation SAS Errors Caused by Simple Floating-Point Arithmetic Computation Fuad J. Foty, U.S. Census Bureau, Washington, DC ABSTRACT Every SAS programmer knows that
More informationPaper # Jazz it up a Little with Formats. Brian Bee, The Knowledge Warehouse Ltd
Paper #1495-2014 Jazz it up a Little with Formats Brian Bee, The Knowledge Warehouse Ltd Abstract Formats are an often under-valued tool in the SAS toolbox. They can be used in just about all domains to
More informationFrom Manual to Automatic with Overdrive - Using SAS to Automate Report Generation Faron Kincheloe, Baylor University, Waco, TX
Paper 152-27 From Manual to Automatic with Overdrive - Using SAS to Automate Report Generation Faron Kincheloe, Baylor University, Waco, TX ABSTRACT This paper is a case study of how SAS products were
More information