Penetrating the Matrix Justin Z. Smith, William Gui Zupko II, U.S. Census Bureau, Suitland, MD

Similar documents
Paper SAS Managing Large Data with SAS Dynamic Cluster Table Transactions Guy Simpson, SAS Institute Inc., Cary, NC

Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex

Creating an Interactive SAS Textbook in the ipad with ibooks Author

Asks for clarification of whether a GOP must communicate to a TOP that a generator is in manual mode (no AVR) during start up or shut down.

SAS Scalable Performance Data Server 4.3

This report is based on sampled data. Jun 1 Jul 6 Aug 10 Sep 14 Oct 19 Nov 23 Dec 28 Feb 1 Mar 8 Apr 12 May 17 Ju

ICT PROFESSIONAL MICROSOFT OFFICE SCHEDULE MIDRAND

SAS System Powers Web Measurement Solution at U S WEST

Monthly SEO Report. Example Client 16 November 2012 Scott Lawson. Date. Prepared by

Imputation for missing observation through Artificial Intelligence. A Heuristic & Machine Learning approach

All King County Summary Report

Bad Date: How to find true love with Partial Dates! Namrata Pokhrel, Accenture Life Sciences, Berwyn, PA

DATA Step Debugger APPENDIX 3

software.sci.utah.edu (Select Visitors)

Seattle (NWMLS Areas: 140, 380, 385, 390, 700, 701, 705, 710) Summary

Seattle (NWMLS Areas: 140, 380, 385, 390, 700, 701, 705, 710) Summary

The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC

Seattle (NWMLS Areas: 140, 380, 385, 390, 700, 701, 705, 710) Summary

Sand Pit Utilization

HPE Security Data Security. HPE SecureData. Product Lifecycle Status. End of Support Dates. Date: April 20, 2017 Version:

RLMYPRINT.COM 30-DAY FREE NO-OBLIGATION TRIAL OF RANDOM LENGTHS MY PRINT.

Houston Economic Overview Presented by Patrick Jankowski, SVP Research Greater Houston Partnership

COURSE LISTING. Courses Listed. with SAP Hybris Marketing Cloud. 24 January 2018 (23:53 GMT) HY760 - SAP Hybris Marketing Cloud

Server Virtualization and Optimization at HSBC. John Gibson Chief Technical Specialist HSBC Bank plc

Broadband Rate Design for Public Benefit

3. EXCEL FORMULAS & TABLES

Section 1.2: What is a Function? y = 4x

BANGLADESH UNIVERSITY OF PROFESSIONALS ACADEMIC CALENDAR FOR MPhil AND PHD PROGRAM 2014 (4 TH BATCH) PART I (COURSE WORK)

EVM & Project Controls Applied to Manufacturing

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by

UAE PUBLIC TRAINING CALENDAR

Previous Intranet Initial intranet created in 2002 Created solely by Information Systems Very utilitarian i Created to permit people to access forms r

Unlock SAS Code Automation with the Power of Macros

Project Refresh. Bureau of Primary Health Care Reformatted Survey Report January 18, Copyright, The Joint Commission

WHOIS Accuracy Reporting System (ARS): Phase 2 Cycle 1 Results Webinar 12 January ICANN GDD Operations NORC at the University of Chicago

Annex A to the MPEG Audio Patent License Agreement Essential Philips, France Telecom and IRT Patents relevant to DVD-Video Disc - MPEG Audio - general

Aaron Daniel Chia Huang Licai Huang Medhavi Sikaria Signal Processing: Forecasting and Modeling

Is Something Wrong with Texas Home Prices?

More Binary Search Trees AVL Trees. CS300 Data Structures (Fall 2013)

SAS/ETS 13.2 User s Guide. The TIMEID Procedure

An Introduction to SAS/SHARE, By Example

More BSTs & AVL Trees bstdelete

COURSE LISTING. Courses Listed. Training for Database & Technology with Modeling in SAP HANA. 20 November 2017 (12:10 GMT) Beginner.

Polycom Advantage Service Endpoint Utilization Report

10th Maintenance Cost Conference Chairman s Report Athens Sept 10& Tiymor Kalimat Manager Technical Procurement Royal Jordanian Airlines

PROGRAMMING ROLLING REGRESSIONS IN SAS MICHAEL D. BOLDIN, UNIVERSITY OF PENNSYLVANIA, PHILADELPHIA, PA

Polycom Advantage Service Endpoint Utilization Report

X-13 Stuff You Should Know

Excel Using PowerPivot & Power View in Data Analysis

Handling Numeric Representation SAS Errors Caused by Simple Floating-Point Arithmetic Computation Fuad J. Foty, U.S. Census Bureau, Washington, DC

Intro to ARMA models. FISH 507 Applied Time Series Analysis. Mark Scheuerell 15 Jan 2019

APPENDIX E2 ADMINISTRATIVE DATA RECORD #2

Getting Started with the Output Delivery System

Guide to the Susan Olzak Papers SC1299

2016 Market Update. Gary Keller and Jay Papasan Keller Williams Realty, Inc.

Fluidity Trader Historical Data for Ensign Software Playback

SCI - software.sci.utah.edu (Select Visitors)

One SAS To Rule Them All

Creating Code writing algorithms for producing n-lagged variables. Matt Bates, J.P. Morgan Chase, Columbus, OH

Speed Dating: Looping Through a Table Using Dates

Maximo 76 Cognos Dimensions

Peak Season Metrics Summary

Peak Season Metrics Summary

Peak Season Metrics Summary

e-sens Nordic & Baltic Area Meeting Stockholm April 23rd 2013

Effective Forecast Visualization With SAS/GRAPH Samuel T. Croker, Lexington, SC

Peak Season Metrics Summary

OBJECT-ORIENTED PROGRAMMING IN R: S3 & R6. Environments, Reference Behavior, & Shared Fields

New Concept for Article 36 Networking and Management of the List

New CLIK (CLIK 3.0) CLimate Information tool Kit User Manual

Withdrawn Equity Offerings: Event Study and Cross-Sectional Regression Analysis Using Eventus Software

Westinghouse UK AP1000 GENERIC DESIGN ASSESSMENT. Criticality control in SFP. Radwaste and Decommissioning

Latin America Emerging Markets FY2015. Value Proposition

Business Result for the Second Quarter ended September 30, 2017 Regional Market Environments and Projections

Hematology Program (BC90A/BC90B/BC90C/BC90D/CS90A/CS90B/CS90C/CS90D) Cycle 11: March 2016 February Sample No: 1 Sample Date: 14 Apr 16

1.0 PXL-500 Family Controllers

YOUR BUSINESS Networking Lunch & Vendor Fair

How to Implement the One-Time Methodology Mark Tabladillo, Ph.D., Atlanta, GA

SF Current Cumulative PTF Package. I B M i P R E V E N T I V E S E R V I C E P L A N N I N G I N F O R M A T I O N

Combining Contiguous Events and Calculating Duration in Kaplan-Meier Analysis Using a Single Data Step

SAS/STAT 13.1 User s Guide. The NESTED Procedure

Stakeholder consultation process and online consultation platform

Certificate in Security Management

ARRL RADIOGRAM A How To

Scaling on one node Hybrid engines with Multi-GPU on In-Memory database queries

Next Steps for WHOIS Accuracy Global Domains Division. ICANN June 2015

POSTAL AND TELECOMMUNICATIONS REGULATORY AUTHORITY OF ZIMBABWE (POTRAZ)

Arrays: The How and the Why of it All Jane Stroupe

NORTHWEST. Course Schedule: Through June 2018 MICROSOFT ACCESS. Access 2016 / Access 2010 / Last Revised: 11/13/2017

Advancing the Art of Internet Edge Outage Detection

2014 Forecast Results

SAS Publishing SAS. Forecast Studio 1.4. User s Guide

U.S. Digital Video Benchmark Adobe Digital Index Q2 2014

Imputation for missing data through artificial intelligence 1

Excel Functions & Tables

MISO PJM Joint and Common Market Cross Border Transmission Planning

Peak Season Metrics Summary

CMR India Monthly Mobile Handset Market Report. June 2017

Annex A to the DVD-R Disc and DVD-RW Disc Patent License Agreement Essential Sony Patents relevant to DVD-RW Disc

Economic Update German American Chamber of Commerce

Transcription:

Penetrating the Matrix Justin Z. Smith, William Gui Zupko II, U.S. Census Bureau, Suitland, MD ABSTRACT While working on a time series modeling problem, we needed to find the row and column that corresponded to the minimum of a dataset. Processing the dataset via SAS 1 DATA step methods proved somewhat inefficient for the task because SAS is designed to run along observations, but not through variables without considerable manipulation of the dataset. By reading the dataset into PROC IML and treating it as a matrix, we were able to pinpoint the specific location with ease. PROC IML was able to penetrate the matrix and create macro variables that told us where in the matrix our minimum value was located. These macro variables were then used to identify the tentative autoregressive and moving average orders for time series modeling using PROC ARIMA. DISCLAIMER This report is released to inform interested parties of research and to encourage discussion. The views expressed are those of the authors and not necessarily those of the U.S. Census Bureau. INTRODUCTION SAS is designed to process observations in a dataset in the DATA step. Although there are many procedures available to check through variables, we encountered a problem when we tried to locate a specific number within a dataset. The twist with our situation is that we needed to obtain the location of the minimum value of the dataset which was more important than the minimum value itself. Finding the minimum of a matrix is simple enough, but what do you do if you need to find the row and column that correspond to the minimum of a matrix? Using publically available historical shipments and inventories data from the U.S. Census Bureau s office of Manufacturers Shipments, Inventories, and Orders (M3) 2, we created an autoregressive integrated moving average (ARIMA) model to try to predict the value of shipments (VS) / total inventory (TI) ratio over time. An ARIMA model involves autoregressive (AR) terms, which are past values of the series, and moving average (MA) terms, which are past values of random errors. The number of AR terms is denoted by p, and the number of MA terms is denoted by q. There are several methods to tentatively identify suitable p and q for a model. This paper focuses on using the minimum information criterion 3 to accomplish this. The minimum information criterion is a statistic based on the residuals from the model and an added penalty which is a function of the number of model parameters. By using the MINIC option in the IDENTIFY statement, PROC ARIMA will give a matrix, where the rows are an inputted range of p s, the columns are an inputted range of q s, and the entries are values of the information criterion (Brocklebank, 2005). 1 SAS and all other SAS Institute Inc. product or service names are registered trademarks of SAS Institute Inc. in the USA and other countries. A indicates USA registration. Other brand and product names are trademarks of their respective companies. 2 Estimates of manufacturers shipments, inventories, and orders are subject to survey error and revision. For further details on survey design, methodology, and data limitations, see http://www.census.gov/manufacturing/m3. 3 See http://support.sas.com/documentation/cdl/en/etsug/60372/html/default/viewer.htm#etsug_arima_sect030.htm for detailed information. 1

EXPLANATION OF CODE We used the following PROC ARIMA code to produce the matrix shown in Table 1. Because PROC ARIMA outputs its results directly to the output window, we needed to use ODS Output to create a dataset that will house our matrix. We called this dataset MINICDATA. ODS Output takes the results from the output window and creates a dataset with that information. In this case, we wanted the information provided in the Minimum Information Criterion. ods output "Minimum Information Criterion"=minicdata; proc arima data=mergevsti; identify var=ratio(1) minic p=(1:7) q=(1:7) alpha=.1; quit; ods output close; To get a fairly wide range of possible p and q values to test, we programmed PROC ARIMA calculate the information criterion using p and q both ranging from 1 to 7. Usually, these values are restricted by business rules, which are processing criteria specific to a given survey, to be below a certain order, for example p 3 and q 4, and other restrictions, such as not having both p and q in the same model (a mixed model ). The p values represent the number of autoregressive terms, and are the rows in the matrix. The q values represent the number of moving average terms, and are the columns in the matrix. Note that the ratio(1) part of the code takes the first difference of the series. We used this method with this series to achieve stationarity. Table 1 Minimum Information Criterion Lags MA 1 MA 2 MA 3 MA 4 MA 5 MA 6 MA 7 AR 1-8.9123-8.89189-8.92248-8.90214-8.87953-8.85736-8.86446 AR 2-8.88938-8.87657-8.89979-8.88173-8.86487-8.84228-8.84288 AR 3-8.94442-8.92118-8.90444-8.89207-8.86915-8.86154-8.86481 AR 4-8.92167-8.89854-8.88296-8.87913-8.85824-8.84391-8.84368 AR 5-8.89854-8.87564-8.86011-8.85737-8.83558-8.82066-8.82182 AR 6-8.87571-8.85268-8.84426-8.83715-8.81434-8.81242-8.80423 AR 7-8.89687-8.87379-8.85815-8.83566-8.81606-8.80399-8.78121 We then manipulated the matrix, deleting the first row, due to a business rule for a prediction to be based on at least two prior observations. This gave us the matrix shown in Table 2. Table 2 Minimum Information Criterion Lags MA 1 MA 2 MA 3 MA 4 MA 5 MA 6 MA 7 AR 2-8.88938-8.87657-8.89979-8.88173-8.86487-8.84228-8.84288 AR 3-8.94442-8.92118-8.90444-8.89207-8.86915-8.86154-8.86481 AR 4-8.92167-8.89854-8.88296-8.87913-8.85824-8.84391-8.84368 AR 5-8.89854-8.87564-8.86011-8.85737-8.83558-8.82066-8.82182 AR 6-8.87571-8.85268-8.84426-8.83715-8.81434-8.81242-8.80423 AR 7-8.89687-8.87379-8.85815-8.83566-8.81606-8.80399-8.78121 2

By observation, we can see that the minimum of this matrix is the value -8.94442, which we ve highlighted in Table 2, corresponding to the AR 3 and MA 1 location. Therefore, we want our macro variables to be p = 3 and q = 1. Our first solution was to make a dataset for each column, calculate the minimum in each dataset using PROC MEANS, and keep the column with the minimum of the minimums. Then we d loop through the columns and note the row for which the minimum occurred. This was a slow process for the simple matrix we were using and it would be even slower using matrices with many more rows and columns. Our second solution was to use PROC IML. We read in the MINICDATA dataset, and looped through the rows (i) and columns (j), storing i and j where the matrix entry was the minimum. PROC IML is different from other PROCs. It does not use the data or set statements in the DATA step. While the use statement is like the set statement, no columns are read into IML unless the read statement is used. We used the read all into command to bring in the MINICDATA dataset into IML as a matrix. We needed to tell PROC IML the name of the rows and columns. We named the matrix pqmat, with the observations becoming the rowname and the variable names starting with the string ma. proc iml; use minicdata; read all into pqmat[rowname=rowname colname=ma_]; Next, we obtained the number of rows and columns. These values were used in the DO loops over the rows and columns. We calculated the minimum of the matrix here. Note that calculating the minimum outside the DO loops is more efficient than calculating the minimum inside the DO loops because we only needed to calculate it once, not repeatedly. nrow=nrow(pqmat); ncol=ncol(pqmat); minmatrix=min(pqmat); Below are the DO loops. Here we stored the values of i and j to macro variables that (i,j) corresponded to the minimum identified above. The print option is not required in this step, but it is useful to confirm that the matrix was created correctly. We printed the nrow and ncol to confirm that we had the correct number of rows and columns in the matrix. This gave us further confirmation that the data was created properly. print pqmat nrow ncol; do i=1 to nrow; do j=1 to ncol; if pqmat[i,j]=minmatrix then call symput('i',left(char(i))); if pqmat[i,j]=minmatrix then call symput('j',left(char(j))); end; end; quit; Next, we used i and j to create the macro variables p and q. This correction step was needed because we started the range of p and q at 1. Since we have a business rule where future observations in our time series must be based on at least two past observations, p must be at least two. This value of p is then calculated by increasing the value of i by one. Because q starts at one, no additional correction was made to q. However, we left this portion of code in for q to remind us that if we start at a different range for q, or have other business rules for q, then we would need to modify j to calculate q as well. For the dataset 3

above, the minimum was identified at i = 2 and j = 1. Therefore, this correction step will calculate the correct p = 3 and q = 1. %let p=%eval(&i+1); %let q=%eval(&j); Last, we outputted the values of p and q to the log window for review. %put p=&p; %put q=&q; Our code determined p = 3 and q = 1 as the model with the minimum information criterion, and is therefore the preferred tentative model. Using PROC ARIMA with the ESTIMATE option with macro variables evaluating to p = 3 and q = 1, the output window showed the parameter estimates for the model. Figure 1 shows the output screen. The IML code not only printed out the rows and columns, but the variables we created showing how many rows and columns are in the matrix. Fig. 1 A modification to the code would be necessary if, for example, the matrix had more than one cell containing the minimum. In that case, our code would merely store the last (p, q) pair where the minimum occurred. In our context, models that would happen to have the same value of the minimum information criterion would be statistically indistinguishable. The modification would be to insert a counter to total the number of minimums identified, and to store the different p and q by incorporating this counter as a macro variable in the names. By storing these values as macros, we were able to use them to set up parameters for later use. To see this demonstrated in an entire process, please refer to our parent paper Creating and Displaying an Econometric Model Automatically paper # 52, by the same authors. CONCLUSION Finding the minimum of a matrix is a simple task, but finding the row and column corresponding to the minimum of the matrix is somewhat more involved. PROC IML gives us the flexibility to search a SAS dataset through its observations, and also through its variables. Programmers often need greater flexibility than row-based calculations or comparisons. PROC IML grants that flexibility using a matrix for its calculations, which allows programmers to compare observations and columns. We learned that some care needs to be taken to identify the correct row and column if rows or columns are being deleted. Additionally, the code needs to be modified if there is more than one minimum. Note, that while we worked with the minimum function, the method is general and many criteria can be used depending on the application. 4

REFERENCES Brocklebank, John C. and Dickey, David (2005). SAS for Forecasting Time Series, Cary, NC, SAS Institute Inc. ACKNOWLEDGEMENTS The authors would like to thank Courtney Harris, Jan Lattimore, and Amy Newman-Smith for their reviews and suggestions which greatly improved this paper. CONTACT INFORMATION If you would like to submit comments and questions, please contact the authors at: Justin Z. Smith U.S. Census Bureau 4600 Silver Hill Road, Room 7K072D Suitland, MD, 20746 301-763-4413 Justin.z.smith@census.gov William Gui Zupko II U.S. Census Bureau 4600 Silver Hill Road, Room 7K072D Suitland, MD, 20746 301-763-3446 william.e.zupko.ii@census.gov 5

Appendix A: Input Data /*M3 data, seasonally adjusted VS and TI at the total mfg level*/ data mergevsti; input date:monyy5. vs_total ti_total; format date monyy5.; datalines; jan92 227886 377747 feb92 228960 376638 mar92 238575 377648 apr92 239888 377531 may92 243627 377654 jun92 245708 377082 jul92 245763 377349 aug92 242836 379429 sep92 244603 379462 oct92 243490 378987 nov92 247071 377905 dec92 248163 378651 jan93 247798 376288 feb93 249667 375910 mar93 250209 377934 apr93 248805 378770 may93 251941 378971 jun93 252936 379478 jul93 249094 378389 aug93 249398 378743 sep93 252572 379894 oct93 256506 379468 nov93 255374 379739 dec93 255982 379681 jan94 260272 380659 feb94 261247 381759 mar94 261706 382947 apr94 263348 383691 may94 266838 386139 jun94 267949 387392 jul94 271236 391085 aug94 273787 392423 sep94 272880 392872 oct94 276547 394852 nov94 279426 397530 dec94 286836 399852 jan95 287105 403161 feb95 288543 406220 mar95 284558 410024 apr95 289663 412534 may95 285586 416327 jun95 288701 417926 jul95 287937 419593 aug95 289702 420152 sep95 295112 422860 oct95 292999 423719 nov95 291615 423301 dec95 300577 424742 jan96 287221 427690 feb96 282396 427844 6

mar96 294270 428362 apr96 295746 427394 may96 299515 426806 jun96 303345 426639 jul96 299811 426888 aug96 301541 427364 sep96 305512 429140 oct96 301295 429950 nov96 308479 430558 dec96 307658 430446 jan97 306977 431518 feb97 314531 432085 mar97 315603 431909 apr97 317686 433933 may97 315287 435245 jun97 320528 435749 jul97 322600 436683 aug97 322149 438307 sep97 323819 439556 oct97 322226 442060 nov97 328562 441998 dec97 326506 443529 jan98 321169 445177 feb98 329667 447602 mar98 328451 448496 apr98 324302 449386 may98 325284 450058 jun98 319938 450950 jul98 315796 451158 aug98 323818 451074 sep98 325481 451086 oct98 327564 452093 nov98 328584 452407 dec98 328455 448974 jan99 328480 446675 feb99 332452 446854 mar99 328838 448629 apr99 330167 448150 may99 335194 449765 jun99 333653 449522 jul99 334244 451565 aug99 340453 452055 sep99 338604 454960 oct99 343111 456154 nov99 343605 460219 dec99 344306 463529 jan00 352338 463722 feb00 340160 466474 mar00 349152 466898 apr00 354512 469510 may00 350027 469787 jun00 355114 473820 jul00 351491 475724 aug00 347221 477621 sep00 355737 478058 oct00 349286 480193 nov00 346501 482493 7

dec00 350443 481233 jan01 342474 483147 feb01 346204 477906 mar01 340459 472767 apr01 330526 470030 may01 338811 465649 jun01 330972 459958 jul01 327969 454392 aug01 330804 449989 sep01 321520 445089 oct01 320508 440129 nov01 319528 434128 dec01 322291 427806 jan02 320292 425185 feb02 319887 420256 mar02 321723 417758 apr02 326698 417589 may02 328712 416421 jun02 327930 416950 jul02 325813 417439 aug02 329800 418515 sep02 331563 420253 oct02 330077 420500 nov02 330039 419863 dec02 324698 422953 jan03 329297 421283 feb03 332730 425181 mar03 337001 422819 apr03 325544 422128 may03 328434 420338 jun03 332584 417332 jul03 336592 414243 aug03 332740 412691 sep03 340301 410476 oct03 340496 410213 nov03 340123 408691 dec03 340947 408273 jan04 339466 408024 feb04 339840 410902 mar04 354606 413114 apr04 350930 415066 may04 353062 418637 jun04 356104 422815 jul04 355761 426207 aug04 362223 429598 sep04 363487 430679 oct04 369119 434481 nov04 373258 439752 dec04 376479 440780 jan05 382149 446921 feb05 382258 452051 mar05 385607 456098 apr05 390747 457412 may05 389197 457576 jun05 391153 458868 jul05 388728 462230 aug05 398153 461826 8

sep05 405096 463144 oct05 405681 467466 nov05 408000 470073 dec05 415856 473977 jan06 416409 480035 feb06 414458 481252 mar06 419071 486579 apr06 412626 492054 may06 423338 496788 jun06 423041 502704 jul06 415986 507534 aug06 423874 511417 sep06 412905 516201 oct06 410528 519492 nov06 415622 521651 dec06 429422 522693 jan07 419566 524915 feb07 427395 527627 mar07 435243 529591 apr07 438289 533764 may07 444854 538556 jun07 442764 541364 jul07 445213 541468 aug07 445091 542181 sep07 446177 548284 oct07 452258 551259 nov07 462938 555956 dec07 461934 562058 jan08 467518 570084 feb08 461513 572292 mar08 461387 576268 apr08 474957 576865 may08 475426 577593 jun08 481484 584288 jul08 485305 585792 aug08 467038 587005 sep08 453849 580966 oct08 434235 575167 nov08 399940 567341 dec08 382988 550196 jan09 369445 545645 feb09 370864 537665 mar09 360570 531471 apr09 357454 525473 may09 356007 521764 jun09 362735 516523 jul09 367357 512990 aug09 368630 509820 sep09 374858 506784 oct09 378727 510371 nov09 383894 512617 dec09 389564 512889 jan10 391192 513731 feb10 389580 518607 mar10 397323 520370 apr10 400920 523410 may10 396819 523255 9

jun10 393959 527044 jul10 402458 530012 aug10 401696 532323 sep10 405645 537957 oct10 408082 544410 nov10 412779 550059 dec10 423543 557617 jan11 431064 565167 feb11 431886 571854 mar11 445386 580076 apr11 443493 588509 may11 443344 592935 jun11 446021 595119 jul11 453156 598034 ; /*create the ratio of value of shipments over total inventory*/ data mergevsti; set mergevsti; ratio=vs_total/ti_total; 10

Appendix B: PROC ARIMA and PROC IML code /*get the minimum information criterion of the first differenced series*/ ods output "Minimum Information Criterion"=minicdata; proc arima data=mergevsti; identify var=ratio(1) minic p=(1:7) q=(1:7) alpha=.1; /*took the first difference*/ quit; ods output close; /*clean up the minicdata dataset. This is because we want a forecast based on at least 2 past observations*/ data minicdata; set minicdata; if rowname in('ar 1') then delete; /*use proc IML to manipulate dataset to get p and q for suggested AR(p) and MA(q) process*/ proc iml; use minicdata; read all into pqmat[rowname=rowname colname=ma_]; /*get num of rows and columns. This will be used in the DO loop*/ nrow=nrow(pqmat); ncol=ncol(pqmat); minmatrix=min(pqmat); /*calculate minimum of matrix*/ /*display matrix, nrow, and ncol, get i and j*/ print pqmat nrow ncol; do i=1 to nrow; do j=1 to ncol; if pqmat[i,j]=minmatrix then call symput('i',left(char(i))); if pqmat[i,j]=minmatrix then call symput('j',left(char(j))); end; end; quit; /*this adjustment is done to get the correct p and q*/ /*this is because we want a forecast based on at least 2 past observations*/ %let p=%eval(&i+1); %let q=%eval(&j); %put p=&p; %put q=&q; /*estimate the model using the identified p and q*/ proc arima data=mergevsti; identify var=ratio; estimate p=&p q=&q; quit; 11