Using SAS to Manage Biological Species Data and Calculate Diversity Indices

Size: px
Start display at page:

Download "Using SAS to Manage Biological Species Data and Calculate Diversity Indices"

Transcription

1 SCSUG November 2014 Using SAS to Manage Biological Species Data and Calculate Diversity Indices ABSTRACT Paul A. Montagna, Harte Research Institute, TAMU-CC, Corpus Christi, TX Species level information is necessary for conservation, management, and research into ecological and environmental systems. A common problem is that one doesn t know what the community composition will be, so data management is complex because it has to be managed by species and not by samples. Once the data management problem is solved, there will be a need to summarize information at the sample level, and perhaps calculate a diversity index. This presentation will describe algorithms for commonly used diversity indices, and demonstrate how SAS was used to solve the data management and computational problems. KEYWORDS PROC SORT, PROC TRANSPOSE, DATA STEP PROGRAMMING, ARRAYS, DO LOOPS, GOTO INTRODUCTION Biodiversity is a common indicator of ecosystem health and often is measured in environmental assessment studies. Both terrestrial and aquatic samples are composed of multiple species and it is useful to summarize the species in a sample by computing a diversity index. Diversity indices intend to describe richness (i.e., how many species are present) and evenness (i.e., how the counts are distributed among the species). For example, it is common for disturbed samples to have dominance (i.e., low evenness regardless of the richness) compared to samples from undisturbed environments. An important attribute of diversity is that the more samples taken, the more species are found. This concept is referred to as the species-sample size principle and it implies that one will always find new species. This attribute implies that when creating biological databases, it is always advisable to use a vector approach where each species from a sample is a record (Table 1). A matrix approach (where each sample is a record and species are in columns) will not work because it would be necessary to continually add new columns when new species are found. Thus, samples are decomposed and must be reassembled in order calculate diversity indices. In the present study, various diversity algorithms are discussed, and then SAS data base approaches and analytical methods are shown to implement the equations. MATHEMATICAL METHODS There are many approaches to calculating diversity indices, and most fall into one of the two classes being: richness indices or evenness indices (Ludwig and Reynolds 1988). The simplest richness index is R, the total number of species (equation 1).... (1) Richness indices are weak because they vary as a function of sample size. One of the first richness indices to account for sample size (n) is R1 (Margalef 1958).... (2) A similar formulation which is a less severe transformation of s sample size is R2 (Menhinick 1964).... (3) There are many different forms of diversity indices that try to account for evenness as well as richness. Hill (1973) reviewed many and demonstrated that they were part of a family or series of diversity numbers that are based on the proportional representation of each of the i species in a sample. Hill s diversity numbers are N0, N1, and N2. N0 is the total count, which I define above as R, the total number of species. Hill s N1 is the exponential form of the Shannon and Weaver (1948) H' diversity index. where Table 1. Biological data. Sample Species Value (4) 1

2 [( ) ( )]... (5) Hill s N2 is the reciprocal of Simpson (1949) index λ.... (6) where ( )... (7) Evenness indices have also been created. An evenness index should be maximum if all the species are equally abundant, and then then should decrease to zero as relative abundances. Again Hill (1973) shows how a series of proportional ratios of Hill s Numbered indices can be used to generalize the equations and computing. The most common evenness index used by ecologists is Pielou (1975) J' where a perfectly even community has an index value of one because there is one individual on a proportional basis for every species. Hill named J' E1 Sheldon (1969) proposed the exponential form of E1, and Hill named this E2. Heip (1974) proposed subtracting the minimum from E2, and Hill named this E3.... (8)... (9)... (10) Hill (1973) proposed the ration of N2 to N1, and this is named E4. This ratio is useful because is it in units of very abundant species. That is, as N1 and N2 reduce to one species, the ratio reduces to a value of (11) Alatalo (1981) proposed a modified Hill s ratio that approaches zero as a single species becomes more dominant. ( )... (12) DATA METHODS The method to create biological databases is an important decision. Species data are multivariate responses in a sampling design, meaning the dataset will contain samples as rows and species as columns, and it will be easier to analyze the data once it is in this matrix format. However, it is recommended that biological community data be stored in a vector format (Table 1). The data model should contain every species within every sample as rows and the count (or size) of the species in a column (Table 1). This is important because, we never know which species will be found in any given sample, and new species are found all the time. So, most biological data is stored as a vector, not a matrix. Often samples are collected in a complex experimental design, so you can either include columns for all elements of the design, or include a key variable for each measurement value and then create a second relational table with key variables. It is important to distinguish your data from your metadata. Metadata is data about the data. Include the data source with contact information, what the weather was like, the exact georeferenced location, the units of the variables, adequate descriptions of variable names and codes used to identify treatments, and any other information that is usually overlooked or assumed to be easy to remember. Just remember this: You won t remember it. It is very important to write down all the details, and then record it in your data base. For example, note that in Table 1, species 1 occurs in samples 2 and 3, but in sample 2 there are two new species (3 and 4). This storage approach creates a problem because simply computing sample means with PROC MEANS would return the wrong values. For example the average for Species 2 would be 5, because the zero value for Species 2 was not found in the second sample. Zeros must be in the data set to calculate means. This is easily accomplished by transposing the dataset twice. The first transpose will create the missing cells by using the PREFIX option and the ID statement so that each species is placed in its own column. The second transpose returns the data set to the original format, and then it is easy to replace the missing values with zeros as in the example below. 2

3 PROC SORT DATA=sp; BY sample species; PROC TRANSPOSE DATA=sp OUT=tsp PREFIX=S; ID species; PROC TRANSPOSE DATA=tsp OUT=sp1; DATA sp2; SET sp1; IF value=. THEN value=0; Table 2. Results of first transpose. The result of the program to put Sample S1 S2 S3 S4 zeros in the dataset for missing species is shown in Table 2. The vector in Table 1 (data sp ) has been converted to a matrix (data tsp. The second transpose puts the dataset back into a vector format (data sp1 ), and then the data step is used to convert the missing values to zero (Table 3, data sp2 ). The data set with zeros can be used to calculate the average abundance of each species correctly. Table 3. Results of second transpose. Sample Species Value While using the ID statement in TRANSPOSE is necessary to create a dataset with zeros, it is not if our goal is to simply convert the species data into an array that can then be used to calculate diversity indices. In fact, to calculate the diversity indices, we only need to TRANSPOSE once without the ID statement. An example of this is listed below: PROC SORT DATA=sp; BY sample species; PROC TRANSPOSE DATA=sp OUT=tsp2 PREFIX=S; Table 4. Results of transpose without the ID statement. Sample S1 S2 S This data transform creates an array with missing values (Table 4, data tsp2). Notice that the values of the species are now compressed and that the column names (S1 to S3) no longer represent the individual species, but just the next species found in a sample. In contrast, notice in Table 2, which is a result of using the ID statement, that the column names do represent the individual specific species. Leaving out the ID statement creates an array, which is more useful for creating the programming code to calculate the diversity indices. The next task is to calculate various measures of biodiversity (Equations 1-13). This is best accomplished by transposing the data and treating each sample as an array without the ID statement as done in Table 4. We can also calculate total abundance of organisms (i.e., to sum values). First, the ARRAY has to be declared, then it is a simple matter to calculate the index contribution for each species, and then sum the individual contributions. Two cautions: 1) include a test to make certain data exists on the line, that is the sample is not empty or no organisms were found, or there are zeros in the row, and 2) make sure there are no divide by zero errors, which can happen if there is just one species in a sample. Although not included here, it is simple to add an IF THEN ELSE type statement or GOTO statement and test for the values of denominators in the equations. The SAS DATA step code below calculates the indices. Also, don t forget to FORMAT the calculated variables to display the correct number of significant digits. 3

4 PROC TRANSPOSE DATA=sp OUT=tsp2 PREFIX=S; DATA diversity; SET tsp2; ARRAY s s1-s100; /* Create temporary species array */ DROP s1-s100 _name_; DO OVER s; /* Loop to convert zero to missing */ IF s=0 THEN s=.; END; maxn=max(of s1-s100); /* maxn = highest number of individuals */ tn=sum(of s1-s100); /* tn = total number of individuals */ R=n(of s1-s100); /* R = Richness, total species EQ-1 */ R1=(R-1)/log(tn); /* R1 = Margalef index EQ-2 */ R2=R/sqrt(tn); /* R2 = Menhinick index EQ-3 */ Hprime=0; Lambda=0; /* Define all indices and set to zero */ N0=0; N1=0; N2=0; E1=0; E2=0; E3=0; E4=0; E5=0; DO OVER s; /* Loop to calculate diversity indices */ If s=. THEN GOTO msp; /* Skip species with missing values */ Hprime = Hprime + (-(s/tn) * log(s/tn)); /* Shannon H index EQ-5 */ Lambda = Lambda + (s/tn)**2; /* Simpson λ index EQ-7 */ msp: ; END; N0=R; /* Hill's Number 0, same as R EQ-1 */ N1=exp(Hprime); /* Hill's number 1 EQ-4 */ N2=1/Lambda; /* Hill's number 2 EQ-6 */ E1=log(n1)/log(n0); /* Eveness index, J, EQ-8 */ E2=N1/N0; /* Eveness index EQ-9 */ E3=(N1-1)/(N0-1); /* Eveness index EQ-10 */ E4=N2/N1; /* Eveness index EQ-11 */ E5=(N2-1)/(N1-1); /* Eveness index EQ-12 */ OUTPUT; /* Add new calculated values to dataset */ RETURN; /* Start the next data step */ FORMAT Hprime Lambda R1 R2 N0 N1 N2 E1 E2 E3 E4 E5 8.2; FORMAT tn R 8.0; DROP maxn; The program above contains two complex action steps necessary to calculate the proportions of each individual species: the ARRAY and DO statements. The ARRAY statement creates a series of temporary variables that are used to identify each individual species. The DO statement allows for repetitive processing in a single data step. For example, the first DO loop tests each species (s) and if it is zero it converts it to a missing value. Combining the ARRAY and DO statements is necessary to calculate the proportional representation of each species to the total abundance. This is implemented in the second DO loop where the individual species (s) is divided by the total number of organisms (tn). One caution about DO loops, you have to be careful which letter used as a name and make sure it is unique to the data set. For example, here S is used for each species column, so if we had another numeric variable beginning with an S then it could be processed as if it were a species. This would not occur if it was a character variable. SAS code can be added to test if there are no species present (If tn=. OR tn=0 THEN ), and then the indices can be set to either missing or zero. Likewise, it is simple to add a test if the denominator would be zero, and execute the calculation only for non-zero counts. CONCLUSION The calculation of diversity indices is a common practice among those who collect, manage, and study biological community samples or data. Biological community data has a unique hierarchical structure because each sample generates measurement values (in this case count data) for a unique number of species within each sample. This structure alone has implications on how to design data models for biological community structure data. 4

5 The SAS system provides powerful tools for data management, statistical analysis, and data visualization. In fact, it would very difficult and time consuming to perform the analyses presented here without the ability to easily transpose data and write code for complex equations. All diversity indices have a weakness in that they are species independent. For example, a sample with 3 species and a second sample with 3 different species have the same richness, but the samples are completely different. Also some indices are very sensitive to sample size, such as R1 and R2. When samples sizes are equal, R is just fine. The most commonly used indices are H' (Equation 5) for diversity and J' (Equation 8) for evenness. While there are many diversity indices, some are more easily interpretable than others. For example Hill s diversity indices N1 and N2 are in the units of number of dominant species, and thus are easy to understand. Likewise, Hill s evenness index E5 is also interpretable because it approaches zero as a single species becomes dominant. REFERENCES Alatalo, R.V Problems in the measurement of evenness in ecology. Oikos 37: Heip, C A new index measuring evenness. Journal of the Marine Biological Association 54: Hill, M.O Diversity and evenness: a unifying notation and its consequences. Ecology 54: Ludwig, J.A. and J.F. Reynolds Statistical Ecology: A Primer on Methods and Computing, John Wiley & Sons, New York. Margalef, R Information Theory in Ecology. General Systematics 3: Pielou, E.C Ecological Diversity. Wiley, New York. Shannon, C. E. and W. Weaver The Mathematical Theory of Communication. University of Illinois Press. Urbana, IL. Sheldon, A.L Equitability indices: dependence on species count. Ecology 50: Simpson, E.H Measurement of diversity. Nature 163:688 ACKNOWLEDGMENTS This publication was made possible, in part, by the National Oceanic and Atmospheric Administration, Office of Education Educational Partnership Program award (NA11SEC ). Its contents are solely the responsibility of the award recipient Paul Montagna and do not necessarily represent the official views of the U.S. Department of Commerce, National Oceanic and Atmospheric Administration. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Paul A. Montagna Harte Research Institute for Gulf of Mexico Studies Texas A&M University-Corpus Christi 6300 Ocean Drive, Unit 5869 Corpus Christi, Texas paul.montagna@tamucc.edu Office (361) TRADEMARK INFORMATION SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 5

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD ABSTRACT SESUG 2016 - RV-201 CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD Those of us who have been using SAS for more than a few years often rely

More information

Introduction to entropart

Introduction to entropart Introduction to entropart Eric Marcon AgroParisTech UMR EcoFoG Bruno Hérault Cirad UMR EcoFoG Abstract entropart is a package for R designed to estimate diversity based on HCDT entropy or similarity-based

More information

A Visual Step-by-step Approach to Converting an RTF File to an Excel File

A Visual Step-by-step Approach to Converting an RTF File to an Excel File A Visual Step-by-step Approach to Converting an RTF File to an Excel File Kirk Paul Lafler, Software Intelligence Corporation Abstract Rich Text Format (RTF) files incorporate basic typographical styling

More information

SAS/STAT 13.1 User s Guide. The NESTED Procedure

SAS/STAT 13.1 User s Guide. The NESTED Procedure SAS/STAT 13.1 User s Guide The NESTED Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute

More information

Producing Summary Tables in SAS Enterprise Guide

Producing Summary Tables in SAS Enterprise Guide Producing Summary Tables in SAS Enterprise Guide Lora D. Delwiche, University of California, Davis, CA Susan J. Slaughter, Avocet Solutions, Davis, CA ABSTRACT This paper shows, step-by-step, how to use

More information

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, 00142 Roma, Italy e-mail: pimassol@istat.it 1. Introduction Questions can be usually asked following specific

More information

1. Introduction. 2. Modelling elements III. CONCEPTS OF MODELLING. - Models in environmental sciences have five components:

1. Introduction. 2. Modelling elements III. CONCEPTS OF MODELLING. - Models in environmental sciences have five components: III. CONCEPTS OF MODELLING 1. INTRODUCTION 2. MODELLING ELEMENTS 3. THE MODELLING PROCEDURE 4. CONCEPTUAL MODELS 5. THE MODELLING PROCEDURE 6. SELECTION OF MODEL COMPLEXITY AND STRUCTURE 1 1. Introduction

More information

Describe the two most important ways in which subspaces of F D arise. (These ways were given as the motivation for looking at subspaces.

Describe the two most important ways in which subspaces of F D arise. (These ways were given as the motivation for looking at subspaces. Quiz Describe the two most important ways in which subspaces of F D arise. (These ways were given as the motivation for looking at subspaces.) What are the two subspaces associated with a matrix? Describe

More information

The NESTED Procedure (Chapter)

The NESTED Procedure (Chapter) SAS/STAT 9.3 User s Guide The NESTED Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete manual

More information

Not Just Merge - Complex Derivation Made Easy by Hash Object

Not Just Merge - Complex Derivation Made Easy by Hash Object ABSTRACT PharmaSUG 2015 - Paper BB18 Not Just Merge - Complex Derivation Made Easy by Hash Object Lu Zhang, PPD, Beijing, China Hash object is known as a data look-up technique widely used in data steps

More information

Section 1.1 Definitions and Properties

Section 1.1 Definitions and Properties Section 1.1 Definitions and Properties Objectives In this section, you will learn to: To successfully complete this section, you need to understand: Abbreviate repeated addition using Exponents and Square

More information

Linear Programming with Bounds

Linear Programming with Bounds Chapter 481 Linear Programming with Bounds Introduction Linear programming maximizes (or minimizes) a linear objective function subject to one or more constraints. The technique finds broad use in operations

More information

How to Conduct a Heuristic Evaluation

How to Conduct a Heuristic Evaluation Page 1 of 9 useit.com Papers and Essays Heuristic Evaluation How to conduct a heuristic evaluation How to Conduct a Heuristic Evaluation by Jakob Nielsen Heuristic evaluation (Nielsen and Molich, 1990;

More information

DESIGN OF A COMPOSITE ARITHMETIC UNIT FOR RATIONAL NUMBERS

DESIGN OF A COMPOSITE ARITHMETIC UNIT FOR RATIONAL NUMBERS DESIGN OF A COMPOSITE ARITHMETIC UNIT FOR RATIONAL NUMBERS Tomasz Pinkiewicz, Neville Holmes, and Tariq Jamil School of Computing University of Tasmania Launceston, Tasmania 7250 AUSTRALIA Abstract: As

More information

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by Paper CC-016 A macro for nearest neighbor Lung-Chang Chien, University of North Carolina at Chapel Hill, Chapel Hill, NC Mark Weaver, Family Health International, Research Triangle Park, NC ABSTRACT SAS

More information

York Public Schools Subject Area: Mathematics Course: 6 th Grade Math NUMBER OF DAYS TAUGHT DATE

York Public Schools Subject Area: Mathematics Course: 6 th Grade Math NUMBER OF DAYS TAUGHT DATE 6.1.1.d 6.EE.A.1 Represent large numbers using exponential notation (i.e.. 10x10x10x10x10) (Review PV chart first) Write evaluate numerical expressions involving whole number exponents 5 days August, Lesson

More information

Move-to-front algorithm

Move-to-front algorithm Up to now, we have looked at codes for a set of symbols in an alphabet. We have also looked at the specific case that the alphabet is a set of integers. We will now study a few compression techniques in

More information

Gateway Regional School District VERTICAL ALIGNMENT OF MATHEMATICS STANDARDS Grades 3-6

Gateway Regional School District VERTICAL ALIGNMENT OF MATHEMATICS STANDARDS Grades 3-6 NUMBER SENSE & OPERATIONS 3.N.1 Exhibit an understanding of the values of the digits in the base ten number system by reading, modeling, writing, comparing, and ordering whole numbers through 9,999. Our

More information

Multi-Criteria Decision Making 1-AHP

Multi-Criteria Decision Making 1-AHP Multi-Criteria Decision Making 1-AHP Introduction In our complex world system, we are forced to cope with more problems than we have the resources to handle We a framework that enable us to think of complex

More information

ABSTRACT INTRODUCTION PROBLEM: TOO MUCH INFORMATION? math nrt scr. ID School Grade Gender Ethnicity read nrt scr

ABSTRACT INTRODUCTION PROBLEM: TOO MUCH INFORMATION? math nrt scr. ID School Grade Gender Ethnicity read nrt scr ABSTRACT A strategy for understanding your data: Binary Flags and PROC MEANS Glen Masuda, SRI International, Menlo Park, CA Tejaswini Tiruke, SRI International, Menlo Park, CA Many times projects have

More information

Software Development Techniques. 26 November Marking Scheme

Software Development Techniques. 26 November Marking Scheme Software Development Techniques 26 November 2015 Marking Scheme This marking scheme has been prepared as a guide only to markers. This is not a set of model answers, or the exclusive answers to the questions,

More information

DM and Cluster Identification Algorithm

DM and Cluster Identification Algorithm DM and Cluster Identification Algorithm Andrew Kusiak, Professor oratory Seamans Center Iowa City, Iowa - Tel: 9-9 Fax: 9- E-mail: andrew-kusiak@uiowa.edu Homepage: http://www.icaen.uiowa.edu/~ankusiak

More information

A SAS MACRO for estimating bootstrapped confidence intervals in dyadic regression

A SAS MACRO for estimating bootstrapped confidence intervals in dyadic regression A SAS MACRO for estimating bootstrapped confidence intervals in dyadic regression models. Robert E. Wickham, Texas Institute for Measurement, Evaluation, and Statistics, University of Houston, TX ABSTRACT

More information

Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE

Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE ABSTRACT Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE Steve Cavill, NSW Bureau of Crime Statistics and Research, Sydney, Australia PROC TABULATE is a great tool for generating

More information

Macros I Use Every Day (And You Can, Too!)

Macros I Use Every Day (And You Can, Too!) Paper 2500-2018 Macros I Use Every Day (And You Can, Too!) Joe DeShon ABSTRACT SAS macros are a powerful tool which can be used in all stages of SAS program development. Like most programmers, I have collected

More information

For Module 2 SKILLS CHECKLIST. Fraction Notation. George Hartas, MS. Educational Assistant for Mathematics Remediation MAT 025 Instructor

For Module 2 SKILLS CHECKLIST. Fraction Notation. George Hartas, MS. Educational Assistant for Mathematics Remediation MAT 025 Instructor Last Updated: // SKILLS CHECKLIST For Module Fraction Notation By George Hartas, MS Educational Assistant for Mathematics Remediation MAT 0 Instructor Assignment, Section. Divisibility SKILL: Determine

More information

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University While your data tables or spreadsheets may look good to

More information

(Refer Slide Time 3:31)

(Refer Slide Time 3:31) Digital Circuits and Systems Prof. S. Srinivasan Department of Electrical Engineering Indian Institute of Technology Madras Lecture - 5 Logic Simplification In the last lecture we talked about logic functions

More information

by Kevin M. Chevalier

by Kevin M. Chevalier Precalculus Review Handout.4 Trigonometric Functions: Identities, Graphs, and Equations, Part I by Kevin M. Chevalier Angles, Degree and Radian Measures An angle is composed of: an initial ray (side) -

More information

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY PharmaSUG 2014 - Paper BB14 A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY ABSTRACT Clinical Study

More information

Multiplying and Dividing Rational Expressions

Multiplying and Dividing Rational Expressions Page 1 of 14 Multiplying and Dividing Rational Expressions Attendance Problems. Simplify each expression. Assume all variables are nonzero. x 6 y 2 1. x 5 x 2 2. y 3 y 3 3. 4. x 2 y 5 Factor each expression.

More information

Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL

Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL Paper TS05-2011 Data Manipulation with SQL Mara Werner, HHS/OIG, Chicago, IL Abstract SQL was developed to pull together information from several different data tables - use this to your advantage as you

More information

words, and models Assessment limit: Use whole numbers (0-10,000) b. Express whole numbers in expanded form

words, and models Assessment limit: Use whole numbers (0-10,000) b. Express whole numbers in expanded form VSC - Mathematics Print pages on legal paper, landsca Grade PK Grade K Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade Number a. Build concept of number and a. Extend concept of number and

More information

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT MWSUG 2017 - Paper BB15 Building Intelligent Macros: Driving a Variable Parameter System with Metadata Arthur L. Carpenter, California Occidental Consultants, Anchorage, Alaska ABSTRACT When faced with

More information

YEAR 7 KEY STAGE THREE CURRICULUM KNOWLEDGE AND SKILLS MAPPING TOOL

YEAR 7 KEY STAGE THREE CURRICULUM KNOWLEDGE AND SKILLS MAPPING TOOL KEY STAGE THREE CURRICULUM KNOWLEDGE AND SKILLS MAPPING TOOL KNOWLEDGE SUBJECT: Mathematics SKILLS YEAR 7 Number Place Value Number Addition and Subtraction Number Multiplication and Division Number -

More information

Using Basic Formulas 4

Using Basic Formulas 4 Using Basic Formulas 4 LESSON SKILL MATRIX Skills Exam Objective Objective Number Understanding and Displaying Formulas Display formulas. 1.4.8 Using Cell References in Formulas Insert references. 4.1.1

More information

1. To add (or subtract) fractions, the denominators must be equal! a. Build each fraction (if needed) so that both denominators are equal.

1. To add (or subtract) fractions, the denominators must be equal! a. Build each fraction (if needed) so that both denominators are equal. MAT000- Fractions Purpose One of the areas most frustrating for teachers and students alike is the study of fractions, specifically operations with fractions. Year after year, students learn and forget

More information

Mini-Lectures by Section

Mini-Lectures by Section Mini-Lectures by Section BEGINNING AND INTERMEDIATE ALGEBRA, Mini-Lecture 1.1 1. Learn the definition of factor.. Write fractions in lowest terms.. Multiply and divide fractions.. Add and subtract fractions..

More information

Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians

Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians Cheat sheet: Data Processing Optimization - for Pharma Analysts & Statisticians ABSTRACT Karthik Chidambaram, Senior Program Director, Data Strategy, Genentech, CA This paper will provide tips and techniques

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics ST009 PROC MI as the Basis for a Macro for the Study of Patterns of Missing Data Carl E. Pierchala, National Highway Traffic Safety Administration, Washington ABSTRACT The study of missing data patterns

More information

Common Core State Standards Mathematics (Subset K-5 Counting and Cardinality, Operations and Algebraic Thinking, Number and Operations in Base 10)

Common Core State Standards Mathematics (Subset K-5 Counting and Cardinality, Operations and Algebraic Thinking, Number and Operations in Base 10) Kindergarten 1 Common Core State Standards Mathematics (Subset K-5 Counting and Cardinality,, Number and Operations in Base 10) Kindergarten Counting and Cardinality Know number names and the count sequence.

More information

Omitting Records with Invalid Default Values

Omitting Records with Invalid Default Values Paper 7720-2016 Omitting Records with Invalid Default Values Lily Yu, Statistics Collaborative Inc. ABSTRACT Many databases include default values that are set inappropriately. These default values may

More information

Fast, Easy, and Publication-Quality Ecological Analyses with PC-ORD

Fast, Easy, and Publication-Quality Ecological Analyses with PC-ORD Emerging Technologies Fast, Easy, and Publication-Quality Ecological Analyses with PC-ORD JeriLynn E. Peck School of Forest Resources, Pennsylvania State University, University Park, Pennsylvania 16802

More information

A Prehistory of Arithmetic

A Prehistory of Arithmetic A Prehistory of Arithmetic History and Philosophy of Mathematics MathFest August 8, 2015 Patricia Baggett Andrzej Ehrenfeucht Dept. of Math Sci. Computer Science Dept. New Mexico State Univ. University

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM

Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM * Which directories are used for input files and output files? See menu-item "Options" and page 22 in the manual.

More information

Hexadecimal Numbers. Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those?

Hexadecimal Numbers. Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those? 9/10/18 1 Binary and Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those? Hexadecimal Numbers Check Homework 3 Binary Numbers A binary (base-two) number

More information

Customizing ODS Graphics to Publish Visualizations

Customizing ODS Graphics to Publish Visualizations ABSTRACT SCSUG November 2018 Customizing ODS Graphics to Publish Visualizations Paul A. Montagna, Harte Research Institute, Texas A&M University-Corpus Christi, Corpus Christi, TX The procedures provided

More information

3rd Grade Texas Math Crosswalk Document:

3rd Grade Texas Math Crosswalk Document: New TX Math 3.1(A) apply mathematics to problems arising in everyday life, society, and the workplace; 3.1(B) use a problem-solving model that incorporates analyzing given information, formulating a plan

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #44. Multidimensional Array and pointers

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #44. Multidimensional Array and pointers Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #44 Multidimensional Array and pointers In this video, we will look at the relation between Multi-dimensional

More information

Section 4 General Factorial Tutorials

Section 4 General Factorial Tutorials Section 4 General Factorial Tutorials General Factorial Part One: Categorical Introduction Design-Ease software version 6 offers a General Factorial option on the Factorial tab. If you completed the One

More information

Repetition Through Recursion

Repetition Through Recursion Fundamentals of Computer Science I (CS151.02 2007S) Repetition Through Recursion Summary: In many algorithms, you want to do things again and again and again. For example, you might want to do something

More information

Multiplying and Dividing Rational Expressions

Multiplying and Dividing Rational Expressions Multiplying and Dividing Rational Expressions Warm Up Simplify each expression. Assume all variables are nonzero. 1. x 5 x 2 3. x 6 x 2 x 7 Factor each expression. 2. y 3 y 3 y 6 x 4 4. y 2 1 y 5 y 3 5.

More information

Unit Map: Grade 3 Math

Unit Map: Grade 3 Math Place Value 3.3 Number and operations. The student represents and compares whole numbers and understand relationships related to place value. Place Value of Whole Numbers 3.3A compose and decompose numbers

More information

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide Paper 809-2017 Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide ABSTRACT Marje Fecht, Prowerk Consulting Whether you have been programming in SAS for years, are new to

More information

Interleaving a Dataset with Itself: How and Why

Interleaving a Dataset with Itself: How and Why cc002 Interleaving a Dataset with Itself: How and Why Howard Schreier, U.S. Dept. of Commerce, Washington DC ABSTRACT When two or more SAS datasets are combined by means of a SET statement and an accompanying

More information

CPSC 532L Project Development and Axiomatization of a Ranking System

CPSC 532L Project Development and Axiomatization of a Ranking System CPSC 532L Project Development and Axiomatization of a Ranking System Catherine Gamroth cgamroth@cs.ubc.ca Hammad Ali hammada@cs.ubc.ca April 22, 2009 Abstract Ranking systems are central to many internet

More information

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from

More information

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software. Welcome to Basic Excel, presented by STEM Gateway as part of the Essential Academic Skills Enhancement, or EASE, workshop series. Before we begin, I want to make sure we are clear that this is by no means

More information

This paper describes a report layout for reporting adverse events by study consumption pattern and explains its programming aspects.

This paper describes a report layout for reporting adverse events by study consumption pattern and explains its programming aspects. PharmaSUG China 2015 Adverse Event Data Programming for Infant Nutrition Trials Ganesh Lekurwale, Singapore Clinical Research Institute, Singapore Parag Wani, Singapore Clinical Research Institute, Singapore

More information

HyperNiche. Multiplicative Habitat Modeling

HyperNiche. Multiplicative Habitat Modeling HyperNiche TM Multiplicative Habitat Modeling Suggested citation: McCune, B. and M. J. Mefford. 2004. HyperNiche. Multiplicative Habitat Modeling. MjM Software, Gleneden Beach, Oregon, U.S.A. Cover: Ecologically

More information

these developments has been in the field of formal methods. Such methods, typically given by a

these developments has been in the field of formal methods. Such methods, typically given by a PCX: A Translation Tool from PROMELA/Spin to the C-Based Stochastic Petri et Language Abstract: Stochastic Petri ets (SPs) are a graphical tool for the formal description of systems with the features of

More information

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma ABSTRACT Today there is more pressure on programmers to deliver summary outputs faster without sacrificing quality. By using just a few programming

More information

Is the statement sufficient? If both x and y are odd, is xy odd? 1) xy 2 < 0. Odds & Evens. Positives & Negatives. Answer: Yes, xy is odd

Is the statement sufficient? If both x and y are odd, is xy odd? 1) xy 2 < 0. Odds & Evens. Positives & Negatives. Answer: Yes, xy is odd Is the statement sufficient? If both x and y are odd, is xy odd? Is x < 0? 1) xy 2 < 0 Positives & Negatives Answer: Yes, xy is odd Odd numbers can be represented as 2m + 1 or 2n + 1, where m and n are

More information

Outline and Reading. Analysis of Algorithms 1

Outline and Reading. Analysis of Algorithms 1 Outline and Reading Algorithms Running time ( 3.1) Pseudo-code ( 3.2) Counting primitive operations ( 3.4) Asymptotic notation ( 3.4.1) Asymptotic analysis ( 3.4.2) Case study ( 3.4.3) Analysis of Algorithms

More information

The Consequences of Poor Data Quality on Model Accuracy

The Consequences of Poor Data Quality on Model Accuracy The Consequences of Poor Data Quality on Model Accuracy Dr. Gerhard Svolba SAS Austria Cologne, June 14th, 2012 From this talk you can expect The analytical viewpoint on data quality Answers to the questions

More information

Shell CSCE 314 TAMU. Haskell Functions

Shell CSCE 314 TAMU. Haskell Functions 1 CSCE 314: Programming Languages Dr. Dylan Shell Haskell Functions 2 Outline Defining Functions List Comprehensions Recursion 3 Conditional Expressions As in most programming languages, functions can

More information

Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury

Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury Introduction The objective of this paper is to demonstrate how to use a fillable PDF to collect

More information

An Application of PROC NLP to Survey Sample Weighting

An Application of PROC NLP to Survey Sample Weighting An Application of PROC NLP to Survey Sample Weighting Talbot Michael Katz, Analytic Data Information Technologies, New York, NY ABSTRACT The classic weighting formula for survey respondents compensates

More information

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology Intermediate Algebra Gregg Waterman Oregon Institute of Technology c 2017 Gregg Waterman This work is licensed under the Creative Commons Attribution 4.0 International license. The essence of the license

More information

Equations and Problem Solving with Fractions. Variable. Expression. Equation. A variable is a letter used to represent a number.

Equations and Problem Solving with Fractions. Variable. Expression. Equation. A variable is a letter used to represent a number. MAT 040: Basic Math Equations and Problem Solving with Fractions Variable A variable is a letter used to represent a number. Expression An algebraic expression is a combination of variables and/or numbers

More information

PharmaSUG Paper AD06

PharmaSUG Paper AD06 PharmaSUG 2012 - Paper AD06 A SAS Tool to Allocate and Randomize Samples to Illumina Microarray Chips Huanying Qin, Baylor Institute of Immunology Research, Dallas, TX Greg Stanek, STEEEP Analytics, Baylor

More information

Chapter 1: Number and Operations

Chapter 1: Number and Operations Chapter 1: Number and Operations 1.1 Order of operations When simplifying algebraic expressions we use the following order: 1. Perform operations within a parenthesis. 2. Evaluate exponents. 3. Multiply

More information

CHAPTER 6. The Normal Probability Distribution

CHAPTER 6. The Normal Probability Distribution The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit

More information

Lukáš Plch at Mendel university in Brno

Lukáš Plch at Mendel university in Brno Lukáš Plch lukas.plch@mendelu.cz at Mendel university in Brno CAB Abstracts Greenfile Econlit with Full Text OECD ilibrary the most comprehensive database of its kind, instant access to over 7.3 million

More information

Optimizing System Performance

Optimizing System Performance 243 CHAPTER 19 Optimizing System Performance Definitions 243 Collecting and Interpreting Performance Statistics 244 Using the FULLSTIMER and STIMER System Options 244 Interpreting FULLSTIMER and STIMER

More information

CS1 Recitation. Week 1

CS1 Recitation. Week 1 CS1 Recitation Week 1 Admin READ YOUR CS ACCOUNT E-MAIL!!! Important announcements, like when the cluster will be unavailable, or when you need to reset your password. If you want to forward your e-mail:

More information

parameters, network shape interpretations,

parameters, network shape interpretations, GIScience 20100 Short Paper Proceedings, Zurich, Switzerland, September. Formalizing Guidelines for Building Meaningful Self- Organizing Maps Jochen Wendel 1, Barbara. P. Buttenfield 1 1 Department of

More information

Information Criteria Methods in SAS for Multiple Linear Regression Models

Information Criteria Methods in SAS for Multiple Linear Regression Models Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information

More information

Stage 1 Place Value Calculations Geometry Fractions Data. Name and describe (using appropriate vocabulary) common 2d and 3d shapes

Stage 1 Place Value Calculations Geometry Fractions Data. Name and describe (using appropriate vocabulary) common 2d and 3d shapes Stage 1 Place Value Calculations Geometry Fractions Data YEAR 7 Working towards Read and write whole numbers in words and figures Mental methods for addition and subtraction, Name and describe (using appropriate

More information

2.2 Order of Operations

2.2 Order of Operations 2.2 Order of Operations Learning Objectives Evaluate algebraic expressions with grouping symbols. Evaluate algebraic expressions with fraction bars. Evaluate algebraic expressions using a graphing calculator.

More information

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application SAS/STAT 13.1 User s Guide The Power and Sample Size Application This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as

More information

command.name(measurement, grouping, argument1=true, argument2=3, argument3= word, argument4=c( A, B, C ))

command.name(measurement, grouping, argument1=true, argument2=3, argument3= word, argument4=c( A, B, C )) Tutorial 3: Data Manipulation Anatomy of an R Command Every command has a unique name. These names are specific to the program and case-sensitive. In the example below, command.name is the name of the

More information

When using computers, it should have a minimum number of easily identifiable states.

When using computers, it should have a minimum number of easily identifiable states. EET 3 Chapter Number Systems (B) /5/4 PAGE Number Systems (B) Number System Characteristics (Tinder) What s important in choosing a number system? Basically, there are four important characteristics desirable

More information

Armstrong State University Engineering Studies MATLAB Marina 2D Arrays and Matrices Primer

Armstrong State University Engineering Studies MATLAB Marina 2D Arrays and Matrices Primer Armstrong State University Engineering Studies MATLAB Marina 2D Arrays and Matrices Primer Prerequisites The 2D Arrays and Matrices Primer assumes knowledge of the MATLAB IDE, MATLAB help, arithmetic operations,

More information

REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS. Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1

REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS. Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1 REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1 Sandia National Laboratories and 1 Los Alamos National Laboratory

More information

Checking for Duplicates Wendi L. Wright

Checking for Duplicates Wendi L. Wright Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when

More information

While You Were Sleeping, SAS Was Hard At Work Andrea Wainwright-Zimmerman, Capital One Financial, Inc., Richmond, VA

While You Were Sleeping, SAS Was Hard At Work Andrea Wainwright-Zimmerman, Capital One Financial, Inc., Richmond, VA Paper BB-02 While You Were Sleeping, SAS Was Hard At Work Andrea Wainwright-Zimmerman, Capital One Financial, Inc., Richmond, VA ABSTRACT Automating and scheduling SAS code to run over night has many advantages,

More information

METAPOPULATION DYNAMICS

METAPOPULATION DYNAMICS 16 METAPOPULATION DYNAMICS Objectives Determine how extinction and colonization parameters influence metapopulation dynamics. Determine how the number of patches in a system affects the probability of

More information

Paul's Online Math Notes. Online Notes / Algebra (Notes) / Systems of Equations / Augmented Matricies

Paul's Online Math Notes. Online Notes / Algebra (Notes) / Systems of Equations / Augmented Matricies 1 of 8 5/17/2011 5:58 PM Paul's Online Math Notes Home Class Notes Extras/Reviews Cheat Sheets & Tables Downloads Algebra Home Preliminaries Chapters Solving Equations and Inequalities Graphing and Functions

More information

Introduction to Matlab. By: Hossein Hamooni Fall 2014

Introduction to Matlab. By: Hossein Hamooni Fall 2014 Introduction to Matlab By: Hossein Hamooni Fall 2014 Why Matlab? Data analytics task Large data processing Multi-platform, Multi Format data importing Graphing Modeling Lots of built-in functions for rapid

More information

University of Alberta

University of Alberta A Brief Introduction to MATLAB University of Alberta M.G. Lipsett 2008 MATLAB is an interactive program for numerical computation and data visualization, used extensively by engineers for analysis of systems.

More information

CPSC 67 Lab #5: Clustering Due Thursday, March 19 (8:00 a.m.)

CPSC 67 Lab #5: Clustering Due Thursday, March 19 (8:00 a.m.) CPSC 67 Lab #5: Clustering Due Thursday, March 19 (8:00 a.m.) The goal of this lab is to use hierarchical clustering to group artists together. Once the artists have been clustered, you will calculate

More information

Real-time Monitoring of Multi-mode Industrial Processes using Feature-extraction Tools

Real-time Monitoring of Multi-mode Industrial Processes using Feature-extraction Tools Real-time Monitoring of Multi-mode Industrial Processes using Feature-extraction Tools Y. S. Manjili *, M. Niknamfar, M. Jamshidi Department of Electrical and Computer Engineering The University of Texas

More information

Middle School Math Course 3

Middle School Math Course 3 Middle School Math Course 3 Correlation of the ALEKS course Middle School Math Course 3 to the Texas Essential Knowledge and Skills (TEKS) for Mathematics Grade 8 (2012) (1) Mathematical process standards.

More information

Gateway Regional School District VERTICAL ARTICULATION OF MATHEMATICS STANDARDS Grades K-4

Gateway Regional School District VERTICAL ARTICULATION OF MATHEMATICS STANDARDS Grades K-4 NUMBER SENSE & OPERATIONS K.N.1 Count by ones to at least 20. When you count, the last number word you say tells the number of items in the set. Counting a set of objects in a different order does not

More information

Model Design and Evaluation

Model Design and Evaluation Model Design and Evaluation The General Modeling Process YES Define Goals Compartments Systematize Add Spatial Dimension Complete? NO Formulate & Flowchart Return Return NO Complete? Deliver Implement

More information

Curriculum Guide for Doctor of Philosophy degree program with a specialization in ENVIRONMENTAL PUBLIC HEALTH

Curriculum Guide for Doctor of Philosophy degree program with a specialization in ENVIRONMENTAL PUBLIC HEALTH 2018 2019 Curriculum Guide for Doctor of Philosophy degree program with a specialization in ENVIRONMENTAL PUBLIC HEALTH The Doctor of Philosophy (PhD) degree in Environmental Sciences requires a significant

More information

SOLVING THE TRANSPORTATION PROBLEM WITH PIECEWISE-LINEAR CONCAVE COST FUNCTIONS

SOLVING THE TRANSPORTATION PROBLEM WITH PIECEWISE-LINEAR CONCAVE COST FUNCTIONS Review of the Air Force Academy No.2 (34)/2017 SOLVING THE TRANSPORTATION PROBLEM WITH PIECEWISE-LINEAR CONCAVE COST FUNCTIONS Tatiana PAȘA, Valeriu UNGUREANU Universitatea de Stat, Chișinău, Moldova (pasa.tatiana@yahoo.com,

More information

OLAP Drill-through Table Considerations

OLAP Drill-through Table Considerations Paper 023-2014 OLAP Drill-through Table Considerations M. Michelle Buchecker, SAS Institute, Inc. ABSTRACT When creating an OLAP cube, you have the option of specifying a drill-through table, also known

More information