Notes on the SAS Data Step and an Introduction to Simulation

Size: px
Start display at page:

Download "Notes on the SAS Data Step and an Introduction to Simulation"

Transcription

1 Notes on the SAS Data Step and an Introduction to Simulation W. John Braun University of Western Ontario Department of Statistical and Actuarial Sciences

2 Chapter 1 Introduction 1.1 Objectives and a Brief Overview You are about to be introduced to one of the most commonly used statistical packages: SAS (Statistical Analysis System). SAS has been (and is continuing to be) developed at the SAS Institute in Research Triangle Park at Cary, North Carolina. We will be using the SAS Enterprise version in this course. The purpose of statistical computing, and the reason for learning SAS, is for data analysis. Given a set of data, one wishes to analyze it appropriately in order to make a decision or to acquire some new insights into the population from which the data was extracted. This set of notes will take us, seemingly, in the reverse direction. The ultimate goal of this set of notes is to teach you how to use SAS to simulate different kinds of data. There are at least 2 reasons for learning how to do this: first, it gives you a way of making up data for your own future exercises so that you can test out different SAS analysis procedures, and you will be able to find out what kinds of data are appropriate for a given procedure; second, knowing how to simulate a set of data is a step towards understanding what kind of structure underlies the data or the mathematical model which is being studied as an approximation to the real population. Thus, we will first be using SAS to create artificial data of different types. Later, we will learn how to use SAS procedures to analyze real data; the artificial data can then be used for practice. We will begin by learning how to run SAS jobs in the Windows NT environment. In practice, SAS is often run on Unix platforms in which case the procedures for running the SAS jobs differs from what will be described here, but the content of the SAS programs is almost identical. Documentation for SAS programs will be considered briefly. Then the Data Step will be considered in some detail. Matters of input/output and flow control will be discussed. The main application will be to the generation of random numbers and the creation of artificial data as alluded to above. 1.2 Introduction to SAS The SAS system is a software system for data analysis. Some Definitions 1. Data - collections of letters (characters) and/or numbers each representing measurement information. 2. Analysis checking the data for errors (data cleaning) graphical displays statistical tests interpreting results To invoke SAS in the lab (Room 256 WSC), log into the network, and click on the SAS Enterprise icon. Choose the File Menu, and Open Program to get started. You now have a Program Editor window and Output window. The Program Editor is ready for you to type in a SAS program or to open an existing program. 1

3 CHAPTER 1. INTRODUCTION 2 In a Unix environment, SAS programs (named, for example, filename.sas) are written using an editor such as vi or pico. These programs are then submitted to the cpu for compilation and execution. The results can then be read from a list file (called filename.lst). Error diagnostics are available in a file called filename.log. In the Windows environment, things are a little different. 1.3 Main Components of a SAS program 1. DATA step - for reading and manipulating data. Sometimes programming is done in this step. 2. PROC step - for analyzing data. A SAS procedure is used to conduct the analysis on data that is contained in a SAS dataset prepared during the DATA step. Thus, the PROC step must always follow a DATA step.

4 Chapter 2 The Data Step 2.1 Some Definitions 1. Data Value - a single measurement. e.g. the height of a person (Joe). 2. Observation - a set of data values for the same individual. e.g. name, height, weight, age and sex of Joe. 3. Variable - a set of data values for the same measurement. e.g. the heights of 10 different people. 4. Data set - a collection of observations. We usually think of the observations as being the rows of the data set, while the variables make up the columns of the data set Example Consider the following data set which consists of 4 observations on 5 different variables (NAME, HEIGHT, WEIGHT, AGE, SEX). NAME HEIGHT WEIGHT AGE SEX JOE M MARY F SUE F TOM M Here, we have 3 numeric variables (HEIGHT, WEIGHT, AGE) and 2 character variables (NAME, SEX) Exercise Consider the following data set: TEMPERATURE PRESSURE MINIMUM WIND SPEED MAXIMUM WIND SPEED How many variables are there? 2. How many observations on each variable? The Data Step is the point in the SAS program at which one or more SAS data sets are created. These data sets may be read in from external files or created from within the SAS program itself. It should be noted that a single SAS program can consist of more than one Data Step, though we shall find a single Data Step sufficient for present purposes. The Data Step consists of a sequence of statements, each ending with a semi-colon. These statements are primarily concerned with the construction of data sets and the management of data. 3

5 CHAPTER 2. THE DATA STEP Data The first line of the Data Step consists of the Data statement. This statement indicates that a data step is starting, and it tells SAS the name of the SAS data set which is being created. Syntax: DATA setname; The data set name is a word which is somehow descriptive of the data set with which it is associated. It must consist of at most 32 letters and/or numbers. The first character must be a letter Examples The following statement tells SAS that a SAS data set called WEATHER is going to be created. DATA WEATHER; The following statement tells SAS that a SAS data set called GRADES98 is going to be created. DATA GRADES98; Some programming applications do not involve a data set. The following statement tells SAS to begin a data step without creating a data set. This type of data statement frees up memory that would possibly be used unnecessarily. We will use it when doing simulations. 2.3 Numeric Assignment The Assignment statement is used for creating new variables and modifying existing variables. Syntax: varname = value; Naming Variables in SAS: A variable name must begin with a letter and may be 1 to 8 characters long. e.g. NAME HEIGHT WEIGHT AGE SEX. e.g. If we have two samples of heights, we could label the 2 height variables HEIGHT1 and HEIGHT2. 1HEIGHT and 2HEIGHT are not valid variable names Example TEMP = -21.7; The above statement assigns the value to the variable TEMP Example We can create a SAS data set called WEATHER consisting of one observation on each of 4 variables using the following sequence of assignment statements. DATA WEATHER; DATE = 22; PRESSURE= ; WIND = 19; TEMP = -21.7; The resulting SAS data set is as follows: WEATHER DATE PRESSURE WIND TEMP

6 CHAPTER 2. THE DATA STEP 5 An equivalent result is obtained using the Input and Datalines statements: DATA WEATHER; INPUT DATE PRESSURE WIND TEMP; DATALINES; ; Example Create a SAS data set called GRADES98 containing the following data set: ID EXAM FINAL DATA GRADES98; INPUT ID EXAM FINAL; DATALINES; ; 2.4 INFILE and INPUT: Importing Data from an External File Often, data has been entered into a text file, for example, from a spreadsheet or data editor, or perhaps from another SAS program. The INFILE statement is used in the Data Step to tell SAS what file the data is located in. Then, the INPUT statement reads the data into a SAS dataset. Syntax: INFILE filename ; INPUT var1 var2... varn; Example Suppose the data set of the exercise in the previous section was entered into a file called weather.dat. We can produce a SAS data set called WEATHER by executing the following program. /* Example of reading data */ DATA WEATHER; INFILE WEATHER.DAT ; INPUT TEMP PRESSURE MINWIND MAXWIND; PROC PRINT NOOBS; /* This statement is NOT necessary, but it allows one to see the contents of the SAS data set in the Output window. */ /* This statement IS necessary. The program will not run otherwise. */

7 CHAPTER 2. THE DATA STEP Comments and Documentation It is often important to add documentation to any computer programs which you create. Comment statements should be used to describe program contents. Proper documentation allows you or other users to read and understand your program more easily. This is particularly useful if the program is to be updated later. In SAS, there are two forms of comment statements: 1. /* comment */ e.g. /* The variable RADIUS measures the cross-sectional radius of each tree at a distance of 1 meter from the ground. */ 2. * comment; e.g. * The variable RADIUS measures the cross-sectional radius of each tree at a distance of 1 meter from the ground.; A useful form of documentation includes a statement at the beginning of the program consisting of the title of the program, the name of the programmer, and the date. Sometimes variables are defined here. A brief description of the purpose of the program is useful as well. In the body of the program, it is often useful to explain any special commands used there Example The following lines would make up a SAS file: /* Descriptive Analysis of a Sample of Four Individuals By P. Brooks January 15, 2007 This program computes the mean and standard deviation for the height, weight and age of a random sample of people. Variables: HEIGHT = height in centimeters. WEIGHT = weight in kilograms. AGE = age in years. */ DATA SIZES; INFILE sizes.dat ; INPUT HEIGHT AGE WEIGHT; PROC MEANS MEAN STD; * The extra arguments produce only the sample mean and sample standard deviation for each variable; 2.6 File and Put The FILE statement is used to specify an external output file. Syntax:

8 CHAPTER 2. THE DATA STEP 7 FILE filename; The PUT statement causes SAS to print to the external file named in an earlier FILE statement. Syntax: PUT varname1 varname2...; Example The following lines cause SAS to print the values 22, , 19, to a file called weather.dat. DATA WEATHER; FILE weather.dat ; INPUT DATE PRESSURE WIND TEMP; PUT DATE PRESSURE WIND TEMP; DATALINES; ; Each occurrence of a Put statement causes the current value of the relevant variables to be output to the file named in the File statement Example FILE GRADES.08 ; IF _N_=1 THEN PUT 2008 GRADES ; /* _N_ counts the observations as they are input to the dataset */ LENGTH NAME $ 8; /* This Length statement ensures that the variable NAME can contain values up to INPUT NAME $ GRADE; PUT NAME GRADE; DATALINES; JOE 57.5 MARY 83 JENNIFER 64.5 ; 8 characters in length. */ /* The $ tells SAS that NAME is a character variable. */ This produces a file called GRADES.08 containing the lines 2008 GRADES JOE 57.5 MARY 83 JENNIFER Exercises 1. Write out the contents of the file epa.dat produced by the following:

9 CHAPTER 2. THE DATA STEP 8 FILE epa.dat ; PUT SOME MILEAGE MEASUREMENTS ; LENGTH CAR $ 13; CAR = BUICK CENTURY ; DISTANCE = 540; FUEL = 40; PUT CAR DISTANCE FUEL; CAR = HONDA CRX ; DISTANCE = 720; FUEL = 30; PUT CAR DISTANCE FUEL; 2. Check your answer by executing the above lines on a computer. 3. Was a SAS data set created? Check this by adding the line PROC PRINT NOOBS; (then look in the Output window and the Log file for more information.) 4. Reorganize the program so that it uses the Datalines statement. 2.7 Arithmetic SAS can be used as a calculator to perform simple arithmetic. 1. Addition: varname = varname1 + varname2; 2. Subtraction: varname = varname1 - varname2; 3. Multiplication: varname = varname1 * varname2; 4. Division: varname = varname1 / varname2; 5. Power (varname1 varname2 ): varname = varname1 ** varname2; 6. Modular arithmetic: varname = MOD(varname1, varname2); this computes the remainder resulting from division of varname1 by varname2 and assigns this value to varname.

10 CHAPTER 2. THE DATA STEP Example /* some examples of arithmetic calculations */ FILE arith.out ; X = 15; Y = 6; SUM = X + Y; DIFF = X - Y; /* DIFF = DIFFERENCE */ PRODUCT = X * Y; QUOTIENT = X/Y; POWER = X ** Y; REMAIND = MOD(X,Y); /* REMAIND = REMAINDER */ PUT X Y SUM DIFF PRODUCT; PUT QUOTIENT POWER REMAIND; Execution of the above SAS program produces a file called arith.out which contains the following lines: Exercises 1. What are the contents of the file convert.tmp produced by the following program? FILE convert.tmp ; TEMPC = 20; TEMPF = TEMPC* ; PUT TEMPC degrees Celsius = TEMPF degrees Fahrenheit. 2. Suppose X = 45, Y = 32, and Z = 7. Find the value of the variable ANSWER in each of the following: (a) ANSWER = X - Y; (b) ANSWER = Z ** Z; (c) ANSWER = MOD(X,Y); (d) ANSWER = MOD(Y,Z); (e) ANSWER = MOD(X,Y)+ MOD(X,Z); 3. Using the fact that 1 mile = 1.6 kilometers, write a complete SAS program which converts a distance of 26 miles into kilometer units, and which prints the following into a file called convert.dst: A distance of 26 miles is the same as a distance of 41.6 kilometers. The Floor Function Syntax: varname = FLOOR(varname1); This statement assigns the greatest integer less than varname1 to the variable varname. For example, the greatest integer less than is 27, and the greatest integer less than is -17.

11 CHAPTER 2. THE DATA STEP Example X = 47.39; Y = FLOOR(X); The value of Y is Exercises 1. Write out the contents of the file arith.dat produced by FILE arith.dat ; X = ; Y = FLOOR(X); PUT X Y; 2. Modify the above program to compute the greatest integer less than (a) (b) (c) W, where W = 32X, and X =

12 Chapter 3 If: Controlling Flow of Operations The IF statement is very important in database management. It is used to control the flow of operations which are applied to variables depending on the values of relevant variables. In other words, if a certain variable takes on a certain value, a certain operation might be performed; otherwise, the operation is not performed or a different operation is performed in its place. Syntax: IF (condition) THEN (SAS statement); ELSE (SAS statement); SAS evaluates the condition to determine whether it is true or false. If the condition is true, SAS proceeds to carry out the SAS statement. The ELSE statement is optional. It provides an alternative action if the condition is false. Possible conditions to test are varname GE constant, varname LE constant varname < constant, varname > constant varname = constant, varname NE constant Testing the first condition above amounts to testing whether the variable with name varname is greater than or equal to the specified constant (another variable name could be used here as well). The second condition listed concerns less than or equal, and the last condition involves testing for inequality Example Coding The variable SEX can take values M and F. It is sometimes more convenient to code this variable numerically using 1 for males and 0 for females. The IF statement can be used to do this as follows: IF SEX = M THEN SEXCODE = 1; ELSE SEXCODE = 0; In other words, if the variable SEX takes the value M, then the new variable SEXCODE takes the value 1. Otherwise, SEXCODE takes the value Example Outlier Detection Suppose X is a variable whose mean is MU and standard deviation is SIGMA. We may decide that the value of X is to be considered outlying if it is more than 3 standard deviations from MU. The following SAS lines determine if the value of X is outlying. The variable OUTLIER is assigned the value 1 if X is an outlier, and it is assigned the value 0 if X is not an outlier. 11

13 CHAPTER 3. IF: CONTROLLING FLOW OF OPERATIONS 12 OUTLIER = 0; Z = (X - MU)/SIGMA; IF Z > 3 THEN OUTLIER = 1; ELSE IF Z < -3 THEN OUTLIER = 1; Exercises 1. Execute the following program and view the contents of the file demog.dat. DATA DEMOGRAP; FILE demog.dat ; INPUT SEX $; IF SEX = M THEN SEXCODE = 1; ELSE SEXCODE = 0; PUT SEXCODE; DATALINES; M F M M F ; 2. The following data has been recorded over a period of 5 hours at a switch: 0,1,1,1,0. The switch is off when the value of the above variable (called testcode) is 0, and on when the value is 1. Write a SAS program which assigns the value on to the variable test when the testcode value is 1 and off when testcode is A random variable X has mean 14 and variance 49. Write a SAS program which determines which of the following values of X are outliers: 15, 23, -8, 31, 17. The results should be output to a file called outliers.ex.

14 Chapter 4 DOing things repeatedly The DO statement is often useful for simulation. It is also sometimes useful in other kinds of data preparation and analysis. 4.1 Simple DO The simple DO statement (which is usually used in association with an IF statement) tells SAS to execute a set of SAS statements. This set of statements is usually referred to as a DO group. Syntax: DO; SAS statements Example FILE do.eg ; INPUT X Y; IF X > Y THEN DO; Z1 = X+Y; Z2 = X-Y; ELSE DO; Z1 = X-Y; Z2 = X+Y; PUT X Y Z1 Z2; DATALINES; ; Executing the above program results in a file called do.eg which contains the following:

15 CHAPTER 4. DOING THINGS REPEATEDLY Iterative DO The iterative DO statement tells SAS to perform a computation several times. Syntax: DO varname = constant1 TO constant2 BY constant3; SAS statements Example Suppose we wish to add up all the numbers from 1 to 100. The following SAS program does this for us: NUMSUM = 0; DO INDEX = 1 TO 100; NUMSUM = NUMSUM + INDEX; FILE sum.100 ; PUT NUMSUM; /* NUMSUM is the variable which will ultimately contain the sum we are interested in.*/ /* At each iteration of the DO group, the current value of INDEX is added to the current value of NUMSUM. */ The file sum.100 will then contain the value 5050, which is the sum of the first 100 integers Example Suppose we wish to add up all the even numbers between 1 and 101. The following SAS program does this for us: NUMSUM = 0; DO INDEX = 2 TO 100 BY 2; NUMSUM = NUMSUM + INDEX; FILE even.sum ; PUT NUMSUM; The file even.sum will then contain the value 2550, which is the sum of the first 50 even numbers Exercises 1. Write a SAS program which calculates the sum of all multiples of 3 between 1 and 121. Ans Modify the above program so that it calculates the sum of all integers from 51 through 100. Ans. 3775

16 CHAPTER 4. DOING THINGS REPEATEDLY Modify the above program so that it calculates the sum of all squares from 1 to Modify the above program so that it calculates the sum of square roots of even numbers between 1 and Modify the above program so that it calculates 20! (the product of all integers between 1 and 20). 4.3 DO While (optional) In order to use the iterative DO, one needs to know the number of times the computation is to be performed. Often, this number is not known beforehand. Instead, one might require that the computation is performed while a particular condition is satisfied. Syntax: DO WHILE (condition); SAS statements The SAS statements in the DO group are executed as long as the condition is found to be true. The condition is tested once before the beginning of each loop. The first time that the condition is found to be false, the DO group statements are no longer executed and SAS moves on beyond the statement Example Suppose we want to determine the largest value of n so that n i 2 < i=1 One approach to this problem is to successively add terms to the sum, while the sum is less than 10000, and to stop accumulating as soon as the sum exceeds this amount. The following statements accomplish this: NUMSUM = 0; INDEX=0; DO WHILE (NUMSUM < 10000); INDEX=INDEX+1; NUMSUM = NUMSUM + INDEX**2; INDEX=INDEX-1; FILE sum.out ; PUT INDEX; The final value of INDEX is the solution n Exercises 1. Write a SAS program which finds the largest n satisfying n i 3 < i=1

17 CHAPTER 4. DOING THINGS REPEATEDLY Write a SAS program which finds the largest n satisfying n! < Write a SAS program which finds the smallest n satisfying n! >

18 Chapter 5 Simulation 5.1 Generation of Pseudorandom Numbers We begin our discussion of simulation with a brief exploration of the mechanics of pseudorandom number generation. Pseudorandom numbers are useful in simulation studies. We will briefly describe a common method for simulating independent uniform random variables on the interval [0,1]. A multiplicative congruential random number generator produces a sequence of pseudorandom numbers, u 0, u 1, u 2,..., which are approximately independent uniform random variables on the interval [0,1]. We now describe how to construct such a generator. Let m be a large integer, and let b be another integer which is smaller than m. b is often somewhere around the square root of m. To begin, an integer x 0 is chosen between 1 and m. x 0 is called the seed. It is best chosen in some non-systematic manner. Once the seed has been chosen, the generator proceeds as follows: x 1 = bx 0 (mod m) u 1 = x 1 /m. u 1 is the first pseudorandom number. Dividing by m ensures that the number lies between 0 and 1. Note that it takes some value between 0 and 1. If m and b are chosen properly, it is difficult to predict the value of u 1, given the value of x 0 only. The second pseudorandom number is then obtained in the same manner: x 2 = bx 1 (mod m) u 2 = x 2 /m. u 2 is another pseudorandom number, which is approximately independent of u 1. using the following formulas: x n = bx n 1 (mod m) u n = x n /m. The method continues This method produces numbers which are in reality non-random, but if done properly, the numbers appear to be random (i.e. unpredictable). Different values of b and m give rise to pseudorandom number generators of varying quality. If they are not chosen with some care, then the generator will produce numbers that do not appear to be random. A number of statistical tests have been developed for assessing the quality of a pseudorandom number generator Example The following lines of SAS create a file called RANDOM.DAT which contains 5 pseudorandom numbers based on the multiplicative congruential generator: with initial seed x 0 = x n = 171x n 1 (mod 30269) u n = x n /

19 CHAPTER 5. SIMULATION 18 /* Rudimentary Pseudorandom Number Generator */ FILE RANDOM.DAT ; B = 171; M = 30269; SEED = 23121; X = SEED; DO I = 1 TO 5; X = MOD(B*X, M); U = X/M; PUT X U; The results which are stored in the file RANDOM.DAT are as follows. The first column consists of the integers x 1, x 2,..., x 5. The second column consists of numbers ranging between 0 and 1. These are the uniform pseudorandom numbers, u 1, u 2,..., u A related operation is used internally by SAS to produce pseudorandom numbers automatically with the function UNIFORM Example The following lines of SAS create a file called RANDOM.DAT which contains 50 uniform pseudorandom numbers based on the SAS generator UNIFORM with initial seed x 0 = /* Example demonstrating use of SAS RNG with fixed seed. */ SEED = 27218; FILE RANDOM.DAT ; DO I = 1 TO 50; U = UNIFORM(SEED); PUT U; It is often of interest to look at the distribution of a set of pseudorandom numbers. For the numbers generated in the previous example, we would proceed as follows: DATA RANDOM; INFILE RANDOM.DAT ; INPUT U; PROC CHART; VBAR U; The bars of the histogram should all be roughly the same height, if the numbers are really uniformly distributed.

20 CHAPTER 5. SIMULATION Exercises 1. Generate 200 random numbers using the generator from the first example with an initial seed of Write a program (or modify the second program in the second example) which produces a histogram of the numbers produced in the previous exercise. 3. Generate 200 random numbers using the SAS UNIFORM generator from example 2 with an initial seed of Produce a histogram of this simulated data. 4. Modify the generator of the first example so that it produces 200 random numbers from the generator x n = 172x n 1 (mod 30307) with initial seed x 0 = Generate 1000 pseudorandom numbers using the SAS function UNIFORM, and store them in a file called UNIF.DAT. 6. Modify the above program to simulate the random variable Y = 1/(U + 1) where U is a uniform random variable on the interval [0,1]. Specifically, generate 1000 values of this random variable and put them in a file called RANDOM.DAT. Also, plot the histogram of the random numbers y 1,..., y Since Y is no longer a uniform random variable, the histogram will not be flat any longer; what is the shape of the distribution? 7. Write a program which generates 100 independent observations on a uniformly distributed random variable on the interval [0, 100]. Estimate the mean, variance and standard deviation of such a uniform random variable. 8. Use the FLOOR function together with UNIFORM to simulate 100 random integers between 0 and Simulation of Bernoulli Trials A Bernoulli trial is an experiment in which there are 2 possible outcomes. For example, a light bulb may work or it may not work; these are the only possibilities. For another example, consider a student who guesses on a multiple choice test question which has 5 options; the student may guess correctly with probability 0.2 and incorrectly with probability 0.8. Suppose we would like to know how well such a student would do on a multiple choice test consisting of 100 questions. We can get an idea by using simulation: Each question corresponds to an independent Bernoulli trial with probability of success equal to 0.2. We can simulate the correctness of the student for each question by generating an independent uniform random number. If this number is less than.2, we say that the student guessed correctly; otherwise, we say that the student guessed incorrectly. This will work because the probability that a uniform random variable is less than.2 is exactly.2, while the probability that a uniform random variable exceeds.2 is exactly.8, which is the same as the probability that the student guesses incorrectly. Thus, the uniform random number generator is simulating the student. The SAS version of this is as follows: SEED = 12883; FILE STUDENT.ANS ; PUT CORRECT U ; DO QUESTION = 1 TO 100; U = UNIFORM(SEED); IF U $<$.2 THEN CORRECT = 1; ELSE CORRECT = 0; PUT CORRECT U;

21 CHAPTER 5. SIMULATION 20 The first column of the file STUDENT.ANS contains the results of the student s guesses. A 1 is recorded each time the student correctly guesses the answer, while a 0 is recorded each time the student is wrong. The second column records the value of the variable U; note that whenever its value is less than.2, the value of CORRECT is 1, and when U takes a value exceeding.2, the value of CORRECT is Exercises 1. Write a SAS program which simulates a student guessing at a True-False test consisting of 40 questions. 2. Write a SAS program which simulates 500 light bulbs, each of which has probability.99 of working. 3. Write a SAS program which simulates a binomial random variable Y with parameters n = 25 and p =.4. (Y is the sum of 25 independent Bernoulli random variables with p =.4.) Now, modify the program so that it generates 100 of these binomial random variables and writes them to a file called binom.dat. In order to do this, you will need to nest one DO group inside another. Write another program which reads the data from binom.dat into a SAS data set and produces a histogram. Estimate the mean and variance using PROC MEANS. Compare these estimates with their theoretical counterparts. Recall that the theoretical mean of a binomial random variable is np and the theoretical variance is np(1 p). 5.3 Binomial Random Numbers The RANBIN function can be used to automatically generate binomial random numbers. Syntax: Y = RANBIN(seed,n,p); The seed is any positive integer, while n and p are the binomial parameters. The function assigns a random binomial realization to the variable Y Example Suppose 10 % of the vacuum tubes produced by a machine are defective, and suppose 15 tubes are produced each hour. Each tube is independent of all other tubes. Simulate the number of defective tubes produced by the machine for each hour over a 24-hour period. Record the simulated numbers of defectives in a file called TUBE.DAT. Since 15 tubes are produced each hour and each tube has a 0.1 probability of being defective, independent of the state of the other tubes, the number of defectives produced in one hour is a binomial random variable with n = 15 and p = 0.1. To simulate the number of defectives for each hour in a 24-hour period, we need to generate 24 binomial random numbers: /* Simulation of defective vacuum tubes */ SEED = 21223; N = 15; P =.1; FILE TUBE.DAT ; PUT HOUR NUMBER OF DEFECTIVES ; DO HOUR = 1 TO 24; DEFECTVS = RANBIN(SEED,N,P); PUT HOUR DEFECTVS;

22 CHAPTER 5. SIMULATION Exercise Generate 1000 binomial values using RANBIN. Then use PROC MEANS to estimate the average and variance of a binomial random variable with n = 18 and p =.75. Compare with the theoretical mean and variance. 5.4 Poisson Random Numbers We can generate Poisson random numbers using SAS with the RANPOI function. It is similar to the RANBIN function, but there is only one parameter instead of two. Syntax: Y = RANPOI(seed, lambda); In this case, lambda is the mean of the Poisson random variable Example Suppose traffic accidents occur at an intersection with a mean of 3.7 per year. Simulate the annual number of accidents for a 10-year period, assuming that the numbers occurring from year to year are independent. /* Example of Poisson variate generation -- Simulation of Traffic Accidents */ SEED = ; LAMBDA = 3.7; FILE ACCIDENT.DAT ; PUT YEAR NUMBER OF ACCIDENTS ; DO YEAR = 1 TO 10; ACCIDENT = RANPOI(SEED, LAMBDA); PUT YEAR ACCIDENT; Exercises 1. Modify the above program to simulate the number of accidents per year for 15 years, when the average rate is 2.8 accidents per year. 2. Simulate the number of surface defects in the finish of a sports car for 20 cars, where the mean is 1.2 defects per car. 3. Estimate the mean and variance of a Poisson random variable whose mean rate is 7.2 by simulating 1000 such variates and using PROC MEANS. Compare with the theoretical values, recalling that the variance and mean are equal for Poisson random variables. 5.5 Exponential Random Numbers The exponential distribution can be used as a simple model for the time until a component fails, or until a light bulb burns out. A random variable T has an exponential distribution with mean λ if P(T t) = 1 e t/λ

23 CHAPTER 5. SIMULATION 22 for any non-negative t. The mean or expected value of T is 1/λ and the variance of T is 1/λ 2. The simplest way to simulate exponential random variables is to generate a uniform random variable U on [0,1], and set 1 e T/λ = U Solving this for T, we have T = λ log(1 U). It can be shown that T defined in this way has an exponential distribution with mean λ. The SAS function RANEXP can be used to generate random exponential variates with mean 1. Syntax: T = RANEXP(seed); This produces an exponential variate T having mean 1. To change the mean to lambda, we must use T = lambda * RANEXP(seed); Example /* SIMULATION OF N EXPONENTIAL LAMBDA RANDOM VARIATES */ SEED = 12238; LAMBDA = 2.5; N = 10; FILE EXPO.RVS DO I = 1 TO N; T = RANEXP(SEED)*LAMBDA; PUT T; Exercises 1. Suppose that a certain type of battery has a lifetime which is exponentially distributed with mean 55 hours. Simulate 1000 such lifetimes to estimate the mean and variance of the lifetime for this type of battery. Compare with the theoretical mean and variance. 2. The central limit theorem says that the sample mean for a random sample of size n from a population with mean µ and variance σ 2 is approximately normally distributed with mean µ and variance σ 2 /n, where the approximation improves as n increases. The following programs provides a demonstration for the case where the underlying population is exponentially distributed: /* PROGRAM 1: Computation of averages of samples of size N coming from exponential lambda populations */ SEED = 12238; LAMBDA = 2.5; NSAMPLES = 1000; N = 10; FILE EXPO.AVG DO NSAMPLE = 1 TO NSAMPLES; TSUM = 0; /* We are going to simulate NSAMPLES independent samples of size N, computing the average in each case. */

24 CHAPTER 5. SIMULATION 23 DO I = 1 TO N; T = RANEXP(SEED)*LAMBDA; TSUM = TSUM + T; TAVG = TSUM/N; PUT TAVG; /* Accumulating the sample values to form a sum */ /* TAVG = average of the current sample. */ /* Storing sample averages for use in next program where they will be plotted as a histogram. */ /* PROGRAM 2: Histogram of averages to demonstrate CLT */ DATA EXPO_AVG; INFILE EXPO.AVG ; INPUT TAVG; PROC CHART; VBAR TAVG; PROC MEANS MEAN VAR; VAR TAVG; /* We ve included this procedure to compare the mean and variance of the averages with what is expected by the theory */ Run the above programs for N = 3, 6, 10, 20, 30, 40. Note how the histogram begins to resemble the familiar bell-shaped curve as N increases. How large would you say N should be in order for the normal approximation to be considered accurate, when the underlying population is exponential? 5.6 Normal Random Numbers Standard normal random variables can be generated using the RANNOR function in SAS. Syntax: Z = RANNOR(seed); This produces a value of a normal random variable Z which has mean 0 and variance 1. Recall that if X has mean µ and variance σ 2, then X = µ + σz where Z has mean 0 and variance 1. standard deviation sigma, use X = mu + sigma*rannor(seed); Example Therefore, to simulate a random variable X having mean mu and Use simulation to estimate P (Z < 1.25) where Z is a standard normal random variable. Idea: Simulate a large number (say, 1000) of standard normal random variates and compute the proportion that lie below 1.25.

25 CHAPTER 5. SIMULATION 24 FILE NORMAL.PRB ; SEED = 19218; N = 1000; VALUE = 1.25; COUNT = 0; DO I = 1 TO N; Z = RANNOR(SEED); IF Z < VALUE THEN COUNT = COUNT + 1; PROBEST = COUNT/N; PUT AN EMPIRICAL ESTIMATE OF P(Z < VALUE ) IS PROBEST; Exercises 1. Simulate 100 normal random variates having mean 51 and standard deviation 5.2. Compute the average and standard deviation of your simulated sample and compare with the theoretical values. 2. Simulate 1000 standard normal random variates Z, and use your simulated sample to estimate (a) P (Z > 2.5). (b) P (0 < Z < 1.645). (c) P (1.2 < Z < 1.45). (d) P ( 1.2 < Z < 1.3). Compare with the theoretical values (i.e. consult a normal table). 3. Using the fact that a χ 2 random variable on 1 degree of freedom has the same distribution as the square of a standard normal random variable, simulate 100 independent values of such a χ 2 random variable, and estimate its mean and variance. (Compare with the theoretical values: 1, 2.) 4. A χ 2 random variable on n degrees of freedom has the same distribution as the sum of n independent standard normal random variables. Simulate a χ 2 random variable on 8 degrees of freedom, and estimate its mean and variance. (Compare with the theoretical values: 8, 16.)

26 Chapter 6 REFERENCE: Other Data Step Functions A SAS DATASET X1 X2 X3 X used in some of the examples below. 6.1 Arithmetic Functions ABS(X) - returns the absolute value of X: X. EXAMPLE: Y=ABS(X1); (Y = ). MAX(X1,X2,...,XN) - returns the largest value among the values of the arguments. EXAMPLE: verb+y=max(x1,x2,x3,x4);+ (Y = ). MIN(X1,X2,...,XN) - returns the smallest value among the values of the arguments. EXAMPLE: Y=MIN(X1,X2,X3,X4); (Y = ). MOD(N1,N2) - returns the remainder when the quotient of N1 divided by N2 is calculated. EXAMPLE: Y=MOD(X1,X2); (Y= ). SIGN(X) - returns the sign of X, or 0, if X is 0. EXAMPLE: Y=SIGN(X1); (Y= ) SQRT(X) - returns the square root of X: X. When X is negative, it returns a missing value (.). EXAMPLE: Y=SQRT(X1); (Y = ). 6.2 Truncation Functions CEIL(X) - returns the smallest integer greater than X. FLOOR(X) - returns the largest integer smaller than X. INT(X) - returns the same value as FLOOR(X), if X is positive, and returns the same value as CEIL(X), if X is negative. ROUND(X,Z) - returns the value of X rounded to the nearest unit of Z. 25

27 CHAPTER 6. REFERENCE: OTHER DATA STEP FUNCTIONS Special Mathematical Functions EXP(X): e X. GAMMA(X): the complete gamma function, 0 t X 1 e t dt. LOG(X): the natural logarithm of X. LOG2(X): the logarithm to the base 2 of X. LOG10(X): the logarithm to the base 10 of X. 6.4 Trigonometric and Hyperbolic Functions ARCOS(X): inverse cosine of X. ARSIN(X): inverse sine of X. ATAN(X): inverse tangent of X. COS(X): cosine of X. COSH(X): hyperbolic cosine of X. SIN(X): sine of X. SINH(X): hyperbolic sine of X. TAN(X): tangent of X. TANH(X): hyperbolic tangent of X. 6.5 Statistical functions CSS(X1,X2,...,XN): the corrected sum of squares N Xi 2 N X 2 i=1 CV(X1,X2,...,XN): the coefficient of variation - the standard deviation of X 1,..., X N divided by the mean of X 1,..., X N. MEAN(X1,...,XN) X = 1 N EXAMPLE: Y = MEAN(X1,X2,X3,X4); (Y = ). N(X1,...,XN): number of nonmissing arguments. EXAMPLE: Y=N(.,4.1,.3.7,5.7); (Y = 3). N i=1 NMISS($X_1,\ldots,X_N$): number of missing values. EXAMPLE: Y=NMISS(.,4.1,.3.7,5.7); (Y = 2). RANGE(X1,...,XN): maximum minus the minimum. EXAMPLE: Y=RANGE(X1,X2,X3,X4); (Y = ). STD(X1,...,XN): standard deviation. STDERR(X1,...,XN): standard error (standard deviation divided by N). SUM(X1,...,XN): N i=1 X i USS(X1,...,XN): uncorrected sum of squares N i=1 X2 i VAR(X1,...,XN): variance X i

28 CHAPTER 6. REFERENCE: OTHER DATA STEP FUNCTIONS Probability functions The following functions can be used to determine various probabilities. The syntax is similar to that used for the random number generator functions. GAMINV(P,eta): returns the value of x such that x 0 P = tη 1 e t dt Γ(η) (0 P < 1, and η > 0). POISSON(lambda,N): returns the probability that an observation from a Poisson distribution is less than or equal to N. λ is the mean parameter. i.e. POISSON(lambda,N) = N e λ (λ) j j=0 j! PROBBNML(p,n,m): returns the probability that an observation from a binomial distribution with parameters p and n is less than or equal to m. i.e. PROBBNML(p,n,m) = ( ) m n j=0 p j j (1 p) n j. PROBCHI(x,nu): returns the probability that a random variable with a chi-square distribution on ν degrees of freedom falls below x. PROBF(x,ndf,ddf): returns the probability that a random variable with an F distribution on ndf numerator degrees of freedom and ddf denominator degrees of freedom falls below x. PROBGAM(x,eta): returns the probability that a random variable with a gamma distribution with shape parameter η falls below x. i.e. PROBGAM(x,eta) = x 0 tη 1 e t Γ(η). PROBIT(x): returns the inverse of the standard normal cumulative distribution function. i.e. If X is a standard normal random variable, then x is the probability that X will take on a value less PROBIT(X). PROBNORM(x): returns the probability that a standard normal random variable will fall below x. PROBT(x,nu): returns the probability that a random variable with student s t distribution on ν degrees of freedom will fall below x. TINV(p,nu): returns the pth percentile of the student s t distribution on ν degrees of freedom Example Find the probability that a random variable with a t distribution on 8 degrees of freedom is less than 1.4. i.e. P (T < 1.4) =? where T is t-distributed on 8 d.f. The following program writes the correct probability into the file PROB.T. FILE PROB.T ; PROB = PROBT(1.4, 8); PUT PROB;

29 CHAPTER 6. REFERENCE: OTHER DATA STEP FUNCTIONS Exercises 1. Compute the probability that a Poisson random variable with mean rate 11.4 takes on values less than (a) 1. (b) 2. (c) 5. (d) 11. (e) 15. (f) Repeat the previous question for a binomial random variable with p =.45 and n = The time that it takes a bus to arrive at the next stop is normally distributed with mean 10.4 minutes and standard deviation 1.2. Compute the probabilities that the bus will arrive in less than (a) 5 minutes. (b) 8 minutes. (c) 10.5 minutes. (d) 12.5 minutes. (e) 13.1 minutes. (f) 15.2 minutes.

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

The SAS interface is shown in the following screen shot:

The SAS interface is shown in the following screen shot: The SAS interface is shown in the following screen shot: There are several items of importance shown in the screen shot First there are the usual main menu items, such as File, Edit, etc I seldom use anything

More information

Sketchify Tutorial Properties and Variables. sketchify.sf.net Željko Obrenović

Sketchify Tutorial Properties and Variables. sketchify.sf.net Željko Obrenović Sketchify Tutorial Properties and Variables sketchify.sf.net Željko Obrenović z.obrenovic@tue.nl Properties and Variables Properties of active regions and sketches can be given directly, or indirectly

More information

CHAPTER 6. The Normal Probability Distribution

CHAPTER 6. The Normal Probability Distribution The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

Distributions of Continuous Data

Distributions of Continuous Data C H A P T ER Distributions of Continuous Data New cars and trucks sold in the United States average about 28 highway miles per gallon (mpg) in 2010, up from about 24 mpg in 2004. Some of the improvement

More information

Consider this m file that creates a file that you can load data into called rain.txt

Consider this m file that creates a file that you can load data into called rain.txt SAVING AND IMPORTING DATA FROM A DATA FILES AND PROCESSING AS A ONE DIMENSIONAL ARRAY If we save data in a file sequentially than we can call it back sequentially into a row vector. Consider this m file

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

Themes in the Texas CCRS - Mathematics

Themes in the Texas CCRS - Mathematics 1. Compare real numbers. a. Classify numbers as natural, whole, integers, rational, irrational, real, imaginary, &/or complex. b. Use and apply the relative magnitude of real numbers by using inequality

More information

Chapter 4: Basic C Operators

Chapter 4: Basic C Operators Chapter 4: Basic C Operators In this chapter, you will learn about: Arithmetic operators Unary operators Binary operators Assignment operators Equalities and relational operators Logical operators Conditional

More information

Methods CSC 121 Fall 2014 Howard Rosenthal

Methods CSC 121 Fall 2014 Howard Rosenthal Methods CSC 121 Fall 2014 Howard Rosenthal Lesson Goals Understand what a method is in Java Understand Java s Math Class Learn the syntax of method construction Learn both void methods and methods that

More information

S H A R P E L R H UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit

S H A R P E L R H UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit S H A R P E L - 5 3 1 R H UNIVERSITY OF SOUTHERN QUEENSLAND The Learning Centre Learning and Teaching Support Unit TABLE OF CONTENTS PAGE Introduction 1 A word about starting out 2 1. Addition and subtraction

More information

SPSS Basics for Probability Distributions

SPSS Basics for Probability Distributions Built-in Statistical Functions in SPSS Begin by defining some variables in the Variable View of a data file, save this file as Probability_Distributions.sav and save the corresponding output file as Probability_Distributions.spo.

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : Premier Date Year 9 MEG :

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : Premier Date Year 9 MEG : Personal targets to help me achieve my grade : AFL Sheet Number 1 : Standard Form, Decimals, Fractions and Percentages Standard Form I can write a number as a product of it s prime factors I can use the

More information

Expressions. Eric Roberts Handout #3 CSCI 121 January 30, 2019 Expressions. Grace Murray Hopper. Arithmetic Expressions.

Expressions. Eric Roberts Handout #3 CSCI 121 January 30, 2019 Expressions. Grace Murray Hopper. Arithmetic Expressions. Eric Roberts Handout #3 CSCI 121 January 30, 2019 Expressions Grace Murray Hopper Expressions Eric Roberts CSCI 121 January 30, 2018 Grace Hopper was one of the pioneers of modern computing, working with

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

Basic stuff -- assignments, arithmetic and functions

Basic stuff -- assignments, arithmetic and functions Basic stuff -- assignments, arithmetic and functions Most of the time, you will be using Maple as a kind of super-calculator. It is possible to write programs in Maple -- we will do this very occasionally,

More information

CSI31 Lecture 5. Topics: 3.1 Numeric Data Types 3.2 Using the Math Library 3.3 Accumulating Results: Factorial

CSI31 Lecture 5. Topics: 3.1 Numeric Data Types 3.2 Using the Math Library 3.3 Accumulating Results: Factorial CSI31 Lecture 5 Topics: 3.1 Numeric Data Types 3.2 Using the Math Library 3.3 Accumulating Results: Factorial 1 3.1 Numberic Data Types When computers were first developed, they were seen primarily as

More information

Chapter 3 Analyzing Normal Quantitative Data

Chapter 3 Analyzing Normal Quantitative Data Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

MYSQL NUMERIC FUNCTIONS

MYSQL NUMERIC FUNCTIONS MYSQL NUMERIC FUNCTIONS http://www.tutorialspoint.com/mysql/mysql-numeric-functions.htm Copyright tutorialspoint.com MySQL numeric functions are used primarily for numeric manipulation and/or mathematical

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data EXERCISE Using Excel for Graphical Analysis of Data Introduction In several upcoming experiments, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

INDEPENDENT SCHOOL DISTRICT 196 Rosemount, Minnesota Educating our students to reach their full potential

INDEPENDENT SCHOOL DISTRICT 196 Rosemount, Minnesota Educating our students to reach their full potential INDEPENDENT SCHOOL DISTRICT 196 Rosemount, Minnesota Educating our students to reach their full potential MINNESOTA MATHEMATICS STANDARDS Grades 9, 10, 11 I. MATHEMATICAL REASONING Apply skills of mathematical

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

Voluntary State Curriculum Algebra II

Voluntary State Curriculum Algebra II Algebra II Goal 1: Integration into Broader Knowledge The student will develop, analyze, communicate, and apply models to real-world situations using the language of mathematics and appropriate technology.

More information

Chapter 2 Modeling Distributions of Data

Chapter 2 Modeling Distributions of Data Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and

More information

MATELECT PDsoft v1.00

MATELECT PDsoft v1.00 MATELECT PDsoft v1.00 INSTRUCTION MANUAL TABLE OF CONTENTS SYSTEM REQUIREMENTS... 4 TECHNICAL SUPPORT... 4 INSTALLING THE LICENSE FILE... 5 ABOUT PDsoft... 6 A GUIDED TOUR OF THE USER INTERFACE... 6 CHART

More information

appstats6.notebook September 27, 2016

appstats6.notebook September 27, 2016 Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using

More information

6-1 (Function). (Function) !*+!"#!, Function Description Example. natural logarithm of x (base e) rounds x to smallest integer not less than x

6-1 (Function). (Function) !*+!#!, Function Description Example. natural logarithm of x (base e) rounds x to smallest integer not less than x (Function) -1.1 Math Library Function!"#! $%&!'(#) preprocessor directive #include !*+!"#!, Function Description Example sqrt(x) square root of x sqrt(900.0) is 30.0 sqrt(9.0) is 3.0 exp(x) log(x)

More information

9-1 GCSE Maths. GCSE Mathematics has a Foundation tier (Grades 1 5) and a Higher tier (Grades 4 9).

9-1 GCSE Maths. GCSE Mathematics has a Foundation tier (Grades 1 5) and a Higher tier (Grades 4 9). 9-1 GCSE Maths GCSE Mathematics has a Foundation tier (Grades 1 5) and a Higher tier (Grades 4 9). In each tier, there are three exams taken at the end of Year 11. Any topic may be assessed on each of

More information

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies. Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.

More information

C A S I O f x S UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit

C A S I O f x S UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit C A S I O f x - 1 0 0 S UNIVERSITY OF SOUTHERN QUEENSLAND The Learning Centre Learning and Teaching Support Unit MASTERING THE CALCULATOR USING THE CASIO fx-100s Learning and Teaching Support Unit (LTSU)

More information

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : 1 Date Year 9 MEG :

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : 1 Date Year 9 MEG : Personal targets to help me achieve my grade : AFL Sheet Number 1 : Standard Form, Decimals, Fractions and Percentages Standard Form I can write a number as a product of it s prime factors I can use the

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015 MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are

More information

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Graphical Analysis of Data using Microsoft Excel [2016 Version] Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Measures of Dispersion

Measures of Dispersion Lesson 7.6 Objectives Find the variance of a set of data. Calculate standard deviation for a set of data. Read data from a normal curve. Estimate the area under a curve. Variance Measures of Dispersion

More information

Key Stage 4: Year 10. Subject: Mathematics. Aims of the subject:

Key Stage 4: Year 10. Subject: Mathematics. Aims of the subject: Key Stage 4: Year 10 Subject: Mathematics Aims of the subject: The mathematics department aim to develop the full potential of every student in the subject. It is our aim to ensure that every pupil experiences

More information

C A S I O f x L B UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit

C A S I O f x L B UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit C A S I O f x - 8 2 L B UNIVERSITY OF SOUTHERN QUEENSLAND The Learning Centre Learning and Teaching Support Unit MASTERING THE CALCULATOR USING THE CASIO fx-82lb Learning and Teaching Support Unit (LTSU)

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Chapter 1 Introduction to MATLAB

Chapter 1 Introduction to MATLAB Chapter 1 Introduction to MATLAB 1.1 What is MATLAB? MATLAB = MATrix LABoratory, the language of technical computing, modeling and simulation, data analysis and processing, visualization and graphics,

More information

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit. Continuous Improvement Toolkit Normal Distribution The Continuous Improvement Map Managing Risk FMEA Understanding Performance** Check Sheets Data Collection PDPC RAID Log* Risk Analysis* Benchmarking***

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

Control Charts. An Introduction to Statistical Process Control

Control Charts. An Introduction to Statistical Process Control An Introduction to Statistical Process Control Course Content Prerequisites Course Objectives What is SPC? Control Chart Basics Out of Control Conditions SPC vs. SQC Individuals and Moving Range Chart

More information

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable Learning Objectives Continuous Random Variables & The Normal Probability Distribution 1. Understand characteristics about continuous random variables and probability distributions 2. Understand the uniform

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

ANSI C Programming Simple Programs

ANSI C Programming Simple Programs ANSI C Programming Simple Programs /* This program computes the distance between two points */ #include #include #include main() { /* Declare and initialize variables */ double

More information

Programming in QBasic

Programming in QBasic Programming in QBasic Second lecture Constants In QBASIC: Constants In QBASIC division into three types: 1. Numeric Constants: there are two types of numeric constants: Real: the numbers used may be written

More information

Stat 302 Statistical Software and Its Applications SAS Functions

Stat 302 Statistical Software and Its Applications SAS Functions Stat 302 Statistical Software and Its Applications SAS Functions Yen-Chi Chen Department of Statistics, University of Washington Autumn 2016 1 / 31 Creating New Variables Here we create new variables using

More information

Lecture Notes 3: Data summarization

Lecture Notes 3: Data summarization Lecture Notes 3: Data summarization Highlights: Average Median Quartiles 5-number summary (and relation to boxplots) Outliers Range & IQR Variance and standard deviation Determining shape using mean &

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Methods CSC 121 Fall 2016 Howard Rosenthal

Methods CSC 121 Fall 2016 Howard Rosenthal Methods CSC 121 Fall 2016 Howard Rosenthal Lesson Goals Understand what a method is in Java Understand Java s Math Class and how to use it Learn the syntax of method construction Learn both void methods

More information

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010 THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE

More information

correlated to the Michigan High School Mathematics Content Expectations

correlated to the Michigan High School Mathematics Content Expectations correlated to the Michigan High School Mathematics Content Expectations McDougal Littell Algebra 1 Geometry Algebra 2 2007 correlated to the STRAND 1: QUANTITATIVE LITERACY AND LOGIC (L) STANDARD L1: REASONING

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Maths Year 11 Mock Revision list

Maths Year 11 Mock Revision list Maths Year 11 Mock Revision list F = Foundation Tier = Foundation and igher Tier = igher Tier Number Tier Topic know and use the word integer and the equality and inequality symbols use fractions, decimals

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

Math Lesson Plan 6th Grade Curriculum Total Activities: 302

Math Lesson Plan 6th Grade Curriculum Total Activities: 302 TimeLearning Online Learning for Homeschool and Enrichment www.timelearning.com Languages Arts, Math and more Multimedia s, Interactive Exercises, Printable Worksheets and Assessments Student Paced Learning

More information

C++ Overview. Chapter 1. Chapter 2

C++ Overview. Chapter 1. Chapter 2 C++ Overview Chapter 1 Note: All commands you type (including the Myro commands listed elsewhere) are essentially C++ commands. Later, in this section we will list those commands that are a part of the

More information

AQA GCSE Maths - Higher Self-Assessment Checklist

AQA GCSE Maths - Higher Self-Assessment Checklist AQA GCSE Maths - Higher Self-Assessment Checklist Number 1 Use place value when calculating with decimals. 1 Order positive and negative integers and decimals using the symbols =,, , and. 1 Round to

More information

Montana City School GRADE 5

Montana City School GRADE 5 Montana City School GRADE 5 Montana Standard 1: Students engage in the mathematical processes of problem solving and reasoning, estimation, communication, connections and applications, and using appropriate

More information

STATS PAD USER MANUAL

STATS PAD USER MANUAL STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11

More information

Central Valley School District Math Curriculum Map Grade 8. August - September

Central Valley School District Math Curriculum Map Grade 8. August - September August - September Decimals Add, subtract, multiply and/or divide decimals without a calculator (straight computation or word problems) Convert between fractions and decimals ( terminating or repeating

More information

Lab 5 - Risk Analysis, Robustness, and Power

Lab 5 - Risk Analysis, Robustness, and Power Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors

More information

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability 7 Fractions GRADE 7 FRACTIONS continue to develop proficiency by using fractions in mental strategies and in selecting and justifying use; develop proficiency in adding and subtracting simple fractions;

More information

Python Numbers. Learning Outcomes 9/19/2012. CMSC 201 Fall 2012 Instructor: John Park Lecture Section 01 Discussion Sections 02-08, 16, 17

Python Numbers. Learning Outcomes 9/19/2012. CMSC 201 Fall 2012 Instructor: John Park Lecture Section 01 Discussion Sections 02-08, 16, 17 Python Numbers CMSC 201 Fall 2012 Instructor: John Park Lecture Section 01 Discussion Sections 02-08, 16, 17 1 (adapted from Meeden, Evans & Mayberry) 2 Learning Outcomes To become familiar with the basic

More information

NAME EET 2259 Lab 3 The Boolean Data Type

NAME EET 2259 Lab 3 The Boolean Data Type NAME EET 2259 Lab 3 The Boolean Data Type OBJECTIVES - Understand the differences between numeric data and Boolean data. -Write programs using LabVIEW s Boolean controls and indicators, Boolean constants,

More information

C++ Programming Lecture 11 Functions Part I

C++ Programming Lecture 11 Functions Part I C++ Programming Lecture 11 Functions Part I By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department Introduction Till now we have learned the basic concepts of C++. All the programs

More information

Process Optimization

Process Optimization Process Optimization Tier II: Case Studies Section 1: Lingo Optimization Software Optimization Software Many of the optimization methods previously outlined can be tedious and require a lot of work to

More information

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution MAT 102 Introduction to Statistics Chapter 6 Chapter 6 Continuous Probability Distributions and the Normal Distribution 6.2 Continuous Probability Distributions Characteristics of a Continuous Probability

More information

Univariate Statistics Summary

Univariate Statistics Summary Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:

More information

Cecil Jones Academy Mathematics Fundamentals

Cecil Jones Academy Mathematics Fundamentals Year 10 Fundamentals Core Knowledge Unit 1 Unit 2 Estimate with powers and roots Calculate with powers and roots Explore the impact of rounding Investigate similar triangles Explore trigonometry in right-angled

More information

STA Module 4 The Normal Distribution

STA Module 4 The Normal Distribution STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling

More information

Stat 302 Statistical Software and Its Applications SAS Functions

Stat 302 Statistical Software and Its Applications SAS Functions 1 Stat 302 Statistical Software and Its Applications SAS Functions Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 14, 2015 2 Creating New Variables Here we

More information

MAT 090 Brian Killough s Instructor Notes Strayer University

MAT 090 Brian Killough s Instructor Notes Strayer University MAT 090 Brian Killough s Instructor Notes Strayer University Success in online courses requires self-motivation and discipline. It is anticipated that students will read the textbook and complete sample

More information

Introduction to MATLAB 7 for Engineers

Introduction to MATLAB 7 for Engineers Introduction to MATLAB 7 for Engineers William J. Palm III Chapter 3 Functions and Files Getting Help for Functions You can use the lookfor command to find functions that are relevant to your application.

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

MAT 003 Brian Killough s Instructor Notes Saint Leo University

MAT 003 Brian Killough s Instructor Notes Saint Leo University MAT 003 Brian Killough s Instructor Notes Saint Leo University Success in online courses requires self-motivation and discipline. It is anticipated that students will read the textbook and complete sample

More information

Best Student Exam (Open and Closed) Solutions Texas A&M High School Math Contest 8 November 2014

Best Student Exam (Open and Closed) Solutions Texas A&M High School Math Contest 8 November 2014 Best Student Exam (Open and Closed) Solutions Texas A&M High School Math Contest 8 November 2014 1. The numbers 1, 2, 3, etc. are written down one after the other up to 1000, without any commas, in one

More information

Probability Models.S4 Simulating Random Variables

Probability Models.S4 Simulating Random Variables Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Probability Models.S4 Simulating Random Variables In the fashion of the last several sections, we will often create probability

More information

3. Data Analysis and Statistics

3. Data Analysis and Statistics 3. Data Analysis and Statistics 3.1 Visual Analysis of Data 3.2.1 Basic Statistics Examples 3.2.2 Basic Statistical Theory 3.3 Normal Distributions 3.4 Bivariate Data 3.1 Visual Analysis of Data Visual

More information

York Public Schools Subject Area: Mathematics Course: 6 th Grade Math NUMBER OF DAYS TAUGHT DATE

York Public Schools Subject Area: Mathematics Course: 6 th Grade Math NUMBER OF DAYS TAUGHT DATE 6.1.1.d 6.EE.A.1 Represent large numbers using exponential notation (i.e.. 10x10x10x10x10) (Review PV chart first) Write evaluate numerical expressions involving whole number exponents 5 days August, Lesson

More information

CT 229 Java Syntax Continued

CT 229 Java Syntax Continued CT 229 Java Syntax Continued 06/10/2006 CT229 Lab Assignments Due Date for current lab assignment : Oct 8 th Before submission make sure that the name of each.java file matches the name given in the assignment

More information

SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics. Numbers & Number Systems

SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics. Numbers & Number Systems SCHOOL OF ENGINEERING & BUILT ENVIRONMENT Mathematics Numbers & Number Systems Introduction Numbers and Their Properties Multiples and Factors The Division Algorithm Prime and Composite Numbers Prime Factors

More information

Fortran 90 Two Commonly Used Statements

Fortran 90 Two Commonly Used Statements Fortran 90 Two Commonly Used Statements 1. DO Loops (Compiled primarily from Hahn [1994]) Lab 6B BSYSE 512 Research and Teaching Methods The DO loop (or its equivalent) is one of the most powerful statements

More information

KS4 3 Year scheme of Work Year 10 Higher

KS4 3 Year scheme of Work Year 10 Higher KS4 3 Year scheme of Work Year 10 Higher Review: Equations A2 Substitute numerical values into formulae and expressions, including scientific formulae unfamiliar formulae will be given in the question

More information

Chapter 6 Normal Probability Distributions

Chapter 6 Normal Probability Distributions Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal Distribution 6-3 Applications of Normal Distributions 6-4 Sampling Distributions and Estimators 6-5 The Central

More information

YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1)

YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1) YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1) Algebra and Functions Quadratic Functions Equations & Inequalities Binomial Expansion Sketching Curves Coordinate Geometry Radian Measures Sine and

More information

List of NEW Maths content

List of NEW Maths content List of NEW Maths content Our brand new Maths content for the new Maths GCSE (9-1) consists of 212 chapters broken up into 37 titles and 4 topic areas (Algebra, Geometry & Measures, Number and Statistics).

More information

Key Stage 4: Year 9. Subject: Mathematics. Aims of the subject:

Key Stage 4: Year 9. Subject: Mathematics. Aims of the subject: Key Stage 4: Year 9 Subject: Mathematics Aims of the subject: The mathematics department aim to develop the full potential of every student in the subject. It is our aim to ensure that every pupil experiences

More information

Smarter Balanced Vocabulary (from the SBAC test/item specifications)

Smarter Balanced Vocabulary (from the SBAC test/item specifications) Example: Smarter Balanced Vocabulary (from the SBAC test/item specifications) Notes: Most terms area used in multiple grade levels. You should look at your grade level and all of the previous grade levels.

More information

Introduction to Digital Image Processing

Introduction to Digital Image Processing Fall 2005 Image Enhancement in the Spatial Domain: Histograms, Arithmetic/Logic Operators, Basics of Spatial Filtering, Smoothing Spatial Filters Tuesday, February 7 2006, Overview (1): Before We Begin

More information

ENGR 1181 Autumn 2015 Final Exam Study Guide and Practice Problems

ENGR 1181 Autumn 2015 Final Exam Study Guide and Practice Problems ENGR 1181 Autumn 2015 Final Exam Study Guide and Practice Problems Disclaimer Problems seen in this study guide may resemble problems relating mainly to the pertinent homework assignments. Reading this

More information

Expressions in JavaScript. Jerry Cain CS 106AJ October 2, 2017

Expressions in JavaScript. Jerry Cain CS 106AJ October 2, 2017 Expressions in JavaScript Jerry Cain CS 106AJ October 2, 2017 What is JavaScript? JavaScript was developed at the Netscape Communications Corporation in 1995, reportedly by a single programmer in just

More information

An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s.

An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s. Using Monte Carlo to Estimate π using Buffon s Needle Problem An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s. Here s the problem (in a simplified form). Suppose

More information