Assignments. Math 338 Lab 1: Introduction to R. Atoms, Vectors and Matrices

Size: px
Start display at page:

Download "Assignments. Math 338 Lab 1: Introduction to R. Atoms, Vectors and Matrices"

Transcription

1 Assignments Math 338 Lab 1: Introduction to R. Generally speaking, there are three basic forms of assigning data. Case one is the single atom or a single number. Assigning a number to an object in this case is quite trivial. All we need is to use < or = for assigning a number or an atom to a character. In the following, > refers to the prompt in R. The second form is the vector form. In this form, we assign a name to an array of numbers. This can be done with the command c which stands for concatenation. The interesting fact is that we can call any member of the vector or we can replace that member with a new member or to perform various arithmetic operations on that vector, as shown below. Finally, the third form of storing data is to put them in a matrix form. The command is matrix. First we need to input the data set of interest, followed by telling R the dimensionality of the matrix that needs to be specified. For example, we can put an array of 9 numbers into a matrix with 3 rows and 3 columns. We demonstrate all of these below. Atoms, Vectors and Matrices (a) Atoms: > sam=2 > sam [1] 2 > sam+sam [1] 4 > (2*sam*2)/2 [1] 4 > sam^(1/3) [1] > sqrt(sam) [1] > abs(-sam) [1] 2 1

2 (b) Vectors > class.age=c(35,35,36,37,37,38,38,39,40.5,43,44,44.5,50,19) > class.age [1] > class.age[3] [1] 36 > class.age[1:5] [1] > class.age[-5] [1] > class.age[-c(2,7)] [1] > class.age*2 [1] > sqrt(class.age) [1] [14] > class.age^(-1) [1] [11] > class.age*class.age [1] > class.age^2 [1] > mean(class.age) 2

3 [1] > median(class.age) [1] 38 > class.age=(class.age)/2 > class.age [1] > class.age=class.age*2 Often it is useful to create an empty vector. Here is the way this is done: > hi=numeric(10) > hi [1] Vectors do not have to be numerical. We can create a vector of characters: > hi=c("hello","whasup","longday") > hi [1] "hello" "whasup" "longday" Later, it becomes useful to ask R the length of a vector: > length(class.age) [1] 14 (c) Matrices > sam = matrix(nrow=3,ncol=4) > sam [,1] [,2] [,3] [,4] [1,] NA NA NA NA [2,] NA NA NA NA [3,] NA NA NA NA > sam = matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,byrow=t) 3

4 > sam [,1] [,2] [,3] [,4] [1,] [2,] [3,] > sam<-matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,byrow=f) > sam [,1] [,2] [,3] [,4] [1,] [2,] [3,] > sally =c(1,2,3,4,5,6,7,8,9,10,11,12) > sam=matrix(sally,nrow=3,byrow=t) > sam [,1] [,2] [,3] [,4] [1,] [2,] [3,] > v1=c(1,2,3,4) > v2=c(5,6,7,8) > v3=c(9,10,11,12) > sam=matrix(c(v1,v2,v3),nrow=3,byrow=t) > sam [,1] [,2] [,3] [,4] [1,] [2,] [3,]

5 > sam[1,] [1] > sam[,2] [1] > sam[1,3] [1] 3 sam[3,]<-v2 > sam [,1] [,2] [,3] [,4] [1,] [2,] [3,] > sam[1,]<-log(v1) > sam [,1] [,2] [,3] [,4] [1,] [2,] [3,] (d) Lists R provides a powerful additional storing function called list. The importance of list is in that we can store various objects of different natures such as matrices, vectors, or atoms into a unique space, followed by calling different parts of that object separately. Let s assume that we would like to store the following three object into a list-object called sam: > s1=3 > s2=seq(1,10,2) > s3=matrix(c(1:9),nrow=3) > s1 [1] 3 > s2 5

6 [1] > s3 [,1] [,2] [,3] [1,] [2,] [3,] > s<-list(s1,s2,s3) > s [[1]] [1] 3 [[2]] [1] [[3]] [,1] [,2] [,3] [1,] [2,] [3,] > s[[1]] [1] 3 > s[[2]] [1] > s[[3]] [,1] [,2] [,3] [1,] [2,] [3,] The for loop Often times, it becomes necessary to repeat certain calculations a number of times. This is done in R using a simple command called for. Here are some examples: 6

7 > for(i in 1:3) + { + print("sam") + } [1] "sam" [1] "sam" [1] "sam" > s=matrix(c(1,2,3,4,5,6,7,8,9),nrow=3,byrow=t) > for(i in 1:3) + print(s) [,1] [,2] [,3] [1,] [2,] [3,] [,1] [,2] [,3] [1,] [2,] [3,] [,1] [,2] [,3] [1,] [2,] [3,] or: > for(i in 1:3) + { print(s)} [,1] [,2] [,3] [1,] [2,] [3,] [,1] [,2] [,3] [1,] [2,] [3,] [,1] [,2] [,3] [1,] [2,] [3,]

8 > for(i in 1:3) +{ + print(s[i,]) + } [1] [1] [1] Functions In principle, there are two sorts of functions in R. The most common and useful ones are the library functions or the already written commands. For example, mean and sd are commands that calculate the average and the standard deviation of an object, say a vector respectively. Here are a couple of examples: > s2<-seq(1,10,2) > s2 [1] > mean(s2) [1] 5 > var(s2) [1] 10 > sd(s2) [1] > median(s2) [1] 5 Second type of functions are those that the users of R create. These functions will remain in the command memory of the software unless you delete them or overwrite them. Expectedly, the command to create a function is f unction. Here is an example of a function that gets a matrix, and calculates the standard deviation divided by the mean of its rows σ. This measurement is called the coefficient µ of variation. Note that in writing this function, I use the commands mean, and sd. 8

9 In general, any time you are not sure what an R command does or to learn about its specifics, just type a question mark, followed by the command in the prompt. > m.cv<-function(mat) { u=nrow(mat) t=numeric(u) for(i in 1:u) { t[i]<-sd(mat[i,])/mean(mat[i,]) } return(t) } > sm3<-matrix(c(1:9),nrow=3) > sm3 [,1] [,2] [,3] [1,] [2,] [3,] > m.cv(sm3) [1]

10 Visualizing Data: Pie Charts, Stem plots, Histograms Categorical Data For categorical data, we keep track of counts or relative frequencies of each group. Therefore, a schematic presentation should reflect the percentage of occurrences in each category. This is usually done via two types of graphs: 1- Pie charts, and 2- Barplots. Both graphs are easy to create in R. An important issue here is that in most cases, it would make sense to label the categories. We show you how to do this below. Example 1. The counts and the percentages of the marital status of American women was collected by the Current Population Survey in 1995 as following: Marital Status Count (millions) Percent Never Married Married Widowed Divorced Here are the commands to provide the pie-chart for these data (figure 1): > married<-c(43.9,116.7,13.4,17.6) > married.code<-as.factor(c(1,2,3,4)) > pie(married,married.code) Alternatively, we could label each piece of pie by creating a factor vector that contains the names of each pie (figure 2): > married.code<-c("never married","married","widowed","divorced") > pie(married,married.code) To create a barplot for the married data, it is sufficient to execute the following function (figure 3): > married<-c(22.9,60.9,7,9.2) > barplot(married,names.arg=married.code) 10

11 Figure 1: Pie chart for the Married data. Stemplots and Histograms For quantitative data, stemplots and histograms are the useful visual tools. Example 2. Let s revisit the class-age data we introduced previously. To create the stemplot for these data, we can do the following: > class.age<-c(35,35,36,37,37,38,38,39,40.5,43,44,44.5,50,19) > class.age [1] > stem(class.age) The decimal point is 1 digit(s) to the right of the

12 never married married divorced widowed Figure 2: Pie chart for the Married data with labels. 5 0 > stem(class.age,scale=2) The decimal point is 1 digit(s) to the right of the > test<-c(0,0.01,0.22,0.34,0.36,0.31,0.36,0.45,0.4,0.55,0.65) > stem(test) 12

13 never married married widowed divorced Figure 3: Barplot for the Married data with labels. The decimal point is 1 digit(s) to the left of the > stem(test,scale=2) The decimal point is 1 digit(s) to the left of the

14 To create a histogram for the class-age data, it is sufficient to use the hist command (figure 4): > hist(class.age) Histogram of class.age Frequency class.age Figure 4: The histogram of class.age. We can make the bars finer. Here is a simple trick (figure 5): > b1<-seq(15,50,3) > b1 [1] > b1<-seq(15,50,3)+2 > hist(class.age,breaks=b1) 14

15 Histogram of class.age Frequency class.age Figure 5: The histogram of class.age with finer classes. Measuring Center: Quartile The Mean, The Median, and the The measures of centrality play a fundamental role in understanding the statistical distributions. The chief important ones are the mean, the median, and the other quantiles. > mean(class.age) [1] > median(class.age) [1] 38 > quantile(class.age) 0% 25% 50% 75% 100% > quantile(class.age,prob=0.66) 15

16 66% Comparing Mean and Median The Symmetric Case For symmetric distributions such as the one in figure 6, the median and the mean are close to each other. Histogram of n1 Frequency n1 Figure 6: A symmetric distribution. Mean= , Median= For distributions that are skewed to the left such as the one in figure 7, the mean is smaller than the median (why?) Finally, for the right-skewed distributions, the mean is larger than the median (figure 8). Measuring Spread: The Standard Deviation The standard deviation reflects the number of units away from the mean. For example, the standard deviation for the data in figures 6, 7, and 8 are 10.03, 0.19, and 0.19 respectively. To calculate the variance and the standard deviation for the class.age data, we can type: 16

17 Histogram of n3 Frequency n3 Figure 7: Left-skewed Distribution. Mean= 0.75, Median= > var(class.age) [1] > sd(class.age) [1] Project 1. Visualizing Grades. First, read the file grades.txt from the webpage. To do this, run the following code in R: > grades=read.table(" This will generate a data frame of the size called grades in R for you. Rows are students, and columns represent the Verbal SAT score, the Math SAT score, and the GPA for each student. To examine the dimensionality of this object you can type: > dim(grades) [1] Which confirms what we planned initially. Now, we are at a position to answer the following questions: 17

18 Histogram of n5 Frequency n5 Figure 8: Right-skewed Distribution. Mean= 0.24, Median= (1.) Create barplots, dotplots, stemplots, and histograms for the three variables of interest. To make life easier, use the command attach to make the data file grades your working data. Next, proceed by just typing the name of the column of interest. Here is what I mean: > attach(grades) > GPA [1] [19] [37] [55] [73] [91] (2.) Calculate the min, the max, the mean, the median, first quartile, third quartile, and the standard deviation of each variable. A good chunk of that information may be obtained through using the command summary: > summary(gpa) 18

19 Min. 1st Qu. Median Mean 3rd Qu. Max (3.) Report your findings in detail. Compare the verbal scores with the math scores. Comment on the symmetry, measures of centrality, measures of spread, and the potential outliers in each distribution. Make sure to comment on the statistical features of the GPA as well. Boxplots Boxplots are efficient tools for representing data distributions. The five number summary can be traced on a boxplot. Additionally, we can figure-out outliers with boxplots. Remember the three distributions in figures 6, 7 and 8. Note that these distributions are symmetric, left skewed and right skewed respectively. Here is how I created figure 9 below: > n1=rnorm(100000,2,3) > n2=rpois(100000,3) > n3=rbeta(100000,12,3) > par(mfrow=c(3,2)) > hist(n1) > boxplot(n1) > hist(n2) > boxplot(n2) > hist(n3) > boxplot(n3) Note that the command par(mfrow) creates a 3 by 2 grid in the graphic area. The side-by-side boxplots are very helpful in comparing two or more distributions. For example figure 2 shows a side-by-side boxplots for the two skewed distributions of figure 1. > boxplot(n1,n2) 19

20 Histogram of n1 Frequency n1 Histogram of n2 Frequency n2 Histogram of n3 Frequency n3 Figure 9: Histograms along with boxplots for the three simulated datasets. 20

21 Figure 10: Side-by-side boxplots for the two of the distributions of figure 1. 21

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis. 1.3 Density curves p50 Some times the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. It is easier to work with a smooth curve, because the histogram

More information

Chapter 2: The Normal Distributions

Chapter 2: The Normal Distributions Chapter 2: The Normal Distributions Measures of Relative Standing & Density Curves Z-scores (Measures of Relative Standing) Suppose there is one spot left in the University of Michigan class of 2014 and

More information

Statistics Lecture 6. Looking at data one variable

Statistics Lecture 6. Looking at data one variable Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

Parents Names Mom Cell/Work # Dad Cell/Work # Parent List the Math Courses you have taken and the grade you received 1 st 2 nd 3 rd 4th

Parents Names Mom Cell/Work # Dad Cell/Work # Parent   List the Math Courses you have taken and the grade you received 1 st 2 nd 3 rd 4th Full Name Phone # Parents Names Birthday Mom Cell/Work # Dad Cell/Work # Parent email: Extracurricular Activities: List the Math Courses you have taken and the grade you received 1 st 2 nd 3 rd 4th Turn

More information

Chapter 5: The standard deviation as a ruler and the normal model p131

Chapter 5: The standard deviation as a ruler and the normal model p131 Chapter 5: The standard deviation as a ruler and the normal model p131 Which is the better exam score? 67 on an exam with mean 50 and SD 10 62 on an exam with mean 40 and SD 12? Is it fair to say: 67 is

More information

CHAPTER 2: SAMPLING AND DATA

CHAPTER 2: SAMPLING AND DATA CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),

More information

VCEasy VISUAL FURTHER MATHS. Overview

VCEasy VISUAL FURTHER MATHS. Overview VCEasy VISUAL FURTHER MATHS Overview This booklet is a visual overview of the knowledge required for the VCE Year 12 Further Maths examination.! This booklet does not replace any existing resources that

More information

UNIT 1A EXPLORING UNIVARIATE DATA

UNIT 1A EXPLORING UNIVARIATE DATA A.P. STATISTICS E. Villarreal Lincoln HS Math Department UNIT 1A EXPLORING UNIVARIATE DATA LESSON 1: TYPES OF DATA Here is a list of important terms that we must understand as we begin our study of statistics

More information

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or  me, I will answer promptly. Statistical Methods Instructor: Lingsong Zhang 1 Issues before Class Statistical Methods Lingsong Zhang Office: Math 544 Email: lingsong@purdue.edu Phone: 765-494-7913 Office Hour: Monday 1:00 pm - 2:00

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

CHAPTER 2 DESCRIPTIVE STATISTICS

CHAPTER 2 DESCRIPTIVE STATISTICS CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

MATH11400 Statistics Homepage

MATH11400 Statistics Homepage MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 1.1 A Framework for Statistical Problems Many statistical problems can be described by a simple framework in which

More information

AP Statistics Prerequisite Packet

AP Statistics Prerequisite Packet Types of Data Quantitative (or measurement) Data These are data that take on numerical values that actually represent a measurement such as size, weight, how many, how long, score on a test, etc. For these

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

STA Module 4 The Normal Distribution

STA Module 4 The Normal Distribution STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Week 4: Describing data and estimation

Week 4: Describing data and estimation Week 4: Describing data and estimation Goals Investigate sampling error; see that larger samples have less sampling error. Visualize confidence intervals. Calculate basic summary statistics using R. Calculate

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread MATH 112 Section 7.2: Measuring Distribution, Center, and Spread Prof. Jonathan Duncan Walla Walla College Fall Quarter, 2006 Outline 1 Measures of Center The Arithmetic Mean The Geometric Mean The Median

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data MATH& 146 Lesson 10 Section 1.6 Graphing Numerical Data 1 Graphs of Numerical Data One major reason for constructing a graph of numerical data is to display its distribution, or the pattern of variability

More information

2.1: Frequency Distributions and Their Graphs

2.1: Frequency Distributions and Their Graphs 2.1: Frequency Distributions and Their Graphs Frequency Distribution - way to display data that has many entries - table that shows classes or intervals of data entries and the number of entries in each

More information

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics Section. Displaying Quantitative Data with Graphs Mrs. Daniel AP Statistics Section. Displaying Quantitative Data with Graphs After this section, you should be able to CONSTRUCT and INTERPRET dotplots,

More information

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data. Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting

More information

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd Chapter 3: Data Description - Part 3 Read: Sections 1 through 5 pp 92-149 Work the following text examples: Section 3.2, 3-1 through 3-17 Section 3.3, 3-22 through 3.28, 3-42 through 3.82 Section 3.4,

More information

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016) CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1 Daphne Skipper, Augusta University (2016) 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different? A frequency table is a table with two columns, one for the categories and another for the number of times each category occurs. See Example 1 on p. 247. Create a bar graph that displays the data from the

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Measures of Position. 1. Determine which student did better

Measures of Position. 1. Determine which student did better Measures of Position z-score (standard score) = number of standard deviations that a given value is above or below the mean (Round z to two decimal places) Sample z -score x x z = s Population z - score

More information

AP Statistics Summer Assignment:

AP Statistics Summer Assignment: AP Statistics Summer Assignment: Read the following and use the information to help answer your summer assignment questions. You will be responsible for knowing all of the information contained in this

More information

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA This lab will assist you in learning how to summarize and display categorical and quantitative data in StatCrunch. In particular, you will learn how to

More information

Name Date Types of Graphs and Creating Graphs Notes

Name Date Types of Graphs and Creating Graphs Notes Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital

More information

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one. Probability and Statistics Chapter 2 Notes I Section 2-1 A Steps to Constructing Frequency Distributions 1 Determine number of (may be given to you) a Should be between and classes 2 Find the Range a The

More information

Chapter 2 Modeling Distributions of Data

Chapter 2 Modeling Distributions of Data Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and

More information

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

Measures of Dispersion

Measures of Dispersion Measures of Dispersion 6-3 I Will... Find measures of dispersion of sets of data. Find standard deviation and analyze normal distribution. Day 1: Dispersion Vocabulary Measures of Variation (Dispersion

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

AND NUMERICAL SUMMARIES. Chapter 2

AND NUMERICAL SUMMARIES. Chapter 2 EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots Math 167 Pre-Statistics Chapter 4 Summarizing Data Numerically Section 3 Boxplots Objectives 1. Find quartiles of some data. 2. Find the interquartile range of some data. 3. Construct a boxplot to describe

More information

More Numerical and Graphical Summaries using Percentiles. David Gerard

More Numerical and Graphical Summaries using Percentiles. David Gerard More Numerical and Graphical Summaries using Percentiles David Gerard 2017-09-18 1 Learning Objectives Percentiles Five Number Summary Boxplots to compare distributions. Sections 1.6.5 and 1.6.6 in DBC.

More information

IT 403 Practice Problems (1-2) Answers

IT 403 Practice Problems (1-2) Answers IT 403 Practice Problems (1-2) Answers #1. Using Tukey's Hinges method ('Inclusionary'), what is Q3 for this dataset? 2 3 5 7 11 13 17 a. 7 b. 11 c. 12 d. 15 c (12) #2. How do quartiles and percentiles

More information

appstats6.notebook September 27, 2016

appstats6.notebook September 27, 2016 Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using

More information

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular

More information

3 Graphical Displays of Data

3 Graphical Displays of Data 3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked

More information

Bar Charts and Frequency Distributions

Bar Charts and Frequency Distributions Bar Charts and Frequency Distributions Use to display the distribution of categorical (nominal or ordinal) variables. For the continuous (numeric) variables, see the page Histograms, Descriptive Stats

More information

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11} Entering Data: Basic Commands Consider the data set: {15, 22, 32, 31, 52, 41, 11} Data is stored in Lists on the calculator. Locate and press the STAT button on the calculator. Choose EDIT. The calculator

More information

15 Wyner Statistics Fall 2013

15 Wyner Statistics Fall 2013 15 Wyner Statistics Fall 2013 CHAPTER THREE: CENTRAL TENDENCY AND VARIATION Summary, Terms, and Objectives The two most important aspects of a numerical data set are its central tendencies and its variation.

More information

Descriptive Statistics

Descriptive Statistics Chapter 2 Descriptive Statistics 2.1 Descriptive Statistics 1 2.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Display data graphically and interpret graphs:

More information

1 Overview of Statistics; Essential Vocabulary

1 Overview of Statistics; Essential Vocabulary 1 Overview of Statistics; Essential Vocabulary Statistics: the science of collecting, organizing, analyzing, and interpreting data in order to make decisions Population and sample Population: the entire

More information

Understanding and Comparing Distributions. Chapter 4

Understanding and Comparing Distributions. Chapter 4 Understanding and Comparing Distributions Chapter 4 Objectives: Boxplot Calculate Outliers Comparing Distributions Timeplot The Big Picture We can answer much more interesting questions about variables

More information

3 Graphical Displays of Data

3 Graphical Displays of Data 3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked

More information

AP Statistics. Study Guide

AP Statistics. Study Guide Measuring Relative Standing Standardized Values and z-scores AP Statistics Percentiles Rank the data lowest to highest. Counting up from the lowest value to the select data point we discover the percentile

More information

Univariate Statistics Summary

Univariate Statistics Summary Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:

More information

Measures of Central Tendency

Measures of Central Tendency Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of

More information

No. of blue jelly beans No. of bags

No. of blue jelly beans No. of bags Math 167 Ch5 Review 1 (c) Janice Epstein CHAPTER 5 EXPLORING DATA DISTRIBUTIONS A sample of jelly bean bags is chosen and the number of blue jelly beans in each bag is counted. The results are shown in

More information

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS To Describe Data, consider: Symmetry Skewness TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS Unimodal or bimodal or uniform Extreme values Range of Values and mid-range Most frequently occurring values In

More information

Chapter 1 Histograms, Scatterplots, and Graphs of Functions

Chapter 1 Histograms, Scatterplots, and Graphs of Functions Chapter 1 Histograms, Scatterplots, and Graphs of Functions 1.1 Using Lists for Data Entry To enter data into the calculator you use the statistics menu. You can store data into lists labeled L1 through

More information

4.2 Data Distributions

4.2 Data Distributions NOTES Data Distribution: Write your questions here! Dotplots Histograms Find the mean number of siblings: Find the median number of siblings: Types of distributions: The mean on the move: Compare the mean

More information

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018 Day 4 Box & Whisker Plots and Percentiles In a previous lesson, we learned that the median divides a set a data into 2 equal parts. Sometimes it is necessary to divide the data into smaller more precise

More information

Chapter 1. Looking at Data-Distribution

Chapter 1. Looking at Data-Distribution Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies. Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.

More information

Sections Graphical Displays and Measures of Center. Brian Habing Department of Statistics University of South Carolina.

Sections Graphical Displays and Measures of Center. Brian Habing Department of Statistics University of South Carolina. STAT 515 Statistical Methods I Sections 2.1-2.3 Graphical Displays and Measures of Center Brian Habing Department of Statistics University of South Carolina Redistribution of these slides without permission

More information

Lecture 1: Exploratory data analysis

Lecture 1: Exploratory data analysis Lecture 1: Exploratory data analysis Statistics 101 Mine Çetinkaya-Rundel January 17, 2012 Announcements Announcements Any questions about the syllabus? If you sent me your gmail address your RStudio account

More information

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES STP 6 ELEMENTARY STATISTICS NOTES PART - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES Chapter covered organizing data into tables, and summarizing data with graphical displays. We will now use

More information

1.3 Graphical Summaries of Data

1.3 Graphical Summaries of Data Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 1.3 Graphical Summaries of Data In the previous section we discussed numerical summaries of either a sample or a data. In this

More information

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015 MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample

More information

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions

More information

Minitab Lab #1 Math 120 Nguyen 1 of 7

Minitab Lab #1 Math 120 Nguyen 1 of 7 Minitab Lab #1 Math 120 Nguyen 1 of 7 Objectives: 1) Retrieve a MiniTab file 2) Generate a list of random integers 3) Draw a bar chart, pie chart, histogram, boxplot, stem-and-leaf diagram 4) Calculate

More information

Quantitative - One Population

Quantitative - One Population Quantitative - One Population The Quantitative One Population VISA procedures allow the user to perform descriptive and inferential procedures for problems involving one population with quantitative (interval)

More information

Chapter 3: Describing, Exploring & Comparing Data

Chapter 3: Describing, Exploring & Comparing Data Chapter 3: Describing, Exploring & Comparing Data Section Title Notes Pages 1 Overview 1 2 Measures of Center 2 5 3 Measures of Variation 6 12 4 Measures of Relative Standing & Boxplots 13 16 3.1 Overview

More information

+ Statistical Methods in

+ Statistical Methods in 9/4/013 Statistical Methods in Practice STA/MTH 379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Discovering Statistics

More information

Key Strokes To make a histogram or box-and-whisker plot: (Using canned program in TI)

Key Strokes To make a histogram or box-and-whisker plot: (Using canned program in TI) Key Strokes To make a histogram or box-and-whisker plot: (Using canned program in TI) 1. ing Data: To enter the variable, use the following keystrokes: Press STAT (directly underneath the DEL key) Leave

More information

This lesson is designed to improve students

This lesson is designed to improve students NATIONAL MATH + SCIENCE INITIATIVE Mathematics g x 8 6 4 2 0 8 6 4 2 y h x k x f x r x 8 6 4 2 0 8 6 4 2 2 2 4 6 8 0 2 4 6 8 4 6 8 0 2 4 6 8 LEVEL Algebra or Math in a unit on function transformations

More information

Lab 7 Statistics I LAB 7 QUICK VIEW

Lab 7 Statistics I LAB 7 QUICK VIEW Lab 7 Statistics I This lab will cover how to do statistical calculations in excel using formulas. (Note that your version of excel may have additional formulas to calculate statistics, but these formulas

More information

Chapter 2: Modeling Distributions of Data

Chapter 2: Modeling Distributions of Data Chapter 2: Modeling Distributions of Data Section 2.2 The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 2 Modeling Distributions of Data 2.1 Describing Location in a Distribution

More information

MEASURES OF CENTRAL TENDENCY

MEASURES OF CENTRAL TENDENCY 11.1 Find Measures of Central Tendency and Dispersion STATISTICS Numerical values used to summarize and compare sets of data MEASURE OF CENTRAL TENDENCY A number used to represent the center or middle

More information

Math 227 EXCEL / MEGASTAT Guide

Math 227 EXCEL / MEGASTAT Guide Math 227 EXCEL / MEGASTAT Guide Introduction Introduction: Ch2: Frequency Distributions and Graphs Construct Frequency Distributions and various types of graphs: Histograms, Polygons, Pie Charts, Stem-and-Leaf

More information

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Probability and Statistics. Copyright Cengage Learning. All rights reserved. Probability and Statistics Copyright Cengage Learning. All rights reserved. 14.5 Descriptive Statistics (Numerical) Copyright Cengage Learning. All rights reserved. Objectives Measures of Central Tendency:

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information