Regression III: Advanced Methods
|
|
- Alexander Jones
- 5 years ago
- Views:
Transcription
1 Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University
2 Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions Histograms Density estimation Quantile comparison plots Boxplots Plots exploring relationships Parallel boxplots Scatterplots (including scatterplot matrices and dynamic three-dimensional scatterplots) Bivariate & three dimensional density estimation Conditioning plots 2
3 Histograms Dissects the range of the data into bins of equal width along the horizontal axis Vertical axis represents the frequency counts (or percents, proportions) Bars represent the counts Fewer bins, smoother histogram, but less detail about the distribution Trade-off between smoothness and detail: We want to preserve as much detail as possible but we do not want the graph to be too rough (difficult to discern shape) Stem-and-leaf displays Alternative form of histogram that uses the numerical data themselves to form the bars of the histogram Data are broken into two parts: a stem and a leaf 3
4 Choosing the number of bins Simple rule of thumb for small datasets (approx. 100 or less) is: For larger samples, the car package for R implements Freedman and Diaconis (1981) recommended formula from the n.bins function: Frequency Histogram of income income 4
5 Nonparametric Density Estimation Although histograms can be very useful for examining the distribution of a variable, the diagram can differ dramatically depending on the number of bins employed This problem can be overcome (partially) using nonparametric density estimation Density estimation is an attempt to estimate the probability density function of a variable based on the sample, but less formally it can be thought of as a way of averaging and smoothing the histogram Since a density function encloses an area of 1, we first must rescale the histogram so that the total area under the smoothed line (the area within the bins) equals 1. In other words, we examine the proportion of cases at specific points in the histogram rather than the frequency counts 5
6 Kernel Density Estimation Essentially a sophisticated form of locally weighted averaging of the distribution Use a weight function (kernel) that ensures the enclosed area of the curve equals 1 Probability density functions (such as the standard normal density function) are good choices because they are smooth and symmetric The kernel density estimate is calculated as follows: 6
7 Selecting the Bandwidth Unlike histograms we no longer set the number of bins; instead we must select the bandwidth h. We can do this visually, but statistical theory provides some help: The population standard deviation σ is unknown so we replace it with an adaptive estimator of spread (The sample standard deviation S can be inflated if the underlying density isn t normal): Hinge spread is the inter-quartile range; is the hinge spread of the standard normal distribution. The formula for the bandwidth is then: 7
8 An example of Kernel Density Estimation If the underlying density distribution is substantially nonnormal, produces a window width 2h that is too wide (i.e., the line is too rough), but it is good as a starting point As the bandwidth increases, the density curve becomes smoother Ideally we want a smooth curve like the black line to the right (bw=1087) Density Histogram with Density Estimation income bw=1087 bw=400 8
9 R script for previous graph 9
10 Density Estimates with confidence envelopes The sm package for R allows you to plot variability bands that are a width of two standard errors These bands can be especially useful for assessing modality More details are in Bowman and Azzalini (1997: Chapter 2) Probability density function income 10
11 Density Estimates with a normal reference band The sm package also allows you to fit a normal reference band which indicates the likely position of the density estimate when the data are normal (blue area) Again, more details are in Bowman and Azzalini (1997: Chapter 2) Probability density function income 11
12 Quantile Comparison Plots (1) Quantile comparison plots are most useful for examining the tails of the distribution Unlike histograms and density functions, they do not require arbitrary bins and thus preserve the continuous nature of the data. They do so by comparing the sample distribution to a theoretical cumulative distribution function (CDF) We could substitute the empirical cumulative distribution function (ECDF), which gives the proportion of data that fall below each x value as x moves from left to right, for the CDF Because the EDCF is typically rough (a stair-step function that rises a height of 1/n at each observation), however, the quantile comparison plot does not construct it directly 12
13 Quantile Comparison Plots (2) Order the cases from lowest to highest X (1), X (2),,X (n) Calculate the cumulative proportion before each X (i): z i values that correspond to the cumulative probability P i are found from the inverse of the CDF: 13
14 Quantile Comparison Plots (3) Plot the z i values on the horizontal axis; the X (i) on the vertical axis If X is normal, then X (i) z i. In other words, the plot should be approximately linear Draw a line that connects the hinges (the quartiles) A 95% confidence envelope is constructed as follows: A positive skew is indicated by points above the line on both ends; A negative skew is indicated by points below the lines on both ends Heavy tailed distributions are indicated by points above the confidence envelope for high values and points below for low values. 14
15 Quantile Comparison Plots (4) QQ-plots are easily implemented using the car package. Here we see a positive skew LAWYERS since points lie above the OSTEOPATHS.CHIROPRACTORS line on both tails (a negative skew is indicated by points below the lines on both tails) As we can see this plot is useful for examining the tails of a distribution They tell us nothing about the mode, however income GENERAL.MANAGERS norm quantiles PHYSICIAN VETERINARIANS PILOTS 15
16 Boxplots Display the center, spread, and outliers of a distribution Vertical axis represents the range of the variable A box is drawn around the hinge spread A line is drawn at the median Outliers more than 1.5 hinge spreads past the hinges are marked individually Whiskers connect the box to the most extreme nonoutlying observation Income Boxplot of Income GENERAL.MANAGE PHYSICIANS LAWYERS OSTEOPATHS.CHIROPRACTORS VETERINARIANS 16
17 Side-by-side Boxplots: Helpful for comparing many distributions Boxplots of Income for different Occupation Types bc prof wc 17
18 Why use graphs to examine relationships? Anscombe s (1973) contrived data show the importance of using graphs in data analysis rather than simply looking at numerical outputs Four very different relationships with exactly the same correlation coefficient, standard deviation of the residuals, and coefficients and standard errors The linear regression line adequately summarizes only graph (a) Y Y (a) Accurate summary X (c) Drawn to outlier X Y Y (b) Distorts curvilinear rel X (d) "Chases" outlier X 18
19 Scatterplots One of the most used of all statistical graphs summarizes the relationship between two quantitative variables Including nonparametic smooths and linear regression lines can aid visualization of the trend A useful technique when one of the variables does not take on many different values, or when the sample is so large that the data are over-plotted and there are few empty spaces on the graph, is to jitter the data (i.e., add a random component to each value) We can jitter one or both of the variables in a scatterplot 19
20 Jittering scatterplots No jitter Conservative attitudes Conservative attitudes Age jittered Age
21 Identifying categorical variables Distinguishing between categories of a categorical control variable in scatterplots can help show important patterns we might otherwise miss Different slopes for nondemocracies and democracies suggests an interaction between democracy and income inequality in their effects on attitudes Attitudes towards inequality(mean) Democratic Non-Democratic Gini coefficient 21
22 Bivariate Density Estimates The kernel smoothing method used for histograms can be easily extended to the joint distribution of two random continuous variables The bivariate density function takes the following form: Where K is the kernel function and (h 1 and h 2 ) are the joint smoothing parameters For univariate densities, probabilities are associated with area under the density curve. For a bivariate density curve, probabilities are associated with volume under the density, where the total volume equals one 22
23 Types of Bivariate Density Plots Perspective plots: the joint distribution is shown in a 3D plot height is used to show level of density Imageplots: different intensities of colour or shading denote density levels Contour plots or slice plots: lines trace paths of constant levels of density (similar to the depiction of elevation in a geographical contour map) 23
24 R script for Bivariate Density Plots 24
25 Three Dimensional Density Estimates The three dimensional density estimate also extends simply from the bivariate case: Where K is the kernel function and (h 1, h 2, and h 3 ) are the joint smoothing parameter In these plots contours represent closed surfaces Like the other density estimates, these are helpful for assessing clustering of the data 25
26 Some examples of three dimensional density estimates prestige secpay income education gini gdp 26
27 Scatterplotmatrix Plots individual scatterplots for all possible bivariate relationships at one time Can be enhanced by adding density estimates for each variable on the diagonal Note: Only marginal relationships are depicted (i.e., no control for other variables) gini secpay gdp
28 Conditioning plots: An example from the CES jitter(lascale) Given : education Given : men age
29 Next Topic: Transformations for univariate and bivariate data 29
STA 570 Spring Lecture 5 Tuesday, Feb 1
STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row
More informationDensity Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.
1.3 Density curves p50 Some times the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. It is easier to work with a smooth curve, because the histogram
More informationFurther Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables
Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables
More informationChapter 2 Modeling Distributions of Data
Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and
More informationChapter 6: DESCRIPTIVE STATISTICS
Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling
More informationChapter 2 Describing, Exploring, and Comparing Data
Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative
More informationLecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #
Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms
More informationVisual Analytics. Visualizing multivariate data:
Visual Analytics 1 Visualizing multivariate data: High density time-series plots Scatterplot matrices Parallel coordinate plots Temporal and spectral correlation plots Box plots Wavelets Radar and /or
More informationLecture 6: Chapter 6 Summary
1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z
More informationLecture 3 Questions that we should be able to answer by the end of this lecture:
Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair
More informationName: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution
Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the
More informationTable of Contents (As covered from textbook)
Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression
More informationChapter 2: The Normal Distribution
Chapter 2: The Normal Distribution 2.1 Density Curves and the Normal Distributions 2.2 Standard Normal Calculations 1 2 Histogram for Strength of Yarn Bobbins 15.60 16.10 16.60 17.10 17.60 18.10 18.60
More informationLecture 3 Questions that we should be able to answer by the end of this lecture:
Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair
More informationIAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram
IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital
More information1.3 Graphical Summaries of Data
Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 1.3 Graphical Summaries of Data In the previous section we discussed numerical summaries of either a sample or a data. In this
More informationChapter 5: The standard deviation as a ruler and the normal model p131
Chapter 5: The standard deviation as a ruler and the normal model p131 Which is the better exam score? 67 on an exam with mean 50 and SD 10 62 on an exam with mean 40 and SD 12? Is it fair to say: 67 is
More informationCREATING THE DISTRIBUTION ANALYSIS
Chapter 12 Examining Distributions Chapter Table of Contents CREATING THE DISTRIBUTION ANALYSIS...176 BoxPlot...178 Histogram...180 Moments and Quantiles Tables...... 183 ADDING DENSITY ESTIMATES...184
More informationStatistics Lecture 6. Looking at data one variable
Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More informationWELCOME! Lecture 3 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important
More informationCHAPTER 2 DESCRIPTIVE STATISTICS
CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of
More informationMATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data
MATH& 146 Lesson 10 Section 1.6 Graphing Numerical Data 1 Graphs of Numerical Data One major reason for constructing a graph of numerical data is to display its distribution, or the pattern of variability
More informationNonparametric Regression and Generalized Additive Models Part I
SPIDA, June 2004 Nonparametric Regression and Generalized Additive Models Part I Robert Andersen McMaster University Plan of the Lecture 1. Detecting nonlinearity Fitting a linear model to a nonlinear
More informationPart I. Graphical exploratory data analysis. Graphical summaries of data. Graphical summaries of data
Week 3 Based in part on slides from textbook, slides of Susan Holmes Part I Graphical exploratory data analysis October 10, 2012 1 / 1 2 / 1 Graphical summaries of data Graphical summaries of data Exploratory
More information3. Data Analysis and Statistics
3. Data Analysis and Statistics 3.1 Visual Analysis of Data 3.2.1 Basic Statistics Examples 3.2.2 Basic Statistical Theory 3.3 Normal Distributions 3.4 Bivariate Data 3.1 Visual Analysis of Data Visual
More informationIT 403 Practice Problems (1-2) Answers
IT 403 Practice Problems (1-2) Answers #1. Using Tukey's Hinges method ('Inclusionary'), what is Q3 for this dataset? 2 3 5 7 11 13 17 a. 7 b. 11 c. 12 d. 15 c (12) #2. How do quartiles and percentiles
More informationThe basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student
Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite
More informationUNIT 1A EXPLORING UNIVARIATE DATA
A.P. STATISTICS E. Villarreal Lincoln HS Math Department UNIT 1A EXPLORING UNIVARIATE DATA LESSON 1: TYPES OF DATA Here is a list of important terms that we must understand as we begin our study of statistics
More informationLearner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display
CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &
More informationCHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves
More information3 Graphical Displays of Data
3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More informationUNIT 1: NUMBER LINES, INTERVALS, AND SETS
ALGEBRA II CURRICULUM OUTLINE 2011-2012 OVERVIEW: 1. Numbers, Lines, Intervals and Sets 2. Algebraic Manipulation: Rational Expressions and Exponents 3. Radicals and Radical Equations 4. Function Basics
More informationThings you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.
1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.
More informationUnivariate Statistics Summary
Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:
More informationCHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.
1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed
More information2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative
More informationSTP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES
STP 6 ELEMENTARY STATISTICS NOTES PART - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES Chapter covered organizing data into tables, and summarizing data with graphical displays. We will now use
More informationChapter 3 - Displaying and Summarizing Quantitative Data
Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative
More informationVocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.
5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table
More informationChapter 2: The Normal Distributions
Chapter 2: The Normal Distributions Measures of Relative Standing & Density Curves Z-scores (Measures of Relative Standing) Suppose there is one spot left in the University of Michigan class of 2014 and
More informationTMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS
To Describe Data, consider: Symmetry Skewness TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS Unimodal or bimodal or uniform Extreme values Range of Values and mid-range Most frequently occurring values In
More informationChapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.
Chapter 5 Understanding and Comparing Distributions The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram
More informationFrequency Distributions
Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,
More informationBIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26
Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 INTRODUCTION Graphs are one of the most important aspects of data analysis and presentation of your of data. They are visual representations
More informationChapter 1. Looking at Data-Distribution
Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw
More information1 RefresheR. Figure 1.1: Soy ice cream flavor preferences
1 RefresheR Figure 1.1: Soy ice cream flavor preferences 2 The Shape of Data Figure 2.1: Frequency distribution of number of carburetors in mtcars dataset Figure 2.2: Daily temperature measurements from
More informationProbability and Statistics. Copyright Cengage Learning. All rights reserved.
Probability and Statistics Copyright Cengage Learning. All rights reserved. 14.6 Descriptive Statistics (Graphical) Copyright Cengage Learning. All rights reserved. Objectives Data in Categories Histograms
More informationAND NUMERICAL SUMMARIES. Chapter 2
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative
More informationTopic (3) SUMMARIZING DATA - TABLES AND GRAPHICS
Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3- Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS A) Frequency Distributions For Samples Defn: A FREQUENCY DISTRIBUTION is a tabular or graphical display
More informationName Date Types of Graphs and Creating Graphs Notes
Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.
More informationMATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation
MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 1st, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 1st, 2018 1 Exploratory Data Analysis & Feature Construction How to explore a dataset Understanding the variables (values, ranges, and empirical
More information8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10
8: Statistics Statistics: Method of collecting, organizing, analyzing, and interpreting data, as well as drawing conclusions based on the data. Methodology is divided into two main areas. Descriptive Statistics:
More informationChapter 2: Understanding Data Distributions with Tables and Graphs
Test Bank Chapter 2: Understanding Data with Tables and Graphs Multiple Choice 1. Which of the following would best depict nominal level data? a. pie chart b. line graph c. histogram d. polygon Ans: A
More informationStatistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.
Statistical Methods Instructor: Lingsong Zhang 1 Issues before Class Statistical Methods Lingsong Zhang Office: Math 544 Email: lingsong@purdue.edu Phone: 765-494-7913 Office Hour: Monday 1:00 pm - 2:00
More informationMinitab 17 commands Prepared by Jeffrey S. Simonoff
Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save
More informationExploratory Data Analysis
Chapter 10 Exploratory Data Analysis Definition of Exploratory Data Analysis (page 410) Definition 12.1. Exploratory data analysis (EDA) is a subfield of applied statistics that is concerned with the investigation
More informationALGEBRA II A CURRICULUM OUTLINE
ALGEBRA II A CURRICULUM OUTLINE 2013-2014 OVERVIEW: 1. Linear Equations and Inequalities 2. Polynomial Expressions and Equations 3. Rational Expressions and Equations 4. Radical Expressions and Equations
More informationPrepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.
Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good
More informationChapter Two: Descriptive Methods 1/50
Chapter Two: Descriptive Methods 1/50 2.1 Introduction 2/50 2.1 Introduction We previously said that descriptive statistics is made up of various techniques used to summarize the information contained
More informationSTA Module 4 The Normal Distribution
STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally
More informationSTA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves
STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally
More information8. MINITAB COMMANDS WEEK-BY-WEEK
8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are
More informationCHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and
CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4
More informationOrganizing and Summarizing Data
1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This
More information3 Graphical Displays of Data
3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationCHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)
CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1 Daphne Skipper, Augusta University (2016) 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is
More informationMeasures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.
Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in
More informationAP Statistics Summer Assignment:
AP Statistics Summer Assignment: Read the following and use the information to help answer your summer assignment questions. You will be responsible for knowing all of the information contained in this
More informationCHAPTER 2: Describing Location in a Distribution
CHAPTER 2: Describing Location in a Distribution 2.1 Goals: 1. Compute and use z-scores given the mean and sd 2. Compute and use the p th percentile of an observation 3. Intro to density curves 4. More
More informationExcel 2010 with XLSTAT
Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with
More informationBar Charts and Frequency Distributions
Bar Charts and Frequency Distributions Use to display the distribution of categorical (nominal or ordinal) variables. For the continuous (numeric) variables, see the page Histograms, Descriptive Stats
More informationChapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data
Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically
More informationMath 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency
Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,
More informationSTA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures
STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming
More informationa. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.
Probability and Statistics Chapter 2 Notes I Section 2-1 A Steps to Constructing Frequency Distributions 1 Determine number of (may be given to you) a Should be between and classes 2 Find the Range a The
More informationThis chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.
CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.
More informationMAT 110 WORKSHOP. Updated Fall 2018
MAT 110 WORKSHOP Updated Fall 2018 UNIT 3: STATISTICS Introduction Choosing a Sample Simple Random Sample: a set of individuals from the population chosen in a way that every individual has an equal chance
More informationAverages and Variation
Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationGoals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)
SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple
More informationWeek 2: Frequency distributions
Types of data Health Sciences M.Sc. Programme Applied Biostatistics Week 2: distributions Data can be summarised to help to reveal information they contain. We do this by calculating numbers from the data
More informationVCEasy VISUAL FURTHER MATHS. Overview
VCEasy VISUAL FURTHER MATHS Overview This booklet is a visual overview of the knowledge required for the VCE Year 12 Further Maths examination.! This booklet does not replace any existing resources that
More informationLAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA
LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA This lab will assist you in learning how to summarize and display categorical and quantitative data in StatCrunch. In particular, you will learn how to
More informationNo. of blue jelly beans No. of bags
Math 167 Ch5 Review 1 (c) Janice Epstein CHAPTER 5 EXPLORING DATA DISTRIBUTIONS A sample of jelly bean bags is chosen and the number of blue jelly beans in each bag is counted. The results are shown in
More informationappstats6.notebook September 27, 2016
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using
More informationStat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution
Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal
More informationMultivariate probability distributions
Multivariate probability distributions September, 07 STAT 0 Class Slide Outline of Topics Background Discrete bivariate distribution 3 Continuous bivariate distribution STAT 0 Class Slide Multivariate
More informationOverview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution
Chapter 2 Summarizing & Graphing Data Slide 1 Overview Descriptive Statistics Slide 2 A) Overview B) Frequency Distributions C) Visualizing Data summarize or describe the important characteristics of a
More informationCHAPTER 3: Data Description
CHAPTER 3: Data Description You ve tabulated and made pretty pictures. Now what numbers do you use to summarize your data? Ch3: Data Description Santorico Page 68 You ll find a link on our website to a
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA
NOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA Statistics is concerned with scientific methods of collecting, recording, organising, summarising, presenting and analysing data from which future
More informationSo..to be able to make comparisons possible, we need to compare them with their respective distributions.
Unit 3 ~ Modeling Distributions of Data 1 ***Section 2.1*** Measures of Relative Standing and Density Curves (ex) Suppose that a professional soccer team has the money to sign one additional player and
More informationLecture Notes 3: Data summarization
Lecture Notes 3: Data summarization Highlights: Average Median Quartiles 5-number summary (and relation to boxplots) Outliers Range & IQR Variance and standard deviation Determining shape using mean &
More informationPage 1. Graphical and Numerical Statistics
TOPIC: Description Statistics In this tutorial, we show how to use MINITAB to produce descriptive statistics, both graphical and numerical, for an existing MINITAB dataset. The example data come from Exercise
More informationMeasures of Central Tendency
Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3
Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site. Topics Exploratory Data Analysis
More information