Using R. Liang Peng Georgia Institute of Technology January 2005
|
|
- Marybeth Bryant
- 5 years ago
- Views:
Transcription
1 Using R Liang Peng Georgia Institute of Technology January 2005
2 1. Introduction Quote from R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering,...) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity. One of R s strengths is the ease with which well-designed publicationquality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
3 R is available as Free Software under the terms of the Free Software Foundation s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS. 2. Installing R under Windows Go to click Windows (95 and later), click base, click rw2100.exe, and then save it to disk. Just double-click the file rw2100.exe at the directory where you just saved, and follow the instructions. 3. Starting R Creat a shortcut icon, R, on the desktop. Right click this icon and select Properties. In Start in, type D : \ 3770Spring05 as a working directory for this course. Now simply double-click the R icon to start R. After starting R, click Packages at the top and install or update packages. 4. Objects Generating random vectors: rnorm(10,mean=1, sd=2); runif(10); rchisq(10, df=2); rt(10, df=2); rcauchy(10)
4 Some elementary graphics functions: 1) plot(x,y,type= l,lty=1,xlim,ylim,,xlab,ylab,main); 2) Adding things to a plot. points(x,y) #add points lines(x,y) #add lines text(x,y,labels) #point labels 3) Lines of various kinds. abline(a,b) # int. a, slope b abline(h=c) #horizontal at c abline(v=c) # vertical at c 4) Diagnostic plot. qqnorm(y) #normal scores 5) Miscellaneous. hist(x) #histograms barchart(f) #histogram-like dotchart(x,labels) pie(f) Standard & Poor stock index: x=read.table( spidata.txt,sep=, ) # input data from file spidata.txt x # look data
5 dim(x) # dimension x[,1] # first column (daily return of stock price) x[,2] # second column (daily volume) length(x[,1]) plot(x[,1]) plot(x[,2]) plot(x[,1],x[,2],type= l, lty=1, xlab= daily return, ylab= daily volume, main= Comparison ) hist(x[,1]) hist(x[,1],probability=true) qqnorm(x[,1]) qqline(x[,1]) sort(x[,1]) sort(x[,1],decreasing=true) order(x[,1]) abs(x[,1]) x[,1]*x[,2] x[,1]+x[,2] x[,1][5:10] min(x[,1])
6 max(x[,1]) summary(x[,1]) 5. Control, Loops and Functions 1) general forms: for(var in vector) statement while (condition) statement repeat statement 2) Define a function: fname=function(arg1, arg2,...) statement 6. Basic statistics Standard statistical procedures are in the package stats. For instance, the one sample t-test and corresponding confidence interval may be performed with the command t.test. Type in help( t.test ) to obtain the following information. Student s t-test Description: Performs one sample t-tests on vectors of data.
7 Usage: t.test(x,...) Arguments: x: a numeric vector of data values. alternative: a character string specifying the alternative hypothesis, must be one of two.sided (default), greater or less. You can specify just the initial letter. mu: a number indicating the true value of the mean. conf.level: confidence level of the interval. To complete the classical approach, we first need to determine a rejection region for the t-statistic. Recall that a critical value is in fact a quantile, and hence we may compute a critical value for the t-statistic by means of the command qt. help( TDist ) The Student t Distribution Description: Density, distribution function, quantile function and random generation for the t distribution with df degrees of freedom. Usage: dt(x, df) pt(q, df) qt(p, df) rt(n, df)
8 In Exercise 8.32 we are asked to do a two-sided test of the null hypothesis H 0 : µ = 100, using α = 0.05 [which corresponds to confidence level 95%]. First, we compute the critical value t 0.025, which corresponds to the quantile of the t-distribution with 12 1 = 11 degrees of freedom. qt(0975,df=11) [1] Hence, we now know that the rejection region is equal to {t : t }. Next, we compute the value taken by the test statistic. t.test(c1,mu=100,conf.level=0.95) Arguments: x, q: vector of quantiles. p: vector of probabilities. n: number of observations. If length(n) > 1, the length is taken to be the number required. df: degrees of freedom (> 0, maybe non-integer). Value: dt gives the density, pt gives the distribution function, qt gives the quantile function, and rt generates random deviates. 7. An example
9 This produces the following output One Sample t-test data: C1 t = , df = 11, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x The test statistic takes the value , outside the rejection region. Hence, we do not reject: there is no evidence that the population mean reading µ is not equal to 100. Note that the output also contains the P -value, which is larger than the significance level Hence, it immediately follows that the null hypothesis is not rejected. In this way we could have avoided the computation of the critical value. Moreover, the output shows that the 95% confidence interval for µ contains 100. That is, H 0 : µ = 100 is not rejected.
10 8. Generating data Many of the exercises in Devore s book only involve summarized data, and accordingly the data sets only contain summarized data. For instance, if you open the SPSS-file with the data for Exercise 9.7, you may obtain the following output. library(foreign) ex09.07 = read.spss( d:/manual Install/Datasets/SPSS/Ch09/ex9-07.sav ) summary(ex09.07) Length Class Mode GENDER 2 -none- character SAMP LE S 2 -none- numeric SAMP LE M 2 -none- numeric SAMP LE 1 2 -none- numeric attach(ex09.07) data.frame(gender,samp LE S,SAMP LE M,SAMP LE 1 ) GENDER SAMP LE S SAMP LE M SAMP LE 1 1 Males Females The use of statistical software typically requires the original data. Thus, we
11 need original data with the sample sample means and sample standard deviation as reported in the summary. We may simulate the male data as follows: (i) generate 97 independent standard normal random variables; (ii) scale them, so as to obtain a random sample with mean 0 and standard deviation 1; (iii) multiply each value in the sample by 4.83 and add The output below shows that this approach works. males = 4.83*scale(rnorm(97)) mean(males) [1] 10.4 sd(males) [1] 4.83 In the same way, we may simulate the female data, and compare both samples by means of a two-sample t-test. females = 4.68*scale(rnorm(148))+9.26 t.test(males,females) Welch Two Sample t-test data: males and females t = 1.829, df = , p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:
12 sample estimates: mean of x mean of y
EPIB Four Lecture Overview of R
EPIB-613 - Four Lecture Overview of R R is a package with enormous capacity for complex statistical analysis. We will see only a small proportion of what it can do. The R component of EPIB-613 is divided
More informationCommon Sta 101 Commands for R. 1 One quantitative variable. 2 One categorical variable. 3 Two categorical variables. Summary statistics
Common Sta 101 Commands for R 1 One quantitative variable summary(x) # most summary statitstics at once mean(x) median(x) sd(x) hist(x) boxplot(x) # horizontal = TRUE for horizontal plot qqnorm(x) qqline(x)
More informationCommon R commands used in Data Analysis and Statistical Inference
Common R commands used in Data Analysis and Statistical Inference 1 One numerical variable summary(x) # most summary statitstics at once mean(x) median(x) sd(x) hist(x) boxplot(x) # horizontal = TRUE for
More informationZ-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
Z-TEST / Z-STATISTIC: used to test hypotheses about µ when the population standard deviation is known and population distribution is normal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses
More informationStatistical Tests for Variable Discrimination
Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:
More informationIntro Intro.3
Intro.1 Intro.2 Introduction to R Much of the content here is from Appendix A of my Analysis of Categorical Data with R book (www.chrisbilder.com/ categorical). All R code is available in AppendixInitialExamples.R
More informationIQR = number. summary: largest. = 2. Upper half: Q3 =
Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number
More informationStatistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals.
Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals. In this Computer Class we are going to use Statgraphics
More informationChapter 6: DESCRIPTIVE STATISTICS
Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling
More informationDynamic Documents. David Allen University of Kentucky. August 15, 2014
Dynamic Documents David Allen University of Kentucky August 15, 2014 The term dynamic document covers a wide assortment of documents. See Wikipedia for a general definition. This discussion involves a
More informationResearch Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel
Research Methods for Business and Management Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel A Simple Example- Gym Purpose of Questionnaire- to determine the participants involvement
More informationSTAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015
STAT 113: Lab 9 Colin Reimer Dawson Last revised November 10, 2015 We will do some of the following together. The exercises with a (*) should be done and turned in as part of HW9. Before we start, let
More informationIntroductory Applied Statistics: A Variable Approach TI Manual
Introductory Applied Statistics: A Variable Approach TI Manual John Gabrosek and Paul Stephenson Department of Statistics Grand Valley State University Allendale, MI USA Version 1.1 August 2014 2 Copyright
More informationLesson 20: Every Line is a Graph of a Linear Equation
Student Outcomes Students know that any non vertical line is the graph of a linear equation in the form of, where is a constant. Students write the equation that represents the graph of a line. Lesson
More informationStatistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.
Statistical Methods Instructor: Lingsong Zhang 1 Issues before Class Statistical Methods Lingsong Zhang Office: Math 544 Email: lingsong@purdue.edu Phone: 765-494-7913 Office Hour: Monday 1:00 pm - 2:00
More informationUnit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users
BIOSTATS 640 Spring 2018 Review of Introductory Biostatistics STATA solutions Page 1 of 13 Key Comments begin with an * Commands are in bold black I edited the output so that it appears here in blue Unit
More informationTHE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann
Forrest W. Young & Carla M. Bann THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA CB 3270 DAVIE HALL, CHAPEL HILL N.C., USA 27599-3270 VISUAL STATISTICS PROJECT WWW.VISUALSTATS.ORG
More informationUnit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals
BIOSTATS 540 Fall 017 8. SUPPLEMENT Normal, T, Chi Square, F and Sums of Normals Page 1 of Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals Topic 1. Normal Distribution.. a. Definition..
More informationIntroduction to R. Hao Helen Zhang. Fall Department of Mathematics University of Arizona
Department of Mathematics University of Arizona hzhang@math.aricona.edu Fall 2019 What is R R is the most powerful and most widely used statistical software Video: A language and environment for statistical
More informationAz R adatelemzési nyelv
Az R adatelemzési nyelv alapjai II. Egészségügyi informatika és biostatisztika Gézsi András gezsi@mit.bme.hu Functions Functions Functions do things with data Input : function arguments (0,1,2, ) Output
More informationDynamic Documents. David Allen University of Kentucky. July 30, Presented at TUG 2014
Dynamic Documents David Allen University of Kentucky July 30, 2014 Presented at TUG 2014 1 Introduction A generic definition of a dynamic document from Wikipedia: A living document or dynamic document
More informationWhy must we use computers in stats? Who wants to find the mean of these numbers (100) by hand?
Introductory Statistics Lectures Introduction to R Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the author 2009 (Compile
More informationRegression Analysis and Linear Regression Models
Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical
More informationA Survey of Statistical Modeling Tools
1 of 6 A Survey of Statistical Modeling Tools Madhuri Kulkarni (A survey paper written under the guidance of Prof. Raj Jain) Abstract: A plethora of statistical modeling tools are available in the market
More informationUnivariate Data - 2. Numeric Summaries
Univariate Data - 2. Numeric Summaries Young W. Lim 2018-08-01 Mon Young W. Lim Univariate Data - 2. Numeric Summaries 2018-08-01 Mon 1 / 36 Outline 1 Univariate Data Based on Numerical Summaries R Numeric
More informationAn Introduction to Minitab Statistics 529
An Introduction to Minitab Statistics 529 1 Introduction MINITAB is a computing package for performing simple statistical analyses. The current version on the PC is 15. MINITAB is no longer made for the
More informationChapter 2. Frequency distribution. Summarizing and Graphing Data
Frequency distribution Chapter 2 Summarizing and Graphing Data Shows how data are partitioned among several categories (or classes) by listing the categories along with the number (frequency) of data values
More informationDescriptive Statistics, Standard Deviation and Standard Error
AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.
More informationSTA215 Inference about comparing two populations
STA215 Inference about comparing two populations Al Nosedal. University of Toronto. Summer 2017 June 22, 2017 Two-sample problems The goal of inference is to compare the responses to two treatments or
More informationANSWERS -- Prep for Psyc350 Laboratory Final Statistics Part Prep a
ANSWERS -- Prep for Psyc350 Laboratory Final Statistics Part Prep a Put the following data into an spss data set: Be sure to include variable and value labels and missing value specifications for all variables
More informationIST 3108 Data Analysis and Graphics Using R Week 9
IST 3108 Data Analysis and Graphics Using R Week 9 Engin YILDIZTEPE, Ph.D 2017-Spring Introduction to Graphics >y plot (y) In R, pictures are presented in the active graphical device or window.
More informationSelected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.
Minitab@Oneonta.Manual: Selected Introductory Statistical and Data Manipulation Procedures Gordon & Johnson 2002 Minitab version 13.0 Minitab@Oneonta.Manual: Selected Introductory Statistical and Data
More informationA Short Introduction to R
A Short Introduction to R 1.1 The R initiative There are many commercial statistical softwares available. Well-known examples include SAS, SPSS, S-Plus, Minitab, Statgraphics, GLIM, and Genstat. Usually
More informationStat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution
Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal
More informationa suite of operators for calculations on arrays, in particular
The R Environment (Adapted from the Venables and Smith R Manual on www.r-project.org and from Andreas Buja s web site for Applied Statistics at http://www-stat.wharton.upenn.edu/ buja/stat-541/notes-stat-541.r)
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationAn Introduction to the R Commander
An Introduction to the R Commander BIO/MAT 460, Spring 2011 Christopher J. Mecklin Department of Mathematics & Statistics Biomathematics Research Group Murray State University Murray, KY 42071 christopher.mecklin@murraystate.edu
More informationAn Introductory Guide to R
An Introductory Guide to R By Claudia Mahler 1 Contents Installing and Operating R 2 Basics 4 Importing Data 5 Types of Data 6 Basic Operations 8 Selecting and Specifying Data 9 Matrices 11 Simple Statistics
More informationChapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea
Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.
More informationPOL 345: Quantitative Analysis and Politics
POL 345: Quantitative Analysis and Politics Precept Handout 9 Week 11 (Verzani Chapter 10: 10.1 10.2) Remember to complete the entire handout and submit the precept questions to the Blackboard 24 hours
More informationData Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology
❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous
More informationChapter 2: The Normal Distributions
Chapter 2: The Normal Distributions Measures of Relative Standing & Density Curves Z-scores (Measures of Relative Standing) Suppose there is one spot left in the University of Michigan class of 2014 and
More informationIntroduction to Minitab 1
Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,
More informationSo..to be able to make comparisons possible, we need to compare them with their respective distributions.
Unit 3 ~ Modeling Distributions of Data 1 ***Section 2.1*** Measures of Relative Standing and Density Curves (ex) Suppose that a professional soccer team has the money to sign one additional player and
More informationDensity Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.
1.3 Density curves p50 Some times the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. It is easier to work with a smooth curve, because the histogram
More informationVisualizing univariate data 1
Visualizing univariate data 1 Xijin Ge SDSU Math/Stat Broad perspectives of exploratory data analysis(eda) EDA is not a mere collection of techniques; EDA is a new altitude and philosophy as to how we
More informationSTATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS
STATISTICAL LABORATORY, April 3th, 21 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 1 independent rounds of a game, and each player has probability
More informationExam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x.
Exam 4 1-5. Normal Population. The scatter plot show below is a random sample from a 2D normal population. The bell curves and dark lines refer to the population. The sample Least Squares Line (shorter)
More informationData Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski
Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...
More informationR version ( ) Copyright (C) 2010 The R Foundation for Statistical Computing ISBN
Math 3070 1. Treibergs Railway Example: Power of One Sample t-test Name: Example May 25, 2011 The power of a t-test is described by the operation characteristic curves of Table A.17 in Devore. It is also
More informationCHAPTER 6. The Normal Probability Distribution
The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit
More informationPrepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti
Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang MS Word is a word processing application
More informationWhat R is. STAT:5400 (22S:166) Computing in Statistics
STAT:5400 (22S:166) Computing in Statistics Introduction to R Lecture 5 September 9, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowa.edu 1 What R is an integrated suite of software facilities for data
More informationUsing R for Cross-Cultural Research
2003 World Cultures 14(2):144-154 USING R / Dow Using R for Cross-Cultural Research James W. Dow Department of Sociology and Anthropology, Oakland University, Rochester, MI 48309; dow@oakland.edu 1. ALTERNATIVES
More information1 Lab 1. Graphics and Checking Residuals
R is an object oriented language. We will use R for statistical analysis in FIN 504/ORF 504. To download R, go to CRAN (the Comprehensive R Archive Network) at http://cran.r-project.org Versions for Windows
More informationTable Of Contents. Table Of Contents
Statistics Table Of Contents Table Of Contents Basic Statistics... 7 Basic Statistics Overview... 7 Descriptive Statistics Available for Display or Storage... 8 Display Descriptive Statistics... 9 Store
More informationPackage condir. R topics documented: February 15, 2017
Package condir February 15, 2017 Title Computation of P Values and Bayes Factors for Conditioning Data Version 0.1.1 Author Angelos-Miltiadis Krypotos Maintainer Angelos-Miltiadis
More informationChapter 2 Modeling Distributions of Data
Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 7 Density Estimation: Erupting Geysers and Star Clusters 7.1 Introduction 7.2 Density Estimation The three kernel
More informationChapter 5: The normal model
Chapter 5: The normal model Objective (1) Learn how rescaling a distribution affects its summary statistics. (2) Understand the concept of normal model. (3) Learn how to analyze distributions using the
More informationRobust Linear Regression (Passing- Bablok Median-Slope)
Chapter 314 Robust Linear Regression (Passing- Bablok Median-Slope) Introduction This procedure performs robust linear regression estimation using the Passing-Bablok (1988) median-slope algorithm. Their
More informationUniversity of Wollongong School of Mathematics and Applied Statistics. STAT231 Probability and Random Variables Introductory Laboratory
1 R and RStudio University of Wollongong School of Mathematics and Applied Statistics STAT231 Probability and Random Variables 2014 Introductory Laboratory RStudio is a powerful statistical analysis package.
More informationChapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.
MINITAB Guide PREFACE Preface This guide is used as part of the Elementary Statistics class (Course Number 227) offered at Los Angeles Mission College. It is structured to follow the contents of the textbook
More informationChapter2 Description of samples and populations. 2.1 Introduction.
Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that
More informationCourse of study- Algebra Introduction: Algebra 1-2 is a course offered in the Mathematics Department. The course will be primarily taken by
Course of study- Algebra 1-2 1. Introduction: Algebra 1-2 is a course offered in the Mathematics Department. The course will be primarily taken by students in Grades 9 and 10, but since all students must
More informationChapter 5: The standard deviation as a ruler and the normal model p131
Chapter 5: The standard deviation as a ruler and the normal model p131 Which is the better exam score? 67 on an exam with mean 50 and SD 10 62 on an exam with mean 40 and SD 12? Is it fair to say: 67 is
More informationMAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015
MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample
More informationMinitab Guide for MA330
Minitab Guide for MA330 The purpose of this guide is to show you how to use the Minitab statistical software to carry out the statistical procedures discussed in your textbook. The examples usually are
More informationPackage visualizationtools
Package visualizationtools April 12, 2011 Type Package Title Package contains a few functions to visualize statistical circumstances. Version 0.2 Date 2011-04-06 Author Thomas Roth Etienne Stockhausen
More informationPart 1: Getting Started
Part 1: Getting Started 140.776 Statistical Computing Ingo Ruczinski Thanks to Thomas Lumley and Robert Gentleman of the R-core group (http://www.r-project.org/) for providing some tex files that appear
More informationCHAPTER 2: Describing Location in a Distribution
CHAPTER 2: Describing Location in a Distribution 2.1 Goals: 1. Compute and use z-scores given the mean and sd 2. Compute and use the p th percentile of an observation 3. Intro to density curves 4. More
More informationYour Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using
More informationExcel Tips and FAQs - MS 2010
BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my
More informationLab #9: ANOVA and TUKEY tests
Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for
More informationR is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:
Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/
More informationCHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers HW 34. Sketch
More information23.2 Normal Distributions
1_ Locker LESSON 23.2 Normal Distributions Common Core Math Standards The student is expected to: S-ID.4 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate
More informationSelect Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables
Select Cases GRAPHS The Select Cases command excludes from further analysis all those cases that do not meet specified selection criteria. Select Cases For a subset of the datafile, use Select Cases. In
More informationStat 302 Statistical Software and Its Applications Introduction to R
Stat 302 Statistical Software and Its Applications Introduction to R Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 January 8, 2015 2 Statistical Software There are
More informationUse of Extreme Value Statistics in Modeling Biometric Systems
Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision
More informationIT 403 Practice Problems (1-2) Answers
IT 403 Practice Problems (1-2) Answers #1. Using Tukey's Hinges method ('Inclusionary'), what is Q3 for this dataset? 2 3 5 7 11 13 17 a. 7 b. 11 c. 12 d. 15 c (12) #2. How do quartiles and percentiles
More informationTHIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010
THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE
More informationCHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves
More informationLab 5 - Risk Analysis, Robustness, and Power
Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors
More informationIntroduction to R: Part I
Introduction to R: Part I Jeffrey C. Miecznikowski March 26, 2015 R impact R is the 13th most popular language by IEEE Spectrum (2014) Google uses R for ROI calculations Ford uses R to improve vehicle
More informationStatistical Good Practice Guidelines. 1. Introduction. Contents. SSC home Using Excel for Statistics - Tips and Warnings
Statistical Good Practice Guidelines SSC home Using Excel for Statistics - Tips and Warnings On-line version 2 - March 2001 This is one in a series of guides for research and support staff involved in
More informationNormal Data ID1050 Quantitative & Qualitative Reasoning
Normal Data ID1050 Quantitative & Qualitative Reasoning Histogram for Different Sample Sizes For a small sample, the choice of class (group) size dramatically affects how the histogram appears. Say we
More informationA (very) brief introduction to R
A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce
More informationPopulation Genetics (52642)
Population Genetics (52642) Benny Yakir 1 Introduction In this course we will examine several topics that are related to population genetics. In each topic we will discuss briefly the biological background
More informationWeek 7: The normal distribution and sample means
Week 7: The normal distribution and sample means Goals Visualize properties of the normal distribution. Learning the Tools Understand the Central Limit Theorem. Calculate sampling properties of sample
More informationappstats6.notebook September 27, 2016
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using
More informationThe first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.
Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.
More informationBluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition
Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created
More informationYour Name: Section: 2. To develop an understanding of the standard deviation as a measure of spread.
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #3 Interpreting the Standard Deviation and Exploring Transformations Objectives: 1. To review stem-and-leaf plots and their
More informationIBM SPSS Statistics and open source: A powerful combination. Let s go
and open source: A powerful combination Let s go The purpose of this paper is to demonstrate the features and capabilities provided by the integration of IBM SPSS Statistics and open source programming
More informationIPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 16)
IPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 6) Bonnie Lin and Nicholas Horton (nhorton@amherst.edu) July, 8 Introduction and background These documents are intended to help describe how
More informationLecture 6: Chapter 6 Summary
1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z
More informationComparing Groups and Hypothesis Testing
Comparing Groups and Hypothesis Testing We ve mainly reviewed about informally comparing the distribution of data in different groups. Now we want to review the tools you know about how use statistics
More informationInterval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.
Interval Estimation It is a common requirement to efficiently estimate population parameters based on simple random sample data. In the R tutorials of this section, we demonstrate how to compute the estimates.
More informationAppendix A: A Sample R Session
Appendix A: The purpose of this appendix is to become accustomed to the R and the way if responds to line commands. R can be downloaded from http://cran.r-project.org/ Be sure to download the version of
More information