5b. Descriptive Statistics - Part II

Size: px
Start display at page:

Download "5b. Descriptive Statistics - Part II"

Transcription

1 5b. Descriptive Statistics - Part II In this lab we ll cover how you can calculate descriptive statistics that we discussed in class. We also learn how to summarize large multi-level databases efficiently, and how to export the results as a CSV file. Such a table of summary statistics can be the basis for graphical representation that we cover in the subsequent labs. You will need to download the aspen.csv data from the course website. 5b.1 Descriptive statistics functions in R Load the dataset (aspen.csv with the variables DBH and VOLUME) into R, attach the data, and try out the following functions that will give you summary statistics. Insert the variable name that you are interested in between the brackets: mean() median() max() min() range() which.max() which.min() var() sd() quantile() Percentiles can be calculated with the quantile function by adding a vector of the percentiles that you want. Bold indicates what you may customize for your own purposes: e.g: quantile(dbh, c(0.025,0.05,0.95,0.975)) There are also these handy functions to calculate multiple statistics: fivenum(), summary() 5b.2 Multi-level summaries in R Load the dataset from Lab 2 that you created by merging the protein data in. (i.e. you should have the independent variables FARM and VARIETY and the dependent variables YIELD and PROTEIN, plus an ID). If you had trouble with this, you can also download the combined dataset from the website under today s lab. dat1=read.csv("lentils_with_protein.csv") head(dat1) attach(dat1) Now, we want to do some more complex data summaries. For example we may want to know the mean and standard deviation of yield and protein content for each lentil variety at each farm. With the formulas above, this would require a lot of programming to subset the data and calculate the statistics of interest. Thankfully there is a nice R-package PLYR that does that automatically for us, but you need to install it first: From the menu in R, choose Packages, then Install packages. Next, you get a pop-up window, where you can choose a download site. They have all exactly the same contents, so you may pick something nearby, for example Canada (BC). Then, you get a pop-up window, where you can choose a package. Scroll down and double-click plyr. The first thing you always have to do with additional packages is to load them into computer memory. You have to rerun this library command everytime you re-open R. Only the base-funcrionality is loaded by default, not the additional packages that you may install. library(plyr)

2 Now we are ready to calculate some summaries. The function ddply() first needs to be told which dataset to work with (dat1), then given a list of independent variables to make the summaries by (in parenthesis and preceded by a period). Here we use the summarize function of (it can do other stuff), and, and finally, you provide the function or formula that should be applied to the data. Try: ddply(dat1,.(farm), summarise, myield=mean(yield)) ddply(dat1,.(farm,variety), summarise, myield=mean(yield)) ddply(dat1,.(farm), summarise, mprotein=mean(protein)) The result from the last command may strike you as odd. However, this is how many functions work by default. If you have missing values, the result of the function will be a missing value, unless you add a comment that removes missing values before the calculation. ddply(dat1,.(farm), summarise, mprotein=mean(protein, na.rm=t)) This is true and worth to keep in mind for all statistical functions. Try: mean(dat1$protein) versus mean(dat1$protein, na.rm=t) sd(dat1$protein) versus sd(dat1$protein, na.rm=t) range(dat1$protein) versus range(dat1$protein, na.rm=t) To top things off, let s calculate the mean and standard error of yield and protein content. We have not covered standard errors in class, but I want to demonstrate that you can use ddply() with any formula of interest. To get the standard error, you have to divide the standard deviation by the square root of the number of observations. We can get the number of observation with the function length(). For protein content, we need the number of non-missing observations, which we can get with length(protein [!is.na(protein)]). We ll write the results into a table (dat2) and then export the result as a CSV. dat2 = ddply(dat1,.(farm,variety), summarise, myield = mean(yield), seyield = sd (YIELD)/sqrt(length(YIELD)), mprotein = mean(protein, na.rm=t), seprotein = sd (PROTEIN, na.rm=t) /sqrt(length (PROTEIN[!is.na(PROTEIN)])) ) write.csv(dat2,"lentil_summary.csv") This may be a little complicated if you are not used to programming, but you see that it is a powerful way to quickly summarize your data. It s the numerical equivalent to: boxplot(yield~variety*farm). Just for the record, the R base package has a function that is similar to ddply(), but it s output is a bit ugly, it loses the names of group variables, and it can t do complex functions. The general syntax is aggregate(variable(s), by=list(class variable(s)), FUN=statistic of your choice). For example: aggregate(yield, by=list(farm,variety), FUN=mean) or a two-variable example: aggregate(dat[,c("yield", "PROTEIN")], by=list(farm,variety), FUN=mean, na.rm=t) There is no reason to use this, rather than the PLYR package, however.

3 5b.3 Multi-level summaries in SAS with UNIVARIATE and BOXPLOT SAS has a very useful and powerful procedure to calculate summary statistics. Let s try PROC UNIVARIATE for the aspen.csv dataset. This gives you all statistics you ever wanted to know: proc univariate data=lentils; var YIELD; There is a more useful alternative to specify which statistics you actually want, and to write them into an output file that you can export. Let s try this with the lentil.csv dataset. For this dataset we can also calculate statistics for multiple levels, by specifying one or more by variables. In the code below, we ask the number of non-missing values (n), the average (mean) of yield proc sort data=lentils; by FARM VARIETY; required sorting for by variables! proc univariate data=lentils noprint; var YIELD; by FARM VARIETY; output out=summary_stats1 n=nyield mean=myield ; You can get any number statistics for multiple variables. Again, merge in your protein dataset and try several or all of the following options: N NMISS MAX MIN RANGE SUM MEAN MODE VAR STD CV STDMEAN MEDIAN P1 P5 P10 P90 P95 P99 Q1 Q3 QRANGE SKEWNESS KURTOSIS proc sort data=lentils; by FARM VARIETY; required sorting for by variables! proc univariate data=lentils noprint; var YIELD PROTEIN; by FARM VARIETY; output out=summary_stats2 q1=q1yield q1protein q3=q3yield q3protein ; In the output out= file, you can add as many variables and statistics as you like, but each has to have a different name. A good idea is to choose a combination of the statistic and variable name as above. You don t see these file names in your SAS spreadsheet, but you do when you export your results. To export your table of summary statistics, it s actually fine to use the SAS wizard from the file menu, but you can also code it to save you some clicks: proc export data=summary_stats2; outfile="c:\your path\your file name.csv" dbms=csv replace;

4 You can also do some quick boxplots in SAS, but the graphical quality is poor and customization options are minimal. Don t waste too much time with graphics in SAS. The data also needs to be sorted correctly for this procedure, but sorting incorrectly, for example by both Farm and Variety can trick SAS into displaying multi-level boxplots. proc sort data=lentils; by variety; (also try: proc sort data=lentils; by farm variety; ) goptions reset=all; proc boxplot data=lentils; plot yield*variety; Simple boxplot Add two rows to your lentils.csv file with a mild and a far outlier (for separate Farm*Variety groups) and see how they display with this extended code: proc boxplot data=lentils; Customized boxplot. ID id; This line labels outliers so that you can identify them. plot yield*variety /cboxfill=ligr cboxes=black; insetgroup n mean q1 q3; This line adds statistics of choice for each variety. 5b.4 Pivot tables with Excel One of the rare occasions where Excel beats R and SAS in functionality are Pivot tables. This is the equivalent to PROC UNIVARIATE and the AGGREGATE command. If you just want some quick summary statistics from a database, Pivot tables are often easier to use and you have more flexibility in arranging your table of summary statistics. I highly recommend exploring this functionality with your own data. AGGREGATE and UNIVARIATE are of course still useful if the calculation of summary statistics is part of a larger program, or if you want to calculate statistics for a large number of dependent variables (for example you will quickly get tired of dragging variables to the value field if you have 365 daily measurements over the course of a year). Open lentils.csv in Excel and choose the ribbon Insert > Pivot Table Calculate means for Farm and Variety combinations: 1. drag Yield into the value field and choose Value field settings from the drop-down list, then select average 2. arrange the table in various ways by dragging Farm and Variety into the Rows / Columns / Filter fields. 3. The variables in Filter get a drop down list, where you can choose subsets (e.g. remove Farm1 data by unchecking the box.

5 CHALLENGE: You now have all the statistical tools to manage data tables and calculate summary statistics. In the next labs we will cover graphical presentation of quantitative data, and this will complete the toolkit that you need for the first round of project websites and project presentations. For your project, you should address a real scientific or applied questions (even if you have to make up the data). You are likely on the right track if you can answer the following questions for your project (and you may also answer them for your two datasets that you have come across so far: aspen.csv from Lab 1 and lentils.csv from lab 2): 1. What is the population that the investigators probably want to study? 2. What are your sampling units or experimental units? 3. How would you select a sample from a population in a way that allows you to generalize? 4. What is the predictor variable(s)? 5. What is the response variable(s)? 6. Is the predictor variable manipulated in an experiment or simply observed? 7. What is your idea of the causal relationship between the response variable and the predictor variable (scientific hypothesis)? 8. If you don t have a non-trivial scientific hypothesis, what are the practical implications of the findings? Can you use the relationship to predict or control the response variable? Later we will ask: 9. What is the appropriate statistical method to study the relationship between the response variable and the predictor variable (statistical hypothesis)? 10. Assuming, you find a statistically significant relationship, can you think of a reasonable alternative explanation for the apparent link between the variables? 11. How accurate will the prediction or control be?

3. Data Tables & Data Management

3. Data Tables & Data Management 3. Data Tables & Data Management In this lab, we will learn how to create and manage data tables for analysis. We work with a very simple example, so it is easy to see what the code does. In your own projects

More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

EXST SAS Lab Lab #6: More DATA STEP tasks

EXST SAS Lab Lab #6: More DATA STEP tasks EXST SAS Lab Lab #6: More DATA STEP tasks Objectives 1. Working from an current folder 2. Naming the HTML output data file 3. Dealing with multiple observations on an input line 4. Creating two SAS work

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here:

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here: Lab 1. Introduction to R & SAS R is free, open-source software. Get it here: http://tinyurl.com/yfet8mj for your own computer. 1.1. Using R like a calculator Open R and type these commands into the R Console

More information

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions) THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533 MIDTERM EXAMINATION: October 14, 2005 Instructor: Val LeMay Time: 50 minutes 40 Marks FRST 430 50 Marks FRST 533 (extra questions) This examination

More information

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview Using SAS to Analyze CYP-C Data: Introduction to Procedures CYP-C Research Champion Webinar July 14, 2017 Jason D. Pole, PhD Overview SAS overview revisited Introduction to SAS Procedures PROC FREQ PROC

More information

Reading data in SAS and Descriptive Statistics

Reading data in SAS and Descriptive Statistics P8130 Recitation 1: Reading data in SAS and Descriptive Statistics Zilan Chai Sep. 18 th /20 th 2017 Outline Intro to SAS (windows, basic rules) Getting Data into SAS Descriptive Statistics SAS Windows

More information

Excel 2007/2010. Don t be afraid of PivotTables. Prepared by: Tina Purtee Information Technology (818)

Excel 2007/2010. Don t be afraid of PivotTables. Prepared by: Tina Purtee Information Technology (818) Information Technology MS Office 2007/10 Users Guide Excel 2007/2010 Don t be afraid of PivotTables Prepared by: Tina Purtee Information Technology (818) 677-2090 tpurtee@csun.edu [ DON T BE AFRAID OF

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 3 Descriptive Measures Slide 3-2 Section 3.1 Measures of Center Slide 3-3 Definition 3.1 Mean of a Data Set The mean of a data set is the sum of the observations divided by the number of observations.

More information

Laboratory Topics 1 & 2

Laboratory Topics 1 & 2 PLS205 Lab 1 January 12, 2012 Laboratory Topics 1 & 2 Welcome, introduction, logistics, and organizational matters Introduction to SAS Writing and running programs; saving results; checking for errors

More information

Objective: Class Activities

Objective: Class Activities Objective: A Pivot Table is way to present information in a report format. The idea is that you can click drop down lists and change the data that is being displayed. Students will learn how to group data

More information

CHAPTER 3: Data Description

CHAPTER 3: Data Description CHAPTER 3: Data Description You ve tabulated and made pretty pictures. Now what numbers do you use to summarize your data? Ch3: Data Description Santorico Page 68 You ll find a link on our website to a

More information

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide Paper 809-2017 Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide ABSTRACT Marje Fecht, Prowerk Consulting Whether you have been programming in SAS for years, are new to

More information

Ivy s Business Analytics Foundation Certification Details (Module I + II+ III + IV + V)

Ivy s Business Analytics Foundation Certification Details (Module I + II+ III + IV + V) Ivy s Business Analytics Foundation Certification Details (Module I + II+ III + IV + V) Based on Industry Cases, Live Exercises, & Industry Executed Projects Module (I) Analytics Essentials 81 hrs 1. Statistics

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

PowerPoint Presentation to Accompany GO! All In One. Chapter 13

PowerPoint Presentation to Accompany GO! All In One. Chapter 13 PowerPoint Presentation to Accompany GO! Chapter 13 Create, Query, and Sort an Access Database; Create Forms and Reports 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 Objectives Identify Good

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

Exercise 1: Introduction to Stata

Exercise 1: Introduction to Stata Exercise 1: Introduction to Stata New Stata Commands use describe summarize stem graph box histogram log on, off exit New Stata Commands Downloading Data from the Web I recommend that you use Internet

More information

STA9750 Lecture I OUTLINE 1. WELCOME TO 9750!

STA9750 Lecture I OUTLINE 1. WELCOME TO 9750! STA9750 Lecture I OUTLINE 1. Welcome to STA9750! a. Blackboard b. Tentative syllabus c. Remote access to SAS 2. Introduction to reading data with SAS a. Manual input b. Reading from a text file c. Import

More information

Excel Primer CH141 Fall, 2017

Excel Primer CH141 Fall, 2017 Excel Primer CH141 Fall, 2017 To Start Excel : Click on the Excel icon found in the lower menu dock. Once Excel Workbook Gallery opens double click on Excel Workbook. A blank workbook page should appear

More information

It s Proc Tabulate Jim, but not as we know it!

It s Proc Tabulate Jim, but not as we know it! Paper SS02 It s Proc Tabulate Jim, but not as we know it! Robert Walls, PPD, Bellshill, UK ABSTRACT PROC TABULATE has received a very bad press in the last few years. Most SAS Users have come to look on

More information

Data Analysis Guidelines

Data Analysis Guidelines Data Analysis Guidelines DESCRIPTIVE STATISTICS Standard Deviation Standard deviation is a calculated value that describes the variation (or spread) of values in a data set. It is calculated using a formula

More information

Basic Concepts #6: Introduction to Report Writing

Basic Concepts #6: Introduction to Report Writing Basic Concepts #6: Introduction to Report Writing Using By-line, PROC Report, PROC Means, PROC Freq JC Wang By-Group Processing By-group processing in a procedure step, a BY line identifies each group

More information

Excel Basics 1. Running Excel When you first run Microsoft Excel you see the following menus and toolbars across the top of your new worksheet

Excel Basics 1. Running Excel When you first run Microsoft Excel you see the following menus and toolbars across the top of your new worksheet Excel Basics 1. Running Excel When you first run Microsoft Excel you see the following menus and toolbars across the top of your new worksheet The Main Menu Bar is located immediately below the Program

More information

Introductory SAS example

Introductory SAS example Introductory SAS example STAT:5201 1 Introduction SAS is a command-driven statistical package; you enter statements in SAS s language, submit them to SAS, and get output. A fairly friendly user interface

More information

Workshop. Import Workshop

Workshop. Import Workshop Import Overview This workshop will help participants understand the tools and techniques used in importing a variety of different types of data. It will also showcase a couple of the new import features

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

Using Excel Tables to Manipulate Billing Data, Part 2

Using Excel Tables to Manipulate Billing Data, Part 2 Using Excel Tables to Manipulate Billing Data, Part 2 By Nate Moore, CPA, MBA, CMPE The May-June 2012 issue of Billing introduced tables in Excel, a powerful tool that is used to sort, filter, and organize

More information

Basics: How to Calculate Standard Deviation in Excel

Basics: How to Calculate Standard Deviation in Excel Basics: How to Calculate Standard Deviation in Excel In this guide, we are going to look at the basics of calculating the standard deviation of a data set. The calculations will be done step by step, without

More information

Using Excel, Chapter 2: Descriptive Statistics

Using Excel, Chapter 2: Descriptive Statistics 1 Using Excel, Chapter 2: Descriptive Statistics Individual Descriptive Statistics using Excel Functions 2 A Summary of Descriptive Statistics Using the Analysis ToolPak (Windows Users) 3 A Summary of

More information

Chapter 6: Modifying and Combining Data Sets

Chapter 6: Modifying and Combining Data Sets Chapter 6: Modifying and Combining Data Sets The SET statement is a powerful statement in the DATA step. Its main use is to read in a previously created SAS data set which can be modified and saved as

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

ADVANCED INQUIRIES IN ALBEDO: PART 2 EXCEL DATA PROCESSING INSTRUCTIONS

ADVANCED INQUIRIES IN ALBEDO: PART 2 EXCEL DATA PROCESSING INSTRUCTIONS ADVANCED INQUIRIES IN ALBEDO: PART 2 EXCEL DATA PROCESSING INSTRUCTIONS Once you have downloaded a MODIS subset, there are a few steps you must take before you begin analyzing the data. Directions for

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

SPREADSHEETS. (Data for this tutorial at

SPREADSHEETS. (Data for this tutorial at SPREADSHEETS (Data for this tutorial at www.peteraldhous.com/data) Spreadsheets are great tools for sorting, filtering and running calculations on tables of data. Journalists who know the basics can interview

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

Creating a new form with check boxes, drop-down list boxes, and text box fill-ins. Customizing each of the three form fields.

Creating a new form with check boxes, drop-down list boxes, and text box fill-ins. Customizing each of the three form fields. In This Chapter Creating a new form with check boxes, drop-down list boxes, and text box fill-ins. Customizing each of the three form fields. Adding help text to any field to assist users as they fill

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

CHAPTER 6. The Normal Probability Distribution

CHAPTER 6. The Normal Probability Distribution The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit

More information

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Graphical Analysis of Data using Microsoft Excel [2016 Version] Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Access: You will have to

Access: You will have to Access: You will have to Create a new blank database Import data from a text file and set up the fields correctly Add some records to the table Create some reports. o For these reports you will need to

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

Writing Reports with the

Writing Reports with the Writing Reports with the SAS System s TABULATE Procedure or Big Money Proc Tabulate Ben Cochran The Bedford Group bencochran@nc.rr.com Writing Reports with the SAS System s TABULATE Procedure Copyright

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are

More information

1. What is a PivotTable? What is a Cross Tab Report?

1. What is a PivotTable? What is a Cross Tab Report? Data Analysis & Business Intelligence Made Easy with Excel Power Tools Excel Data Analysis Basics = E-DAB Notes for Video: E-DAB-04: Summary Reports with Standard PivotTables & Slicers Objectives of Video:

More information

CHAPTER 2 DESCRIPTIVE STATISTICS

CHAPTER 2 DESCRIPTIVE STATISTICS CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

ST Lab 1 - The basics of SAS

ST Lab 1 - The basics of SAS ST 512 - Lab 1 - The basics of SAS What is SAS? SAS is a programming language based in C. For the most part SAS works in procedures called proc s. For instance, to do a correlation analysis there is proc

More information

STAT 503 Fall Introduction to SAS

STAT 503 Fall Introduction to SAS Getting Started Introduction to SAS 1) Download all of the files, sas programs (.sas) and data files (.dat) into one of your directories. I would suggest using your H: drive if you are using a computer

More information

Microsoft Excel 2010 Training. Excel 2010 Basics

Microsoft Excel 2010 Training. Excel 2010 Basics Microsoft Excel 2010 Training Excel 2010 Basics Overview Excel is a spreadsheet, a grid made from columns and rows. It is a software program that can make number manipulation easy and somewhat painless.

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

Our Changing Forests Level 2 Graphing Exercises (Google Sheets)

Our Changing Forests Level 2 Graphing Exercises (Google Sheets) Our Changing Forests Level 2 Graphing Exercises (Google Sheets) In these graphing exercises, you will learn how to use Google Sheets to create a simple pie chart to display the species composition of your

More information

Frances Provan i #)# #%'

Frances Provan i #)# #%' !"#$%&#& Frances Provan i ##+), &'!#( $& #)# *% #%' & SPSS Versions... 2 Some slide shorthand... 2 Did you know you could... 2 Nice newish graphs... 2 Population Pyramids... 2 Population Pyramids: categories...

More information

In this chapter, I introduce you to Excel s statistical functions and data. Understanding Excel s Statistical Capabilities. Chapter 2.

In this chapter, I introduce you to Excel s statistical functions and data. Understanding Excel s Statistical Capabilities. Chapter 2. Chapter 2 Understanding Excel s Statistical Capabilities In This Chapter Working with worksheet functions Creating a shortcut to statistical functions Getting an array of results Naming arrays Tooling

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use

More information

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2 PLS205 Lab 1 January 9, 2014 Laboratory Topics 1 & 2 Welcome, introduction, logistics, and organizational matters Introduction to SAS Writing and running programs saving results checking for errors Different

More information

LEIAG-Excel Workshop

LEIAG-Excel Workshop Random Sample Excel has a simple formula we can utilize to obtain a random sample (cases, citations, city, etc.) At the Sheriff s Department, we are able to run a case management report that generates

More information

Homework 1 Excel Basics

Homework 1 Excel Basics Homework 1 Excel Basics Excel is a software program that is used to organize information, perform calculations, and create visual displays of the information. When you start up Excel, you will see the

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data EXERCISE Using Excel for Graphical Analysis of Data Introduction In several upcoming experiments, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Assignment 0. Nothing here to hand in

Assignment 0. Nothing here to hand in Assignment 0 Nothing here to hand in The questions here have solutions attached. Follow the solutions to see what to do, if you cannot otherwise guess. Though there is nothing here to hand in, it is very

More information

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC Paper CC-05 Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC ABSTRACT For many SAS users, learning SQL syntax appears to be a significant effort with a low

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS code SAS (originally Statistical Analysis Software) is a commercial statistical software package based on a powerful programming

More information

Rev. C 11/09/2010 Downers Grove Public Library Page 1 of 41

Rev. C 11/09/2010 Downers Grove Public Library Page 1 of 41 Table of Contents Objectives... 3 Introduction... 3 Excel Ribbon Components... 3 Office Button... 4 Quick Access Toolbar... 5 Excel Worksheet Components... 8 Navigating Through a Worksheet... 8 Making

More information

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd Chapter 3: Data Description - Part 3 Read: Sections 1 through 5 pp 92-149 Work the following text examples: Section 3.2, 3-1 through 3-17 Section 3.3, 3-22 through 3.28, 3-42 through 3.82 Section 3.4,

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Quantitative - One Population

Quantitative - One Population Quantitative - One Population The Quantitative One Population VISA procedures allow the user to perform descriptive and inferential procedures for problems involving one population with quantitative (interval)

More information

Notes on Simulations in SAS Studio

Notes on Simulations in SAS Studio Notes on Simulations in SAS Studio If you are not careful about simulations in SAS Studio, you can run into problems. In particular, SAS Studio has a limited amount of memory that you can use to write

More information

Basic Excel. Helen Mills OME-RESA

Basic Excel. Helen Mills OME-RESA Basic Excel Helen Mills OME-RESA Agenda Introduction- Highlight Basic Components of Microsoft Excel Entering & Formatting Data, Numbers, & Tables Calculating Totals & Summaries Using Formulas Conditional

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

2 A little on Spreadsheets

2 A little on Spreadsheets 2 A little on Spreadsheets Spreadsheets are computer versions of an accounts ledger. They are used frequently in business, but have wider uses. In particular they are often used to manipulate experimental

More information

Intro To Excel Spreadsheet for use in Introductory Sciences

Intro To Excel Spreadsheet for use in Introductory Sciences INTRO TO EXCEL SPREADSHEET (World Population) Objectives: Become familiar with the Excel spreadsheet environment. (Parts 1-5) Learn to create and save a worksheet. (Part 1) Perform simple calculations,

More information

Excel Training - Beginner March 14, 2018

Excel Training - Beginner March 14, 2018 Excel Training - Beginner March 14, 2018 Working File File was emailed to you this morning, please log in to your email, download and open the file. Once you have the file PLEASE CLOSE YOUR EMAIL. Open

More information

Statistics with a Hemacytometer

Statistics with a Hemacytometer Statistics with a Hemacytometer Overview This exercise incorporates several different statistical analyses. Data gathered from cell counts with a hemacytometer is used to explore frequency distributions

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

Instructions on Adding Zeros to the Comtrade Data

Instructions on Adding Zeros to the Comtrade Data Instructions on Adding Zeros to the Comtrade Data Required: An excel spreadshheet with the commodity codes for all products you want included. In this exercise we will want all 4-digit SITC Revision 2

More information

Week 4: Describing data and estimation

Week 4: Describing data and estimation Week 4: Describing data and estimation Goals Investigate sampling error; see that larger samples have less sampling error. Visualize confidence intervals. Calculate basic summary statistics using R. Calculate

More information

Microsoft Access 2016

Microsoft Access 2016 Access 2016 Instructor s Manual Page 1 of 10 Microsoft Access 2016 Module Two: Querying a Database A Guide to this Instructor s Manual: We have designed this Instructor s Manual to supplement and enhance

More information

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler JMP in a nutshell 1 HR, 17 Apr 2018 The software JMP Pro 14 is installed on the Macs of the Phonetics Institute. Private versions can be bought from

More information

EXCEL BASICS: MICROSOFT OFFICE 2007

EXCEL BASICS: MICROSOFT OFFICE 2007 EXCEL BASICS: MICROSOFT OFFICE 2007 GETTING STARTED PAGE 02 Prerequisites What You Will Learn USING MICROSOFT EXCEL PAGE 03 Opening Microsoft Excel Microsoft Excel Features Keyboard Review Pointer Shapes

More information

Introduction to the workbook and spreadsheet

Introduction to the workbook and spreadsheet Excel Tutorial To make the most of this tutorial I suggest you follow through it while sitting in front of a computer with Microsoft Excel running. This will allow you to try things out as you follow along.

More information

Advanced Excel Skills

Advanced Excel Skills Advanced Excel Skills Note : This tutorial is based upon MSExcel 2000. If you are using MSExcel 2002, there may be some operations which look slightly different (e.g. pivot tables), but the same principles

More information

Microsoft Access 2016

Microsoft Access 2016 Access 2016 Instructor s Manual Page 1 of 10 Microsoft Access 2016 Module Two: Querying a Database A Guide to this Instructor s Manual: We have designed this Instructor s Manual to supplement and enhance

More information

Contents of SAS Programming Techniques

Contents of SAS Programming Techniques Contents of SAS Programming Techniques Chapter 1 About SAS 1.1 Introduction 1.1.1 SAS modules 1.1.2 SAS module classification 1.1.3 SAS features 1.1.4 Three levels of SAS techniques 1.1.5 Chapter goal

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

Graphing on Excel. Open Excel (2013). The first screen you will see looks like this (it varies slightly, depending on the version):

Graphing on Excel. Open Excel (2013). The first screen you will see looks like this (it varies slightly, depending on the version): Graphing on Excel Open Excel (2013). The first screen you will see looks like this (it varies slightly, depending on the version): The first step is to organize your data in columns. Suppose you obtain

More information

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software. Welcome to Basic Excel, presented by STEM Gateway as part of the Essential Academic Skills Enhancement, or EASE, workshop series. Before we begin, I want to make sure we are clear that this is by no means

More information

Using pivot tables in Excel (live exercise with data)

Using pivot tables in Excel (live exercise with data) Using pivot tables in Excel (live exercise with data) In chapter four, we used B.C. s political donations data to learn how to build pivot tables, which group elements in your data and summarize the information

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

SAS Example A10. Output Delivery System (ODS) Sample Data Set sales.txt. Examples of currently available ODS destinations: Mervyn Marasinghe

SAS Example A10. Output Delivery System (ODS) Sample Data Set sales.txt. Examples of currently available ODS destinations: Mervyn Marasinghe SAS Example A10 data sales infile U:\Documents\...\sales.txt input Region : $8. State $2. +1 Month monyy5. Headcnt Revenue Expenses format Month monyy5. Revenue dollar12.2 proc sort by Region State Month

More information

Here is Kellogg s custom menu for their core statistics class, which can be loaded by typing the do statement shown in the command window at the very

Here is Kellogg s custom menu for their core statistics class, which can be loaded by typing the do statement shown in the command window at the very Here is Kellogg s custom menu for their core statistics class, which can be loaded by typing the do statement shown in the command window at the very bottom of the screen: 4 The univariate statistics command

More information

Using Pivot Tables in Excel (Live Exercise with Data)

Using Pivot Tables in Excel (Live Exercise with Data) Chapter 4 Using Pivot Tables in Excel (Live Exercise with Data) In chapter four, we used B.C. s political donations data to learn how to build pivot tables, which group elements in your data and summarize

More information

Excel Time Savers Page 1

Excel Time Savers Page 1 Excel Time Savers Page 1 Excel Time Savers In this document we have summarised a few useful tasks and actions that can be real time savers when doing a lot of work in Excel. The first section introduces

More information