Solutions to Problem Set 2 Andrew Stokes Fall 2017

Size: px
Start display at page:

Download "Solutions to Problem Set 2 Andrew Stokes Fall 2017"

Transcription

1 Solutions to Problem Set 2 Andrew Stokes Fall 2017 This answer key will use both dplyr and base r. Set working directory and point R to it input<-"/users/jasoncollins/desktop/gh811/assignments/problemset2" setwd(input) Read in any relevant libraries. The solutions below incorporate dplyr, a new package in R that makes data manipuation faster and more intuitive than base R. suppressmessages(library(dplyr)) Now we are ready to read in the Malawi data malawi_raw<-read.csv("malawi2010s.csv") The first step in processing the data is to apply the inclusion criteria. We will restrict the dataset to households in which the primary fuel used for cooking is wood. We will also require that household cooking is performed either in the main residence or a separate building. We exclude households in which cooking is performed outdoors because of the reduced risk of exposure to smoke in these instances. To reduce the size of the dataset, we also subset the data to include only those variables needed for the analysis. In dplyr to subset columns, we use the select verb,and to subset rows we use filter. (Recall tbl_df creates a local data frame) malawi <- tbl_df(malawi_raw) malawi <- select(hv024, hv025, hv226, hv227, hv239, hv240, hv241, hv270, hv108_01, shdist) %>% filter(malawi$hv226 == 8 & malawi$hv241 %in% c(1,2)) malawi ## # A tibble: 15,661 x 10 ## hv024 hv025 hv226 hv227 hv239 hv240 hv241 hv270 hv108_01 shdist ## <fctr> <fctr> <int> <int> <int> <int> <int> <fctr> <int> <fctr> ## 1 central rural middle 0 dedza ## 2 central rural richest 15 dedza ## 3 central rural richer 3 dedza ## 4 central rural richest 2 dedza ## 5 central rural middle 3 dedza ## 6 central rural poorer 5 dedza ## 7 central rural poorest 3 dedza ## 8 central rural poorer 5 dedza ## 9 central rural richest 0 dedza ## 10 central rural richer 3 dedza ## #... with 15,651 more rows Question 1 How many households were eliminated as a result of applying the stated inclusion criteria and how many remain in the final analytic dataset? 1

2 dim(malawi_raw)[1] ## [1] dim(malawi)[1] ## [1] dim(malawi_raw)[1] - dim(malawi)[1] ## [1] 9164 Question 2 We are now asked to construct a variable that identifies households with an improved wood stove. We consider the stove an improved wood stove if ANY of the following criteria are met: Food is cooked on an open stove (hv239=2) Food is cooked on a closed stove with chimney (hv239=3) Household has a chimney (hv240=1) Household has a hood (hv240=2) We consider it not an improved stove if ALL the following criteria are met: Food is cooked over an open fire (hv230=1) Household has neither chimney or hood (hv240=0) We use a nested ifelse statement to construct the improved wood stove variable. Don t forget to exclude the missing values! malawi$impstove <- ifelse(malawi$hv239==2 malawi$hv239==3 malawi$hv240==1 malawi$hv240==2, 1, ifelse(malawi$hv239==1 & malawi$hv240==0, 0, 9)) table(malawi$impstove) ## ## ## malawi <- filter(impstove!=9) table(malawi$impstove) ## ## 0 1 ## Question 3 Having created the improved stove variable, we are now asked to compare households with improve stoves to those without on several characteristics. Mean value of wealth index (hv270) Percent of households with wealth index of poorest or poorer Percent of households from the southern region of Malawi (hv024) Percent of households that are urban (hv025) Percent of households that have a bednet for sleeping (hv227) 2

3 We can use the dplyr summarise verb to calculate the values for the table. Let s start by calculating the mean value of the wealth index by improved stove status. First we need to recode the raw data for hv270, which is stored as text. malawi$wealth_index_num<-9 malawi$wealth_index_num[malawi$hv270=="poorest"]<-1 malawi$wealth_index_num[malawi$hv270=="poorer"]<-2 malawi$wealth_index_num[malawi$hv270=="middle"]<-3 malawi$wealth_index_num[malawi$hv270=="richer"]<-4 malawi$wealth_index_num[malawi$hv270=="richest"]<-5 Check to make sure it worked! table(malawi$wealth_index_num) ## ## ## Now that the data are numeric, we can use dplyr to calculate the mean value of the wealth index by improved stove status. summarise(mean_wealth = mean(wealth_index_num)) ## impstove mean_wealth ## ## The next characteristic is the percent of households with wealth index of poorest or poorer. First we create a new variable, low_ses, which we code as 1 if the household belongs to one of those two categories, else 0. malawi$low_ses<-9 malawi$low_ses[malawi$hv270=="poorest" malawi$hv270=="poorer"]<-1 malawi$low_ses[malawi$hv270=="middle" malawi$hv270=="richer" malawi$hv270=="richest"]<-0 Then using similar code as above, we calculate the proportion poorest or poorer by improved stove status. summarise(mean_low_ses = mean(low_ses)) ## impstove mean_low_ses ## ## Next up is the percent of households from the southern region of Malawi (hv024). I first construct a new dummy variable southern that indicates whether the household is located in the southern region. We find this variable by looking at the levels of the variable we suspect from the recode book. levels(malawi$hv024) ## [1] "central" "northern" "southern" 3

4 malawi$southern <- ifelse(malawi$hv024=="southern", 1,0) Then, I calcuate the proportion that reside in the southern region by improved stove status. summarise(mean_southern = mean(southern)) ## impstove mean_southern ## ## Next is the percent of households that are urban (hv025) malawi$urban <- ifelse(malawi$hv025=="urban", 1,0) summarise(mean_urban = mean(urban)) ## impstove mean_urban ## ## The final one is the percent of households that have a bednet for sleeping (hv227). This one is simple and doesn t require any recoding. summarise(mean_net = mean(hv227)) ## impstove mean_net ## ## Question 5 Which district of Malawi has the highest prevalence of improved wood stoves? imp_stove_districts <- group_by(shdist) %>% summarise(mean_imp_stove = mean(impstove)) print(imp_stove_districts, n=30) ## # A tibble: 27 x 2 ## shdist mean_imp_stove ## <fctr> <dbl> ## 1 balaka ## 2 blantyre ## 3 chikwawa ## 4 chiradzulu

5 ## 5 chitipa ## 6 dedza ## 7 dowa ## 8 karonga ## 9 kasungu ## 10 lilongwe ## 11 machinga ## 12 mangochi ## 13 mchinji ## 14 mulanje ## 15 mwanza ## 16 mzimba ## 17 neno ## 18 nkhatabay ## 19 nkhota kota ## 20 nsanje ## 21 ntcheu ## 22 ntchisi ## 23 phalombe ## 24 rumphi ## 25 salima ## 26 thyolo ## 27 zomba From the table above, it appears that the highest prevalence is found in Mulange district. Question 6 Now we are asked to restrict the districts to those that have a prevalence of improved wood stoves greater than the median and represent this with a barchart. We can use base R or dplyr here. x<-prop.table(table(malawi$impstove,malawi$shdist),2)[2,] z<-x[x>median(x)] barplot(z,xlab="district", ylab="proportion of Improved Stoves", main="improved Stoves by District", ylim=c(0,0.08)) 5

6 Proportion of Improved Stoves Improved Stoves by District balaka lilongwe mulanje mzimba nsanje rumphi zomba District Question 7 We are asked to generate a boxplot to show the distribution of education (in units of single years) of the household head comparing households that have an improved wood stove to those who do not. malawi <- filter(hv108_01<98) malawi$edu<-as.numeric(malawi$hv108_01) boxplot(edu~impstove, data=malawi, main = "Years of school by improved stove status", xlab = "Improved Stove", ylab="education in Single Years", font.lab=3, col= "darkgreen") 6

7 Years of school by improved stove status Education in Single Years Improved Stove Question 8 For question 8 we are asked to use a for loop to do something we can already do without a for loop, get the proportions of improved stoves within each wealth status category. To do this we calculate the proportion of impstoves by wealth status category individually by iterating through the different statuses. imp.wealth<-c() for (i in levels(malawi$hv270)){ imp.wealth[i]<- prop.table(table(malawi$impstove[which(malawi$hv270==i)]))[2] } imp.wealth ## middle poorer poorest richer richest ## Now we generate a barplot. barplot(imp.wealth[c("poorest","poorer","middle","richer","richest")], main="proportion of Improved Stoves by Wealth Status, Malawi", xlab="wealth Index", ylab="proportion of Improved Stoves", ylim=c(0,0.1)) 7

8 Proportion of Improved Stoves Proportion of Improved Stoves by Wealth Status, Malawi poorest poorer middle richer richest Wealth Index 8

SURVEY ON ACCESS AND USAGE OF ICT SERVICES IN MALAWI-

SURVEY ON ACCESS AND USAGE OF ICT SERVICES IN MALAWI- National Statistical Office SURVEY ON ACCESS AND USAGE OF ICT SERVICES IN MALAWI- 2014 REPORT National Statistical Office P.O. Box 333, Zomba June 2015 1 Preface This report is based on the Access and

More information

Subsetting, dplyr, magrittr Author: Lloyd Low; add:

Subsetting, dplyr, magrittr Author: Lloyd Low;  add: Subsetting, dplyr, magrittr Author: Lloyd Low; Email add: wai.low@adelaide.edu.au Introduction So you have got a table with data that might be a mixed of categorical, integer, numeric, etc variables? And

More information

Data Manipulation using dplyr

Data Manipulation using dplyr Data Manipulation in R Reading and Munging Data L. Torgo ltorgo@fc.up.pt Faculdade de Ciências / LIAAD-INESC TEC, LA Universidade do Porto Oct, 2017 Data Manipulation using dplyr The dplyr is a package

More information

Graphing Bivariate Relationships

Graphing Bivariate Relationships Graphing Bivariate Relationships Overview To fully explore the relationship between two variables both summary statistics and visualizations are important. For this assignment you will describe the relationship

More information

Lecture 3: Pipes and creating variables using mutate()

Lecture 3: Pipes and creating variables using mutate() Lecture 3: Pipes and creating variables using mutate() EDUC 263: Managing and Manipulating Data Using R Ozan Jaquette 1 Introduction What we will do today 1. Introduction 1.1 Finish lecture 2, filter and

More information

SPSS TRAINING SPSS VIEWS

SPSS TRAINING SPSS VIEWS SPSS TRAINING SPSS VIEWS Dataset Data file Data View o Full data set, structured same as excel (variable = column name, row = record) Variable View o Provides details for each variable (column in Data

More information

Blackboard 9 - Creating Categories in the Grade Center

Blackboard 9 - Creating Categories in the Grade Center University of Southern California Marshall Information Services Blackboard 9 - Creating Categories in the Grade Center Categories allow you to place Blackboard data columns (i.e. non-calculated columns)

More information

Data wrangling. Reduction/Aggregation: reduces a variable to a scalar

Data wrangling. Reduction/Aggregation: reduces a variable to a scalar Data Wrangling Some definitions A data table is a collection of variables and observations A variable (when data are tidy) is a single column in a data table An observation is a single row in a data table,

More information

IPUMS Training and Development: Requesting Data

IPUMS Training and Development: Requesting Data IPUMS Training and Development: Requesting Data IPUMS PMA Exercise 2 OBJECTIVE: Gain an understanding of how IPUMS PMA service delivery point datasets are structured and how it can be leveraged to explore

More information

Lecture 10: for, do, and switch

Lecture 10: for, do, and switch Lecture 10: for, do, and switch Jiajia Liu Recall the while Loop The while loop has the general form while ( boolean condition ) { The while loop is like a repeated if statement. It will repeat the statements

More information

More Numerical and Graphical Summaries using Percentiles. David Gerard

More Numerical and Graphical Summaries using Percentiles. David Gerard More Numerical and Graphical Summaries using Percentiles David Gerard 2017-09-18 1 Learning Objectives Percentiles Five Number Summary Boxplots to compare distributions. Sections 1.6.5 and 1.6.6 in DBC.

More information

LECTURE 5 Control Structures Part 2

LECTURE 5 Control Structures Part 2 LECTURE 5 Control Structures Part 2 REPETITION STATEMENTS Repetition statements are called loops, and are used to repeat the same code multiple times in succession. The number of repetitions is based on

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression

More information

R Visualizing Data. Fall Fall 2016 CS130 - Intro to R 1

R Visualizing Data. Fall Fall 2016 CS130 - Intro to R 1 R Visualizing Data Fall 2016 Fall 2016 CS130 - Intro to R 1 mtcars Data Frame R has a built-in data frame called mtcars Useful R functions length(object) # number of variables str(object) # structure of

More information

Week 4. Big Data Analytics - data.frame manipulation with dplyr

Week 4. Big Data Analytics - data.frame manipulation with dplyr Week 4. Big Data Analytics - data.frame manipulation with dplyr Hyeonsu B. Kang hyk149@eng.ucsd.edu April 2016 1 Dplyr In the last lecture we have seen how to index an individual cell in a data frame,

More information

Assignment 3 due Thursday Oct. 11

Assignment 3 due Thursday Oct. 11 Instructor Linda C. Stephenson due Thursday Oct. 11 GENERAL NOTE: These assignments often build on each other what you learn in one assignment may be carried over to subsequent assignments. If I have already

More information

Dplyr Introduction Matthew Flickinger July 12, 2017

Dplyr Introduction Matthew Flickinger July 12, 2017 Dplyr Introduction Matthew Flickinger July 12, 2017 Introduction to Dplyr This document gives an overview of many of the features of the dplyr library include in the tidyverse of related R pacakges. First

More information

MOBILE COVERAGE GLOBAL, REGIONAL, & NATIONAL MOBILE COVERAGE AND ADOPTION TRENDS. June Evans School Policy Analysis & Research Group (EPAR)

MOBILE COVERAGE GLOBAL, REGIONAL, & NATIONAL MOBILE COVERAGE AND ADOPTION TRENDS. June Evans School Policy Analysis & Research Group (EPAR) MOBILE COVERAGE GLOBAL, REGIONAL, & NATIONAL MOBILE COVERAGE AND ADOPTION TRENDS June 2014 AGENDA Global Snapshot of Mobile Coverage 12% of the world s population is uncovered Regional Snapshots of Mobile

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

EXPLORATORY DATA ANALYSIS. Introducing the data

EXPLORATORY DATA ANALYSIS. Introducing the data EXPLORATORY DATA ANALYSIS Introducing the data Email data set > email # A tibble: 3,921 21 spam to_multiple from cc sent_email time image 1 not-spam 0 1 0 0

More information

Selec%on and Decision Structures in Java: If Statements and Switch Statements CSC 121 Fall 2016 Howard Rosenthal

Selec%on and Decision Structures in Java: If Statements and Switch Statements CSC 121 Fall 2016 Howard Rosenthal Selec%on and Decision Structures in Java: If Statements and Switch Statements CSC 121 Fall 2016 Howard Rosenthal Lesson Goals Understand Control Structures Understand how to control the flow of a program

More information

Data Manipulation. Module 5

Data Manipulation.   Module 5 Data Manipulation http://datascience.tntlab.org Module 5 Today s Agenda A couple of base-r notes Advanced data typing Relabeling text In depth with dplyr (part of tidyverse) tbl class dplyr grammar Grouping

More information

DINO. Language Reference Manual. Author: Manu Jain

DINO. Language Reference Manual. Author: Manu Jain DINO Language Reference Manual Author: Manu Jain Table of Contents TABLE OF CONTENTS...2 1. INTRODUCTION...3 2. LEXICAL CONVENTIONS...3 2.1. TOKENS...3 2.2. COMMENTS...3 2.3. IDENTIFIERS...3 2.4. KEYWORDS...3

More information

Example 1 - Joining datasets by a common variable: Creating a single table using multiple datasets Other features illustrated: Aggregate data multi-variable recode, computational calculation Background:

More information

A Cross-national Comparison Using Stacked Data

A Cross-national Comparison Using Stacked Data A Cross-national Comparison Using Stacked Data Goal In this exercise, we combine household- and person-level files across countries to run a regression estimating the usual hours of the working-aged civilian

More information

Statistics Lecture 6. Looking at data one variable

Statistics Lecture 6. Looking at data one variable Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)

More information

DAY 52 BOX-AND-WHISKER

DAY 52 BOX-AND-WHISKER DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and

More information

Computers in Engineering COMP 208. Where s Waldo? Linear Search. Searching and Sorting Michael A. Hawker

Computers in Engineering COMP 208. Where s Waldo? Linear Search. Searching and Sorting Michael A. Hawker Computers in Engineering COMP 208 Searching and Sorting Michael A. Hawker Where s Waldo? A common use for computers is to search for the whereabouts of a specific item in a list The most straightforward

More information

Financial Econometrics Practical

Financial Econometrics Practical Financial Econometrics Practical Practical 3: Plotting in R NF Katzke Table of Contents 1 Introduction 1 1.0.1 Install ggplot2................................................. 2 1.1 Get data Tidy.....................................................

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Introduction to Computer Science Midterm 3 Fall, Points

Introduction to Computer Science Midterm 3 Fall, Points Introduction to Computer Science Fall, 2001 100 Points Notes 1. Tear off this sheet and use it to keep your answers covered at all times. 2. Turn the exam over and write your name next to the staple. Do

More information

Software Testing Fundamentals. Software Testing Techniques. Information Flow in Testing. Testing Objectives

Software Testing Fundamentals. Software Testing Techniques. Information Flow in Testing. Testing Objectives Software Testing Fundamentals Software Testing Techniques Peter Lo Software Testing is a critical element of software quality assurance and represents the ultimate review of specification, design and coding.

More information

K-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017

K-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017 K-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017 Requirements This demo requires several packages: tidyverse (dplyr, tidyr, tibble, ggplot2) modelr broom proc Background K-fold

More information

Correlation. January 12, 2019

Correlation. January 12, 2019 Correlation January 12, 2019 Contents Correlations The Scattterplot The Pearson correlation The computational raw-score formula Survey data Fun facts about r Sensitivity to outliers Spearman rank-order

More information

Data Import and Formatting

Data Import and Formatting Data Import and Formatting http://datascience.tntlab.org Module 4 Today s Agenda Importing text data Basic data visualization tidyverse vs data.table Data reshaping and type conversion Basic Text Data

More information

Division of State Fire Marshal. Florida Public School Fire Safety Report System User Manual

Division of State Fire Marshal. Florida Public School Fire Safety Report System User Manual Division of State Fire Marshal Florida Public School Fire Safety Report System User Manual Division of State Fire Marshal 10-1-2018 I. Inspection Agencies Definition: Inspection Agency A public school

More information

Mobile for Development. mhealth Country Feasibility Report. Malawi

Mobile for Development. mhealth Country Feasibility Report. Malawi Mobile for Development mhealth Country Feasibility Report Malawi Mobile for Development The GSMA represents the interests of mobile operators worldwide. Spanning more than 220 countries, the GSMA unites

More information

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables Select Cases GRAPHS The Select Cases command excludes from further analysis all those cases that do not meet specified selection criteria. Select Cases For a subset of the datafile, use Select Cases. In

More information

Using the Health Indicators database to help students research Canadian health issues

Using the Health Indicators database to help students research Canadian health issues Assignment Using the Health Indicators database to help students research Canadian health issues Joel Yan, Statistics Canada, joel.yan@statcan.ca, 1-800-465-1222 With input from Brenda Wannell, Health

More information

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL DESCRIPTION: The objective of this example is to illustrate how external

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 31-32 Review Name 1) Which of the following is the properly rounded mean for the given data? 7, 8, 13, 9, 10, 11 A) 9 B) 967 C) 97 D) 10 2) What is the median of the following set of values? 5, 19, 17,

More information

Loops! Loops! Loops! Lecture 5 COP 3014 Fall September 25, 2017

Loops! Loops! Loops! Lecture 5 COP 3014 Fall September 25, 2017 Loops! Loops! Loops! Lecture 5 COP 3014 Fall 2017 September 25, 2017 Repetition Statements Repetition statements are called loops, and are used to repeat the same code mulitple times in succession. The

More information

(edit 3/7: fixed a typo in project specification 2-f) user_id that user enters should be in the range [0,n-1] (i.e., from 0 to n-1, inclusive))

(edit 3/7: fixed a typo in project specification 2-f) user_id that user enters should be in the range [0,n-1] (i.e., from 0 to n-1, inclusive)) CSE 231 Spring 2017 Programming Project 7 (edit 3/1: fixed a typo in num_in_common_between_lists(user1_friend_lst, user2_friend_lst as described in c) calc_similarity_scores(network)) (edit 3/7: fixed

More information

Spring 2017 CS130 - Intro to R 1 R VISUALIZING DATA. Spring 2017 CS130 - Intro to R 2

Spring 2017 CS130 - Intro to R 1 R VISUALIZING DATA. Spring 2017 CS130 - Intro to R 2 Spring 2017 CS130 - Intro to R 1 R VISUALIZING DATA Spring 2017 Spring 2017 CS130 - Intro to R 2 Goals for this lecture: Review constructing Data Frame, Categorizing variables Construct basic graph, learn

More information

Flow Control: Branches and loops

Flow Control: Branches and loops Flow Control: Branches and loops In this context flow control refers to controlling the flow of the execution of your program that is, which instructions will get carried out and in what order. In the

More information

IPUMS Training and Development: Requesting Data

IPUMS Training and Development: Requesting Data IPUMS Training and Development: Requesting Data IPUMS PMA Exercise 2 OBJECTIVE: Gain an understanding of how IPUMS PMA service delivery point datasets are structured and how it can be leveraged to explore

More information

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular

More information

Florida Rural Household Travel Survey Mobile App

Florida Rural Household Travel Survey Mobile App Florida Rural Household Travel Survey Mobile App presented by Michelle Arnold, AICP, EI Transportation Planner & Mark Knoblauch, GISP Senior GIS/IT Project Manager AECOM May 6, 2015 Purpose Model Task

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

SUMMARY RESULTS FOR 2004 PARLIAMENTARY GENERAL ELECTIONS

SUMMARY RESULTS FOR 2004 PARLIAMENTARY GENERAL ELECTIONS DISTRICT CONSTITUENCY CANDIDATE PARTY 01 CHITIPA 001 Chitipa East Peter Chiwona MCP 831 9.89 Kanyerere Ghambi NDA 146 1.74 Lameck Amos Kayuni AFORD 2296 27.33 Chizamsoka Oliver Mulwafu IND 1 5129 61.04

More information

Using ADePT Edu: A Step-by-Step Guide

Using ADePT Edu: A Step-by-Step Guide Chapter 3 Using ADePT Edu: A Step-by-Step Guide This technical guide to ADePT Edu illustrates each of the steps required to install and operate the software. 1 The chapter begins by identifying the computer

More information

Old Faithful Chris Parrish

Old Faithful Chris Parrish Old Faithful Chris Parrish 17-4-27 Contents Old Faithful eruptions 1 data.................................................. 1 duration................................................ 1 waiting time..............................................

More information

Assignment 0. Nothing here to hand in

Assignment 0. Nothing here to hand in Assignment 0 Nothing here to hand in The questions here have solutions attached. Follow the solutions to see what to do, if you cannot otherwise guess. Though there is nothing here to hand in, it is very

More information

Applied Statistics and Econometrics Lecture 6

Applied Statistics and Econometrics Lecture 6 Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,

More information

Chapter 17: INTERNATIONAL DATA PRODUCTS

Chapter 17: INTERNATIONAL DATA PRODUCTS Chapter 17: INTERNATIONAL DATA PRODUCTS After the data processing and data analysis, a series of data products were delivered to the OECD. These included public use data files and codebooks, compendia

More information

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt )

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) JMP Output from Chapter 9 Factorial Analysis through JMP Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) Fitting the Model and checking conditions Analyze > Fit Model

More information

Data 8 Final Review #1

Data 8 Final Review #1 Data 8 Final Review #1 Topics we ll cover: Visualizations Arrays and Table Manipulations Programming constructs (functions, for loops, conditional statements) Chance, Simulation, Sampling and Distributions

More information

Data Feedback Report Tutorial Script Data Collection Cycle

Data Feedback Report Tutorial Script Data Collection Cycle Data Feedback Report Tutorial Script 2014-15 Data Collection Cycle The IPEDS Data Center includes a wide-range of functional options, including access to current and previous versions of your institution's

More information

Basics of Plotting Data

Basics of Plotting Data Basics of Plotting Data Luke Chang Last Revised July 16, 2010 One of the strengths of R over other statistical analysis packages is its ability to easily render high quality graphs. R uses vector based

More information

The Digital Inclusion Perspective

The Digital Inclusion Perspective The Digital Inclusion Perspective OECD Workshop on the Economic and Social Impacts of Broadband 22 nd May 2007 Ewen McKinnon Digital Inclusion Team Communities and Local Government Ewen.McKinnon@communities.gsi.gov.uk

More information

Presented by Mayamiko Minofu Renew N Able Malawi (RENAMA)

Presented by Mayamiko Minofu Renew N Able Malawi (RENAMA) Presented by Mayamiko Minofu Renew N Able Malawi (RENAMA) Background MBAULA was created during the 2012 PCIA/DISCOVER stove camp which gathered organizations and experts in production, promotion and marketing

More information

R Basics / Course Business

R Basics / Course Business R Basics / Course Business We ll be using a sample dataset in class today: CourseWeb: Course Documents " Sample Data " Week 2 Can download to your computer before class CourseWeb survey on research/stats

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use

More information

Stat Day 6 Graphs in Minitab

Stat Day 6 Graphs in Minitab Stat 150 - Day 6 Graphs in Minitab Example 1: Pursuit of Happiness The General Social Survey (GSS) is a large-scale survey conducted in the U.S. every two years. One of the questions asked concerns how

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 21, 2017 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Technical Working Session on Profiling Equity Focused Information

Technical Working Session on Profiling Equity Focused Information Technical Working Session on Profiling Equity Focused Information Using to create, knowledge and wisdom (with a particular focus on meta) 23 26 June, 2015 UN ESCAP, Bangkok 24/06/2015 1 Aims 1. Outline

More information

Salary 9 mo : 9 month salary for faculty member for 2004

Salary 9 mo : 9 month salary for faculty member for 2004 22s:52 Applied Linear Regression DeCook Fall 2008 Lab 3 Friday October 3. The data Set In 2004, a study was done to examine if gender, after controlling for other variables, was a significant predictor

More information

Room Searches and Room Requests

Room Searches and Room Requests This document contains basic information about using 25Live from an academic perspective for reserving rooms. Only use this program for reserving spaces used for academic purposes. All other rooms on campus

More information

1. Descriptive Statistics

1. Descriptive Statistics 1.1 Descriptive statistics 1. Descriptive Statistics A Data management Before starting any statistics analysis with a graphics calculator, you need to enter the data. We will illustrate the process by

More information

Solution to Tumor growth in mice

Solution to Tumor growth in mice Solution to Tumor growth in mice Exercise 1 1. Import the data to R Data is in the file tumorvols.csv which can be read with the read.csv2 function. For a succesful import you need to tell R where exactly

More information

Lesson 39: Conditionals #3 (W11D4)

Lesson 39: Conditionals #3 (W11D4) Lesson 39: Conditionals #3 (W11D4) Balboa High School Michael Ferraro October 29, 2015 1 / 29 Do Now In order to qualify for a $50k loan, the following conditions must be met: Your annual income must be

More information

2018 HELO Leadership Retreat. The Economic Impact of the Digital Divide on the Latino Community

2018 HELO Leadership Retreat. The Economic Impact of the Digital Divide on the Latino Community 2018 HELO Leadership Retreat The Economic Impact of the Digital Divide on the Latino Community on the Latino Community Speakers: Joseph Torres, Senior Director of Policy & Engagement, Free Press David

More information

A more efficient way of finding Hamiltonian cycle

A more efficient way of finding Hamiltonian cycle A more efficient way of finding Hamiltonian cycle Pawe l Kaftan September 24, 2014 1 Introduction There is no known efficient algorithm for Hamiltonian cycle problem. In this paper I present an algorithm

More information

A2. Statistical methodology

A2. Statistical methodology A2. Statistical methodology This report analyses findings collected from panellists who had the Ofcom mobile research app downloaded for at least seven days during the second fieldwork period. Panellists

More information

Copyright 2018 by KNIME Press

Copyright 2018 by KNIME Press 2 Copyright 2018 by KNIME Press All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval

More information

Search Lesson Outline

Search Lesson Outline 1. Searching Lesson Outline 2. How to Find a Value in an Array? 3. Linear Search 4. Linear Search Code 5. Linear Search Example #1 6. Linear Search Example #2 7. Linear Search Example #3 8. Linear Search

More information

After Click on Enter site you will get the page which looks like below image.

After Click on Enter site you will get the page which looks like below image. for Fill the Application First of all you have to open the Internet Explorer and write the below Website address in Address bar of Internet Explorer. www.gidc.gov.in Now you will have to click on Enter

More information

CMPSC 390 Visual Computing Spring 2014 Bob Roos Notes on R Graphs, Part 2

CMPSC 390 Visual Computing Spring 2014 Bob Roos   Notes on R Graphs, Part 2 Notes on R Graphs, Part 2 1 CMPSC 390 Visual Computing Spring 2014 Bob Roos http://cs.allegheny.edu/~rroos/cs390s2014 Notes on R Graphs, Part 2 Bar Graphs in R So far we have looked at basic (x, y) plots

More information

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

Session 1: Overview of CSPro, Dictionary and Forms

Session 1: Overview of CSPro, Dictionary and Forms Session 1: Overview of CSPro, Dictionary and Forms At the end of this lesson participants will be able to: Identify different CSPro modules and tools and their roles in the survey workflow Create a simple

More information

Notes on Topology. Andrew Forrester January 28, Notation 1. 2 The Big Picture 1

Notes on Topology. Andrew Forrester January 28, Notation 1. 2 The Big Picture 1 Notes on Topology Andrew Forrester January 28, 2009 Contents 1 Notation 1 2 The Big Picture 1 3 Fundamental Concepts 2 4 Topological Spaces and Topologies 2 4.1 Topological Spaces.........................................

More information

Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys

Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys Steven Pedlow 1, Kanru Xia 1, Michael Davern 1 1 NORC/University of Chicago, 55 E. Monroe Suite 2000, Chicago, IL 60603

More information

while (condition) { body_statements; for (initialization; condition; update) { body_statements;

while (condition) { body_statements; for (initialization; condition; update) { body_statements; ITEC 136 Business Programming Concepts Week 01, Part 01 Overview 1 Week 7 Overview Week 6 review Four parts to every loop Initialization Condition Body Update Pre-test loops: condition is evaluated before

More information

Programming Iterative Loops. for while

Programming Iterative Loops. for while Programming Iterative Loops for while What was an iterative loop, again? Recall this definition: Iteration is when the same procedure is repeated multiple times. Some examples were long division, the Fibonacci

More information

Лекция 4 Трансформация данных в R

Лекция 4 Трансформация данных в R Анализ данных Лекция 4 Трансформация данных в R Гедранович Ольга Брониславовна, старший преподаватель кафедры ИТ, МИУ volha.b.k@gmail.com 2 Вопросы лекции Фильтрация (filter) Сортировка (arrange) Выборка

More information

From the User Profile section of your employer account, select User Profile and enter your new password.

From the User Profile section of your employer account, select User Profile and enter your new password. Signing Into The Employer User Account On the ApplyToEducation Homepage (www.applytoeducation.com) sign in using your assigned username and password. If you forgot your username and/or password, click

More information

Lab 1: Introduction to data

Lab 1: Introduction to data Lab 1: Introduction to data Some define Statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information - the

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

Data Science & . June 14, 2018

Data Science &  . June 14, 2018 Data Science & Email June 14, 2018 Attention. Source: OPTE Project How do you build repeat audience attention? EMAIL It s the best way to: own your audience cultivate relationships through repeatable

More information

TRANSANA and Chapter 8 Retrieval

TRANSANA and Chapter 8 Retrieval TRANSANA and Chapter 8 Retrieval Chapter 8 in Using Software for Qualitative Research focuses on retrieval a crucial aspect of qualitatively coding data. Yet there are many aspects of this which lead to

More information

ICSSR Data Service Indian Social Science Data Repository R : User Guide Indian Council of Social Science Research

ICSSR Data Service Indian Social Science Data Repository R : User Guide Indian Council of Social Science Research http://www.icssrdataservice.in/ ICSSR Data Service Indian Social Science Data Repository R : User Guide Indian Council of Social Science Research ICSSR Data Service Contents 1. Introduction 1 2. Installation

More information

Quick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont

Quick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont Quick introduction to descriptive statistics and graphs in R Commander Written by: Robin Beaumont e-mail: robin@organplayers.co.uk http://www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html Date

More information

Dr. V. Alhanaqtah. Econometrics. Graded assignment

Dr. V. Alhanaqtah. Econometrics. Graded assignment LABORATORY ASSIGNMENT 4 (R). SURVEY: DATA PROCESSING The first step in econometric process is to summarize and describe the raw information - the data. In this lab, you will gain insight into public health

More information

Chapter 6: Modifying and Combining Data Sets

Chapter 6: Modifying and Combining Data Sets Chapter 6: Modifying and Combining Data Sets The SET statement is a powerful statement in the DATA step. Its main use is to read in a previously created SAS data set which can be modified and saved as

More information

Data Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification

Data Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification Data Mining 3.3 Fall 2008 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rules With Exceptions Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms

More information

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler JMP in a nutshell 1 HR, 17 Apr 2018 The software JMP Pro 14 is installed on the Macs of the Phonetics Institute. Private versions can be bought from

More information