8.3 simulating from the fitted model Chris Parrish July 3, 2016
|
|
- Alexander Riley
- 6 years ago
- Views:
Transcription
1 8. simulating from the fitted model Chris Parrish July, 6 Contents speed of light (Simon Newcomb, 88) simulate data, fit the model, and check the coverage of the conf intervals model fit post create the replicated data create fake data figure roaches 7 data model fit post figure model fit post switch to glm figure simulating from the fitted model reference: - ARM chapter 8, github library(arm) # for sim library(rstan) rstan_options(auto_write = TRUE) options(mc.cores = parallel::detectcores()) library(ggplot) library(reshape) # for melt speed of light (Simon Newcomb, 88) algorithm for model checking : compare distribution of data y to distributions of simulated ỹ. data : y. use data to fit model : y with parameters β and σ find beta and sigma such that y = Xβ + ɛ. use model to generate n hypothetical values of ỹ ỹ = Xβ + ɛ
2 . replicate step n.sims times result is a matrix with dims n.sims, n 5. compare the distribution of the real data y with the distributions of the simulated ỹ check the model : if the distributions of the simulated ỹ do not correspond to the distribution of the original data y, then the model is suspect simulate data, fit the model, and check the coverage of the conf intervals source("lightspeed.data.r", echo = TRUE) ## ## > "y" <- c(8, 6,,,, -, 7, 6,, -, ## + 9,,,, 5,,, 9,, 9,,, 6,, 6, ## + 8, 5,, 8, 9, 7,... [TRUNCATED] ## ## > "N" <- 66 str(y) ## num [:66] model lightspeed.stan data { int<lower=> N; vector[n] y; parameters { vector[] beta; real<lower=> sigma; model { y ~ normal(beta[], sigma); fit ## Model fit (lightspeed.stan) ## lm (y ~ ) datalist. <- c("n","y") lightspeed.sf <- stan(file='lightspeed.stan', data=datalist., iter=, chains=) plot(lightspeed.sf) ## ci_level:.8 (8% intervals) ## outer_level:.95 (95% intervals)
3 beta[] sigma 5 5 pairs(lightspeed.sf) 6 beta[] 6 6 sigma lp print(lightspeed.sf) ## Inference for Stan model: lightspeed. ## chains, each with iter=; warmup=5; thin=; ## post-warmup draws per chain=5, total post-warmup draws=. ## ## mean se_mean sd.5% 5% 5% 75% 97.5% n_eff ## beta[] ## sigma ## lp ## Rhat ## beta[] ## sigma ## lp ## ## Samples were drawn using NUTS(diag_e) at Fri Jul 8 :9: 6. ## For each parameter, n_eff is a crude measure of effective sample size, ## and Rhat is the potential scale reduction factor on split chains (at ## convergence, Rhat=). ## The estimated Bayesian Fraction of Missing Information is a measure of ## the efficiency of the sampler with values close to being ideal. ## For each chain, these estimates are
4 ## post post <- extract(lightspeed.sf) str(post) ## List of ## $ beta : num [:, ] ##..- attr(*, "dimnames")=list of ##....$ iterations: NULL ##....$ : NULL ## $ sigma: num [:(d)] ##..- attr(*, "dimnames")=list of ##....$ iterations: NULL ## $ lp : num [:(d)] ##..- attr(*, "dimnames")=list of ##....$ iterations: NULL create the replicated data ## Create the replicated data n.sims <- create fake data ## Create fake data n <- 5 y.rep <- array (NA, c(n.sims, n)) for (s in :n.sims){ y.rep[s,] <- rnorm (n, post$beta[s], post$sigma[s]) str(y.rep) ## num [:, :5] ## Histogram of replicated data (Figure 8.) y.new <- melt(y.rep) y.new$var <- factor(y.new$var, levels=c('','','','','5','6','7','8','9','','','','','','5'), labels=c('replication #','Replication #','Replication #','Replication #', 'Replication #5','Replication #6','Replication #7','Replication #8', 'Replication #9','Replication #','Replication #','Replication #', 'Replication #','Replication #','Replication #5')) str(y.new) ## 'data.frame': 5 obs. of variables: ## $ Var : int ## $ Var : Factor w/ 5 levels "Replication #",..:... ## $ value: num
5 p <- ggplot(y.new, aes(value)) + geom_histogram(colour = "seashell", fill = "wheat", binwidth=5) + theme_gray() + facet_wrap( ~ Var, ncol=5) + theme(axis.title.y = element_blank(), axis.title.x=element_blank()) print(p) Replication # Replication # Replication # Replication # Replication #5 5 5 Replication #6 Replication #7 Replication #8 Replication #9 Replication # 5 5 Replication # Replication # Replication # Replication # Replication # ## Write a function to make histograms with specified bin widths and ranges Hist.preset <- function (a, width, xtitle,ytitle,maintitle){ # dev.new() a.hi <- max (a, na.rm=true) a.lo <- min (a, na.rm=true) if (is.null(width)) width <- min (sqrt(a.hi-a.lo), e-5) bin.hi <- width*ceiling(a.hi/width) bin.lo <- width*floor(a.lo/width) frame = data.frame(x=a) p <- ggplot(frame,aes(x=x)) + geom_histogram(colour = "seashell", fill = "wheat", binwidth=width) + 5
6 theme_gray() + scale_x_continuous(xtitle) + scale_y_continuous(ytitle) + labs(title=maintitle) print(p) ## Run the function for (s in :){ Hist.preset (y.rep[s,], width=5, "","",paste("replication #",s,sep="")) Replication # 6
7 Replication # 5 Replication # 7
8 Replication # 5 Replication #5 8
9 Replication #6 6 Replication #7 6 9
10 Replication #8 Replication #9
11 Replication # Replication #
12 Replication # 5 6 Replication #
13 Replication # 5 Replication #5 5
14 Replication #6 5 5 Replication #7 5
15 Replication #8 5 Replication #9 5
16 Replication # ## Numerical test Test <- function (y){ min (y) test.rep <- rep (NA, n.sims) for (s in :n.sims){ test.rep[s] <- Test (y.rep[s,]) str(test.rep) ## num [:] figure 8.5 ## Histogram Figure 8.5 # dev.new() frame = data.frame(x = test.rep) frame <- data.frame(x = Test(y)) p <- ggplot(frame, aes(x = x)) + geom_histogram(colour = "seashell", fill = "wheat") + geom_segment(aes(x = x, y =, xend = x, yend =, color = "saddlebrown"), data = frame) + theme_gray() + theme(legend.position="none") + labs(title="observed T(y) and distribution of T(y.rep)") print(p) 6
17 Observed T(y) and distribution of T(y.rep) 5 count x roaches data ############################################################################## ## Read the cleaned data # All data are at # if bad initial values, this model fails # NOTE: can't find same exact data set as ARM book uses.. roachdata <- read.csv ("roachdata.csv") str(roachdata) ## 'data.frame': 6 obs. of 6 variables: ## $ X : int ## $ y : int ## $ roach : num ## $ treatment: int... ## $ senior : int... ## $ exposure: num attach(roachdata) ## The following object is masked _by_.globalenv: ## ## y 7
18 model roaches.stan data { int<lower=> N; vector[n] exposure; vector[n] roach; vector[n] senior; vector[n] treatment; int y[n]; transformed data { vector[n] log_expo; log_expo = log(exposure); parameters { vector[] beta; model { y ~ poisson_log(log_expo + beta[] + beta[] * roach + beta[] * treatment + beta[] * senior); fit datalist. <- list(n=length(roachdata$y), y=roachdata$y,roach=roachdata$roach, treatment=roachdata$treatment,exposure=roachdata$exposure, senior=roachdata$senior) roaches.sf <- stan(file='roaches.stan', data=datalist., iter=5, chains=) print(roaches.sf) ## Inference for Stan model: roaches. ## chains, each with iter=5; warmup=5; thin=; ## post-warmup draws per chain=5, total post-warmup draws=. ## ## mean se_mean sd.5% 5% 5% 75% ## beta[] ## beta[] ## beta[] ## beta[] ## lp ## 97.5% n_eff Rhat ## beta[]..9 ## beta[]..7 ## beta[] -.. ## beta[] ## lp ## ## Samples were drawn using NUTS(diag_e) at Fri Jul 8 :5:6 6. 8
19 ## For each parameter, n_eff is a crude measure of effective sample size, ## and Rhat is the potential scale reduction factor on split chains (at ## convergence, Rhat=). ## The estimated Bayesian Fraction of Missing Information is a measure of ## the efficiency of the sampler with values close to being ideal. ## For each chain, these estimates are ##.9 post post <- extract(roaches.sf) ## Comparing the data to a replicated dataset n <- length(roachdata$y) X <- cbind (rep(,n), roach, treatment, senior) y.hat <- exposure * exp (X %*% colmeans(post$beta)) y.rep <- rpois (n, y.hat) print (mean (roachdata$y==)) ## [] print (mean (y.rep==)) ## [] ## Comparing the data to replicated datasets n.sims <- y.rep <- array (NA, c(n.sims, n)) for (s in :n.sims){ y.hat <- exposure * exp (X %*% post$beta[s,]) y.rep[s,] <- rpois (n, y.hat) # test statistic Test <- function (y){ mean (y==) test.rep <- rep (NA, n.sims) for (s in :n.sims){ test.rep[s] <- Test (y.rep[s,]) # p-value print (mean (test.rep > Test(roachdata$y))) ## [] figure ## Histogram Figure # dev.new() frame = data.frame(x = test.rep) frame5 = data.frame(x = Test(roachdata$y)) 9
20 p <- ggplot(frame, aes(x=x)) + geom_histogram(colour = "seashell", fill = "wheat") + geom_segment(aes(x = x, y =, xend = x, yend = 5, color = "saddlebrown"), data = frame5) + theme_gray() + theme(legend.position="none") + labs(title="observed T(y) and distribution of T(y.rep)") print(p) ## `stat_bin()` using `bins = `. Pick better value with `binwidth`. Observed T(y) and distribution of T(y.rep) 75 count x T(y) =.6, but all the values of test.rep are much smaller. summary(test.rep) ## Min. st Qu. Median Mean rd Qu. Max. ## model roaches_overdispersion.stan data { int<lower=> N; vector[n] exposure; vector[n] roach; vector[n] senior; vector[n] treatment;
21 int y[n]; transformed data { vector[n] log_expo; log_expo = log(exposure); parameters { vector[] beta; vector[n] lambda; real<lower=> tau; transformed parameters { real<lower=> sigma; sigma =. / sqrt(tau); model { tau ~ gamma(.,.); for (i in :N) { lambda[i] ~ normal(, sigma); y[i] ~ poisson_log(lambda[i] + log_expo[i] + beta[] + beta[]*roach[i] + beta[]*senior[i] + beta[]*treatment[i]); fit ## Checking the overdispersed model # NOTE: can't find same exact data set as ARM book uses.. roaches_overdispersion.sf <- stan(file='roaches_overdispersion.stan', data=datalist., iter=, chains=) # print(roaches_overdispersion.sf) post post <- extract(roaches_overdispersion.sf) switch to glm glm. <- glm(y ~ roach + treatment + senior, data = roachdata, family=quasipoisson, offset=log(exposure)) sim. <- sim(glm., n.sims) # replicated datasets y.rep <- array (NA, c(n.sims, n)) overdisp <- summary(glm.)$dispersion
22 for (s in :n.sims){ y.hat <- exposure * exp (X %*% sim.@coef[s,]) a <- y.hat/(overdisp-) # using R's parametrization for the y.rep[s,] <- rnegbin (n, y.hat, a) # negative binomial distribution test.rep <- rep (NA, n.sims) for (s in :n.sims){ test.rep[s] <- Test (y.rep[s,]) compare each value of test.rep with the number Test(roachdata$y) # p-value summary(test.rep) ## Min. st Qu. Median Mean rd Qu. Max. ## print (mean (test.rep > Test(roachdata$y))) ## [].68 Test(roachdata$y) ## [] figure ## Histogram Figure # dev.new() frame = data.frame(x = test.rep) frame5 = data.frame(x = Test(roachdata$y)) p5 <- ggplot(frame, aes(x=x)) + geom_histogram(colour = "seashell", fill = "wheat") + geom_segment(aes(x = x, y =, xend = x, yend =, color = "saddlebrown"), data = frame5) + theme_gray() + theme(legend.position="none") + labs(title="observed T(y) and distribution of T(y.rep)") print(p5) ## `stat_bin()` using `bins = `. Pick better value with `binwidth`.
23 Observed T(y) and distribution of T(y.rep) 75 count x
8.1 fake data simulation Chris Parrish July 3, 2016
8.1 fake data simulation Chris Parrish July 3, 2016 Contents fake-data simulation 1 simulate data, fit the model, and check the coverage of the conf intervals............... 1 model....................................................
More informationGLM Poisson Chris Parrish August 18, 2016
GLM Poisson Chris Parrish August 18, 2016 Contents 3. Introduction to the generalized linear model (GLM) 1 3.3. Poisson GLM in R and WinBUGS for modeling time series of counts 1 3.3.1. Generation and analysis
More informationBayesian Workflow. How to structure the process of your analysis to maximise [sic] the odds that you build useful models.
Bayesian Workflow How to structure the process of your analysis to maximise [sic] the odds that you build useful models. -Jim Savage Sean Talts Core Stan Developer Bayesian Workflow Scope out your problem
More informationGetting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018
Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from
More informationExercises R For Simulations Columbia University EPIC 2015 (no answers)
Exercises R For Simulations Columbia University EPIC 2015 (no answers) C DiMaggio June 10, 2015 Contents 1 Sampling and Simulations 2 2 Drawing Statistical Inferences on a Continuous Variable 2 2.1 Simulations
More informationPredictive Checking. Readings GH Chapter 6-8. February 8, 2017
Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison
More informationPoisson Regression and Model Checking
Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)
More information(Not That) Advanced Hierarchical Models
(Not That) Advanced Hierarchical Models Ben Goodrich StanCon: January 10, 2018 Ben Goodrich Advanced Hierarchical Models StanCon 1 / 13 Obligatory Disclosure Ben is an employee of Columbia University,
More informationBIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...
BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...
More informationOld Faithful Chris Parrish
Old Faithful Chris Parrish 17-4-27 Contents Old Faithful eruptions 1 data.................................................. 1 duration................................................ 1 waiting time..............................................
More informationInformative Priors for Regularization in Bayesian Predictive Modeling
Informative Priors for Regularization in Bayesian Predictive Modeling Kyle M. Lang Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX November 23, 2016 Outline
More informationMarkov Chain Monte Carlo (part 1)
Markov Chain Monte Carlo (part 1) Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2018 Depending on the book that you select for
More informationSimulation and resampling analysis in R
Simulation and resampling analysis in R Author: Nicholas G Reich, Jeff Goldsmith, Andrea S Foulkes, Gregory Matthews This material is part of the statsteachr project Made available under the Creative Commons
More informationbrms: An R Package for Bayesian Multilevel Models using Stan
brms: An R Package for Bayesian Multilevel Models using Stan Paul Bürkner Institut für Psychologie Westfälische Wilhelms-Universität Münster 26.02.2016 Agenda Agenda 1 Short introduction to Stan 2 The
More informationIntroduction to Python 2
Introduction to Python 2 Chang Y. Chung Office of Population Research 01/14/2014 Algorithms + Data Structures = Programs Niklaus Wirth (1976)[3] 1 / 36 Algorithms + Data Structures = Programs Niklaus Wirth
More informationDescription/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources
R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org
More informationPackage citools. October 20, 2018
Type Package Package citools October 20, 2018 Title Confidence or Prediction Intervals, Quantiles, and Probabilities for Statistical Models Version 0.5.0 Maintainer John Haman Functions
More informationA brief introduction to econometrics in Stan James Savage
A brief introduction to econometrics in Stan James Savage 2017-04-30 2 Contents About 5 The structure................................................ 6 1 Modern Statistical Workflow 7 1.1 Modern Statistical
More informationSTENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015
STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................
More informationCSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation
CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting
More informationRstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang
Rstudio GGPLOT2 Preparations There are several different systems for creating data visualizations in R. We will introduce ggplot2, which is based on Leland Wilkinson s Grammar of Graphics. The learning
More informationPackage ggmcmc. August 29, 2016
Package ggmcmc August 29, 2016 Title Tools for Analyzing MCMC Simulations from Bayesian Inference Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for
More information=!"#$%!! 2! 2 (1 +!! )! (1 +!! )!! 2!! 2!!
MCEM is not a good choice when you are able to comput a close form of pdf/cdf. What's worse? If one only knows the kernel of pdf, things getts very very boring. For example, assmume!!!"#!!,!!!! =!"#$%!!
More informationIntroduction to R and R-Studio Toy Program #2 Excel to R & Basic Descriptives
Introduction to R and R-Studio 2018-19 Toy Program #2 Basic Descriptives Summary The goal of this toy program is to give you a boiler for working with your own excel data. So, I m hoping you ll try!. In
More informationCITS4009 Introduction to Data Science
School of Computer Science and Software Engineering CITS4009 Introduction to Data Science SEMESTER 2, 2017: CHAPTER 4 MANAGING DATA 1 Chapter Objectives Fixing data quality problems Organizing your data
More informationPSS718 - Data Mining
Lecture 5 - Hacettepe University October 23, 2016 Data Issues Improving the performance of a model To improve the performance of a model, we mostly improve the data Source additional data Clean up the
More informationIssues in MCMC use for Bayesian model fitting. Practical Considerations for WinBUGS Users
Practical Considerations for WinBUGS Users Kate Cowles, Ph.D. Department of Statistics and Actuarial Science University of Iowa 22S:138 Lecture 12 Oct. 3, 2003 Issues in MCMC use for Bayesian model fitting
More informationCanopy Light: Synthesizing multiple data sources
Canopy Light: Synthesizing multiple data sources Tree growth depends upon light (previous example, lab 7) Hard to measure how much light an ADULT tree receives Multiple sources of proxy data Exposed Canopy
More informationHow to Use (R)Stan to Estimate Models in External R Packages. Ben Goodrich of Columbia University
How to Use (R)Stan to Estimate Models in External R Packages Ben Goodrich of Columbia University (benjamin.goodrich@columbia.edu) July 6, 2017 Obligatory Disclosure Ben is an employee of Columbia University,
More informationReading and wri+ng data
An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look
More informationBAYESIAN OUTPUT ANALYSIS PROGRAM (BOA) VERSION 1.0 USER S MANUAL
BAYESIAN OUTPUT ANALYSIS PROGRAM (BOA) VERSION 1.0 USER S MANUAL Brian J. Smith January 8, 2003 Contents 1 Getting Started 4 1.1 Hardware/Software Requirements.................... 4 1.2 Obtaining BOA..............................
More informationCS&s/STAT 566 Class Lab 3 January 22, 2016
CS&s/STAT 566 Class Lab 3 January 22, 2016 (1) Fisher s randomization test; continuous response rm(list=ls()) #data trt
More informationThe Bolstad Package. July 9, 2007
The Bolstad Package July 9, 2007 Version 0.2-12 Date 2007-09-07 Title Bolstad functions Author James Curran Maintainer James M. Curran A set of
More informationNONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR
NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR J. D. Maca July 1, 1997 Abstract The purpose of this manual is to demonstrate the usage of software for
More informationBayesian data analysis using R
Bayesian data analysis using R BAYESIAN DATA ANALYSIS USING R Jouni Kerman, Samantha Cook, and Andrew Gelman Introduction Bayesian data analysis includes but is not limited to Bayesian inference (Gelman
More informationText Mining with R: Building a Text Classifier
Martin Schweinberger July 28, 2016 This post 1 will exemplify how to create a text classifier with R, i.e. it will implement a machine-learning algorithm, which classifies texts as being either a speech
More informationThe rv Package. R topics documented: November 17, Title Simulation-based random variable object class in R. Version
The rv Package November 17, 2005 Title Simulation-based random variable object class in R Version 0.911 Date 2005/11/17 Author Jouni Kerman Maintainer Jouni Kerman
More informationThis is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or
STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible
More informationExercise 2.23 Villanova MAT 8406 September 7, 2015
Exercise 2.23 Villanova MAT 8406 September 7, 2015 Step 1: Understand the Question Consider the simple linear regression model y = 50 + 10x + ε where ε is NID(0, 16). Suppose that n = 20 pairs of observations
More informationPackage nullabor. February 20, 2015
Version 0.3.1 Package nullabor February 20, 2015 Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine
More informationNotes for week 3. Ben Bolker September 26, Linear models: review
Notes for week 3 Ben Bolker September 26, 2013 Licensed under the Creative Commons attribution-noncommercial license (http: //creativecommons.org/licenses/by-nc/3.0/). Please share & remix noncommercially,
More informationUsing the package glmbfp: a binary regression example.
Using the package glmbfp: a binary regression example. Daniel Sabanés Bové 3rd August 2017 This short vignette shall introduce into the usage of the package glmbfp. For more information on the methodology,
More informationBen Baumer Instructor
MULTIPLE AND LOGISTIC REGRESSION What is logistic regression? Ben Baumer Instructor A categorical response variable ggplot(data = hearttr, aes(x = age, y = survived)) + geom_jitter(width = 0, height =
More informationMissing Data and Imputation
Missing Data and Imputation Hoff Chapter 7, GH Chapter 25 April 21, 2017 Bednets and Malaria Y:presence or absence of parasites in a blood smear AGE: age of child BEDNET: bed net use (exposure) GREEN:greenness
More informationoptions(width = 65) suppressmessages(library(mi)) data(nlsyv, package = "mi")
An Example of mi Usage Ben Goodrich and Jonathan Kropko, for this version, based on earlier versions written by Yu-Sung Su, Masanao Yajima, Maria Grazia Pittau, Jennifer Hill, and Andrew Gelman 06/16/2014
More informationStandard Errors in OLS Luke Sonnet
Standard Errors in OLS Luke Sonnet Contents Variance-Covariance of ˆβ 1 Standard Estimation (Spherical Errors) 2 Robust Estimation (Heteroskedasticity Constistent Errors) 4 Cluster Robust Estimation 7
More informationLinear Modeling with Bayesian Statistics
Linear Modeling with Bayesian Statistics Bayesian Approach I I I I I Estimate probability of a parameter State degree of believe in specific parameter values Evaluate probability of hypothesis given the
More informationBayesian model selection and diagnostics
Bayesian model selection and diagnostics A typical Bayesian analysis compares a handful of models. Example 1: Consider the spline model for the motorcycle data, how many basis functions? Example 2: Consider
More informationChapter 5: Joint Probability Distributions and Random
Chapter 5: Joint Probability Distributions and Random Samples Curtis Miller 2018-06-13 Introduction We may naturally inquire about collections of random variables that are related to each other in some
More informationSTAT 203 SOFTWARE TUTORIAL
STAT 203 SOFTWARE TUTORIAL PYTHON IN BAYESIAN ANALYSIS YING LIU 1 Some facts about Python An open source programming language Have many IDE to choose from (for R? Rstudio!) A powerful language; it can
More information1 Methods for Posterior Simulation
1 Methods for Posterior Simulation Let p(θ y) be the posterior. simulation. Koop presents four methods for (posterior) 1. Monte Carlo integration: draw from p(θ y). 2. Gibbs sampler: sequentially drawing
More informationBayesian Modelling with JAGS and R
Bayesian Modelling with JAGS and R Martyn Plummer International Agency for Research on Cancer Rencontres R, 3 July 2012 CRAN Task View Bayesian Inference The CRAN Task View Bayesian Inference is maintained
More informationEBSeq: An R package for differential expression analysis using RNA-seq data
EBSeq: An R package for differential expression analysis using RNA-seq data Ning Leng, John A. Dawson, Christina Kendziorski October 9, 2012 Contents 1 Introduction 2 2 The Model 3 2.1 Two conditions............................
More informationPair-Wise Multiple Comparisons (Simulation)
Chapter 580 Pair-Wise Multiple Comparisons (Simulation) Introduction This procedure uses simulation analyze the power and significance level of three pair-wise multiple-comparison procedures: Tukey-Kramer,
More informationSession 26 TS, Predictive Analytics: Moving Out of Square One. Moderator: Jean-Marc Fix, FSA, MAAA
Session 26 TS, Predictive Analytics: Moving Out of Square One Moderator: Jean-Marc Fix, FSA, MAAA Presenters: Jean-Marc Fix, FSA, MAAA Jeffery Robert Huddleston, ASA, CERA, MAAA Predictive Modeling: Getting
More informationGeneralized Additive Models
Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 1.1 A Framework for Statistical Problems Many statistical problems can be described by a simple framework in which
More informationSection 2. Stan Components. Bob Carpenter. Columbia University
Section 2. Stan Components Bob Carpenter Columbia University Part I Stan Top Level Stan s Namesake Stanislaw Ulam (1909 1984) Co-inventor of Monte Carlo method (and hydrogen bomb) Ulam holding the Fermiac,
More informationThe GLMMGibbs Package
The GLMMGibbs Package April 22, 2002 Version 0.5-1 Author Jonathan Myles and David Clayton Maintainer Jonathan Myles Depends R (>= 1.0) Date 2001/22/01 Title
More informationIPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 16)
IPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 6) Bonnie Lin and Nicholas Horton (nhorton@amherst.edu) July, 8 Introduction and background These documents are intended to help describe how
More informationData Import and Export
Data Import and Export Eugen Buehler October 17, 2018 Importing Data to R from a file CSV (comma separated value) tab delimited files Excel formats (xls, xlsx) SPSS/SAS/Stata RStudio will tell you if you
More informationRegression Analysis and Linear Regression Models
Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical
More informationPackage ggqc. R topics documented: January 30, Type Package Title Quality Control Charts for 'ggplot' Version Author Kenith Grey
Type Package Title Quality Control Charts for 'ggplot' Version 0.0.2 Author Kenith Grey Package ggqc January 30, 2018 Maintainer Kenith Grey Plot single and faceted type quality
More informationThe glmmml Package. August 20, 2006
The glmmml Package August 20, 2006 Version 0.65-1 Date 2006/08/20 Title Generalized linear models with clustering A Maximum Likelihood and bootstrap approach to mixed models. License GPL version 2 or newer.
More informationBUGS: Language, engines, and interfaces
BUGS: Language, engines, and interfaces Patrick Breheny January 17 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/18 The BUGS framework The BUGS project (Bayesian inference using Gibbs Sampling)
More informationR Programming: Worksheet 6
R Programming: Worksheet 6 Today we ll study a few useful functions we haven t come across yet: all(), any(), `%in%`, match(), pmax(), pmin(), unique() We ll also apply our knowledge to the bootstrap.
More informationAn introduction to R WS 2013/2014
An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,
More informationLogistic Regression. (Dichotomous predicted variable) Tim Frasier
Logistic Regression (Dichotomous predicted variable) Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.
More informationMultiple Comparisons of Treatments vs. a Control (Simulation)
Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that
More informationPackage SafeBayes. October 20, 2016
Type Package Package SafeBayes October 20, 2016 Title Generalized and Safe-Bayesian Ridge and Lasso Regression Version 1.1 Date 2016-10-17 Depends R (>= 3.1.2), stats Description Functions for Generalized
More informationA GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM
A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM Jayawant Mandrekar, Daniel J. Sargent, Paul J. Novotny, Jeff A. Sloan Mayo Clinic, Rochester, MN 55905 ABSTRACT A general
More informationPackage MetaLasso. R topics documented: March 22, Type Package
Type Package Package MetaLasso March 22, 2018 Title Integrative Generlized Linear Model for Group/Variable Selections over Multiple Studies Version 0.1.0 Depends glmnet Author Quefeng Li Maintainer Quefeng
More informationST440/540: Applied Bayesian Analysis. (5) Multi-parameter models - Initial values and convergence diagn
(5) Multi-parameter models - Initial values and convergence diagnostics Tuning the MCMC algoritm MCMC is beautiful because it can handle virtually any statistical model and it is usually pretty easy to
More informationMCMC Diagnostics. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) MCMC Diagnostics MATH / 24
MCMC Diagnostics Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) MCMC Diagnostics MATH 9810 1 / 24 Convergence to Posterior Distribution Theory proves that if a Gibbs sampler iterates enough,
More informationGgplot2 QMMA. Emanuele Taufer. 2/19/2018 Ggplot2 (1)
Ggplot2 QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20classes/1-4_ggplot2.html#(1) 1/27 Ggplot2 ggplot2 is a plotting system for R, based on the
More informationMails : ; Document version: 14/09/12
Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary
More informationRearranging and manipula.ng data
An introduc+on to Rearranging and manipula.ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 7 Course outline Review Checking and cleaning data Rearranging and manipula+ng
More informationIntroduction to Applied Bayesian Modeling A brief JAGS and R2jags tutorial
Introduction to Applied Bayesian Modeling A brief JAGS and R2jags tutorial Johannes Karreth University of Georgia jkarreth@uga.edu ICPSR Summer Program 2011 Last updated on July 7, 2011 1 What are JAGS,
More informationPackage glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering
Encoding UTF-8 Version 1.0.3 Date 2018-03-25 Title Generalized Linear Models with Clustering Package glmmml March 25, 2018 Binomial and Poisson regression for clustered data, fixed and random effects with
More informationDealing with Categorical Data Types in a Designed Experiment
Dealing with Categorical Data Types in a Designed Experiment Part II: Sizing a Designed Experiment When Using a Binary Response Best Practice Authored by: Francisco Ortiz, PhD STAT T&E COE The goal of
More informationBART::wbart: BART for Numeric Outcomes
BART::wbart: BART for Numeric Outcomes Robert McCulloch and Rodney Sparapani Contents 1 BART 1 1.1 Boston Housing Data......................................... 2 1.2 A Quick Look at the Data......................................
More informationSolution to Tumor growth in mice
Solution to Tumor growth in mice Exercise 1 1. Import the data to R Data is in the file tumorvols.csv which can be read with the read.csv2 function. For a succesful import you need to tell R where exactly
More informationBayesian Approaches to Content-based Image Retrieval
Bayesian Approaches to Content-based Image Retrieval Simon Wilson Georgios Stefanou Department of Statistics Trinity College Dublin Background Content-based Image Retrieval Problem: searching for images
More informationPackage ANOVAreplication
Type Package Version 1.1.2 Package ANOVAreplication September 30, 2017 Title Test ANOVA Replications by Means of the Prior Predictive p- Author M. A. J. Zondervan-Zwijnenburg Maintainer M. A. J. Zondervan-Zwijnenburg
More informationPackage simex. R topics documented: September 7, Type Package Version 1.7 Date Imports stats, graphics Suggests mgcv, nlme, MASS
Type Package Version 1.7 Date 2016-03-25 Imports stats, graphics Suggests mgcv, nlme, MASS Package simex September 7, 2017 Title SIMEX- And MCSIMEX-Algorithm for Measurement Error Models Author Wolfgang
More informationMissing Data Analysis for the Employee Dataset
Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients
More informationTest Run to Check the Installation of JAGS & Rjags
John Miyamoto File = D:\bugs\test.jags.install.docm 1 Test Run to Check the Installation of JAGS & Rjags The following annotated code is extracted from John Kruschke's R scripts, "E:\btut\r\BernBetaBugsFull.R"
More informationSupplementary tutorial for A Practical Guide and Power Analysis for GLMMs: Detecting Among Treatment Variation in Random Effects using R
Supplementary tutorial for A Practical Guide and Power Analysis for GLMMs: Detecting Among Treatment Variation in Random Effects using R Kain, Morgan, Ben M. Bolker, and Michael W. McCoy Introduction The
More informationIntroduction to R and the tidyverse. Paolo Crosetto
Introduction to R and the tidyverse Paolo Crosetto Lecture 1: plotting Before we start: Rstudio Interactive console Object explorer Script window Plot window Before we start: R concatenate: c() assign:
More informationInstall RStudio from - use the standard installation.
Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/
More informationUsing Machine Learning to Optimize Storage Systems
Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation
More informationYEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1)
YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1) Algebra and Functions Quadratic Functions Equations & Inequalities Binomial Expansion Sketching Curves Coordinate Geometry Radian Measures Sine and
More informationPackage DPBBM. September 29, 2016
Type Package Title Dirichlet Process Beta-Binomial Mixture Version 0.2.5 Date 2016-09-21 Author Lin Zhang Package DPBBM September 29, 2016 Maintainer Lin Zhang Depends R (>= 3.1.0)
More informationComputing With R Handout 1
Computing With R Handout 1 Getting Into R To access the R language (free software), go to a computing lab that has R installed, or a computer on which you have downloaded R from one of the distribution
More informationSpatio-temporal Under-five Mortality Methods for Estimation
Spatio-temporal Under-five Mortality Methods for Estimation Load Package and Data DemoData contains model survey data provided by DHS. Note that this data is fake, and does not represent any real country
More informationToday s Lecture. Factors & Sampling. Quick Review of Last Week s Computational Concepts. Numbers we Understand. 1. A little bit about Factors
Today s Lecture Factors & Sampling Jarrett Byrnes September 8, 2014 1. A little bit about Factors 2. Sampling 3. Describing your sample Quick Review of Last Week s Computational Concepts Numbers we Understand
More informationStatsMate. User Guide
StatsMate User Guide Overview StatsMate is an easy-to-use powerful statistical calculator. It has been featured by Apple on Apps For Learning Math in the App Stores around the world. StatsMate comes with
More informationSimulating power in practice
Simulating power in practice Author: Nicholas G Reich This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en
More informationWeek 1 R Warm-Ups for Finance
Week 1 R Warm-Ups for Finance Copyright 2016, William G. Foote. All rights reserved. Copyright 2016, William G. Foote. All rights reserved. Week 1 R Warm-Ups for Finance 1 / 97 Imagine this... You work
More informationWork through the sheet in any order you like. Skip the starred (*) bits in the first instance, unless you re fairly confident.
CDT R Review Sheet Work through the sheet in any order you like. Skip the starred (*) bits in the first instance, unless you re fairly confident. 1. Vectors (a) Generate 100 standard normal random variables,
More information