Risk Management Using R, SoSe 2013
|
|
- Jasmine Butler
- 5 years ago
- Views:
Transcription
1 1. Problem (vectors and factors) a) Create a vector containing the numbers 1 to 10. In this vector, replace all numbers greater than 4 with 5. b) Create a sequence of length 5 starting at 0 with an increment step of 2.5. c) Create a sequence from 1 to 5 and repeat each element 4 times. d) Create a vector called x with the letters "a", "b", "c", "d" and "e". The vector should be of length 30, and the frequencies randomly chosen by the sample function. Remove all "e" letters from x. The functions which, == or!= may be useful. e) Make the x vector a factor, i.e., use as.factor. Remove the "d" level completely. This means that levels(x) should give you "a" "b" "c" and not "a" "b" "c" "d". You may want to consider the help page of the factor function. Rename the levels "a", "b" and "c" to "A", "B" and "C". Fusion the levels "A" and "B" into a single category called "one" and rename level "C" into "two". Solution ## a) a <- 1:10 a[a>4] <- 5 a # seq(1,10) ## [1] ## b) seq(0, length=5, by=2.5) ## [1] ## c) rep(1:5, each=4) ## [1] ## d) set.seed(123) x <- sample(letters[1:5], size=30, replace=true) x <- x[x!="e"] # x[-which(x=='e')] x ## [1] "b" "d" "c" "a" "c" "c" "c" "c" "d" "c" "a" "b" "a" "b" "d" "d" "d" ## [18] "d" "c" "c" "b" "a" ## e) xf <- as.factor(x) xf <- xf[xf!="d"] levels(xf) # level 'd' is still there Nikolay Robinzonov 1 14th June 2013
2 ## [1] "a" "b" "c" "d" xf <- xf[,drop=true] levels(xf) # level 'd' is gone ## [1] "a" "b" "c" xf <- factor(xf, labels=c("a","b", "C")) xf ## [1] B C A C C C C C A B A B C C B A ## Levels: A B C ## one option with the 'car' package library("car") recode(xf, "c('a','b')='one'; else='two'") ## [1] one two one two two two two two one one one one two two one one ## Levels: one two ## or alternatively xx <- list(one=c("a","b"), two=c("c")) levels(xf) <- xx xf ## [1] one two one two two two two two one one one one two two one one ## Levels: one two 2. Problem (matrices) a) Sample 9 times from the normal distribution rnorm with mean 10 and variance 4. Call this vector x. Make your results reproducible, e.g., consider the set.seed function. b) Create a matrix X from x with 3 columns. c) Switch the first two columns in X and calculate the column- and the row-average values (mean), as well as the standard deviation (sd). Make use of the apply function. d) Using the functions lower.tri, upper.tri and diag assign to your X matrix the following values: X = e) Compute X +X, X +10I 3. Can you also compute (X X) 1 and (X +X ) 1, i.e., use the solve function? Do you encounter any problems and what would be the statistician s solution? Solution ## a) set.seed(123) x <- rnorm(9, mean=10, sd=2) Nikolay Robinzonov 2 14th June 2013
3 ## b) X <- matrix(x, ncol=3) ## c) X <- X[,c(2,1,3)] colmeans(x) # column switch ## [1] apply(x, 2, mean) ## [1] rowmeans(x) ## [1] apply(x, 1, mean) ## [1] apply(x, 1, sd) ## [1] apply(x, 2, sd) ## [1] ## d) X[lower.tri(X)] <- 1 X[upper.tri(X)] <- 3 diag(x) <- 2 X ## [,1] [,2] [,3] ## [1,] ## [2,] ## [3,] ## e) X + t(x) ## [,1] [,2] [,3] ## [1,] ## [2,] ## [3,] X + 10 * diag(3) ## [,1] [,2] [,3] ## [1,] ## [2,] ## [3,] solve(x %*% X) Nikolay Robinzonov 3 14th June 2013
4 ## [,1] [,2] [,3] ## [1,] ## [2,] ## [3,] solve(x + t(x)) # X + t(x) is not invertible (singular) ## Error: Lapack routine dgesv: system is exactly singular: U[2,2] = 0 XX <- X + t(x) + diag(0.01, nrow(x)) solve(xx) ## [,1] [,2] [,3] ## [1,] ## [2,] ## [3,] Problem (dates & times) (a) Make a sequence x of dates starting at January 1, 2013 and ending 90 days later. Extract all Fridays from x. (b) Find out the date lying exactly 10,000 days in the past? Use Sys.time() for the current date. (c) (d) Solution i) Load the Rweek.mat using the readmat() function from the R.matlab package. ii) The file contains weekly (Thursday to Thursday) percentage net returns of the S&P 500, FTSE, and DAX indices over the period from January 1984 to August Assign the data to a data.frame called rweek, name the columns appropriately and add an additional column indicating the dates. Get the returns in 2004 only. i) Load the daxmin.dat data containing DAX minute observations in the time interval March 20-27, Use the paste and the strptime functions to obtain a vector with the date/time information. Create a data.frame called daxini containing the date/time vector and the minute levels of DAX. Plot the times series. ii) Using the function aggregate obtain the highest and the lowest daily observations. Do the same with the cast function from the reshape package. Afterwards, obtain the highest DAX levels per hour for each day. iii) Compute the squared percentage log-returns and plot the median values per minute. Try to depict the previous figure as close as possible to Figure 1 ## a) x <- seq(from=as.date(" "), length=90, by="day") x[weekdays(x) == "Friday"] ## [1] " " " " " " " " " " ## [6] " " " " " " " " " " ## [11] " " " " " " ## b) as.date(sys.time()) Nikolay Robinzonov 4 14th June 2013
5 Median of the squared log returns NYSE opens 09:30 11:00 12:30 14:00 15:30 17:00 Figure 1: Median value per minute of the squared log-returns of the German DAX for the period March 20-27, ## [1] " " as.date(sys.time() * 60 * 60 * 24) ## [1] " " ## c) library(r.matlab, quietly=true) datei <- readmat("rweek.mat") rweek <- data.frame(datei$rweek) names(rweek) <- c("sp500", "ftse", "dax") ## add a "time" column time <- seq(as.date(" "), by="week", length=nrow(rweek)) rweek$time <- time ## get the 2004 ind <- format(rweek$time, "%Y") == "2004" ## rweek[ind,] ## d) daxini <- read.table(file="daxmin.dat", header = FALSE) x <- paste(daxini[,1], daxini[,2]) z <- strptime(x, format = "%Y%m%d %H:%M") dax <- data.frame(datetime = z, val = daxini[,3]) plot(dax$datetime, dax$val, type="l") # overnight jumps Nikolay Robinzonov 5 14th June 2013
6 dax$val Mar 21 Mar 23 Mar 25 Mar 27 dax$datetime plot(1:nrow(dax), dax$val, type="l") # fix gaps dax$val :nrow(dax) ## min and max values per day dax$day <- as.date(dax$datetime) dax$day <- as.factor(dax$day) ## tapply(dax$val, dax$day, max) ## aggregate(list(max=dax$val), list(date=dax$day), max) aggregate(list(rng=dax$val), list(date=dax$day), range) ## Date Rng.1 Rng.2 ## ## ## ## ## ## library(reshape) cast(dax, day ~., value = "val", range) Nikolay Robinzonov 6 14th June 2013
7 ## day X1 X2 ## ## ## ## ## ## ## per hour per day dax$hour <- factor(format(dax$datetime, "%H")) cast(dax, hour ~ day, value = "val", max) ## hour ## ## ## ## ## ## ## ## ## ## aggregate(list(max=dax$val), list(hour=dax$hour, Date=dax$day), max) ## tapply(dax$val, list(dax$hour, dax$day), max) ## log-returns dax$ret <- c(na, diff(log(dax$val))) * 100 dax$sqret <- dax$ret^2 # squared log-returns dax$min <- format(dax$datetime, "%H:%M") mmin <- cast(dax, min ~., value = "sqret", median) # median level per minute plot(1:nrow(mmin), mmin[,2], type="l", axes=false, xlab="", ylab="median of the squared log-returns") axis(2) ii <- grep("*30$ *00$", mmin[,1]) axis(1, at=(1:nrow(mmin))[ii], labels=mmin[ii,1]) nyx <- which(mmin[,1]=="15:30") # x-coord. nyy <- quantile(na.omit(mmin[,2]), 0.999) # y-coord. abline(v=nyx, col=2, lty=2) text(nyx, nyy, "NYSE opens") Median of the squared log returns NYSE opens 09:30 11:00 12:30 14:00 15:30 17:00 Nikolay Robinzonov 7 14th June 2013
8 4. Problem (functions) (a) Create an empty list mylist, and a numeric vector x using set.seed(1234) mylist <- list() x <- sample(1:500, size=20) Using a for-loop assign five new elements to mylist which contain selected values from x according to the following rule: the first element contains all x i [1, 100], the second element contains all x i [101, 200] and so on until the fifth element containing all x i [401, 500]. Sort the values of each element in ascending order. (b) Write a function for computing the density of the normal distribution and compare it to the standard dnorm function. The densitiy of the normal distribution is defined as f (x; µ, σ 2 ) = 1 (x µ)2 e 2σ 2. (1) 2πσ 2 (c) The function embed is very useful for time series analysis since it returns the lagged values of multivariate time-series. Try for example embed(cbind(1:10,101:110), 3) or embed(1:10, 4) to understand what it makes. Write another function, say embed2, which returns the lagged values of a multivariate time-series up to a given lag-length p. This new function should assign informative column names indicating the names of the time-series and the respective lags. When applied to the rweek data set head(rweek) ## sp500 ftse dax ## ## ## ## ## ## the output should look similar to embed2(rweek, resp="dax", lag=2) ## dax sp500.l1 sp500.l2 ftse.l1 ftse.l2 dax.l1 dax.l2 ## ## ## ## ## ## (d) Using the rweek data set, the previous function embed2, and the function lm fit the following model y DAXt = β 0 +β 1 y DAX,t 1 +β 2 y DAX,t 2 +β 3 y sp500,t 1 +β 4 y sp500,t 2 +β 5 y ftse,t 1 +β 6 y ftse,t 2 +ε t (2) where ε t N(0, σ 2 ). Nikolay Robinzonov 8 14th June 2013
9 Solution ## a) for(i in 1:5){ z <- x[x > (i-1) * & x<= i * 100] mylist[[i]] <- sort(z) } ## b) mynorm <- function(x, mu=0, sigma=1) 1/sqrt(2*pi*sigma^2) * exp(-(x-mu)^2/(2*sigma^2)) mynorm(0.3) ## [1] dnorm(0.3) ## [1] ## c) embed2 <- function(y, resp, lag = 1){ Names <- colnames(y) P <- lag + 1 res <- list() for(z in Names){ x <- as.matrix(y[,z, drop=false]) xlagged <- embed(x, P) colnames(xlagged) <- c(z, paste(z, 1:lag, sep = ".L")) if(z == resp) yresp <- xlagged[, 1, drop=false] xlagged <- xlagged[,2:p, drop=false] res[[z]] <- xlagged } res <- do.call(cbind, res) res <- cbind(yresp,res) if(is.data.frame(y)) res <- as.data.frame(res) res } rweek <- rweek[,-4] round(head(embed2(rweek, resp="dax", lag=2)),2) ## dax sp500.l1 sp500.l2 ftse.l1 ftse.l2 dax.l1 dax.l2 ## ## ## ## ## ## ## d) dat <- embed2(rweek, resp="dax", lag=2) fit <- lm(dax ~., data=dat) summary(fit) ## ## Call: ## lm(formula = dax ~., data = dat) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: Nikolay Robinzonov 9 14th June 2013
10 ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ** ## sp500.l * ## sp500.l ## ftse.l ## ftse.l ## dax.l ## dax.l ## --- ## Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 2.98 on 1118 degrees of freedom ## Multiple R-squared: ,Adjusted R-squared: ## F-statistic: 1.11 on 6 and 1118 DF, p-value: Nikolay Robinzonov 10 14th June 2013
CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation
CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting
More informationIntroduction to R, Github and Gitlab
Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and
More informationExercise 2.23 Villanova MAT 8406 September 7, 2015
Exercise 2.23 Villanova MAT 8406 September 7, 2015 Step 1: Understand the Question Consider the simple linear regression model y = 50 + 10x + ε where ε is NID(0, 16). Suppose that n = 20 pairs of observations
More informationSection 2.3: Simple Linear Regression: Predictions and Inference
Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple
More informationStat 5303 (Oehlert): Response Surfaces 1
Stat 5303 (Oehlert): Response Surfaces 1 > data
More informationSimulating power in practice
Simulating power in practice Author: Nicholas G Reich This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en
More informationPractice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version)
Practice in R January 28, 2010 (pdf version) 1 Sivan s practice Her practice file should be (here), or check the web for a more useful pointer. 2 Hetroskadasticity ˆ Let s make some hetroskadastic data:
More information1 Lab 1. Graphics and Checking Residuals
R is an object oriented language. We will use R for statistical analysis in FIN 504/ORF 504. To download R, go to CRAN (the Comprehensive R Archive Network) at http://cran.r-project.org Versions for Windows
More informationRegression on the trees data with R
> trees Girth Height Volume 1 8.3 70 10.3 2 8.6 65 10.3 3 8.8 63 10.2 4 10.5 72 16.4 5 10.7 81 18.8 6 10.8 83 19.7 7 11.0 66 15.6 8 11.0 75 18.2 9 11.1 80 22.6 10 11.2 75 19.9 11 11.3 79 24.2 12 11.4 76
More informationSection 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business
Section 4.1: Time Series I Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Time Series Data and Dependence Time-series data are simply a collection of observations gathered
More informationLab #13 - Resampling Methods Econ 224 October 23rd, 2018
Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Introduction In this lab you will work through Section 5.3 of ISL and record your code and results in an RMarkdown document. I have added section
More informationEstimating R 0 : Solutions
Estimating R 0 : Solutions John M. Drake and Pejman Rohani Exercise 1. Show how this result could have been obtained graphically without the rearranged equation. Here we use the influenza data discussed
More informationR Programming Basics - Useful Builtin Functions for Statistics
R Programming Basics - Useful Builtin Functions for Statistics Vectorized Arithmetic - most arthimetic operations in R work on vectors. Here are a few commonly used summary statistics. testvect = c(1,3,5,2,9,10,7,8,6)
More informationEXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression
EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression
More informationAmong those 14 potential explanatory variables,non-dummy variables are:
Among those 14 potential explanatory variables,non-dummy variables are: Size: 2nd column in the dataset Land: 14th column in the dataset Bed.Rooms: 5th column in the dataset Fireplace: 7th column in the
More informationSTENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015
STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................
More informationElements of a programming language 3
Elements of a programming language 3 Marcin Kierczak 21 September 2016 Contents of the lecture variables and their types operators vectors numbers as vectors strings as vectors matrices lists data frames
More informationAA BB CC DD EE. Introduction to Graphics in R
Introduction to Graphics in R Cori Mar 7/10/18 ### Reading in the data dat
More informationR is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:
Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/
More information36-402/608 HW #1 Solutions 1/21/2010
36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together
More informationRegression Analysis and Linear Regression Models
Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical
More informationSome methods for the quantification of prediction uncertainties for digital soil mapping: Universal kriging prediction variance.
Some methods for the quantification of prediction uncertainties for digital soil mapping: Universal kriging prediction variance. Soil Security Laboratory 2018 1 Universal kriging prediction variance In
More informationGetting Started in R
Getting Started in R Phil Beineke, Balasubramanian Narasimhan, Victoria Stodden modified for Rby Giles Hooker January 25, 2004 1 Overview R is a free alternative to Splus: a nice environment for data analysis
More informationIntroduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center
Introduction to R Nishant Gopalakrishnan, Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January, 2011 Getting Started Atomic Data structures Creating vectors Subsetting vectors Factors Matrices
More informationGetting Started in R
Getting Started in R Giles Hooker May 28, 2007 1 Overview R is a free alternative to Splus: a nice environment for data analysis and graphical exploration. It uses the objectoriented paradigm to implement
More informationGelman-Hill Chapter 3
Gelman-Hill Chapter 3 Linear Regression Basics In linear regression with a single independent variable, as we have seen, the fundamental equation is where ŷ bx 1 b0 b b b y 1 yx, 0 y 1 x x Bivariate Normal
More informationAnalysis of variance - ANOVA
Analysis of variance - ANOVA Based on a book by Julian J. Faraway University of Iceland (UI) Estimation 1 / 50 Anova In ANOVAs all predictors are categorical/qualitative. The original thinking was to try
More informationS CHAPTER return.data S CHAPTER.Data S CHAPTER
1 S CHAPTER return.data S CHAPTER.Data MySwork S CHAPTER.Data 2 S e > return ; return + # 3 setenv S_CLEDITOR emacs 4 > 4 + 5 / 3 ## addition & divison [1] 5.666667 > (4 + 5) / 3 ## using parentheses [1]
More informationMath 263 Excel Assignment 3
ath 263 Excel Assignment 3 Sections 001 and 003 Purpose In this assignment you will use the same data as in Excel Assignment 2. You will perform an exploratory data analysis using R. You shall reproduce
More informationGetting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018
Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from
More informationStatistics Lab #7 ANOVA Part 2 & ANCOVA
Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")
More informationContinuous-time stochastic simulation of epidemics in R
Continuous-time stochastic simulation of epidemics in R Ben Bolker May 16, 2005 1 Introduction/basic code Using the Gillespie algorithm, which assumes that all the possible events that can occur (death
More informationApplied Statistics and Econometrics Lecture 6
Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,
More information1 Standard Errors on Different Models
1 Standard Errors on Different Models Math 158, Spring 2018 Jo Hardin Regression Splines & Smoothing/Kernel Splines R code First we scrape some weather data from NOAA. The resulting data we will use is
More informationThe Data. Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO
Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO The Data The following dataset is from Hastie, Tibshirani and Friedman (2009), from a studyby Stamey et al. (1989) of prostate
More informationSolution to Bonus Questions
Solution to Bonus Questions Q2: (a) The histogram of 1000 sample means and sample variances are plotted below. Both histogram are symmetrically centered around the true lambda value 20. But the sample
More informationSection 2.2: Covariance, Correlation, and Least Squares
Section 2.2: Covariance, Correlation, and Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 A Deeper
More information9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10
St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................
More informationBernt Arne Ødegaard. 15 November 2018
R Bernt Arne Ødegaard 15 November 2018 To R is Human 1 R R is a computing environment specially made for doing statistics/econometrics. It is becoming the standard for advanced dealing with empirical data,
More informationReferences R's single biggest strenght is it online community. There are tons of free tutorials on R.
Introduction to R Syllabus Instructor Grant Cavanaugh Department of Agricultural Economics University of Kentucky E-mail: gcavanugh@uky.edu Course description Introduction to R is a short course intended
More informationSection 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business
Section 3.4: Diagnostics and Transformations Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Regression Model Assumptions Y i = β 0 + β 1 X i + ɛ Recall the key assumptions
More informationLinear Modeling with Bayesian Statistics
Linear Modeling with Bayesian Statistics Bayesian Approach I I I I I Estimate probability of a parameter State degree of believe in specific parameter values Evaluate probability of hypothesis given the
More informationSTATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS
STATISTICAL LABORATORY, April 3th, 21 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 1 independent rounds of a game, and each player has probability
More informationBiostatistics 615/815 Lecture 22: Matrix with C++
Biostatistics 615/815 Lecture 22: Matrix with C++ Hyun Min Kang December 1st, 2011 Hyun Min Kang Biostatistics 615/815 - Lecture 22 December 1st, 2011 1 / 33 Recap - A case for simple linear regression
More informationSome issues with R It is command-driven, and learning to use it to its full extent takes some time and effort. The documentation is comprehensive,
R To R is Human R is a computing environment specially made for doing statistics/econometrics. It is becoming the standard for advanced dealing with empirical data, also in finance. Good parts It is freely
More informationAn Introductory Guide to R
An Introductory Guide to R By Claudia Mahler 1 Contents Installing and Operating R 2 Basics 4 Importing Data 5 Types of Data 6 Basic Operations 8 Selecting and Specifying Data 9 Matrices 11 Simple Statistics
More informationStat 579: More Preliminaries, Reading from Files
Stat 579: More Preliminaries, Reading from Files Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu September 1, 2011, 1/10 Some more
More informationPractice for Learning R and Learning Latex
Practice for Learning R and Learning Latex Jennifer Pan August, 2011 Latex Environments A) Try to create the following equations: 1. 5+6 α = β2 2. P r( 1.96 Z 1.96) = 0.95 ( ) ( ) sy 1 r 2 3. ˆβx = r xy
More informationIntroduction to R. Hao Helen Zhang. Fall Department of Mathematics University of Arizona
Department of Mathematics University of Arizona hzhang@math.aricona.edu Fall 2019 What is R R is the most powerful and most widely used statistical software Video: A language and environment for statistical
More information610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison
610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison R is very touchy about unbalanced designs, partly because
More informationStandard Errors in OLS Luke Sonnet
Standard Errors in OLS Luke Sonnet Contents Variance-Covariance of ˆβ 1 Standard Estimation (Spherical Errors) 2 Robust Estimation (Heteroskedasticity Constistent Errors) 4 Cluster Robust Estimation 7
More informationNEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age
NEURAL NETWORKS As an introduction, we ll tackle a prediction task with a continuous variable. We ll reproduce research from the field of cement and concrete manufacturing that seeks to model the compressive
More informationMultivariate Normal Random Numbers
Multivariate Normal Random Numbers Revised: 10/11/2017 Summary... 1 Data Input... 3 Analysis Options... 4 Analysis Summary... 5 Matrix Plot... 6 Save Results... 8 Calculations... 9 Summary This procedure
More informationCross-Validation Alan Arnholt 3/22/2016
Cross-Validation Alan Arnholt 3/22/2016 Note: Working definitions and graphs are taken from Ugarte, Militino, and Arnholt (2016) The Validation Set Approach The basic idea behind the validation set approach
More informationSTAT 5200 Handout #25. R-Square & Design Matrix in Mixed Models
STAT 5200 Handout #25 R-Square & Design Matrix in Mixed Models I. R-Square in Mixed Models (with Example from Handout #20): For mixed models, the concept of R 2 is a little complicated (and neither PROC
More informationCalibration of Quinine Fluorescence Emission Vignette for the Data Set flu of the R package hyperspec
Calibration of Quinine Fluorescence Emission Vignette for the Data Set flu of the R package hyperspec Claudia Beleites CENMAT and DI3, University of Trieste Spectroscopy Imaging,
More informationPackage StVAR. February 11, 2017
Type Package Title Student's t Vector Autoregression (StVAR) Version 1.1 Date 2017-02-10 Author Niraj Poudyal Maintainer Niraj Poudyal Package StVAR February 11, 2017 Description Estimation
More informationPackage uclaboot. June 18, 2003
Package uclaboot June 18, 2003 Version 0.1-3 Date 2003/6/18 Depends R (>= 1.7.0), boot, modreg Title Simple Bootstrap Routines for UCLA Statistics Author Maintainer
More informationa. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.
Probability and Statistics Chapter 2 Notes I Section 2-1 A Steps to Constructing Frequency Distributions 1 Determine number of (may be given to you) a Should be between and classes 2 Find the Range a The
More informationSalary 9 mo : 9 month salary for faculty member for 2004
22s:52 Applied Linear Regression DeCook Fall 2008 Lab 3 Friday October 3. The data Set In 2004, a study was done to examine if gender, after controlling for other variables, was a significant predictor
More informationIntroduction to the R Statistical Computing Environment R Programming: Exercises
Introduction to the R Statistical Computing Environment R Programming: Exercises John Fox (McMaster University) ICPSR 2014 1. A straightforward problem: Write an R function for linear least-squares regression.
More informationPackage SSLASSO. August 28, 2018
Package SSLASSO August 28, 2018 Version 1.2-1 Date 2018-08-28 Title The Spike-and-Slab LASSO Author Veronika Rockova [aut,cre], Gemma Moran [aut] Maintainer Gemma Moran Description
More informationMultiple Linear Regression
Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors
More informationthat is, Data Science Hello World.
R 4 hackers Hello World that is, Data Science Hello World. We got some data... Sure, first we ALWAYS do some data exploration. data(longley) head(longley) GNP.deflator GNP Unemployed Armed.Forces Population
More informationSimulating Multivariate Normal Data
Simulating Multivariate Normal Data You have a population correlation matrix and wish to simulate a set of data randomly sampled from a population with that structure. I shall present here code and examples
More informationHandling Missing Values
Handling Missing Values STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Missing Values 2 Introduction
More informationA (very) brief introduction to R
A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce
More informationThe supclust Package
The supclust Package May 18, 2005 Title Supervised Clustering of Genes Version 1.0-5 Date 2005-05-18 Methodology for Supervised Grouping of Predictor Variables Author Marcel Dettling and Martin Maechler
More informationOrganizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set
Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,
More informationLogistic Regression. (Dichotomous predicted variable) Tim Frasier
Logistic Regression (Dichotomous predicted variable) Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.
More informationPackage BLCOP. February 15, 2013
Package BLCOP February 15, 2013 Type Package Title Black-Litterman and copula-opinion pooling frameworks Version 0.2.6 Date 22/05/2011 Author Maintainer Mango Solutions An
More informationAdvanced Econometric Methods EMET3011/8014
Advanced Econometric Methods EMET3011/8014 Lecture 2 John Stachurski Semester 1, 2011 Announcements Missed first lecture? See www.johnstachurski.net/emet Weekly download of course notes First computer
More informationR practice. Eric Gilleland. 20th May 2015
R practice Eric Gilleland 20th May 2015 1 Preliminaries 1. The data set RedRiverPortRoyalTN.dat can be obtained from http://www.ral.ucar.edu/staff/ericg. Read these data into R using the read.table function
More informationComparing Fitted Models with the fit.models Package
Comparing Fitted Models with the fit.models Package Kjell Konis Acting Assistant Professor Computational Finance and Risk Management Dept. Applied Mathematics, University of Washington History of fit.models
More informationAn introduction to R WS 2013/2014
An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,
More informationSection 4: FWL and model fit
Section 4: FWL and model fit Ed Rubin Contents 1 Admin 1 1.1 What you will need........................................ 1 1.2 Last week............................................. 2 1.3 This week.............................................
More informationOrange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1)
Orange Juice data Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l10-oj-data.html#(1) 1/31 Orange Juice Data The data contain weekly sales of refrigerated
More informationThe nor1mix Package. August 3, 2006
The nor1mix Package August 3, 2006 Title Normal (1-d) Mixture Models (S3 Classes and Methods) Version 1.0-6 Date 2006-08-02 Author: Martin Mächler Maintainer Martin Maechler
More informationA Knitr Demo. Charles J. Geyer. February 8, 2017
A Knitr Demo Charles J. Geyer February 8, 2017 1 Licence This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/.
More informationStatistical Programming with R
Statistical Programming with R Lecture 9: Basic graphics in R Part 2 Bisher M. Iqelan biqelan@iugaza.edu.ps Department of Mathematics, Faculty of Science, The Islamic University of Gaza 2017-2018, Semester
More informationThe simpleboot Package
The simpleboot Package April 1, 2005 Version 1.1-1 Date 2005-03-31 LazyLoad yes Depends R (>= 2.0.0), boot Title Simple Bootstrap Routines Author Maintainer Simple bootstrap
More informationSub-setting Data. Tzu L. Phang
Sub-setting Data Tzu L. Phang 2016-10-13 Subsetting in R Let s start with a (dummy) vectors. x
More informationGov Troubleshooting the Linear Model II: Heteroskedasticity
Gov 2000-10. Troubleshooting the Linear Model II: Heteroskedasticity Matthew Blackwell December 4, 2015 1 / 64 1. Heteroskedasticity 2. Clustering 3. Serial Correlation 4. What s next for you? 2 / 64 Where
More informationThe R statistical computing environment
The R statistical computing environment Luke Tierney Department of Statistics & Actuarial Science University of Iowa June 17, 2011 Luke Tierney (U. of Iowa) R June 17, 2011 1 / 27 Introduction R is a language
More informationII.Matrix. Creates matrix, takes a vector argument and turns it into a matrix matrix(data, nrow, ncol, byrow = F)
II.Matrix A matrix is a two dimensional array, it consists of elements of the same type and displayed in rectangular form. The first index denotes the row; the second index denotes the column of the specified
More informationStatistical Programming Worksheet 4
Statistical Programming Worksheet 4 1. Cholesky Decomposition. (a) Write a function with argument n to generate a random symmetric n n-positive definite matrix. To do this: generate an n n matrix C whose
More informationR Programming: Worksheet 6
R Programming: Worksheet 6 Today we ll study a few useful functions we haven t come across yet: all(), any(), `%in%`, match(), pmax(), pmin(), unique() We ll also apply our knowledge to the bootstrap.
More informationRegression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:
Regression Lab The data set cholesterol.txt available on your thumb drive contains the following variables: Field Descriptions ID: Subject ID sex: Sex: 0 = male, = female age: Age in years chol: Serum
More informationStat 4510/7510 Homework 4
Stat 45/75 1/7. Stat 45/75 Homework 4 Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details so that the grader
More informationddhazard Diagnostics Benjamin Christoffersen
ddhazard Diagnostics Benjamin Christoffersen 2017-11-25 Introduction This vignette will show examples of how the residuals and hatvalues functions can be used for an object returned by ddhazard. See vignette("ddhazard",
More informationThe nor1mix Package. June 12, 2007
The nor1mix Package June 12, 2007 Title Normal (1-d) Mixture Models (S3 Classes and Methods) Version 1.0-7 Date 2007-03-15 Author Martin Mächler Maintainer Martin Maechler
More informationSTATS PAD USER MANUAL
STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11
More informationLab #7 - More on Regression in R Econ 224 September 18th, 2018
Lab #7 - More on Regression in R Econ 224 September 18th, 2018 Robust Standard Errors Your reading assignment from Chapter 3 of ISL briefly discussed two ways that the standard regression inference formulas
More informationR Graphics. SCS Short Course March 14, 2008
R Graphics SCS Short Course March 14, 2008 Archeology Archeological expedition Basic graphics easy and flexible Lattice (trellis) graphics powerful but less flexible Rgl nice 3d but challenging Tons of
More informationStat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution
Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal
More informationPackage mnormt. April 2, 2015
Package mnormt April 2, 2015 Version 1.5-2 Date 2015-04-02 Title The Multivariate Normal and t Distributions Author Fortran code by Alan Genz, R code by Adelchi Azzalini Maintainer Adelchi Azzalini
More informationTHE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)
THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533 MIDTERM EXAMINATION: October 14, 2005 Instructor: Val LeMay Time: 50 minutes 40 Marks FRST 430 50 Marks FRST 533 (extra questions) This examination
More informationExtremely short introduction to R Jean-Yves Sgro Feb 20, 2018
Extremely short introduction to R Jean-Yves Sgro Feb 20, 2018 Contents 1 Suggested ahead activities 1 2 Introduction to R 2 2.1 Learning Objectives......................................... 2 3 Starting
More informationLibPAS Graphs Table Trend/PI Trend Period Comparison PI Gap Graph/PI Summary Graph
LibPAS Graphs Graphic drill-downs are available in the Table, Trend/PI, Trend, Period Comparison, and PI Gap Report Types. Graph/PI and Summary Graph Report Types were designed specifically as graph reports.
More informationSYS 6021 Linear Statistical Models
SYS 6021 Linear Statistical Models Project 2 Spam Filters Jinghe Zhang Summary The spambase data and time indexed counts of spams and hams are studied to develop accurate spam filters. Static models are
More information