Intermediate Programming in R Session 1: Data. Olivia Lau, PhD

Size: px
Start display at page:

Download "Intermediate Programming in R Session 1: Data. Olivia Lau, PhD"

Transcription

1 Intermediate Programming in R Session 1: Data Olivia Lau, PhD

2 Outline About Me About You Course Overview and Logistics R Data Types R Data Structures Importing Data Recoding Data 2

3 About Me Using and programming in R for over 10 years Working in high tech, previously worked in the federal sector and academia Expertise in: Linear and general linear models Survival and hazard rate models Multi-level models Experimentation and causal inference Taught A Crash Course in R Programming at the 2010 and 2012 UseR! conferences For more information, see 3

4 About You Have taken Introduction to R or have equivalent experience Familiar with R data types Familiar with R data structures Comfortable typing at the command line Take a moment to introduce yourselves via the Meet and Greet on the course website What is your background? What do you do? What do you want to get out of the course? 4

5 Course Overview and Logistics 4 modules Data (and review) Loops Functions Avoiding loops Self-paced, so please feel free to pause, rewind, and review I will answer questions twice per day, once in the early morning and once in the early evening Pacific time. Students are encouraged to reply to questions as well. Note: Throughout the slides, R code will look like this 5

6 Setting up your work environment Install R version 2.14 or greater If you have R installed, you can check the version with R.Version()$version.string If your R version number is less than , you must install the latest version Windows users: Make sure you install to C:\Program Files Install the R editor of your choice (Word, Notepad, TextEdit are not sufficient) Emacs with ESS: Vim with Vim-R-plugin: TinnR: NotePad++: Eclipse with StatET: 6

7 Some Reminders R is case-sensitive R for Windows uses / (forward slash) instead of \ (back slash) in file paths Ctrl-C kills the R command being executed If you get a syntax error, check your commas If you get stuck, try args(command) to see the inputs to the command function help(command) for detailed help on the command function ls() to see the contents of your workspace names(object) or str(object) to see the contents of data frames and lists Do classes match up as they should? Check with class(object) Final reminders: getwd() and setwd() to ID and set your working directory save() to save your workspace or specific objects 7

8 Check In 1 What are the arguments to the read.table command? Answer > args(read.table) function (file, header = FALSE, sep = "", quote = "\"'", dec = ".", row.names, col.names, as.is =!stringsasfactors, na.strings = "NA", colclasses = NA, nrows = -1, skip = 0, check.names = TRUE, fill =!blank.lines.skip, strip.white = FALSE, blank.lines.skip = TRUE, comment.char = "#", allowescapes = FALSE, flush = FALSE, stringsasfactors = default.stringsasfactors(), fileencoding = "", encoding = "unknown", text) 8

9 Outline About Me About You Course Overview and Logistics R Data Types R Data Structures Importing Data Recoding Data 9

10 R Data Types: Atomic or scalar units Smallest building blocks in the R language All units have a class attribute Numeric (or integer) Character Logical (TRUE or FALSE) Date (either as Date [without time] or a full POSIXct time stamp) Additional specialized classes can be defined foo <- 25 class(foo) foobar <- super class(foobar) sotrue <- TRUE class(sotrue) 10

11 Special classes: Dates and Factors A factor is categorical value Levels are usually represented as character strings, but may also be numeric Can be unordered (nominal) values, or ordered values Time is represented in two ways Dates with day and optionally time zone values, e.g., as.date( , format = %Y-%m-%d ) Time stamps with date, time in hours, minutes, and seconds, and optionally time zone attributes as.posixct( :30:00, format = %Y-%m-%d %H:%M:%S ) POSIXct stores time stamps as the number of seconds since January 1,

12 Check In 2 Create a timestamp for one hour in the future. Do not hardcode the date and time, but use the system variable Sys.time() Answer Sys.time() + (60 * 60) 12

13 R Data Types: Homogenous Data Structures A homogenous data structure contains scalars all of the same class These data structures are delimited with square brackets [] and, to separate dimensions Vector: One dimension foo.v <- c(2, 3, 4, 5) names(foo.v) <- c( eeny, meeny, miny, moe ) Matrix: Two dimensions (first is always row, second is always column) foo.m <- matrix(1:20, nrow = 4, ncol = 5) rownames(foo.m) <- c( A, B, C, D ) colnames(foo.m) <- c( E, F, G, H, I ) Array: K dimensions foo.a <- array(1:30, dim = c(2, 5, 3), dimnames = list(c( r1, r2 ), NULL, c( z1, z2, z3 )) dim(foo.a) Hit pause, and take a moment to create the data structures foo.v, foo.m, and foo.a 13

14 Check In 3 Extract the element named eeny from foo.v Answer foo.v[ eeny ] Extract row 3 from the matrix foo.m Answer foo.m[3, ] Extract the matrix associated with column 4 of foo.a Answer foo.a[, 4, ] [,1] [,2] [,3] [1,] [2,]

15 R Data Types: Heterogenous Data Structures: Lists Most general type: The list Can contain any type of data structure, scalars, vectors, matrix, arrays, other lists, etc Has names and length attributes Come in two flavors: S3 and S4 Use $ or [[ ]] to extract elements from S3 lists to extract elements from S4 lists foo.l <- list(vec = foo.v, mat = foo.m, arr = foo.a) foo.l$vec foo.l$vec[ eeny ] foo.l[[3]][2,, ] attributes(foo.l) 15

16 R Data Types: Data frames A data frame is a list in which all of the elements have the same length Data frames use S3 methods of extraction library(mass) data(cars93) names(cars93) str(cars93) dim(cars93) summary(cars93) head(cars93) 16

17 Check In 4 In the Cars93 data set from the MASS library, identify the first 10 values in Weight Answer library(mass) data(cars93) Cars93$Weight[1:10] 17

18 Importing Data Text files, space delimited worldbank <- read.table( worldbank.txt, header = TRUE) Text files, tab delimited worldbank <- read.table( worldbank.tab, header = TRUE, sep = \t ) Text files, comma delimited worldbank <- read.csv( worldbank.csv ) Text files, fixed width: read.fwf() If reading a text file takes a long time, Pre-specify column classes using the colclasses argument for text files Alternatively, use scan() SAS, STATA, SPSS, and other foreign file types can be imported using the foreign library library(foreign) worldbank <- read.dta( worldbank.dta ) # For STATA files 18

19 Importing Data: Sanity Checks Check for number or rows and columns with dim() or nrow() or ncol() Check for variable names using names(), assign names if necessary Check for missing values with apply(worldbank, 2, function(x) sum(is.na(x))) Do you have the right number of missing values for each variable? Were missing values coded in the original data (e.g., -99 = missing)? If so, use read.*(..., na.strings = c(,, -99 )) Check variable classes, recode if necessary 19

20 Recoding Data: Change Variable Classes Never attach() a data frame By default, R coerces character strings (including dates) to factors To override for all character variables, use read.*(..., as.is = TRUE) If some character fields are factors and others character, read.*() as normal, then recode to correct class From factor to character worldbank$yearcode <- as.character(worldbank$yearcode) From factor to numeric worldbank$year <- as.numeric(as.character(worldbank$year)) From character to ordered factor worldbank$yearcode <- factor(worldbank$yearcode, levels = paste0( YR, 2002:2011), ordered = TRUE) 20

21 Recoding Data: Change Variable Names Check existing variables with names(worldbank) Rename variables in two steps Create the new variable worldbank$year.factor <- as.factor(worldbank$yearcode) Remove the old variable workldbank$yearcode <- NULL No error message if you are overwriting an existing variable 21

22 Recoding Data: Subsets Identify rows, columns, or vector positions using A logical vector of the same dimension as the object A numeric vector with the dimension index of the object A character vector with the element names (row names, column names, etc) of the object Any combination of the above three Extract identified positions and save them as new objects in the workspace yr2002 <- worldbank[worldbank$year == 2002, ] ck2002 <- worldbank[which(worldbank$year == 2002), ] identical(yr2002, ck2002) Replace identified positions with new values worldbank$before2005 <- 0 worldbank[worldbank$year < 2005, before2005 ] <- 1 22

23 Recoding Data: Merging data Both data frames to be merged should already be R objects in the workspace R creates a primary key by looking for identical variable names in dataset x and dataset y Check that variable names are expected before joining If no common variables are found, R will perform a combinatoric expansion of the rows and columns of both data sets, resulting in really really big data sets R supports 4 standard types of joins using one command: merge() Inner join (default, unless there are no common variables) merge(x, y) Outer join merge(x, y, all = TRUE) Left join merge(x, y, all.x = TRUE) Right join merge(x, y, all.y = TRUE) R s equivalent of SQL s UNION ALL rbind(x, y) 23

24 Assignment Introduce yourself on the class discussion board Reading for this week From the course text, Paul Teetor s R Cookbook: Chapters 1-2 Chapter 4, Sections 7-10 only Chapter 5 (stop at the beginning of Section 5.1 on p. 101) R help pages for: which merge Problem set as assigned 24

Reading in data. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen

Reading in data. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Reading in data Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Data Import R can import data many ways. Packages exists that handles import from software systems

More information

Reading and wri+ng data

Reading and wri+ng data An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look

More information

Input/Output Data Frames

Input/Output Data Frames Input/Output Data Frames Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Input/Output Importing text files Rectangular (n rows, c columns) Usually you want to use read.table read.table(file,

More information

Data Input/Output. Andrew Jaffe. January 4, 2016

Data Input/Output. Andrew Jaffe. January 4, 2016 Data Input/Output Andrew Jaffe January 4, 2016 Before we get Started: Working Directories R looks for files on your computer relative to the working directory It s always safer to set the working directory

More information

Reading and writing data

Reading and writing data 25/10/2017 Reading data Reading data is one of the most consuming and most cumbersome aspects of bioinformatics... R provides a number of ways to read and write data stored on different media (file, database,

More information

STAT 540 Computing in Statistics

STAT 540 Computing in Statistics STAT 540 Computing in Statistics Introduces programming skills in two important statistical computer languages/packages. 30-40% R and 60-70% SAS Examples of Programming Skills: 1. Importing Data from External

More information

Programming for Chemical and Life Science Informatics

Programming for Chemical and Life Science Informatics Programming for Chemical and Life Science Informatics I573 - Week 7 (Statistical Programming with R) Rajarshi Guha 24 th February, 2009 Resources Download binaries If you re working on Unix it s a good

More information

Intermediate Programming in R Session 4: Avoiding Loops. Olivia Lau, PhD

Intermediate Programming in R Session 4: Avoiding Loops. Olivia Lau, PhD Intermediate Programming in R Session 4: Avoiding Loops Olivia Lau, PhD Outline Thinking in Parallel Vectorization Avoiding Loops with Homogenous Data Structures Avoiding Loops with Heterogenous Data Structures

More information

"no.loss"), FALSE) na.strings=c("na","#div/0!"), 72 # Ενσωματωμένες συναρτήσεις (build-in functions) του R

no.loss), FALSE) na.strings=c(na,#div/0!), 72 # Ενσωματωμένες συναρτήσεις (build-in functions) του R 71 72 # Ενσωματωμένες συναρτήσεις (build-in functions) του R ----------------------------------------- 73 read.table(file, header = FALSE, sep = "", quote = "\"'", 74 dec = ".", numerals = c("allow.loss",

More information

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Contents Things to Know Before You Begin.................................... 1 Entering and Outputting Data......................................

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3)

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3) Basics of R 1. Arithmetic Operators > 2+2 > sqrt(2) # (2) >2^2 > sin(pi) # sin(π) >(1-2)*3 > exp(1) # e 1 >1-2*3 > log(10) # This is a short form of the full command, log(10, base=e). (Note) For log 10

More information

R and parallel libraries. Introduction to R for data analytics Bologna, 26/06/2017

R and parallel libraries. Introduction to R for data analytics Bologna, 26/06/2017 R and parallel libraries Introduction to R for data analytics Bologna, 26/06/2017 Outline Overview What is R R Console Input and Evaluation Data types R Objects and Attributes Vectors and Lists Matrices

More information

What R is. STAT:5400 (22S:166) Computing in Statistics

What R is. STAT:5400 (22S:166) Computing in Statistics STAT:5400 (22S:166) Computing in Statistics Introduction to R Lecture 5 September 9, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowa.edu 1 What R is an integrated suite of software facilities for data

More information

Introduction to R. Dataset Basics. March 2018

Introduction to R. Dataset Basics. March 2018 Introduction to R March 2018 1. Preliminaries.... a) Suggested packages for importing/exporting data.... b) FAQ: How to find the path of your dataset (or whatever). 2. Import/Export Data........ a) R (.Rdata)

More information

Module 4. Data Input. Andrew Jaffe Instructor

Module 4. Data Input. Andrew Jaffe Instructor Module 4 Data Input Andrew Jaffe Instructor Data Input We used several pre-installed sample datasets during previous modules (CO2, iris) However, 'reading in' data is the first step of any real project/analysis

More information

Importing data sets in R

Importing data sets in R Importing data sets in R R can import and export different types of data sets including csv files text files excel files access database STATA data SPSS data shape files audio files image files and many

More information

Introducion to R and parallel libraries. Giorgio Pedrazzi, CINECA Matteo Sartori, CINECA School of Data Analytics and Visualisation Milan, 09/06/2015

Introducion to R and parallel libraries. Giorgio Pedrazzi, CINECA Matteo Sartori, CINECA School of Data Analytics and Visualisation Milan, 09/06/2015 Introducion to R and parallel libraries Giorgio Pedrazzi, CINECA Matteo Sartori, CINECA School of Data Analytics and Visualisation Milan, 09/06/2015 Overview What is R R Console Input and Evaluation Data

More information

IMPORTING DATA INTO R. Introduction Flat Files

IMPORTING DATA INTO R. Introduction Flat Files IMPORTING DATA INTO R Introduction Flat Files Importing data into R? 5 Types Flat Files Excel Files Statistical Software Databases Data from the Web Flat Files states.csv Comma Separated Values state,capital,pop_mill,area_sqm

More information

Introduction to R Commander

Introduction to R Commander Introduction to R Commander 1. Get R and Rcmdr to run 2. Familiarize yourself with Rcmdr 3. Look over Rcmdr metadata (Fox, 2005) 4. Start doing stats / plots with Rcmdr Tasks 1. Clear Workspace and History.

More information

IST 3108 Data Analysis and Graphics Using R. Summarizing Data Data Import-Export

IST 3108 Data Analysis and Graphics Using R. Summarizing Data Data Import-Export IST 3108 Data Analysis and Graphics Using R Summarizing Data Data Import-Export Engin YILDIZTEPE, PhD Working with Vectors and Logical Subscripts >xsum(x) how many of the values were less than

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

Reading data into R. 1. Data in human readable form, which can be inspected with a text editor.

Reading data into R. 1. Data in human readable form, which can be inspected with a text editor. Reading data into R There is a famous, but apocryphal, story about Mrs Beeton, the 19th century cook and writer, which says that she began her recipe for rabbit stew with the instruction First catch your

More information

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R MBV4410/9410 Fall 2018 Bioinformatics for Molecular Biology Introduction to R Outline Introduce R Basic operations RStudio Bioconductor? Goal of the lecture Introduce you to R Show how to run R, basic

More information

IMPORTING DATA IN R. Introduction read.csv

IMPORTING DATA IN R. Introduction read.csv IMPORTING DATA IN R Introduction read.csv Importing data in R? 5 types Flat files Data from Excel Databases Web Statistical software Flat Files states.csv Comma Separated Values state,capital,pop_mill,area_sqm

More information

An Introduction to Statistical Computing in R

An Introduction to Statistical Computing in R An Introduction to Statistical Computing in R K2I Data Science Boot Camp - Day 1 AM Session May 15, 2017 Statistical Computing in R May 15, 2017 1 / 55 AM Session Outline Intro to R Basics Plotting In

More information

Reading Data in zoo. Gabor Grothendieck GKX Associates Inc. Achim Zeileis Universität Innsbruck

Reading Data in zoo. Gabor Grothendieck GKX Associates Inc. Achim Zeileis Universität Innsbruck Reading Data in zoo Gabor Grothendieck GKX Associates Inc. Achim Zeileis Universität Innsbruck Abstract This vignette gives examples of how to read data in various formats in the zoo package using the

More information

Intermediate Programming in R Session 2: Loops. Olivia Lau, PhD

Intermediate Programming in R Session 2: Loops. Olivia Lau, PhD Intermediate Programming in R Session 2: Loops Olivia Lau, PhD Outline When to Use Loops Measuring and Monitoring R s Performance Different Types of Loops Fast Loops 2 When to Use Loops Loops repeat a

More information

Package swat. March 5, 2018

Package swat. March 5, 2018 Type Package Package swat March 5, 2018 Title SAS Scripting Wrapper for Analytics Transfer (SWAT) Version 1.2.0.9000 Date 11OCT2017 Author Jared Dean [aut, cre], Tom Weber [aut, cre], Kevin Smith [aut]

More information

Package swat. April 27, 2017

Package swat. April 27, 2017 Type Package Package swat April 27, 2017 Title SAS Scripting Wrapper for Analytics Transfer (SWAT) Version 1.0.0 Date 28APR2017 Author Jared Dean [aut, cre], Tom Weber [aut, cre], Kevin Smith [aut] SWAT

More information

ACHIEVEMENTS FROM TRAINING

ACHIEVEMENTS FROM TRAINING LEARN WELL TECHNOCRAFT DATA SCIENCE/ MACHINE LEARNING SYLLABUS 8TH YEAR OF ACCOMPLISHMENTS AUTHORIZED GLOBAL CERTIFICATION CENTER FOR MICROSOFT, ORACLE, IBM, AWS AND MANY MORE. 8411002339/7709292162 WWW.DW-LEARNWELL.COM

More information

CS Introduction to Computational and Data Science. Instructor: Renzhi Cao Computer Science Department Pacific Lutheran University Spring 2017

CS Introduction to Computational and Data Science. Instructor: Renzhi Cao Computer Science Department Pacific Lutheran University Spring 2017 CS 133 - Introduction to Computational and Data Science Instructor: Renzhi Cao Computer Science Department Pacific Lutheran University Spring 2017 Announcement Read book to page 44. Final project Today

More information

Data input & output. Hadley Wickham. Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University.

Data input & output. Hadley Wickham. Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University. Data input & output Hadley Wickham Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University June 2012 1. Working directories 2. Loading data 3. Strings and factors

More information

Stat405. More about data. Hadley Wickham. Tuesday, September 11, 12

Stat405. More about data. Hadley Wickham. Tuesday, September 11, 12 Stat405 More about data Hadley Wickham 1. (Data update + announcement) 2. Motivating problem 3. External data 4. Strings and factors 5. Saving data Slot machines they be sure casinos are honest? CC by-nc-nd:

More information

6.GettingDataInandOutofR

6.GettingDataInandOutofR 6.GettingDataInandOutofR 6.1 Reading and Writing Data Watchavideoofthis section¹ There are a few principal functions reading data into R. read.table, read.csv, for reading tabular data readlines, for reading

More information

Package swat. June 5, 2018

Package swat. June 5, 2018 Type Package Package swat June 5, 2018 Title SAS Scripting Wrapper for Analytics Transfer (SWAT) Version 1.2.1 Date 11OCT2017 Author Jared Dean [aut, cre], Tom Weber [aut, cre], Kevin Smith [aut] SWAT

More information

Mails : ; Document version: 14/09/12

Mails : ; Document version: 14/09/12 Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary

More information

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,

More information

Data types and structures

Data types and structures An introduc+on to Data types and structures Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 3 Review GeFng started with R Crea+ng Objects Data types in R Data structures in R

More information

Data Input/Output. Introduction to R for Public Health Researchers

Data Input/Output. Introduction to R for Public Health Researchers Data Input/Output Introduction to R for Public Health Researchers Common new user mistakes we have seen 1. Working directory problems: trying to read files that R "can't find" RStudio can help, and so

More information

A Brief Introduction to R

A Brief Introduction to R A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.

More information

Goals of this course. Crash Course in R. Getting Started with R. What is R? What is R? Getting you setup to use R under Windows

Goals of this course. Crash Course in R. Getting Started with R. What is R? What is R? Getting you setup to use R under Windows Oxford Spring School, April 2013 Effective Presentation ti Monday morning lecture: Crash Course in R Robert Andersen Department of Sociology University of Toronto And Dave Armstrong Department of Political

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #43. Multidimensional Arrays

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #43. Multidimensional Arrays Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #43 Multidimensional Arrays In this video will look at multi-dimensional arrays. (Refer Slide Time: 00:03) In

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Bradley Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org

More information

Introduction to the R Language

Introduction to the R Language Introduction to the R Language Data Types and Basic Operations Starting Up Windows: Double-click on R Mac OS X: Click on R Unix: Type R Objects R has five basic or atomic classes of objects: character

More information

PSS718 - Data Mining

PSS718 - Data Mining Lecture 3 Hacettepe University, IPS, PSS October 10, 2016 Data is important Data -> Information -> Knowledge -> Wisdom Dataset a collection of data, a.k.a. matrix, table. Observation a row of a dataset,

More information

LaF benchmarks. D.J. van der Laan

LaF benchmarks. D.J. van der Laan LaF benchmarks D.J. van der Laan 2011-11-06 1 Introduction LaF is a package for R for working with large ASCII files in R. The manual vignette contains an discription of the functionality provided. In

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri) R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine

More information

Recap From Last Time: Today s Learning Goals BIMM 143. Data analysis with R Lecture 4. Barry Grant.

Recap From Last Time: Today s Learning Goals BIMM 143. Data analysis with R Lecture 4. Barry Grant. BIMM 143 Data analysis with R Lecture 4 Barry Grant http://thegrantlab.org/bimm143 Recap From Last Time: Substitution matrices: Where our alignment match and mis-match scores typically come from Comparing

More information

Package csvread. August 29, 2016

Package csvread. August 29, 2016 Title Fast Specialized CSV File Loader Version 1.2 Author Sergei Izrailev Package csvread August 29, 2016 Maintainer Sergei Izrailev Description Functions for loading large

More information

STAT 540: R: Sections Arithmetic in R. Will perform these on vectors, matrices, arrays as well as on ordinary numbers

STAT 540: R: Sections Arithmetic in R. Will perform these on vectors, matrices, arrays as well as on ordinary numbers Arithmetic in R R can be viewed as a very fancy calculator Can perform the ordinary mathematical operations: + - * / ˆ Will perform these on vectors, matrices, arrays as well as on ordinary numbers With

More information

ITS Introduction to R course

ITS Introduction to R course ITS Introduction to R course Nov. 29, 2018 Using this document Code blocks and R code have a grey background (note, code nested in the text is not highlighted in the pdf version of this document but is

More information

Package filematrix. R topics documented: February 27, Type Package

Package filematrix. R topics documented: February 27, Type Package Type Package Package filematrix February 27, 2018 Title File-Backed Matrix Class with Convenient Read and Write Access Version 1.3 Date 2018-02-26 Description Interface for working with large matrices

More information

Brief cheat sheet of major functions covered here. shoe<-c(8,7,8.5,6,10.5,11,7,6,12,10)

Brief cheat sheet of major functions covered here. shoe<-c(8,7,8.5,6,10.5,11,7,6,12,10) 1 Class 2. Handling data in R Creating, editing, reading, & exporting data frames; sorting, subsetting, combining Goals: (1) Creating matrices and dataframes: cbind and as.data.frame (2) Editing data:

More information

Chapter 7. The Data Frame

Chapter 7. The Data Frame Chapter 7. The Data Frame The R equivalent of the spreadsheet. I. Introduction Most analytical work involves importing data from outside of R and carrying out various manipulations, tests, and visualizations.

More information

R for large data and bioinformatics

R for large data and bioinformatics R for large data and bioinformatics Thomas Lumley Ken Rice Universities of Washington and Auckland Auckland, November 2013 Introduction: Course Aims Under the hood of R R essentials, and programming skills

More information

The SQLiteDF Package

The SQLiteDF Package The SQLiteDF Package August 25, 2006 Type Package Title Stores data frames & matrices in SQLite tables Version 0.1.18 Date 2006-08-18 Author Maintainer Transparently stores data frames

More information

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R...

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R... Introduction to R and R-Studio 2017-18 02. Getting Data Into R 1. Enter Data Directly into R...... 2. Import Excel Data (.xlsx ) into R..... 3. Import Stata Data (.dta ) into R...... a) From a folder on

More information

Querying Data with Transact SQL

Querying Data with Transact SQL Course 20761A: Querying Data with Transact SQL Course details Course Outline Module 1: Introduction to Microsoft SQL Server 2016 This module introduces SQL Server, the versions of SQL Server, including

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #16 Loops: Matrix Using Nested for Loop In this section, we will use the, for loop to code of the matrix problem.

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

T-SQL Training: T-SQL for SQL Server for Developers

T-SQL Training: T-SQL for SQL Server for Developers Duration: 3 days T-SQL Training Overview T-SQL for SQL Server for Developers training teaches developers all the Transact-SQL skills they need to develop queries and views, and manipulate data in a SQL

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 2: Software Introduction Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University jacoby@msu.edu Getting Started with R What is R? A tiny R session

More information

Package LSDinterface

Package LSDinterface Type Package Title Reading LSD Results (.res) Files Version 0.3.1 Date 2017-11-24 Author Package LSDinterface November 27, 2017 Maintainer Interfaces R with LSD. Reads object-oriented

More information

Introduction to R 21/11/2016

Introduction to R 21/11/2016 Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced

More information

R basics workshop Sohee Kang

R basics workshop Sohee Kang R basics workshop Sohee Kang Math and Stats Learning Centre Department of Computer and Mathematical Sciences Objective To teach the basic knowledge necessary to use R independently, thus helping participants

More information

Package iotools. R topics documented: January 25, Version Title I/O Tools for Streaming

Package iotools. R topics documented: January 25, Version Title I/O Tools for Streaming Version 0.2-5 Title I/O Tools for Streaming Package iotools January 25, 2018 Author Simon Urbanek , Taylor Arnold Maintainer Simon Urbanek

More information

R for Libraries. Session 2: Data Exploration. Clarke Iakovakis Scholarly Communications Librarian University of Houston-Clear Lake

R for Libraries. Session 2: Data Exploration. Clarke Iakovakis Scholarly Communications Librarian University of Houston-Clear Lake R for Libraries Session 2: Data Exploration Clarke Iakovakis Scholarly Communications Librarian University of Houston-Clear Lake This work is licensed under a Creative Commons Attribution 4.0 International

More information

Using R Efficiently. Felix Andrews, ANU

Using R Efficiently. Felix Andrews, ANU Using R Efficiently Felix Andrews, ANU 2009-07-13 Using R Efficiently R can be a blessing or a curse: a time-waster or a time-saver. Three Styles of Using R 1.Interactive 2.Scripts, functions 3.Documents

More information

Lab 4 CSE 7, Spring 2018 This lab is an introduction to using logical and comparison operators in Matlab.

Lab 4 CSE 7, Spring 2018 This lab is an introduction to using logical and comparison operators in Matlab. LEARNING OBJECTIVES: Lab 4 CSE 7, Spring 2018 This lab is an introduction to using logical and comparison operators in Matlab 1 Use comparison operators (< > = == ~=) between two scalar values to create

More information

R Short Course Session 1

R Short Course Session 1 R Short Course Session 1 Daniel Zhao, PhD Sixia Chen, PhD Department of Biostatistics and Epidemiology College of Public Health, OUHSC 10/23/2015 Outline Overview of the 5 sessions Pre-requisite requirements

More information

Chapter 7 File Access. Chapter Table of Contents

Chapter 7 File Access. Chapter Table of Contents Chapter 7 File Access Chapter Table of Contents OVERVIEW...105 REFERRING TO AN EXTERNAL FILE...105 TypesofExternalFiles...106 READING FROM AN EXTERNAL FILE...107 UsingtheINFILEStatement...107 UsingtheINPUTStatement...108

More information

Lecture 3. Homework Review and Recoding I R Teaching Team. September 5, 2018

Lecture 3. Homework Review and Recoding I R Teaching Team. September 5, 2018 Lecture 3 Homework Review and Recoding I 2018 R Teaching Team September 5, 2018 Acknowledgements 1. Mike Fliss & Sara Levintow! 2. stackoverflow (particularly user David for lecture styling - link) 3.

More information

SPSS TRAINING SPSS VIEWS

SPSS TRAINING SPSS VIEWS SPSS TRAINING SPSS VIEWS Dataset Data file Data View o Full data set, structured same as excel (variable = column name, row = record) Variable View o Provides details for each variable (column in Data

More information

Stat 579: Objects in R Vectors

Stat 579: Objects in R Vectors Stat 579: Objects in R Vectors Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu, 1/23 Logical Vectors I R allows manipulation of logical

More information

Data Input/Output. Introduction to R for Public Health Researchers

Data Input/Output. Introduction to R for Public Health Researchers Data Input/Output Introduction to R for Public Health Researchers Common new user mistakes we have seen 1. Working directory problems: trying to read files that R can t find RStudio can help, and so do

More information

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org

More information

Unix tutorial. Thanks to Michael Wood-Vasey (UPitt) and Beth Willman (Haverford) for providing Unix tutorials on which this is based.

Unix tutorial. Thanks to Michael Wood-Vasey (UPitt) and Beth Willman (Haverford) for providing Unix tutorials on which this is based. Unix tutorial Thanks to Michael Wood-Vasey (UPitt) and Beth Willman (Haverford) for providing Unix tutorials on which this is based. Terminal windows You will use terminal windows to enter and execute

More information

Introduction to R. Le Yan HPC User LSU

Introduction to R. Le Yan HPC User LSU Introduction to R Le Yan HPC User Services @ LSU 3/18/2015 HPC training series Spring 2015 The History of R R is a dialect of the S language S was initiated at the Bell Labs as an internal statistical

More information

Surviving SPSS.

Surviving SPSS. Surviving SPSS http://dataservices.gmu.edu/workshops/spss http://dataservices.gmu.edu/software/spss Debby Kermer George Mason University Libraries Data Services Research Consultant Mason Data Services

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

The statistical software R

The statistical software R The statistical software R Luca Frigau University of Cagliari Ph.D. course Quantitative Methods A.A. 2017/2018 1 / 75 Outline 1 R and its history 2 Logic and objects Data acquisition Object structure and

More information

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up.

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up. Fall 2010 Handout on Using R Page: 1 The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up. 1. Installing R First off,

More information

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE COURSE TITLE MTA DATABASE ADMINISTRATOR FUNDAMENTALS COURSE DURATION 10 Hour(s) of Self-Paced Interactive Training COURSE OVERVIEW

More information

User Guide. Data Preparation R-1.1

User Guide. Data Preparation R-1.1 User Guide Data Preparation R-1.1 Contents 1. About this Guide... 4 1.1. Document History... 4 1.2. Overview... 4 1.3. Target Audience... 4 2. Introduction... 4 2.1. Introducing the Big Data BizViz Data

More information

Reading and writing data

Reading and writing data An introduction to WS 2017/2018 Reading and writing data Dr. Noémie Becker Dr. Sonja Grath Special thanks to: Prof. Dr. Martin Hutzenthaler and Dr. Benedikt Holtmann for significant contributions to course

More information

Package SIRItoGTFS. May 21, 2018

Package SIRItoGTFS. May 21, 2018 Package SIRItoGTFS May 21, 2018 Type Package Title Compare SIRI Datasets to GTFS Tables Version 0.2.4 Date 2018-05-21 Allows the user to compare SIRI (Service Interface for Real Time Information) data

More information

Introduction to R. Daniel Berglund. 9 November 2017

Introduction to R. Daniel Berglund. 9 November 2017 Introduction to R Daniel Berglund 9 November 2017 1 / 15 R R is available at the KTH computers If you want to install it yourself it is available at https://cran.r-project.org/ Rstudio an IDE for R is

More information

seq(), seq_len(), min(), max(), length(), range(), any(), all() Comparison operators: <, <=, >, >=, ==,!= Logical operators: &&,,!

seq(), seq_len(), min(), max(), length(), range(), any(), all() Comparison operators: <, <=, >, >=, ==,!= Logical operators: &&,,! LECTURE 3: DATA STRUCTURES IN R (contd) STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University SOME USEFUL R FUNCTIONS seq(), seq_len(), min(), max(), length(),

More information

Statistics for Biologists: Practicals

Statistics for Biologists: Practicals Statistics for Biologists: Practicals Peter Stoll University of Basel HS 2012 Peter Stoll (University of Basel) Statistics for Biologists: Practicals HS 2012 1 / 22 Outline Getting started Essentials of

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

User Guide. Data Preparation R-1.0

User Guide. Data Preparation R-1.0 User Guide Data Preparation R-1.0 Contents 1. About this Guide... 4 1.1. Document History... 4 1.2. Overview... 4 1.3. Target Audience... 4 2. Introduction... 4 2.1. Introducing the Big Data BizViz Data

More information

II.Matrix. Creates matrix, takes a vector argument and turns it into a matrix matrix(data, nrow, ncol, byrow = F)

II.Matrix. Creates matrix, takes a vector argument and turns it into a matrix matrix(data, nrow, ncol, byrow = F) II.Matrix A matrix is a two dimensional array, it consists of elements of the same type and displayed in rectangular form. The first index denotes the row; the second index denotes the column of the specified

More information

Data Structures STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

Data Structures STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley Data Structures STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Data Types and Structures To make the

More information

Outline. What are SAS (1966), Stata (1985), and SPSS (1968)? Migrating to R for SAS/SPSS/Stata Users

Outline. What are SAS (1966), Stata (1985), and SPSS (1968)? Migrating to R for SAS/SPSS/Stata Users UCLA Department of Statistics Statistical Consulting Center Outline Vivian Lew vlew@stat.ucla.edu October 4, 2009 Statistical packages allow you to read raw data and transform it into a proprietary file

More information

An Introduction to Using R

An Introduction to Using R An Introduction to Using R Dino Christenson & Scott Powell Ohio StateUniversity November 20, 2007 Introduction to R Outline I. What is R? II. Why use R? III. Where to get R? IV. GUI & scripts V. Objects

More information

A Guide for the Unwilling S User

A Guide for the Unwilling S User A Guide for the Unwilling S User Patrick Burns Original: 2003 February 23 Current: 2005 January 2 Introduction Two versions of the S language are available a free version called R, and a commercial version

More information

Oracle Database 10g: Introduction to SQL

Oracle Database 10g: Introduction to SQL ORACLE UNIVERSITY CONTACT US: 00 9714 390 9000 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database

More information