Computer lab 2 Course: Introduction to R for Biologists

Size: px
Start display at page:

Download "Computer lab 2 Course: Introduction to R for Biologists"


1 Computer lab 2 Course: Introduction to R for Biologists April 23, Scripting As you have seen, you often want to run a sequence of commands several times, perhaps with small changes. An efficient way to do this is to store your commands in a text file, and run the text file from R. This concept is called scripting and is vital to doing efficient analyses and you also get documented exactly what you have done in your computations for later use. 1. Create a directory at some convenient place of your computer, possibly a specific folder for this course, for storing your R files. Usually it is easiest to keep the scripts for a single analysis in a separate folder so if you prefer create a sub-folder named lab2 or something similar to collect the files for this lab. 2. Change your working directory to your newly created directory. In R studio you can find the command Set Working Directory under the Tools or Project menu and browse for the directory. In any R environment you have access to the commands getwd and setwd functions to see what working directory you are currently in and to set the working directory. 3. Objects you create in R can be stored when you close R, in a workspace. This workspace will be stored in the current directory, the one you just set above. Try this out, by creating a couple of objects in R, closing R, while saving the workspace, and then go to the directory you created and right-click on the R icon and select open with R studio (this is how to do it under Windows). Your created objects should now be available, try ls(). You can also manually load workspaces by using the load button in the top right workspace panel in R studio or using the command load(). 4. Now select New R Script from the File menu in R studio and save it as myscript.r in your chosen directory. If not using R Studio you can create a file of the same name and use a text editor of your own choice. In this file, write 1

2 mydata <- c(432, 44, 1) mean(mydata) and save it. Running scripts is done by using the source command which can also be accessed through the code menu in R studio. To run your script write in the R console source("myscript.txt") If you get an error message, try the function dir(), it will list the files in the current directory. If myscript.r is not listed you should either move the file to your current working directory or change your working directory to the files location. If you do not get any error message this means that the script executed, however you will not get any output at all: If you try ls(), you will see that R now has an object called mydata. By default commands in R scripts are silent so it did not print out the mean of the data when the script was run. To get R to print out something as a result of a command in a script you need to write for example, print(mean(mydata)) Edit your file accordingly, and run the script again, to see the output. (Don t forget to save your file after you have edited it). 5. Script files as the myscript.r above are useful to store sequences of commands. They can also store other text that explains your computations and your thinking, right next to the commands. The symbol # will make R ignore all text following it on the same line. Thus it is called the comment symbol. Write a text file containing a solution to the following exercise: Read in the data 34, 54, 25, 53, 24, 41, 49, 32, 26, 51 and analyze it by producing some of the summary statistics, and some of the plots, you have learned to produce so far (Hint: again use c() to combine the data into a vector, useful other functions are mean, sd, hist, and summary). The text file should contain, as comments, an explanation about what each command is doing. A very useful feature in R Studio is the ability to only run part of a script. This is done by selecting those lines in the file panel, top left, 2

3 and pressing crtl + enter ( cmd + enter, on mac). Try this on some of the code in your script. NOTE: All your obligatory exercises should be written and handed in using the format above: A text file that can be run as a script by R, and which contains, as comments, the additional text that explains the computations. 2 Data structures in R The data objects we have looked at so far have been either vectors or matrices, containing either numbers, text strings, or logical values. We will now look at a few other common data structures: Factors, data frames, and lists. From now on it is suggested that you write the solutions to your exercises in scripts so that you easily can redo steps, change commands and get back to your solutions in the future. 1. Categorical variables are variables that can take on certain specific levels : The variable sex could have the two levels male and female, a variable color could have levels red, green, and blue, for example. Such variables are represented in R with factors. Create a factor as follows: > data <- c("woman", "man", "man", "woman", "woman") > d <- factor(data) A factor is represented in a specific way in the computer; try to guess how by applying the functions levels and as.numeric to d. Also, try out the function as.character on d. That a vector is stored as a factor will change the behavior of many functions; sometimes in a direction you want, sometimes not. We will return to see cases when factors are very useful. 2. Real data sets often come in the form of tables. Often, each row represents an observation, and each column an attribute for each observation. The attributes can be of various types, sometimes represented by numbers, sometimes by text. Try out and explain the outputs of the following commands: > attribute1 <- c(34, 52, 31) > attribute2 <- 1:3 > attribute3 <- c("man", "woman", "woman") > mymatrix1 <- cbind(attribute1, attribute2) > mymatrix2 <- cbind(attribute1, attribute2, attribute3) NOTE: If you have lines of code in your script that you do not want to run it can be useful to comment them away by writing a # in front of 3

4 that line. This allows you to keep the code but prevents it from being executed. 3. If you examine mymatrix2 (simply write mymatrix2 in the console) you will see that all the values considered as characters as opposed to numbers as in mymatrix1. Since matrices cannot contain different data types R forces all the data into the same type without giving a warning. However mixing different types of data is often necessary and this is possible using a data frame. Try > myframe <- data.frame(attribute1, attribute2, attribute3) You will see when you display it that the data frame has named columns. Use the names function to assign suitable names to the three columns. 4. If you for example named the first column Age, you will now be able to access this column in two ways: > myframe$age > myframe[,1] Use the first type of access to change the first woman s age from 52 to 49, and the second type of access to change the mans age from 34 to Use the class function to investigate the type of the third column: You will find that it is a factor. This may or may not be what you would like. Read the help for data.frame, and find a way to re-construct myframe in such a way that the last column gets type character and not factor. This can also be done by replacing the last column with itself but changing the type using as.character. 6. In the data above, each row represented an observation, so naturally, all columns had the same length. In other types of data, the data set might be a collection of several vectors of different length. In this case you can use a list to collect the data in a single object. OPTIONAL: Use help on the list function to find out how you can represent such data as a list. You can for example create a list containing mymatrix1, mymatrix2 and myframe using > mylist <- list(mymatrix1,mymatrix2,myframe). You acces the list using double square brackets, for example mylist[[1]] would give you the contents of mymatrix1. 4

5 3 Input and output of data We are finally getting to a very important point: Input and output of data. Real data sets will most often be in the form of an output from some other program. A general way of inputing such data to R, is to make sure it is in some kind of text format. 1. Download the file Example1.txt from the course homepage and put it into your current R directory. Open the file in Windows, with for example Notepad: You will see that it has three columns of data, that the first line represents headings, that the first column is text and the other two columns consist of numbers, and that the columns are separated by tab values. The data is in fact part of the result from a Microarray experiment; the first column consists of names of probes for genes. The file has been produced by Microsoft Excel, using the output option tab-delimited text. 2. A general way to input data in the format of a table is to use the read.table function. Try first > mydata <- read.table("example.txt") You are likely to get an error message, as you should adapt the read.table function to this particular type of output. Try to read help(read.table) to identify the problem or problems. Use the help information to find a way to change the arguments of read.table so that it will read in the data without problems. 3. Investigate your new object using functions you know. A useful function may be head. Other useful functions to apply are dim, class, names ; make sure you understand the output from each. Try also > class(mydata$genename) which will show that the first column is a factor, and not just a character vector. That columns in a data frame are factors may cause unexpected behavior, if they are intended to be interpreted just as a character vector. Go back to the help function for read.table, or for read.delim, and find an option so that when you re-read the data from file, the first column becomes a character vector, i.e., the last command above responds with character. 4. Try the function > newdata <- edit(mydata) 5

6 and change some of the probe names to names you find prettier. 5. Create from newdata a new dataset consisting of only the lines where the probe name has - as the second character. (Hint: Consider the function substr ). 6. To write out data on table format, the function write.table is often useful. Read help(write.table) to find out how to output your data again in a text file, name it newdata.txt. Use Notepad or another text editor to view the data file. OPTIONAL: if you like, try to open the new file with Excel. In the helpfile for write.table, you may find an alternative command which may be better suited for outputting a table if the table is going to be read into Excel. 7. Data can also be contained in packages, for example, the package connected to our textbook, ISwR, contains a number of datasets. Activate the package (for this R session) by writing > library(iswr) If you get an error message, it means that the package has not yet been downloaded to the computer you use. To do so, use either > install.packages("iswr") or use the Packages menu (under Windows). After the package has been activated with the library function, use > help(packages=iswr) to see a list of the datasets contained in the package. You can read more about each dataset using help, e.g., for the energy dataset, write > help(energy) To activate the dataset, so that it appears among your objects when you use the ls function, use the data function, e.g., to activate the energy dataset, write > data(energy) Finally, visualize the data: Try the two commands > plot(energy) > plot(energy$expend~energy$stature) and explain the output. 6

7 4 R programming So far, you have applied R either by using single commands, or by using sequences of commands, placed together in a script. One of the strengths of R is that you can seamlessly expand the way you use R into using it as a programming language. 1. Even if a script can store a useful sequence of commands it is not very flexible if you want to apply it on different data or use it multiple times. The standard functions in R such as plot and sd offers this flexibility. In R you have the option to write your own functions. As an example let s assume that you need to go through a vector of words and replace any occurrence of the string 3XSSC with another given string. Write the following code in a script, myreplace <- function(v, newstring="yes") { index <- v == "3XSSC" v[index] <- newstring return(v) } Source the script so that the code gets executed in R. Check the workspace panel in R studio or use ls() to see that the function appeared. Now test out this function on the first column of the data set read in as mydata above by writing. > outputvector <- myreplace(mydata[,1]) Can you tell the difference between mydata[,1] and outputvector, hint use head to look at the first few values of each of them? Now lets go through the meaning of each line in this function. myreplace <- function(v, newstring="yes") { } This part states that we create a function called myreplace that has two input arguments (also know as parameters), v and newstring. newstring also has a default value yes that will be used if we do not supply a value for newstring. The curly brackets { and } then defines what is inside our function. Any code between these will be executed when the function is called. index <- v == "3XSSC" v[index] <- newstring return(v) 7

8 These three lines define what the function actually does. It first finds all occurrences of 3XSSC and stores this information in index. Then it replaces all these containing 3XSSC with the value stored in new- String. Finally the return command states that v should be given as the output of our function call. Now try to call the function myreplace on the first column mydata again but give the function an additional argument so that it changes 3XSSC into another word. 2. Make your function more general, by giving it an extra argument, with default value No, indicating which word should be replaced. Test this new version of your function by for example replacing geno1 in mydata with something else. 3. In some situations you want to perform the same set of commands multiple times. This can be done with loops. There are a few different options for this in R but the most common one is the for-loop. Write the following code in a script and run it, for ( i in 1:10 ) { print(i) } This loop runs the code print(i) for each value of i contained in the vector 1:10. Note again the construction 1:10 which is a very quick way of creating a sequence of values. Now create a new loop that prints the first ten genenames in mydata. 4. The final concept to consider is conditional execution of commands. This means having code in your script of function that only gets executed if some condition holds. For example try running the following code using a script, a <- 10 if ( a > 15) { print("a is greater than 15") } else { print("a is less than or equal to 15") } This code checks the statement after if, in this case if a is larger than 15, and if that is true it executes the first block of code. If it is not true it sees if an else command in given and executes any code following that. Try what happens when you change the value of a to something larger than 15. 8

9 Now combine your knowledge of for loops and if statements to write a short script that steps through the twenty first rows of mydata, for each row calculates the difference between the red and green values (column 2 and 3) and then prints the name of any gene where the difference is larger than Hint, use the abs function to get around problems with negative differences. The same procedure can be done only using direct vector operations but try to use a for loop containing an if statement. If you want to read more about for loops and if statements, chapter in Dalgaard covers this. You can also find information in chapter 9 in the text An Introduction to R accessible through the built-in R help, use help.start() to start it. These final two exercises are very nice and combine many of the concepts we have covered so far but they can be slightly demanding. At this point you have the option to head straight to the first three hand-in assignments, labs three to five. 5. OPTIONAL: Transform the list stored in mydata as follows: For all lines where the Genename is duplicated, remove all but the first one. Then, sort all lines according to the Genenames in the first column. You may have use for such functions as sort, unique, and duplicated. 6. OPTIONAL: We would like to create a new function similar to myreplace that can do the following: To replace the first letter in each word, if it is a capital letter, with the corresponding lower-case letter. You may have use for the built-in vectors LETTERS and letters. The best thing, for speed of execution, is to write your function using vectors: Try this. Alternatively, try to write the function using a for loop. 9

R in Linguistic Analysis. Week 2 Wassink Autumn 2012

R in Linguistic Analysis. Week 2 Wassink Autumn 2012 R in Linguistic Analysis Week 2 Wassink Autumn 2012 Today R fundamentals The anatomy of an R help file but first... How did you go about learning the R functions in the reading? More help learning functions

More information

Statistics for Biologists: Practicals

Statistics for Biologists: Practicals Statistics for Biologists: Practicals Peter Stoll University of Basel HS 2012 Peter Stoll (University of Basel) Statistics for Biologists: Practicals HS 2012 1 / 22 Outline Getting started Essentials of

More information

Instruction: Download and Install R and RStudio

Instruction: Download and Install R and RStudio 1 Instruction: Download and Install R and RStudio We will use a free statistical package R, and a free version of RStudio. Please refer to the following two steps to download both R and RStudio on your

More information

This document is designed to get you started with using R

This document is designed to get you started with using R An Introduction to R This document is designed to get you started with using R We will learn about what R is and its advantages over other statistics packages the basics of R plotting data and graphs What

More information

Mails : ; Document version: 14/09/12

Mails : ; Document version: 14/09/12 Mails : ; Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary

More information

A brief introduction to R

A brief introduction to R A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

Introduction to Statistics using R/Rstudio

Introduction to Statistics using R/Rstudio Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri) R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine

More information

Introduction to R. base -> R win32.exe (this will change depending on the latest version)

Introduction to R. base -> R win32.exe (this will change depending on the latest version) Dr Raffaella Calabrese, Essex Business School 1. GETTING STARTED Introduction to R R is a powerful environment for statistical computing which runs on several platforms. R is available free of charge.

More information

Chapter 2 Assignment (due Thursday, April 19)

Chapter 2 Assignment (due Thursday, April 19) (due Thursday, April 19) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should

More information

Reading and writing data

Reading and writing data An introduction to WS 2017/2018 Reading and writing data Dr. Noémie Becker Dr. Sonja Grath Special thanks to: Prof. Dr. Martin Hutzenthaler and Dr. Benedikt Holtmann for significant contributions to course

More information

Lab 1: Getting started with R and RStudio Questions? or

Lab 1: Getting started with R and RStudio Questions? or Lab 1: Getting started with R and RStudio Questions? or 1. Installing R and RStudio To install R, go to and click on the Download

More information

Author: Leonore Findsen, Qi Wang, Sarah H. Sellke, Jeremy Troisi

Author: Leonore Findsen, Qi Wang, Sarah H. Sellke, Jeremy Troisi 0. Downloading Data from the Book Website 1. Go to 2. Click on Data Sets 3. Click on Data Sets: PC Text 4. Click on Click here to download. 5. Right Click PC Text and choose

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

A whirlwind introduction to using R for your research

A whirlwind introduction to using R for your research A whirlwind introduction to using R for your research Jeremy Chacón 1 Outline 1. Why use R? 2. The R-Studio work environment 3. The mock experimental analysis: 1. Writing and running code 2. Getting data

More information

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here:

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here: Lab 1. Introduction to R & SAS R is free, open-source software. Get it here: for your own computer. 1.1. Using R like a calculator Open R and type these commands into the R Console

More information

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Contents Things to Know Before You Begin.................................... 1 Entering and Outputting Data......................................

More information


LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

Lab 1 (fall, 2017) Introduction to R and R Studio

Lab 1 (fall, 2017) Introduction to R and R Studio Lab 1 (fall, 201) Introduction to R and R Studio Introduction: Today we will use R, as presented in the R Studio environment (or front end), in an introductory setting. We will make some calculations,

More information

GEO 425: SPRING 2012 LAB 9: Introduction to Postgresql and SQL

GEO 425: SPRING 2012 LAB 9: Introduction to Postgresql and SQL GEO 425: SPRING 2012 LAB 9: Introduction to Postgresql and SQL Objectives: This lab is designed to introduce you to Postgresql, a powerful database management system. This exercise covers: 1. Starting

More information

STAT 540 Computing in Statistics

STAT 540 Computing in Statistics STAT 540 Computing in Statistics Introduces programming skills in two important statistical computer languages/packages. 30-40% R and 60-70% SAS Examples of Programming Skills: 1. Importing Data from External

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction In this exercise, we will learn how to reorganize and reformat a data

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012 An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences Scott C Merrill September 5 th, 2012 Chapter 2 Additional help tools Last week you asked about getting help on packages.

More information

Introduction to R Reading, writing and exploring data

Introduction to R Reading, writing and exploring data Introduction to R Reading, writing and exploring data R-peer-group QUB February 12, 2013 R-peer-group (QUB) Session 2 February 12, 2013 1 / 26 Session outline Review of last weeks exercise Introduction

More information

Linkage analysis with paramlink Session I: Introduction and pedigree drawing

Linkage analysis with paramlink Session I: Introduction and pedigree drawing Linkage analysis with paramlink Session I: Introduction and pedigree drawing In this session we will introduce R, and in particular the package paramlink. This package provides a complete environment for

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot.

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. 3 confint A metafor package function that gives you the confidence intervals of effect sizes.

More information

Introduction to scientific programming in R

Introduction to scientific programming in R Introduction to scientific programming in R John M. Drake & Pejman Rohani 1 Introduction This course will use the R language programming environment for computer modeling. The purpose of this exercise

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

Week 1: Introduction to R, part 1

Week 1: Introduction to R, part 1 Week 1: Introduction to R, part 1 Goals Learning how to start with R and RStudio Use the command line Use functions in R Learning the Tools What is R? What is RStudio? Getting started R is a computer program

More information

ELEC4042 Signal Processing 2 MATLAB Review (prepared by A/Prof Ambikairajah)

ELEC4042 Signal Processing 2 MATLAB Review (prepared by A/Prof Ambikairajah) Introduction ELEC4042 Signal Processing 2 MATLAB Review (prepared by A/Prof Ambikairajah) MATLAB is a powerful mathematical language that is used in most engineering companies today. Its strength lies

More information

An Introduction to Statistical Computing in R

An Introduction to Statistical Computing in R An Introduction to Statistical Computing in R K2I Data Science Boot Camp - Day 1 AM Session May 15, 2017 Statistical Computing in R May 15, 2017 1 / 55 AM Session Outline Intro to R Basics Plotting In

More information

EGR 111 Functions and Relational Operators

EGR 111 Functions and Relational Operators EGR 111 Functions and Relational Operators This lab is an introduction to writing your own MATLAB functions. The lab also introduces relational operators and logical operators which allows MATLAB to compare

More information

Introduction to SPSS

Introduction to SPSS Introduction to SPSS Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data file and calculate

More information

An Introduction to R 1.3 Some important practical matters when working with R

An Introduction to R 1.3 Some important practical matters when working with R An Introduction to R 1.3 Some important practical matters when working with R Dan Navarro ( School of Psychology, University of Adelaide DSTO R Workshop,

More information

STAT 113: R/RStudio Intro

STAT 113: R/RStudio Intro STAT 113: R/RStudio Intro Colin Reimer Dawson Last Revised September 1, 2017 1 Starting R/RStudio There are two ways you can run the software we will be using for labs, R and RStudio. Option 1 is to log

More information

Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project

Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project Data Content: Example: Who chats on-line most frequently? This Technology Use dataset in

More information

Running Minitab for the first time on your PC

Running Minitab for the first time on your PC Running Minitab for the first time on your PC Screen Appearance When you select the MINITAB option from the MINITAB 14 program group, or click on MINITAB 14 under RAS you will see the following screen.

More information

STA 248 S: Some R Basics

STA 248 S: Some R Basics STA 248 S: Some R Basics The real basics The R prompt > > # A comment in R. Data To make the variable x equal to 2 use > x x = 2 To make x a vector, use the function c() ( c for concatenate)

More information

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. To perform inferential statistics

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 2: Software Introduction Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University Getting Started with R What is R? A tiny R session

More information

Introduction to R Commander

Introduction to R Commander Introduction to R Commander 1. Get R and Rcmdr to run 2. Familiarize yourself with Rcmdr 3. Look over Rcmdr metadata (Fox, 2005) 4. Start doing stats / plots with Rcmdr Tasks 1. Clear Workspace and History.

More information

Lab 1 Introduction to R

Lab 1 Introduction to R Lab 1 Introduction to R Date: August 23, 2011 Assignment and Report Due Date: August 30, 2011 Goal: The purpose of this lab is to get R running on your machines and to get you familiar with the basics

More information


MATLAB TUTORIAL WORKSHEET MATLAB TUTORIAL WORKSHEET What is MATLAB? Software package used for computation High-level programming language with easy to use interactive environment Access MATLAB at Tufts here:

More information

Practical 2: Plotting

Practical 2: Plotting Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory

More information

Matlab notes Matlab is a matrix-based, high-performance language for technical computing It integrates computation, visualisation and programming usin

Matlab notes Matlab is a matrix-based, high-performance language for technical computing It integrates computation, visualisation and programming usin Matlab notes Matlab is a matrix-based, high-performance language for technical computing It integrates computation, visualisation and programming using familiar mathematical notation The name Matlab stands

More information


TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL We have spent the first part of the course learning Excel: importing files, cleaning, sorting, filtering, pivot tables and exporting

More information

Matlab for FMRI Module 1: the basics Instructor: Luis Hernandez-Garcia

Matlab for FMRI Module 1: the basics Instructor: Luis Hernandez-Garcia Matlab for FMRI Module 1: the basics Instructor: Luis Hernandez-Garcia The goal for this tutorial is to make sure that you understand a few key concepts related to programming, and that you know the basics

More information

EGR 111 Functions and Relational Operators

EGR 111 Functions and Relational Operators EGR 111 Functions and Relational Operators This lab is an introduction to writing your own MATLAB functions. The lab also introduces relational operators and logical operators which allows MATLAB to compare

More information, grouping, argument1=true, argument2=3, argument3= word, argument4=c( A, B, C )), grouping, argument1=true, argument2=3, argument3= word, argument4=c( A, B, C )) Tutorial 3: Data Manipulation Anatomy of an R Command Every command has a unique name. These names are specific to the program and case-sensitive. In the example below, is the name of the

More information

Lab # 2. For today s lab:

Lab # 2. For today s lab: 1 ITI 1120 Lab # 2 Contributors: G. Arbez, M. Eid, D. Inkpen, A. Williams, D. Amyot 1 For today s lab: Go the course webpage Follow the links to the lab notes for Lab 2. Save all the java programs you

More information

Note on homework for SAS date formats

Note on homework for SAS date formats Note on homework for SAS date formats I m getting error messages using the format MMDDYY10D. even though this is listed on websites for SAS date formats. Instead, MMDDYY10 and similar (without the D seems

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information


TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL We have spent the first part of the course learning Excel: importing files, cleaning, sorting, filtering, pivot tables and exporting

More information

Installing and running R

Installing and running R Installing and running R The R website: R on the web here you can find information on the software, download the current version R-2.9.2 (released on 2009-08-24), packages, tutorials

More information

Extensible scriptlet-driven tool to manipulate, or do work based on, files and file metadata (fields)

Extensible scriptlet-driven tool to manipulate, or do work based on, files and file metadata (fields) 1. MCUtils This package contains a suite of scripts for acquiring and manipulating MC metadata, and for performing various actions. The available scripts are listed below. The scripts are written in Perl

More information

STAT 20060: Statistics for Engineers. Statistical Programming with R

STAT 20060: Statistics for Engineers. Statistical Programming with R STAT 20060: Statistics for Engineers Statistical Programming with R Why R? Because it s free to download for everyone! Most statistical software is very, very expensive, so this is a big advantage. Statisticians

More information

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R MBV4410/9410 Fall 2018 Bioinformatics for Molecular Biology Introduction to R Outline Introduce R Basic operations RStudio Bioconductor? Goal of the lecture Introduce you to R Show how to run R, basic

More information

Logical operators: R provides an extensive list of logical operators. These include

Logical operators: R provides an extensive list of logical operators. These include meat.r: Explanation of code Goals of code: Analyzing a subset of data Creating data frames with specified X values Calculating confidence and prediction intervals Lists and matrices Only printing a few

More information

LAB 5 Implementing an ALU

LAB 5 Implementing an ALU Goals To Do Design a practical ALU LAB 5 Implementing an ALU Learn how to extract performance numbers (area and speed) Draw a block level diagram of the MIPS 32-bit ALU, based on the description in the

More information

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus Introduction (SPSS) SPSS is the acronym of Statistical Package for the Social Sciences. SPSS is one of the most popular statistical packages which can perform highly complex data manipulation and analysis

More information

limma: A brief introduction to R

limma: A brief introduction to R limma: A brief introduction to R Natalie P. Thorne September 5, 2006 R basics i R is a command line driven environment. This means you have to type in commands (line-by-line) for it to compute or calculate

More information

CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup

CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup Purpose: The purpose of this lab is to setup software that you will be using throughout the term for learning about Python

More information

NCSS Statistical Software. Design Generator

NCSS Statistical Software. Design Generator Chapter 268 Introduction This program generates factorial, repeated measures, and split-plots designs with up to ten factors. The design is placed in the current database. Crossed Factors Two factors are

More information

INTRODUCTION TO SPSS. Anne Schad Bergsaker 13. September 2018

INTRODUCTION TO SPSS. Anne Schad Bergsaker 13. September 2018 INTRODUCTION TO SPSS Anne Schad Bergsaker 13. September 2018 BEFORE WE BEGIN... LEARNING GOALS 1. Be familiar with and know how to navigate between the different windows in SPSS 2. Know how to write a

More information

R Basics / Course Business

R Basics / Course Business R Basics / Course Business We ll be using a sample dataset in class today: CourseWeb: Course Documents " Sample Data " Week 2 Can download to your computer before class CourseWeb survey on research/stats

More information

Chapter 2. Editing And Compiling

Chapter 2. Editing And Compiling Chapter 2. Editing And Compiling Now that the main concepts of programming have been explained, it's time to actually do some programming. In order for you to "edit" and "compile" a program, you'll need

More information


SISG/SISMID Module 3 SISG/SISMID Module 3 Introduction to R Ken Rice Tim Thornton University of Washington Seattle, July 2018 Introduction: Course Aims This is a first course in R. We aim to cover; Reading in, summarizing

More information

Introduction to R 21/11/2016

Introduction to R 21/11/2016 Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? How to install and use it? Follow the steps: you don t need advanced

More information

Using R for statistics and data analysis

Using R for statistics and data analysis Introduction ti to R: Using R for statistics and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. Why use R? To perform inferential statistics (e.g.,

More information

MATLAB Introductory Course Computer Exercise Session

MATLAB Introductory Course Computer Exercise Session MATLAB Introductory Course Computer Exercise Session This course is a basic introduction for students that did not use MATLAB before. The solutions will not be collected. Work through the course within

More information

WORKSHOP: Using the Health Survey for England, 2014

WORKSHOP: Using the Health Survey for England, 2014 WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience

More information

3. Data Tables & Data Management

3. Data Tables & Data Management 3. Data Tables & Data Management In this lab, we will learn how to create and manage data tables for analysis. We work with a very simple example, so it is easy to see what the code does. In your own projects

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

Chapter 2 Assignment (due Thursday, October 5)

Chapter 2 Assignment (due Thursday, October 5) (due Thursday, October 5) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should

More information

Brief cheat sheet of major functions covered here. shoe<-c(8,7,8.5,6,10.5,11,7,6,12,10)

Brief cheat sheet of major functions covered here. shoe<-c(8,7,8.5,6,10.5,11,7,6,12,10) 1 Class 2. Handling data in R Creating, editing, reading, & exporting data frames; sorting, subsetting, combining Goals: (1) Creating matrices and dataframes: cbind and (2) Editing data:

More information

You will have to download all of the data used from the internet before R can access the data.

You will have to download all of the data used from the internet before R can access the data. 0. Downloading Data You will have to download all of the data used from the internet before R can access the data. If the file accessed via a link, then right click on the file name and save it to a directory

More information



More information


file:///users/williams03/a/workshops/2015.march/final/intro_to_r.html Intro to R R is a functional programming language, which means that most of what one does is apply functions to objects. We will begin with a brief introduction to R objects and how functions work, and

More information

Part I. Introduction to Linux

Part I. Introduction to Linux Part I Introduction to Linux 7 Chapter 1 Linux operating system Goal-of-the-Day Familiarisation with basic Linux commands and creation of data plots. 1.1 What is Linux? All astronomical data processing

More information

introduction to records in touchdevelop

introduction to records in touchdevelop introduction to records in touchdevelop To help you keep your data organized, we are introducing records in release 2.8. A record stores a collection of named values called fields, e.g., a Person record

More information

Statistics 13, Lab 1. Getting Started. The Mac. Launching RStudio and loading data

Statistics 13, Lab 1. Getting Started. The Mac. Launching RStudio and loading data Statistics 13, Lab 1 Getting Started This first lab session is nothing more than an introduction: We will help you navigate the Statistics Department s (all Mac) computing facility and we will get you

More information

Introduction to Programming for Biology Research

Introduction to Programming for Biology Research Introduction to Programming for Biology Research Introduction to MATLAB: part I MATLAB Basics - The interface - Variables/arrays/matrices - Conditional statements - Loops (for and while) MATLAB: The

More information

the star lab introduction to R Day 2 Open R and RWinEdt should follow: we ll need that today.

the star lab introduction to R Day 2 Open R and RWinEdt should follow: we ll need that today. R-WinEdt Open R and RWinEdt should follow: we ll need that today. Cleaning the memory At any one time, R is storing objects in its memory. The fact that everything is an object in R is generally a good

More information

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises Introduction to R Course in Practical Analysis of Microarray Data Computational Exercises 2010 March 22-26, Technischen Universität München Amin Moghaddasi, Kurt Fellenberg 1. Installing R. Check whether

More information

EE 301 Signals & Systems I MATLAB Tutorial with Questions

EE 301 Signals & Systems I MATLAB Tutorial with Questions EE 301 Signals & Systems I MATLAB Tutorial with Questions Under the content of the course EE-301, this semester, some MATLAB questions will be assigned in addition to the usual theoretical questions. This

More information

CS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 2: SEP. 8TH INSTRUCTOR: JIAYIN WANG

CS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 2: SEP. 8TH INSTRUCTOR: JIAYIN WANG CS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 2: SEP. 8TH INSTRUCTOR: JIAYIN WANG 1 Notice Class Website Reading Assignment Chapter 1: Introduction to Java Programming

More information

Molecular Statistics Exercise 1. As was shown to you this morning, the interactive python shell can add, subtract, multiply and divide numbers.

Molecular Statistics Exercise 1. As was shown to you this morning, the interactive python shell can add, subtract, multiply and divide numbers. Molecular Statistics Exercise 1 Introduction This is the first exercise in the course Molecular Statistics. The exercises in this course are split in two parts. The first part of each exercise is a general

More information

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics November 2013 George W. Bell, Ph.D. To perform inferential

More information

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources

More information

MATLAB Part 2. This week we will look at techniques which will allow you to build powerful applications in MATLAB

MATLAB Part 2. This week we will look at techniques which will allow you to build powerful applications in MATLAB MATLAB Part 2 This week we will look at techniques which will allow you to build powerful applications in MATLAB Using M-files for calculations Decisions and loops Introduction to plotting data Start MATLAB

More information

CSCU9B2 Practical 1: Introduction to HTML 5

CSCU9B2 Practical 1: Introduction to HTML 5 CSCU9B2 Practical 1: Introduction to HTML 5 Aim: To learn the basics of creating web pages with HTML5. Please register your practical attendance: Go to the GROUPS\CSCU9B2 folder in your Computer folder

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

Exercise 1-Solutions TMA4255 Applied Statistics

Exercise 1-Solutions TMA4255 Applied Statistics Exercise 1-Solutions TMA4255 Applied Statistics January 16, 2017 Intro 0.1 Start MINITAB Start MINITAB on your laptop, or remote desktop to and log in with win-ntnu-no\yourusername

More information

ICOM 4015 Advanced Programming Laboratory. Chapter 1 Introduction to Eclipse, Java and JUnit

ICOM 4015 Advanced Programming Laboratory. Chapter 1 Introduction to Eclipse, Java and JUnit ICOM 4015 Advanced Programming Laboratory Chapter 1 Introduction to Eclipse, Java and JUnit University of Puerto Rico Electrical and Computer Engineering Department by Juan E. Surís 1 Introduction This

More information